handling missing kdump pages in diskdump format
by Bob Montgomery
I've been experimenting with the makedumpfile utility for kdump on ia64.
One of my experiments was to verify that a page that should have been
missing indeed was missing. I used crash 4.0-3.8 to look for a user
page that should have been omitted from the dump.
crash> x/xg 0xe0000040fc00c000
0xe0000040fc00c000: 0x0000000000000000
On a full dump from makedumpfile as well as on a straight copy of
vmcore, crash reports this:
crash> x/xg 0xe0000040fc00c000
0xe0000040fc00c000: 0x00010102464c457f
The dumpfiles created by makedumpfile appear to crash as diskdump files,
and crash appears to excuse missing pages and report 0x0 contents here:
diskdump.c:read_diskdump, line 454:
if (!page_is_dumpable(pfn)) {
memset(bufptr, 0, cnt);
return cnt;
Shouldn't there be some indication that a requested page is missing as
opposed to being legitimately full of zeros?
Bob Montgomery
17 years, 9 months
Re: [Crash-utility] problem with crash on upstream kernel cores
by Dave Anderson
The 2.6.21-rc1 kernel has removed the zone->free_pages member, and
replaced it by using the zone->vm_stat[NR_FREE_PAGES] counter.
Without this patch, crash will die during initialization with:
crash: invalid (optional) structure member offsets: zone_struct_free_pages or zone_free_pages
The attached patch to the 4.0-3.20 tree is queued for the next release.
Thanks for the report,
Dave
memory.c symbols.c defs.h
--- crash-4.0-3.20/memory.c 2007-02-26 15:34:57.000000000 -0500
+++ crash-next/memory.c 2007-02-26 15:38:51.000000000 -0500
@@ -659,8 +659,17 @@
vt->dump_free_pages = dump_free_pages_zones_v1;
} else if (VALID_STRUCT(zone)) {
- MEMBER_OFFSET_INIT(zone_free_pages,
- "zone", "free_pages");
+ MEMBER_OFFSET_INIT(zone_vm_stat, "zone", "vm_stat");
+ MEMBER_OFFSET_INIT(zone_free_pages, "zone", "free_pages");
+ if (INVALID_MEMBER(zone_free_pages) &&
+ VALID_MEMBER(zone_vm_stat)) {
+ long nr_free_pages = 0;
+ if (!enumerator_value("NR_FREE_PAGES", &nr_free_pages))
+ error(WARNING,
+ "cannot determine NR_FREE_PAGES enumerator\n");
+ ASSIGN_OFFSET(zone_free_pages) = OFFSET(zone_vm_stat) +
+ (nr_free_pages * sizeof(long));
+ }
MEMBER_OFFSET_INIT(zone_free_area,
"zone", "free_area");
MEMBER_OFFSET_INIT(zone_zone_pgdat,
--- crash-4.0-3.20/symbols.c 2007-02-26 15:34:57.000000000 -0500
+++ crash-next/symbols.c 2007-02-26 14:33:41.000000000 -0500
@@ -6414,6 +6414,8 @@
OFFSET(zone_pages_low));
fprintf(fp, " zone_pages_high: %ld\n",
OFFSET(zone_pages_high));
+ fprintf(fp, " zone_vm_stat: %ld\n",
+ OFFSET(zone_vm_stat));
fprintf(fp, " neighbour_next: %ld\n",
OFFSET(neighbour_next));
--- crash-4.0-3.20/defs.h 2007-02-26 15:34:57.000000000 -0500
+++ crash-next/defs.h 2007-02-26 14:52:33.000000000 -0500
@@ -1282,6 +1282,7 @@
long zone_pages_min;
long zone_pages_low;
long zone_pages_high;
+ long zone_vm_stat;
long neighbour_next;
long neighbour_primary_key;
long neighbour_ha;
17 years, 9 months
problem with crash on upstream kernel cores
by Josef Whiter
Hello,
I'm having a problem using crash on cores generated by upstream kernels (in this
case 2.6.20). I'm using crash 4.0-3.20, and I've built my kernel with -g.
Whenever I try to open it this is the error I get
[root@rh5cluster2 127.0.0.1-2007-02-23-15:09:25]#
crash /root/linux-2.6/vmlinux vmcore
crash 4.0-3.20
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
crash: invalid (optional) structure member offsets: zone_struct_free_pages or
zone_free_pages
FILE: memory.c LINE: 11520 FUNCTION: dump_memory_nodes()
[/usr/bin/crash] error trace: 8096dcc => 80baca0 => 80ba076 => 812ee1a
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
WARNING: Because this kernel was compiled with gcc version 4.1.1, certain
commands or command options may fail unless crash is invoked with
the "--readnow" command line option.
[root@rh5cluster2 127.0.0.1-2007-02-23-15:09:25]# file vmcore
vmcore: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style
the file is definitely a proper core. I used the --readnow option and that did
not help either. Thank you,
Josef
17 years, 9 months
crash version 4.0-3.20 is available
by Dave Anderson
- Merged third round of "xencrash" patches, which allows a crash
session to alternatively be brought up against the xen-syms
binary instead of a vmlinux kernel. This update introduces
support for ia64. (oda(a)valinux.co.jp)
- Verified support of live system analysis of ia64 xen kernels, and
removed unnecessary EFI memory verification warning message during
their initialization. (anderson(a)redhat.com)
- Added gdb's "shell" command to the prohibited gdb command list, and
updated the "help output" page to describe shell escape usage.
(anderson(a)redhat.com)
- Fix for the x86 "bt" command for the 2.6.20 kernel, which has added
the "xgs" field to the pt_regs structure. Without this patch, the
exception frame dump in "bt" would show invalid contents for several
registers; the fix also shows the GS register contents.
(anderson(a)redhat.com)
- Fix for the "mount" command for the 2.6.20 kernel to recognize the
new "nsproxy" field in the task_struct and the contents of the
nsproxy and mnt_namespace structures, in order to find the root
mount namespace. Without the patch, the command would fail with:
"mount: invalid kernel virtual address: 69 type: first list entry".
(anderson(a)redhat.com)
- Fix for the "files" command for the 2.6.20 kernel to handle the
removal of the fdtable "max_fdset" member. Without the patch, the
command would fail with: "files: invalid structure member offset:
fdtable_max_fdset". (anderson(a)redhat.com)
- Fix for the "net -[sS]" command options for the 2.6.20 kernel to
handle the removal of the fdtable "max_fdset" member. Without the
patch, the command would fail with: "net: invalid structure member
offset: fdtable_max_fdset". (anderson(a)redhat.com)
- Fix for the "vm" command for the 2.6.20 kernel to handle the removal
of the file structure's "f_dentry" member, and its placement inside
the embedded "path" structure. Without the patch the command would
fail with: "vm: invalid structure member offset: file_f_dentry".
(anderson(a)redhat.com)
- Fix for the "swap" command for the 2.6.20 kernel to handle the removal
of the file structure's "f_vfsmnt" member, and its placement inside
the embedded "path" structure. Without the patch the command would
fail with: "swap: invalid structure member offset: file_f_vfsmnt".
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 9 months
Re: [Crash-utility] 2.6.20 data structure changes
by Dave Anderson
> One of the other changes is the removal of the "max_fdset"
> field from the "fdtable" structure. I'm not entirely clear how
> best to handle it -- it almost seems that that if it doesn't exist,
> the "max_fds" value can be used alone in all cases where
> it used to be used in conjunction with "max_fdset"?
>
If I may answer my own question, yeah, that works Dave...
BTW, "mount" is broken also.
Dave
17 years, 9 months
2.6.20 data structure changes
by Dave Anderson
Just as a head's up, all of a sudden there are changes to
some crucial data structures in 2.6.20 that break several
commands, such as "vm", "files", "swap", "net -s", and probably
more...
I've fixed the "vm" and "swap" commands by recognizing
that the file structure's former "f_dentry" and "f_vfsmnt"
fields are now contained within the embedded "path"
structure. That was simple enough...
One of the other changes is the removal of the "max_fdset"
field from the "fdtable" structure. I'm not entirely clear how
best to handle it -- it almost seems that that if it doesn't exist,
the "max_fds" value can be used alone in all cases where
it used to be used in conjunction with "max_fdset"? Any
filesystem-ophile out there that that can offer up a suggestion
would be appreciated...
I'll also fix the i386 "xgs" register addition to its pt_regs
structure. Unfortunately the pt_regs register offsets have
been hardwired in crash since the beginning of time (i.e.,
long before gdb was embedded in crash) and it's screwing
up exception frame output...
Dave
17 years, 9 months
Problem with "shell" command in crash
by Raghuveer R
I am having some problems using the "shell" command in crash.
If i execute "shell" command in crash, the input commands in the shell
prompt do not get displayed properly. Here is what i see.
crash> shell
mac:/home/raghu # 2a a.out addr.c crash-4.0-3.19 dump
^^^^^entered ls
mac:/home/raghu # llm22:/home/raghu # llm22:/home/raghu # llm22:/home/raghu #
^^^^^enter ^^^^^enter
However, the problem disappears if i execute 'stty echo' or 'reset' in the shell prompt.
mac:/home/raghu # llm22:/home/raghu #
^^^^entered stty echo
mac:/home/raghu # echo This is fine
This is fine
mac:/home/raghu #
Also, the problem is not found if i run crash in gdb.
The problem is reproducible in the latest version of crash, crash-4.0-3.19.
Anyone knows how to fix this?
Thanks,
Raghuveer R
17 years, 9 months
IA64 support for xen hypervisor analysis
by Itsuro ODA
Hi,
This is an update for xen hypervisor analysis.
Now IA64 xen hypervisor analysis is supported.
(We use ELF vmcore taken by Fujitsu sadump to check this.)
This patch is for crash-4.0-3.19.
Thanks.
--
Itsuro ODA <oda(a)valinux.co.jp>
17 years, 9 months
crash version 4.0-3.19 is available
by Dave Anderson
- Fix for support of paravirtual x86 xendumps that were:
1) created on host machines with greater than 4GB of memory, and
2) the active guest task at crash-time had been assigned a page
directory page (cr3) with a machine address greater than 4GB.
If both of the above apply, the crash session would fail with one of
two error messages, either "crash: cannot read/find cr3 page", or
"crash: cannot create xen pfn-to-mfn mapping". (anderson(a)redhat.com)
- Fix for the "kmem -p [page-struct-address]" command construct, which
would cause a segmentation violation when run on SPARSEMEM kernels.
(anderson(a)redhat.com)
- Added a new "struct -u" option, which indicates that the subsequent
address argument is a user virtual address in the current context.
This option could be used, for example, if a known kernel data
structure exists at user virtual address in the current context,
or if the debuginfo data of a user program were loaded into the
crash session via the gdb "add-symbol-file" command.
(anderson(a)redhat.com)
- Added new "rd -f" and "struct -f" options, which indicate that the
subsequent address argument is a dumpfile file offset. These options
could be used, for example, to print a known kernel data structure
that exists in the dumpfile header, or to simply dump data directly
from the dumpfile. (anderson(a)redhat.com)
- Cosmetic fix to prevent double-printing of "kmem -p" and "kmem -v"
headers when they are passed multiple address arguments.
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 10 months
kmem command problem (bug of dump_mem_map_SPARSEMEM?)
by Takao Indoh
Hi,
I used crash-4.0-3.18 and found a problem.
When I used kmem command to get information of
virtual address e000000105090000, which is page ptr,
segmentation fault occurred.
-----------------------------------------------------------------------
KERNEL: /usr/lib/debug/lib/modules/2.6.18-8.el5/vmlinux
DUMPFILE: /dev/mem
CPUS: 4
DATE: Tue Feb 6 15:53:25 2007
UPTIME: 01:12:05
LOAD AVERAGE: 0.55, 0.56, 0.30
TASKS: 185
NODENAME: build.fujitsu.com
RELEASE: 2.6.18-8.el5
VERSION: #1 SMP Fri Jan 26 14:16:09 EST 2007
MACHINE: ia64 (1600 Mhz)
MEMORY: 31.2 GB
PID: 16173
COMMAND: "crash"
TASK: e0000040b2660000 [THREAD_INFO: e0000040b2661040]
CPU: 1
STATE: TASK_RUNNING (ACTIVE)
crash> kmem -p | grep e000000105090000
e000000105090000 180000000 ------- ----- 0 600200080000
crash> kmem -p e000000105090000
Segmentation fault (core dumped)
-----------------------------------------------------------------------
This is a backtrace.
#0 0x40000000000faa20 in nr_to_section (nr=15032385540) at memory.c:11897
#1 0x40000000000fb360 in valid_section_nr (nr=15032385540) at memory.c:11966
#2 0x40000000000ab000 in dump_mem_map_SPARSEMEM (mi=0x60000fffffd88370)
at memory.c:3754
#3 0x40000000000b00c0 in dump_mem_map (mi=0x60000fffffd88370) at memory.c:4120
#4 0x40000000000a7b70 in cmd_kmem () at memory.c:3345
#5 0x400000000005ad20 in exec_command () at main.c:528
#6 0x400000000005a900 in main_loop () at main.c:487
It seems that there is a problem in dump_mem_map_SPARSEMEM.
Here is a part of it.
switch (mi->flags)
{
case ADDRESS_SPECIFIED:
switch (mi->memtype)
{
case KVADDR:
1) if (is_page_ptr(mi->spec_addr, NULL))
pg_spec = TRUE;
else {
if (kvtop(NULL, mi->spec_addr, &phys, 0)) {
2) mi->spec_addr = phys;
phys_spec = TRUE;
}
else
return;
}
break;
(snipped)
if (mi->flags & ADDRESS_SPECIFIED) {
3) ulong pfn = mi->spec_addr >> PAGESHIFT();
section_nr = pfn_to_section_nr(pfn);
}
4) if (!(section = valid_section_nr(section_nr))) {
1) e000000105090000 is a page pointer, so is_page_ptr returns TRUE.
2) mi->spec_addr is converted to physical addr here, but, this
code does not be executed in this case.
3) mi->spec_addr is converted to pfn, but mi->spec_addr has
virtual address in this case, so this conversion fails.
4) section_nr has invalid section number and it causes fault.
I have yet to find how to fix it.
Any idea?
Takao Indoh
17 years, 10 months