Vivek Goyal wrote:
On Mon, Sep 10, 2007 at 11:35:21AM -0700, Randy Dunlap wrote:
>On Fri, 7 Sep 2007 17:57:46 +0900 Ken'ichi Ohmichi wrote:
>
>
>>Hi,
>
>>I released a new makedumpfile (version 1.2.0) with vmcoreinfo support.
>>I updated the patches for linux and kexec-tools.
>>
>>PATCH SET:
>>[1/2] [linux-2.6.22] Add vmcoreinfo
>> The patch is for linux-2.6.22.
>> The patch adds the vmcoreinfo data. Its address and size are output
>> to /sys/kernel/vmcoreinfo.
>>
>>[2/2] [kexec-tools] Pass vmcoreinfo's address and size
>> The patch is for kexec-tools-testing-20070330.
>> (
http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/)
>> kexec command gets the address and size of the vmcoreinfo data from
>> /sys/kernel/vmcoreinfo, and passes them to the second kernel through
>> ELF header of /proc/vmcore. When the second kernel is booting, the
>> kernel gets them from the ELF header and creates vmcoreinfo's PT_NOTE
>> segment into /proc/vmcore.
>
>Hi,
>When using the vmcoreinfo patches, what tool(s) are available for
>analyzing the vmcore (dump) file? E.g., lkcd or crash or just gdb?
>
>gdb works for me, but I tried to use crash (4.0-4.6 from
>http://people.redhat.com/anderson/) and crash complained:
>
>crash: invalid kernel virtual address: 0 type: "cpu_pda entry"
>
>Should crash work, or does it need to be modified?
>
Hi Randy,
Crash should just work. It might broken on latest kernel. Copying it
to crash-utility mailing list. Dave will be able to tell us better.
>This is on a 2.6.23-rc3 kernel with vmcoreinfo patches and a dump file
>with -l 31 (dump level 31, omitting all possible pages).
There's always the possibility that something crucial (to the crash
utility) has changed in the upstream kernel; that's just the nature
of the beast.
In this case, crash is reading this set of per-cpu pointers:
struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly;
and for each one, it then reads the x8664_pda data structure
that it points to -- but finds a NULL. It's possible that it
has incorrectly determined the number of x8664_pda structures
(cpus) that exist. Or less likely, the array contents were read
as zeroes from the dumpfile.
Anyway, with any initialization-time failure, it's usually helpful
to invoke crash with the "-d7" (debug level) argument, as in:
$ crash -d7 vmlinux vmcore
That will display information re: every read made to the dumpfile.
In this case, normally you would see, for each cpu, a read of the
individual 8-byte address from the array, and then based upon what
it read, the subsequent read of the whole 128-byte data structure:
<readmem: ffffffff8042d9c0, KVADDR, "_cpu_pda addr", 8, (FOE),
7fbffff210>
<readmem: ffffffff80406000, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU0: level4_pgt: 200000010 data_offset: ffff8100899c1000
<readmem: ffffffff8042d9c8, KVADDR, "_cpu_pda addr", 8, (FOE),
7fbffff210>
<readmem: ffff81003ff027c0, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU1: level4_pgt: 200000010 data_offset: ffff8100899c9000
<readmem: ffffffff8042d9d0, KVADDR, "_cpu_pda addr", 8, (FOE),
7fbffff210>
<readmem: ffff81003ff19e40, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU2: level4_pgt: 200000010 data_offset: ffff8100899d1000
<readmem: ffffffff8042d9d8, KVADDR, "_cpu_pda addr", 8, (FOE),
7fbffff210>
<readmem: ffff81003ff19640, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU3: level4_pgt: 200000010 data_offset: ffff8100899d9000
<readmem: ffffffff8042d9e0, KVADDR, "_cpu_pda addr", 8, (FOE),
7fbffff210>
<readmem: ffffffff80406200, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
From that data structure it grabs the level4_pgt and data_offset
fields for subsequent use. So in your case, it should show how
many (if any) of the x8664_pda structures it read before encountering
a NULL pointer in one of the array entries.
Dave