On Tue, 11 Sep 2007 14:12:00 -0400 Dave Anderson wrote:
Randy Dunlap wrote:
>
> I have the vmcoreinfo patch applied.
> Kernel is 2.6.23-rc3.
>
> The crash debug output is below. Please let me know if you'd like
> me to test without the vmcoreinfo patch or anything else.
>
> ---
[snipped]
A few things come to mind. Walking through the debug data above...
The very first readmem() from the dumpfile is from the kernel symbol
"kernel_config_data", where you can see that it found the CONFIG_HZ and
CONFIG_NR_CPUS values. The next readmem()'s are of "xtime" and then
"init_uts_ns". We don't know what was read from the "xtime"
location,
but the utsname data from "init_uts_ns" gets displayed later on here:
> utsname version: #19 SMP Tue Sep 4 09:52:06 PDT 2007
And then the "linux_banner" address of ffffffff80537000 is first
checked for accessibility (OK), and then it is read successfully,
and its contents are displayed here:
> /proc/version:
> Linux version 2.6.23-rc3 (rddunlap(a)unicorn.site) (gcc version 4.1.1 20070105
(Red Hat 4.1.1-52)) #19 SMP Tue Sep 4 09:52:06 PDT 2007
The string above from the dumpfile is correlated against the
linux_banner string in the vmlinux file, which is subsequently
displayed here:
> /boot/vmlinux-2.6.23-rc3:
> Linux version 2.6.23-rc3 (rddunlap(a)unicorn.site) (gcc version 4.1.1 20070105
(Red Hat 4.1.1-52)) #22 SMP Thu Sep 6 21:24:54 PDT 2007
The utsname data and the linux_banner string from the dumpfile
are from "Tue Sep 4 09:52:06 PDT 2007", whereas the vmlinux file
was built 2 days later at "Thu Sep 6 21:24:54 PDT 2007". I don't
know whether that's the issue or not. Is there a reason that
you are *not* using the same vmlinux that the dumpfile was created
from?
Just sorry user error. Sorry to use your time like that
and thanks for the intro-to-crash lesson.
It's working now as expected. Thanks.
But, for now let's suppose that the two kernels are identical
except
for the date in the linux_banner strings. I don't have a 2.6.23
kernel source tree handy, but at least as of 2.6.22-5, it was still
declared statically like so:
struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly;
Has that changed?
Nope.
If not, it would be worth checking a dumpfile with no pages
excluded with makedumpfile. I wouldn't think the in-kernel
part of the vmcoreinfo patches would make a difference, but
I suppose anything's possible.
crash works (loads without error) with the vmcore file and one that
has all possible pages removed from it using 'makedumpfile'.
But again -- the very first thing to do is make sure that you
are using the exact same vmlinux as was booted/dumped.
Very true.
Thanks again.
---
~Randy