Re: [Crash-utility] kmem -[sS] segfault on 2.6.25.17
by Dave Anderson
----- "Mike Snitzer" <snitzer(a)gmail.com> wrote:
> On Thu, Oct 16, 2008 at 12:25 PM, Dave Anderson <anderson(a)redhat.com>
> wrote:
> >
> > ----- "Mike Snitzer" <snitzer(a)gmail.com> wrote:
> >
> >> I'm getting a core when I try to show slab data (kmem -[sS]) on
> >> 2.6.25.17 with both a live crash or saved vmcore.
> >>
> >> The core shows that the segv is coming from memset() via
> >> gather_cpudata_list_v2_nodes (memory.c:10119). This is with crash
> >> 4.0-7.4, but the same crash occurs with crash 4.0-6.3
> >> (memory.c:10108)
> >> and older.
> >>
> >> I've also seen kmem -[sS] segfaults with older kernels too (e.g.
> >> 2.6.22.x).
> >>
> >> Have others experienced this? Would it be useful for me to
> provide
> >> my
> >> kernel config?
> >
> > No that won't help.
>
> Actually I think it may considering kmem -[sS] works perfectly fine
> on
> the same identical 2.6.22.19 kernel if various debug features are
> _not_ enabled, see the attached .config diff. Of note:
> -# CONFIG_DEBUG_SLAB is not set
> +CONFIG_DEBUG_SLAB=y
> +CONFIG_DEBUG_SLAB_LEAK=y
>
> Comparable debug features are enabled in my 2.6.25 kernel that causes
> crash to segfault.
Good point...
>
> > It's failing in the BZERO() here:
> >
> > 10117 for (i = 0; (i < ARRAY_LENGTH(kmem_cache_s_array))
> &&
> > 10118 (cpudata[i]) && !(index); i++) {
> > 10119 BZERO(si->cpudata[i], sizeof(ulong) *
> vt->kmem_max_limit);
> >
> > What is "i" equal to when it segfaults? If you have a crash core
> file,
> > print out the contents of the global "vm_table". In that structure
> > there is a "kmem_max_cpus" field. If "i" is greater or equal to
> that,
> > then that's one explanation.
>
> i=0 and kmem_max_cpus=4.
Ok, then I can't see off-hand why it would segfault. Prior to this
routine running, si->cpudata[0...i] all get allocated buffers equal
to the size that's being BZERO'd.
Is si->cpudata[i] NULL or something?
Dave
16 years, 1 month
Re: [Crash-utility] kmem -[sS] segfault on 2.6.25.17
by Dave Anderson
----- "Mike Snitzer" <snitzer(a)gmail.com> wrote:
> I'm getting a core when I try to show slab data (kmem -[sS]) on
> 2.6.25.17 with both a live crash or saved vmcore.
>
> The core shows that the segv is coming from memset() via
> gather_cpudata_list_v2_nodes (memory.c:10119). This is with crash
> 4.0-7.4, but the same crash occurs with crash 4.0-6.3
> (memory.c:10108)
> and older.
>
> I've also seen kmem -[sS] segfaults with older kernels too (e.g.
> 2.6.22.x).
>
> Have others experienced this? Would it be useful for me to provide
> my
> kernel config?
No that won't help.
It's failing in the BZERO() here:
10117 for (i = 0; (i < ARRAY_LENGTH(kmem_cache_s_array)) &&
10118 (cpudata[i]) && !(index); i++) {
10119 BZERO(si->cpudata[i], sizeof(ulong) * vt->kmem_max_limit);
What is "i" equal to when it segfaults? If you have a crash core file,
print out the contents of the global "vm_table". In that structure
there is a "kmem_max_cpus" field. If "i" is greater or equal to that,
then that's one explanation.
Or you can bring up the dumpfile (or live system), and look at the value
kmem_max_cpus by looking at the output of "help -v".
Dave
Dave
16 years, 1 month
kmem -[sS] segfault on 2.6.25.17
by Mike Snitzer
I'm getting a core when I try to show slab data (kmem -[sS]) on
2.6.25.17 with both a live crash or saved vmcore.
The core shows that the segv is coming from memset() via
gather_cpudata_list_v2_nodes (memory.c:10119). This is with crash
4.0-7.4, but the same crash occurs with crash 4.0-6.3 (memory.c:10108)
and older.
I've also seen kmem -[sS] segfaults with older kernels too (e.g. 2.6.22.x).
Have others experienced this? Would it be useful for me to provide my
kernel config?
Mike
16 years, 1 month
Re: [Crash-utility] [PATCH] Support linux-2.6.26 sparsemem kernel on i386.
by Dave Anderson
----- "Ken'ichi Ohmichi" <oomichi(a)mxs.nes.nec.co.jp> wrote:
> Hi Dave,
>
> When 'kmem -p'/'kmem -i' is executed on i386 linux-2.6.26 sparsemem
> kernel,
> current crash utility misunderstands the number of struct page in
> each
> section and refers invalid struct page.
>
> The cause is that SECTION_SIZE_BITS of PAE has been changed to 29 from
> 30
> since linux-2.6.26 by the following:
>
> [PATCH 3 of 4] sparsemem: reduce i386 PAE section size
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0803.3/1882.html
>
> If applying the attached patch, the crash utility can catch up the
> above change. It makes the results of both 'kmem -p' and 'kmem -i'
> correct. The attached patch is for crash-4.0-7.4.
>
> There is the same problem in makedumpfile, and I will fix it :-)
>
>
> Thanks
> Ken'ichi Ohmichi
Thanks Ken'ichi -- it's queued for the next release.
Dave
>
> Signed-off-by: Ken'ichi Ohmichi <oomichi(a)mxs.nes.nec.co.jp>
> ---
> diff -rpuN crash-4.0-7.4.orig/defs.h crash-4.0-7.4/defs.h
> --- crash-4.0-7.4.orig/defs.h 2008-10-16 17:32:46.000000000 +0900
> +++ crash-4.0-7.4/defs.h 2008-10-16 17:46:30.000000000 +0900
> @@ -2069,7 +2069,8 @@ struct load_module {
> #define TIF_SIGPENDING (2)
>
> // CONFIG_X86_PAE
> -#define _SECTION_SIZE_BITS_PAE 30
> +#define _SECTION_SIZE_BITS_PAE_ORIG 30
> +#define _SECTION_SIZE_BITS_PAE_2_6_26 29
> #define _MAX_PHYSMEM_BITS_PAE 36
>
> // !CONFIG_X86_PAE
> diff -rpuN crash-4.0-7.4.orig/x86.c crash-4.0-7.4/x86.c
> --- crash-4.0-7.4.orig/x86.c 2008-10-16 17:32:46.000000000 +0900
> +++ crash-4.0-7.4/x86.c 2008-10-16 17:47:58.000000000 +0900
> @@ -1819,7 +1819,12 @@ x86_init(int when)
> }
>
> if (machdep->flags & PAE) {
> - machdep->section_size_bits = _SECTION_SIZE_BITS_PAE;
> + if (THIS_KERNEL_VERSION < LINUX(2,6,26))
> + machdep->section_size_bits =
> + _SECTION_SIZE_BITS_PAE_ORIG;
> + else
> + machdep->section_size_bits =
> + _SECTION_SIZE_BITS_PAE_2_6_26;
> machdep->max_physmem_bits = _MAX_PHYSMEM_BITS_PAE;
> } else {
> machdep->section_size_bits = _SECTION_SIZE_BITS;
> _
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
16 years, 1 month
Re: [Crash-utility] "cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash
by Dave Anderson
----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
> Thought that perhaps the newest kexec-toools from git would help, with
> changelog entries like "Parse contents of /proc/iomem instead of
> hardcoding RAM ranges." No joy. :\
>
> Dave, one last question for you before I attempt to get assistance
> from kexec folks... I just noticed that crash reports my memory is
> 5GB. I have 4GB of physical memory. Am I correct in assuming this is
> just some cosmetic detail?
Pretty much. It's a PITA figuring it out for all the different
architectures and memory configurations, but in your kernel's case
you've got 1 "pglist_data" memory node descriptor, which covers
covers the whole range of your physical memory. And that structure's
"node_present_pages" contains a count of 1310720 pages, which is
exactly 5GB. From your "crash.log" file, that value gets stored
by crash in its singular node_table[0] data structure:
node_table[0]:
id: 0
pgdat: 403c6e80
size: 1310720
present: 1310720
mem_map: 45000000
start_paddr: 0
start_mapnr: 0
Dave
>
> -----Original Message-----
> From: Worth, Kevin
> Sent: Wednesday, October 15, 2008 4:16 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: RE: [Crash-utility] "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
> Tried version 2.0 of kexec-tools (released 7/19/2008) and still have
> the same problem with zero'ed out module info. Sounds like perhaps it
> is narrowed down to the 2.6.20 kernel kdump code (the most difficult
> part to change out, but the part that likely gets a lot of eyes on it
> and probably has the issue fixed in current versions). :\
>
> -Kevin
>
> -----Original Message-----
> From: crash-utility-bounces(a)redhat.com
> [mailto:crash-utility-bounces@redhat.com] On Behalf Of Worth, Kevin
> Sent: Wednesday, October 15, 2008 2:43 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: RE: [Crash-utility] "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
> Sorry please ignore the second paragraph... I am already running the
> most recent version in Ubuntu, 20070330 from Simon Horman's
> kexec-tools-testing and kernel 2.6.20. ... may try a newer version of
> kexec-tools-testing to see if anything changes.
>
> -Kevin
>
> -----Original Message-----
> From: Worth, Kevin
> Sent: Wednesday, October 15, 2008 2:31 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: RE: [Crash-utility] "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
> So Dave, at this am I correct in the assumption that it sounds like
> this is not a problem with crash, but with the dump file itself? I
> tried one more go at modifying the kexec-tools to have the correct
> PAGE_OFFSET defined and still got the same type of results (all zeroes
> at the module's address), so that doesn't seem to be it.
>
> Maybe this is a better question to take to the kexec mailing list, but
> do you know where the line is drawn between the kernel support or the
> userspace (kexec-tools)? I'm presuming that the kernel support is tied
> to each kernel (i.e. since I'm on 2.6.20 this issue could have been
> resolved in a more recent kernel). I'm wondering if I can pull a newer
> kexec-tools and that they might work with 2.6.20 and possibly have
> this issue resolved.
>
> -Kevin
>
> -----Original Message-----
> From: crash-utility-bounces(a)redhat.com
> [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson
> Sent: Wednesday, October 15, 2008 6:53 AM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
>
> ----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
>
> > Hi Dave,
> >
> > Before you responded I noticed that a simple "make modules" didn't
> > work because my kernel wasn't exporting the symbol. Rather than do
> > anything risky/complex which might risk mucking up the
> troubleshooting
> > process, I just rebuilt the kernel. It built just fine and now I
> can
> > load crash and I see "DUMPFILE: /dev/crash" when I load up crash.
> Let
> > me try walking through the steps that you had me do previously,
> this
> > time using /dev/crash instead of /dev/mem and /dev/kmem
>
> You made one small error (but not totally fatal) in the suggested
> steps.
> See my comments below...
>
> >
> > >From my limited understanding of what's going on here, it would
> > appear that the dump file is missing some data, or else crash is
> > looking in the wrong place for it.
>
> The crash utility is a slave to what is indicated in the PT_LOAD
> segments of the ELF header of the kdump vmcore. In the case of
> the physical memory chunk that starts at 4GB physical on your
> machine,
> this is what's in the ELF header (from your original "crash.log"
> file):
>
> Elf64_Phdr:
> p_type: 1 (PT_LOAD)
> p_offset: 3144876760 (bb7302d8)
> p_vaddr: ffffffffffffffff
> p_paddr: 100000000
> p_filesz: 1073741824 (40000000)
> p_memsz: 1073741824 (40000000)
> p_flags: 7 (PF_X|PF_W|PF_R)
> p_align: 0
>
>
> What that says is: for the range of physical memory starting
> at 0x100000000 (p_paddr), the vmcore contains a block of
> memory starting at file offset (p_offset) 3144876760/0xbb7302d8
> that is 1073741824/0x40000000 (p_filesz) bytes long.
>
> More simply put, the 1GB of physical memory from 4GB to 5GB
> can be found in the vmcore file starting at file offset 3144876760.
>
> So if a request for physical memory page 0x100000000 comes
> in, the crash utility reads from vmcore file offset 3144876760.
> If the next physical page were requested, i.e., at 0x100001000,
> it would read from vmcore file offset 3144876760+4096. It's
> as simple as that -- so when you suggest that "crash is looking
> in the wrong place for it", well, there's nothing that the
> crash utility can do differently.
>
> Now, back to the test sequence:
>
> > ---Live system---
> >
> > KERNEL: vmlinux-devcrash
> > DUMPFILE: /dev/crash
> > CPUS: 2
> > DATE: Tue Oct 14 16:08:28 2008
> > UPTIME: 00:02:07
> > LOAD AVERAGE: 0.17, 0.08, 0.03
> > TASKS: 97
> > NODENAME: test-machine
> > RELEASE: 2.6.20-17.39-custom2
> > VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> > MACHINE: i686 (2200 Mhz)
> > MEMORY: 5 GB
> > PID: 5628
> > COMMAND: "crash"
> > TASK: 5d4c2560 [THREAD_INFO: f3de6000]
> > CPU: 1
> > STATE: TASK_RUNNING (ACTIVE)
> >
> > crash> p modules
> > modules = $2 = {
> > next = 0xf8a3ea04,
> > prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> > state = MODULE_STATE_LIVE,
> > list = {
> > next = 0xf8d10484,
> > prev = 0x403c63a4
> > },
> > name =
> >
> "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > mkobj = {
> > kobj = {
> > k_name = 0xf8a3ea4c "crash",
> > name =
> > "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > kref = {
> > refcount = {
> > counter = 3
> > }
> > },
> > entry = {
> > next = 0x403c6068,
> > prev = 0xf8d104e4
> > },
> > parent = 0x403c6074
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL PHYSICAL
> > f8a3ea00 116017a00
>
> OK -- so the physical memory location of the module data structure
> is at physical address 116017a00, but...
> >
> > PAGE DIRECTORY: 4044b000
> > PGD: 4044b018 => 6001
> > PMD: 6e28 => 1d51a067
> > PTE: 1d51a1f0 => 116017163
> > PAGE: 116017000
> >
> > PTE PHYSICAL FLAGS
> > 116017163 116017000 (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > 472c02e0 116017000 0 229173 1 80000000
> >
>
> You're reading from the beginning of the page, i.e., 116017000
> instead of where the module structure is at 116017a00:
>
> > crash> rd -p 116017000 30
> > 116017000: 53e58955 d089c389 4d8bca89 74c98508 U..S.......M...t
> > 116017010: 01e9831f b85b0d74 ffffffea ffffba5d ....t.[.....]...
> > 116017020: 03c3ffff 53132043 26b48d24 00000000 ....C .S$..&....
> > 116017030: 89204389 5d5b2453 26b48dc3 00000000 .C .S$[]...&....
> > 116017040: 83e58955 55892cec 08558be4 89f45d89 U....,.U..U..]..
> > 116017050: 7d89f875 ffeabffc 4d89ffff 8b028be0 u..}.......M....
> > 116017060: c3890452 ac0fd689 45890cf3 0ceec1ec R..........E....
> > 116017070: 5589c889 89d231f0 ...U.1..
> > crash>
> >
>
> So therefore you're not seeing the "crash" strings embedded in
> the raw physical data. Now, although it would have been "nice"
> if you could have shown the contents of the module structure via
> the physical address, the fact remains that since you used the
> /dev/crash driver, the "module 0xf8a3ea00" command required that
> the crash utility first translate the vmalloc address into its
> physical equivalent, and then read from there.
>
> In any case, you do have a dump of physical memory from 116017000
> which at least is in the same 4k page as the module data structure,
> so it should not change when read from the dumpfile.
>
> > ---Using dump file---
> >
> >
> > please wait... (gathering module symbol data)
> > WARNING: cannot access vmalloc'd module memory
> >
> > KERNEL: vmlinux-devcrash
> > DUMPFILE: /var/crash/vmcore
> > CPUS: 2
> > DATE: Tue Oct 14 16:09:32 2008
> > UPTIME: 00:03:12
> > LOAD AVERAGE: 0.09, 0.08, 0.02
> > TASKS: 97
> > NODENAME: test-machine
> > RELEASE: 2.6.20-17.39-custom2
> > VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> > MACHINE: i686 (2200 Mhz)
> > MEMORY: 5 GB
> > PANIC: "[ 192.148000] SysRq : Trigger a crashdump"
> > PID: 0
> > COMMAND: "swapper"
> > TASK: 403c0440 (1 of 2) [THREAD_INFO: 403f2000]
> > CPU: 0
> > STATE: TASK_RUNNING (SYSRQ)
> >
> > crash> p modules
> > modules = $2 = {
> > next = 0xf8a3ea04,
> > prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> > state = MODULE_STATE_LIVE,
> > list = {
> > next = 0x0,
> > prev = 0x0
> > },
> > name =
> >
> "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000",
> > mkobj = {
> > kobj = {
> > k_name = 0x0,
> > name =
> > "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000\000",
> > kref = {
> > refcount = {
> > counter = 0
> > }
> > },
> > entry = {
> > next = 0x0,
> > prev = 0x0
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL PHYSICAL
> > f8a3ea00 116017a00
> >
> > PAGE DIRECTORY: 4044b000
> > PGD: 4044b018 => 6001
> > PMD: 6e28 => 1d51a067
> > PTE: 1d51a1f0 => 116017163
> > PAGE: 116017000
> >
> > PTE PHYSICAL FLAGS
> > 116017163 116017000 (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > 472c02e0 116017000 0 229173 1 80000000
> >
> > crash> rd -p 116017000 30
> > 116017000: 00000000 00000000 00000000 00000000 ................
> > 116017010: 00000000 00000000 00000000 00000000 ................
> > 116017020: 00000000 00000000 00000000 00000000 ................
> > 116017030: 00000000 00000000 00000000 00000000 ................
> > 116017040: 00000000 00000000 00000000 00000000 ................
> > 116017050: 00000000 00000000 00000000 00000000 ................
> > 116017060: 00000000 00000000 00000000 00000000 ................
> > 116017070: 00000000 00000000 ........
> > crash>
>
> Now we're reading the same physical address as you did on
> the dumpfile, and it's returning all zeroes. And the
> "module 0xf8a3ea00" above shows all zeroes from a higher
> location in the page because the same vmalloc translation is
> done to turn it into a physical address before reading it
> from the vmcore file. But instead of using the /dev/crash driver
> to access the translated physical memory, the crash utility
> uses the information from the ELF header's PT_LOAD segments
> to find out where to find the page data in the vmcore file.
>
> So, anyway, the "rd -p 116017000 30" command that you did
> on both the live system and the dumpfile should yield the same
> data.
>
> It seems like in all examples to date, the file data read
> at the greater-than-4GB PT_LOAD segment returns zeroes.
>
> You can verify this from the crash utility's viewpoint by
> doing a "help -n" during runtime when running with the dumpfile,
> which will show you both the actual contents of the ELF header,
> as well as the manner in which the PT_LOAD data is stored for
> its use. (It's also shown with the "crash -d7 ..." output).
>
> So again, from your original "crash.log" file, here is what the
> ELF header's PT_LOAD segment contains:
>
> Elf64_Phdr:
> p_type: 1 (PT_LOAD)
> p_offset: 3144876760 (bb7302d8)
> p_vaddr: ffffffffffffffff
> p_paddr: 100000000
> p_filesz: 1073741824 (40000000)
> p_memsz: 1073741824 (40000000)
> p_flags: 7 (PF_X|PF_W|PF_R)
> p_align: 0
>
> And this is what the crash utility stored in its internal
> data structure for that particular segment:
>
> pt_load_segment[4]:
> file_offset: bb7302d8
> phys_start: 100000000
> phys_end: 140000000
> zero_fill: 0
>
> And when the physical memory read request comes in, it filters
> to this part of the crash utility's read_netdump() function in
> netdump.c:
>
> for (i = offset = 0; i < nd->num_pt_load_segments;
> i++) {
> pls = &nd->pt_load_segments[i];
> if ((paddr >= pls->phys_start) &&
> (paddr < pls->phys_end)) {
> offset = (off_t)(paddr -
> pls->phys_start) +
> pls->file_offset;
> break;
> }
> if (pls->zero_fill && (paddr >= pls->phys_end)
> &&
> (paddr < pls->zero_fill)) {
> memset(bufptr, 0, cnt);
> return cnt;
> }
> }
>
> So for any physical address request between 100000000 to 140000000,
> (4GB to 5GB) it will calculate the offset to seek to by subtracting
> 100000000 from the incoming physical address, and adding the
> difference
> to the starting file offset of the whole segment.
>
> So if you wanted to, you could put debug code just prior to the
> "break" above
> that shows the pls->file_offset for a given incoming physical
> address.
> But this code has been in place forever, so it's hard to conceive
> that
> somehow it's not working in the case of this dumpfile. But presuming
> that
> it *does* go to the correct file offset location in the vmcore, and
> it's
> getting bogus data from there, then there's nothing that the crash
> utility can do about it.
>
> Dave
>
>
>
>
>
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
16 years, 1 month
Re: [Crash-utility] "cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash
by Dave Anderson
----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
> So Dave, at this am I correct in the assumption that it sounds like
> this is not a problem with crash, but with the dump file itself?
Anything's possible, but since you have shown that reading the same
(static) physical address contents on the live system, and then on the
resultant dumpfile, and they don't match up, then I cannot conceive of
it being a crash problem.
> I tried one more go at modifying the kexec-tools to have the correct
> PAGE_OFFSET defined and still got the same type of results (all zeroes
> at the module's address), so that doesn't seem to be it.
At best you should see the proper unity-mapped virtual addresses
in the ELF header's PT_LOAD segments. But as I mentioned way back,
the crash utility does not use the PT_LOAD segment's p_vaddr field
for the x86 architecture, and is only concerned with the physical
addresses for each segment.
>
> Maybe this is a better question to take to the kexec mailing list, but
> do you know where the line is drawn between the kernel support or the
> userspace (kexec-tools)? I'm presuming that the kernel support is tied
> to each kernel (i.e. since I'm on 2.6.20 this issue could have been
> resolved in a more recent kernel). I'm wondering if I can pull a newer
> kexec-tools and that they might work with 2.6.20 and possibly have
> this issue resolved.
Both the kernel-space kexec/kdump and the user-space kexec-tools
developers converse on the kexec-list mailing list, so they should
be able to answer that better than me.
Dave
> -Kevin
>
> -----Original Message-----
> From: crash-utility-bounces(a)redhat.com
> [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson
> Sent: Wednesday, October 15, 2008 6:53 AM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
>
> ----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
>
> > Hi Dave,
> >
> > Before you responded I noticed that a simple "make modules" didn't
> > work because my kernel wasn't exporting the symbol. Rather than do
> > anything risky/complex which might risk mucking up the
> troubleshooting
> > process, I just rebuilt the kernel. It built just fine and now I
> can
> > load crash and I see "DUMPFILE: /dev/crash" when I load up crash.
> Let
> > me try walking through the steps that you had me do previously,
> this
> > time using /dev/crash instead of /dev/mem and /dev/kmem
>
> You made one small error (but not totally fatal) in the suggested
> steps.
> See my comments below...
>
> >
> > >From my limited understanding of what's going on here, it would
> > appear that the dump file is missing some data, or else crash is
> > looking in the wrong place for it.
>
> The crash utility is a slave to what is indicated in the PT_LOAD
> segments of the ELF header of the kdump vmcore. In the case of
> the physical memory chunk that starts at 4GB physical on your
> machine,
> this is what's in the ELF header (from your original "crash.log"
> file):
>
> Elf64_Phdr:
> p_type: 1 (PT_LOAD)
> p_offset: 3144876760 (bb7302d8)
> p_vaddr: ffffffffffffffff
> p_paddr: 100000000
> p_filesz: 1073741824 (40000000)
> p_memsz: 1073741824 (40000000)
> p_flags: 7 (PF_X|PF_W|PF_R)
> p_align: 0
>
>
> What that says is: for the range of physical memory starting
> at 0x100000000 (p_paddr), the vmcore contains a block of
> memory starting at file offset (p_offset) 3144876760/0xbb7302d8
> that is 1073741824/0x40000000 (p_filesz) bytes long.
>
> More simply put, the 1GB of physical memory from 4GB to 5GB
> can be found in the vmcore file starting at file offset 3144876760.
>
> So if a request for physical memory page 0x100000000 comes
> in, the crash utility reads from vmcore file offset 3144876760.
> If the next physical page were requested, i.e., at 0x100001000,
> it would read from vmcore file offset 3144876760+4096. It's
> as simple as that -- so when you suggest that "crash is looking
> in the wrong place for it", well, there's nothing that the
> crash utility can do differently.
>
> Now, back to the test sequence:
>
> > ---Live system---
> >
> > KERNEL: vmlinux-devcrash
> > DUMPFILE: /dev/crash
> > CPUS: 2
> > DATE: Tue Oct 14 16:08:28 2008
> > UPTIME: 00:02:07
> > LOAD AVERAGE: 0.17, 0.08, 0.03
> > TASKS: 97
> > NODENAME: test-machine
> > RELEASE: 2.6.20-17.39-custom2
> > VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> > MACHINE: i686 (2200 Mhz)
> > MEMORY: 5 GB
> > PID: 5628
> > COMMAND: "crash"
> > TASK: 5d4c2560 [THREAD_INFO: f3de6000]
> > CPU: 1
> > STATE: TASK_RUNNING (ACTIVE)
> >
> > crash> p modules
> > modules = $2 = {
> > next = 0xf8a3ea04,
> > prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> > state = MODULE_STATE_LIVE,
> > list = {
> > next = 0xf8d10484,
> > prev = 0x403c63a4
> > },
> > name =
> >
> "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > mkobj = {
> > kobj = {
> > k_name = 0xf8a3ea4c "crash",
> > name =
> > "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > kref = {
> > refcount = {
> > counter = 3
> > }
> > },
> > entry = {
> > next = 0x403c6068,
> > prev = 0xf8d104e4
> > },
> > parent = 0x403c6074
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL PHYSICAL
> > f8a3ea00 116017a00
>
> OK -- so the physical memory location of the module data structure
> is at physical address 116017a00, but...
> >
> > PAGE DIRECTORY: 4044b000
> > PGD: 4044b018 => 6001
> > PMD: 6e28 => 1d51a067
> > PTE: 1d51a1f0 => 116017163
> > PAGE: 116017000
> >
> > PTE PHYSICAL FLAGS
> > 116017163 116017000 (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > 472c02e0 116017000 0 229173 1 80000000
> >
>
> You're reading from the beginning of the page, i.e., 116017000
> instead of where the module structure is at 116017a00:
>
> > crash> rd -p 116017000 30
> > 116017000: 53e58955 d089c389 4d8bca89 74c98508 U..S.......M...t
> > 116017010: 01e9831f b85b0d74 ffffffea ffffba5d ....t.[.....]...
> > 116017020: 03c3ffff 53132043 26b48d24 00000000 ....C .S$..&....
> > 116017030: 89204389 5d5b2453 26b48dc3 00000000 .C .S$[]...&....
> > 116017040: 83e58955 55892cec 08558be4 89f45d89 U....,.U..U..]..
> > 116017050: 7d89f875 ffeabffc 4d89ffff 8b028be0 u..}.......M....
> > 116017060: c3890452 ac0fd689 45890cf3 0ceec1ec R..........E....
> > 116017070: 5589c889 89d231f0 ...U.1..
> > crash>
> >
>
> So therefore you're not seeing the "crash" strings embedded in
> the raw physical data. Now, although it would have been "nice"
> if you could have shown the contents of the module structure via
> the physical address, the fact remains that since you used the
> /dev/crash driver, the "module 0xf8a3ea00" command required that
> the crash utility first translate the vmalloc address into its
> physical equivalent, and then read from there.
>
> In any case, you do have a dump of physical memory from 116017000
> which at least is in the same 4k page as the module data structure,
> so it should not change when read from the dumpfile.
>
> > ---Using dump file---
> >
> >
> > please wait... (gathering module symbol data)
> > WARNING: cannot access vmalloc'd module memory
> >
> > KERNEL: vmlinux-devcrash
> > DUMPFILE: /var/crash/vmcore
> > CPUS: 2
> > DATE: Tue Oct 14 16:09:32 2008
> > UPTIME: 00:03:12
> > LOAD AVERAGE: 0.09, 0.08, 0.02
> > TASKS: 97
> > NODENAME: test-machine
> > RELEASE: 2.6.20-17.39-custom2
> > VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> > MACHINE: i686 (2200 Mhz)
> > MEMORY: 5 GB
> > PANIC: "[ 192.148000] SysRq : Trigger a crashdump"
> > PID: 0
> > COMMAND: "swapper"
> > TASK: 403c0440 (1 of 2) [THREAD_INFO: 403f2000]
> > CPU: 0
> > STATE: TASK_RUNNING (SYSRQ)
> >
> > crash> p modules
> > modules = $2 = {
> > next = 0xf8a3ea04,
> > prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> > state = MODULE_STATE_LIVE,
> > list = {
> > next = 0x0,
> > prev = 0x0
> > },
> > name =
> >
> "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000",
> > mkobj = {
> > kobj = {
> > k_name = 0x0,
> > name =
> > "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000\000",
> > kref = {
> > refcount = {
> > counter = 0
> > }
> > },
> > entry = {
> > next = 0x0,
> > prev = 0x0
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL PHYSICAL
> > f8a3ea00 116017a00
> >
> > PAGE DIRECTORY: 4044b000
> > PGD: 4044b018 => 6001
> > PMD: 6e28 => 1d51a067
> > PTE: 1d51a1f0 => 116017163
> > PAGE: 116017000
> >
> > PTE PHYSICAL FLAGS
> > 116017163 116017000 (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > 472c02e0 116017000 0 229173 1 80000000
> >
> > crash> rd -p 116017000 30
> > 116017000: 00000000 00000000 00000000 00000000 ................
> > 116017010: 00000000 00000000 00000000 00000000 ................
> > 116017020: 00000000 00000000 00000000 00000000 ................
> > 116017030: 00000000 00000000 00000000 00000000 ................
> > 116017040: 00000000 00000000 00000000 00000000 ................
> > 116017050: 00000000 00000000 00000000 00000000 ................
> > 116017060: 00000000 00000000 00000000 00000000 ................
> > 116017070: 00000000 00000000 ........
> > crash>
>
> Now we're reading the same physical address as you did on
> the dumpfile, and it's returning all zeroes. And the
> "module 0xf8a3ea00" above shows all zeroes from a higher
> location in the page because the same vmalloc translation is
> done to turn it into a physical address before reading it
> from the vmcore file. But instead of using the /dev/crash driver
> to access the translated physical memory, the crash utility
> uses the information from the ELF header's PT_LOAD segments
> to find out where to find the page data in the vmcore file.
>
> So, anyway, the "rd -p 116017000 30" command that you did
> on both the live system and the dumpfile should yield the same
> data.
>
> It seems like in all examples to date, the file data read
> at the greater-than-4GB PT_LOAD segment returns zeroes.
>
> You can verify this from the crash utility's viewpoint by
> doing a "help -n" during runtime when running with the dumpfile,
> which will show you both the actual contents of the ELF header,
> as well as the manner in which the PT_LOAD data is stored for
> its use. (It's also shown with the "crash -d7 ..." output).
>
> So again, from your original "crash.log" file, here is what the
> ELF header's PT_LOAD segment contains:
>
> Elf64_Phdr:
> p_type: 1 (PT_LOAD)
> p_offset: 3144876760 (bb7302d8)
> p_vaddr: ffffffffffffffff
> p_paddr: 100000000
> p_filesz: 1073741824 (40000000)
> p_memsz: 1073741824 (40000000)
> p_flags: 7 (PF_X|PF_W|PF_R)
> p_align: 0
>
> And this is what the crash utility stored in its internal
> data structure for that particular segment:
>
> pt_load_segment[4]:
> file_offset: bb7302d8
> phys_start: 100000000
> phys_end: 140000000
> zero_fill: 0
>
> And when the physical memory read request comes in, it filters
> to this part of the crash utility's read_netdump() function in
> netdump.c:
>
> for (i = offset = 0; i < nd->num_pt_load_segments;
> i++) {
> pls = &nd->pt_load_segments[i];
> if ((paddr >= pls->phys_start) &&
> (paddr < pls->phys_end)) {
> offset = (off_t)(paddr -
> pls->phys_start) +
> pls->file_offset;
> break;
> }
> if (pls->zero_fill && (paddr >= pls->phys_end)
> &&
> (paddr < pls->zero_fill)) {
> memset(bufptr, 0, cnt);
> return cnt;
> }
> }
>
> So for any physical address request between 100000000 to 140000000,
> (4GB to 5GB) it will calculate the offset to seek to by subtracting
> 100000000 from the incoming physical address, and adding the
> difference
> to the starting file offset of the whole segment.
>
> So if you wanted to, you could put debug code just prior to the
> "break" above
> that shows the pls->file_offset for a given incoming physical
> address.
> But this code has been in place forever, so it's hard to conceive
> that
> somehow it's not working in the case of this dumpfile. But presuming
> that
> it *does* go to the correct file offset location in the vmcore, and
> it's
> getting bogus data from there, then there's nothing that the crash
> utility can do about it.
>
> Dave
>
>
>
>
>
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
16 years, 1 month
[PATCH] Support linux-2.6.26 sparsemem kernel on i386.
by Ken'ichi Ohmichi
Hi Dave,
When 'kmem -p'/'kmem -i' is executed on i386 linux-2.6.26 sparsemem kernel,
current crash utility misunderstands the number of struct page in each
section and refers invalid struct page.
The cause is that SECTION_SIZE_BITS of PAE has been changed to 29 from 30
since linux-2.6.26 by the following:
[PATCH 3 of 4] sparsemem: reduce i386 PAE section size
http://www.uwsg.iu.edu/hypermail/linux/kernel/0803.3/1882.html
If applying the attached patch, the crash utility can catch up the
above change. It makes the results of both 'kmem -p' and 'kmem -i'
correct. The attached patch is for crash-4.0-7.4.
There is the same problem in makedumpfile, and I will fix it :-)
Thanks
Ken'ichi Ohmichi
Signed-off-by: Ken'ichi Ohmichi <oomichi(a)mxs.nes.nec.co.jp>
---
diff -rpuN crash-4.0-7.4.orig/defs.h crash-4.0-7.4/defs.h
--- crash-4.0-7.4.orig/defs.h 2008-10-16 17:32:46.000000000 +0900
+++ crash-4.0-7.4/defs.h 2008-10-16 17:46:30.000000000 +0900
@@ -2069,7 +2069,8 @@ struct load_module {
#define TIF_SIGPENDING (2)
// CONFIG_X86_PAE
-#define _SECTION_SIZE_BITS_PAE 30
+#define _SECTION_SIZE_BITS_PAE_ORIG 30
+#define _SECTION_SIZE_BITS_PAE_2_6_26 29
#define _MAX_PHYSMEM_BITS_PAE 36
// !CONFIG_X86_PAE
diff -rpuN crash-4.0-7.4.orig/x86.c crash-4.0-7.4/x86.c
--- crash-4.0-7.4.orig/x86.c 2008-10-16 17:32:46.000000000 +0900
+++ crash-4.0-7.4/x86.c 2008-10-16 17:47:58.000000000 +0900
@@ -1819,7 +1819,12 @@ x86_init(int when)
}
if (machdep->flags & PAE) {
- machdep->section_size_bits = _SECTION_SIZE_BITS_PAE;
+ if (THIS_KERNEL_VERSION < LINUX(2,6,26))
+ machdep->section_size_bits =
+ _SECTION_SIZE_BITS_PAE_ORIG;
+ else
+ machdep->section_size_bits =
+ _SECTION_SIZE_BITS_PAE_2_6_26;
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS_PAE;
} else {
machdep->section_size_bits = _SECTION_SIZE_BITS;
_
16 years, 1 month
Question re: xen hypervisor backtrace problem
by Dave Anderson
Hello Oda-san,
I have a xen-syms vmcore that finds a path that the hypervisor-related
changes in lkcd_x86_trace.c cannot handle. When the back trace runs
into the "process_softirqs" text return address reference from
"xen/arch/x86/x86_32/entry.S", it cannot go any further. Therefore
the backtrace fails, and in the recovery code it incorrectly searches
for a (vmlinux) eframe:
crash> bt -a
PCPU: 0 VCPU: ffbc7080
bt: cannot resolve stack trace:
#0 [ff1d3ebc] elf_core_save_regs at ff10a810
#1 [ff1d3ec4] common_interrupt at ff1222ed
#2 [ff1d3ed0] do_nmi at ff1335bb
#3 [ff1d3ef0] handle_nmi_mce at ff17442e
#4 [ff1d3f24] csched_tick at ff110aa7
#5 [ff1d3f80] timer_softirq_action at ff1155d2
#6 [ff1d3fa0] do_softirq at ff1143fe
#7 [ff1d3fb0] process_softirqs at ff173f61
bt: text symbols on stack:
[ff1d3ebc] disable_local_APIC at ff11db75
[ff1d3ec0] crash_nmi_callback at ff13cc96
[ff1d3ec4] common_interrupt at ff1222f2
[ff1d3ed0] do_nmi at ff1335c1
[ff1d3ef0] handle_nmi_mce at ff174435
[ff1d3f18] csched_tick at ff110aa7
[ff1d3f80] timer_softirq_action at ff1155d4
[ff1d3fa0] do_softirq at ff114405
[ff1d3fb0] process_softirqs at ff173f66
bt: invalid structure size: task_struct
FILE: x86.c LINE: 1576 FUNCTION: x86_eframe_search()
[/usr/bin/crash] error trace: 816373b => 8164497 => 810c40c => 813ed94
813ed94: SIZE_verify+126
810c40c: x86_eframe_search+1075
8164497: handle_trace_error+692
816373b: lkcd_x86_back_trace+2370
bt: invalid structure size: task_struct
FILE: x86.c LINE: 1576 FUNCTION: x86_eframe_search()
crash>
Now, the bogus vmlinux eframe search can be avoided by doing this in
handle_trace_error():
--- lkcd_x86_trace.c.orig 2008-10-14 15:46:33.000000000 -0400
+++ lkcd_x86_trace.c 2008-10-14 16:09:26.000000000 -0400
@@ -2440,12 +2441,14 @@ handle_trace_error(struct bt_info *bt, i
bt->flags |= BT_TEXT_SYMBOLS_PRINT|BT_ERROR_MASK;
back_trace(bt);
- bt->flags = BT_EFRAME_COUNT;
- if ((cnt = machdep->eframe_search(bt))) {
- error(INFO, "possible exception frame%s:\n",
- cnt > 1 ? "s" : "");
- bt->flags &= ~(ulonglong)BT_EFRAME_COUNT;
- machdep->eframe_search(bt);
+ if (!XEN_HYPER_MODE()) {
+ bt->flags = BT_EFRAME_COUNT;
+ if ((cnt = machdep->eframe_search(bt))) {
+ error(INFO, "possible exception frame%s:\n",
+ cnt > 1 ? "s" : "");
+ bt->flags &= ~(ulonglong)BT_EFRAME_COUNT;
+ machdep->eframe_search(bt);
+ }
}
}
After doing the above, the bt -a shows this, and therefore does
not fail prematurely:
crash> bt -a
PCPU: 0 VCPU: ffbc7080
bt: cannot resolve stack trace:
#0 [ff1d3ebc] elf_core_save_regs at ff10a810
#1 [ff1d3ec4] common_interrupt at ff1222ed
#2 [ff1d3ed0] do_nmi at ff1335bb
#3 [ff1d3ef0] handle_nmi_mce at ff17442e
#4 [ff1d3f24] csched_tick at ff110aa7
#5 [ff1d3f80] timer_softirq_action at ff1155d2
#6 [ff1d3fa0] do_softirq at ff1143fe
#7 [ff1d3fb0] process_softirqs at ff173f61
bt: text symbols on stack:
[ff1d3ebc] disable_local_APIC at ff11db75
[ff1d3ec0] crash_nmi_callback at ff13cc96
[ff1d3ec4] common_interrupt at ff1222f2
[ff1d3ed0] do_nmi at ff1335c1
[ff1d3ef0] handle_nmi_mce at ff174435
[ff1d3f18] csched_tick at ff110aa7
[ff1d3f80] timer_softirq_action at ff1155d4
[ff1d3fa0] do_softirq at ff114405
[ff1d3fb0] process_softirqs at ff173f66
PCPU: 1 VCPU: ff1b6080
...
Carrying it one step further, and given that the relevant part
of the stack from above looks like this:
crash> rd -s ff1d3ebc 84
ff1d3ebc: disable_local_APIC+5 crash_nmi_callback+38 common_interrupt+82 cpu0_stack+16076
ff1d3ecc: 0003d027 do_nmi+49 cpu0_stack+16120 00000000
ff1d3edc: ffbca000 ffbcbeb0 00000030 cpu0_stack+16308
ff1d3eec: 0000e010 handle_nmi_mce+91 cpu0_stack+16120 00000100
ff1d3efc: 00000005 000000ff 000005dc ffbdee88
ff1d3f0c: 00000000 00000960 00020000 csched_tick+1239
ff1d3f1c: 0000e008 00000083 ffbc7080 00000030
ff1d3f2c: 0003d027 80000003 000583a8 per_cpu__schedule_data
ff1d3f3c: c840ceb2 00000000 ffbfda80 00000000
ff1d3f4c: 00000000 00000000 00000100 00000960
ff1d3f5c: ffbdee80 00000246 000000ff csched_priv+4
ff1d3f6c: 00000000 ffbfda8c __per_cpu_data_end+54972 e4c5d8d9
ff1d3f7c: 0000008b timer_softirq_action+132 00000000 ffbc7080
ff1d3f8c: per_cpu__timers 00000000 cpu0_stack+16308 0000007b
ff1d3f9c: eaed7700 do_softirq+53 00000000 ffbc7080
ff1d3fac: 0000007b process_softirqs+6 eb396d84 00000002
ff1d3fbc: c0678470 c0678470 00000002 eaed7700
ff1d3fcc: 00000000 000d0000 c04011a7 00000061
ff1d3fdc: 00000202 eb396d48 00000069 0000007b
ff1d3fec: 0000007b 00000000 00000000 00000000
ff1d3ffc: ffbc7080 ffffffff ffffffff ffffffff
crash>
Clearly "process_softirqs" is the last text return address
reference that the backtrace code can work with. So to try
to clean up the backtrace, I added this:
--- lkcd_x86_trace.c.orig 2008-10-14 15:46:33.000000000 -0400
+++ lkcd_x86_trace.c 2008-10-14 16:09:26.000000000 -0400
@@ -1423,6 +1423,7 @@ find_trace(
if (XEN_HYPER_MODE()) {
func_name = kl_funcname(pc);
if (STREQ(func_name, "idle_loop") || STREQ(func_name, "hypercall")
+ || STREQ(func_name, "process_softirqs")
|| STREQ(func_name, "tracing_off")
|| STREQ(func_name, "handle_exception")) {
UPDATE_FRAME(func_name, pc, 0, sp, bp, asp, 0, 0, bp - sp, 0);
which shows:
crash> bt -a
PCPU: 0 VCPU: ffbc7080
#0 [ff1d3ebc] elf_core_save_regs at ff10a810
#1 [ff1d3ec4] common_interrupt at ff1222ed
#2 [ff1d3ed0] do_nmi at ff1335bb
#3 [ff1d3ef0] handle_nmi_mce at ff17442e
#4 [ff1d3f24] csched_tick at ff110aa7
#5 [ff1d3f80] timer_softirq_action at ff1155d2
#6 [ff1d3fa0] do_softirq at ff1143fe
#7 [ff1d3fb0] process_softirqs at ff173f61
PCPU: 1 VCPU: ff1b6080
...
The patch to avoid eframe search can be avoided entirely by applying
the second patch, but it seems that it should be left in place for
other unforeseen possibilities in the future.
Do you agree with these changes?
Thanks,
Dave
16 years, 1 month
Re: [Crash-utility] "cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash
by Dave Anderson
----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
> Thanks, Dave. Is it valid to just do "make modules" since it appears
> we're just adding a module or does the modification to
> arch/i386/mm/init.c necessitate a rebuilt kernel?
You might be able to *build* crash.o with "make modules", but if
you try to install it, it's going to fail due because it won't
be able to resolve the "page_is_ram" reference.
There may be some other way to export a symbol from
the base kernel without rebuilding the kernel. I have
seen some 3rd-party modules (i.e., non-Red Hat) that
load a "rogue" module that tinkers with its own internal
exported symbol list after it is installed by overwriting
its own exported symbols with the symbol name and address
of un-exported base kernel symbols. Then, after the rogue module
gets installed (and overwrites its own list of exported symbols),
a second "real" module gets installed -- and the real module
uses the illegally-exported (?) kernel symbols from the first
rogue module. Seems like a violation of the GPL, but anyway,
I don't have any examples of how they do it.
Dave
16 years, 1 month
Re: [Crash-utility] "cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash
by Dave Anderson
----- "Kevin Worth" <kevin.worth(a)hp.com> wrote:
> Hi Dave,
>
> I tried changing the PAGE_OFFSET definition in kexec-tools. Didn't
> seem to affect it- crash still fails to load the vmalloc'ed memory. If
> that seems like it absolves kexec-tools of any sins then perhaps we
> can drop the kexec-ml off the CC list.
>
> Your statement "Theoretically, anything at and above 0xb8000000 should
> fail." was accurate, which I saw on my live system (with no dump
> involved). Hoping this provides some insight.
Right -- but when you did the "module 0xf9102280", it read legitimate
data -- but the translated vmalloc address of 0xf9102280 shows a physical
address of 119b76280, i.e. well beyond the physical limit of 0xb8000000:
crash> module 0xf9102280
> struct module {
> state = MODULE_STATE_LIVE,
> list = {
> next = 0xf9073d84,
> prev = 0x403c63a4
> },
> name = "custom_lkm"
> ...
> crash> vtop 0xf9102280
> VIRTUAL PHYSICAL
> f9102280 119b76280
> ...
> crash> rd -p 119b76000 30
> rd: read error: physical address: 119b76000 type: "32-bit PHYSADDR"
> ...
That being the case, I just remembered something that I had completely
forgotten about -- because of yet another Red Hat imposed restriction.
On RHEL systems, we have the restricted /dev/mem, but in addition to
that /dev/kmem has been completely removed:
$ ls -l /dev/mem /dev/kmem
ls: /dev/kmem: No such file or directory
crw-r----- 1 root kmem 1, 1 Oct 6 09:17 /dev/mem
$
However, the crash utility, if it realizes that it cannot access a physical
address because it's bigger than the high_memory limit, does this in
read_dev_mem():
/*
* /dev/mem disallows anything >= __pa(high_memory)
*
* However it will allow 64-bit lseeks to anywhere, and when followed
* by pulling a 32-bit address from the 64-bit file position, it
* quietly returns faulty data from the (wrapped-around) address.
*/
if (vt->high_memory && (paddr >= (physaddr_t)(VTOP(vt->high_memory)))) {
readcnt = 0;
errno = 0;
goto try_dev_kmem;
}
So the vmalloc read of 0xf9102280, whose physical address is 0x119b76280 is
greater than the VTOP of 0xf8000000, or b8000000, will goto try_dev_kmem.
First, do you have a /dev/kmem on your system? If so, the read attempt
continues like so if the passed-in address was a vmalloc address:
if ((readcnt != cnt) && BITS32() && !readcnt && !errno &&
IS_VMALLOC_ADDR(addr))
readcnt = read_dev_kmem(addr, bufptr, cnt);
You'll have to debug crash's memory.c:read_dev_mem() function to determine
whether it followed that path, and successfully read_dev_kmem() above to get
the legitimate module data. That would be a possible (the only?) explanation
for what you're seeing.
Now, that all being said, debugging the above offers nothing towards the
debugging of the kdump/dumpfile issue.
Dave
16 years, 1 month