November 2009 - Crash-utility - Crash Utility List Archives

by Adrien Kunysz

Earlier today I was pointed to a truncated vmcore that made crash(8) crash and this prompted me to do some fuzzing. Before going further I would like to know if there is interest to fix this kind of bugs and if I should report them to Bugzilla. After all, most of these crashes are unlikely to happen in real life as long as the vmcores have not been purposefully tempered with. The most common crash by far in my tests is this one: Consider a x86_64 vmcore file taken with the snap plugin: 00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............| 00000010 04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............| 00000020 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |@...............| 00000030 00 00 00 00 40 00 38 00 03 00 00 00 00 00 00 00 |....@.8.........| 00000040 04 00 00 00 00 00 00 00 e8 00 00 00 00 00 00 00 |................| If we change byte 0x4e: 00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............| 00000010 04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............| 00000020 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |@...............| 00000030 00 00 00 00 40 00 38 00 03 00 00 00 00 00 00 00 |....@.8.........| 00000040 04 00 00 00 00 00 00 00 e8 00 00 00 00 00 80 00 |................| This makes crash(8) segfault: Program received signal SIGSEGV, Segmentation fault. 0x00000000004f1bf4 in dump_Elf64_Nhdr (offset=36028797018964200, store=1) at netdump.c:1807 1807 notesize = (uint64_t)note->n_namesz + (uint64_t)note->n_descsz; (gdb) bt full #0 0x00000000004f1bf4 in dump_Elf64_Nhdr (offset=36028797018964200, store=1) at netdump.c:1807 i = 0 lf = 0 words = 0 note = (Elf64_Nhdr *) 0x800000159520c8 len = 140737175810672 buf = '\0' <repeats 1499 times> ptr = 0x800000159520d4 <Address 0x800000159520d4 out of bounds> uptr = (ulonglong *) 0x100000000 iptr = (int *) 0x0 up = (ulong *) 0x6f0617 xen_core = 0 vmcoreinfo = 0 remaining = 0 notesize = 362094736 #1 0x00000000004ed99a in is_netdump (file=0x7fffed5f1bee "vmcore-sample-small.x86_64", source_query=128) at netdump.c:335 i = 2 fd = 6 swap = 0 elf32 = (Elf32_Ehdr *) 0x7fffed5ef8b0 load32 = (Elf32_Phdr *) 0x0 elf64 = (Elf64_Ehdr *) 0x7fffed5ef8b0 load64 = (Elf64_Phdr *) 0x7fffed5ef928 eheader = [...] buf = [...] size = 760 len = 0 tot = 0 offset32 = 32767 offset64 = 36028797018964200 tmp_flags = 64 tmp_elf_header = 0x15951fe0 "\177ELF\002\001\001" #2 0x00000000004f3e3b in is_kdump (file=0x7fffed5f1bee "vmcore-sample-small.x86_64", source_query=128) at netdump.c:2383 No locals. #3 0x000000000044c892 in main (argc=2, argv=0x7fffed5f0cb8) at main.c:401 i = <value optimized out> c = <value optimized out> option_index = 0 It looks like it should do more sanity check on p_offset but I am unsure how to fix this properly. This is crash-4.1.1-0. The sample vmcore is too large to send by mail or to attach to Bugzilla and I am not sure the crash core itself would be of much use.

16 years

1
0
0 / 0

Heads up: possible 2.6.31 kdump and crash utility failures

by Dave Anderson

You may have seen this discussion re: 2.6.31 kdump failures on the kexec(a)lists.infradead.org mailing list: Kdump issue with percpu_alloc=lpage (Was:Re: crash_notes posted to kexec-tools) http://lists.infradead.org/pipermail/kexec/2009-October/003587.html or saw Vivek's subsequent post to LKML to address it: [PATCH] Fix kdump failure if booted with percpu_alloc=page http://lkml.org/lkml/2009/11/19/214 Basically if a 2.6.31 or later kernel is: (1) configured with CONFIG_NEED_MULTIPLE_NODES, and (2) the system actually has multiple NUMA nodes, then it will use vmalloc space for its percpu data. In that case, the 2.6.31 kernel uses the "lpage" percpu memory allocator (subsequently renamed the "page" allocator) instead of the traditional "embed" percpu memory allocator. At least on x86_64, this will cause the the crash utility to fail during initialization, because it tries to read vmalloc memory prior to having set itself up to be able to walk page tables. Prior to 4.1.1, it would fail with this error message: crash: read error: kernel virtual address: ffffc9000000e2f8 type: cpu number (per_cpu) With 4.1.1 -- which quietly accepts the readmem failure above -- it fails later on with these two error messages: crash: cannot determine idle task addresses from init_tasks[] or runqueues[] crash: cannot resolve "init_task_union" I believe that this only affects x86_64. I am testing a fix for it, which I will put in a new crash release in short order. Dave

16 years, 1 month

1
0
0 / 0

[ANNOUNCE] crash version 4.1.1 is available

by Dave Anderson

- Fix for a potential session initialization failure when running against 2.6.30 or later x86_64 kernel dumpfiles whose pages have been filtered by the the makedumpfile facility. Without the patch, the session may fail with the error message "crash: page excluded: kernel virtual address: <address> type: cpu number (per_cpu)", but will initialize OK if the "--zero_excluded" command line option is used. (anderson(a)redhat.com) - Added "lsmod" as a built-in alias for the "mod" command. (anderson(a)redhat.com) - Added a defensive mechanism to handle corrupt Elf32_Nhdr/Elf64_Nhdr structures in an ELF vmcore. The fix no longer presumes that all Elf32_Nhdr/Elf64_Nhdr structure contents are legitimate, and if an invalid Elf32_Nhdr or Elf64_Nhdr structure is encountered, it will be ignored and a warning message will be displayed showing the structure contents, and the crash session will continue on. Without the patch, it was possible that an invalid n_namesz or n_descsz value could cause a segmentation violation when attempting to read the bogus note contents. (anderson(a)redhat.com) - Fix for "mach -c" command option on 2.6.30 and later x86_64 kernels in which the per-cpu array x8664_pda data structures were replaced with per-cpu variables. Without the patch, the command displays just the boot cpu's cpuinfo data structure and then fails with the error message: "mach: invalid structure name: x8664_pda". (anderson(a)redhat.com) - Fix to properly set the DEBUG exception stack size and stack base address on 2.6.18 and later x86_64 kernels. Without the patch, the DEBUG exception stack was presumed to be the same size as all of the other exception stacks, so in the extremely rare occurrance that a kernel crash started while running on a per-cpu DEBUG stack, the backtrace code would not recognize it as such, and would either start the trace using stale starting stack hooks, typically from "schedule" while running on the process stack, or the backtrace attempt would fail with the error message "bt: cannot transition from exception stack to current process stack". (anderson(a)redhat.com) - Related to the above, when the x86_64 "bt" is displaying a trace segment from one of the five exception stacks, change the output from showing just "--- <exception stack> ..." to showing which exception stack it's working from, for example, "--- <NMI exception stack> ---" or "--- <DEBUG exception stack> ---", etc. (anderson(a)redhat.com) - Fix for a session initialization failure when running against 2.6.30 or later x86_64 kernels if the number of possible cpus equals the kernel's configured NR_CPUS. Without the patch, the session fails with the error message "crash: invalid kernel virtual address: cc08 type: cpu number (per_cpu)". (bob.montgomery(a)hp.com) - Preparations in the top-level source code for the integration of gbd-7.0. The current embedded version remains gdb-6.1. (anderson(a)redhat.com) Download from: http://people.redhat.com/anderson

16 years, 1 month

1
0
0 / 0

Re: [Crash-utility] kmap

by Dave Anderson

----- "Darrin Thompson" <darrinth(a)gmail.com> wrote: > On Wed, Nov 18, 2009 at 3:21 PM, Dave Anderson <anderson(a)redhat.com> > wrote: > > Or for what it's worth, you can just read the data using the > physical > > address: > > > > crash> rd -p 2b0000 10 > > 2b0000: 0100c70000080805 fff0db31fb000000 ............1... > > 2b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1. > > That's exactly what I'm looking for. > > When I'm looking at: > > kmem -p ffff810104ffd258 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffff810104ffd258 16da9d000 0 0 1 168100000000061 > > How do I get the translation of the flags? I've seen something useful > in vtop but I can never tell if it's giving me flags for the page > struct at the pointer I give or the page struct that would have > pointed the address I gave. The flags you see in the "vtop" output are PTE flags and not page flags. For the page flags, you'll have to look at the kernel source code in "include/linux/page-flags.h". The usage of that bit-field changes way too much for it to be hardwired into the crash utility. Dave

16 years, 1 month

1
0
0 / 0

Re: [Crash-utility] kmap

by Dave Anderson

----- "Darrin Thompson" <darrinth(a)gmail.com> wrote: > I'm finding a problem struct page in a kdump. I want to trace down > what that page is referring to. For instance, if I could execute > kmap(page), and run rd the pointer returned, what would I find there? > I realize that this may not always be possible. What is the right way > to attempt it? This is x86_64 if it matters. If it's an x86_64, then calling kmap(page) ends up doing this on the page struct address: __va(page_to_pfn(page) << PAGE_SHIFT); So, I'm presuming that you know the page structure address, but you want to know how to access the page data via its kmap'd virtual address. So for example, suppose I know that the page structure address is ffff8100006ef680, then "kmem -p <page-address> shows the physical address of the referenced page: crash> kmem -p ffff8100006ef680 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffff8100006ef680 2b0000 0 0 1 400 crash> For x86_64, then it's simply a matter of changing the physical address into its unity-mapped kernel virtual address (i.e. as returned by the __va() macro): crash> ptov 2b0000 VIRTUAL PHYSICAL ffff8100002b0000 2b0000 crash> So kmap(0xffff8100006ef680) would return ffff8100002b0000, which you can "rd": crash> rd ffff8100002b0000 10 ffff8100002b0000: 0100c70000080805 fff0db31fb000000 ............1... ffff8100002b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1. ffff8100002b0020: 03f8ba046a0c7a8b 8d4c2c24748b0000 .z.j.......t$,L. ffff8100002b0030: f5e800000090248c fffffb37e9ffffe4 .$..........7... ffff8100002b0040: 03398330244c8b48 798300000156860f H.L$0.9...V....y crash> Or for what it's worth, you can just read the data using the physical address: crash> rd -p 2b0000 10 2b0000: 0100c70000080805 fff0db31fb000000 ............1... 2b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1. 2b0020: 03f8ba046a0c7a8b 8d4c2c24748b0000 .z.j.......t$,L. 2b0030: f5e800000090248c fffffb37e9ffffe4 .$..........7... 2b0040: 03398330244c8b48 798300000156860f H.L$0.9...V....y crash> I *think* that's what's your asking... Dave

16 years, 1 month

2
1
0 / 0

Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

by Dave Anderson

----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > On Wed, 2009-11-11 at 18:54 +0000, Dave Anderson wrote: > > > > > But another question is in the (extremely) rare circumstance of > a > > > > non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array > > > > would be all NULL, and the symbol_value("per_cpu__cpu_number") > > > > call would return the qualified unity-mapped address. So the > > > > virtual address calculation should work in x86_64_per_cpu_init(), > > > > and the loop wouldn't even be entered in x86_64_get_smp_cpus() > > > > > > > > That being said, I don't think I've seen a recent x86_64 kernel > > > > that was not compiled CONFIG_SMP, so I can't confirm that it's > > > > ever been tested. > > > > > > > > So for sanity's sake, maybe your patch should also be applied, > > > > but should also check if the "i" index is non-zero? > > Now I'm thinking that test won't be needed for the non-CONFIG_SMP > kernel. If the array is full of 0x0s, the loop will compute the first > address as (0x0 + symbol_value("per_cpu__cpu_number")) and read a > cpunumber of 0. Then on the next iteration, it will calculate the very > same address again, and read the same cpunumber of 0. But now the test > is against cpus==1, so that test will fail and we'll drop out of the > loop, right? Right! > In the real smp case, we'll still try to read the small offset (cc08) > like an address, but be spared any embarrassment by the QUIET| > RETURN_ON_ERROR fix. Just to be clear, I think that we agree that: (1) the QUIET|RETURN_ON_ERROR be applied in both functions, (2) the kt->__per_cpu_offset[] NULL-check should be completely dropped in x86_64_per_cpu_init(), and (3) the kt->__per_cpu_offset[] NULL-check should still be applied in x86_64_get_smp_cpus() since that loop pre-requires that it's SMP. Dave

16 years, 1 month

2
2
0 / 0

kmap

by Darrin Thompson

I'm finding a problem struct page in a kdump. I want to trace down what that page is referring to. For instance, if I could execute kmap(page), and run rd the pointer returned, what would I find there? I realize that this may not always be possible. What is the right way to attempt it? This is x86_64 if it matters. -- Darrin

16 years, 1 month

1
0
0 / 0

Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

by Dave Anderson

----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > On Wed, 2009-11-11 at 14:52 +0000, Dave Anderson wrote: > > ----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > > > > > I have a dump from a 2.6.31-based x86_64 system where the number of > > > "possible" cpus equals the system's NR_CPUS (32). > > > On that system, the __per_cpu_offset table in the kernel consists of 32 > > > valid offset pointers. > > > I have a similar-but-different fix queued for this, but instead of > > checking for a NULL kt->__per_cpu_offset[i] entry, it changes the > > readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR > > like this: > > > > if (!readmem(symbol_value("per_cpu__cpu_number") + > > kt->__per_cpu_offset[i], > > KVADDR, &cpunumber, sizeof(int), > > "cpu number (per_cpu)", QUIET|RETURN_ON_ERROR)) > > break; > > > That should prevent the failure you're seeing. > > I did that first, and thought it was sort of cheating :-) Sort of. But at that point in time we're still kind of blindly wading around in the murk trying to figure out what we're running on... > > > But another question is in the (extremely) rare circumstance of a > > non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array > > would be all NULL, and the symbol_value("per_cpu__cpu_number") > > call would return the qualified unity-mapped address. So the > > virtual address calculation should work in x86_64_per_cpu_init(), > > and the loop wouldn't even be entered in x86_64_get_smp_cpus() > > > > That being said, I don't think I've seen a recent x86_64 kernel > > that was not compiled CONFIG_SMP, so I can't confirm that it's > > ever been tested. > > > > So for sanity's sake, maybe your patch should also be applied, > > but should also check if the "i" index is non-zero? > > So like this? > + if (i && (kt->__per_cpu_offset[i] == NULL)) > + break; Yes. > > So it's always ok to try the readmem on the first element of > the array. And the RETURN_ON_ERROR would deal with something going > wrong with that, although that case would presumably be a real > problem with the dump, right? (cpus == 0) Most likely yes. The motivation for my fix was due to a failure attempting to readmem() a legitimate virtual address that was an an excluded page from a makedumpfile-generated dump. If I recall correctly, it was an in-house kexec-tools bugzilla, but I can't find it. Dave

16 years, 1 month

2
1
0 / 0

Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

by Dave Anderson

----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > I have a dump from a 2.6.31-based x86_64 system where the number of > "possible" cpus equals the system's NR_CPUS (32). > On that system, the __per_cpu_offset table in the kernel consists of 32 > valid offset pointers. > > When crash loads this table into its __per_cpu_offset[NR_CPUS=4096] > array in struct kernel_table, it knows the length of the kernel's array > (32*sizeof(long)), and copies the 32 pointers, leaving the rest of its > (much longer) array full of 0x0s. > > (This happens in kernel.c) > > 193 if (symbol_exists("__per_cpu_offset")) { > 194 if (LKCD_KERNTYPES()) > 195 i = get_cpus_possible(); > 196 else > 197 i = get_array_length("__per_cpu_offset", NULL, 0); > 198 get_symbol_data("__per_cpu_offset", > 199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS), > 200 &kt->__per_cpu_offset[0]); > 201 kt->flags |= PER_CPU_OFF; > 202 } > > Later, in a couple of places, crash checks for the maximum valid > __per_cpu_offset by reading the cpu_number value out of each per_cpu > area and comparing it to the expected number until the comparison fails. > (Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and > that's OK). > > >From x86_64.c: > > 4201 for (i = cpus = 0; i < NR_CPUS; i++) { > 4202 readmem(symbol_value("per_cpu__cpu_number") + > 4203 kt->__per_cpu_offset[i], KVADDR, > 4204 &cpunumber, sizeof(int), > 4205 "cpu number (per_cpu)", FAULT_ON_ERROR); > 4206 if (cpunumber != cpus) > 4207 break; > 4208 cpus++; > 4209 } > > This works well when the kernel's array has fewer real per_cpu_offsets > than its own NR_CPUS, since the kernel preloads its array with a pointer > (BOOT_PERCPU_OFFSET) and when this loop runs past the real > per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it > reads a bogus value for cpunumber and terminates. > > But when the kernel's table is full of valid per_cpu_offset pointers, > this loop continues off the end of that into the part of crash's > __per_cpu_offset array that has the 0x0 initial values, and dies with: > > crash: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)" > > The cc08 comes from the symbol_value of per_cpu__cpu_number: > 000000000000cc08 D per_cpu__cpu_number > > Bottom line: Crash is assuming an insufficient array termination for > the kernel's __per_cpu_offset array (a pointer that points to an invalid > cpu_number). > > The included patch adds an additional loop termination so that crash > doesn't run off the end of what it loaded from the dump. It just checks > for a NULL 0x0 value in kt->__per_cpu_offset[i]. > > Bob Montgomery, > Working at HP I have a similar-but-different fix queued for this, but instead of checking for a NULL kt->__per_cpu_offset[i] entry, it changes the readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR like this: if (!readmem(symbol_value("per_cpu__cpu_number") + kt->__per_cpu_offset[i], KVADDR, &cpunumber, sizeof(int), "cpu number (per_cpu)", QUIET|RETURN_ON_ERROR)) break; That should prevent the failure you're seeing. But another question is in the (extremely) rare circumstance of a non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array would be all NULL, and the symbol_value("per_cpu__cpu_number") call would return the qualified unity-mapped address. So the virtual address calculation should work in x86_64_per_cpu_init(), and the loop wouldn't even be entered in x86_64_get_smp_cpus() That being said, I don't think I've seen a recent x86_64 kernel that was not compiled CONFIG_SMP, so I can't confirm that it's ever been tested. So for sanity's sake, maybe your patch should also be applied, but should also check if the "i" index is non-zero? Thanks, Dave

16 years, 1 month

2
1
0 / 0

invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

by Bob Montgomery

I have a dump from a 2.6.31-based x86_64 system where the number of "possible" cpus equals the system's NR_CPUS (32). On that system, the __per_cpu_offset table in the kernel consists of 32 valid offset pointers. When crash loads this table into its __per_cpu_offset[NR_CPUS=4096] array in struct kernel_table, it knows the length of the kernel's array (32*sizeof(long)), and copies the 32 pointers, leaving the rest of its (much longer) array full of 0x0s. (This happens in kernel.c) 193 if (symbol_exists("__per_cpu_offset")) { 194 if (LKCD_KERNTYPES()) 195 i = get_cpus_possible(); 196 else 197 i = get_array_length("__per_cpu_offset", NULL, 0); 198 get_symbol_data("__per_cpu_offset", 199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS), 200 &kt->__per_cpu_offset[0]); 201 kt->flags |= PER_CPU_OFF; 202 } Later, in a couple of places, crash checks for the maximum valid __per_cpu_offset by reading the cpu_number value out of each per_cpu area and comparing it to the expected number until the comparison fails. (Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and that's OK). >From x86_64.c: 4201 for (i = cpus = 0; i < NR_CPUS; i++) { 4202 readmem(symbol_value("per_cpu__cpu_number") + 4203 kt->__per_cpu_offset[i], KVADDR, 4204 &cpunumber, sizeof(int), 4205 "cpu number (per_cpu)", FAULT_ON_ERROR); 4206 if (cpunumber != cpus) 4207 break; 4208 cpus++; 4209 } This works well when the kernel's array has fewer real per_cpu_offsets than its own NR_CPUS, since the kernel preloads its array with a pointer (BOOT_PERCPU_OFFSET) and when this loop runs past the real per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it reads a bogus value for cpunumber and terminates. But when the kernel's table is full of valid per_cpu_offset pointers, this loop continues off the end of that into the part of crash's __per_cpu_offset array that has the 0x0 initial values, and dies with: crash: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)" The cc08 comes from the symbol_value of per_cpu__cpu_number: 000000000000cc08 D per_cpu__cpu_number Bottom line: Crash is assuming an insufficient array termination for the kernel's __per_cpu_offset array (a pointer that points to an invalid cpu_number). The included patch adds an additional loop termination so that crash doesn't run off the end of what it loaded from the dump. It just checks for a NULL 0x0 value in kt->__per_cpu_offset[i]. Bob Montgomery, Working at HP

16 years, 1 month

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility November 2009