fuzzing crash(8)
by Adrien Kunysz
Earlier today I was pointed to a truncated vmcore that made crash(8) crash and this prompted me to do some fuzzing.
Before going further I would like to know if there is interest to fix this kind of bugs and if I should report them to
Bugzilla. After all, most of these crashes are unlikely to happen in real life as long as the vmcores have not been
purposefully tempered with.
The most common crash by far in my tests is this one:
Consider a x86_64 vmcore file taken with the snap plugin:
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............|
00000020 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |@...............|
00000030 00 00 00 00 40 00 38 00 03 00 00 00 00 00 00 00 |....@.8.........|
00000040 04 00 00 00 00 00 00 00 e8 00 00 00 00 00 00 00 |................|
If we change byte 0x4e:
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............|
00000020 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |@...............|
00000030 00 00 00 00 40 00 38 00 03 00 00 00 00 00 00 00 |....@.8.........|
00000040 04 00 00 00 00 00 00 00 e8 00 00 00 00 00 80 00 |................|
This makes crash(8) segfault:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004f1bf4 in dump_Elf64_Nhdr (offset=36028797018964200, store=1) at netdump.c:1807
1807 notesize = (uint64_t)note->n_namesz + (uint64_t)note->n_descsz;
(gdb) bt full
#0 0x00000000004f1bf4 in dump_Elf64_Nhdr (offset=36028797018964200, store=1) at netdump.c:1807
i = 0
lf = 0
words = 0
note = (Elf64_Nhdr *) 0x800000159520c8
len = 140737175810672
buf = '\0' <repeats 1499 times>
ptr = 0x800000159520d4 <Address 0x800000159520d4 out of bounds>
uptr = (ulonglong *) 0x100000000
iptr = (int *) 0x0
up = (ulong *) 0x6f0617
xen_core = 0
vmcoreinfo = 0
remaining = 0
notesize = 362094736
#1 0x00000000004ed99a in is_netdump (file=0x7fffed5f1bee "vmcore-sample-small.x86_64",
source_query=128) at netdump.c:335
i = 2
fd = 6
swap = 0
elf32 = (Elf32_Ehdr *) 0x7fffed5ef8b0
load32 = (Elf32_Phdr *) 0x0
elf64 = (Elf64_Ehdr *) 0x7fffed5ef8b0
load64 = (Elf64_Phdr *) 0x7fffed5ef928
eheader = [...]
buf = [...]
size = 760
len = 0
tot = 0
offset32 = 32767
offset64 = 36028797018964200
tmp_flags = 64
tmp_elf_header = 0x15951fe0 "\177ELF\002\001\001"
#2 0x00000000004f3e3b in is_kdump (file=0x7fffed5f1bee "vmcore-sample-small.x86_64", source_query=128)
at netdump.c:2383
No locals.
#3 0x000000000044c892 in main (argc=2, argv=0x7fffed5f0cb8) at main.c:401
i = <value optimized out>
c = <value optimized out>
option_index = 0
It looks like it should do more sanity check on p_offset but I am unsure how to fix this properly.
This is crash-4.1.1-0. The sample vmcore is too large to send by mail or to attach to Bugzilla and I am not sure the
crash core itself would be of much use.
14 years, 10 months
Heads up: possible 2.6.31 kdump and crash utility failures
by Dave Anderson
You may have seen this discussion re: 2.6.31 kdump failures on the
kexec(a)lists.infradead.org mailing list:
Kdump issue with percpu_alloc=lpage (Was:Re: crash_notes posted to kexec-tools)
http://lists.infradead.org/pipermail/kexec/2009-October/003587.html
or saw Vivek's subsequent post to LKML to address it:
[PATCH] Fix kdump failure if booted with percpu_alloc=page
http://lkml.org/lkml/2009/11/19/214
Basically if a 2.6.31 or later kernel is:
(1) configured with CONFIG_NEED_MULTIPLE_NODES, and
(2) the system actually has multiple NUMA nodes,
then it will use vmalloc space for its percpu data. In that case, the 2.6.31
kernel uses the "lpage" percpu memory allocator (subsequently renamed the
"page" allocator) instead of the traditional "embed" percpu memory allocator.
At least on x86_64, this will cause the the crash utility to fail during
initialization, because it tries to read vmalloc memory prior to having
set itself up to be able to walk page tables.
Prior to 4.1.1, it would fail with this error message:
crash: read error: kernel virtual address: ffffc9000000e2f8 type: cpu number (per_cpu)
With 4.1.1 -- which quietly accepts the readmem failure above -- it fails later on
with these two error messages:
crash: cannot determine idle task addresses from init_tasks[] or runqueues[]
crash: cannot resolve "init_task_union"
I believe that this only affects x86_64. I am testing a fix for it, which
I will put in a new crash release in short order.
Dave
14 years, 10 months
[ANNOUNCE] crash version 4.1.1 is available
by Dave Anderson
- Fix for a potential session initialization failure when running
against 2.6.30 or later x86_64 kernel dumpfiles whose pages have been
filtered by the the makedumpfile facility. Without the patch, the
session may fail with the error message "crash: page excluded: kernel
virtual address: <address> type: cpu number (per_cpu)", but will
initialize OK if the "--zero_excluded" command line option is used.
(anderson(a)redhat.com)
- Added "lsmod" as a built-in alias for the "mod" command.
(anderson(a)redhat.com)
- Added a defensive mechanism to handle corrupt Elf32_Nhdr/Elf64_Nhdr
structures in an ELF vmcore. The fix no longer presumes that all
Elf32_Nhdr/Elf64_Nhdr structure contents are legitimate, and if an
invalid Elf32_Nhdr or Elf64_Nhdr structure is encountered, it will
be ignored and a warning message will be displayed showing the
structure contents, and the crash session will continue on. Without
the patch, it was possible that an invalid n_namesz or n_descsz
value could cause a segmentation violation when attempting to read
the bogus note contents.
(anderson(a)redhat.com)
- Fix for "mach -c" command option on 2.6.30 and later x86_64 kernels
in which the per-cpu array x8664_pda data structures were replaced
with per-cpu variables. Without the patch, the command displays
just the boot cpu's cpuinfo data structure and then fails with the
error message: "mach: invalid structure name: x8664_pda".
(anderson(a)redhat.com)
- Fix to properly set the DEBUG exception stack size and stack base
address on 2.6.18 and later x86_64 kernels. Without the patch, the
DEBUG exception stack was presumed to be the same size as all of the
other exception stacks, so in the extremely rare occurrance that a
kernel crash started while running on a per-cpu DEBUG stack, the
backtrace code would not recognize it as such, and would either start
the trace using stale starting stack hooks, typically from "schedule"
while running on the process stack, or the backtrace attempt would
fail with the error message "bt: cannot transition from exception
stack to current process stack".
(anderson(a)redhat.com)
- Related to the above, when the x86_64 "bt" is displaying a trace
segment from one of the five exception stacks, change the output from
showing just "--- <exception stack> ..." to showing which exception
stack it's working from, for example, "--- <NMI exception stack> ---"
or "--- <DEBUG exception stack> ---", etc.
(anderson(a)redhat.com)
- Fix for a session initialization failure when running against 2.6.30
or later x86_64 kernels if the number of possible cpus equals the
kernel's configured NR_CPUS. Without the patch, the session fails
with the error message "crash: invalid kernel virtual address: cc08
type: cpu number (per_cpu)".
(bob.montgomery(a)hp.com)
- Preparations in the top-level source code for the integration of
gbd-7.0. The current embedded version remains gdb-6.1.
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
14 years, 10 months
Re: [Crash-utility] kmap
by Dave Anderson
----- "Darrin Thompson" <darrinth(a)gmail.com> wrote:
> On Wed, Nov 18, 2009 at 3:21 PM, Dave Anderson <anderson(a)redhat.com>
> wrote:
> > Or for what it's worth, you can just read the data using the
> physical
> > address:
> >
> > crash> rd -p 2b0000 10
> > 2b0000: 0100c70000080805 fff0db31fb000000 ............1...
> > 2b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1.
>
> That's exactly what I'm looking for.
>
> When I'm looking at:
>
> kmem -p ffff810104ffd258
> PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> ffff810104ffd258 16da9d000 0 0 1 168100000000061
>
> How do I get the translation of the flags? I've seen something useful
> in vtop but I can never tell if it's giving me flags for the page
> struct at the pointer I give or the page struct that would have
> pointed the address I gave.
The flags you see in the "vtop" output are PTE flags and not page flags.
For the page flags, you'll have to look at the kernel source code
in "include/linux/page-flags.h". The usage of that bit-field changes
way too much for it to be hardwired into the crash utility.
Dave
14 years, 10 months
Re: [Crash-utility] kmap
by Dave Anderson
----- "Darrin Thompson" <darrinth(a)gmail.com> wrote:
> I'm finding a problem struct page in a kdump. I want to trace down
> what that page is referring to. For instance, if I could execute
> kmap(page), and run rd the pointer returned, what would I find there?
> I realize that this may not always be possible. What is the right way
> to attempt it? This is x86_64 if it matters.
If it's an x86_64, then calling kmap(page) ends up doing this
on the page struct address:
__va(page_to_pfn(page) << PAGE_SHIFT);
So, I'm presuming that you know the page structure address, but you
want to know how to access the page data via its kmap'd virtual
address.
So for example, suppose I know that the page structure address
is ffff8100006ef680, then "kmem -p <page-address> shows the
physical address of the referenced page:
crash> kmem -p ffff8100006ef680
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffff8100006ef680 2b0000 0 0 1 400
crash>
For x86_64, then it's simply a matter of changing the physical
address into its unity-mapped kernel virtual address (i.e. as
returned by the __va() macro):
crash> ptov 2b0000
VIRTUAL PHYSICAL
ffff8100002b0000 2b0000
crash>
So kmap(0xffff8100006ef680) would return ffff8100002b0000, which
you can "rd":
crash> rd ffff8100002b0000 10
ffff8100002b0000: 0100c70000080805 fff0db31fb000000 ............1...
ffff8100002b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1.
ffff8100002b0020: 03f8ba046a0c7a8b 8d4c2c24748b0000 .z.j.......t$,L.
ffff8100002b0030: f5e800000090248c fffffb37e9ffffe4 .$..........7...
ffff8100002b0040: 03398330244c8b48 798300000156860f H.L$0.9...V....y
crash>
Or for what it's worth, you can just read the data using the physical
address:
crash> rd -p 2b0000 10
2b0000: 0100c70000080805 fff0db31fb000000 ............1...
2b0010: 8b485500313e6b05 c931c03145302454 .k>1.UH.T$0E1.1.
2b0020: 03f8ba046a0c7a8b 8d4c2c24748b0000 .z.j.......t$,L.
2b0030: f5e800000090248c fffffb37e9ffffe4 .$..........7...
2b0040: 03398330244c8b48 798300000156860f H.L$0.9...V....y
crash>
I *think* that's what's your asking...
Dave
14 years, 10 months
Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"
by Dave Anderson
----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote:
> On Wed, 2009-11-11 at 18:54 +0000, Dave Anderson wrote:
>
> > > > But another question is in the (extremely) rare circumstance of
> a
> > > > non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array
> > > > would be all NULL, and the symbol_value("per_cpu__cpu_number")
> > > > call would return the qualified unity-mapped address. So the
> > > > virtual address calculation should work in x86_64_per_cpu_init(),
> > > > and the loop wouldn't even be entered in x86_64_get_smp_cpus()
> > > >
> > > > That being said, I don't think I've seen a recent x86_64 kernel
> > > > that was not compiled CONFIG_SMP, so I can't confirm that it's
> > > > ever been tested.
> > > >
> > > > So for sanity's sake, maybe your patch should also be applied,
> > > > but should also check if the "i" index is non-zero?
>
> Now I'm thinking that test won't be needed for the non-CONFIG_SMP
> kernel. If the array is full of 0x0s, the loop will compute the first
> address as (0x0 + symbol_value("per_cpu__cpu_number")) and read a
> cpunumber of 0. Then on the next iteration, it will calculate the very
> same address again, and read the same cpunumber of 0. But now the test
> is against cpus==1, so that test will fail and we'll drop out of the
> loop, right?
Right!
> In the real smp case, we'll still try to read the small offset (cc08)
> like an address, but be spared any embarrassment by the QUIET|
> RETURN_ON_ERROR fix.
Just to be clear, I think that we agree that:
(1) the QUIET|RETURN_ON_ERROR be applied in both functions,
(2) the kt->__per_cpu_offset[] NULL-check should be completely dropped
in x86_64_per_cpu_init(), and
(3) the kt->__per_cpu_offset[] NULL-check should still be applied in
x86_64_get_smp_cpus() since that loop pre-requires that it's SMP.
Dave
14 years, 10 months
kmap
by Darrin Thompson
I'm finding a problem struct page in a kdump. I want to trace down
what that page is referring to. For instance, if I could execute
kmap(page), and run rd the pointer returned, what would I find there?
I realize that this may not always be possible. What is the right way
to attempt it? This is x86_64 if it matters.
--
Darrin
14 years, 10 months
Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"
by Dave Anderson
----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote:
> On Wed, 2009-11-11 at 14:52 +0000, Dave Anderson wrote:
> > ----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote:
> >
> > > I have a dump from a 2.6.31-based x86_64 system where the number of
> > > "possible" cpus equals the system's NR_CPUS (32).
> > > On that system, the __per_cpu_offset table in the kernel consists of 32
> > > valid offset pointers.
>
> > I have a similar-but-different fix queued for this, but instead of
> > checking for a NULL kt->__per_cpu_offset[i] entry, it changes the
> > readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR
> > like this:
> >
> > if (!readmem(symbol_value("per_cpu__cpu_number") +
> > kt->__per_cpu_offset[i],
> > KVADDR, &cpunumber, sizeof(int),
> > "cpu number (per_cpu)", QUIET|RETURN_ON_ERROR))
> > break;
>
> > That should prevent the failure you're seeing.
>
> I did that first, and thought it was sort of cheating :-)
Sort of. But at that point in time we're still kind of blindly
wading around in the murk trying to figure out what we're
running on...
>
> > But another question is in the (extremely) rare circumstance of a
> > non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array
> > would be all NULL, and the symbol_value("per_cpu__cpu_number")
> > call would return the qualified unity-mapped address. So the
> > virtual address calculation should work in x86_64_per_cpu_init(),
> > and the loop wouldn't even be entered in x86_64_get_smp_cpus()
> >
> > That being said, I don't think I've seen a recent x86_64 kernel
> > that was not compiled CONFIG_SMP, so I can't confirm that it's
> > ever been tested.
> >
> > So for sanity's sake, maybe your patch should also be applied,
> > but should also check if the "i" index is non-zero?
>
> So like this?
> + if (i && (kt->__per_cpu_offset[i] == NULL))
> + break;
Yes.
>
> So it's always ok to try the readmem on the first element of
> the array. And the RETURN_ON_ERROR would deal with something going
> wrong with that, although that case would presumably be a real
> problem with the dump, right? (cpus == 0)
Most likely yes. The motivation for my fix was due to a failure
attempting to readmem() a legitimate virtual address that was an
an excluded page from a makedumpfile-generated dump. If I recall
correctly, it was an in-house kexec-tools bugzilla, but I can't
find it.
Dave
14 years, 11 months
Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"
by Dave Anderson
----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote:
> I have a dump from a 2.6.31-based x86_64 system where the number of
> "possible" cpus equals the system's NR_CPUS (32).
> On that system, the __per_cpu_offset table in the kernel consists of 32
> valid offset pointers.
>
> When crash loads this table into its __per_cpu_offset[NR_CPUS=4096]
> array in struct kernel_table, it knows the length of the kernel's array
> (32*sizeof(long)), and copies the 32 pointers, leaving the rest of its
> (much longer) array full of 0x0s.
>
> (This happens in kernel.c)
>
> 193 if (symbol_exists("__per_cpu_offset")) {
> 194 if (LKCD_KERNTYPES())
> 195 i = get_cpus_possible();
> 196 else
> 197 i = get_array_length("__per_cpu_offset", NULL, 0);
> 198 get_symbol_data("__per_cpu_offset",
> 199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS),
> 200 &kt->__per_cpu_offset[0]);
> 201 kt->flags |= PER_CPU_OFF;
> 202 }
>
> Later, in a couple of places, crash checks for the maximum valid
> __per_cpu_offset by reading the cpu_number value out of each per_cpu
> area and comparing it to the expected number until the comparison fails.
> (Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and
> that's OK).
>
> >From x86_64.c:
>
> 4201 for (i = cpus = 0; i < NR_CPUS; i++) {
> 4202 readmem(symbol_value("per_cpu__cpu_number") +
> 4203 kt->__per_cpu_offset[i], KVADDR,
> 4204 &cpunumber, sizeof(int),
> 4205 "cpu number (per_cpu)", FAULT_ON_ERROR);
> 4206 if (cpunumber != cpus)
> 4207 break;
> 4208 cpus++;
> 4209 }
>
> This works well when the kernel's array has fewer real per_cpu_offsets
> than its own NR_CPUS, since the kernel preloads its array with a pointer
> (BOOT_PERCPU_OFFSET) and when this loop runs past the real
> per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it
> reads a bogus value for cpunumber and terminates.
>
> But when the kernel's table is full of valid per_cpu_offset pointers,
> this loop continues off the end of that into the part of crash's
> __per_cpu_offset array that has the 0x0 initial values, and dies with:
>
> crash: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"
>
> The cc08 comes from the symbol_value of per_cpu__cpu_number:
> 000000000000cc08 D per_cpu__cpu_number
>
> Bottom line: Crash is assuming an insufficient array termination for
> the kernel's __per_cpu_offset array (a pointer that points to an invalid
> cpu_number).
>
> The included patch adds an additional loop termination so that crash
> doesn't run off the end of what it loaded from the dump. It just checks
> for a NULL 0x0 value in kt->__per_cpu_offset[i].
>
> Bob Montgomery,
> Working at HP
I have a similar-but-different fix queued for this, but instead of
checking for a NULL kt->__per_cpu_offset[i] entry, it changes the
readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR
like this:
if (!readmem(symbol_value("per_cpu__cpu_number") +
kt->__per_cpu_offset[i],
KVADDR, &cpunumber, sizeof(int),
"cpu number (per_cpu)", QUIET|RETURN_ON_ERROR))
break;
That should prevent the failure you're seeing.
But another question is in the (extremely) rare circumstance of a
non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array
would be all NULL, and the symbol_value("per_cpu__cpu_number")
call would return the qualified unity-mapped address. So the
virtual address calculation should work in x86_64_per_cpu_init(),
and the loop wouldn't even be entered in x86_64_get_smp_cpus()
That being said, I don't think I've seen a recent x86_64 kernel
that was not compiled CONFIG_SMP, so I can't confirm that it's
ever been tested.
So for sanity's sake, maybe your patch should also be applied,
but should also check if the "i" index is non-zero?
Thanks,
Dave
14 years, 11 months
invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"
by Bob Montgomery
I have a dump from a 2.6.31-based x86_64 system where the number of
"possible" cpus equals the system's NR_CPUS (32).
On that system, the __per_cpu_offset table in the kernel consists of 32
valid offset pointers.
When crash loads this table into its __per_cpu_offset[NR_CPUS=4096]
array in struct kernel_table, it knows the length of the kernel's array
(32*sizeof(long)), and copies the 32 pointers, leaving the rest of its
(much longer) array full of 0x0s.
(This happens in kernel.c)
193 if (symbol_exists("__per_cpu_offset")) {
194 if (LKCD_KERNTYPES())
195 i = get_cpus_possible();
196 else
197 i = get_array_length("__per_cpu_offset", NULL, 0);
198 get_symbol_data("__per_cpu_offset",
199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS),
200 &kt->__per_cpu_offset[0]);
201 kt->flags |= PER_CPU_OFF;
202 }
Later, in a couple of places, crash checks for the maximum valid
__per_cpu_offset by reading the cpu_number value out of each per_cpu
area and comparing it to the expected number until the comparison fails.
(Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and
that's OK).
>From x86_64.c:
4201 for (i = cpus = 0; i < NR_CPUS; i++) {
4202 readmem(symbol_value("per_cpu__cpu_number") +
4203 kt->__per_cpu_offset[i], KVADDR,
4204 &cpunumber, sizeof(int),
4205 "cpu number (per_cpu)", FAULT_ON_ERROR);
4206 if (cpunumber != cpus)
4207 break;
4208 cpus++;
4209 }
This works well when the kernel's array has fewer real per_cpu_offsets
than its own NR_CPUS, since the kernel preloads its array with a pointer
(BOOT_PERCPU_OFFSET) and when this loop runs past the real
per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it
reads a bogus value for cpunumber and terminates.
But when the kernel's table is full of valid per_cpu_offset pointers,
this loop continues off the end of that into the part of crash's
__per_cpu_offset array that has the 0x0 initial values, and dies with:
crash: invalid kernel virtual address: cc08 type: "cpu number
(per_cpu)"
The cc08 comes from the symbol_value of per_cpu__cpu_number:
000000000000cc08 D per_cpu__cpu_number
Bottom line: Crash is assuming an insufficient array termination for
the kernel's __per_cpu_offset array (a pointer that points to an invalid
cpu_number).
The included patch adds an additional loop termination so that crash
doesn't run off the end of what it loaded from the dump. It just checks
for a NULL 0x0 value in kt->__per_cpu_offset[i].
Bob Montgomery,
Working at HP
14 years, 11 months