Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Dave Anderson
----- "Sami Liedes" <sliedes(a)cc.hut.fi> wrote:
> On Fri, Oct 08, 2010 at 11:07:10AM -0400, Dave Anderson wrote:
> > How did you come up with the "131" size?
>
> Just by adding debugging prints and inspecting the qcow2 image. So it
> may be very much incorrect, except in this one case.
Damn -- this "borrowed usage" of the savevm format for virsh dump is
really getting to be a pain in the ass...
Looking at the qemu-kvm sources, it's not obvious to me what the size
of the the "slirp" device would be in the dumpfile. And apparently
Red Hat kernels don't use that device or somebody else would have
bumped into it, but I'll check with Paolo Bonzini to verify the number.
Thanks,
Dave
14 years, 1 month
Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Sami Liedes
On Fri, Oct 08, 2010 at 11:07:10AM -0400, Dave Anderson wrote:
> How did you come up with the "131" size?
Just by adding debugging prints and inspecting the qcow2 image. So it
may be very much incorrect, except in this one case.
Sami
14 years, 1 month
Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Sami Liedes
On Fri, Oct 08, 2010 at 10:21:30AM -0400, Dave Anderson wrote:
> Try the attached patch.
Yup, that seems to fix the problem.
FWIW, I also added support for the "slirp" section in some
qemu-produced qcow2 images I had. I didn't read qemu source to
determine whether the section size is constant, so it might not be
correct; however the attached patch works for me in this one case.
Sami
14 years, 1 month
Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Dave Anderson
----- "Sami Liedes" <sliedes(a)cc.hut.fi> wrote:
> On Fri, Oct 08, 2010 at 09:31:02AM -0400, Dave Anderson wrote:
> > I don't think that this is associated with KVM, but rather the
> kernel
> > version used. It should be pretty easy to debug on your end,
> because it
> > boils down to these initializations at the top of
> x86_64_per_cpu_init()
> >
> > irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
> > cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");
> >
> > If it's a UP kernel, and if "irq_sp" does not get set, then isize would
> > be left uninitialized.
>
> It's a uniprocessor amd64 kernel. Neither irq_sp nor cpu_sp get set.
>
> I have
>
> crash> sym irq_stack_union
> ffffffff81a1c000 (D) irq_stack_union
> crash> sym cpu_number
> symbol not found: cpu_number
>
> It's not accepted by per_cpu_symbol_search() because its type is not
> 'V' and because it's not between __per_cpu_start and __per_cpu_end.
> __per_cpu_start and __per_cpu_end are the same; I don't know if
> there's something wrong with that.
Try the attached patch.
Dave
14 years, 1 month
Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Sami Liedes
On Fri, Oct 08, 2010 at 09:31:02AM -0400, Dave Anderson wrote:
> I don't think that this is associated with KVM, but rather the kernel
> version used. It should be pretty easy to debug on your end, because it
> boils down to these initializations at the top of x86_64_per_cpu_init()
>
> irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
> cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");
>
> If it's a UP kernel, and if "irq_sp" does not get set, then isize would
> be left uninitialized.
It's a uniprocessor amd64 kernel. Neither irq_sp nor cpu_sp get set.
I have
crash> sym irq_stack_union
ffffffff81a1c000 (D) irq_stack_union
crash> sym cpu_number
symbol not found: cpu_number
It's not accepted by per_cpu_symbol_search() because its type is not
'V' and because it's not between __per_cpu_start and __per_cpu_end.
__per_cpu_start and __per_cpu_end are the same; I don't know if
there's something wrong with that.
(gdb) b x86_64_per_cpu_init
Breakpoint 1 at 0x4eb49c: file x86_64.c, line 823.
(gdb) r
[...]
Breakpoint 1, x86_64_per_cpu_init () at x86_64.c:823
823 ms = machdep->machspec;
(gdb) n
825 irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
(gdb) s
per_cpu_symbol_search (symbol=0x8a46d7 "per_cpu__irq_stack_union") at symbols.c:4106
4106 if (STRNEQ(symbol, "per_cpu__")) {
(gdb) n
4107 if ((sp = symbol_search(symbol)))
(gdb)
4109 new = symbol + strlen("per_cpu__");
(gdb)
4110 if ((sp = symbol_search(new))) {
(gdb) print new
$1 = 0x8a46e0 "irq_stack_union"
(gdb) n
4111 if ((sp->type == 'V') ||
(gdb) l
4106 if (STRNEQ(symbol, "per_cpu__")) {
4107 if ((sp = symbol_search(symbol)))
4108 return sp;
4109 new = symbol + strlen("per_cpu__");
4110 if ((sp = symbol_search(new))) {
4111 if ((sp->type == 'V') ||
4112 ((sp->value >= st->__per_cpu_start) &&
4113 (sp->value < st->__per_cpu_end)))
4114 return sp;
4115 }
(gdb) print sp->type
$2 = 68 'D'
(gdb) print sp->value
$3 = 18446744071589445632
(gdb) p/x sp->value
$4 = 0xffffffff81a1c000
(gdb) p/x st->__per_cpu_start
$5 = 0xffffffff81ae7000
(gdb) p/x st->__per_cpu_end
$6 = 0xffffffff81ae7000
Sami
14 years, 1 month
Re: [Crash-utility] [patch] crash on a KVM-generated dump
by Dave Anderson
----- "Sami Liedes" <sliedes(a)cc.hut.fi> wrote:
> Hi,
>
> There's a bug in Debian bugzilla on crash crashing:
>
> http://bugs.debian.org/599353
>
> Attached is a message I sent to that bug which contains a patch that
> fixes the problem (but in a non-beautiful way).
>
> Is there a redhat bugzilla entry for crash, by the way? Finding
> applications there was kind of hard, especially given that the query
> would be "crash".
Yes, it's bugzilla component is "crash", but it's pretty much for issues
associated with running crash against RHEL kernels, and I have not seen
this before. (even with an Ubuntu vmlinux-2.6.31-17-server dumpfile
I have no hand) Reporting it here is the best thing to do.
I don't think that this is associated with KVM, but rather the kernel
version used. It should be pretty easy to debug on your end, because it
boils down to these initializations at the top of x86_64_per_cpu_init()
irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");
If it's a UP kernel, and if "irq_sp" does not get set, then isize would
be left uninitialized.
If it's an SMP kernel, and if either "irq_sp" or "cpu_sp" do not get,
then isize would be left uninitialized.
But I can't understand why they wouldn't get initialized?
In a 2.6.36-rc1 kernel KVM dumpfile, I see this for their per-cpu
symbol values:
crash> sym irq_stack_union
0 (V) irq_stack_union
crash> sym cpu_number
e320 (V) cpu_number
crash>
Do you see something different with that kernel?
Dave
>
> Sami
>
>
> ----- Forwarded message from Sami Liedes <sliedes(a)cc.hut.fi> -----
>
> Date: Thu, 7 Oct 2010 21:50:22 +0300
> From: Sami Liedes <sliedes(a)cc.hut.fi>
> To: 599353(a)bugs.debian.org
> Subject: [patch] Hack to fix this crash
> User-Agent: Mutt/1.5.20 (2009-06-14)
>
> Hi,
>
> The crashing is pretty nondeterministic; today the existence of $HOME
> does not seem to have an effect (confirmed by Timo).
>
> It seems to be caused by heap corruption. The code in fault is in
> x86_64.c; On some core files (produced by KVM), the interrupt stack
> size (machdep->machspec->stkinfo.isize) is somehow calculated to be 0,
> and 0 is passed to malloc() in x86_64.c:342. Later data is written
> through that pointer.
>
> Here's a minimal patch (crude hack, not a real fix for the underlying
> problem) to make this work:
>
> ------------------------------------------------------------
> diff -ur crash-5.0.7/x86_64.c crash-5.0.7.patched//x86_64.c
> --- crash-5.0.7/x86_64.c 2010-08-27 20:36:18.000000000 +0300
> +++ crash-5.0.7.patched//x86_64.c 2010-10-07 21:23:16.079119657 +0300
> @@ -339,6 +339,9 @@
> x86_64_per_cpu_init();
> x86_64_ist_init();
> machdep->in_alternate_stack = x86_64_in_alternate_stack;
> + /* HACK */
> + if (machdep->machspec->stkinfo.isize == 0)
> + machdep->machspec->stkinfo.isize = 65536;
> if ((machdep->machspec->irqstack = (char *)
> malloc(machdep->machspec->stkinfo.isize)) == NULL)
> error(FATAL, "cannot malloc irqstack
> space.");
> ------------------------------------------------------------
>
> Here are the valgrind warnings produced (search for "invalid write"
> to
> find the fault causing this; not that the other problems would not be
> worth fixing):
>
> ------------------------------------------------------------
> $ valgrind crash vmlinux new.core
> ==10013== Memcheck, a memory error detector
> ==10013== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et
> al.
> ==10013== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h
> for copyright info
> ==10013== Command: crash vmlinux new.core
> ==10013==
>
> crash 5.0.7
> Copyright (C) 2002-2010 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> GNU gdb (GDB) 7.0
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x5079290: inflateReset2 (inflate.c:157)
> ==10013== by 0x507937F: inflateInit2_ (inflate.c:193)
> ==10013== by 0x4DB05B: read_in_kernel_config (kernel.c:6708)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C26BB7: __GI___rawmemchr (mc_replace_strmem.c:729)
> ==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
> ==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x5758FFF: _IO_vfscanf (vfscanf.c:600)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x5759014: _IO_vfscanf (vfscanf.c:602)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x577B789: _IO_sputbackc (genops.c:730)
> ==10013== by 0x5759042: _IO_vfscanf (vfscanf.c:602)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C26BAA: __GI___rawmemchr (mc_replace_strmem.c:729)
> ==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
> ==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x575B66C: _IO_vfscanf (vfscanf.c:2734)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x575B70B: _IO_vfscanf (vfscanf.c:2734)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x46318F: whitespace (tools.c:222)
> ==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x463195: whitespace (tools.c:222)
> ==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4DB1B2: read_in_kernel_config (kernel.c:6747)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C2536A: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C25380: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C2537A: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> WARNING: cannot determine how modules are linked
> WARNING: no kernel module access
>
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A88: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e0 is 0 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A8C: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e1 is 1 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A94: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e2 is 2 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A99: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e3 is 3 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26AA9: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e8 is 8 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> KERNEL: vmlinux
> DUMPFILE: new.core
> CPUS: 1
> DATE: Fri Oct 1 21:26:15 2010
> UPTIME: 00:00:56
> LOAD AVERAGE: 0.14, 0.05, 0.02
> TASKS: 45
> NODENAME: fstest
> RELEASE: 2.6.35.6
> VERSION: #2 Wed Sep 29 15:05:49 EEST 2010
> MACHINE: x86_64 (2394 Mhz)
> ==10013== Source and destination overlap in strcpy(0x7fefffae2,
> 0x7fefffae4)
> ==10013== at 0x4C25918: strcpy (mc_replace_strmem.c:311)
> ==10013== by 0x46E9DE: pages_to_size (tools.c:4640)
> ==10013== by 0x49393F: get_memory_size (memory.c:11145)
> ==10013== by 0x4CFFC5: display_sys_stats (kernel.c:3927)
> ==10013== by 0x45D934: main_loop (main.c:581)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> MEMORY: 1 GB
> PANIC: ""
> PID: 0
> COMMAND: "swapper"
> TASK: ffffffff81a13040 [THREAD_INFO: ffffffff81a00000]
> CPU: 0
> STATE: TASK_RUNNING (ACTIVE)
> WARNING: panic task not found
>
> crash> q
> ==10013==
> ==10013== HEAP SUMMARY:
> ==10013== in use at exit: 53,444,536 bytes in 10,730 blocks
> ==10013== total heap usage: 396,156 allocs, 385,426 frees,
> 2,187,205,021 bytes allocated
> ==10013==
> ==10013== LEAK SUMMARY:
> ==10013== definitely lost: 6,414 bytes in 35 blocks
> ==10013== indirectly lost: 24 bytes in 1 blocks
> ==10013== possibly lost: 42,174,127 bytes in 8,022 blocks
> ==10013== still reachable: 11,263,971 bytes in 2,672 blocks
> ==10013== suppressed: 0 bytes in 0 blocks
> ==10013== Rerun with --leak-check=full to see details of leaked
> memory
> ==10013==
> ==10013== For counts of detected and suppressed errors, rerun with:
> -v
> ==10013== Use --track-origins=yes to see where uninitialised values
> come from
> ==10013== ERROR SUMMARY: 6710 errors from 21 contexts (suppressed: 4
> from 4)
> ------------------------------------------------------------
>
> Sami
>
> ----- End forwarded message -----
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
14 years, 1 month
[patch] crash on a KVM-generated dump
by Sami Liedes
Hi,
There's a bug in Debian bugzilla on crash crashing:
http://bugs.debian.org/599353
Attached is a message I sent to that bug which contains a patch that
fixes the problem (but in a non-beautiful way).
Is there a redhat bugzilla entry for crash, by the way? Finding
applications there was kind of hard, especially given that the query
would be "crash".
Sami
----- Forwarded message from Sami Liedes <sliedes(a)cc.hut.fi> -----
Date: Thu, 7 Oct 2010 21:50:22 +0300
From: Sami Liedes <sliedes(a)cc.hut.fi>
To: 599353(a)bugs.debian.org
Subject: [patch] Hack to fix this crash
User-Agent: Mutt/1.5.20 (2009-06-14)
Hi,
The crashing is pretty nondeterministic; today the existence of $HOME
does not seem to have an effect (confirmed by Timo).
It seems to be caused by heap corruption. The code in fault is in
x86_64.c; On some core files (produced by KVM), the interrupt stack
size (machdep->machspec->stkinfo.isize) is somehow calculated to be 0,
and 0 is passed to malloc() in x86_64.c:342. Later data is written
through that pointer.
Here's a minimal patch (crude hack, not a real fix for the underlying
problem) to make this work:
------------------------------------------------------------
diff -ur crash-5.0.7/x86_64.c crash-5.0.7.patched//x86_64.c
--- crash-5.0.7/x86_64.c 2010-08-27 20:36:18.000000000 +0300
+++ crash-5.0.7.patched//x86_64.c 2010-10-07 21:23:16.079119657 +0300
@@ -339,6 +339,9 @@
x86_64_per_cpu_init();
x86_64_ist_init();
machdep->in_alternate_stack = x86_64_in_alternate_stack;
+ /* HACK */
+ if (machdep->machspec->stkinfo.isize == 0)
+ machdep->machspec->stkinfo.isize = 65536;
if ((machdep->machspec->irqstack = (char *)
malloc(machdep->machspec->stkinfo.isize)) == NULL)
error(FATAL, "cannot malloc irqstack space.");
------------------------------------------------------------
Here are the valgrind warnings produced (search for "invalid write" to
find the fault causing this; not that the other problems would not be
worth fixing):
------------------------------------------------------------
$ valgrind crash vmlinux new.core
==10013== Memcheck, a memory error detector
==10013== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==10013== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==10013== Command: crash vmlinux new.core
==10013==
crash 5.0.7
Copyright (C) 2002-2010 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x5079290: inflateReset2 (inflate.c:157)
==10013== by 0x507937F: inflateInit2_ (inflate.c:193)
==10013== by 0x4DB05B: read_in_kernel_config (kernel.c:6708)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C26BB7: __GI___rawmemchr (mc_replace_strmem.c:729)
==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x5758FFF: _IO_vfscanf (vfscanf.c:600)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x5759014: _IO_vfscanf (vfscanf.c:602)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x577B789: _IO_sputbackc (genops.c:730)
==10013== by 0x5759042: _IO_vfscanf (vfscanf.c:602)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C26BAA: __GI___rawmemchr (mc_replace_strmem.c:729)
==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x575B66C: _IO_vfscanf (vfscanf.c:2734)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x575B70B: _IO_vfscanf (vfscanf.c:2734)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x46318F: whitespace (tools.c:222)
==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x463195: whitespace (tools.c:222)
==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4DB1B2: read_in_kernel_config (kernel.c:6747)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C2536A: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C25380: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C2537A: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
WARNING: cannot determine how modules are linked
WARNING: no kernel module access
==10013== Invalid write of size 1
==10013== at 0x4C26A88: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e0 is 0 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A8C: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e1 is 1 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A94: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e2 is 2 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A99: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e3 is 3 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26AA9: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e8 is 8 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
KERNEL: vmlinux
DUMPFILE: new.core
CPUS: 1
DATE: Fri Oct 1 21:26:15 2010
UPTIME: 00:00:56
LOAD AVERAGE: 0.14, 0.05, 0.02
TASKS: 45
NODENAME: fstest
RELEASE: 2.6.35.6
VERSION: #2 Wed Sep 29 15:05:49 EEST 2010
MACHINE: x86_64 (2394 Mhz)
==10013== Source and destination overlap in strcpy(0x7fefffae2, 0x7fefffae4)
==10013== at 0x4C25918: strcpy (mc_replace_strmem.c:311)
==10013== by 0x46E9DE: pages_to_size (tools.c:4640)
==10013== by 0x49393F: get_memory_size (memory.c:11145)
==10013== by 0x4CFFC5: display_sys_stats (kernel.c:3927)
==10013== by 0x45D934: main_loop (main.c:581)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
MEMORY: 1 GB
PANIC: ""
PID: 0
COMMAND: "swapper"
TASK: ffffffff81a13040 [THREAD_INFO: ffffffff81a00000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
WARNING: panic task not found
crash> q
==10013==
==10013== HEAP SUMMARY:
==10013== in use at exit: 53,444,536 bytes in 10,730 blocks
==10013== total heap usage: 396,156 allocs, 385,426 frees, 2,187,205,021 bytes allocated
==10013==
==10013== LEAK SUMMARY:
==10013== definitely lost: 6,414 bytes in 35 blocks
==10013== indirectly lost: 24 bytes in 1 blocks
==10013== possibly lost: 42,174,127 bytes in 8,022 blocks
==10013== still reachable: 11,263,971 bytes in 2,672 blocks
==10013== suppressed: 0 bytes in 0 blocks
==10013== Rerun with --leak-check=full to see details of leaked memory
==10013==
==10013== For counts of detected and suppressed errors, rerun with: -v
==10013== Use --track-origins=yes to see where uninitialised values come from
==10013== ERROR SUMMARY: 6710 errors from 21 contexts (suppressed: 4 from 4)
------------------------------------------------------------
Sami
----- End forwarded message -----
14 years, 1 month
[ANNOUNCE] crash version 5.0.8 is available
by Dave Anderson
- Fix for the "bt" command on 2.6.30 and later x86_64 kernels that
may be seen when a System.map file is used on the command line.
Without the patch, the "bt" frame-by-frame output may be interspersed
with error messages indicating "bt: invalid kernel virtual address:
<address> type: call byte".
(anderson(a)redhat.com)
- Fix for the KVM error messages generated by store_mapfile_offset() and
and load_mapfile_offset() when an invalid physical address is issued.
The errno translation displayed by both functions was irrelevant; and
load_mapfile_offset() has been changed to show its error message only
if CRASHDEBUG(1) is in effect, making its behaviour similar to the
read functions associated with the other dumpfile types.
(anderson(a)redhat.com)
- Fix for the "sig" command on 2.6.35 and later kernels to account for
the "signal_struct" member name change from "count" to "nr_threads".
Without the patch, the command would fail with the error message
"sig: invalid structure member offset: signal_struct_count".
(anderson(a)redhat.com)
- Fix for the "net -s" command option on 2.6.33 and later kernels
to account for the "inet_sock" structure member name changes from
"daddr", "rcv_saddr", "dport", "sport" and "num" to the equivalent
name preceded by "inet_". Without the patch, the command would fail
for tasks with open sockets with the error message "net: invalid
structure member offset: inet_opt_daddr".
(anderson(a)redhat.com)
- Fix for the "mod" command on 2.6.35 and later kernels to account
for the removal of the "owner" member from the "attribute" structure.
Without the patch, the "mod" command fail with the error message
"mod: invalid structure member offset: attribute_owner".
(anderson(a)redhat.com)
- Fix for the "mount -f" command on 2.6.36 and later kernels to account
for the data type change of the super_block "s_files" member from
"struct list_head" to "struct list_head __percpu *". The open files
of a super_block are no longer contained on a single list, and are
now linked onto one of the per-cpu lists. Without the patch the
command would fail with the error message "mount: invalid kernel
virtual address: <percpu-offset> type: first list entry".
(anderson(a)redhat.com)
- Fix for the "files" command when the vfsmount pointer in the file
structure's "f_path" member is not suitable for the root vfsmount to
be used when reconstructing the full file pathname. Without the
patch, open files in /dev directory may be truncated and not show
the "/dev" filename component.
(anderson(a)redhat.com)
- Change to the manner in which the cpu count is determined for x86_64
kernels. SLES11 2.6.32 kernels delay the call to crash_kexec() until
after smp_send_stop() is called by panic(), and so the cpu_online_map
cannot be used for determining the cpu count. With the patch, the
cpu_present_map is used.
(Jeffrey.Hagen(a)teradata.com)
- Fix for the "bt" command with 2.6.27 and later x86_64 kernels to
prevent the possible display of a an invalid "vgettimeofday" frame
above the topmost "system_call_fastpath" frame, followed by two
read errors indicating "bt: read error: kernel virtual address:
ffffffffff600000 type: gdb_readmem_callback".
(anderson(a)redhat.com)
- Currently the "s390dbf" command uses KL_UINT() for reading pointers,
which works only if the pointers are below 4 GiB. To fix this issue
a new KL_ULONG() function has been added to read pointers correctly.
(holzheu(a)linux.vnet.ibm.com)
- Implement the capability of building crash as an x86 binary for ARM
dumpfiles on an x86_64 host. The initial ARM support only allowed
the building of an x86 binary for ARM dumpfiles to be done from an
x86 host. To build crash as an x86 binary on an x86_64 host, enter
"make target=ARM".
(Jan.Karlsson(a)sonyericsson.com)
- Simplify the ARM build procedure after an initial ARM build has
been completed in a crash source tree. With the patch, it is only
necessary to enter "make target=ARM" for the intial build; subsequent
builds can be done with "make" alone, which will continue to build
with ARM support.
(Jan.Karlsson(a)sonyericsson.com, anderson(a)redhat.com)
- Implemented the capability of building an X86 crash binary on an
X86_64 host, which can be done by entering "make target=X86". After
the initial build is complete, subsequent builds can be done by
entering "make" alone.
(anderson(a)redhat.com)
- Fix for a regression in get_text_init_space() due to logic added by
the ARM processor support. Without the patch, the function would not
recognize the failure to find the kernel's .text.init address for
non-ARM architectures.
(perr.fransson.ml(a)gmail.com, anderson(a)redhat.com)
- Implemented support for SMP on the ARM architecture.
(per.xx.fransson(a)stericsson.com)
- Fix for the x86_64 "bt" command on 2.6.31 and later kernels when the
crash was generated by an "echo c > /proc/sysrq-trigger". Without
the patch, the backtrace starts at sysrq_handle_crash() and does
not display the exception frame from the forced oops. This is not
applicable to older kernels where crash_kexec() is called directly
from sysrq_handle_crash(), or if an actual alt-sysrq-c keystroke
sequence is entered.
(anderson(a)redhat.com)
- Fix to recognize module "init" symbols that are still valid, whose
vmalloc'd virtual memory has not been vfree'd by sys_init_module().
Without the patch those symbols are not visible by any of the "sym"
command options, nor by commands that try to translate their virtual
addresses to a symbol name, such as the "bt" command if the kernel
crashed during a module load.
(hutao(a)cn.fujitsu.com, anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
14 years, 1 month
Re: [Crash-utility] ARM SMP
by Dave Anderson
----- "Per Fransson" <per.fransson.ml(a)gmail.com> wrote:
> > Might be something to do with different crashdump formats supported
> > by crash or perhaps some historic reason? Dave, any comments on this?
> >
> > Looking history for the kernel/kexec.c shows that crash_notes has
> > been there from the early days of the kexec code.
> >
>
> Just to elaborate a little bit:
>
> The crash_notes which end up in the ELF notes, e.g. when collecting
> the core dump through /proc/vmcore on x86, are just copies of the ones
> in kernel/kexec.c. They are located by /proc/vmcore using the ELF
> header that kexec-tools creates. kexec-tools figures out the addresses
> to the crash_notes by reading
>
> /sys/devices/system/cpu/cpu%d/crash_notes
>
> or an equivalent file. But why go though all that trouble if they are
> available within the dump anyway?
It's probably a better question for the kexec mailing list,
but I presume it's simply a matter of following ELF conventions?
Dave
14 years, 1 month
ARM SMP
by Per Fransson
Hi Dave and Mika,
Thanks for your input. Here's attempt number two. I have:
- eliminated the leaks
- removed 'crash_task_pid'
- fixed the formatting
- not used gmail, since it corrupts the patch
- used malloc/free for panic_task_regs
Regards,
Per
diff --git a/arm.c b/arm.c
index 06b2f1c..b3841c0 100644
--- a/arm.c
+++ b/arm.c
@@ -73,7 +73,7 @@ struct arm_cpu_context_save {
/*
* Holds registers during the crash.
*/
-static struct arm_pt_regs panic_task_regs;
+static struct arm_pt_regs *panic_task_regs;
#define PGDIR_SIZE() (4 * PAGESIZE())
#define PGDIR_OFFSET(X) (((ulong)(X)) & (PGDIR_SIZE() - 1))
@@ -392,7 +392,6 @@ arm_dump_machdep_table(ulong arg)
fprintf(fp, " kernel_text_end: %lx\n", ms->kernel_text_end);
fprintf(fp, "exception_text_start: %lx\n", ms->exception_text_start);
fprintf(fp, " exception_text_end: %lx\n", ms->exception_text_end);
- fprintf(fp, " crash_task_pid: %ld\n", ms->crash_task_pid);
fprintf(fp, " crash_task_regs: %lx\n", (ulong)ms->crash_task_regs);
}
@@ -484,71 +483,104 @@ arm_get_crash_notes(void)
Elf32_Nhdr *note;
ulong ptr, offset;
char *buf, *p;
+ ulong *notes_ptrs;
+ ulong per_cpu_offsets_addr;
+ ulong *per_cpu_offsets;
+ ulong i;
if (!symbol_exists("crash_notes"))
return FALSE;
crash_notes = symbol_value("crash_notes");
- if (kt->cpus > 1)
- error(WARNING, "only one CPU is currently supported\n");
+ notes_ptrs = GETBUF(kt->cpus*sizeof(notes_ptrs[0]));
/*
* Read crash_notes for the first CPU. crash_notes are in standard ELF
* note format.
*/
- if (!readmem(crash_notes, KVADDR, &ptr, sizeof(ptr), "crash_notes",
+ if (!readmem(crash_notes, KVADDR, ¬es_ptrs[kt->cpus-1], sizeof(notes_ptrs[kt->cpus-1]), "crash_notes",
RETURN_ON_ERROR)) {
error(WARNING, "cannot read crash_notes\n");
+ FREEBUF(notes_ptrs);
return FALSE;
}
+
+
+ if (symbol_exists("__per_cpu_offset")) {
+
+ /* Get the __per_cpu_offset array */
+ per_cpu_offsets_addr = symbol_value("__per_cpu_offset");
+
+ per_cpu_offsets = GETBUF(kt->cpus*sizeof(*per_cpu_offsets));
+
+ if (!readmem(per_cpu_offsets_addr, KVADDR, per_cpu_offsets, kt->cpus*sizeof(*per_cpu_offsets), "per_cpu_offsets",
+ RETURN_ON_ERROR)) {
+ error(WARNING, "cannot read per_cpu_offsets\n");
+ FREEBUF(per_cpu_offsets);
+ return FALSE;
+ }
+
+ /* Add __per_cpu_offset for each cpu to form the pointer to the notes */
+ for (i = 0; i<kt->cpus; i++) {
+ notes_ptrs[i] = notes_ptrs[kt->cpus-1] + per_cpu_offsets[i];
+ }
+ FREEBUF(per_cpu_offsets);
+ }
buf = GETBUF(SIZE(note_buf));
+ panic_task_regs = malloc(kt->cpus*sizeof(*panic_task_regs));
+
+ for (i=0;i<kt->cpus;i++) {
+
+ if (!readmem(notes_ptrs[i], KVADDR, buf, SIZE(note_buf), "note_buf_t",
+ RETURN_ON_ERROR)) {
+ error(WARNING, "failed to read note_buf_t\n");
+ goto fail;
+ }
- if (!readmem(ptr, KVADDR, buf, SIZE(note_buf), "note_buf_t",
- RETURN_ON_ERROR)) {
- error(WARNING, "failed to read note_buf_t\n");
- goto fail;
- }
+ /*
+ * Do some sanity checks for this note before reading registers from it.
+ */
+ note = (Elf32_Nhdr *)buf;
+ p = buf + sizeof(Elf32_Nhdr);
- /*
- * Do some sanity checks for this note before reading registers from it.
- */
- note = (Elf32_Nhdr *)buf;
- p = buf + sizeof(Elf32_Nhdr);
+ if (note->n_type != NT_PRSTATUS) {
+ error(WARNING, "invalid note (n_type != NT_PRSTATUS)\n");
+ goto fail;
+ }
+ if (p[0] != 'C' || p[1] != 'O' || p[2] != 'R' || p[3] != 'E') {
+ error(WARNING, "invalid note (name != \"CORE\"\n");
+ goto fail;
+ }
- if (note->n_type != NT_PRSTATUS) {
- error(WARNING, "invalid note (n_type != NT_PRSTATUS)\n");
- goto fail;
- }
- if (p[0] != 'C' || p[1] != 'O' || p[2] != 'R' || p[3] != 'E') {
- error(WARNING, "invalid note (name != \"CORE\"\n");
- goto fail;
- }
+ /*
+ * Find correct location of note data. This contains elf_prstatus
+ * structure which has registers etc. for the crashed task.
+ */
+ offset = sizeof(Elf32_Nhdr);
+ offset = roundup(offset + note->n_namesz, 4);
+ p = buf + offset; /* start of elf_prstatus */
- /*
- * Find correct location of note data. This contains elf_prstatus
- * structure which has registers etc. for the crashed task.
- */
- offset = sizeof(Elf32_Nhdr);
- offset = roundup(offset + note->n_namesz, 4);
- p = buf + offset; /* start of elf_prstatus */
+ BCOPY(p + OFFSET(elf_prstatus_pr_reg), &panic_task_regs[i],
+ sizeof(panic_task_regs[i]));
- BCOPY(p + OFFSET(elf_prstatus_pr_reg), &panic_task_regs,
- sizeof(panic_task_regs));
+ }
/*
- * And finally we have pid and registers for the crashed task. This is
+ * And finally we have the registers for the crashed task. This is
* used later on when dumping backtrace.
*/
- ms->crash_task_pid = *(ulong *)(p + OFFSET(elf_prstatus_pr_pid));
- ms->crash_task_regs = &panic_task_regs;
+ ms->crash_task_regs = panic_task_regs;
FREEBUF(buf);
+ FREEBUF(notes_ptrs);
return TRUE;
fail:
FREEBUF(buf);
+ FREEBUF(notes_ptrs);
+ free(panic_task_regs);
return FALSE;
}
@@ -996,20 +1028,20 @@ arm_get_dumpfile_stack_frame(struct bt_info *bt, ulong *nip, ulong *ksp)
if (!ms->crash_task_regs)
return FALSE;
- if (tt->panic_task != bt->task || bt->tc->pid != ms->crash_task_pid)
+ if (!is_task_active(bt->task))
return FALSE;
-
+
/*
* We got registers for panic task from crash_notes. Just return them.
*/
- *nip = ms->crash_task_regs->ARM_pc;
- *ksp = ms->crash_task_regs->ARM_sp;
+ *nip = ms->crash_task_regs[bt->tc->processor].ARM_pc;
+ *ksp = ms->crash_task_regs[bt->tc->processor].ARM_sp;
/*
* Also store pointer to all registers in case unwinding code needs
* to access LR.
*/
- bt->machdep = ms->crash_task_regs;
+ bt->machdep = &(ms->crash_task_regs[bt->tc->processor]);
return TRUE;
}
diff --git a/defs.h b/defs.h
index d431d6e..6e0c8cc 100755
--- a/defs.h
+++ b/defs.h
@@ -85,7 +85,7 @@
#define NR_CPUS (64)
#endif
#ifdef ARM
-#define NR_CPUS (1)
+#define NR_CPUS (4)
#endif
#define BUFSIZE (1500)
@@ -4062,7 +4062,6 @@ struct machine_specific {
ulong kernel_text_end;
ulong exception_text_start;
ulong exception_text_end;
- ulong crash_task_pid;
struct arm_pt_regs *crash_task_regs;
};
14 years, 1 month