Re: [Crash-utility] Kernel Crash Analysis on Android
by Shankar, AmarX
Hi Dave,
Thanks for your info regarding kexec tool.
I am unable to download kexec from below link.
http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/kexec-too...
It says HTTP 404 Page Not Found.
Could you please guide me on this?
Thanks & Regards,
Amar Shankar
> On Wed, Mar 21, 2012 at 06:00:00PM +0000, Shankar, AmarX wrote:
>
> > I want to do kernel crash Analysis on Android Merrifield Target.
> >
> > Could someone please help me how to do it?
>
> Merrifield is pretty much similar than Medfield, e.g it has x86 core. So I
> guess you can follow the instructions how to setup kdump on x86 (see
> Documentation/kdump/kdump.txt) unless you already have that configured.
>
> crash should support this directly presuming you have vmlinux/vmcore files to
> feed it. You can configure crash to support x86 on x86_64 host by running:
>
> % make target=X86
> & make
>
> (or something along those lines).
Right -- just the first make command will suffice, i.e., when running
on an x86_64 host:
$ wget http://people.redhat.com/anderson/crash-6.0.4.tar.gz
$ tar xzf crash-6.0.4.tar.gz
...
$ cd crash-6.0.4
$ make target=X86
...
$ ./crash <path-to>/vmlinux <path-to>/vmcore
Dave
From: Shankar, AmarX
Sent: Wednesday, March 21, 2012 11:30 PM
To: 'crash-utility(a)redhat.com'
Subject: Kernel Crash Analysis on Android
Hi,
I want to do kernel crash Analysis on Android Merrifield Target.
Could someone please help me how to do it?
Thanks & Regards,
Amar Shankar
1 year, 1 month
[PATCH] kmem, snap: iomem/ioport display and vmcore snapshot support
by HATAYAMA Daisuke
Some days ago I was in a situation that I had to convert vmcore in
kvmdump format into ELF since some extension module we have locally
can be used only on relatively old crash utility, around version 4,
but such old crash utility cannot handle kvmdump format.
To do the conversion in handy, I used snap command with some modifications
so that it tries to use iomem information in vmcore instead of host's
/proc/iomem. This patch is its cleaned-up version.
In this development, I naturally got down to also making an interface
for an access to resource objects, and so together with the snap
command's patch, I also extended kmem command for iomem/ioport
support. Actually:
kmem -r displays /proc/iomem
crash> kmem -r
00000000-0000ffff : reserved
00010000-0009dbff : System RAM
0009dc00-0009ffff : reserved
000c0000-000c7fff : Video ROM
...
and kmem -R displays /proc/ioport
crash> kmem -R
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
...
Looking into old version of kernel source code back, resource structure
has been unchanged since linux-2.4.0. I borrowed the way of walking on
resouce tree in this patch from the lastest v3.3-rc series, but I
guess the logic is also applicable to old kernels. I expect Dave's
regression testsuite.
Also, there would be another command more sutable for iomem/ioport.
If necessay, I'll repost the patch.
---
HATAYAMA Daisuke (4):
Add vmcore snapshot support
Add kmem -r and -R options
Add dump iomem/ioport functions; a helper for resource objects
Add a helper function for iterating resource objects
defs.h | 9 ++++
extensions/snap.c | 54 ++++++++++++++++++++++-
help.c | 2 +
memory.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++--
4 files changed, 180 insertions(+), 7 deletions(-)
--
Thanks.
HATAYAMA Daisuke
1 year, 1 month
Re: [Crash-utility] question about phys_base
by Dave Anderson
----- Original Message -----
> >
> > OK, so then I don't understand what you mean by "may be the same"?
> >
> > You didn't answer my original question, but if I understand you correctly,
> > it would be impossible for the qemu host to create a PT_LOAD segment that
> > describes an x86_64 guest's __START_KERNEL_map region, because the host
> > doesn't know that what kind of kernel the guest is running.
>
> Yes. Even if the guest is linux, it is still impossible to do it. Because
> the guest maybe in the second kernel.
>
> qemu-dump walks all guest's page table and collect virtual address and
> physical address mapping. If the page is not used by guest, the virtual is set
> to 0. I create PT_LOAD according to such mapping. So if the guest is linux,
> there may be a PT_LOAD segment that describes __START_KERNEL_map region.
> But the information stored in PT_LOAD maybe for the second kernel. If crash
> uses it, crash will see the second kernel, not the first kernel.
Just to be clear -- what do you mean by the "second" kernel? Do you
mean that a guest kernel crashed guest, and did a kdump operation,
and that second kdump kernel failed somehow, and now you're trying
to do a "virsh dump" on the kdump kernel?
Dave
1 year, 1 month
question about phys_base
by Wen Congyang
Hi, Dave
I am implementing a new dump command in the qemu. The vmcore's
format is elf(like kdump). And I try to provide phys_base in
the PT_LOAD. But if the os uses the first vcpu do kdump, the
value of phys_base is wrong.
I find a function x86_64_virt_phys_base() in crash's code.
Is it OK to call this function first? If the function
successes, we do not calculate phys_base according to PT_LOAD.
Thanks
Wen Congyang
1 year, 1 month
[PATCH] runq: search current task's runqueue explicitly
by HATAYAMA Daisuke
Currently, runq sub-command doesn't consider CFS runqueue's current
task removed from CFS runqueue. Due to this, the remaining CFS
runqueus that follow the current task's is not displayed. This patch
fixes this by making runq sub-command search current task's runqueue
explicitly.
Note that CFS runqueue exists for each task group, and so does CFS
runqueue's current task, and the above search needs to be done
recursively.
Test
====
On vmcore I made 7 task groups:
root group --- A --- AA --- AAA
+ +- AAB
|
+- AB --- ABA
+- ABB
and then I ran three CPU bound tasks, which is exactly the same as
int main(void) { for (;;) continue; return 0; }
for each task group, including root group; so total 24 tasks. For
readability, I annotated each task name with its belonging group name.
For example, loop.ABA belongs to task group ABA.
Look at CPU0 collumn below. [before] lacks 8 tasks and [after]
successfully shows all tasks on the runqueue, which is identical to
the result of [sched debug] that is expected to ouput correct result.
I'll send this vmcore later.
[before]
crash> runq | cat
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND: "loop.ABA"
<cut>
[after]
crash_fix> runq
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND: "loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND: "loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND: "loop.ABB"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
<cut>
[sched debug]
crash> runq -d
CPU 0
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND: "loop.ABA"
[120] PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND: "loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND: "loop.ABB"
<cut>
Diff stat
=========
defs.h | 1 +
task.c | 37 +++++++++++++++++--------------------
2 files changed, 18 insertions(+), 20 deletions(-)
Thanks.
HATAYAMA, Daisuke
1 year, 1 month
[RFC] makedumpfile, crash: LZO compression support
by HATAYAMA Daisuke
Hello,
This is a RFC patch set that adds LZO compression support to
makedumpfile and crash utility. LZO is as good as in size but by far
better in speed than ZLIB, leading to reducing down time during
generation of crash dump and refiltering.
How to build:
1. Get LZO library, which is provided as lzo-devel package on recent
linux distributions, and is also available on author's website:
http://www.oberhumer.com/opensource/lzo/.
2. Apply the patch set to makedumpfile v1.4.0 and crash v6.0.0.
3. Build both using make. But for crash, do the following now:
$ make CFLAGS="-llzo2"
How to use:
I've newly used -l option for lzo compression in this patch. So for
example, do as follows:
$ makedumpfile -l vmcore dumpfile
$ crash vmlinux dumpfile
Request of configure-like feature for crash utility:
I would like configure-like feature on crash utility for users to
select wheather to add LZO feature actually or not in build-time,
that is: ./configure --enable-lzo or ./configure --disable-lzo.
The reason is that support staff often downloads and installs the
latest version of crash utility on machines where lzo library is not
provided.
Looking at the source code, it looks to me that crash does some kind
of configuration processing in a local manner, around configure.c,
and I guess it's difficult to use autoconf tools directly.
Or is there another better way?
Performance Comparison:
Sample Data
Ideally, I must have measured the performance for many enough
vmcores generated from machines that was actually running, but now
I don't have enough sample vmcores, I couldn't do so. So this
comparison doesn't answer question on I/O time improvement. This
is TODO for now.
Instead, I choosed worst and best cases regarding compression
ratio and speed only. Specifically, the former is /dev/urandom and
the latter is /dev/zero.
I get the sample data of 10MB, 100MB and 1GB by doing like this:
$ dd bs=4096 count=$((1024*1024*1024/4096)) if=/dev/urandom of=urandom.1GB
How to measure
Then I performed compression for each block, 4096 bytes, and
measured total compression time and output size. See attached
mycompress.c.
Result
See attached file result.txt.
Discussion
For both kinds of data, lzo's compression was considerably quicker
than zlib's. Compression ratio is about 37% for urandom data, and
about 8.5% for zero data. Actual situation of physical memory
would be in between the two cases, and so I guess average
compression time ratio is between 37% and 8.5%.
Although beyond the topic of this patch set, we can estimate worst
compression time on more data size since compression is performed
block size wise and the compression time increases
linearly. Estimated worst time on 2TB memory is about 15 hours for
lzo and about 40 hours for zlib. In this case, compressed data
size is larger than the original, so they are really not used,
compression time is fully meaningless. I think compression must be
done in parallel, and I'll post such patch later.
Diffstat
* makedumpfile
diskdump_mod.h | 3 +-
makedumpfile.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++------
makedumpfile.h | 12 +++++++
3 files changed, 101 insertions(+), 12 deletions(-)
* crash
defs.h | 1 +
diskdump.c | 20 +++++++++++++++++++-
diskdump.h | 3 ++-
3 files changed, 22 insertions(+), 2 deletions(-)
TODO
* evaluation including I/O time using actual vmcores
Thanks.
HATAYAMA, Daisuke
1 year, 1 month
Re: [Crash-utility] [RFI] Support Fujitsu's sadump dump format
by tachibana@mxm.nes.nec.co.jp
Hi Hatayama-san,
On 2011/06/29 12:12:18 +0900, HATAYAMA Daisuke <d.hatayama(a)jp.fujitsu.com> wrote:
> From: Dave Anderson <anderson(a)redhat.com>
> Subject: Re: [Crash-utility] [RFI] Support Fujitsu's sadump dump format
> Date: Tue, 28 Jun 2011 08:57:42 -0400 (EDT)
>
> >
> >
> > ----- Original Message -----
> >> Fujitsu has stand-alone dump mechanism based on firmware level
> >> functionality, which we call SADUMP, in short.
> >>
> >> We've maintained utility tools internally but now we're thinking that
> >> the best is crash utility and makedumpfile supports the sadump format
> >> for the viewpoint of both portability and maintainability.
> >>
> >> We'll be of course responsible for its maintainance in a continuous
> >> manner. The sadump dump format is very similar to diskdump format and
> >> so kdump (compressed) format, so we estimate patch set would be a
> >> relatively small size.
> >>
> >> Could you tell me whether crash utility and makedumpfile can support
> >> the sadump format? If OK, we'll start to make patchset.
I think it's not bad to support sadump by makedumpfile. However I have
several questions.
- Do you want to use makedumpfile to make an existing file that sadump has
dumped small?
- It isn't possible to support the same form as kdump-compressed format
now, is it?
- When the information that makedumpfile reads from a note of /proc/vmcore
(or a header of kdump-compressed format) is added by an extension of
makedumpfile, do you need to modify sadump?
Thanks
tachibana
> >
> > Sure, yes, the crash utility can always support another dumpfile format.
> >
>
> Thanks. It helps a lot.
>
> > It's unclear to me how similar SADUMP is to diskdump/compressed-kdump.
> > Does your internal version patch diskdump.c, or do you maintain your
> > own "sadump.c"? I ask because if your patchset is at all intrusive,
> > I'd prefer it be kept in its own file, primarily for maintainability,
> > but also because SADUMP is essentially a black-box to anybody outside
> > Fujitsu.
>
> What I meant when I used ``similar'' is both literally and
> logically. The format consists of diskdump header-like header, two
> kinds of bitmaps used for the same purpose as those in diskump format,
> and memory data. They can be handled in common with the existing data
> structure, diskdump_data, non-intrusively, so I hope they are placed
> in diskdump.c.
>
> On the other hand, there's a code to be placed at such specific
> area. sadump is triggered depending on kdump's progress and so
> register values to be contained in vmcore varies according to the
> progress: If crash_notes has been initialized when sadump is
> triggered, sadump packs the register values in crash_notes; if not
> yet, packs registers gathered by firmware. This is sadump specific
> processing, so I think putting it in specific sadump.c file is a
> natural and reasonable choise.
>
> Anyway, I have not made any patch set for this. I'll post a patch set
> when I complete.
>
> Again, thanks a lot for the positive answer.
>
> Thanks.
> HATAYAMA, Daisuke
>
>
> _______________________________________________
> kexec mailing list
> kexec(a)lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
1 year, 1 month
gcore: Segmentation fault due to renaming of old_rsp symbol in kernel
by Eric Ewanco
I am trying to use gcore to generate a user application core from a kernel dump file. I compiled the latest crash-7.1.6 and crash-gcore-command-1.3.1 from https://people.redhat.com/anderson/. I installed a debug kernel (vmlinux-4.1.34-33-debug.gz from openSUSE Leap 42.1) and did a controlled (sysrq-trigger) crash. When I attempt to use gcore on the process in question, after reading <https://people.redhat.com/anderson/extensions/gcore_help_gcore.html>, I get a segmentation fault:
eje-code:~ # crash /boot/vmlinux-4.1.34-33-debug.gz /var/crash/2016-10-31-17\:01//vmcore
crash 7.1.6
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /boot/vmlinux-4.1.34-33-debug.gz
DUMPFILE: /var/crash/2016-10-31-17:01//vmcore
CPUS: 4
DATE: Mon Oct 31 13:01:36 2016
UPTIME: 02:12:08
LOAD AVERAGE: 0.00, 0.00, 0.00
TASKS: 204
NODENAME: eje-code
RELEASE: 4.1.34-33-debug
VERSION: #1 SMP Thu Oct 20 08:03:29 UTC 2016 (fe18aba)
MACHINE: x86_64 (2094 Mhz)
MEMORY: 4 GB
PANIC: "sysrq: SysRq : Trigger a crash"
PID: 3260
COMMAND: "crashtest"
TASK: ffff88011a020550 [THREAD_INFO: ffff8800bcd98000]
CPU: 3
STATE: TASK_RUNNING (SYSRQ)
crash> extend /usr/lib64/crash/extensions/gcore.so
/usr/lib64/crash/extensions/gcore.so: shared object loaded
crash> gcore -f 0 -v 7 3260
gcore: Opening file core.3260.crashtest ...
gcore: done.
gcore: Writing ELF header ...
gcore: done.
gcore: Retrieving and writing note information ...
Segmentation fault
Sixty-four bytes of core get written before the segmentation fault (I'm guessing that's the ELF header). I can gcore some other processes (although I get many "gcore: WARNING: page fault at 7ffca6a5d000" errors). I tried this both with an echo from bash from the command line and a custom test program that just does a controlled crash in a function nested four deep. The segmentation fault sometimes causes a hang (which I can end with Ctrl-C).
It does the same thing if I specify the task address (in this case, "gcore ffff88011a020550"). I've tried it without any options, too, and with different combinations.
I obtained a core dump of gcore and this is my debugging session:
eje-code:~ # gdb /usr/lib64/crash/extensions/gcore.so /var/core/core.eje-code-crash-3074
GNU gdb (GDB; openSUSE Leap 42.1) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib64/crash/extensions/gcore.so...done.
warning: core file may not match specified executable file. [Not sure why ...]
[New LWP 3074]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `crash /boot/vmlinux-4.1.34-33-debug.gz /var/crash/2016-10-31-17:01//vmcore'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
Missing separate debuginfos, use: zypper install glibc-debuginfo-2.19-17.4.x86_64 liblzma5-debuginfo-5.0.5-3.5.x86_64 libncurses5-debuginfo-5.9-53.4.x86_64 libz1-debuginfo-1.2.8-6.4.x86_64
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00007f1235eed4e4 in restore_regs_syscall_context (target=0x6939df8, regs=0xf6f280, active_regs=0x7ffefa968880)
at libgcore/gcore_x86.c:1656
#2 0x00007f1235eedcb6 in genregs_get (target=0x6939df8, regset=0x7f12360f6460 <x86_64_regsets>, size=216,
buf=0xf6f280) at libgcore/gcore_x86.c:1795
#3 0x00007f1235ee6438 in fill_write_thread_core_info (fp=0x59efb10, tc=0x6939df8, dump_tc=0x6939df8, info=0xf6ee80,
view=0x7f12360f5d80 <x86_64_regset_view>, offset=0x7ffefa968ab0, total=0xf6ee98) at libgcore/gcore_coredump.c:469
#4 0x00007f1235ee682c in fill_write_note_info (fp=0x59efb10, info=0xf6ee80, phnum=20, offset=0x7ffefa968ab0)
at libgcore/gcore_coredump.c:566
#5 0x00007f1235ee4dd1 in gcore_coredump () at libgcore/gcore_coredump.c:112
#6 0x00007f1235eeeb8b in do_gcore (arg=0x0) at gcore.c:317
#7 0x00007f1235eee926 in cmd_gcore () at gcore.c:253
#8 0x0000000000472b8c in ?? ()
#9 0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00007f1235eed4e4 in restore_regs_syscall_context (target=0x6939df8, regs=0xf6f280, active_regs=0x7ffefa968880)
at libgcore/gcore_x86.c:1656
#2 0x00007f1235eedcb6 in genregs_get (target=0x6939df8, regset=0x7f12360f6460 <x86_64_regsets>, size=216,
buf=0xf6f280) at libgcore/gcore_x86.c:1795
#3 0x00007f1235ee6438 in fill_write_thread_core_info (fp=0x59efb10, tc=0x6939df8, dump_tc=0x6939df8, info=0xf6ee80,
view=0x7f12360f5d80 <x86_64_regset_view>, offset=0x7ffefa968ab0, total=0xf6ee98) at libgcore/gcore_coredump.c:469
#4 0x00007f1235ee682c in fill_write_note_info (fp=0x59efb10, info=0xf6ee80, phnum=20, offset=0x7ffefa968ab0)
at libgcore/gcore_coredump.c:566
#5 0x00007f1235ee4dd1 in gcore_coredump () at libgcore/gcore_coredump.c:112
#6 0x00007f1235eeeb8b in do_gcore (arg=0x0) at gcore.c:317
#7 0x00007f1235eee926 in cmd_gcore () at gcore.c:253
#8 0x0000000000472b8c in ?? ()
#9 0x0000000000000000 in ?? ()
(gdb) up
#1 0x00007f1235eed4e4 in restore_regs_syscall_context (target=0x6939df8, regs=0xf6f280, active_regs=0x7ffefa968880)
at libgcore/gcore_x86.c:1656
1656 regs->sp = gxt->get_old_rsp(target->processor);
(gdb) print gxt
$1 = (struct gcore_x86_table *) 0x215ea0 <gcore_x86_table>
(gdb) print *target
$2 = {task = 18446612137045525840, thread_info = 18446612135482589184, pid = 3260, comm = "crashtest\000@XI\215u H",
processor = 3, ptask = 18446612137046565648, mm_struct = 18446612137048351232, tc_next = 0x0}
(gdb) print *regs
$3 = {r15 = 0, r14 = 2, r13 = 2, r12 = 34324496, bp = 2, bx = 4196186, r11 = 582, r10 = 140728806957456,
r9 = 140048302249728, r8 = 34324720, ax = 18446744073709551578, cx = 140048297135408, dx = 2, si = 140048302292992,
di = 3, orig_ax = 1, ip = 140048297135408, cs = 51, flags = 582, sp = 140728806957864, ss = 43, fs_base = 0,
gs_base = 0, ds = 0, es = 0, fs = 0, gs = 0}
(gdb) print *gxt
$4 = {get_old_rsp = 0x0, get_thread_struct_fpu = 0x0, get_thread_struct_fpu_size = 0x0, is_special_syscall = 0x0,
is_special_ia32_syscall = 0x0, tsk_used_math = 0x0}
=============================
So not only is get_old_rsp zero, all the fields in gxt are zero.
Looks like a kernel support issue. This field is filled in by gcore_x86_table_register_get_old_rsp() which looks up four symbols in various forms, none of which exist in my kernel:
eje-code:~ # fgrep old_rsp /proc/kallsyms
eje-code:~ # fgrep cpu_pda /proc/kallsyms
eje-code:~ #
old_rsp did exist in openSUSE 12.1 and 13.1 (3.11.10-29 for the latter).
According to http://lists.openwall.net/linux-kernel/2015/03/17/766 old_rsp was renamed rsp_scratch. I don't know if the semantics changed -- it doesn't appear so -- but I added code to accept this symbol as an alternative and the core dump generates and works (I can see a correct backtrace). I do not warrant the work though. :-) Someone may want to review my work, and check the other functions and see if they are supposed to be zero. Since they haven't been invoked I don't know if they are supposed to be non-zero or not.
Here is the diff:
--- gcore_x86.c~ 2014-11-06 04:58:47.000000000 -0500
+++ gcore_x86.c 2016-10-31 16:01:00.989025841 -0400
@@ -1351,6 +1351,26 @@ static ulong gcore_x86_64_get_old_rsp(in
}
/**
+ * gcore_x86_64_get_rsp_scratch() - get rsp at per-cpu area
+ *
+ * @cpu target CPU's CPU id
+ *
+ * Given a CPU id, returns a RSP value saved at per-cpu area for the
+ * CPU whose id is the given CPU id.
+ */
+static ulong gcore_x86_64_get_rsp_scratch(int cpu)
+{
+ ulong old_rsp;
+
+ readmem(symbol_value("rsp_scratch") + kt->__per_cpu_offset[cpu],
+ KVADDR, &old_rsp, sizeof(old_rsp),
+ "gcore_x86_64_get_rsp_scratch: rsp_scratch",
+ gcore_verbose_error_handle());
+
+ return old_rsp;
+}
+
+/**
* gcore_x86_64_get_per_cpu__old_rsp() - get rsp at per-cpu area
*
* @cpu target CPU's CPU id
@@ -1834,6 +1854,11 @@ static void gcore_x86_table_register_get
else if (symbol_exists("_cpu_pda"))
gxt->get_old_rsp = gcore_x86_64_get_cpu__pda_oldrsp;
+
+ else if (symbol_exists("rsp_scratch"))
+ gxt->get_old_rsp = gcore_x86_64_get_rsp_scratch;
+
+ if (!gxt->get_old_rsp) printf ("Warning: NO gxt->get_old_rsp\n");
}
#endif
8 years, 1 month
[ANNOUNCE] crash version 7.1.7 is available
by Dave Anderson
Download from: http://people.redhat.com/anderson
or
https://github.com/crash-utility/crash/releases
The github master branch serves as a development branch that will contain
all patches that are queued for the next release:
$ git clone git://github.com/crash-utility/crash.git
Changelog:
- Set the default 32-bit MIPS HZ value to 100 if the in-kernel config
data is unavailable, and have the "mach" command display the value.
(rabinv(a)axis.com)
- Enable SPARSEMEM support on 32-bit MIPS by setting SECTION_SIZE_BITS
and MAX_PHYSMEM_BITS.
(rabinv(a)axis.com)
- Fix for Linux 4.9-rc1 commits 15f4eae70d365bba26854c90b6002aaabb18c8aa
and c65eacbe290b8141554c71b2c94489e73ade8c8d, which have introduced a
new CONFIG_THREAD_INFO_IN_TASK configuration. This configuration
moves each task's thread_info structure from the base of its kernel
stack into its task_struct. Without the patch, the crash session
fails during initialization with the error "crash: invalid structure
member offset: thread_info_cpu".
(anderson(a)redhat.com)
- Fixes for the gathering of the active task registers from 32-bit MIPS
dumpfiles:
(1) If ELF notes are not available, read them from the kernel's
crash_notes.
(2) If an online CPUs did not save its ELF notes, then adjust
the mapping of each ELF note to its CPU accordingly.
(rabinv(a)axis.com)
- Add support for "help -r" on 32-bit MIPS to display the registers
for each CPU from a dumpfile.
(rabinv(a)axis.com)
- Fix for Linux 4.9-rc1 commit 0100301bfdf56a2a370c7157b5ab0fbf9313e1cd,
which rewrote the x86_64 switch_to() code by embedding the call to
__switch_to() inside a new __switch_to_asm() assembly code ENTRY()
function. Without the patch, the message "crash: cannot determine
thread return address" gets displayed during initialization, and the
"bt" command shows frame #0 starting at "schedule" instead of
"__schedule".
(anderson(a)redhat.com)
- When each x86_64 per-cpu cpu_tss.x86_tss.ist[] array member (or in
older kernels, each per-cpu init_tss.x86_hw_tss.ist[] array member),
is compared with its associated per-cpu orig_ist.ist[] array member,
ensure that both exception stack pointers have been initialized
(non-NULL) before printing a WARNING message if they don't match.
(anderson(a)redhat.com)
- Fix for a possible segmentation violation when analyzing Linux 4.7
x86_64 kernels that are configured with CONFIG_RANDOMIZE_BASE.
Depending upon the randomized starting address of the kernel text
and static data, a segmentation violation may occur during session
initialization, just after the patching of the gdb minimal_symbol
values message.
(anderson(a)redhat.com)
- Restore the x86_64 "dis" command's symbolic translation of jump
or call target addresses if the kernel was configured with
CONFIG_RANDOMIZE_BASE.
(anderson(a)redhat.com)
- Fix for the 32-bit MIPS "bt" command to prevent an empty display
(task header only) for an active task if the epc register in its
exception frame contains 00000000.
(rabinv(a)axis.com)
- Fix for support of Linux 4.7 and later x86_64 ELF kdump vmcores from
kernels configured with CONFIG_RANDOMIZE_BASE. Without the patch,
the crash session may fail during initialization with the message
"crash: vmlinux and vmcore do not match!".
(anderson(a)redhat.com)
- Fix for the x86_64 "mach" command display of the vmemmap base address
in Linux 4.9 and later kernels configured with CONFIG_RANDOMIZE_BASE.
Without the patch, the command shows a value of ffffea0000000000 next
to "KERNEL VMEMMAP BASE".
(anderson(a)redhat.com)
- Since the Linux 3.10 release, the kernel has offered the ability to
create multiple independent ftrace buffers. At present, however,
the "trace.c" extension module is only able to extract the primary
buffer. This patch refactors the trace.c extension module so that
the global instance is passed around as a parameter rather than
accessing it directly, and then locates all of the available
instances and extracts the data from each of them.
(kyle.a.tomsic(a)gmail.com)
- Fix for the s390x "bt" command for active tasks. Since the commit
above in this crash-7.1.7 release that added support for the new
CONFIG_THREAD_INFO_IN_TASK configuration, the backtrace of active
tasks can be incomplete.
(holzheu(a)linux.vnet.ibm.com)
- In collaboration with an update to the /dev/crash kernel driver, fix
for Linux 4.6 commit a7f8de168ace487fa7b88cb154e413cf40e87fc6, which
allows the ARM64 kernel image to be loaded anywhere in physical
memory. Without the patch, attempting to run live on an ARM64
Linux 4.6 and later kernel may display the warning message "WARNING:
cannot read linux_banner string", and then fails with the message
"crash: vmlinux and /dev/crash do not match!". Version 1.3 of the
crash driver is required, which introduces a new ioctl command that
retrieves the ARM64-only "kimage_voffset" value that is required for
virtual-to-physical address translation.
(anderson(a)redhat.com)
- Update of the sample memory_driver/crash.c /dev/crash kernel driver
to version 1.3, which adds support for Linux 4.6 and later ARM64
kernels, kernels configured with CONFIG_HARDENED_USERCOPY, and
S390X kernels use xlate_dev_mem_ptr() and unxlate_dev_mem_ptr()
instead of kmap() and kunmap().
(anderson(a)redhat.com)
8 years, 1 month