Re:[RFC] Crash patch for DWARF CFI based unwind support
by Dave Anderson
> Hi Dave
>
> The following patch adds support for DWARF CFI based stack unwinding
> for crash. Since this method uses the call frame instructions for
> unwinding, it generates better backtraces than the existing backtrace
> mechanism. So when we have the unwind info available, this new method
> will be called, else we fall back to the existing mechanism.
>
> ... <this section moved below>
>
> Please provide your suggestions and comments.
>
> Thanks
> Rachita
Hi Rachita,
I've only been able to test this on a live system that has __start_unwind
and __end_unwind symbols, so I don't know what a backtrace with an
in-kernel exception frame, or a backtrace with a transition to the x86_64
IRQ stack or x86_64 exception stacks, would look like. If you have
an example, I'd be interested in seeing how they get handled.
But what's there, i.e., with only user-mode-to-kernel-mode exception
frames being displayed, looks pretty good!
I'd like to tinker with it a bit, and fold most of what you've
done into 4.0-3.8. Then you can use that as a source base to
continue.
I do have a few questions/comments.
This netdump.c patch doesn't make sense to me -- we really don't
want to be doing any ELFSTORE operations in the debug-only
netdump_memory_dump() debug function:
@@ -771,8 +772,11 @@ netdump_memory_dump(FILE *fp)
dump_Elf64_Phdr(nd->load64 + i, ELFREAD);
offset64 = nd->notes64->p_offset;
for (tot = 0; tot < nd->notes64->p_filesz; tot += len) {
+ if (has_unwind_info)
+ len = dump_Elf64_Nhdr(offset64, ELFSTORE);
+ else
len = dump_Elf64_Nhdr(offset64, ELFREAD);
- offset64 += len;
+ offset64 += len;
}
break;
}
This patch below looks to only be necessary in dumpfiles, but it seems
like, given that the x86_64 user_regs_struct is unavailable in 2.6
vmlinux files, that the initializations in get_netdump_regs_x86_64()
would never get done -- because VALID_STRUCT(user_regs_struct) would
fail, right?
@@ -1562,8 +1566,10 @@ get_netdump_regs_x86_64(struct bt_info *
if (is_task_active(bt->task))
bt->flags |= BT_DUMPFILE_SEARCH;
- if ((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
- VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) {
+ if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
+ VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
+ (KDUMP_DUMPFILE() && has_unwind_info && (bt->flags &
+ BT_DUMPFILE_SEARCH))) {
if (nd->num_prstatus_notes > 1)
note = (Elf64_Nhdr *)
nd->nt_prstatus_percpu[bt->tc->processor];
@@ -1574,9 +1580,21 @@ get_netdump_regs_x86_64(struct bt_info *
len = roundup(len + note->n_namesz, 4);
len = roundup(len + note->n_descsz, 4);
+ if KDUMP_DUMPFILE() {
+ ASSIGN_SIZE(user_regs_struct) = 27 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rsp) = 19 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rip) = 16 * sizeof(unsigned long);
+ }
user_regs = ((char *)note + len)
- SIZE(user_regs_struct) - sizeof(long);
+ if KDUMP_DUMPFILE() {
+ *rspp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rsp));
+ *ripp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rip));
+ if (*ripp && *rspp)
+ return;
+ }
+
But then again, perhaps you never needed the user_regs_struct_rsp and
user_regs_struct_rip offsets in your test scenario?
There does seem to be some unnecessary "kernel-port" left-overs that
should be pruned. Like the __get_user_nocheck(), __get_user_size()
and __get_user_asm() definitions are superfluous, since they're only
needed by __get_user(), which is not used.
I'll also make it compilable in other than x86_64 environments,
because the unwind_x86_64.c code should be #ifdef'd X86_64.
> There still are a couple of things which need to be done, viz
> 1. Extend to obtaining unwind info from modules as well(currently
> doing only for the kernel)
Shouldn't pose a major problem -- just requires following the links
from the kernel table AFAICT. But that leads to another question...
What happens, in dwarf_backtrace(), when it encounters a module
frame? My guess is that the call to unwind() will fail, and then
the loop will bail out prematurely, and end up truncating the
trace output? We'll need some type of error-handling there I
would think.
> 2. Currently reading the unwind info from eh_frame section only(ie
> __start_unwind to __end_unwind). Need to add facility to read from
> the .debug_frame(if .debug_frame is present in cases where .eh_frame
> is absent. Will have to read from the vmlinux if we want to read the
> .debug_frame info)
Definitely -- we've got a plethora of kernels that have the CFI stuff
in the vmlinux file, but not in the kernel.
> 3. Add FRAME_POINTER support.
Personally, I don't much care about this...
And don't forget about x86 support!
And we should probably -- for now anyway -- make it possible to
turn this capability on and off at will. Also, for now, it should
probably default to off until we pound on it a bit. For example,
things like "bt -l" is not supported in these scheme.
But, I can't emphasize this enough -- this is a nice piece of work
that you've done here.
I'll try to get something out in the next couple of days.
Thanks,
Dave
18 years
Re: [Crash-utility] xencrash: crash analisys tool for Xen hypervisor
by Dave Anderson
> Hello Itsuro,
>
> I have only briefly scanned your patch, and find it very
> interesting. I am not at all familiar with the hypervisor
> code, but I do have a couple simple questions at this point:
>
> 1. Is it necessary to create a separate "xencrash"
> binary? Would it be possible to create a single
> crash binary that can be used for both vmlinux
> and xen-syms sessions?
>
Sorry about this -- I missed the following two parts of
the Makefile and main.c patches!
+
+xencrash: crash
+ cp -f crash xencrash
+ if (STREQ(pc->program_name, "xencrash"))
+ pc->flags |= XEN_HYPER;
However, given the above, I wonder whether it would be possible
to postpone the setting of the XEN_HYPER flag from the
setup_environment() function until later on in main() when
the command line arguments are parsed? Wouldn't it be possible
to recognize that the xen-syms object file is the hypervisor
binary, and to set the XEN_HYPER flag at that point?
Thanks,
Dave
>
> 2. Can you make it work with a live hypervisor system
> in the same way the crash works with a live Linux
> kernel?
>
> Thanks,
> Dave
>
18 years
Re:[RFC] Crash patch for DWARF CFI based unwind support
by Dave Anderson
Hi Rachita,
I've figured out why the x86_64 interrupt-stack-to-process-stack
transition is showing a bogus exception frame. It's not kdump
or jprobes -- I think it may have been introduced with the DWARF
CFI changes.
Anyway, in older x86_64 kernels, when an interrupt was taken,
the pt_regs exception frame would be laid down on the current stack,
and the rdi register would contain a pointer to it. Then the stack
pointer would be switched to the per-cpu interrupt stack. (Actually
it is switched to a point 64 bytes from the top of the interrupt
stack, presumably for cache line purposes). The first thing
done after having been switched to the interrupt stack is to push
the rdi register, which again, contains a pointer to the exception
frame on the other stack. Then it calls the interrupt handler.
Here's the "old" code, where the last 4 instructions in the macro
shown below perform the steps outlined above:
1. get the per-cpu interrupt stack address,
2. move it into rsp -- which effectively switches stacks,
3. then the rdi register is pushed,
4. and the interrupt handler called:
.macro interrupt func
CFI_STARTPROC simple
CFI_DEF_CFA rsp,(SS-RDI)
CFI_REL_OFFSET rsp,(RSP-ORIG_RAX)
CFI_REL_OFFSET rip,(RIP-ORIG_RAX)
cld
#ifdef CONFIG_DEBUG_INFO
SAVE_ALL
movq %rsp,%rdi
/*
* Setup a stack frame pointer. This allows gdb to trace
* back to the original stack.
*/
movq %rsp,%rbp
CFI_DEF_CFA_REGISTER rbp
#else
SAVE_ARGS
leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler
#endif
testl $3,CS(%rdi)
je 1f
swapgs
1: addl $1,%gs:pda_irqcount # RED-PEN should check preempt count
movq %gs:pda_irqstackptr,%rax
cmoveq %rax,%rsp
pushq %rdi # save old stack
call \func
.endm
However, in current x86_64 kernels, the interrupt macro has changed
to look like this:
.macro interrupt func
cld
SAVE_ARGS
leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler
pushq %rbp
CFI_ADJUST_CFA_OFFSET 8
CFI_REL_OFFSET rbp, 0
movq %rsp,%rbp
CFI_DEF_CFA_REGISTER rbp
testl $3,CS(%rdi)
je 1f
swapgs
1: incl %gs:pda_irqcount # RED-PEN should check preempt count
cmoveq %gs:pda_irqstackptr,%rsp
push %rbp # backlink for old unwinder
/*
* We entered an interrupt context - irqs are off:
*/
TRACE_IRQS_OFF
call \func
.endm
Note that rdi still contains the pt_regs pointer, as evidenced by
the "testl $3,CS(%rdi)" instruction, which is checking the CS register
contents in the pt_regs for whether it was operating in user-space
when the interrupt occurred. But more importantly, note that just
prior to calling the handler, it does a "push %rbp" instead of a
"pushq %rdi" like it used to.
I'm pretty sure it's being done purposely, because instead of the
having "old unwinder" dumping kernel text addresses starting inside
of the pt_regs exception frame, it bumps the starting point up to
whatever's contained in $rbp, which is above the exception frame
on the old stack. So it would avoid dumping text return addresses
that happen to be sitting in the pt_regs register dump.
Just to verify, I patched the current kernel to push rdi instead
of rbp. Again, here's what the unpatched alt-sysrq-c backtrace
looks like:
crash> bt
PID: 0 TASK: ffff81003fe48100 CPU: 1 COMMAND: "swapper"
#0 [ffff81003fe6bb40] crash_kexec at ffffffff800ab798
#1 [ffff81003fe6bbc8] mwait_idle at ffffffff80055375
#2 [ffff81003fe6bc00] sysrq_handle_crashdump at ffffffff80192fdc
#3 [ffff81003fe6bc10] __handle_sysrq at ffffffff80192dae
#4 [ffff81003fe6bc50] kbd_event at ffffffff8018db52
#5 [ffff81003fe6bca0] input_event at ffffffff801e9b6d
#6 [ffff81003fe6bcd0] hidinput_hid_event at ffffffff801e4299
#7 [ffff81003fe6bd00] hid_process_event at ffffffff801df639
#8 [ffff81003fe6bd40] hid_input_report at ffffffff801df9a7
#9 [ffff81003fe6bdc0] hid_irq_in at ffffffff801e0d8e
#10 [ffff81003fe6bde0] usb_hcd_giveback_urb at ffffffff801d33a2
#11 [ffff81003fe6be10] uhci_giveback_urb at ffffffff8817b724
#12 [ffff81003fe6be50] uhci_scan_schedule at ffffffff8817be07
#13 [ffff81003fe6bed0] uhci_irq at ffffffff8817dc08
#14 [ffff81003fe6bf10] usb_hcd_irq at ffffffff801d3d91
#15 [ffff81003fe6bf20] handle_IRQ_event at ffffffff800106fd
#16 [ffff81003fe6bf50] __do_IRQ at ffffffff800b520c
#17 [ffff81003fe6bf58] __do_softirq at ffffffff80011bfa
#18 [ffff81003fe6bf90] do_IRQ at ffffffff8006a729
--- <IRQ stack> ---
#19 [ffff81003fe65e70] ret_from_intr at ffffffff8005ba89
[exception RIP: cpu_idle+149]
RIP: ffffffff800473a7 RSP: ffffffff8042e220 RFLAGS: ffffffff80074153
RAX: ffffffffffffff16 RBX: 0000000000000000 RCX: ffffffff80055375
RDX: 0000000000000010 RSI: 0000000000000246 RDI: ffff81003fe65ef0
RBP: ffff81003fe64000 R8: ffffffff8034e818 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003f
R13: ffff810037d0c008 R14: 0000000000000246 R15: 0000000000000001
ORIG_RAX: 0000000000000018 CS: 0020 SS: 0000
bt: WARNING: possibly bogus exception frame
crash>
And when the kernel is patched to push rdi instead, the
"old" behavior is emulated:
crash> bt
PID: 0 TASK: ffffffff8034ce60 CPU: 0 COMMAND: "swapper"
#0 [ffffffff8047eb40] crash_kexec at ffffffff800ab798
#1 [ffffffff8047ebc8] mwait_idle at ffffffff80055375
#2 [ffffffff8047ec00] sysrq_handle_crashdump at ffffffff80192fdc
#3 [ffffffff8047ec10] __handle_sysrq at ffffffff80192dae
#4 [ffffffff8047ec50] kbd_event at ffffffff8018db52
#5 [ffffffff8047eca0] input_event at ffffffff801e9b6d
#6 [ffffffff8047ecd0] hidinput_hid_event at ffffffff801e4299
#7 [ffffffff8047ecd8] ip_route_input at ffffffff8003662f
#8 [ffffffff8047ed00] hid_process_event at ffffffff801df639
#9 [ffffffff8047ed40] hid_input_report at ffffffff801df9a7
#10 [ffffffff8047edc0] hid_irq_in at ffffffff801e0d8e
#11 [ffffffff8047ede0] usb_hcd_giveback_urb at ffffffff801d33a2
#12 [ffffffff8047ee10] uhci_giveback_urb at ffffffff88126724
#13 [ffffffff8047ee50] uhci_scan_schedule at ffffffff88126e07
#14 [ffffffff8047eed0] uhci_irq at ffffffff88128c08
#15 [ffffffff8047ef10] usb_hcd_irq at ffffffff801d3d91
#16 [ffffffff8047ef20] handle_IRQ_event at ffffffff800106fd
#17 [ffffffff8047ef50] __do_IRQ at ffffffff800b520c
#18 [ffffffff8047ef58] __do_softirq at ffffffff80011bfa
#19 [ffffffff8047ef90] do_IRQ at ffffffff8006a729
--- <IRQ stack> ---
#20 [ffffffff80437ee8] ret_from_intr at ffffffff8005ba89
[exception RIP: mwait_idle+54]
RIP: ffffffff80055375 RSP: ffffffff80437f90 RFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000099000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8034e818
RBP: 0000000000099000 R8: ffffffff80436000 R9: 000000000000003e
R10: ffff810037d0c038 R11: ffff81003f48e580 R12: ffff810037fef7a0
R13: 0000000000000000 R14: ffffffff8034d050 R15: 0000000002246128
ORIG_RAX: ffffffffffffff16 CS: 0010 SS: 0018
#21 [ffffffff80437f90] cpu_idle at ffffffff800473a7
crash>
Anyway, we'll have to come up with a differentiator so that
both types of interrupt-stack-linkages are handled. It looks
like the rbp value is fixed with relationship to the exception
frame, so something can be done.
Just FYI,
Dave
18 years, 1 month
xencrash: crash analisys tool for Xen hypervisor
by Itsuro ODA
Hi,
We are developing a crash analisys tool -- xencrash --
like "crash" for Xen hypervisor.
Though the structure of Linux and Xen hypervisor is different,
the requirements for an analisys tool is same essentially.
I think "crash" can be spported analisys of Xen hypervisor
to add quite small modification, and this is best way to
avoid reinvent the wheel.
The attached patch is a prototype of xencrash and it is
a proof-of-concept mentioned above.
(I think this is already usefull for developers of Xen
hypervisor thouth it is a prototype and only support x86.)
Please evaluate this if you are interesting in crash analisys
for Xen hypervisor. (We must analyze both Linux kernel and
Xen hypervisor. Don't you?)
Any comments are welcome.
To use xencrash:
- cd crash-4.0-3.8
- zcat xencrash-4.0-3.8-0.1.patch.gz | patch -p1
- make xencrash
usage: xencrash xen-syms dump-file
where dump-file is either
- vmcore (whole memory dump image) get by kdump, or
- xendump (the part of xen hypervisor) cut by dom0cut
xencrash supports only x86 now. (x86_64 and IA64 is under
development)
Thanks.
--
Itsuro ODA <oda(a)valinux.co.jp>
18 years, 1 month
kdump format may be updated
by Kazuo Moriwaka
Hello,
A new release of the kexec/kdump patches for x86 xen have recently
been posted to the xen-devel mailing list:
http://lists.xensource.com/archives/html/xen-devel/2006-10/msg00611.html
To simplify the kernel and hypervisor code, the kdump format may be
changed in future releases. The idea is to make the kernel and
hypervisor code as simple as possible and put more knowledge into the
tools.
If you have any comments or recommendations now is a good time to step
up and give feedback to the thread on xen-devel.
--
Kazuo Moriwaka <moriwaka(a)valinux.co.jp>
18 years, 1 month
Re:[RFC] Crash patch for DWARF CFI based unwind support
by Dave Anderson
Hi Rachita,
I'm looking at an alt-sysrq-c generated crash on an x86_64 kernel,
and I am seeing something similar to what you are with respect
to the transition from the interrupt stack back to the process
stack.
Here's a "bt -a", where cpu 0 shows that it received a shutdown
NMI while in the idle loop, and the transition from the NMI
exception stack back to the process stack was clean. But on
the cpu which took the alt-sysrq-c keyboard interrupt, the
transition from the per-cpu interrupt stack back to the
process stack is similar to what you're seeing:
crash> bt -a
PID: 0 TASK: ffffffff8034ce60 CPU: 0 COMMAND: "swapper"
#0 [ffffffff80481f30] crash_nmi_callback at ffffffff8007742f
#1 [ffffffff80481f40] do_nmi at ffffffff80063c2c
#2 [ffffffff80481f50] nmi at ffffffff8006312f
[exception RIP: mwait_idle+54]
RIP: ffffffff800553b7 RSP: ffffffff80437f90 RFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffff80055381 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8034e838
RBP: 0000000000099000 R8: ffffffff80436000 R9: 000000000000003e
R10: ffff810037d0c038 R11: 0000000000000048 R12: 0000000000090000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <exception stack> ---
#3 [ffffffff80437f90] mwait_idle at ffffffff800553b7
#4 [ffffffff80437f90] cpu_idle at ffffffff800473be
PID: 0 TASK: ffff81003fe48100 CPU: 1 COMMAND: "swapper"
#0 [ffff81003fe6bb40] crash_kexec at ffffffff800ab7c4
#1 [ffff81003fe6bbc8] mwait_idle at ffffffff800553b7
#2 [ffff81003fe6bc00] sysrq_handle_crashdump at ffffffff8019301f
#3 [ffff81003fe6bc10] __handle_sysrq at ffffffff80192e1c
#4 [ffff81003fe6bc50] kbd_event at ffffffff8018dbc1
#5 [ffff81003fe6bca0] input_event at ffffffff801e9b9f
#6 [ffff81003fe6bcd0] hidinput_hid_event at ffffffff801e42cb
#7 [ffff81003fe6bd00] hid_process_event at ffffffff801df66b
#8 [ffff81003fe6bd40] hid_input_report at ffffffff801df9d9
#9 [ffff81003fe6bdc0] hid_irq_in at ffffffff801e0dc0
#10 [ffff81003fe6bde0] usb_hcd_giveback_urb at ffffffff801d33d8
#11 [ffff81003fe6be10] uhci_giveback_urb at ffffffff88168724
#12 [ffff81003fe6be50] uhci_scan_schedule at ffffffff88168e07
#13 [ffff81003fe6bed0] uhci_irq at ffffffff8816ac08
#14 [ffff81003fe6bf10] usb_hcd_irq at ffffffff801d3dc7
#15 [ffff81003fe6bf20] handle_IRQ_event at ffffffff80010704
#16 [ffff81003fe6bf50] __do_IRQ at ffffffff800b5238
#17 [ffff81003fe6bf58] __do_softirq at ffffffff80011c0b
#18 [ffff81003fe6bf90] do_IRQ at ffffffff8006a762
--- <IRQ stack> ---
#19 [ffff81003fe65e70] ret_from_intr at ffffffff8005bac9
[exception RIP: cpu_idle+149]
RIP: ffffffff800473be RSP: ffffffff8042e220 RFLAGS: ffffffff80074188
RAX: ffffffffffffff16 RBX: 0000000000000000 RCX: ffffffff800553b7
RDX: 0000000000000010 RSI: 0000000000000246 RDI: ffff81003fe65ef0
RBP: ffff81003fe64000 R8: ffffffff8034e838 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003f
R13: ffff810037d0c008 R14: 0000000000000246 R15: 0000000000000001
ORIG_RAX: 0000000000000018 CS: 0020 SS: 0000
bt: WARNING: possibly bogus exception frame
crash>
crash dutifully reports that the exception frame looks bogus
because of the CS value.
The end of the per-cpu interrupt stack looks like this:
crash> rd -s ffff81003fe68000 2048
...
ffff81003fe6bf30: 000000000000e900 00000000000000e9
ffff81003fe6bf40: ffff810037ca4bc0 irq_desc+59708
ffff81003fe6bf50: __do_IRQ+164 __do_softirq+94
ffff81003fe6bf60: 00000000000000e9 ffff81003fe65e48
ffff81003fe6bf70: 00000000000000ff cpu_data+256
ffff81003fe6bf80: 0000000000000100 cpu_core_map+32
ffff81003fe6bf90: do_IRQ+231 ffff81003fe65e70
ffff81003fe6bfa0: mwait_idle ffff81003fe65e70
ffff81003fe6bfb0: ret_from_intr ffff81003fe65e70
ffff81003fe6bfc0: 0000000000000000 0000000000000000
ffff81003fe6bfd0: 0000000000000000 0000000000000000
ffff81003fe6bfe0: 0000000000000000 0000000000000000
ffff81003fe6bff0: 0000000000000000 0000000000000000
crash>
...hence the supposed pointer to the generating exception frame
is presumed to be ffff81003fe65e70 (which is bogus).
Interestingly, though, is if I do an exception frame search
for that task, I do find the "real" exception frame:
crash> bt -e
PID: 0 TASK: ffff81003fe48100 CPU: 1 COMMAND: "swapper"
KERNEL-MODE EXCEPTION FRAME AT: ffff81003fe65e48
RIP: ffffffff800553b7 RSP: ffff81003fe65ef0 RFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8034e838
RBP: 0000000000000000 R8: ffff81003fe64000 R9: 000000000000003f
R10: ffff810037d0c008 R11: 0000000000000246 R12: ffff810037fef040
R13: 0000000000000001 R14: ffff81003fe482f0 R15: 0000000002245f5e
ORIG_RAX: ffffffffffffff16 CS: 0010 SS: 0018
crash> sym ffffffff800553b7
ffffffff800553b7 (t) mwait_idle+0x36 include/asm/thread_info.h: 63
crash>
and if I just grep for that address within the per-cpu interrupt
stack, I see several refernces to it:
crash> rd -s ffff81003fe68000 2048 | grep ffff81003fe65e48
ffff81003fe6bb30: ffff81003ac7c000 ffff81003fe65e48
ffff81003fe6bc60: 00ffffff8001c1fd ffff81003fe65e48
ffff81003fe6bce0: ffff81003ec68000 ffff81003fe65e48
ffff81003fe6bd50: ffff81003fe65e48 0000000180087720
ffff81003fe6bda0: ffff810037cb7400 ffff81003fe65e48
ffff81003fe6bdb0: ffff810037cb7550 ffff81003fe65e48
ffff81003fe6be60: ffff81003cf37488 ffff81003fe65e48
ffff81003fe6bec0: ffff81003fe65e48 ffff81003fe65e48
ffff81003fe6bed0: uhci_irq+0x13f ffff81003fe65e48
ffff81003fe6bf00: ffff81003fe65e48 ffff81003fe65e48
ffff81003fe6bf60: 00000000000000e9 ffff81003fe65e48
crash>
But none of them are located at the "fixed" location of one
word below the 64-byte block at the top of the interrupt
stack.
So I don't know what's going on in this case...
I don't ever recall seeing such a bogus interrupt-to-process
stack transition on any netdump or diskdump generated vmcores.
And all the "test" kdump kernel dumpfiles I've used have only
been generated by using "echo c > /proc/sysrq-trigger", so the
crash path never went off the process stack.
So without blatantly pointing the finger, I wonder whether there's
something that the kexec/kdump code path does that could possibly
be tinkering with the contents of the interrupt stack?
I also want to try to force another crash but with all cpus
forcibly running something other than the idle task, in case
there's something strange about the interrupt bringing the
cpu out of that "mwait" instruction? Grasping at straws a
bit here...
Also, that's why I'm always asking for back-trace tests that
do something "real" -- instead of just having the kernel
call panic(), or that do a sys_write() to /proc/sysrq-trigger
to force an oops on the process stack. At least an alt-sysrq-c
on the console keyboard generates an interrupt, as does your
forced jprobes deal...
BTW, I also note the the reading of the module unwind tables
is reading invalid data, because it's being done before the
non-unity-mapped address translation can even work! In other
words, vmalloc addresses can only be read after vm_init() is
complete. So I've added another machdep_init() argument
(POST_VM) that is called just after vm_init(), and in the
case of x86_64 (and x86), can call init_unwind_table().
Thanks,
Dave
18 years, 1 month
crash failed on vmcore created by kdump at an IA64
by Jay Lan
Hi,
I saved off a vmcore from a kexec'ed crashkernel at an IA64 Altix.
When i tried to run crash (4.0-3.5) on the vmcore, it failed.
Is there a known issue?
Thanks,
- jay
% crash System.map* vmlinuz-2.6.18-kdump vmcore
crash 4.0-3.5
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
crash(7419): unaligned access to 0x60000000001bf1cc, ip=0x400000000026d090
crash(7419): unaligned access to 0x60000000001bf1d4, ip=0x400000000026d090
crash(7419): unaligned access to 0x60000000001bf1dc, ip=0x400000000026d090
crash(7419): unaligned access to 0x60000000001bf1e4, ip=0x400000000026d090
crash(7419): unaligned access to 0x60000000001bf1ec, ip=0x400000000026d090
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "ia64-unknown-linux-gnu"...
please wait... (patching 23743 gdb minimal_symbol values)
crash: invalid (optional) structure member offsets:
pglist_data_node_next or pglist_data_pgdat_next
FILE: memory.c LINE: 11504 FUNCTION: node_table_init()
[/usr/people/jlan/bin/crash] error trace: => 4000000000231740
4000000000231740: OFFSET_option+432
18 years, 1 month
Re:[RFC] Crash patch for DWARF CFI based unwind support
by Dave Anderson
> There still are a couple of things which need to be done, viz
> 1. Extend to obtaining unwind info from modules as well(currently
> doing only for the kernel)
> 2. Currently reading the unwind info from eh_frame section only(ie
> __start_unwind to __end_unwind). Need to add facility to read from
> the .debug_frame(if .debug_frame is present in cases where .eh_frame
> is absent. Will have to read from the vmlinux if we want to read the
> .debug_frame info)
Hi Rachita,
I hope to be able to come up with a new crash version
for you to continue working with by tomorrow, Monday at
the latest.
Off the top of my head, here's what I've done with your
initial patch:
1. As Ben mentioned, it need to be made compilable for
other architectures.
2. Renamed unwind_x86_64.c into unwind_x86_32_64.c,
because the unwind code should be architecture
neutral with respect to x86 and x86_64. It's currently
#ifdef'd to only be compile if X86_64, but when a
new "unwind_x86.h" file is ready to go, it can be
made usable by both arches.
3. Made it capable of reading .eh_frame data from the
vmlinux file if it is not in memory.
4. Made it capable of reading all of the module's unwind
tables.
5. Restored the unwind() function to reflect the kernel
version in that it new uses a new find_table() routine,
which returns a pointer to the local copy of the unwind
that contains the incoming pc.
6. Cleaned up a bunch of cruft...
A couple other notes:
I've restored the original x86_64_low_budget_back_trace_cmd()
function, and renamed your modified version of it. I can't
risk breaking the original, and having a new version that
can be swapped in dynamically to the machdep->back_trace
pointer will allow us to tinker with the new one. Ideally
the new one could continue to use the multiple stack walkers
of the original function, but as they walk through the stacks
from a "known good" starting point, they could use the new
unwind functionality to simply calculate the frame size,
and thereby skip over the leftovers -- as opposed to laboriously
printing every kernel text return address found -- and as
opposed to the way you send it off to dwarf_backtrace().
That being said, your dwarf_backtrace() function is an
excellent proof-of-concept!
The same thing goes for the x86 port. In a perfect scenario,
the back trace code can be left pretty much untouched
*except* for one key function, that being the get_framesize()
function in lkcd_x86_trace.c. As the backtrace code is
walking through the proces and IRQ stacks, it starts with
the "known good" PC, and calls get_framesize() to determine
how much of the stack to skip over to find the next text
return reference. That code has been hacked to hell, and
as the compiler has gotten more and more clever, it's been
more and more prone to miscalculating the frame size. It
disassembles the code from the start of the containing function
of the passed-in PC up to the PC, counts pushes and pops,
subtractions of the stack pointer, tries to handle embedded
returns, etc, etc. -- it's a nightmare. But it seemingly
could take the passed in PC, and just pass it the unwind
code to get a "confident" frame size, and go from there.
That's what I'm hoping for, anyway...
Anyway, the use of the new unwind info is turned "off" by
default, but you can turn it on during runtime by entering
"set unwind on", or turn it off by entering "set unwind off".
Of course it will not allow you to turn it on if the kernel
and vmlinux both don't contain any unwind info.
BTW, I also tried using the .debug_frame section of the vmlinux
file, but it apparently can not be used by the ported
kernel unwind code. The kernel unwind code presumes the
data is from the .eh_frame section; the .debug_frame section
must use a different layout.
This is looking good...
Thanks,
Dave
18 years, 1 month
struct structname.member1,member2,member3, ...
by Olivier Daudel
Hi Dave,
1) With this new patch, i think we are not far from your goals.
But i understand that to change somthing in struct, union, etc. is rather
"dangerous".
Perhaps it would be preferable to implement this attempt in an independent
module ?
The command cmd_struct(), cmd_union() and cmd_pointer() have been unified
and also "*" supports -l now.
It was not the case before (i missed something ?).
To take care of the differences between struct, union and *, we have mainely
one instruction.
2) Nothing to do with the patch, but may be there is still something wrong
with struct -o (in standard 3.8 and also in my new version):
crash> struct -o ipc_id_ary
struct ipc_id_ary {
[0] int size;
[4] struct kern_ipc_perm *p[0];
}
SIZE: 4
We should have SIZE greater than 4 ?
Olivier
crash> struct inode.i_uid,i_gid,i_alloc_sem f605b31c
i_uid = 0,
i_gid = 5,
i_alloc_sem = {
count = 0,
wait_lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
wait_list = {
next = 0xf605b3c4,
prev = 0xf605b3c4
}
},
I don't think, it would be very easy to have this result with grep and we
don't have the -R option
as in task... (if i am correct).
crash> struct inode.i_uid,i_gid,i_alloc_sem f605b31c 3
i_uid = 0,
i_gid = 5,
i_alloc_sem = {
count = 0,
wait_lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
wait_list = {
next = 0xf605b3c4,
prev = 0xf605b3c4
}
},
i_uid = 0,
i_gid = 0,
i_alloc_sem = {
count = 0,
wait_lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
wait_list = {
next = 0xf605b53c,
prev = 0xf605b53c
}
},
i_uid = 0,
i_gid = 0,
i_alloc_sem = {
count = 0,
wait_lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
wait_list = {
next = 0xf605b6b4,
prev = 0xf605b6b4
}
},
crash> struct inode.i_list f605b31c 3
i_list = {
next = 0xf605b49c,
prev = 0xf7b4e1cc
},
i_list = {
next = 0xf60cadb8,
prev = 0xf605b324
},
i_list = {
next = 0xf60e6458,
prev = 0xf60e6278
},
crash> struct inode -r f605b31c 3
f605b31c: 00000000 00000000 f605b49c f7b4e1cc ................
f605b32c: f7d76bfc f6b6903c f6d8ea60 f6d8ea60 .k..<...`...`...
f605b33c: 00000002 00000001 00002190 00000001 .........!......
f605b34c: 00000000 00000005 08800000 00000000 ................
f605b35c: 00000000 453bb0ec 0606fb28 453bb0ec ......;E(.....;E
f605b36c: 0606fb28 453ba008 2330b3e8 0000000a (.....;E..0#....
f605b37c: 00000400 00000000 00000000 00000000 ................
f605b38c: 00000001 dead4ead 00000000 00000001 .....N..........
f605b39c: 00000000 00000001 dead4ead 00000000 .........N......
f605b3ac: f605b3ac f605b3ac 00000000 00000001 ................
f605b3bc: dead4ead 00000000 f605b3c4 f605b3c4 .N..............
f605b3cc: c036d760 c036a7a0 f7d72200 00000000 `.6...6.."......
f605b3dc: f605b3e0 f605b31c 00000000 00000220 ............ ...
f605b3ec: 00000000 01000000 deaf1eed 00000000 ................
f605b3fc: 00000000 00000000 00010001 f605b408 ................
f605b40c: f605b408 00000001 dead4ead 00000000 .........N......
f605b41c: 00000000 00000000 00000000 c0485540 ............@UH.
f605b42c: 000000d2 c0369b88 00000001 dead4ead ......6......N..
f605b43c: 00000000 f605b440 f605b440 00000000 ....@...@.......
f605b44c: 00000000 00000000 f7da1440 f6b69164 ........@...d...
f605b45c: 00000000 00000000 f7da1404 00000000 ................
f605b46c: 00000000 00000000 00000000 00000000 ................
f605b47c: 00000000 00000000 00000001 00000000 ................
f605b48c: f60f9000 00000000 ........
f605b494: 00000000 00000000 f60cadb8 f605b324 ............$...
f605b4a4: f674832c f6b6990c f60493ac f60493ac ,.t.............
f605b4b4: 00002544 00000001 00001180 00000001 D%..............
f605b4c4: 00000000 00000000 00000000 00000000 ................
f605b4d4: 00000000 453ba008 20ce71a8 453ba008 ......;E.q. ..;E
f605b4e4: 20ce71a8 453ba008 20ce71a8 0000000a .q. ..;E.q. ....
f605b4f4: 00001000 00000000 00000000 00000000 ................
f605b504: 00000001 dead4ead 00000000 00000001 .....N..........
f605b514: 00000000 00000001 dead4ead 00000000 .........N......
f605b524: f605b524 f605b524 00000000 00000001 $...$...........
f605b534: dead4ead 00000000 f605b53c f605b53c .N......<...<...
f605b544: c04854e0 c036ab40 f7fcf200 00000000 .TH.@.6.........
f605b554: f605b558 f605b494 00000000 00000220 X........... ...
f605b564: 00000000 01000000 deaf1eed 00000000 ................
f605b574: 00000000 00000000 00010001 f605b580 ................
f605b584: f605b580 00000001 dead4ead 00000000 .........N......
f605b594: 00000000 00000000 00000000 c0485540 ............@UH.
f605b5a4: 000000d2 c0369b88 00000001 dead4ead ......6......N..
f605b5b4: 00000000 f605b5b8 f605b5b8 00000000 ................
f605b5c4: 00000000 00000000 f605b5cc f605b5cc ................
f605b5d4: f7aa7400 00000000 00000000 00000000 .t..............
f605b5e4: 00000000 00000000 00000000 00000007 ................
f605b5f4: 00000000 00000000 00000000 00000000 ................
f605b604: 00000000 00000000 ........
f605b60c: 00000000 00000000 f60e6458 f60e6278 ........Xd..xb..
f605b61c: f605b794 f609e3ac f6067430 f6067430 ........0t..0t..
f605b62c: 000022db 00000001 00008124 00000001 ."......$.......
f605b63c: 00000000 00000000 00000000 00001000 ................
f605b64c: 00000000 453b9fe3 292fe200 453b9fe3 ......;E../)..;E
f605b65c: 292fe200 453b9fe3 292fe200 0000000c ../)..;E../)....
f605b66c: 00001000 00000000 00000000 00000000 ................
f605b67c: 00000001 dead4ead 00000000 00000001 .....N..........
f605b68c: 00000000 00000001 dead4ead 00000000 .........N......
f605b69c: f605b69c f605b69c 00000000 00000001 ................
f605b6ac: dead4ead 00000000 f605b6b4 f605b6b4 .N..............
f605b6bc: c04854e0 c036d400 f7eaf400 00000000 .TH...6.........
f605b6cc: f605b6d0 f605b60c 00000000 00000220 ............ ...
f605b6dc: 00000000 01000000 deaf1eed 00000000 ................
f605b6ec: 00000000 00000000 00010001 f605b6f8 ................
f605b6fc: f605b6f8 00000001 dead4ead 00000000 .........N......
f605b70c: 00000000 00000000 00000000 c036d380 ..............6.
f605b71c: 000000d2 c036d3b0 00000001 dead4ead ......6......N..
f605b72c: 00000000 f605b730 f605b730 00000000 ....0...0.......
f605b73c: 00000000 00000000 f605b744 f605b744 ........D...D...
f605b74c: 00000000 00000000 00000000 00000000 ................
f605b75c: 00000000 00000000 00000000 00000000 ................
f605b76c: 00000000 00000000 00000000 00000000 ................
f605b77c: 00000000 00000000 ........
If you correlate the dump with the previous display, i think the dump is
correct.
crash> ps 1
PID PPID CPU TASK ST %MEM VSZ RSS COMM
1 0 0 f7e8caa0 IN 0.0 0 0 init
crash> struct task_struct.sibling
struct task_struct {
[176] struct list_head sibling;
}
crash> struct task_struct.pid,tgid,pending -l task_struct.sibling 0xf7e8c0d0
pid = 2,
tgid = 2,
pending = {
list = {
next = 0xf7e8c494,
prev = 0xf7e8c494
},
signal = {
sig = {0, 0}
}
},
crash> struct task_struct.pid,tgid,pending -l 176 0xf7e8c0d0
pid = 2,
tgid = 2,
pending = {
list = {
next = 0xf7e8c494,
prev = 0xf7e8c494
},
signal = {
sig = {0, 0}
}
},
I don't think you can do that with task (-l not supported)
crash> * task_struct.pid,tgid,pending -l 176 0xf7e8c0d0
pid = 2,
tgid = 2,
pending = {
list = {
next = 0xf7e8c494,
prev = 0xf7e8c494
},
signal = {
sig = {0, 0}
}
},
18 years, 1 month
[RFC] Crash patch for DWARF CFI based unwind support
by Rachita Kothiyal
Hi Dave
The following patch adds support for DWARF CFI based stack unwinding
for crash. Since this method uses the call frame instructions for
unwinding, it generates better backtraces than the existing backtrace
mechanism. So when we have the unwind info available, this new method
will be called, else we fall back to the existing mechanism.
There still are a couple of things which need to be done, viz
1. Extend to obtaining unwind info from modules as well(currently
doing only for the kernel)
2. Currently reading the unwind info from eh_frame section only(ie
__start_unwind to __end_unwind). Need to add facility to read from
the .debug_frame(if .debug_frame is present in cases where .eh_frame
is absent. Will have to read from the vmlinux if we want to read the
.debug_frame info)
3. Add FRAME_POINTER support.
Please provide your suggestions and comments.
Thanks
Rachita
Signed-off-by: Rachita Kothiyal <rachita(a)in.ibm.com>
---
Makefile | 13 -
netdump.c | 24 +
unwind_x86_64.c | 699 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
unwind_x86_64.h | 135 ++++++++++
x86_64.c | 57 ++++
5 files changed, 919 insertions(+), 9 deletions(-)
diff -puN x86_64.c~crash-dwarf-unwind x86_64.c
--- crash-4.0-3.7/x86_64.c~crash-dwarf-unwind 2006-10-16 18:14:24.907025568 +0530
+++ crash-4.0-3.7-rachita/x86_64.c 2006-10-16 18:14:41.231543864 +0530
@@ -14,6 +14,7 @@
* GNU General Public License for more details.
*/
#include "defs.h"
+#include "unwind_x86_64.h"
#ifdef X86_64
@@ -81,6 +82,7 @@ static ulong x86_64_xen_kdump_page_mfn(u
static void x86_64_debug_dump_page(FILE *, char *, char *);
static void x86_64_get_xendump_regs(struct xendump_data *, struct bt_info *, ulong *, ulong *);
static ulong x86_64_xendump_panic_task(struct xendump_data *);
+static int dwarf_backtrace(struct bt_info *, ulong);
struct machine_specific x86_64_machine_specific = { 0 };
@@ -245,6 +247,9 @@ x86_64_init(int when)
STRUCT_SIZE_INIT(user_regs_struct, "user_regs_struct");
x86_64_cpu_pda_init();
x86_64_ist_init();
+ if (symbol_exists("__start_unwind") &&
+ symbol_exists("__end_unwind"))
+ init_unwind_table();
if ((machdep->machspec->irqstack = (char *)
malloc(machdep->machspec->stkinfo.isize)) == NULL)
error(FATAL, "cannot malloc irqstack space.");
@@ -2170,6 +2175,9 @@ in_exception_stack:
}
stacktop = bt->stacktop - SIZE(pt_regs);
+
+ if (has_unwind_info && !done)
+ done = dwarf_backtrace(bt, stacktop);
for (i = (rsp - bt->stackbase)/sizeof(ulong);
!done && (rsp < stacktop); i++, rsp += sizeof(ulong)) {
@@ -2253,6 +2261,9 @@ in_exception_stack:
stacktop = bt->stacktop - 64; /* from kernel code */
+ if (has_unwind_info && !done)
+ done = dwarf_backtrace(bt, stacktop);
+
for (i = (rsp - bt->stackbase)/sizeof(ulong);
!done && (rsp < stacktop); i++, rsp += sizeof(ulong)) {
@@ -2336,7 +2347,7 @@ in_exception_stack:
/*
* For a normally blocked task, hand-create the first level.
*/
- if (!done &&
+ if (!done && !has_unwind_info &&
!(bt->flags & (BT_TEXT_SYMBOLS|BT_EXCEPTION_STACK|BT_IRQSTACK)) &&
STREQ(closest_symbol(bt->instptr), "thread_return")) {
bt->flags |= BT_SCHEDULE;
@@ -2375,6 +2386,9 @@ in_exception_stack:
/*
* Walk the process stack.
*/
+ if (has_unwind_info && !done)
+ done = dwarf_backtrace(bt, bt->stacktop);
+
for (i = (rsp - bt->stackbase)/sizeof(ulong);
!done && (rsp < bt->stacktop); i++, rsp += sizeof(ulong)) {
@@ -4370,4 +4384,45 @@ generic:
}
}
}
+
+static int dwarf_backtrace(struct bt_info *bt, ulong stacktop)
+{
+ int n = 0;
+ unsigned long bp, offset;
+ struct syment *sp;
+ char *name;
+ struct unwind_frame_info *frame = malloc(sizeof(struct unwind_frame_info));
+
+ frame->regs.rsp = bt->stkptr;
+ frame->regs.rip = bt->instptr;
+
+ /* read rbp from stack for non active tasks */
+ if (!(bt->flags & BT_DUMPFILE_SEARCH) ) {
+ readmem(frame->regs.rsp, KVADDR, &bp,
+ sizeof(unsigned long), "reading bp", FAULT_ON_ERROR);
+ frame->regs.rbp = bp;
+ }
+
+ sp = value_search(UNW_PC(frame), &offset);
+ /*
+ * If offset is zero, it means we have crossed over to the next
+ * function. Recalculate by adjusting the text address
+ */
+ if (!offset)
+ sp = value_search(UNW_PC(frame) - 1, &offset);
+
+ name = sp->name;
+ fprintf(fp, "#0 [%016lx] %s at %016lx \n", UNW_SP(frame), name, UNW_PC(frame));
+
+ while ((UNW_SP(frame) < stacktop)
+ && !unwind(frame) && UNW_PC(frame)) {
+ n++;
+ sp = value_search(UNW_PC(frame), &offset);
+ name = sp->name;
+ fprintf(fp, "#%d [%016lx] %s at %016lx \n", n, UNW_SP(frame), name, UNW_PC(frame));
+ }
+ free(frame);
+ return TRUE;
+}
+
#endif /* X86_64 */
diff -puN /dev/null unwind_x86_64.c
--- /dev/null 2006-07-24 19:06:05.520445648 +0530
+++ crash-4.0-3.7-rachita/unwind_x86_64.c 2006-10-16 18:21:12.131118096 +0530
@@ -0,0 +1,699 @@
+/*
+ * Support for genarating DWARF CFI based backtraces.
+ * Borrowed heavily from the kernel's implementation of unwinding using the
+ * DWARF CFI written by Jan Beulich
+ */
+
+#include "unwind_x86_64.h"
+#include "defs.h"
+#define MAX_STACK_DEPTH 8
+
+void *unwind_table;
+int unwind_table_size = 0;
+int has_unwind_info = 0;
+
+static const struct {
+ unsigned offs:BITS_PER_LONG / 2;
+ unsigned width:BITS_PER_LONG / 2;
+} reg_info[] = {
+ UNW_REGISTER_INFO
+};
+
+#undef PTREGS_INFO
+#undef EXTRA_INFO
+
+#ifndef REG_INVALID
+#define REG_INVALID(r) (reg_info[r].width == 0)
+#endif
+
+#define DW_CFA_nop 0x00
+#define DW_CFA_set_loc 0x01
+#define DW_CFA_advance_loc1 0x02
+#define DW_CFA_advance_loc2 0x03
+#define DW_CFA_advance_loc4 0x04
+#define DW_CFA_offset_extended 0x05
+#define DW_CFA_restore_extended 0x06
+#define DW_CFA_undefined 0x07
+#define DW_CFA_same_value 0x08
+#define DW_CFA_register 0x09
+#define DW_CFA_remember_state 0x0a
+#define DW_CFA_restore_state 0x0b
+#define DW_CFA_def_cfa 0x0c
+#define DW_CFA_def_cfa_register 0x0d
+#define DW_CFA_def_cfa_offset 0x0e
+#define DW_CFA_def_cfa_expression 0x0f
+#define DW_CFA_expression 0x10
+#define DW_CFA_offset_extended_sf 0x11
+#define DW_CFA_def_cfa_sf 0x12
+#define DW_CFA_def_cfa_offset_sf 0x13
+#define DW_CFA_val_offset 0x14
+#define DW_CFA_val_offset_sf 0x15
+#define DW_CFA_val_expression 0x16
+#define DW_CFA_lo_user 0x1c
+#define DW_CFA_GNU_window_save 0x2d
+#define DW_CFA_GNU_args_size 0x2e
+#define DW_CFA_GNU_negative_offset_extended 0x2f
+#define DW_CFA_hi_user 0x3f
+
+#define DW_EH_PE_FORM 0x07
+#define DW_EH_PE_native 0x00
+#define DW_EH_PE_leb128 0x01
+#define DW_EH_PE_data2 0x02
+#define DW_EH_PE_data4 0x03
+#define DW_EH_PE_data8 0x04
+#define DW_EH_PE_signed 0x08
+#define DW_EH_PE_ADJUST 0x70
+#define DW_EH_PE_abs 0x00
+#define DW_EH_PE_pcrel 0x10
+#define DW_EH_PE_textrel 0x20
+#define DW_EH_PE_datarel 0x30
+#define DW_EH_PE_funcrel 0x40
+#define DW_EH_PE_aligned 0x50
+#define DW_EH_PE_indirect 0x80
+#define DW_EH_PE_omit 0xff
+
+#define min(x,y) ({ \
+ typeof(x) _x = (x); \
+ typeof(y) _y = (y); \
+ (void) (&_x == &_y); \
+ _x < _y ? _x : _y; })
+
+#define max(x,y) ({ \
+ typeof(x) _x = (x); \
+ typeof(y) _y = (y); \
+ (void) (&_x == &_y); \
+ _x > _y ? _x : _y; })
+#define STACK_LIMIT(ptr) (((ptr) - 1) & ~(THREAD_SIZE - 1))
+
+typedef unsigned long uleb128_t;
+typedef signed long sleb128_t;
+
+struct unwind_item {
+ enum item_location {
+ Nowhere,
+ Memory,
+ Register,
+ Value
+ } where;
+ uleb128_t value;
+};
+
+struct unwind_state {
+ uleb128_t loc, org;
+ const u8 *cieStart, *cieEnd;
+ uleb128_t codeAlign;
+ sleb128_t dataAlign;
+ struct cfa {
+ uleb128_t reg, offs;
+ } cfa;
+ struct unwind_item regs[ARRAY_SIZE(reg_info)];
+ unsigned stackDepth:8;
+ unsigned version:8;
+ const u8 *label;
+ const u8 *stack[MAX_STACK_DEPTH];
+};
+
+static const struct cfa badCFA = { ARRAY_SIZE(reg_info), 1 };
+
+static uleb128_t get_uleb128(const u8 **pcur, const u8 *end)
+{
+ const u8 *cur = *pcur;
+ uleb128_t value;
+ unsigned shift;
+
+ for (shift = 0, value = 0; cur < end; shift += 7) {
+ if (shift + 7 > 8 * sizeof(value)
+ && (*cur & 0x7fU) >= (1U << (8 * sizeof(value) - shift))) {
+ cur = end + 1;
+ break;
+ }
+ value |= (uleb128_t)(*cur & 0x7f) << shift;
+ if (!(*cur++ & 0x80))
+ break;
+ }
+ *pcur = cur;
+
+ return value;
+}
+
+static sleb128_t get_sleb128(const u8 **pcur, const u8 *end)
+{
+ const u8 *cur = *pcur;
+ sleb128_t value;
+ unsigned shift;
+
+ for (shift = 0, value = 0; cur < end; shift += 7) {
+ if (shift + 7 > 8 * sizeof(value)
+ && (*cur & 0x7fU) >= (1U << (8 * sizeof(value) - shift))) {
+ cur = end + 1;
+ break;
+ }
+ value |= (sleb128_t)(*cur & 0x7f) << shift;
+ if (!(*cur & 0x80)) {
+ value |= -(*cur++ & 0x40) << shift;
+ break;
+ }
+ }
+ *pcur = cur;
+
+ return value;
+}
+
+static unsigned long read_pointer(const u8 **pLoc,
+ const void *end,
+ signed ptrType)
+{
+ unsigned long value = 0;
+ union {
+ const u8 *p8;
+ const u16 *p16u;
+ const s16 *p16s;
+ const u32 *p32u;
+ const s32 *p32s;
+ const unsigned long *pul;
+ } ptr;
+
+ if (ptrType < 0 || ptrType == DW_EH_PE_omit)
+ return 0;
+ ptr.p8 = *pLoc;
+ switch(ptrType & DW_EH_PE_FORM) {
+ case DW_EH_PE_data2:
+ if (end < (const void *)(ptr.p16u + 1))
+ return 0;
+ if(ptrType & DW_EH_PE_signed)
+ value = get_unaligned(ptr.p16s++);
+ else
+ value = get_unaligned(ptr.p16u++);
+ break;
+ case DW_EH_PE_data4:
+#ifdef CONFIG_64BIT
+ if (end < (const void *)(ptr.p32u + 1))
+ return 0;
+ if(ptrType & DW_EH_PE_signed)
+ value = get_unaligned(ptr.p32s++);
+ else
+ value = get_unaligned(ptr.p32u++);
+ break;
+ case DW_EH_PE_data8:
+ BUILD_BUG_ON(sizeof(u64) != sizeof(value));
+#else
+ BUILD_BUG_ON(sizeof(u32) != sizeof(value));
+#endif
+ case DW_EH_PE_native:
+ if (end < (const void *)(ptr.pul + 1))
+ return 0;
+ value = get_unaligned(ptr.pul++);
+ break;
+ case DW_EH_PE_leb128:
+ BUILD_BUG_ON(sizeof(uleb128_t) > sizeof(value));
+ value = ptrType & DW_EH_PE_signed
+ ? get_sleb128(&ptr.p8, end)
+ : get_uleb128(&ptr.p8, end);
+ if ((const void *)ptr.p8 > end)
+ return 0;
+ break;
+ default:
+ return 0;
+ }
+ switch(ptrType & DW_EH_PE_ADJUST) {
+ case DW_EH_PE_abs:
+ break;
+ case DW_EH_PE_pcrel:
+ value += (unsigned long)*pLoc;
+ break;
+ default:
+ return 0;
+ }
+
+/* TBD
+ if ((ptrType & DW_EH_PE_indirect)
+ && __get_user(value, (unsigned long *)value))
+ return 0;
+*/
+ *pLoc = ptr.p8;
+
+ return value;
+}
+
+static signed fde_pointer_type(const u32 *cie)
+{
+ const u8 *ptr = (const u8 *)(cie + 2);
+ unsigned version = *ptr;
+
+ if (version != 1)
+ return -1; /* unsupported */
+ if (*++ptr) {
+ const char *aug;
+ const u8 *end = (const u8 *)(cie + 1) + *cie;
+ uleb128_t len;
+
+ /* check if augmentation size is first (and thus present) */
+ if (*ptr != 'z')
+ return -1;
+ /* check if augmentation string is nul-terminated */
+ if ((ptr = memchr(aug = (const void *)ptr, 0, end - ptr)) == NULL)
+ return -1;
+ ++ptr; /* skip terminator */
+ get_uleb128(&ptr, end); /* skip code alignment */
+ get_sleb128(&ptr, end); /* skip data alignment */
+ /* skip return address column */
+ version <= 1 ? (void)++ptr : (void)get_uleb128(&ptr, end);
+ len = get_uleb128(&ptr, end); /* augmentation length */
+ if (ptr + len < ptr || ptr + len > end)
+ return -1;
+ end = ptr + len;
+ while (*++aug) {
+ if (ptr >= end)
+ return -1;
+ switch(*aug) {
+ case 'L':
+ ++ptr;
+ break;
+ case 'P': {
+ signed ptrType = *ptr++;
+
+ if (!read_pointer(&ptr, end, ptrType) || ptr > end)
+ return -1;
+ }
+ break;
+ case 'R':
+ return *ptr;
+ default:
+ return -1;
+ }
+ }
+ }
+ return DW_EH_PE_native|DW_EH_PE_abs;
+}
+
+static int advance_loc(unsigned long delta, struct unwind_state *state)
+{
+ state->loc += delta * state->codeAlign;
+
+ return delta > 0;
+}
+
+static void set_rule(uleb128_t reg,
+ enum item_location where,
+ uleb128_t value,
+ struct unwind_state *state)
+{
+ if (reg < ARRAY_SIZE(state->regs)) {
+ state->regs[reg].where = where;
+ state->regs[reg].value = value;
+ }
+}
+
+static int processCFI(const u8 *start,
+ const u8 *end,
+ unsigned long targetLoc,
+ signed ptrType,
+ struct unwind_state *state)
+{
+ union {
+ const u8 *p8;
+ const u16 *p16;
+ const u32 *p32;
+ } ptr;
+ int result = 1;
+
+ if (start != state->cieStart) {
+ state->loc = state->org;
+ result = processCFI(state->cieStart, state->cieEnd, 0, ptrType, state);
+ if (targetLoc == 0 && state->label == NULL)
+ return result;
+ }
+ for (ptr.p8 = start; result && ptr.p8 < end; ) {
+ switch(*ptr.p8 >> 6) {
+ uleb128_t value;
+
+ case 0:
+ switch(*ptr.p8++) {
+ case DW_CFA_nop:
+ break;
+ case DW_CFA_set_loc:
+ if ((state->loc = read_pointer(&ptr.p8, end,
+ ptrType)) == 0)
+ result = 0;
+ break;
+ case DW_CFA_advance_loc1:
+ result = ptr.p8 < end && advance_loc(*ptr.p8++, state);
+ break;
+ case DW_CFA_advance_loc2:
+ result = ptr.p8 <= end + 2
+ && advance_loc(*ptr.p16++, state);
+ break;
+ case DW_CFA_advance_loc4:
+ result = ptr.p8 <= end + 4
+ && advance_loc(*ptr.p32++, state);
+ break;
+ case DW_CFA_offset_extended:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Memory,
+ get_uleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_val_offset:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Value,
+ get_uleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_offset_extended_sf:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Memory,
+ get_sleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_val_offset_sf:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Value,
+ get_sleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_restore_extended:
+ case DW_CFA_undefined:
+ case DW_CFA_same_value:
+ set_rule(get_uleb128(&ptr.p8, end), Nowhere, 0, state);
+ break;
+ case DW_CFA_register:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Register,
+ get_uleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_remember_state:
+ if (ptr.p8 == state->label) {
+ state->label = NULL;
+ return 1;
+ }
+ if (state->stackDepth >= MAX_STACK_DEPTH)
+ return 0;
+ state->stack[state->stackDepth++] = ptr.p8;
+ break;
+ case DW_CFA_restore_state:
+ if (state->stackDepth) {
+ const uleb128_t loc = state->loc;
+ const u8 *label = state->label;
+
+ state->label = state->stack[state->stackDepth - 1];
+ memcpy(&state->cfa, &badCFA, sizeof(state->cfa));
+ memset(state->regs, 0, sizeof(state->regs));
+ state->stackDepth = 0;
+ result = processCFI(start, end, 0, ptrType, state);
+ state->loc = loc;
+ state->label = label;
+ } else
+ return 0;
+ break;
+ case DW_CFA_def_cfa:
+ state->cfa.reg = get_uleb128(&ptr.p8, end);
+ /*nobreak*/
+ case DW_CFA_def_cfa_offset:
+ state->cfa.offs = get_uleb128(&ptr.p8, end);
+ break;
+ case DW_CFA_def_cfa_sf:
+ state->cfa.reg = get_uleb128(&ptr.p8, end);
+ /*nobreak*/
+ case DW_CFA_def_cfa_offset_sf:
+ state->cfa.offs = get_sleb128(&ptr.p8, end)
+ * state->dataAlign;
+ break;
+ case DW_CFA_def_cfa_register:
+ state->cfa.reg = get_uleb128(&ptr.p8, end);
+ break;
+ /*todo case DW_CFA_def_cfa_expression: */
+ /*todo case DW_CFA_expression: */
+ /*todo case DW_CFA_val_expression: */
+ case DW_CFA_GNU_args_size:
+ get_uleb128(&ptr.p8, end);
+ break;
+ case DW_CFA_GNU_negative_offset_extended:
+ value = get_uleb128(&ptr.p8, end);
+ set_rule(value, Memory, (uleb128_t)0 -
+ get_uleb128(&ptr.p8, end), state);
+ break;
+ case DW_CFA_GNU_window_save:
+ default:
+ result = 0;
+ break;
+ }
+ break;
+ case 1:
+ result = advance_loc(*ptr.p8++ & 0x3f, state);
+ break;
+ case 2:
+ value = *ptr.p8++ & 0x3f;
+ set_rule(value, Memory, get_uleb128(&ptr.p8, end),
+ state);
+ break;
+ case 3:
+ set_rule(*ptr.p8++ & 0x3f, Nowhere, 0, state);
+ break;
+ }
+ if (ptr.p8 > end)
+ result = 0;
+ if (result && targetLoc != 0 && targetLoc < state->loc)
+ return 1;
+ }
+
+ return result
+ && ptr.p8 == end
+ && (targetLoc == 0
+ || (/*todo While in theory this should apply, gcc in practice omits
+ everything past the function prolog, and hence the location
+ never reaches the end of the function.
+ targetLoc < state->loc &&*/ state->label == NULL));
+}
+
+
+/* Unwind to previous to frame. Returns 0 if successful, negative
+ * number in case of an error. */
+int unwind(struct unwind_frame_info *frame)
+{
+#define FRAME_REG(r, t) (((t *)frame)[reg_info[r].offs])
+ const u32 *fde = NULL, *cie = NULL;
+ const u8 *ptr = NULL, *end = NULL;
+ unsigned long startLoc = 0, endLoc = 0, cfa;
+ unsigned i;
+ signed ptrType = -1;
+ uleb128_t retAddrReg = 0;
+ struct unwind_table *table;
+ struct unwind_state state;
+ u64 reg_ptr = 0;
+
+ if (UNW_PC(frame) == 0)
+ return -EINVAL;
+
+ unsigned long tableSize = unwind_table_size;
+
+ for (fde = unwind_table;
+ tableSize > sizeof(*fde) && tableSize - sizeof(*fde) >= *fde;
+ tableSize -= sizeof(*fde) + *fde,
+ fde += 1 + *fde / sizeof(*fde)) {
+ if (!*fde || (*fde & (sizeof(*fde) - 1)))
+ break;
+ if (!fde[1])
+ continue; /* this is a CIE */
+ if ((fde[1] & (sizeof(*fde) - 1))
+ || fde[1] > (unsigned long)(fde + 1)
+ - (unsigned long)unwind_table)
+ continue; /* this is not a valid FDE */
+ cie = fde + 1 - fde[1] / sizeof(*fde);
+ if (*cie <= sizeof(*cie) + 4
+ || *cie >= fde[1] - sizeof(*fde)
+ || (*cie & (sizeof(*cie) - 1))
+ || cie[1]
+ || (ptrType = fde_pointer_type(cie)) < 0) {
+ cie = NULL; /* this is not a (valid) CIE */
+ continue;
+ }
+ ptr = (const u8 *)(fde + 2);
+ startLoc = read_pointer(&ptr,
+ (const u8 *)(fde + 1) + *fde,
+ ptrType);
+ endLoc = startLoc
+ + read_pointer(&ptr,
+ (const u8 *)(fde + 1) + *fde,
+ ptrType & DW_EH_PE_indirect
+ ? ptrType
+ : ptrType & (DW_EH_PE_FORM|DW_EH_PE_signed));
+ if (UNW_PC(frame) >= startLoc && UNW_PC(frame) < endLoc)
+ break;
+ cie = NULL;
+ }
+
+ if (cie != NULL) {
+ memset(&state, 0, sizeof(state));
+ state.cieEnd = ptr; /* keep here temporarily */
+ ptr = (const u8 *)(cie + 2);
+ end = (const u8 *)(cie + 1) + *cie;
+ if ((state.version = *ptr) != 1)
+ cie = NULL; /* unsupported version */
+ else if (*++ptr) {
+ /* check if augmentation size is first (and thus present) */
+ if (*ptr == 'z') {
+ /* check for ignorable (or already handled)
+ * nul-terminated augmentation string */
+ while (++ptr < end && *ptr)
+ if (strchr("LPR", *ptr) == NULL)
+ break;
+ }
+ if (ptr >= end || *ptr)
+ cie = NULL;
+ }
+ ++ptr;
+ }
+ if (cie != NULL) {
+ /* get code aligment factor */
+ state.codeAlign = get_uleb128(&ptr, end);
+ /* get data aligment factor */
+ state.dataAlign = get_sleb128(&ptr, end);
+ if (state.codeAlign == 0 || state.dataAlign == 0 || ptr >= end)
+ cie = NULL;
+ else {
+ retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
+ /* skip augmentation */
+ if (((const char *)(cie + 2))[1] == 'z')
+ ptr += get_uleb128(&ptr, end);
+ if (ptr > end
+ || retAddrReg >= ARRAY_SIZE(reg_info)
+ || REG_INVALID(retAddrReg)
+ || reg_info[retAddrReg].width != sizeof(unsigned long))
+ cie = NULL;
+ }
+ }
+ if (cie != NULL) {
+ state.cieStart = ptr;
+ ptr = state.cieEnd;
+ state.cieEnd = end;
+ end = (const u8 *)(fde + 1) + *fde;
+ /* skip augmentation */
+ if (((const char *)(cie + 2))[1] == 'z') {
+ uleb128_t augSize = get_uleb128(&ptr, end);
+
+ if ((ptr += augSize) > end)
+ fde = NULL;
+ }
+ }
+ if (cie == NULL || fde == NULL)
+ return -ENXIO;
+
+ state.org = startLoc;
+ memcpy(&state.cfa, &badCFA, sizeof(state.cfa));
+ /* process instructions */
+ if (!processCFI(ptr, end, UNW_PC(frame), ptrType, &state)
+ || state.loc > endLoc
+ || state.regs[retAddrReg].where == Nowhere
+ || state.cfa.reg >= ARRAY_SIZE(reg_info)
+ || reg_info[state.cfa.reg].width != sizeof(unsigned long)
+ || state.cfa.offs % sizeof(unsigned long)) {
+ return -EIO;
+ }
+ /* update frame */
+ cfa = FRAME_REG(state.cfa.reg, unsigned long) + state.cfa.offs;
+ startLoc = min((unsigned long)UNW_SP(frame), cfa);
+ endLoc = max((unsigned long)UNW_SP(frame), cfa);
+ if (STACK_LIMIT(startLoc) != STACK_LIMIT(endLoc)) {
+ startLoc = min(STACK_LIMIT(cfa), cfa);
+ endLoc = max(STACK_LIMIT(cfa), cfa);
+ }
+#ifndef CONFIG_64BIT
+# define CASES CASE(8); CASE(16); CASE(32)
+#else
+# define CASES CASE(8); CASE(16); CASE(32); CASE(64)
+#endif
+ for (i = 0; i < ARRAY_SIZE(state.regs); ++i) {
+ if (REG_INVALID(i)) {
+ if (state.regs[i].where == Nowhere)
+ continue;
+ return -EIO;
+ }
+ switch(state.regs[i].where) {
+ default:
+ break;
+ case Register:
+ if (state.regs[i].value >= ARRAY_SIZE(reg_info)
+ || REG_INVALID(state.regs[i].value)
+ || reg_info[i].width > reg_info[state.regs[i].value].width){
+ return -EIO;
+ }
+ switch(reg_info[state.regs[i].value].width) {
+#define CASE(n) \
+ case sizeof(u##n): \
+ state.regs[i].value = FRAME_REG(state.regs[i].value, \
+ const u##n); \
+ break
+ CASES;
+#undef CASE
+ default:
+ return -EIO;
+ }
+ break;
+ }
+ }
+ for (i = 0; i < ARRAY_SIZE(state.regs); ++i) {
+ if (REG_INVALID(i))
+ continue;
+ switch(state.regs[i].where) {
+ case Nowhere:
+ if (reg_info[i].width != sizeof(UNW_SP(frame))
+ || &FRAME_REG(i, __typeof__(UNW_SP(frame)))
+ != &UNW_SP(frame))
+ continue;
+ UNW_SP(frame) = cfa;
+ break;
+ case Register:
+ switch(reg_info[i].width) {
+#define CASE(n) case sizeof(u##n): \
+ FRAME_REG(i, u##n) = state.regs[i].value; \
+ break
+ CASES;
+#undef CASE
+ default:
+ return -EIO;
+ }
+ break;
+ case Value:
+ if (reg_info[i].width != sizeof(unsigned long)){
+ return -EIO;}
+ FRAME_REG(i, unsigned long) = cfa + state.regs[i].value
+ * state.dataAlign;
+ break;
+ case Memory: {
+ unsigned long addr = cfa + state.regs[i].value
+ * state.dataAlign;
+ if ((state.regs[i].value * state.dataAlign)
+ % sizeof(unsigned long)
+ || addr < startLoc
+ || addr + sizeof(unsigned long) < addr
+ || addr + sizeof(unsigned long) > endLoc){
+ return -EIO;}
+ switch(reg_info[i].width) {
+#define CASE(n) case sizeof(u##n): \
+ readmem(addr, KVADDR, ®_ptr,sizeof(u##n), "register", RETURN_ON_ERROR|QUIET); \
+ FRAME_REG(i, u##n) = (u##n)reg_ptr;\
+ break
+ CASES;
+#undef CASE
+ default:
+ return -EIO;
+ }
+ }
+ break;
+ }
+ }
+ return 0;
+#undef CASES
+#undef FRAME_REG
+}
+EXPORT_SYMBOL(unwind);
+
+void init_unwind_table()
+{
+ unwind_table_size = symbol_value("__end_unwind") - symbol_value("__start_unwind");
+ unwind_table = malloc(unwind_table_size);
+ if(readmem (symbol_value("__start_unwind"), KVADDR, unwind_table,
+ unwind_table_size, "unwind table", RETURN_ON_ERROR))
+ has_unwind_info = 1;
+}
+
+void free_unwind_table()
+{
+ free(unwind_table);
+}
diff -puN /dev/null unwind_x86_64.h
--- /dev/null 2006-07-24 19:06:05.520445648 +0530
+++ crash-4.0-3.7-rachita/unwind_x86_64.h 2006-10-16 18:18:45.955340192 +0530
@@ -0,0 +1,135 @@
+#define BITS_PER_LONG 64
+#define CONFIG_64BIT 1
+#define NULL ((void *)0)
+
+typedef unsigned long size_t;
+typedef unsigned char u8;
+typedef signed short s16;
+typedef unsigned short u16;
+typedef signed int s32;
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+struct pt_regs {
+ unsigned long r15;
+ unsigned long r14;
+ unsigned long r13;
+ unsigned long r12;
+ unsigned long rbp;
+ unsigned long rbx;
+/* arguments: non interrupts/non tracing syscalls only save upto here*/
+ unsigned long r11;
+ unsigned long r10;
+ unsigned long r9;
+ unsigned long r8;
+ unsigned long rax;
+ unsigned long rcx;
+ unsigned long rdx;
+ unsigned long rsi;
+ unsigned long rdi;
+ unsigned long orig_rax;
+/* end of arguments */
+/* cpu exception frame or undefined */
+ unsigned long rip;
+ unsigned long cs;
+ unsigned long eflags;
+ unsigned long rsp;
+ unsigned long ss;
+/* top of stack page */
+};
+
+struct unwind_frame_info
+{
+ struct pt_regs regs;
+};
+
+extern int unwind(struct unwind_frame_info *);
+extern void init_unwind_table(void);
+extern void free_unwind_table(void);
+extern void *unwind_table;
+extern int unwind_table_size;
+extern int has_unwind_info;
+
+#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
+#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1)
+#define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
+#define get_unaligned(ptr) (*(ptr))
+#define __get_user(x,ptr) __get_user_nocheck((x),(ptr),sizeof(*(ptr)))
+#define THREAD_ORDER 1
+#define THREAD_SIZE (PAGE_SIZE << THREAD_ORDER)
+
+#define UNW_PC(frame) (frame)->regs.rip
+#define UNW_SP(frame) (frame)->regs.rsp
+#ifdef CONFIG_FRAME_POINTER
+ #define UNW_FP(frame) (frame)->regs.rbp
+ #define FRAME_RETADDR_OFFSET 8
+ #define FRAME_LINK_OFFSET 0
+ #define STACK_BOTTOM(tsk) (((tsk)->thread.rsp0 - 1) & ~(THREAD_SIZE - 1))
+ #define STACK_TOP(tsk) ((tsk)->thread.rsp0)
+#endif
+
+
+#define EXTRA_INFO(f) { BUILD_BUG_ON_ZERO(offsetof(struct unwind_frame_info, f) % FIELD_SIZEOF(struct unwind_frame_info, f)) + offsetof(struct unwind_frame_info, f)/ FIELD_SIZEOF(struct unwind_frame_info, f), FIELD_SIZEOF(struct unwind_frame_info, f) }
+
+#define PTREGS_INFO(f) EXTRA_INFO(regs.f)
+
+#define UNW_REGISTER_INFO \
+ PTREGS_INFO(rax),\
+ PTREGS_INFO(rdx),\
+ PTREGS_INFO(rcx),\
+ PTREGS_INFO(rbx), \
+ PTREGS_INFO(rsi), \
+ PTREGS_INFO(rdi), \
+ PTREGS_INFO(rbp), \
+ PTREGS_INFO(rsp), \
+ PTREGS_INFO(r8), \
+ PTREGS_INFO(r9), \
+ PTREGS_INFO(r10),\
+ PTREGS_INFO(r11), \
+ PTREGS_INFO(r12), \
+ PTREGS_INFO(r13), \
+ PTREGS_INFO(r14), \
+ PTREGS_INFO(r15), \
+ PTREGS_INFO(rip)
+
+#define __get_user_nocheck(x,ptr,size) \
+({ \
+ int __gu_err; \
+ unsigned long __gu_val; \
+ __get_user_size(__gu_val,(ptr),(size),__gu_err); \
+ (x) = (__typeof__(*(ptr)))__gu_val; \
+ __gu_err; \
+})
+
+#define __get_user_size(x,ptr,size,retval) \
+do { \
+ retval = 0; \
+ __chk_user_ptr(ptr); \
+ switch (size) { \
+ case 1: __get_user_asm(x,ptr,retval,"b","b","=q",-EFAULT); break;\
+ case 2: __get_user_asm(x,ptr,retval,"w","w","=r",-EFAULT); break;\
+ case 4: __get_user_asm(x,ptr,retval,"l","k","=r",-EFAULT); break;\
+ case 8: __get_user_asm(x,ptr,retval,"q","","=r",-EFAULT); break;\
+ default: (x) = __get_user_bad(); \
+ } \
+} while (0)
+
+#define __get_user_asm(x, addr, err, itype, rtype, ltype, errno) \
+ __asm__ __volatile__( \
+ "1: mov"itype" %2,%"rtype"1\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov %3,%0\n" \
+ " xor"itype" %"rtype"1,%"rtype"1\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ ".section __ex_table,\"a\"\n" \
+ " .align 8\n" \
+ " .quad 1b,3b\n" \
+ ".previous" \
+ : "=r"(err), ltype (x) \
+ : "m"(__m(addr)), "i"(errno), "0"(err))
+
+# define __chk_user_ptr(x) (void)0
diff -puN netdump.c~crash-dwarf-unwind netdump.c
--- crash-4.0-3.7/netdump.c~crash-dwarf-unwind 2006-10-16 18:19:14.388017768 +0530
+++ crash-4.0-3.7-rachita/netdump.c 2006-10-16 18:19:21.430947080 +0530
@@ -28,6 +28,7 @@ static void dump_Elf64_Phdr(Elf64_Phdr *
static size_t dump_Elf64_Nhdr(Elf64_Off offset, int);
static void get_netdump_regs_ppc64(struct bt_info *, ulong *, ulong *);
static physaddr_t xen_kdump_p2m(physaddr_t);
+extern int has_unwind_info;
#define ELFSTORE 1
#define ELFREAD 0
@@ -771,8 +772,11 @@ netdump_memory_dump(FILE *fp)
dump_Elf64_Phdr(nd->load64 + i, ELFREAD);
offset64 = nd->notes64->p_offset;
for (tot = 0; tot < nd->notes64->p_filesz; tot += len) {
+ if (has_unwind_info)
+ len = dump_Elf64_Nhdr(offset64, ELFSTORE);
+ else
len = dump_Elf64_Nhdr(offset64, ELFREAD);
- offset64 += len;
+ offset64 += len;
}
break;
}
@@ -1562,8 +1566,10 @@ get_netdump_regs_x86_64(struct bt_info *
if (is_task_active(bt->task))
bt->flags |= BT_DUMPFILE_SEARCH;
- if ((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
- VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) {
+ if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
+ VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
+ (KDUMP_DUMPFILE() && has_unwind_info && (bt->flags &
+ BT_DUMPFILE_SEARCH))) {
if (nd->num_prstatus_notes > 1)
note = (Elf64_Nhdr *)
nd->nt_prstatus_percpu[bt->tc->processor];
@@ -1574,9 +1580,21 @@ get_netdump_regs_x86_64(struct bt_info *
len = roundup(len + note->n_namesz, 4);
len = roundup(len + note->n_descsz, 4);
+ if KDUMP_DUMPFILE() {
+ ASSIGN_SIZE(user_regs_struct) = 27 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rsp) = 19 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rip) = 16 * sizeof(unsigned long);
+ }
user_regs = ((char *)note + len)
- SIZE(user_regs_struct) - sizeof(long);
+ if KDUMP_DUMPFILE() {
+ *rspp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rsp));
+ *ripp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rip));
+ if (*ripp && *rspp)
+ return;
+ }
+
if (CRASHDEBUG(1)) {
rsp = ULONG(user_regs + OFFSET(user_regs_struct_rsp));
rip = ULONG(user_regs + OFFSET(user_regs_struct_rip));
diff -puN Makefile~crash-dwarf-unwind Makefile
--- crash-4.0-3.7/Makefile~crash-dwarf-unwind 2006-10-16 18:20:04.651376576 +0530
+++ crash-4.0-3.7-rachita/Makefile 2006-10-16 18:20:10.967416392 +0530
@@ -25,7 +25,7 @@ PROGRAM=crash
# Supported targets: X86 ALPHA PPC IA64 PPC64
# TARGET will be configured automatically by configure
#
-TARGET=
+TARGET=X86_64
ARCH := $(shell uname -m | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ -e s/arm.*/arm/ -e s/sa110/arm/)
ifeq ($(ARCH), ppc64)
@@ -37,7 +37,7 @@ endif
#
GDB=gdb-6.1
GDB_FILES=${GDB_6.1_FILES}
-GDB_OFILES=
+GDB_OFILES=${GDB_6.1_OFILES}
GDB_PATCH_FILES=gdb-6.1.patch
@@ -69,7 +69,7 @@ LKCD_DUMP_HFILES=lkcd_vmdump_v1.h lkcd_v
lkcd_dump_v7.h lkcd_dump_v8.h lkcd_fix_mem.h
LKCD_TRACE_HFILES=lkcd_x86_trace.h
IBM_HFILES=ibm_common.h
-UNWIND_HFILES=unwind.h unwind_i.h rse.h
+UNWIND_HFILES=unwind.h unwind_i.h rse.h unwind_x86_64.h
CFILES=main.c tools.c global_data.c memory.c filesys.c help.c task.c \
kernel.c test.c gdb_interface.c configure.c net.c dev.c \
@@ -77,7 +77,7 @@ CFILES=main.c tools.c global_data.c memo
extensions.c remote.c va_server.c va_server_v1.c symbols.c cmdline.c \
lkcd_common.c lkcd_v1.c lkcd_v2_v3.c lkcd_v5.c lkcd_v7.c lkcd_v8.c\
lkcd_fix_mem.c s390_dump.c lkcd_x86_trace.c \
- netdump.c diskdump.c xendump.c unwind.c unwind_decoder.c
+ netdump.c diskdump.c xendump.c unwind.c unwind_decoder.c unwind_x86_64.c
SOURCE_FILES=${CFILES} ${GENERIC_HFILES} ${MCORE_HFILES} \
${REDHAT_CFILES} ${REDHAT_HFILES} ${UNWIND_HFILES} \
@@ -89,7 +89,7 @@ OBJECT_FILES=main.o tools.o global_data.
extensions.o remote.o va_server.o va_server_v1.o symbols.o cmdline.o \
lkcd_common.o lkcd_v1.o lkcd_v2_v3.o lkcd_v5.o lkcd_v7.o lkcd_v8.o \
lkcd_fix_mem.o s390_dump.o netdump.o diskdump.o xendump.o \
- lkcd_x86_trace.o unwind_v1.o unwind_v2.o unwind_v3.o
+ lkcd_x86_trace.o unwind_v1.o unwind_v2.o unwind_v3.o unwind_x86_64.o
# These are the current set of crash extensions sources. They are not built
# by default unless the third command line of the "all:" stanza is uncommented.
@@ -387,6 +387,9 @@ extensions.o: ${GENERIC_HFILES} extensio
lkcd_x86_trace.o: ${GENERIC_HFILES} ${LKCD_TRACE_HFILES} lkcd_x86_trace.c
cc -c ${CFLAGS} -DREDHAT lkcd_x86_trace.c ${WARNING_OPTIONS} ${WARNING_ERROR}
+unwind_x86_64.o: ${GENERIC_HFILES} ${UNWIND_HFILES} unwind_x86_64.c
+ cc -c ${CFLAGS} -DREDHAT -DUNWIND_V1 unwind_x86_64.c -o unwind_x86_64.o ${WARNING_OPTIONS} ${WARNING_ERROR}
+
unwind_v1.o: ${GENERIC_HFILES} ${UNWIND_HFILES} unwind.c unwind_decoder.c
cc -c ${CFLAGS} -DREDHAT -DUNWIND_V1 unwind.c -o unwind_v1.o ${WARNING_OPTIONS} ${WARNING_ERROR}
_
18 years, 1 month