On Wed, Nov 08, 2006 at 09:58:53AM -0500, Dave Anderson wrote:
Rachita Kothiyal wrote:
>
>
> Hi Dave
>
> With 4.0-3.8 and older versions of crash, I used to see this message
> "possibly bogus exception frame" on starting crash. That seems to have
> gone now with crash-4.0-3.9. However, I am still getting this message
> when I do a bt on the latest crash(kdump generated vmcore).
>
>
> On crash-4.0-3.9
>
> crash> bt
> PID: 0 TASK: ffffffff805564c0 CPU: 0 COMMAND: "swapper"
> #0 [ffffffff8064bce8] crash_kexec at ffffffff80152225
> #1 [ffffffff8064bd30] machine_kexec at ffffffff8011a739
> #2 [ffffffff8064bd70] crash_kexec at ffffffff80152241
> #3 [ffffffff8064bdf8] crash_kexec at ffffffff80152225
> #4 [ffffffff8064be20] bust_spinlocks at ffffffff8011fd6d
> #5 [ffffffff8064be30] panic at ffffffff80131420
> #6 [ffffffff8064bef8] hrtimer_run_queues at ffffffff80145f6e
> #7 [ffffffff8064bf20] handle_IRQ_event at ffffffff80154432
> #8 [ffffffff8064bf50] __do_IRQ at ffffffff8015451f
> #9 [ffffffff8064bf58] __do_softirq at ffffffff80136ba3
> #10 [ffffffff8064bf90] do_IRQ at ffffffff8010bda1
> --- <IRQ stack> ---
> #11 [ffffffff806f7f20] ret_from_intr at ffffffff80109b95
> [exception RIP: cpu_idle+149]
> RIP: ffffffff8010890f RSP: 000000000008e000 RFLAGS: ffffffff8070379c
> RAX: ffffffffffffffff RBX: 0000000000000000 RCX: ffffffff80108968
> RDX: 0000000000000010 RSI: 0000000000000246 RDI: ffffffff806f7fa0
> RBP: ffffffff806f6000 R8: ffffffff80557db8 R9: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffffffff803951dc R15: 000000000008e000
> ORIG_RAX: 0000000000000018 CS: 20800 SS: 0000
> bt: WARNING: possibly bogus exception frame
> #12 [ffffffff806f7fd0] x86_64_start_kernel at ffffffff80703296
>
> On doing a 'help -m' I find that irq_eframe_link is zero..is that ok?
>
> Thanks
> Rachita
Clearly the exception frame is bogus (RSP and RFLAGS), so
if your kernel's ".macro interrupt func" pushes rpb instead
of rdi prior to calling the interrupt handler, then the
irq_eframe_link shouldn't be zero.
Do a "dis common_interrupt" -- in a RHEL5 kernel it looks like
this:
crash> dis common_interrupt
0xffffffff8005b968 <common_interrupt>: cld
0xffffffff8005b969 <common_interrupt+1>: sub $0x48,%rsp
0xffffffff8005b96d <common_interrupt+5>: mov %rdi,0x40(%rsp)
0xffffffff8005b972 <common_interrupt+10>: mov %rsi,0x38(%rsp)
0xffffffff8005b977 <common_interrupt+15>: mov %rdx,0x30(%rsp)
0xffffffff8005b97c <common_interrupt+20>: mov %rcx,0x28(%rsp)
0xffffffff8005b981 <common_interrupt+25>: mov %rax,0x20(%rsp)
0xffffffff8005b986 <common_interrupt+30>: mov %r8,0x18(%rsp)
0xffffffff8005b98b <common_interrupt+35>: mov %r9,0x10(%rsp)
0xffffffff8005b990 <common_interrupt+40>: mov %r10,0x8(%rsp)
0xffffffff8005b995 <common_interrupt+45>: mov %r11,(%rsp)
0xffffffff8005b999 <common_interrupt+49>: lea
0xffffffffffffffd0(%rsp),%rdi
0xffffffff8005b99e <common_interrupt+54>: push %rbp
0xffffffff8005b99f <common_interrupt+55>: mov %rsp,%rbp
0xffffffff8005b9a2 <common_interrupt+58>: testl $0x3,0x88(%rdi)
0xffffffff8005b9ac <common_interrupt+68>: je 0xffffffff8005b9b1
<common_interrupt+73>
0xffffffff8005b9ae <common_interrupt+70>: invlpg %ax
0xffffffff8005b9b1 <common_interrupt+73>: incl %gs:0x28
0xffffffff8005b9b9 <common_interrupt+81>: cmove %gs:0x30,%rsp
0xffffffff8005b9c3 <common_interrupt+91>: push %rbp
0xffffffff8005b9c4 <common_interrupt+92>: callq 0xffffffff8006a57b
<do_IRQ>
crash>
If "crash --machdep irq_eframe_link=40 ..." works, then
something in x86_64_irq_eframe_link_init() needs to be
looked at.
Hi Dave
The dis common_interrupt looks exactly like above and with the
--machdep irq_eframe_link=40 in the commandline I dont see the bogus
frames in the bt.
Thanks
Rachita
Dave