On Wed, Nov 08, 2006 at 09:58:53AM -0500, Dave Anderson wrote:
 Rachita Kothiyal wrote:
 
 >
 >
 > Hi Dave
 >
 > With 4.0-3.8 and older versions of crash, I used to see this message
 > "possibly bogus exception frame" on starting crash. That seems to have
 > gone now with crash-4.0-3.9. However, I am still getting this message
 > when I do a bt on the latest crash(kdump generated vmcore).
 >
 >
 > On crash-4.0-3.9
 >
 > crash> bt
 > PID: 0      TASK: ffffffff805564c0  CPU: 0   COMMAND: "swapper"
 >  #0 [ffffffff8064bce8] crash_kexec at ffffffff80152225
 >  #1 [ffffffff8064bd30] machine_kexec at ffffffff8011a739
 >  #2 [ffffffff8064bd70] crash_kexec at ffffffff80152241
 >  #3 [ffffffff8064bdf8] crash_kexec at ffffffff80152225
 >  #4 [ffffffff8064be20] bust_spinlocks at ffffffff8011fd6d
 >  #5 [ffffffff8064be30] panic at ffffffff80131420
 >  #6 [ffffffff8064bef8] hrtimer_run_queues at ffffffff80145f6e
 >  #7 [ffffffff8064bf20] handle_IRQ_event at ffffffff80154432
 >  #8 [ffffffff8064bf50] __do_IRQ at ffffffff8015451f
 >  #9 [ffffffff8064bf58] __do_softirq at ffffffff80136ba3
 > #10 [ffffffff8064bf90] do_IRQ at ffffffff8010bda1
 > --- <IRQ stack> ---
 > #11 [ffffffff806f7f20] ret_from_intr at ffffffff80109b95
 >     [exception RIP: cpu_idle+149]
 >     RIP: ffffffff8010890f  RSP: 000000000008e000  RFLAGS: ffffffff8070379c
 >     RAX: ffffffffffffffff  RBX: 0000000000000000  RCX: ffffffff80108968
 >     RDX: 0000000000000010  RSI: 0000000000000246  RDI: ffffffff806f7fa0
 >     RBP: ffffffff806f6000   R8: ffffffff80557db8   R9: 0000000000000001
 >     R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
 >     R13: 0000000000000000  R14: ffffffff803951dc  R15: 000000000008e000
 >     ORIG_RAX: 0000000000000018  CS: 20800  SS: 0000
 > bt: WARNING: possibly bogus exception frame
 > #12 [ffffffff806f7fd0] x86_64_start_kernel at ffffffff80703296
 >
 > On doing a 'help -m' I find that irq_eframe_link is zero..is that ok?
 >
 > Thanks
 > Rachita
 
 Clearly the exception frame is bogus (RSP and RFLAGS), so
 if your kernel's ".macro interrupt func" pushes rpb instead
 of rdi prior to calling the interrupt handler, then the
 irq_eframe_link shouldn't be zero.
 
 Do a "dis common_interrupt" -- in a RHEL5 kernel it looks like
 this:
 
 crash> dis common_interrupt
 0xffffffff8005b968 <common_interrupt>:  cld
 0xffffffff8005b969 <common_interrupt+1>:        sub    $0x48,%rsp
 0xffffffff8005b96d <common_interrupt+5>:        mov    %rdi,0x40(%rsp)
 0xffffffff8005b972 <common_interrupt+10>:       mov    %rsi,0x38(%rsp)
 0xffffffff8005b977 <common_interrupt+15>:       mov    %rdx,0x30(%rsp)
 0xffffffff8005b97c <common_interrupt+20>:       mov    %rcx,0x28(%rsp)
 0xffffffff8005b981 <common_interrupt+25>:       mov    %rax,0x20(%rsp)
 0xffffffff8005b986 <common_interrupt+30>:       mov    %r8,0x18(%rsp)
 0xffffffff8005b98b <common_interrupt+35>:       mov    %r9,0x10(%rsp)
 0xffffffff8005b990 <common_interrupt+40>:       mov    %r10,0x8(%rsp)
 0xffffffff8005b995 <common_interrupt+45>:       mov    %r11,(%rsp)
 0xffffffff8005b999 <common_interrupt+49>:       lea   
0xffffffffffffffd0(%rsp),%rdi
 0xffffffff8005b99e <common_interrupt+54>:       push   %rbp
 0xffffffff8005b99f <common_interrupt+55>:       mov    %rsp,%rbp
 0xffffffff8005b9a2 <common_interrupt+58>:       testl  $0x3,0x88(%rdi)
 0xffffffff8005b9ac <common_interrupt+68>:       je     0xffffffff8005b9b1
<common_interrupt+73>
 0xffffffff8005b9ae <common_interrupt+70>:       invlpg %ax
 0xffffffff8005b9b1 <common_interrupt+73>:       incl   %gs:0x28
 0xffffffff8005b9b9 <common_interrupt+81>:       cmove  %gs:0x30,%rsp
 0xffffffff8005b9c3 <common_interrupt+91>:       push   %rbp
 0xffffffff8005b9c4 <common_interrupt+92>:       callq  0xffffffff8006a57b
<do_IRQ>
 crash>
 
 If "crash --machdep irq_eframe_link=40 ..." works, then
 something in x86_64_irq_eframe_link_init() needs to be
 looked at. 
 
Hi Dave
The dis common_interrupt looks exactly like above and with the 
--machdep irq_eframe_link=40 in the commandline I dont see the bogus 
frames in the bt.
Thanks
Rachita
 Dave