Badari Pulavarty wrote:
On Tue, 2006-04-25 at 14:42 -0400, Dave Anderson wrote:
> Badari Pulavarty wrote:
>
> > Hi,
> >
> > I get following crash warnings on x86-64 machine. Wondering why ?
> > And also, its not showing stacks correctly.
> >
> > Thanks,
> > Badari
> >
> >  # ./crash /var/log/dump/2006-04-24-08:02/vmcore  /usr/src/linux/vmlinux
> >
> > crash 4.0-2.23
> > Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
> > Copyright (C) 2004, 2005, 2006  IBM Corporation
> > Copyright (C) 1999-2006  Hewlett-Packard Co
> > Copyright (C) 2005  Fujitsu Limited
> > Copyright (C) 2005  NEC Corporation
> > Copyright (C) 1999, 2002  Silicon Graphics, Inc.
> > Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> > This program is free software, covered by the GNU General Public
> > License,
> > and you are welcome to change it and/or distribute copies of it under
> > certain conditions.  Enter "help copying" to see the conditions.
> > This program has absolutely no warranty.  Enter "help warranty" for
> > details.
> >
> > GNU gdb 6.1
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you
> > are
> > welcome to change it and/or distribute copies of it under certain
> > conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for
> > details.
> > This GDB was configured as "x86_64-unknown-linux-gnu"...
> >
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> > WARNING: possibly bogus exception frame
> >       KERNEL: /usr/src/linux/vmlinux
> >     DUMPFILE: /var/log/dump/2006-04-24-08:02/vmcore
> >         CPUS: 2
> >         DATE: Mon Apr 24 08:02:03 2006
> >       UPTIME: 00:06:31
> > LOAD AVERAGE: 0.00, 0.00, 0.00
> >        TASKS: 63
> >     NODENAME: elm3a242
> >      RELEASE: 2.6.16-20-smp
> >      VERSION: #1 SMP Mon Apr 10 04:51:13 UTC 2006
> >      MACHINE: x86_64  (3000 Mhz)
> >       MEMORY: 4.6 GB
> >        PANIC: "SysRq : Trigger a crashdump"
> >          PID: 0
> >      COMMAND: "swapper"
> >         TASK: ffffffff80335340  (1 of 2)  [THREAD_INFO:
> > ffffffff8045c000]
> >          CPU: 0
> >        STATE: TASK_RUNNING (ACTIVE)
> >
> > crash> bt
> > PID: 0      TASK: ffffffff80335340  CPU: 0   COMMAND: "swapper"
> >  #0 [ffffffff8045dee8] schedule at ffffffff802cf6fa
>
> It's hard to debug this from here, but...
>
> Two things look strange, (1) it's not finding the proper starting
> point for the panicking (?) idle thread -- and possibly not even finding
> the correct panic task, and (2) the "possibly bogus exception
> frame" messages are due to the x86_64.c x86_64_eframe_verify()
> function finding something irregular in the exception frames (the
> pt_regs) of several processes while it made a search of all
> possible processes for the panic task.
>
> I would guess that if you do a "foreach bt", you will see the
> "possibly bogus" messages associated with the user-space
> exception frames of all user-space generated processes
> (i.e. not kernel threads).  It would be interesting to see what
> those frames look like, and why they are considered strange,
> probably a new cs or ss value that's never been used before?
>
> As far as the determination of the panic task, I'm presuming
> that this was generated from a kdump dumpfile.  The netdump.c
> get_netdump_panic_task() function, which has a bunch of
> kdump-specific code, is failing to find the panic task from the
> data in the ELF header notes.  Running "crash -d1 ..." will indicate
> how crash is trying to determine the panic task.  I don't know
> whether the idle task was even the one that took the sysrq,
> or whether it just defaulted to that task because it couldn't find
> any other likely suspects.  You'll have to debug it from your
> end, starting from get_netdump_panic_task().
>
> Dave

Dave,

I added little debug and found that  x86_64_eframe_verify() returns
FALSE to due to  !(rflags & 0x2)  (rflags = 0x200 in this dump).

Given that "crash" runs fine on live machine, I am going to assume
that its a problem with kdump format for now :(
 

No -- wait -- please don't!    ;-)

If you are saying that when you do a "foreach bt", you see
an RFLAGS of 0x200 in the kernel-entry exception frame for
the user tasks?  And everything else in the exception
frame looks "normal"?

For example, here's typical output:

crash> for bt | grep RFLAGS
    RIP: 0000003b92cc27c3  RSP: 00007ffffff8d898  RFLAGS: 00000246
    RIP: 00002b915e8457c3  RSP: 00007fffff94a858  RFLAGS: 00000246
    RIP: 00002b74ed07d812  RSP: 00007fffffdba4d0  RFLAGS: 00000202
    RIP: 00002b74eceaf436  RSP: 00000000409fffc0  RFLAGS: 00000246
    RIP: 00002b458542b7c3  RSP: 00007fffffd23fb8  RFLAGS: 00000246
    RIP: 00002b9f84957be0  RSP: 00007fffffb22f58  RFLAGS: 00000246
    RIP: 00002b7c74dc6e40  RSP: 00007fffffe348f8  RFLAGS: 00000246
    RIP: 00002b2a029c597f  RSP: 00007fffff8a4fe0  RFLAGS: 00000246
    RIP: 00002ab5bbbbe7c3  RSP: 00007fffffcedbf8  RFLAGS: 00000246
    RIP: 00002aef3c428693  RSP: 00007fffffb6b408  RFLAGS: 00000246
    RIP: 00002b94261777c3  RSP: 00007fffffe815e8  RFLAGS: 00000246
    RIP: 00002b24be49597f  RSP: 00007fffffde8230  RFLAGS: 00000246
    RIP: 0000003b92cc097f  RSP: 00007fffffca7ba0  RFLAGS: 00000246
    RIP: 0000003b94c0bcbb  RSP: 00007ffffff15b90  RFLAGS: 00000206
    RIP: 0000003b92cc27c3  RSP: 00007fffffdd5328  RFLAGS: 00000246
    RIP: 0000003b94c0be01  RSP: 00000000409fa180  RFLAGS: 00000246
    RIP: 00002af05a6197c3  RSP: 00007fffffa4ad58  RFLAGS: 00000246
    RIP: 00002ad6382227c3  RSP: 00007fffffc71918  RFLAGS: 00000246
    RIP: 00002b9b7166f7c3  RSP: 00007ffffff33e58  RFLAGS: 00000246
    RIP: 00002af1423fcdd0  RSP: 00007fffff965b48  RFLAGS: 00000246
    RIP: 0000003b92cc27c3  RSP: 00007fffffe534c8  RFLAGS: 00000246
    RIP: 00002b7269ec2e40  RSP: 00007fffff83fb78  RFLAGS: 00000246
    RIP: 0000003b92cc27c3  RSP: 00007ffffff15df8  RFLAGS: 00000246
    RIP: 00002afbd6a0fe40  RSP: 00007fffffed0f38  RFLAGS: 00000246
    RIP: 0000003b92cc09b6  RSP: 00007fffff82f820  RFLAGS: 00000206
    RIP: 0000003b94c0bebc  RSP: 00000000409ffd60  RFLAGS: 00000202
    RIP: 0000003b92cc097f  RSP: 00007fffff98ba60  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffff98ba38  RFLAGS: 00000246
    RIP: 0000003b92cc097f  RSP: 00007fffff8a4950  RFLAGS: 00000246
    RIP: 0000003b92cc097f  RSP: 00007fffffa83e30  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffff8cb188  RFLAGS: 00000246
    RIP: 0000003b92c91e40  RSP: 00007ffffff61628  RFLAGS: 00000246
    RIP: 0000003b92cc27c3  RSP: 00007ffffff51378  RFLAGS: 00000246
    RIP: 0000003b94c0c7d5  RSP: 00007fffffefb7c0  RFLAGS: 00000246
    RIP: 0000003b94c0b01d  RSP: 00007fffffefaef8  RFLAGS: 00000246
    RIP: 0000003b94c0bcbb  RSP: 00000000409ff490  RFLAGS: 00000206
    RIP: 0000003b94c0bcbb  RSP: 0000000041400490  RFLAGS: 00000206
    RIP: 0000003b92cc2812  RSP: 00007fffffb8af60  RFLAGS: 00000202
    RIP: 0000003b92cbbc1b  RSP: 0000000040a00160  RFLAGS: 00000202
    RIP: 0000003b94c09436  RSP: 0000000041e01660  RFLAGS: 00000246
    RIP: 0000003b94c0ba7b  RSP: 0000000042803160  RFLAGS: 00000202
    RIP: 00002b5a7b8377c3  RSP: 00007fffff82dc18  RFLAGS: 00000246
    RIP: 0000003b94c0bcbb  RSP: 0000000043202f20  RFLAGS: 00000206
    RIP: 0000003b94c0bcbb  RSP: 0000000043c03f20  RFLAGS: 00000206
    RIP: 0000003b92cbbbe0  RSP: 00007fffffe19c68  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffff826588  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffffe74ad8  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffffc476d8  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffff9ccee8  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffffc641d8  RFLAGS: 00000246
    RIP: 00002b7481a327c3  RSP: 00007fffffe25ff8  RFLAGS: 00000246
    RIP: 0000003b92c91a45  RSP: 00007fffffc42bc0  RFLAGS: 00000246
    RIP: 0000003b92cbbbe0  RSP: 00007fffff84ac48  RFLAGS: 00000246
crash>

At least the original AMD System Programming Guide indicates
that bit 1 of the RFLAGS register is "Reserved, Read as One".
But perhaps that's changed, or Intel uses it otherwise?  Can you
show the output of the above command?

Thanks,
  Dave