Badari Pulavarty wrote:
Hi,
I get following crash warnings on x86-64 machine. Wondering why ?
And also, its not showing stacks correctly.
Thanks,
Badari
# ./crash /var/log/dump/2006-04-24-08:02/vmcore /usr/src/linux/vmlinux
crash 4.0-2.23
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public
License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for
details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
WARNING: possibly bogus exception frame
KERNEL: /usr/src/linux/vmlinux
DUMPFILE: /var/log/dump/2006-04-24-08:02/vmcore
CPUS: 2
DATE: Mon Apr 24 08:02:03 2006
UPTIME: 00:06:31
LOAD AVERAGE: 0.00, 0.00, 0.00
TASKS: 63
NODENAME: elm3a242
RELEASE: 2.6.16-20-smp
VERSION: #1 SMP Mon Apr 10 04:51:13 UTC 2006
MACHINE: x86_64 (3000 Mhz)
MEMORY: 4.6 GB
PANIC: "SysRq : Trigger a crashdump"
PID: 0
COMMAND: "swapper"
TASK: ffffffff80335340 (1 of 2) [THREAD_INFO:
ffffffff8045c000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash> bt
PID: 0 TASK: ffffffff80335340 CPU: 0 COMMAND: "swapper"
#0 [ffffffff8045dee8] schedule at ffffffff802cf6fa
It's hard to debug this from here, but...
Two things look strange, (1) it's not finding the proper starting
point for the panicking (?) idle thread -- and possibly not even finding
the correct panic task, and (2) the "possibly bogus exception
frame" messages are due to the x86_64.c x86_64_eframe_verify()
function finding something irregular in the exception frames (the
pt_regs) of several processes while it made a search of all
possible processes for the panic task.
I would guess that if you do a "foreach bt", you will see the
"possibly bogus" messages associated with the user-space
exception frames of all user-space generated processes
(i.e. not kernel threads). It would be interesting to see what
those frames look like, and why they are considered strange,
probably a new cs or ss value that's never been used before?
As far as the determination of the panic task, I'm presuming
that this was generated from a kdump dumpfile. The netdump.c
get_netdump_panic_task() function, which has a bunch of
kdump-specific code, is failing to find the panic task from the
data in the ELF header notes. Running "crash -d1 ..." will indicate
how crash is trying to determine the panic task. I don't know
whether the idle task was even the one that took the sysrq,
or whether it just defaulted to that task because it couldn't find
any other likely suspects. You'll have to debug it from your
end, starting from get_netdump_panic_task().
Dave