Dave Anderson reached out and wrote:
----- Original Message -----
[root kvm7 127.0.0.1-2014-02-07-19:17:09]# crash
/boot/System.map-2.6.32-220.el6.x86_64.debug
/usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64.debug/vmlinux vmcore
crash 5.1.8-1.el6
Copyright (C) 2002-2011 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for
details.
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
crash: page excluded: kernel virtual address: ffffffff81542000 type:
"cpu_possible_mask"
I can go into minimal,
nm -Bn /usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64.debug/vmlinux |
grep
_stext
ffffffff81000198 T _stext
cat /proc/kallsyms | grep _stext
ffffffff81000198 T _stext
If I use the System Map parm I get this warning
WARNING: kernels compiled by different gcc versions:
/usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64.debug/vmlinux: 4.4.5
vmcore kernel: 4.4.6
Would really like to understand why this system crashed. I know I'm a bit
behind on my kernel versions however, but I should be able to look at this
kernel??
Thanks
Tory
It looks like the vmcore and vmlinux file don't match, like maybe the
crashing
system was running the standard 2.6.32-220.el6.x86_64 kernel, and you're
trying
to debug it using the 2.6.32-220.el6.x86_64.debug kernel variant?
First thing -- *never* use a System.map file unless for some reason you
don't
have the original kernel's vmlinux available *and* you feel that the vmlinux
file you have is very close to the crashing kernel's vmlinux. Bit with any
RHEL standard (unmodified) vmlinux/vmcore setup, the System.map is
completely
useless.
So the first question is: what kernel generated the vmcore?
Do this:
$ strings vmcore | grep '2.6.32'
Dave
--
Dave you are right, I thought I had to use the devel kernel and in fact my
system is not running that, so it crashed with the standard
2.6.32-220.el6.x86_64 kernel.
[tblue@kvm7 127.0.0.1-2014-02-07-19:17:09]$ sudo strings vmcore | grep
'2.6.32'
2.6.32-220.el6.x86_64
OSRELEASE=2.6.32-220.el6.x86_64
But it won't take my vmlinux from /boot
crash: /boot/vmlinuz-2.6.32-220.el6.x86_64: not a supported file format
Yes sir you were correct, I was using the wrong kernel!
please wait... (determining panic task)
WARNING: multiple active tasks have called die
KERNEL: /usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64/vmlinux
DUMPFILE: /libvirt/crash/127.0.0.1-2014-02-07-19:17:09/vmcore [PARTIAL
DUMP]
CPUS: 32
DATE: Fri Feb 7 18:16:05 2014
UPTIME: 226 days, 21:36:13
LOAD AVERAGE: 2.42, 2.68, 2.69
TASKS: 816
NODENAME:
kvm7.domain.com
RELEASE: 2.6.32-220.el6.x86_64
VERSION: #1 SMP Tue Dec 6 19:48:22 GMT 2011
MACHINE: x86_64 (2200 Mhz)
MEMORY: 88 GB
PANIC: ""
PID: 0
COMMAND: "swapper"
TASK: ffff881665514b40 (1 of 32) [THREAD_INFO: ffff880c6124e000]
CPU: 19
STATE: TASK_RUNNING (PANIC)
Nothing stands out as s bug or reason to fail
divide error: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu31/cache/index2/shared_cpu_map
CPU 19
Modules linked in: ext3 jbd ip6table_filter ip6_tables ebtable_nat ebtables
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables
sunrpc bridge stp llc bonding ipv6 vhost_net macvtap macvlan tun kvm_intel
kvm cdc_ether usbnet mii microcode i2c_i801 i2c_core iTCO_wdt
iTCO_vendor_support shpchp igb ioatdma dca ses enclosure sg ext4 mbcache
jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mirror
dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.32-220.el6.x86_64 #1 IBM System x3650
M4 -[7915AC1]-/00J6528
RIP: 0010:[<ffffffff81054ad5>] [<ffffffff81054ad5>]
find_busiest_group+0x5c5/0xb20
RSP: 0018:ffff880028363c40 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880028363e64 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8800282cf540 RDI: ffff8800282d5fc0
RBP: ffff880028363dd0 R08: ffff8800282cf860 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffff01
R13: 0000000000015fc0 R14: ffffffffffffffff R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880028360000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f4e5215c000 CR3: 00000011bea54000 CR4: 00000000000426e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff880c6124e000, task ffff881665514b40)
Stack:
ffff880028363d70 ffff880028363ce0 ffff880028363ca0 000000000000024d
<0> ffff8800282cf860 ffff880028363e58 0101881664b121a8 0000000600000000
<0> 0000000600000000 ffff8800282cf540 0000000123386cc0 0000000000000008
Call Trace:
<IRQ>
[<ffffffffa02e4669>] ? br_handle_frame_finish+0x179/0x2a0 [bridge]
[<ffffffff8105fc52>] rebalance_domains+0x1a2/0x5b0
[<ffffffff81060153>] run_rebalance_domains+0xf3/0x160
[<ffffffff8107c4f0>] ? get_next_timer_interrupt+0x1b0/0x250
[<ffffffff81072161>] __do_softirq+0xc1/0x1d0
[<ffffffff81097e0a>] ? sched_clock_idle_wakeup_event+0x1a/0x20
[<ffffffff8100c24c>] call_softirq+0x1c/0x30
[<ffffffff8100de85>] do_softirq+0x65/0xa0
[<ffffffff81071f45>] irq_exit+0x85/0x90
[<ffffffff8102a255>] smp_call_function_single_interrupt+0x35/0x40
[<ffffffff8100bdb3>] call_function_single_interrupt+0x13/0x20
<EOI>
[<ffffffff812c4a5e>] ? intel_idle+0xde/0x170
[<ffffffff812c4a41>] ? intel_idle+0xc1/0x170
[<ffffffff813f9f47>] cpuidle_idle_call+0xa7/0x140
[<ffffffff81009e06>] cpu_idle+0xb6/0x110
[<ffffffff814e5f23>] start_secondary+0x202/0x245
Code: d0 b8 01 00 00 00 48 c1 ea 0a 48 85 d2 0f 45 c2 41 89 40 08 66 90 4c
8b 85 e0 fe ff ff 48 8b 45 a8 31 d2 41 8b 48 08 48 c1 e0 0a <48> f7 f1 48
8b 4d b0 48 89 45 a0 31 c0 48 85 c9 74 0c 48 8b 45
RIP [<ffffffff81054ad5>] find_busiest_group+0x5c5/0xb20
RSP <ffff880028363c40>
Is there a forum that would help me figure out what exactly cause this
crash as it's not the first time, across this series of servers running KVM
Thank you sir,
Tory