Hi Dave,
On Fri, Aug 27, 2010 at 08:40:17AM -0400, Dave Anderson wrote:
>
> ----- hutao(a)cn.fujitsu.com wrote:
>
> > Hi,
> >
> > I encountered a problem on getting backtrace with a `virsh dump' dumped
> > kvm dumpfile, the bt command did not get kernel backtrace properly.
> >
> > guest kernel: 2.6.32
> > crash: 5.0.6 patched with qemu_ram_version_4.patch(attached)
> >
> > steps to get dumpfile:
> >
> > 1. virsh start vm
> > 2. connect to vm, say by vnc
> > 3. On guest, build and run the code:
> >
> > int main(void)
> > {
> > while (1);
> >
> > return 0;
> > }
> >
> > 4. On host, run `virsh dump vm
> > /mnt/data/kernel-2.6.32.dump3-userspace-endless-loop'
> >
> > Then run crash:
> >
> > crash /mnt/data/kernel/linux-2.6.32/System.map
> > /mnt/data/kernel/linux-2.6.32/vmlinux
> > /mnt/data/kernel-2.6.32.dump3-userspace-endless-loop
> >
> > got the result:
> >
> > crash 5.0.6
> > Copyright (C) 2002-2010 Red Hat, Inc.
> > Copyright (C) 2004, 2005, 2006 IBM Corporation
> > Copyright (C) 1999-2006 Hewlett-Packard Co
> > Copyright (C) 2005, 2006 Fujitsu Limited
> > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> > Copyright (C) 2005 NEC Corporation
> > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux,
Inc.
> > This program is free software, covered by the GNU General Public License,
> > and you are welcome to change it and/or distribute copies of it under
> > certain conditions. Enter "help copying" to see the conditions.
> > This program has absolutely no warranty. Enter "help warranty" for
> > details.
> >
> > GNU gdb (GDB) 7.0
> > Copyright (C) 2009 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later
<
http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-unknown-linux-gnu"...
> >
> > SYSTEM MAP: /mnt/data/kernel/linux-2.6.32/System.map
> > DEBUG KERNEL: /mnt/data/kernel/linux-2.6.32/vmlinux (2.6.32)
> > DUMPFILE: /mnt/data/kernel-2.6.32.dump3-userspace-endless-loop
> > CPUS: 1
> > DATE: Fri Aug 27 05:18:12 2010
> > UPTIME: 00:00:51
> > LOAD AVERAGE: 0.44, 0.11, 0.03
> > TASKS: 67
> > NODENAME: localhost.localdomain
> > RELEASE: 2.6.32
> > VERSION: #2 SMP PREEMPT Wed Aug 25 15:26:48 CST 2010
> > MACHINE: x86_64 (2925 Mhz)
> > MEMORY: 511.6 MB
> > PANIC: "Oops: 0003 [#1] PREEMPT SMP " (check log for details)
> > PID: 0
> > COMMAND: "swapper"
> > TASK: ffffffff8158df70 [THREAD_INFO: ffffffff8154e000]
> > CPU: 0
> > STATE: TASK_RUNNING
> > WARNING: panic task not found
> >
> > crash> bt
> > PID: 0 TASK: ffffffff8158df70 CPU: 0 COMMAND: "swapper"
> > #0 [ffffffff8154fe28] schedule at ffffffff8138baa3
> > bt: invalid kernel virtual address: 41 type: "call byte"
> > bt: invalid kernel virtual address: 44e6835ad type: "call byte"
> > bt: load_memfile_offset: read: Success
> > bt: read error: kernel virtual address: fffffffffffffffc type: "call
byte"
> > bt: invalid kernel virtual address: e7ab type: "call byte"
> > bt: invalid kernel virtual address: e273 type: "call byte"
> > bt: invalid kernel virtual address: 13a7b type: "call byte"
> > bt: invalid kernel virtual address: 935cb type: "call byte"
> > bt: load_memfile_offset: read: Success
> > bt: read error: kernel virtual address: fffffffffffffffb type: "call
byte"
> > bt: invalid kernel virtual address: 935cb type: "call byte"
> > #1 [ffffffff8154fef0] cpu_idle at ffffffff8100ad1e
> > crash>
> >
> >
> > Note the output of `bt' command. Without running that endless-loop
> > code then`bt' got:
> >
> >
> > crash> bt
> > PID: 0 TASK: ffffffff8158df70 CPU: 0 COMMAND: "swapper"
> > #0 [ffffffff8154fe28] schedule at ffffffff8138baa3
> > #1 [ffffffff8154fe48] apic_timer_interrupt at ffffffff8100c65e
> > #2 [ffffffff8154fed0] need_resched at ffffffff810125a8
> > #3 [ffffffff8154fee0] default_idle at ffffffff81012e03
> > #4 [ffffffff8154fef0] cpu_idle at ffffffff8100acd6
> > crash>
> >
> >
> > Any suggestions on how to solve the problem?
>
> Not really.
>
> If there's no kernel crash, then the selection of the current
> context defaults to the cpu 0 swapper task. I don't know
> what was happening to the "swapper" task at the time that the
> guest was paused.
Without using System.map, `bt' indeed got
crash> bt
PID: 0 TASK: ffffffff8158df70 CPU: 0 COMMAND: "swapper"
#0 [ffffffff8154fe28] schedule at ffffffff8138baa3
#1 [ffffffff8154fef0] cpu_idle at ffffffff8100ad1e
crash>
>
> If you want to make the vmlinux/dumpfile available for me
> to download, I can take a look. (I don't know why you're
> using a System.map).
Do you still need vmlinux/dumpfile? If yes I will make them
available for you.
Is the "non-System.map" bt output above from the dumpfile containing
the user-space endless-loop, or is it from the second one you tried?
If the backtrace above occurs with the user-space-endless-loop
dumpfile, then it's not necessary to send them to me. But if the
bt errors still occur with the dumpfile containing the user-space
endless-loop (and with no System.map), then yes, I would like to
see the vmlinux/dumpfile pair. (You can send the download details
off-list...)
Thanks,
Dave