Re: [Crash-utility] xendump image with full-vitualized domain
by Dave Anderson
Kazuo Moriwaka wrote:
> > No surprise here -- there's absolutely no crash utility support for
> > xendumps of fully-virtualized kernels.
> >
> > Much of the information that crash uses to find its way
> > around a xendump currently depends upon information
> > *inside* the para-virtualized kernel. In your attempt above,
> > it needs data structure information for the vcpu_guest_context
> > structure, in order to get a cr3 value -- which it uses to find the
> > phys_to_machine_mapping[] array built into the kernel.
>
> This headers' vcpu_guest_context.ctrlreg points just a dummy
> pagetable. (in that file, mfn 12122.)
>
> > But obviously there is no phys_to_machine_mapping[]
> > array in fully-virtualized kernels, so no pseudo-to-physical
> > address translations can be made.
>
> Yes. I read some of code, and now I think this xendump image header
> doesn't have enough information to find shadow page table. Shadow
> page table pointed by vcpu.arch.shadow.* in hypervisor, but xendump
> doesn't have them. If threre is whole-machine dump, converting can be
> one solution.
>
Hi Kazuo,
Yeah -- I finally got around to tinkering with your example
FV xendump, and I agree, the cr3 value points to a page of
zeroes.
Complicating matters even more, is that you can run x86 FV
guests on x86_64 xen hosts -- which is what you did. That being
the case, the xendump header, purportedly of a 32-bit guest,
uses the x86_64 version of things like the vcpu_guest_context
structure, and all of the mfn's in the array are 64-bit values
instead of 32-bit. So when attempting to run crash against
an x86 kernel with essentially an x86_64 xendump, its bookkeeping
gets screwed up.
> Xen's roadmap says that it will support full-virtualized domain's
> save/restore in a few months; while supporting them, xendump format
> will be changed to contain enough info to re-build domain's
> pseudo-physical memory area. Just waiting for them is one way.
>
Doesn't their roadmap call for transitioning from their current
xendump format to an ELF vmcore format? That certainly would
be best...
Thanks,
Dave
17 years, 11 months
crash version 4.0-3.12 is available
by Dave Anderson
- For 2.6.14 and later ia64 kdumps, taken either as a result of the
INIT switch, or when an MCA exception has occurred, several problems
needed to be addressed. First, the "pseudo-task" that handles the
kdump operation due to an INIT or MCA was not being recognized as
the "panic" task. Secondly, the backtraces of the per-cpu INIT
or MCA handling pseudo-tasks only went back as far as their entry
onto their own per-cpu stacks, and did not show the backtrace of
the task that was running on that cpu when the INIT or MCA event
occurred. This version recognizes the pseudo-task that handles the
kdump operation; and for each cpu, the active tasks' backtraces now
also show a transition back to the task that was running on that cpu
when the INIT or MCA event occurred. (j-nomura(a)ce.jp.nec.com)
- To address the need to display per-cpu variables, the "p"
command has been modified to recognize "per_cpu__xxx" arguments
when the kernel is SMP, in order to prevent the attempt to display
the contents of a variable whose symbol value does not represent
the actual location of its data. In that case, the data type of
the per-cpu variable will be displayed, followed by the addresses
of each per-cpu instance. Given that information, a proper command
can be utilized in order to display the data. For example, to look
at the per-cpu buffer_head accounting for cpu 2:
crash> p per_cpu__bh_accounting
PER-CPU DATA TYPE:
struct bh_accounting per_cpu__bh_accounting;
PER-CPU ADDRESSES:
[0]: c5405a80
[1]: c540da80
[2]: c5415a80
[3]: c541da80
crash> bh_accounting c5415a80
struct bh_accounting {
nr = 434,
ratelimit = 2216
}
Note that "p" on the first command line above is optional, because
whenever a data variable is entered alone, crash will recognize it
as such, and pass it to the "p" command by default. I had thought
of putting this functionality into the "struct" command, but many
of the per-cpu variables are pointers, arrays, etc.. So for the
non-structure cases, the "rd" command would be more appropriate,
or alternatively a cobbled-together gdb print command.
(anderson(a)redhat.com)
- A consolidated cleanup and minor fixes patch has been applied to
the experimental x86_64 dwarf CFI unwind facility.
(rachita(a)in.ibm.com)
- Also related to the experimental x86_64 dwarf CFI unwind facility,
fixed a problem where if a "set unwind on" was done, and followed
by a subsequent "set unwind off", then the "bt" output could either
cause a segmentation violation, or display backtrace data that was
different from the original. (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 11 months
Re:[RFC] Crash patch for DWARF CFI based unwind support
by Dave Anderson
> Hi Dave
>
> The following patch adds support for DWARF CFI based stack unwinding
> for crash. Since this method uses the call frame instructions for
> unwinding, it generates better backtraces than the existing backtrace
> mechanism. So when we have the unwind info available, this new method
> will be called, else we fall back to the existing mechanism.
>
> ... <this section moved below>
>
> Please provide your suggestions and comments.
>
> Thanks
> Rachita
Hi Rachita,
I've only been able to test this on a live system that has __start_unwind
and __end_unwind symbols, so I don't know what a backtrace with an
in-kernel exception frame, or a backtrace with a transition to the x86_64
IRQ stack or x86_64 exception stacks, would look like. If you have
an example, I'd be interested in seeing how they get handled.
But what's there, i.e., with only user-mode-to-kernel-mode exception
frames being displayed, looks pretty good!
I'd like to tinker with it a bit, and fold most of what you've
done into 4.0-3.8. Then you can use that as a source base to
continue.
I do have a few questions/comments.
This netdump.c patch doesn't make sense to me -- we really don't
want to be doing any ELFSTORE operations in the debug-only
netdump_memory_dump() debug function:
@@ -771,8 +772,11 @@ netdump_memory_dump(FILE *fp)
dump_Elf64_Phdr(nd->load64 + i, ELFREAD);
offset64 = nd->notes64->p_offset;
for (tot = 0; tot < nd->notes64->p_filesz; tot += len) {
+ if (has_unwind_info)
+ len = dump_Elf64_Nhdr(offset64, ELFSTORE);
+ else
len = dump_Elf64_Nhdr(offset64, ELFREAD);
- offset64 += len;
+ offset64 += len;
}
break;
}
This patch below looks to only be necessary in dumpfiles, but it seems
like, given that the x86_64 user_regs_struct is unavailable in 2.6
vmlinux files, that the initializations in get_netdump_regs_x86_64()
would never get done -- because VALID_STRUCT(user_regs_struct) would
fail, right?
@@ -1562,8 +1566,10 @@ get_netdump_regs_x86_64(struct bt_info *
if (is_task_active(bt->task))
bt->flags |= BT_DUMPFILE_SEARCH;
- if ((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
- VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) {
+ if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
+ VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
+ (KDUMP_DUMPFILE() && has_unwind_info && (bt->flags &
+ BT_DUMPFILE_SEARCH))) {
if (nd->num_prstatus_notes > 1)
note = (Elf64_Nhdr *)
nd->nt_prstatus_percpu[bt->tc->processor];
@@ -1574,9 +1580,21 @@ get_netdump_regs_x86_64(struct bt_info *
len = roundup(len + note->n_namesz, 4);
len = roundup(len + note->n_descsz, 4);
+ if KDUMP_DUMPFILE() {
+ ASSIGN_SIZE(user_regs_struct) = 27 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rsp) = 19 * sizeof(unsigned long);
+ ASSIGN_OFFSET(user_regs_struct_rip) = 16 * sizeof(unsigned long);
+ }
user_regs = ((char *)note + len)
- SIZE(user_regs_struct) - sizeof(long);
+ if KDUMP_DUMPFILE() {
+ *rspp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rsp));
+ *ripp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rip));
+ if (*ripp && *rspp)
+ return;
+ }
+
But then again, perhaps you never needed the user_regs_struct_rsp and
user_regs_struct_rip offsets in your test scenario?
There does seem to be some unnecessary "kernel-port" left-overs that
should be pruned. Like the __get_user_nocheck(), __get_user_size()
and __get_user_asm() definitions are superfluous, since they're only
needed by __get_user(), which is not used.
I'll also make it compilable in other than x86_64 environments,
because the unwind_x86_64.c code should be #ifdef'd X86_64.
> There still are a couple of things which need to be done, viz
> 1. Extend to obtaining unwind info from modules as well(currently
> doing only for the kernel)
Shouldn't pose a major problem -- just requires following the links
from the kernel table AFAICT. But that leads to another question...
What happens, in dwarf_backtrace(), when it encounters a module
frame? My guess is that the call to unwind() will fail, and then
the loop will bail out prematurely, and end up truncating the
trace output? We'll need some type of error-handling there I
would think.
> 2. Currently reading the unwind info from eh_frame section only(ie
> __start_unwind to __end_unwind). Need to add facility to read from
> the .debug_frame(if .debug_frame is present in cases where .eh_frame
> is absent. Will have to read from the vmlinux if we want to read the
> .debug_frame info)
Definitely -- we've got a plethora of kernels that have the CFI stuff
in the vmlinux file, but not in the kernel.
> 3. Add FRAME_POINTER support.
Personally, I don't much care about this...
And don't forget about x86 support!
And we should probably -- for now anyway -- make it possible to
turn this capability on and off at will. Also, for now, it should
probably default to off until we pound on it a bit. For example,
things like "bt -l" is not supported in these scheme.
But, I can't emphasize this enough -- this is a nice piece of work
that you've done here.
I'll try to get something out in the next couple of days.
Thanks,
Dave
17 years, 11 months
xendump image with full-vitualized domain
by Kazuo Moriwaka
Hi,
I tried to analise full-virtualized domain's dump image with crash.
It abortes with following message.
$ crash System.map-2.6.8-2-386 vmlinux-2.6.8-2-386 2006-1110-1141.38-guest2.4.core
(snip)
crash: cannot determine vcpu_guest_context.ctrlreg offset
Full-virtualized domain's kernel doeesn't have any information about
xen-hypervisor, it also doesn't have struct vcpu_guest_context.
I'll put kernels and xendump core files at following for reference.
http://people.valinux.co.jp/~moriwaka/domUcore/
host.tar.gz - xen hypervisor and dom0 kernel(for amd64)
full-virtualized-guest.tar.gz - domU kernel(for i386) and dump image
taken by 'xm dump-core' command.
any ideas?
--
Kazuo Moriwaka <moriwaka(a)valinux.co.jp>
17 years, 11 months
Re: [Crash-utility] crash version 4.0-3.9 is available
by Dave Anderson
>
> > crash> dis common_interrupt
> > 0xffffffff80109b34 <common_interrupt>: cld
> > 0xffffffff80109b35 <common_interrupt+1>: sub $0x48,%rsp
> > 0xffffffff80109b39 <common_interrupt+5>: mov %rdi,0x40(%rsp)
> > 0xffffffff80109b3e <common_interrupt+10>: mov %rsi,0x38(%rsp)
> > 0xffffffff80109b43 <common_interrupt+15>: mov %rdx,0x30(%rsp)
> > 0xffffffff80109b48 <common_interrupt+20>: mov %rcx,0x28(%rsp)
> > 0xffffffff80109b4d <common_interrupt+25>: mov %rax,0x20(%rsp)
> > 0xffffffff80109b52 <common_interrupt+30>: mov %r8,0x18(%rsp)
> > 0xffffffff80109b57 <common_interrupt+35>: mov %r9,0x10(%rsp)
> > 0xffffffff80109b5c <common_interrupt+40>: mov %r10,0x8(%rsp)
> > 0xffffffff80109b61 <common_interrupt+45>: mov %r11,(%rsp)
> > 0xffffffff80109b65 <common_interrupt+49>:
> > lea 0xffffffffffffffd0(%rsp),%rdi
> > 0xffffffff80109b6a <common_interrupt+54>: push %rbp
> > 0xffffffff80109b6b <common_interrupt+55>: mov %rsp,%rbp
> >
> > Thanks
> > Rachita
>
> Unbelievable -- nice catch!
>
> I would have thought since the output of the disassembly
> was changed to a temporary file instead of stdout, that there
> wouldn't be any line-wrap applied by gdb behind the scenes.
>
> And as luck would have it, I did my testing in a window
> larger than 80-columns...
Hi Rachita,
Fixed in 4.0-3.11 (the only change from 4.0-3.9 is attached).
In fact, while testing in a 30-column window, I found another
gdb output problem where an asterisk (pointer) in a data structure
declaration got dropped for some reason, causing next_online_pgdat()
to fail. Anyway, setting the gdb line width to unlimited fixes
both issues.
Thanks again for catching this,
Dave
--- gdb_interface.c 11 Oct 2006 13:14:35 -0000 1.34
+++ gdb_interface.c 8 Nov 2006 21:57:06 -0000 1.35
@@ -239,6 +239,11 @@
sprintf(req->buf, "set height 0");
gdb_interface(req);
+ req->command = GNU_PASS_THROUGH;
+ req->name = NULL, req->flags = 0;
+ sprintf(req->buf, "set width 0");
+ gdb_interface(req);
+
/*
* Patch gdb's symbol values with the correct values from either
* the System.map or non-debug vmlinux, whichever is in effect.
17 years, 11 months
crash version 4.0-3.9 is available
by Dave Anderson
- Tentatively scheduled as base version for RHEL4-U5.
- The current 2.6.18 x86_64 kernel has changed the IRQ-stack-to-
process-stack linkage, where until now the link value was a pointer
to the exception frame on the process stack, but has been changed
to point to a location on the process stack above the exception
frame. Because of that, after displaying the trace data from the
IRQ stack, "bt" would then display an invalid exception frame,
which was reported as a "possibly bogus exception frame".
(anderson(a)redhat.com)
- Also in x86_64 kernels, fix for the "bt" command. When the backtrace
started on the NMI exception stack, it was displaying the correct
exception frame data, but was erroneously reporting that it was a
"possibly bogus exception frame". (anderson(a)redhat.com)
- And again in x86_64 kernels, fix for the "bt" command. When making
the transition from the IRQ stack back to the process stack, when
the IRQ stack entry was made via the relatively new "call_softirq"
entry point. In that case, there is no exception frame on the
process stack, because it's essentially just a cross-stack call
from do_softirq(). However, a bogus exception frame was being
displayed, along with a "possibly bogus exception frame" message;
and if the RIP value in the truly bogus exception frame happened
to fall in the user virtual address range, the remainder of the
process stack trace was not displayed at all. (anderson(a)redhat.com)
- Fix for 2.6.18-era ia64 DISCONTIGMEM kernels, which would fail
during initialization with the error message: "crash: invalid
(optional) structure member offsets: pglist_data_node_next or
pglist_data_pgdat_next". (anderson(a)redhat.com)
- Adapted Olivier Daudel's nifty enhancement to the "struct" command,
which allows the single "struct.member" argument to optionally be
expressed in a "struct.member[,member,member] format, in order to
display multiple members of a given structure. This also applies to
the "union" and "*" commands, as all three functions have now been
combined into one behind the scenes. Fixed the display for applying
a minus count, and given that it opened up a the door to a number of
entry errors, I also added additional error-catching/handling to avoid
the display of incorrect structure data.
(olivier.daudel(a)u-paris10.fr, anderson(a)redhat.com)
- Fixed three sources of potential segmentation violations when using
the "bt" command when the experimental dwarf CFI unwind backtrace
facility was turned on. (anderson(a)redhat.com)
- Added a new machdep_init(POST_VM) call, which is currently only being
used by the x86_64 architecture; it calls init_unwind_table(), which
has to be done after vm_init() in order to access the unwind tables
of kernel modules. (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 11 months
another crash failed to start on SN vmcore
by Jay Lan
Hi Dave,
I have another vmcore that gdb (6.4) was able to display bt
but crash failed to come up. This copy of crash contains
the changes you suggested on my previous failure report (10/26).
I am not sure if this was caused by the fault of this vmcore, since
gdb only showed one thread. There should be another thread. But i
think i should let you know and let you decide if this is the
case.
Thanks,
- jay
###
# gdb vmlinux vmcore-nmi-10
GNU gdb 6.4
Copyright 2005 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "ia64-suse-linux"...Using host libthread_db
library "/lib/libthread_db.so.1".
#0 crash_save_this_cpu () at arch/ia64/kernel/crash.c:57
warning: Source file is more recent than executable.
57 memcpy(buf, name, note->n_namesz);
(gdb) bt
#0 crash_save_this_cpu () at arch/ia64/kernel/crash.c:57
#1 0xa00000010005f580 in kdump_cpu_freeze (info=<value optimized out>,
arg=0x1) at arch/ia64/kernel/crash.c:166
#2 0xa00000010000c9f0 in unw_init_running () at include/linux/bitmap.h:237
#3 0xa00000010005ef70 in kdump_init_notifier (self=0xa000000100b47d78,
val=<value optimized out>, data=0xe000003007157b70)
at arch/ia64/kernel/crash.c:217
#4 0xa0000001000cf0d0 in notifier_call_chain (nl=0xe000003014bcb3f8,
val=13,
v=0xe000003007157b70) at kernel/sys.c:144
#5 0xa0000001000cf170 in atomic_notifier_call_chain
(nh=0xe000003014bcb3f0,
val=21, v=0xe000003007157b70) at kernel/sys.c:229
#6 0xa000000100048480 in ia64_init_handler (regs=0xe000003007157e40,
sw=<value optimized out>, sos=<value optimized out>)
at include/asm/kdebug.h:88
#7 0xa0000001000493a0 in ia64_os_init_virtual_begin ()
at include/asm/kdebug.h:88
(gdb)
###
# crsah vmlinux vmcore-nmi-10
CORRECT>crash vmlinux vmcore-nmi-10 (y|n|e|a)? no
crsah: Command not found.
(jackhammer,113) crash vmlinux vmcore-nmi-10
crash 4.0-3.5
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
crash(1203): unaligned access to 0x60000000001bf1cc, ip=0x400000000026d090
crash(1203): unaligned access to 0x60000000001bf1d4, ip=0x400000000026d090
crash(1203): unaligned access to 0x60000000001bf1dc, ip=0x400000000026d090
crash(1203): unaligned access to 0x60000000001bf1e4, ip=0x400000000026d090
crash(1203): unaligned access to 0x60000000001bf1ec, ip=0x400000000026d090
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "ia64-unknown-linux-gnu"...
crash: invalid (optional) structure member offsets:
pglist_data_node_next or pglist_data_pgdat_next
FILE: memory.c LINE: 11504 FUNCTION: node_table_init()
[/usr/people/jlan/bin/crash] error trace: => 4000000000231740
4000000000231740: OFFSET_option+432
WARNING: Because this kernel was compiled with gcc version 4.1.0, certain
commands or command options may fail unless crash is invoked with
the "--readnow" command line option.
#
17 years, 11 months
Re: [Crash-utility] xencrash: crash analisys tool for Xen hypervisor
by Dave Anderson
Hello Itsuro,
One additional request...
Can you rebuild your patched version of crash by doing this:
# touch defs.h; make warn
There's a bunch of cleanup that needs to be done in task.c, x86.c,
xen_hyper.c, xen_hyper_command.c and xen_hyper_global_data.c.
(You can ignore the complaints from cmdline.c -- they are generated
from the readline.h header file in the gdb sources.)
Thanks,
Dave
17 years, 11 months
Re: [Crash-utility] xencrash: crash analisys tool for Xen hypervisor
by Dave Anderson
> Hello Itsuro,
>
> I have only briefly scanned your patch, and find it very
> interesting. I am not at all familiar with the hypervisor
> code, but I do have a couple simple questions at this point:
>
> 1. Is it necessary to create a separate "xencrash"
> binary? Would it be possible to create a single
> crash binary that can be used for both vmlinux
> and xen-syms sessions?
>
Sorry about this -- I missed the following two parts of
the Makefile and main.c patches!
+
+xencrash: crash
+ cp -f crash xencrash
+ if (STREQ(pc->program_name, "xencrash"))
+ pc->flags |= XEN_HYPER;
However, given the above, I wonder whether it would be possible
to postpone the setting of the XEN_HYPER flag from the
setup_environment() function until later on in main() when
the command line arguments are parsed? Wouldn't it be possible
to recognize that the xen-syms object file is the hypervisor
binary, and to set the XEN_HYPER flag at that point?
Thanks,
Dave
>
> 2. Can you make it work with a live hypervisor system
> in the same way the crash works with a live Linux
> kernel?
>
> Thanks,
> Dave
>
17 years, 11 months