On 11/19/2015 12:36 AM, Nan Xiao wrote:
Hi David & Dave,
Executing "crash" on a physical machine (not VirtualBox):
# crash
crash 7.1.3
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
crash: /boot/xen-4.5.gz: original filename unknown
Use "-f /boot/xen-4.5.gz" on command line to prevent this message.
WARNING: machine type mismatch:
crash utility: X86_64
/var/tmp/xen-4.5.gz_VIOmfp: X86
crash: /boot/symtypes-3.12.49-6-default.gz: original filename unknown
Use "-f /boot/symtypes-3.12.49-6-default.gz" on command line to
prevent this message.
crash: /boot/symvers-3.12.49-6-default.gz: original filename unknown
Use "-f /boot/symvers-3.12.49-6-default.gz" on command line to
prevent this message.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which
renders /dev/mem unusable as a live memory source.
crash: trying /proc/kcore as an alternative to /dev/mem
KERNEL: /boot/vmlinux-3.12.49-6-xen.gz
DEBUGINFO: /usr/lib/debug/boot/vmlinux-3.12.49-6-xen.debug
DUMPFILE: /proc/kcore
CPUS: 128
DATE: Thu Nov 19 09:37:49 2015
UPTIME: 00:34:57
LOAD AVERAGE: 1.77, 1.21, 1.02
TASKS: 1328
NODENAME: dl980-5
RELEASE: 3.12.49-6-xen
VERSION: #1 SMP Mon Oct 26 16:05:37 UTC 2015 (11560c3)
MACHINE: x86_64 (1995 Mhz)
MEMORY: 125.9 GB
PID: 39777
COMMAND: "crash"
TASK: ffff881eacaaa100 [THREAD_INFO: ffff881e8ff46000]
CPU: 3
STATE: TASK_RUNNING (ACTIVE)
crash>
I can see the crash will use "/proc/kcore" instead of "/dev/mem". So
I
try the same thing on VirtualBox:
# crash /boot/vmlinux-3.12.49-6-xen.gz /proc/kcore
crash 7.1.3
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /boot/vmlinux-3.12.49-6-xen.gz
DEBUGINFO: /usr/lib/debug/boot/vmlinux-3.12.49-6-xen.debug
DUMPFILE: /proc/kcore
CPUS: 1
DATE: Thu Nov 19 01:53:01 2015
UPTIME: 05:42:13
LOAD AVERAGE: 0.19, 0.06, 0.06
TASKS: 239
NODENAME: linux-6ev3
RELEASE: 3.12.49-6-xen
VERSION: #1 SMP Mon Oct 26 16:05:37 UTC 2015 (11560c3)
MACHINE: x86_64 (2594 Mhz)
MEMORY: 855.2 MB
PID: 3106
COMMAND: "crash"
TASK: ffff88002ec5c040 [THREAD_INFO: ffff88000b3e2000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash>
It seems OK now.
So my questions are:
(1) Is it OK to use "/proc/kcore" instead of "/dev/mem" as a
workaround?
Is there any side-effect?
As I read it, /proc/kcore is the kernel's virtual address space and
/dev/mem is the system's physical address space. It probably isn't wise
to debug the latter on any dom0 (whether nested in another
virtualization or not) except in very acute cases. It may explain the
problem you were having in the first place if VirtualBox affects whether
/dev/mem is a real physical memory view or if it doesn't then whether
VirtualBox itself affects those cases where, as I understand it, the
kernel has constant physical addresses for some things.
(2) Execute "crash -d8" on physical machine will cause
crash utility core dump.
Use gdb to debug it:
# gdb /usr/bin/crash core-crash-11-0-0-40072-1447945769
GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<
http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<
http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/crash...Reading symbols from
/usr/lib/debug/usr/bin/crash.debug...done.
done.
[New LWP 40072]
Core was generated by `crash -d8'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5119001fd0 in get_cie_encoding (cie=0x7f5119004cd8) at
../../../libgcc/unwind-dw2-fde.c:272
272 ../../../libgcc/unwind-dw2-fde.c: No such file or directory.
(gdb)
(gdb) bt
#0 0x00007f5119001fd0 in get_cie_encoding (cie=0x7f5119004cd8) at
../../../libgcc/unwind-dw2-fde.c:272
#1 0x00007f5119002699 in get_fde_encoding (f=0x7f5119006050) at
../../../libgcc/unwind-dw2-fde.c:319
#2 _Unwind_IteratePhdrCallback (info=info@entry=0x7fff463c10e0,
size=size@entry=64, ptr=ptr@entry=0x7fff463c1160)
at ../../../libgcc/unwind-dw2-fde-dip.c:408
#3 0x00007f51196a3f3c in __GI___dl_iterate_phdr
(callback=callback@entry=0x7f5119002270 <_Unwind_IteratePhdrCallback>,
data=data@entry=0x7fff463c1160) at dl-iteratephdr.c:76
#4 0x00007f51190035c3 in _Unwind_Find_FDE (pc=0x7f5119001aa7
<_Unwind_Backtrace+55>, bases=bases@entry=0x7fff463c1498)
at ../../../libgcc/unwind-dw2-fde-dip.c:459
#5 0x00007f5118ffff86 in uw_frame_state_for
(context=context@entry=0x7fff463c13f0, fs=fs@entry=0x7fff463c1240)
at ../../../libgcc/unwind-dw2.c:1241
#6 0x00007f51190011d0 in uw_init_context_1
(context=context@entry=0x7fff463c13f0,
outer_cfa=outer_cfa@entry=0x7fff463c16a0,
outer_ra=0x7f511967bf46 <__GI___backtrace+86>) at
../../../libgcc/unwind-dw2.c:1562
#7 0x00007f5119001aa8 in _Unwind_Backtrace (trace=0x7f511967bdd0
<backtrace_helper>, trace_argument=0x7fff463c16a0)
at ../../../libgcc/unwind.inc:283
#8 0x00007f511967bf46 in __GI___backtrace
(array=array@entry=0x7fff463c1710, size=size@entry=4) at
../sysdeps/x86_64/backtrace.c:109
#9 0x000000000047add7 in __error (type=type@entry=1,
fmt=fmt@entry=0x853d38 "read(/dev/mem, %lx, %ld): %ld (%lx)\n") at
tools.c:52
#10 0x0000000000490c91 in read_dev_mem (fd=4, bufptr=0x7fff463c1f28,
cnt=8, addr=0, paddr=1052672) at memory.c:2298
#11 0x0000000000486398 in readmem (addr=1052672,
memtype=memtype@entry=4, buffer=buffer@entry=0x7fff463c1f28,
size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2198
#12 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#13 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c1fb8, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#14 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#15 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c2048, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#16 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#17 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c20d8, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#18 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#19 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c2168, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#20 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#21 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c21f8, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#22 0x000000000048704a in devmem_is_restricted () at memory.c:2414
#23 readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c2288, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
#24 0x000000000048704a in devmem_is_restricted () at memory.c:2414
......
Below is always:
readmem (addr=1052672, memtype=memtype@entry=4,
buffer=buffer@entry=0x7fff463c2288, size=size@entry=8,
type=type@entry=0x8563e1 "devmem_is_allowed - pfn 257",
error_handle=error_handle@entry=6) at memory.c:2209
0x000000000048704a in devmem_is_restricted () at memory.c:2414
Seems a dead-loop, but not sure.
That reads like a bug in the code to decide to switch to /proc/kcore. It
is at a test if /dev/mem is allowed for debugging by verifying this from
the function comment:
* On x86 and x86_64, only the first 256 pages of physical memory
* are accessible:
It's considered restricted if a sizeof(long) read from the start of
physical page 255 (address 1,044,480) succeeds and a sizeof(long) read
from page 257 fails (address 1052672). Note the nesting readmem()'s
first arguments.
Since there's no checking of the actual value read at those tests, their
failure status is how the memory file is considered restricted or not.
However,the actual method of implementation is like this simplified view:
readmem(addr, PHYSADDR, buffer, size...)
{
/* Compute from arguments the memory file position to read from */
.
.
.
/* Perform the read of the open file descriptor for the memory file,
returns errno for the read */
switch(READMEM(fd, buffer, count...))
{
.
.
.
case READ_ERROR:
if (PRINT_ERROR_MESSAGE) {
if ((pc->flags & DEVMEM)
&& (kt->flags & PRE_KERNEL_INIT)
&& devmem_is_restricted()
&& switch_to_proc_kcore())
{
return (readmem(addr, memtype, buffer, size...);
}
}
}
The devmem_is_restricted() contains two readmem() of it's own and
there's no protection against it nesting on the switch() above that I
can see.
FWIW, I need to spend more time to commit to a solution. You have a
support contract anyway, right? Like Dave said, SUSE is the main source
of Xen features in crash and what's needed now isn't really worth being
dragged out on the list. If you do have a support contract and could
report the issue it would probably help me a bit.
--
David.