----- Original Message -----
I have the new Odroid-C2 arm64 cortex-a53 board and have been trying
to get
crash to work against the live kernel.
I think the key error is this:
linux_banner:
crash: /lib/modules/3.14.29+/build/vmlinux and /dev/mem do not match!
They should match as I built the kernel myself and verified the vmlinux in
/lib/modules is the one
I'm booted on. What concerns me is that it does not appear to be able to read
anything
from the vmlinux file:
<read_dev_mem: addr: ffffffc001c0dbac paddr: 2c0dbac cnt: 390>
utsname:
sysname: (not printable)
nodename:
release: J
version: (not printable)
machine: r
domainname:
base kernel version: 0.1.4
It's not reading the utsname data from the vmlinux file, but from /dev/mem.
And it's the reads from /dev/mem that are returning nonsense data.
The readmem() calls in your debug output are are all from unity-mapped
virtual addresses, which get translated to their physical address
equivalents, which are passed to /dev/mem.
And in your output, all of the data returned from /dev/mem is obviously
bogus, so my best guess is that there is a fundamental problem
with the manner in which unity-mapped addresses get translated to
the physical addresses passed to /dev/mem. (as opposed to a problem
with the /dev/mem driver itself)
Unfortunately I don't have an ARM64 system where I can use /dev/mem,
because all RHEL kernels are configured with CONFIG_STRICT_DEVMEM.
So we use the /dev/crash "misc" driver that is built into RHEL
kernels.
I doubt it's an issue with /dev/mem itself, but for sanity's sake,
what happens if you enter "crash /proc/kcore"? It will use /proc/kcore
instead of /dev/mem for accessing kernel memory.
Anyway, the arm64 VTOP() macro used to translate virtual-to-physical
addresses in crash looks like this:
#define VTOP(X) \
((unsigned
long)(X)-(machdep->machspec->page_offset)+(machdep->machspec->phys_offset))
You can watch the translation happen by running "crash --minimal" on your
system like this:
# crash --minimal
crash 7.1.4
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
NOTE: minimal mode commands: log, dis, rd, sym, eval, set, extend and exit
crash>
An easy manner of determining that at least unity-mapped addresses get
translated correctly is to read the kernel's "linux_banner" string:
crash> rd linux_banner 30
fffffe00007800b8: 65762078756e694c 2e34206e6f697372 Linux version 4.
fffffe00007800c8: 63722e302d302e34 376c652e30322e33 4.0-0.rc3.20.el7
fffffe00007800d8: 343668637261612e 75626b636f6d2820 .aarch64 (mockbu
fffffe00007800e8: 366d726140646c69 75622e3630302d34 ild(a)arm64-006.bu
fffffe00007800f8: 2e676e652e646c69 686465722e736f62 ild.eng.bos.redh
fffffe0000780108: 20296d6f632e7461 7265762063636728
at.com) (gcc ver
fffffe0000780118: 382e34206e6f6973 303531303220352e sion 4.8.5 20150
fffffe0000780128: 6465522820333236 382e342074614820 623 (Red Hat 4.8
fffffe0000780138: 47282029342d352e 3123202920294343 .5-4) (GCC) ) #1
fffffe0000780148: 64655720504d5320 3120322063654420 SMP Wed Dec 2 1
fffffe0000780158: 2038353a37353a34 3531303220545345 4:57:58 EST 2015
fffffe0000780168: 000000000000000a 745f6b6361706e75 ........unpack_t
fffffe0000780178: 7366746f6f725f6f 0000000000000000 o_rootfs........
fffffe0000780188: 0000000000000000 0000000000000000 ................
fffffe0000780198: 0000000000000000 0000000000000000 ................
crash>
If you do that on your system, I'm guessing that there is garbage in the
rightmost ASCII translation column.
Anyway, let's just take the first 64-bit word, and show the VTOP() in action:
crash> set debug 4
debug: 4
crash> rd linux_banner
<addr: fffffe00007800b8 count: 1 flag: 490 (KVADDR)>
<readmem: fffffe00007800b8, KVADDR, "64-bit KVADDR", 8, (FOE),
3ffc4c05fb8>
<read_memory_device: addr: fffffe00007800b8 paddr: 40007800b8 cnt: 8>
fffffe00007800b8: 65762078756e694c Linux ve
crash>
The VTOP() values used can be found like this:
crash> help -m | grep -e page_offset -e phys_offset
page_offset: fffffe0000000000
phys_offset: 4000000000
crash>
Your output will be different, because your page_offset is based upon a
VA_BITS value of 39 instead of my 42. So yours should show ffffffc000000000
as the page_offset, and 1000000 as the phys_offset (also shown in your debug log).
So for any unity-mapped virtual address, you would subtract the page_offset
value, and then add the phys_offset. In my example above, reading linux_banner
at fffffe00007800b8 does this:
fffffe00007800b8 - fffffe0000000000 + 0x4000000000 = 0x40007800b8
where you can see the translated "paddr" physical address in this line of
the debug output above:
<read_memory_device: addr: fffffe00007800b8 paddr: 40007800b8 cnt: 8>
and which I can use as alternative argument:
crash> rd -p 40007800b8 30
40007800b8: 65762078756e694c 2e34206e6f697372 Linux version 4.
40007800c8: 63722e302d302e34 376c652e30322e33 4.0-0.rc3.20.el7
40007800d8: 343668637261612e 75626b636f6d2820 .aarch64 (mockbu
40007800e8: 366d726140646c69 75622e3630302d34 ild(a)arm64-006.bu
40007800f8: 2e676e652e646c69 686465722e736f62 ild.eng.bos.redh
4000780108: 20296d6f632e7461 7265762063636728
at.com) (gcc ver
4000780118: 382e34206e6f6973 303531303220352e sion 4.8.5 20150
4000780128: 6465522820333236 382e342074614820 623 (Red Hat 4.8
4000780138: 47282029342d352e 3123202920294343 .5-4) (GCC) ) #1
4000780148: 64655720504d5320 3120322063654420 SMP Wed Dec 2 1
4000780158: 2038353a37353a34 3531303220545345 4:57:58 EST 2015
4000780168: 000000000000000a 745f6b6361706e75 ........unpack_t
4000780178: 7366746f6f725f6f 0000000000000000 o_rootfs........
4000780188: 0000000000000000 0000000000000000 ................
4000780198: 0000000000000000 0000000000000000 ................
crash>
In your debug log, taking the "init_uts_ns" read, it takes the
ffffffc001c0dbac,
subtracts the page_offset of ffffffc000000000, and adds the phys_offset of
0x1000000, resulting in "paddr" of 2c0dbac:
<readmem: ffffffc001c0dbac, KVADDR, "init_uts_ns", 390, (ROE), b9606c>
<read_dev_mem: addr: ffffffc001c0dbac paddr: 2c0dbac cnt: 390>
But it's getting back garbage...
I don't know why it's failing to find legitimate data at that location.
The page_offset calculation of ffffffc000000000 and the phys_offset value
are based upon the symbol values themselves, and the phys_offset value
as found in /proc/iomem. (See the definition of ARM64_PAGE_OFFSET in defs.h,
and the arm64_calc_VA_BITS() function in arm64.c). Are there more than one
"System RAM" sections in your /proc/iomem?
Dave
If I elfdump or objdump the vmlinux and grep banner I can see the symbol:
root@odroid64-pre:~/linux# readelf --syms vmlinux | grep banner
74463: ffffffc00186a090 149 OBJECT GLOBAL DEFAULT 4 linux_banner
75496: ffffffc00186a028 100 OBJECT GLOBAL DEFAULT 4 linux_proc_banner
root@odroid64-pre:~/linux# eu-nm -a vmlinux | grep banner
linux_banner |ffffffc00186a090|GLOBAL|OBJECT |0000000000000095|
version.c:43|.rodata
linux_proc_banner |ffffffc00186a028|GLOBAL|OBJECT |0000000000000064|
version.c:47|.rodata
I pulled the crash source and built it native on the arm64 box.
If I could get a pointer on where to start with debugging this it would help
(i.e. which error to focus on first)
===
The full dump of crash startup is below:
root@odroid64-pre:~/linux# /root/crash-7.1.4/crash -d 64
crash 7.1.4
Copyright (C) 2002-2015 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
find_booted_kernel: search for [Linux version 3.14.29+ (root@odroid64-pre)
(gcc version 5.3.1 20160225 (Ubuntu/Linaro 5.3.1-10ubuntu2) ) #1 SMP PREEMPT
Tue Mar 8 01:06:35 CST 2016]
searchdirs[8]: /usr/lib/debug/lib/modules/3.14.29+/
searchdirs[0]: /usr/src/linux/
searchdirs[1]: /boot/
searchdirs[2]: /boot/efi/redhat
searchdirs[3]: /boot/efi/EFI/redhat
searchdirs[4]: /
searchdirs[5]: /lib/modules/3.14.29+/build/
searchdirs[6]: /usr/src/redhat/BUILD/kernel-3.14.29/linux/
searchdirs[7]: /usr/src/redhat/BUILD/kernel-3.14.29/linux-3.14.29/
mount_points[0]: / (c46630)
mount_points[1]: /sys (c46650)
mount_points[2]: /proc (c46670)
mount_points[3]: /dev (c46690)
mount_points[4]: /dev/pts (c466b0)
mount_points[5]: /run (c466d0)
mount_points[6]: / (c466f0)
mount_points[7]: /sys/kernel/security (c46710)
mount_points[8]: /dev/shm (c46740)
mount_points[9]: /run/lock (c46760)
mount_points[10]: /sys/fs/cgroup (c46780)
mount_points[11]: /sys/fs/cgroup/systemd (c467b0)
mount_points[12]: /sys/fs/cgroup/devices (c467f0)
mount_points[13]: /sys/fs/cgroup/cpuset (c46830)
mount_points[14]: /sys/fs/cgroup/cpu,cpuacct (c46870)
mount_points[15]: /sys/fs/cgroup/blkio (c468b0)
mount_points[16]: /sys/fs/cgroup/debug (c468e0)
mount_points[17]: /sys/fs/cgroup/perf_event (c46910)
mount_points[18]: /sys/fs/cgroup/freezer (c46950)
mount_points[19]: /sys/fs/cgroup/net_cls (c46990)
mount_points[20]: /proc/sys/fs/binfmt_misc (c469d0)
mount_points[21]: /dev/mqueue (c46a10)
mount_points[22]: /sys/kernel/debug (c46a30)
mount_points[23]: /dev/hugepages (c46a60)
mount_points[24]: /run/rpc_pipefs (c46a90)
mount_points[25]: /sys/kernel/config (c46ac0)
mount_points[26]: /media/boot (c46af0)
mount_points[27]: /run/cgmanager/fs (c46b10)
mount_points[28]: /run/user/118 (c46b40)
mount_points[29]: /run/user/118/gvfs (c46b70)
mount_points[30]: /sys/fs/fuse/connections (c46ba0)
mount_points[31]: /run/user/0 (c46be0)
find_booted_kernel: check: /lib/modules/3.14.29+/build/vmlinux
find_booted_kernel: found: /lib/modules/3.14.29+/build/vmlinux
get_live_memory_source: /dev/mem
/proc/version:
Linux version 3.14.29+ (root@odroid64-pre) (gcc version 5.3.1 20160225
(Ubuntu/Linaro 5.3.1-10ubuntu2) ) #1 SMP PREEMPT Tue Mar 8 01:06:35 CST 2016
/lib/modules/3.14.29+/build/vmlinux:
Linux version 3.14.29+ (root@odroid64-pre) (gcc version 5.3.1 20160225
(Ubuntu/Linaro 5.3.1-10ubuntu2) ) #1 SMP PREEMPT Tue Mar 8 01:06:35 CST 2016
readmem: read_dev_mem() -> /dev/mem
VA_BITS: 39
using 1000000 as phys_offset
gdb /lib/modules/3.14.29+/build/vmlinux
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
GETBUF(248 -> 0)
GETBUF(1500 -> 1)
FREEBUF(1)
FREEBUF(0)
<readmem: ffffffc001874510, KVADDR, "kernel_config_data", 32768, (ROE),
17c47c0>
<read_dev_mem: addr: ffffffc001874510 paddr: 2874510 cnt: 2800>
<read_dev_mem: addr: ffffffc001875000 paddr: 2875000 cnt: 4096>
<read_dev_mem: addr: ffffffc001876000 paddr: 2876000 cnt: 4096>
<read_dev_mem: addr: ffffffc001877000 paddr: 2877000 cnt: 4096>
<read_dev_mem: addr: ffffffc001878000 paddr: 2878000 cnt: 4096>
<read_dev_mem: addr: ffffffc001879000 paddr: 2879000 cnt: 4096>
<read_dev_mem: addr: ffffffc00187a000 paddr: 287a000 cnt: 4096>
<read_dev_mem: addr: ffffffc00187b000 paddr: 287b000 cnt: 4096>
<read_dev_mem: addr: ffffffc00187c000 paddr: 287c000 cnt: 1296>
WARNING: could not find MAGIC_START!
GETBUF(248 -> 0)
FREEBUF(0)
GETBUF(8 -> 0)
<readmem: ffffffc00186fd80, KVADDR, "cpu_possible_mask", 8, (FOE),
7ffe1dfbd0>
<read_dev_mem: addr: ffffffc00186fd80 paddr: 286fd80 cnt: 8>
<readmem: 1600000102, KVADDR, "possible", 8, (ROE), bf8ae8>
crash: invalid kernel virtual address: 1600000102 type: "possible"
WARNING: cannot read cpu_possible_map
<readmem: ffffffc00186fd70, KVADDR, "cpu_present_mask", 8, (FOE),
7ffe1dfbd0>
<read_dev_mem: addr: ffffffc00186fd70 paddr: 286fd70 cnt: 8>
<readmem: 189a, KVADDR, "present", 8, (ROE), bf8ae8>
crash: invalid kernel virtual address: 189a type: "present"
WARNING: cannot read cpu_present_map
<readmem: ffffffc00186fd78, KVADDR, "cpu_online_mask", 8, (FOE),
7ffe1dfbd0>
<read_dev_mem: addr: ffffffc00186fd78 paddr: 286fd78 cnt: 8>
<readmem: 13a48, KVADDR, "online", 8, (ROE), bf8ae8>
crash: invalid kernel virtual address: 13a48 type: "online"
WARNING: cannot read cpu_online_map
<readmem: ffffffc00186fd68, KVADDR, "cpu_active_mask", 8, (FOE),
7ffe1dfbd0>
<read_dev_mem: addr: ffffffc00186fd68 paddr: 286fd68 cnt: 8>
<readmem: 1600000102, KVADDR, "active", 8, (ROE), bf8ae8>
crash: invalid kernel virtual address: 1600000102 type: "active"
WARNING: cannot read cpu_active_map
FREEBUF(0)
GETBUF(248 -> 0)
FREEBUF(0)
GETBUF(248 -> 0)
FREEBUF(0)
<readmem: ffffffc001d61238, KVADDR, "timekeeper xtime_sec", 8, (ROE),
7ffe1dfc98>
<read_dev_mem: addr: ffffffc001d61238 paddr: 2d61238 cnt: 8>
xtime timespec.tv_sec: 5f044158000e9068: (null)
<readmem: ffffffc001c0dbac, KVADDR, "init_uts_ns", 390, (ROE), b9606c>
<read_dev_mem: addr: ffffffc001c0dbac paddr: 2c0dbac cnt: 390>
utsname:
sysname: (not printable)
nodename:
release: J
version: (not printable)
machine: r
domainname:
base kernel version: 0.1.4
<readmem: ffffffc00186a090, KVADDR, "accessible check", 8, (ROE|Q),
7ffe1df350>
<read_dev_mem: addr: ffffffc00186a090 paddr: 286a090 cnt: 8>
<readmem: ffffffc00186a090, KVADDR, "read_string characters", 1499,
(ROE|Q),
7ffe1df6c8>
<read_dev_mem: addr: ffffffc00186a090 paddr: 286a090 cnt: 1499>
/proc/version:
Linux version 3.14.29+ (root@odroid64-pre) (gcc version 5.3.1 20160225
(Ubuntu/Linaro 5.3.1-10ubuntu2) ) #1 SMP PREEMPT Tue Mar 8 01:06:35 CST 2016
linux_banner:
crash: /lib/modules/3.14.29+/build/vmlinux and /dev/mem do not match!
Usage:
crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS] (dumpfile form)
crash [OPTION]... [NAMELIST] (live system form)
Enter "crash -h" for details.
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility