Thanks,
Tao Liu
On Fri, Sep 13, 2024 at 9:07 PM <mycomplexlove(a)gmail.com> wrote:
Hello, crash main programmers.
I found a problem. On crash with gdb10.2, I have a vmcore that prints parts
that shouldn't appear when parsing the process stack.
I have had some discussions with liutgnu. I recompiled and tried based on
https://github.com/liutgnu/crash-preview.
Unfortunately, it seems that the crash version based on gdb13.2 still has this problem.
Here is the output of my test:
-------------------
crash 8.0.4++
Copyright (C) 2002-2022 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
<
http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
KERNEL: /root/hungtask/vmlinux [TAINTED]
DUMPFILE: /root/hungtask/2024_09_06_05_02_15.kernel_core [PARTIAL DUMP]
CPUS: 64
DATE: Fri Sep 6 05:01:47 CST 2024
UPTIME: 12:27:05
LOAD AVERAGE: 56.87, 25.40, 18.24
TASKS: 4319
NODENAME: host-047bcb37834d
RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug
VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024
MACHINE: x86_64 (2499 Mhz)
MEMORY: 255.9 GB
PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
PID: 112450
COMMAND: "vtpstatd"
TASK: ffff88816ae80000 [THREAD_INFO: ffff88816ae80000]
CPU: 41
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 112450 TASK: ffff88816ae80000 CPU: 41 COMMAND: "vtpstatd"
#0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab
#1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99
#2 [ffff889e3fa87d30] panic at ffffffff9483ed43
#3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6
#4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e
#5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7
#6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176
#7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f
--- <IRQ stack> ---
#8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f
[exception RIP: copy_page_range+3681]
RIP: ffffffff9331d461 RSP: ffff8882631579e8 RFLAGS: 00000246
RAX: 1ffffd4018a95ad1 RBX: 8000003152b5a805 RCX: ffffea00c54ad688
RDX: ffffea00c54aee88 RSI: 00007f80d117f000 RDI: ffffffff956468e0
RBP: ffff8881c5bc2bf8 R8: fffff94018a2e22f R9: fffff94018a2e22f
R10: 0000000000000001 R11: fffff94018a2e22e R12: 0000000000000018
R13: dffffc0000000000 R14: ffffea00c54ad680 R15: 00007f80d117f000
ORIG_RAX: ffffffffffffff13 CS: 0010 SS: 0018
#9 [ffff888263157bc0] copy_process at ffffffff92dadcbd
#10 [ffff888263157d20] __mutex_init at ffffffff92ed8dd5
#11 [ffff888263157d38] __alloc_file at ffffffff93458397
#12 [ffff888263157d60] alloc_empty_file at ffffffff934585d2
#13 [ffff888263157da8] __alloc_fd at ffffffff934b5ead
#14 [ffff888263157e38] _do_fork at ffffffff92dae7a1
#15 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4
#16 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4
RIP: 00007f80ec93641a RSP: 00007ffcb38bbd50 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00007ffcb38bbd50 RCX: 00007f80ec93641a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffcb38bbde0 R8: 000000000001b742 R9: 00007f80ee1a0f80
R10: 00007f80ee1a1250 R11: 0000000000000246 R12: 000000000001b742
R13: 00007ffcb38bbd70 R14: 0000000000000000 R15: 00007ffcb38bbf00
ORIG_RAX: 0000000000000038 CS: 0033 SS: 002b
-------------------
#10....#13 They seem redundant.
The following is the analysis output based on gdb7.6 and the latest crash code:
-------------------
crash_805_gdb76 8.0.5++
Copyright (C) 2002-2024 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2024 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: kernel relocated [284MB]: patching 99408 gdb minimal_symbol values
crash_805_gdb76: gdb cannot find text block for address: dd_init_queue
KERNEL: vmlinux [TAINTED]
DUMPFILE: 2024_09_06_05_02_15.kernel_core [PARTIAL DUMP]
CPUS: 64
DATE: Fri Sep 6 05:01:47 CST 2024
UPTIME: 12:27:05
LOAD AVERAGE: 56.87, 25.40, 18.24
TASKS: 4319
NODENAME: host-047bcb37834d
RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug
VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024
MACHINE: x86_64 (2499 Mhz)
MEMORY: 255.9 GB
PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
PID: 112450
COMMAND: "vtpstatd"
TASK: ffff88816ae80000 [THREAD_INFO: ffff88816ae80000]
CPU: 41
STATE: TASK_RUNNING (PANIC)
crash_805_gdb76> bt
PID: 112450 TASK: ffff88816ae80000 CPU: 41 COMMAND: "vtpstatd"
#0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab
#1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99
#2 [ffff889e3fa87d30] panic at ffffffff9483ed43
#3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6
#4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e
#5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7
#6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176
#7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f
--- <IRQ stack> ---
#8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f
[exception RIP: copy_page_range+3681]
RIP: ffffffff9331d461 RSP: ffff8882631579e8 RFLAGS: 00000246
RAX: 1ffffd4018a95ad1 RBX: 8000003152b5a805 RCX: ffffea00c54ad688
RDX: ffffea00c54aee88 RSI: 00007f80d117f000 RDI: ffffffff956468e0
RBP: ffff8881c5bc2bf8 R8: fffff94018a2e22f R9: fffff94018a2e22f
R10: 0000000000000001 R11: fffff94018a2e22e R12: 0000000000000018
R13: dffffc0000000000 R14: ffffea00c54ad680 R15: 00007f80d117f000
ORIG_RAX: ffffffffffffff13 CS: 0010 SS: 0018
#9 [ffff888263157bc0] copy_process at ffffffff92dadcbd
#10 [ffff888263157e38] _do_fork at ffffffff92dae7a1
#11 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4
#12 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4
RIP: 00007f80ec93641a RSP: 00007ffcb38bbd50 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00007ffcb38bbd50 RCX: 00007f80ec93641a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffcb38bbde0 R8: 000000000001b742 R9: 00007f80ee1a0f80
R10: 00007f80ee1a1250 R11: 0000000000000246 R12: 000000000001b742
R13: 00007ffcb38bbd70 R14: 0000000000000000 R15: 00007ffcb38bbf00
ORIG_RAX: 0000000000000038 CS: 0033 SS: 002b
-------------------
It seems that gdb7.6 parsing is more convincing. This version is compiled by reverting
the commit of update gdb
(github url:
https://github.com/crash-utility/crash/commit/9fab193).
I also tried the release versions of crash 7.3.2 and 8.0.1 (I had problems compiling
8.0.0),
and the results are consistent with the above. 7.3.2 parsing is normal, and 8.0.1 has the
problem.
In crash_805_gdb76 x86_64_framesize_cache[3].framesize=624 :
(gdb) p x86_64_framesize_cache[0]
$136 = {textaddr = 18446744071880546969, framesize = 272, exception = 0}
(gdb) p x86_64_framesize_cache[1]
$137 = {textaddr = 18446744071906258243, framesize = 192, exception = 0}
(gdb) p x86_64_framesize_cache[2]
$138 = {textaddr = 18446744071908104495, framesize = 8, exception = 0}
(gdb) p x86_64_framesize_cache[3]
$139 = {textaddr = 18446744071878401213, framesize = 624, exception = 0}
but In crash_805_gdb102 x86_64_framesize_cache[3].framesize=0 :
(gdb) p x86_64_framesize_cache[0]
$86 = {textaddr = 18446744071880546969, framesize = 272, exception = 0}
(gdb) p x86_64_framesize_cache[1]
$87 = {textaddr = 18446744071906258243, framesize = 192, exception = 0}
(gdb) p x86_64_framesize_cache[2]
$88 = {textaddr = 18446744071908104495, framesize = 8, exception = 0}
(gdb) p x86_64_framesize_cache[3]
$89 = {textaddr = 18446744071878401213, framesize = 0, exception = 0}
---------------------------------------------
After [Walk the process stack. ] of x86_64_low_budget_back_trace_cmd, the value of *up is
as follows:
x86_64.c:4059 switch (x86_64_print_stack_entry(bt, ofp, level, i,*up))
The address returned by crash_805_gdb76:
0xffffffff92d059ab
0xffffffff92fb9a99
0xffffffff9483ed43
0xffffffff93052cf6
0xffffffff92f5e96e
0xffffffff92f5ffe7
0xffffffff94a03176
0xffffffff94a0192f
0xffffffff92dadcbd <-copy_page_range
0xffffffff92dae7a1
0xffffffff92c085f4
0xffffffff94a000a4
The address returned by crash_805_gdb102:
0xffffffff92d059ab
0xffffffff92fb9a99
0xffffffff9483ed43
0xffffffff93052cf6
0xffffffff92f5e96e
0xffffffff92f5ffe7
0xffffffff94a03176
0xffffffff94a0192f
0xffffffff92dadcbd <-copy_page_range
0xffffffff92ed8dd5 -------Parts that shouldn't appear
0xffffffff93458397
0xffffffff934585d2
0xffffffff934b5ead --------Parts that shouldn't appear
0xffffffff92dae7a1
0xffffffff92c085f4
0xffffffff94a000a4
Analyze its symbols:
0xffffffff92ed8dd5 __mutex_init+181
0xffffffff93458397 __alloc_file+407
0xffffffff934585d2 alloc_empty_file+146
0xffffffff934b5ead __alloc_fd+141
Generate vmcore parameters:
makedumpfile -l -d 31 /proc/vmcore [date].kernel_core
Unfortunately, I am not using a regular distribution, it is a deeply customized one
vmcore google drive url:
https://drive.google.com/file/d/1pDICRP6zQafe00c4LWRV-SklkM75971P/view
--
Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io
To unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines:
https://github.com/crash-utility/crash/wiki