Thanks,
Tao Liu
On Fri, Sep 13, 2024 at 9:07 PM <mycomplexlove(a)gmail.com> wrote:
 Hello, crash main programmers.
 I found a problem. On crash with gdb10.2, I have a vmcore that prints parts
 that shouldn't appear when parsing the process stack.
 I have had some discussions with liutgnu. I recompiled and tried based on
https://github.com/liutgnu/crash-preview.
 Unfortunately, it seems that the crash version based on gdb13.2 still has this problem.
 Here is the output of my test:
 -------------------
 crash 8.0.4++
 Copyright (C) 2002-2022  Red Hat, Inc.
 Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
 Copyright (C) 1999-2006  Hewlett-Packard Co
 Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
 Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
 Copyright (C) 2005, 2011, 2020-2022  NEC Corporation
 Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
 Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
 Copyright (C) 2015, 2021  VMware, Inc.
 This program is free software, covered by the GNU General Public License,
 and you are welcome to change it and/or distribute copies of it under
 certain conditions.  Enter "help copying" to see the conditions.
 This program has absolutely no warranty.  Enter "help warranty" for details.
 GNU gdb (GDB) 13.2
 Copyright (C) 2023 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.
 Type "show copying" and "show warranty" for details.
 This GDB was configured as "x86_64-pc-linux-gnu".
 Type "show configuration" for configuration details.
 Find the GDB manual and other documentation resources online at:
     <
http://www.gnu.org/software/gdb/documentation/>.
 For help, type "help".
 Type "apropos word" to search for commands related to "word"...
       KERNEL: /root/hungtask/vmlinux  [TAINTED]
     DUMPFILE: /root/hungtask/2024_09_06_05_02_15.kernel_core  [PARTIAL DUMP]
         CPUS: 64
         DATE: Fri Sep  6 05:01:47 CST 2024
       UPTIME: 12:27:05
 LOAD AVERAGE: 56.87, 25.40, 18.24
        TASKS: 4319
     NODENAME: host-047bcb37834d
      RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug
      VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024
      MACHINE: x86_64  (2499 Mhz)
       MEMORY: 255.9 GB
        PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
          PID: 112450
      COMMAND: "vtpstatd"
         TASK: ffff88816ae80000  [THREAD_INFO: ffff88816ae80000]
          CPU: 41
        STATE: TASK_RUNNING (PANIC)
 crash> bt
 PID: 112450   TASK: ffff88816ae80000  CPU: 41   COMMAND: "vtpstatd"
  #0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab
  #1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99
  #2 [ffff889e3fa87d30] panic at ffffffff9483ed43
  #3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6
  #4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e
  #5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7
  #6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176
  #7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f
 --- <IRQ stack> ---
  #8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f
     [exception RIP: copy_page_range+3681]
     RIP: ffffffff9331d461  RSP: ffff8882631579e8  RFLAGS: 00000246
     RAX: 1ffffd4018a95ad1  RBX: 8000003152b5a805  RCX: ffffea00c54ad688
     RDX: ffffea00c54aee88  RSI: 00007f80d117f000  RDI: ffffffff956468e0
     RBP: ffff8881c5bc2bf8   R8: fffff94018a2e22f   R9: fffff94018a2e22f
     R10: 0000000000000001  R11: fffff94018a2e22e  R12: 0000000000000018
     R13: dffffc0000000000  R14: ffffea00c54ad680  R15: 00007f80d117f000
     ORIG_RAX: ffffffffffffff13  CS: 0010  SS: 0018
  #9 [ffff888263157bc0] copy_process at ffffffff92dadcbd
 #10 [ffff888263157d20] __mutex_init at ffffffff92ed8dd5
 #11 [ffff888263157d38] __alloc_file at ffffffff93458397
 #12 [ffff888263157d60] alloc_empty_file at ffffffff934585d2
 #13 [ffff888263157da8] __alloc_fd at ffffffff934b5ead
 #14 [ffff888263157e38] _do_fork at ffffffff92dae7a1
 #15 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4
 #16 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4
     RIP: 00007f80ec93641a  RSP: 00007ffcb38bbd50  RFLAGS: 00000246
     RAX: ffffffffffffffda  RBX: 00007ffcb38bbd50  RCX: 00007f80ec93641a
     RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000001200011
     RBP: 00007ffcb38bbde0   R8: 000000000001b742   R9: 00007f80ee1a0f80
     R10: 00007f80ee1a1250  R11: 0000000000000246  R12: 000000000001b742
     R13: 00007ffcb38bbd70  R14: 0000000000000000  R15: 00007ffcb38bbf00
     ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
 -------------------
 #10....#13 They seem redundant.
 The following is the analysis output based on gdb7.6 and the latest crash code:
 -------------------
 crash_805_gdb76 8.0.5++
 Copyright (C) 2002-2024  Red Hat, Inc.
 Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
 Copyright (C) 1999-2006  Hewlett-Packard Co
 Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
 Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
 Copyright (C) 2005, 2011, 2020-2024  NEC Corporation
 Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
 Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
 This program is free software, covered by the GNU General Public License,
 and you are welcome to change it and/or distribute copies of it under
 certain conditions.  Enter "help copying" to see the conditions.
 This program has absolutely no warranty.  Enter "help warranty" for details.
 GNU gdb (GDB) 7.6
 Copyright (C) 2013 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
 and "show warranty" for details.
 This GDB was configured as "x86_64-unknown-linux-gnu"...
 WARNING: kernel relocated [284MB]: patching 99408 gdb minimal_symbol values
 crash_805_gdb76: gdb cannot find text block for address: dd_init_queue
       KERNEL: vmlinux  [TAINTED]
     DUMPFILE: 2024_09_06_05_02_15.kernel_core  [PARTIAL DUMP]
         CPUS: 64
         DATE: Fri Sep  6 05:01:47 CST 2024
       UPTIME: 12:27:05
 LOAD AVERAGE: 56.87, 25.40, 18.24
        TASKS: 4319
     NODENAME: host-047bcb37834d
      RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug
      VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024
      MACHINE: x86_64  (2499 Mhz)
       MEMORY: 255.9 GB
        PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
          PID: 112450
      COMMAND: "vtpstatd"
         TASK: ffff88816ae80000  [THREAD_INFO: ffff88816ae80000]
          CPU: 41
        STATE: TASK_RUNNING (PANIC)
 crash_805_gdb76> bt
 PID: 112450   TASK: ffff88816ae80000  CPU: 41   COMMAND: "vtpstatd"
  #0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab
  #1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99
  #2 [ffff889e3fa87d30] panic at ffffffff9483ed43
  #3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6
  #4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e
  #5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7
  #6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176
  #7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f
 --- <IRQ stack> ---
  #8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f
     [exception RIP: copy_page_range+3681]
     RIP: ffffffff9331d461  RSP: ffff8882631579e8  RFLAGS: 00000246
     RAX: 1ffffd4018a95ad1  RBX: 8000003152b5a805  RCX: ffffea00c54ad688
     RDX: ffffea00c54aee88  RSI: 00007f80d117f000  RDI: ffffffff956468e0
     RBP: ffff8881c5bc2bf8   R8: fffff94018a2e22f   R9: fffff94018a2e22f
     R10: 0000000000000001  R11: fffff94018a2e22e  R12: 0000000000000018
     R13: dffffc0000000000  R14: ffffea00c54ad680  R15: 00007f80d117f000
     ORIG_RAX: ffffffffffffff13  CS: 0010  SS: 0018
  #9 [ffff888263157bc0] copy_process at ffffffff92dadcbd
 #10 [ffff888263157e38] _do_fork at ffffffff92dae7a1
 #11 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4
 #12 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4
     RIP: 00007f80ec93641a  RSP: 00007ffcb38bbd50  RFLAGS: 00000246
     RAX: ffffffffffffffda  RBX: 00007ffcb38bbd50  RCX: 00007f80ec93641a
     RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000001200011
     RBP: 00007ffcb38bbde0   R8: 000000000001b742   R9: 00007f80ee1a0f80
     R10: 00007f80ee1a1250  R11: 0000000000000246  R12: 000000000001b742
     R13: 00007ffcb38bbd70  R14: 0000000000000000  R15: 00007ffcb38bbf00
     ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
 -------------------
 It seems that gdb7.6 parsing is more convincing. This version is compiled by reverting
the commit of update gdb
 (github url: 
https://github.com/crash-utility/crash/commit/9fab193).
 I also tried the release versions of crash 7.3.2 and 8.0.1 (I had problems compiling
8.0.0),
 and the results are consistent with the above. 7.3.2 parsing is normal, and 8.0.1 has the
problem.
 In crash_805_gdb76 x86_64_framesize_cache[3].framesize=624 :
 (gdb) p x86_64_framesize_cache[0]
 $136 = {textaddr = 18446744071880546969, framesize = 272, exception = 0}
 (gdb) p x86_64_framesize_cache[1]
 $137 = {textaddr = 18446744071906258243, framesize = 192, exception = 0}
 (gdb) p x86_64_framesize_cache[2]
 $138 = {textaddr = 18446744071908104495, framesize = 8, exception = 0}
 (gdb) p x86_64_framesize_cache[3]
 $139 = {textaddr = 18446744071878401213, framesize = 624, exception = 0}
 but In crash_805_gdb102 x86_64_framesize_cache[3].framesize=0 :
 (gdb) p x86_64_framesize_cache[0]
 $86 = {textaddr = 18446744071880546969, framesize = 272, exception = 0}
 (gdb) p x86_64_framesize_cache[1]
 $87 = {textaddr = 18446744071906258243, framesize = 192, exception = 0}
 (gdb) p x86_64_framesize_cache[2]
 $88 = {textaddr = 18446744071908104495, framesize = 8, exception = 0}
 (gdb) p x86_64_framesize_cache[3]
 $89 = {textaddr = 18446744071878401213, framesize = 0, exception = 0}
 ---------------------------------------------
 After [Walk the process stack. ] of x86_64_low_budget_back_trace_cmd, the value of *up is
as follows:
 x86_64.c:4059    switch (x86_64_print_stack_entry(bt, ofp, level, i,*up))
 The address returned by crash_805_gdb76:
 0xffffffff92d059ab
 0xffffffff92fb9a99
 0xffffffff9483ed43
 0xffffffff93052cf6
 0xffffffff92f5e96e
 0xffffffff92f5ffe7
 0xffffffff94a03176
 0xffffffff94a0192f
 0xffffffff92dadcbd <-copy_page_range
 0xffffffff92dae7a1
 0xffffffff92c085f4
 0xffffffff94a000a4
 The address returned by crash_805_gdb102:
 0xffffffff92d059ab
 0xffffffff92fb9a99
 0xffffffff9483ed43
 0xffffffff93052cf6
 0xffffffff92f5e96e
 0xffffffff92f5ffe7
 0xffffffff94a03176
 0xffffffff94a0192f
 0xffffffff92dadcbd <-copy_page_range
 0xffffffff92ed8dd5 -------Parts that shouldn't appear
 0xffffffff93458397
 0xffffffff934585d2
 0xffffffff934b5ead --------Parts that shouldn't appear
 0xffffffff92dae7a1
 0xffffffff92c085f4
 0xffffffff94a000a4
 Analyze its symbols:
 0xffffffff92ed8dd5 __mutex_init+181
 0xffffffff93458397 __alloc_file+407
 0xffffffff934585d2 alloc_empty_file+146
 0xffffffff934b5ead __alloc_fd+141
 Generate vmcore parameters:
 makedumpfile -l -d 31 /proc/vmcore  [date].kernel_core
 Unfortunately, I am not using a regular distribution, it is a deeply customized one
 vmcore google drive url:
 
https://drive.google.com/file/d/1pDICRP6zQafe00c4LWRV-SklkM75971P/view
 --
 Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io
 To unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io
 https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
 Contribution Guidelines: 
https://github.com/crash-utility/crash/wiki