Sorry for the late reply, and thank you for the update.
Date: Wed, 9 Aug 2023 02:03:17 +0530
From: Aditya Gupta <adityag@linux.ibm.com>
To: crash-utility@redhat.com
Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com>, Sourabh Jain
<sourabhjain@linux.ibm.com>, Hari Bathini <hbathini@linux.ibm.com>
Subject: [Crash-utility] [RFC PATCH v2 0/4] Improve stack unwind on
ppc64
Message-ID: <20230808203321.241732-1-adityag@linux.ibm.com>
Content-Type: text/plain; charset=UTF-8
The Problem:
============
Currently crash is unable to show function arguments and local variables, as
That's true, we have to calculate and infer their values from the stack/registers, because they may be stored in registers or stack. This is not friendly to most kernel developers and debuggers.
Anyway, this is a good point. If inline functions can also be displayed, it would be better.
gdb can do. And functionality for moving between frames ('up'/'down') is not
working in crash.
Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs
'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
gdb not getting the register values from `crash_target::fetch_registers`,
which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
Proposed Solution:
==================
Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64.
This way, "gdb mode in crash" will support this feature for both ELF and
kdump-compressed vmcore formats, while "gdb" would only have supported ELF
format
Implications on Architectures:
====================================
No architecture other than PPC64 has been affected, other than in case of
'frame' command
BTW: Can this feature be implemented on other architectures such as X86 64, etc? Have you investigated?
As mentioned in patch #2, since frame will not be prohibited, so it will print:
crash> frame
#0 <unavailable> in ?? ()
Instead of before prohibited message:
crash> frame
crash: prohibited gdb command: frame
On PPC64, the default mode ("crash mode") will not have ANY OTHER changes,
other than 'frame' as mentioned above.
Major change will be in 'gdb mode' on PPC64, that it will print the frames, and
local variables, instead of failing with errors showing no frame, or showing
that couldn't get PC
Testing:
========
Git tree with this patch series applied:
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-rfc2
To test gdb passthroughs:
crash> set gdb on
gdb> thread 3 # or any other thread number to change context in gdb
gdb> bt
gdb> frame
gdb> up
gdb> down
gdb> info locals
I did a simple test as below(kernel commit: 99d99825fc07):
gdb> info threads
Id Target Id Frame
1 CPU 0 <unavailable> in ?? ()
2 CPU 1
gdb> thread 2
[Switching to thread 2 (CPU 1)]
#0 0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
69 ppc_save_regs(newregs);
gdb> bt
#0 0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:1064
#2 0xc00000000014e018 in panic (fmt=0xc000000001443d80 "sysrq triggered crash\n") at kernel/panic.c:359
#3 0xc0000000009b8978 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
#4 0xc0000000009b946c in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602
#5 0xc0000000009b9ce8 in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
#6 0xc0000000006919fc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000556fcc0) at fs/proc/inode.c:340
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
#8 0xc0000000005b7cb8 in vfs_write (file=file@entry=0xc000000036fa5f00, buf=buf@entry=0x10027835560 <error: Cannot access memory at address 0x10027835560>, count=count@entry=2, pos=pos@entry=0xc00000003dbd7de0) at fs/read_write.c:582
#9 0xc0000000005b83a4 in ksys_write (fd=<optimized out>, buf=0x10027835560 <error: Cannot access memory at address 0x10027835560>, count=2) at fs/read_write.c:637
#10 0xc000000000031454 in system_call_exception (regs=0xc00000003dbd7e80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#11 0xc00000000000cedc in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gdb> frame 7
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
352 rv = pde_write(pde, file, buf, count, ppos);
gdb> info rv
gdb: gdb request failed: info rv
gdb>
Seems that the 'info locals' command is not working as expected. I haven't investigated the details.
Known Issues:
=============
1. In gdb mode, 'info threads' might hang for few seconds, and print only 2
threads
Hmm, it only prints 2 threads, and one of which is unavailable on my side. Can you try to dig into the details?
2. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
from older kernels. This is a known issue due to register mismatch, and
its fix has been merged upstream:
Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
TODO:
=====
1. Introduce automatic thread selection in gdb mode, to select the crashing
thread in gdb, eliminating the need to manually run "thread <id>" after
switching to gdb mode.
Changelog:
==========
RFC V2:
- removed patch implementing 'frame', 'up', 'down' in crash
- updated the cover letter by removing the mention of those commands other
than the respective gdb passthrough
In addition, the get_dumpfile_regs() is not invoked in the [patch 1], I would suggest moving it into the [patch 2]. Just a glance, I haven't looked at the patchset carefully.
Thanks.
Lianbo
Aditya Gupta (4):
add generic get_dumpfile_regs to read registers
ppc64: fix gdb passthrough by implementing machdep->get_cpu_reg
remove 'frame' from prohibited commands list
make cpu context change transparent to crash/gdb
defs.h | 125 ++++++++++++++++++++++++++++++++++++++++++++++++
gdb-10.2.patch | 28 +++++++++++
gdb_interface.c | 2 +-
kernel.c | 33 +++++++++++++
ppc64.c | 105 ++++++++++++++++++++++++++++++++++++++--
tools.c | 12 +++--
6 files changed, 298 insertions(+), 7 deletions(-)
--
2.41.0