Hello Tao,
On Mon, Dec 25, 2023 at 10:25:00AM +0800, Tao Liu wrote:
Hi Aditya,
On Wed, Dec 20, 2023 at 4:52 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
>
> The Problem:
> ============
>
> Currently crash is unable to show function arguments and local variables, as
> gdb can do. And functionality for moving between frames
('up'/'down') is not
> working in crash.
>
> Crash has 'gdb passthroughs' for things gdb can do, but the gdb
passthroughs
> 'bt', 'frame', 'info locals', 'up', 'down'
are not working either, due to
> gdb not getting the register values from `crash_target::fetch_registers`,
> which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
>
> Proposed Solution:
> ==================
>
> Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for
PPC64.
> This way, "gdb mode in crash" will support this feature for both ELF and
> kdump-compressed vmcore formats, while "gdb" would only have supported
ELF
> format
>
> This way other features of 'gdb', such as seeing
> backtraces/registers/variables/arguments/local variables, moving up and
> down stack frames, can be used with any ppc64 vmcore, irrespective of
> being ELF format or kdump-compressed format.
>
> Note: This doesn't support live debugging on ppc64, since registers are not
> available to be read
>
> Implications on Architectures:
> ====================================
>
> No architecture other than PPC64 has been affected, other than in case of
> 'frame' command
>
> As mentioned in patch #2, since frame will not be prohibited, so it will print:
>
> crash> frame
> #0 <unavailable> in ?? ()
>
> Instead of before prohibited message:
>
> crash> frame
> crash: prohibited gdb command: frame
>
> Major change will be in 'gdb mode' on PPC64, that it will print the frames,
and
> local variables, instead of failing with errors showing no frame, or showing
> that couldn't get PC, it will be able to give all this information.
>
> Testing:
> ========
>
> Git tree with this patch series applied:
>
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v5-smaller-gran
>
> To test various gdb passthroughs:
>
> (crash) set
> (crash) set gdb on
> gdb> thread
> gdb> bt
> gdb> info threads
> gdb> info threads
> gdb> info locals
> gdb> info variables irq_rover_lock
> gdb> info args
> gdb> thread 2
> gdb> set gdb off
> (crash) set
> (crash) set -c 6
> (crash) gdb thread
> (crash) bt
> (crash) gdb bt
> (crash) frame
> (crash) up
> (crash) down
> (crash) info locals
>
It seems we cannot view arbitrary process stack unwinding right? I
mean, "gdb thread X" or "gdb bt" can only view the process which is
currently running on the specific CPU, but for other processes which
are not occupying CPU, we cannot specify it by "gdb bt <pid>" or
"gdb
thread <pid>". However drgn can unwind stack on arbitrary processes:
>>> prog.stack_trace(<pid>)[<frame_id>]["varibale_name"]
Yes, currently gdb features cannot be used with arbitrary PIDs, since GDB
doesn't know of all those threads.
GDB gets the details of threads, registers etc from a registered 'target'.
When we debug a vmcore and vmlinux with GDB (NOT gdb mode in crash), ie. with
`gdb <vmlinux-path> <vmcore-path>`, then the target is 'core_target'.
This target knows all the known PIDs in the vmcore.
While in gdb mode, the target we register with GDB is 'crash_target'. And in
'crash_target_init', we explicitly register only 'nr_cpus' threads, one
for each
CPU. Hence it does not know of any other threads.
It seems crash cmd bt need to be improved in order to view arbitrary
process stack unwinding like drgn right?
Yes, specifically 'crash_target' needs to be improved to support all those
arbitrary threads. crash bt can already print arbitrary PID backtraces, but
improving crash_target will allow printing arbitrary threads in gdb mode also.
Thanks,
Aditya Gupta
>
> Thanks,
> Tao Liu
>
>
> > Known Issues:
> > =============
> >
> > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores
collected
> > from older kernels. This is a known issue due to register mismatch, and
> > its fix has been merged upstream:
> >
> > This can also cause some 'invalid kernel virtual address' errors
during gdb
> > unwinding the stack registers
> >
> > Commit:
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
> >
> > Fixing GDB passthroughs on other architectures
> > ==============================================
> >
> > Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
> > thread', 'gdb info locals' etc. has been done by the patches
introducing
> > 'machdep->get_cpu_reg' and this series fixing some issues in that.
> >
> > Other architectures should be able to fix these gdb functionalities by
> > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
> >
> > The reasoning behind that has been explained with a diagram in commit
> > description of patch #1
> >
> > I will assist with my findings/observations fixing it on ppc64 whenever needed.
> >
> > Changelog:
> > ==========
> >
> > V5:
> > + changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable
> > code
> > + changes in patch #3: fixed typo 'ppc64_renum' instead of
'ppc64_regnum',
> > remove unneeded if condition
> > + changes in patch #5: implement refresh regcache on per thread, instead of all
> > threads at once
> >
> > V4:
> > + fix segmentation fault in live debugging (change in patch #1)
> > + mention live debugging not supported in cover letter and patch #1
> > + fixed some checkpatch warnings (change in patch #5)
> >
> > V3:
> > + default gdb thread will be the crashing thread, instead of being
> > thread '0'
> > + synchronise crash cpu and gdb thread context
> > + fix bug in gdb_interface, that replaced gdb's output stream, losing
> > output in some cases, such as info threads and extra output in info
> > variables
> > + fix 'info threads'
> >
> > RFC V2:
> > - removed patch implementing 'frame', 'up', 'down' in
crash
> > - updated the cover letter by removing the mention of those commands other
> > than the respective gdb passthrough
> >
> > Aditya Gupta (5):
> > ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
> > remove 'frame' from prohibited commands list
> > synchronise cpu context changes between crash/gdb
> > fix gdb_interface: restore gdb's output streams at end of
> > gdb_interface
> > fix 'info threads' command
> >
> > crash_target.c | 44 ++++++++++++++++
> > defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
> > gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++-
> > gdb_interface.c | 2 +-
> > kernel.c | 47 +++++++++++++++--
> > ppc64.c | 95 +++++++++++++++++++++++++++++++++--
> > task.c | 14 ++++++
> > tools.c | 2 +-
> > 8 files changed, 434 insertions(+), 10 deletions(-)
> >
> > --
> > 2.41.0
> >
>