Hi Aditya,
On Thu, Feb 22, 2024 at 6:17 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
Hi Tao,
On Thu, Feb 22, 2024 at 04:57:23PM +0800, Tao Liu wrote:
> Hi Aditya,
>
> On Thu, Feb 22, 2024 at 1:23 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
> >
> > The Problem:
> > ============
> >
> > Currently crash is unable to show function arguments and local variables, as
> > gdb can do. And functionality for moving between frames
('up'/'down') is not
> > working in crash.
> >
> > Crash has 'gdb passthroughs' for things gdb can do, but the gdb
passthroughs
> > 'bt', 'frame', 'info locals', 'up',
'down' are not working either, due to
> > gdb not getting the register values from `crash_target::fetch_registers`,
> > which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
> >
> > Proposed Solution:
> > ==================
> >
> > Fix the gdb passthroughs by implementing "machdep->get_cpu_reg"
for PPC64.
> > This way, "gdb mode in crash" will support this feature for both ELF
and
> > kdump-compressed vmcore formats, while "gdb" would only have
supported ELF
> > format
> >
> > This way other features of 'gdb', such as seeing
> > backtraces/registers/variables/arguments/local variables, moving up and
> > down stack frames, can be used with any ppc64 vmcore, irrespective of
> > being ELF format or kdump-compressed format.
> >
> > Note: This doesn't support live debugging on ppc64, since registers are
not
> > available to be read
> >
> > Implications on Architectures:
> > ====================================
> >
> > No architecture other than PPC64 has been affected, other than in case of
> > 'frame' command
> >
> > As mentioned in patch #2, since frame will not be prohibited, so it will
print:
> >
> > crash> frame
> > #0 <unavailable> in ?? ()
> >
> > Instead of before prohibited message:
> >
> > crash> frame
> > crash: prohibited gdb command: frame
> >
> > Major change will be in 'gdb mode' on PPC64, that it will print the
frames, and
> > local variables, instead of failing with errors showing no frame, or showing
> > that couldn't get PC, it will be able to give all this information.
> >
> > Testing:
> > ========
> >
> > Git tree with this patch series applied:
> >
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9
>
> I doubt the v9 patch will not work without my x86's trial patchset. I
> see your repo, you directly applied my "ppc64 arbitrary task stack
> unwind support" patch onto yours. However the patch has some
> dependency on my x86 trial patchset.
Sorry, and thanks for the quick mail.
>
> E.g. I added a new member "bool need_free" for defs.h:struct bt_info
> in the "x86 unwind support" patch of mine, however you didn't pick
the
> one. And in ppc64.c:ppc64_get_stack_frame() and
> ppc64.c:ppc64_get_cpu_reg(), the member will be used as:
>
> if (bt_info.need_free) {
> FREEBUF(pt_regs);
> bt_info.need_free = FALSE;
> }
>
> So I guess (not tried yet) that your patchset v9 will not work.
True, it will not work. Just checked, I had already applied your patches
also and then that patch, where I tested this series. My mistake, please
skip testing this one.
>
> Currently I'm still struggling with some failing cases of x86_64
> unwinding. So I didn't arrange my patchsets, along with the patch
> commit log well, since they are all "trial" patches.
>
> I agree the patch "ppc64 arbitrary task stack unwind support" is
> better to go with the ppc patch series. But I suggest we make some
> modifications for it:
>
> 1) I'm OK with it being a stand alone patch, or merging the code
> changes of this one into your previous patches, but I prefer the
> latter one :)
>
> 2) If you'd like to go with a stand alone patch, could you please
> rewrite a commit log and title for this one?
Sure, I am okay with either. Let's go with the latter one you suggested,
in that case I will add the line adding 'need_free' in 'bt_info'
structure also, to make that work. By the way, shouldn't it be
'FREEBUF(bt_info.stackbuf)' instead of 'FREEBUF(pt_regs)' ?
Also, should I add you as a co-author of the patch ?
Thanks for the reminder! I just found some bugs and made a little
improvement for the patch. So I guess "Suggested-by:" is better than
"co-author", what do you think?
Thanks,
Tao Liu
Thanks,
Aditya Gupta
>
> Thanks,
> Tao Liu
>
>
> >
> > To test various gdb passthroughs:
> >
> > (crash) set
> > (crash) set gdb on
> > gdb> thread
> > gdb> bt
> > gdb> info threads
> > gdb> info threads
> > gdb> info locals
> > gdb> info variables irq_rover_lock
> > gdb> info args
> > gdb> thread 2
> > gdb> set gdb off
> > (crash) set
> > (crash) set -c 6
> > (crash) gdb thread
> > (crash) bt
> > (crash) gdb bt
> > (crash) frame
> > (crash) gdb up
> > (crash) gdb down
> > (crash) info locals
> >
> > Known Issues:
> > =============
> >
> > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores
collected
> > from older kernels. This is a known issue due to register mismatch, and
> > its fix has been merged upstream:
> >
> > This can also cause some 'invalid kernel virtual address' errors
during gdb
> > unwinding the stack registers
> >
> > Commit:
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
> >
> > Fixing GDB passthroughs on other architectures
> > ==============================================
> >
> > Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
> > thread', 'gdb info locals' etc. has been done by the patches
introducing
> > 'machdep->get_cpu_reg' and this series fixing some issues in that.
> >
> > Other architectures should be able to fix these gdb functionalities by
> > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
> >
> > The reasoning behind that has been explained with a diagram in commit
> > description of patch #1
> >
> > I will assist with my findings/observations fixing it on ppc64 whenever
needed.
> >
> > Changelog:
> > ==========
> >
> > V9:
> > + minor change in patch #5: sync gdb context on a 'set' and 'set
-p'
> > + add taoliu's patch for using current context, and fixes in
ppc64_get_cpu_reg
> >
> > V8:
> > + use get_active_task instead of depending on CURRENT_CONTEXT in
ppc64_get_cpu_reg
> > + rebase to upstream/master (5977936c0a91)
> >
> > V7:
> > + move changes in gdb-10.2.patch to the end (minor change in patch #3,4,5)
> > + fix a memory leak in ppc64_get_cpu_reg (minor change in patch #1)
> > + use ascii diagram in patch #1 description
> >
> > V6:
> > + changes in patch #5: fix bug introduced in v5 that caused initial gdb thread
> > to be thread 1
> >
> > V5:
> > + changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable
> > code
> > + changes in patch #3: fixed typo 'ppc64_renum' instead of
'ppc64_regnum',
> > remove unneeded if condition
> > + changes in patch #5: implement refresh regcache on per thread, instead of
all
> > threads at once
> >
> > V4:
> > + fix segmentation fault in live debugging (change in patch #1)
> > + mention live debugging not supported in cover letter and patch #1
> > + fixed some checkpatch warnings (change in patch #5)
> >
> > V3:
> > + default gdb thread will be the crashing thread, instead of being
> > thread '0'
> > + synchronise crash cpu and gdb thread context
> > + fix bug in gdb_interface, that replaced gdb's output stream, losing
> > output in some cases, such as info threads and extra output in info
> > variables
> > + fix 'info threads'
> >
> > RFC V2:
> > - removed patch implementing 'frame', 'up', 'down' in
crash
> > - updated the cover letter by removing the mention of those commands other
> > than the respective gdb passthrough
> >
> > Aditya Gupta (5):
> > ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
> > remove 'frame' from prohibited commands list
> > synchronise cpu context changes between crash/gdb
> > fix gdb_interface: restore gdb's output streams at end of
> > gdb_interface
> > fix 'info threads' command
> >
> > crash_target.c | 44 ++++++++++++++++
> > defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
> > gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++-
> > gdb_interface.c | 2 +-
> > kernel.c | 47 +++++++++++++++--
> > ppc64.c | 95 +++++++++++++++++++++++++++++++++--
> > task.c | 14 ++++++
> > tools.c | 2 +-
> > 8 files changed, 434 insertions(+), 10 deletions(-)
> >
> > --
> > 2.41.0
> >
>