Hi Lianbo,
On Thu, Feb 22, 2024 at 05:16:31PM +0800, lijiang wrote:
Hi, Aditya and Tao
Thank you for working on this.
I would suggest that the current feature can be splitted into several
steps, for example:
[1] complete a basic functionality on ppc64
[2] improve it based on [1], such known issues
[3] after finishing the [1](and [2]), Tao will implement it on X86 64
That can avoid confusion(or dependency) between ppc64 and X86 64 patches,
it may save some time and make it easier to review patches.
Sure, actually I made a mistake in v9, other than that there should be
no dependency in this patch series.
This is the summary:
Patch #1: PPC64 specific
Patch #2 - #5: Architecture Independent, with patch #4 fixing an
existing bug in gdb_interface, this can be an independent patch.
So based on your suggestions, let's split into it as:
1. This patch series having Patch #1,2,3,5 (removed 4)
2. Fix for existing bug in gdb_interface: Patch #4, It will be good if
this can be merged soon, as testing 'info threads' will not work due to
the bug
3. Tao's patches for x86_64
Also, even though i have kept #2-#5 as architecture-independent, but
then they should be merged after the ppc patches, for their description
and usage to make sense, so I still kept them in the series.
Do let me know, I will send a v10 accordingly :)
Thanks,
Aditya Gupta
>
> Thanks.
> Lianbo
>
> On Thu, Feb 22, 2024 at 4:58 PM Tao Liu <ltao(a)redhat.com> wrote:
>
> > Hi Aditya,
> >
> > On Thu, Feb 22, 2024 at 1:23 PM Aditya Gupta <adityag(a)linux.ibm.com>
> > wrote:
> > >
> > > The Problem:
> > > ============
> > >
> > > Currently crash is unable to show function arguments and local
> > variables, as
> > > gdb can do. And functionality for moving between frames
('up'/'down') is
> > not
> > > working in crash.
> > >
> > > Crash has 'gdb passthroughs' for things gdb can do, but the gdb
> > passthroughs
> > > 'bt', 'frame', 'info locals', 'up',
'down' are not working either, due to
> > > gdb not getting the register values from `crash_target::fetch_registers`,
> > > which then uses `machdep->get_cpu_reg`, which is not implemented for
> > PPC64
> > >
> > > Proposed Solution:
> > > ==================
> > >
> > > Fix the gdb passthroughs by implementing
"machdep->get_cpu_reg" for
> > PPC64.
> > > This way, "gdb mode in crash" will support this feature for both
ELF and
> > > kdump-compressed vmcore formats, while "gdb" would only have
supported
> > ELF
> > > format
> > >
> > > This way other features of 'gdb', such as seeing
> > > backtraces/registers/variables/arguments/local variables, moving up and
> > > down stack frames, can be used with any ppc64 vmcore, irrespective of
> > > being ELF format or kdump-compressed format.
> > >
> > > Note: This doesn't support live debugging on ppc64, since registers
are
> > not
> > > available to be read
> > >
> > > Implications on Architectures:
> > > ====================================
> > >
> > > No architecture other than PPC64 has been affected, other than in case of
> > > 'frame' command
> > >
> > > As mentioned in patch #2, since frame will not be prohibited, so it will
> > print:
> > >
> > > crash> frame
> > > #0 <unavailable> in ?? ()
> > >
> > > Instead of before prohibited message:
> > >
> > > crash> frame
> > > crash: prohibited gdb command: frame
> > >
> > > Major change will be in 'gdb mode' on PPC64, that it will print
the
> > frames, and
> > > local variables, instead of failing with errors showing no frame, or
> > showing
> > > that couldn't get PC, it will be able to give all this information.
> > >
> > > Testing:
> > > ========
> > >
> > > Git tree with this patch series applied:
> > >
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9
> >
> > I doubt the v9 patch will not work without my x86's trial patchset. I
> > see your repo, you directly applied my "ppc64 arbitrary task stack
> > unwind support" patch onto yours. However the patch has some
> > dependency on my x86 trial patchset.
> >
> > E.g. I added a new member "bool need_free" for defs.h:struct bt_info
> > in the "x86 unwind support" patch of mine, however you didn't pick
the
> > one. And in ppc64.c:ppc64_get_stack_frame() and
> > ppc64.c:ppc64_get_cpu_reg(), the member will be used as:
> >
> > if (bt_info.need_free) {
> > FREEBUF(pt_regs);
> > bt_info.need_free = FALSE;
> > }
> >
> > So I guess (not tried yet) that your patchset v9 will not work.
> >
> > Currently I'm still struggling with some failing cases of x86_64
> > unwinding. So I didn't arrange my patchsets, along with the patch
> > commit log well, since they are all "trial" patches.
> >
> > I agree the patch "ppc64 arbitrary task stack unwind support" is
> > better to go with the ppc patch series. But I suggest we make some
> > modifications for it:
> >
> > 1) I'm OK with it being a stand alone patch, or merging the code
> > changes of this one into your previous patches, but I prefer the
> > latter one :)
> >
> > 2) If you'd like to go with a stand alone patch, could you please
> > rewrite a commit log and title for this one?
> >
> > Thanks,
> > Tao Liu
> >
> >
> > >
> > > To test various gdb passthroughs:
> > >
> > > (crash) set
> > > (crash) set gdb on
> > > gdb> thread
> > > gdb> bt
> > > gdb> info threads
> > > gdb> info threads
> > > gdb> info locals
> > > gdb> info variables irq_rover_lock
> > > gdb> info args
> > > gdb> thread 2
> > > gdb> set gdb off
> > > (crash) set
> > > (crash) set -c 6
> > > (crash) gdb thread
> > > (crash) bt
> > > (crash) gdb bt
> > > (crash) frame
> > > (crash) gdb up
> > > (crash) gdb down
> > > (crash) info locals
> > >
> > > Known Issues:
> > > =============
> > >
> > > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores
> > collected
> > > from older kernels. This is a known issue due to register mismatch,
> > and
> > > its fix has been merged upstream:
> > >
> > > This can also cause some 'invalid kernel virtual address'
errors
> > during gdb
> > > unwinding the stack registers
> > >
> > > Commit:
> >
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
> > >
> > > Fixing GDB passthroughs on other architectures
> > > ==============================================
> > >
> > > Much of the work for making gdb passthroughs like 'gdb bt',
'gdb
> > > thread', 'gdb info locals' etc. has been done by the patches
introducing
> > > 'machdep->get_cpu_reg' and this series fixing some issues in
that.
> > >
> > > Other architectures should be able to fix these gdb functionalities by
> > > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
> > >
> > > The reasoning behind that has been explained with a diagram in commit
> > > description of patch #1
> > >
> > > I will assist with my findings/observations fixing it on ppc64 whenever
> > needed.
> > >
> > > Changelog:
> > > ==========
> > >
> > > V9:
> > > + minor change in patch #5: sync gdb context on a 'set' and
'set -p'
> > > + add taoliu's patch for using current context, and fixes in
> > ppc64_get_cpu_reg
> > >
> > > V8:
> > > + use get_active_task instead of depending on CURRENT_CONTEXT in
> > ppc64_get_cpu_reg
> > > + rebase to upstream/master (5977936c0a91)
> > >
> > > V7:
> > > + move changes in gdb-10.2.patch to the end (minor change in patch
> > #3,4,5)
> > > + fix a memory leak in ppc64_get_cpu_reg (minor change in patch #1)
> > > + use ascii diagram in patch #1 description
> > >
> > > V6:
> > > + changes in patch #5: fix bug introduced in v5 that caused initial gdb
> > thread
> > > to be thread 1
> > >
> > > V5:
> > > + changes in patch #1: made ppc64_get_cpu_reg static, and remove
> > unreachable
> > > code
> > > + changes in patch #3: fixed typo 'ppc64_renum' instead of
> > 'ppc64_regnum',
> > > remove unneeded if condition
> > > + changes in patch #5: implement refresh regcache on per thread, instead
> > of all
> > > threads at once
> > >
> > > V4:
> > > + fix segmentation fault in live debugging (change in patch #1)
> > > + mention live debugging not supported in cover letter and patch #1
> > > + fixed some checkpatch warnings (change in patch #5)
> > >
> > > V3:
> > > + default gdb thread will be the crashing thread, instead of being
> > > thread '0'
> > > + synchronise crash cpu and gdb thread context
> > > + fix bug in gdb_interface, that replaced gdb's output stream, losing
> > > output in some cases, such as info threads and extra output in info
> > > variables
> > > + fix 'info threads'
> > >
> > > RFC V2:
> > > - removed patch implementing 'frame', 'up',
'down' in crash
> > > - updated the cover letter by removing the mention of those commands
> > other
> > > than the respective gdb passthrough
> > >
> > > Aditya Gupta (5):
> > > ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
> > > remove 'frame' from prohibited commands list
> > > synchronise cpu context changes between crash/gdb
> > > fix gdb_interface: restore gdb's output streams at end of
> > > gdb_interface
> > > fix 'info threads' command
> > >
> > > crash_target.c | 44 ++++++++++++++++
> > > defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
> > > gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++-
> > > gdb_interface.c | 2 +-
> > > kernel.c | 47 +++++++++++++++--
> > > ppc64.c | 95 +++++++++++++++++++++++++++++++++--
> > > task.c | 14 ++++++
> > > tools.c | 2 +-
> > > 8 files changed, 434 insertions(+), 10 deletions(-)
> > >
> > > --
> > > 2.41.0
> > >
> >
> >