Hi Tao,

On 04/03/24 09:49, Tao Liu wrote:
Hi Aditya,

<...snip...>
Will you post your ppc series v10? Because I think my x86 series is
almost ready, so I plan to rebase mine onto your v10, and post it
upstream for discussion. Please let me know your plans. Thanks in
advance!

Yes Tao, I will post it soon today.


Thanks,

Aditya Gupta

Thanks,
Tao Liu

Thanks,
Aditya Gupta

Thanks.
Lianbo

Thanks,
Aditya Gupta

Do let me know, I will send a v10 accordingly :)

Thanks,
Aditya Gupta

Thanks.
Lianbo

On Thu, Feb 22, 2024 at 4:58 PM Tao Liu <ltao@redhat.com> wrote:

Hi Aditya,

On Thu, Feb 22, 2024 at 1:23 PM Aditya Gupta <adityag@linux.ibm.com>
wrote:
The Problem:
============

Currently crash is unable to show function arguments and local
variables, as
gdb can do. And functionality for moving between frames
('up'/'down') is
not
working in crash.

Crash has 'gdb passthroughs' for things gdb can do, but the gdb
passthroughs
'bt', 'frame', 'info locals', 'up', 'down' are not working either,
due to
gdb not getting the register values from
`crash_target::fetch_registers`,
which then uses `machdep->get_cpu_reg`, which is not implemented
for
PPC64
Proposed Solution:
==================

Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for
PPC64.
This way, "gdb mode in crash" will support this feature for both
ELF and
kdump-compressed vmcore formats, while "gdb" would only have
supported
ELF
format

This way other features of 'gdb', such as seeing
backtraces/registers/variables/arguments/local variables, moving
up and
down stack frames, can be used with any ppc64 vmcore, irrespective
of
being ELF format or kdump-compressed format.

Note: This doesn't support live debugging on ppc64, since
registers are
not
available to be read

Implications on Architectures:
====================================

No architecture other than PPC64 has been affected, other than in
case of
'frame' command

As mentioned in patch #2, since frame will not be prohibited, so
it will
print:
        crash> frame
        #0  <unavailable> in ?? ()

Instead of before prohibited message:

        crash> frame
        crash: prohibited gdb command: frame

Major change will be in 'gdb mode' on PPC64, that it will print the
frames, and
local variables, instead of failing with errors showing no frame,
or
showing
that couldn't get PC, it will be able to give all this information.

Testing:
========

Git tree with this patch series applied:
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9
I doubt the v9 patch will not work without my x86's trial patchset. I
see your repo, you directly applied my "ppc64 arbitrary task stack
unwind support" patch onto yours. However the patch has some
dependency on my x86 trial patchset.

E.g. I added a new member "bool need_free" for defs.h:struct bt_info
in the "x86 unwind support" patch of mine, however you didn't pick
the
one. And in ppc64.c:ppc64_get_stack_frame() and
ppc64.c:ppc64_get_cpu_reg(), the member will be used as:

if (bt_info.need_free) {
     FREEBUF(pt_regs);
    bt_info.need_free = FALSE;
}

So I guess (not tried yet) that your patchset v9 will not work.

Currently I'm still struggling with some failing cases of x86_64
unwinding. So I didn't arrange my patchsets, along with the patch
commit log well, since they are all "trial" patches.

I agree the patch "ppc64 arbitrary task stack unwind support" is
better to go with the ppc patch series. But I suggest we make some
modifications for it:

1) I'm OK with it being a stand alone patch, or merging the code
changes of this one into your previous patches, but I prefer the
latter one :)

2) If you'd like to go with a stand alone patch, could you please
rewrite a commit log and title for this one?

Thanks,
Tao Liu


To test various gdb passthroughs:

        (crash) set
        (crash) set gdb on
        gdb> thread
        gdb> bt
        gdb> info threads
        gdb> info threads
        gdb> info locals
        gdb> info variables irq_rover_lock
        gdb> info args
        gdb> thread 2
        gdb> set gdb off
        (crash) set
        (crash) set -c 6
        (crash) gdb thread
        (crash) bt
        (crash) gdb bt
        (crash) frame
        (crash) gdb up
        (crash) gdb down
        (crash) info locals

Known Issues:
=============

1. In gdb mode, 'bt' might fail to show backtrace in few vmcores
collected
   from older kernels. This is a known issue due to register
mismatch,
and
   its fix has been merged upstream:

   This can also cause some 'invalid kernel virtual address' errors
during gdb
   unwinding the stack registers

Commit:

                
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
Fixing GDB passthroughs on other architectures
==============================================

Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
thread', 'gdb info locals' etc. has been done by the patches
introducing
'machdep->get_cpu_reg' and this series fixing some issues in that.

Other architectures should be able to fix these gdb
functionalities by
simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.

The reasoning behind that has been explained with a diagram in
commit
description of patch #1

I will assist with my findings/observations fixing it on ppc64
whenever
needed.
Changelog:
==========

V9:
+ minor change in patch #5: sync gdb context on a 'set' and 'set
-p'
+ add taoliu's patch for using current context, and fixes in
ppc64_get_cpu_reg
V8:
+ use get_active_task instead of depending on CURRENT_CONTEXT in
ppc64_get_cpu_reg
+ rebase to upstream/master (5977936c0a91)

V7:
+ move changes in gdb-10.2.patch to the end (minor change in patch
#3,4,5)
+ fix a memory leak in ppc64_get_cpu_reg (minor change in patch #1)
+ use ascii diagram in patch #1 description

V6:
+ changes in patch #5: fix bug introduced in v5 that caused
initial gdb
thread
  to be thread 1

V5:
+ changes in patch #1: made ppc64_get_cpu_reg static, and remove
unreachable
  code
+ changes in patch #3: fixed typo 'ppc64_renum' instead of
'ppc64_regnum',
  remove unneeded if condition
+ changes in patch #5: implement refresh regcache on per thread,
instead
of all
  threads at once

V4:
+ fix segmentation fault in live debugging (change in patch #1)
+ mention live debugging not supported in cover letter and patch #1
+ fixed some checkpatch warnings (change in patch #5)

V3:
+ default gdb thread will be the crashing thread, instead of being
  thread '0'
+ synchronise crash cpu and gdb thread context
+ fix bug in gdb_interface, that replaced gdb's output stream,
losing
  output in some cases, such as info threads and extra output in
info
  variables
+ fix 'info threads'

RFC V2:
  - removed patch implementing 'frame', 'up', 'down' in crash
  - updated the cover letter by removing the mention of those
commands
other
        than the respective gdb passthrough

Aditya Gupta (5):
  ppc64: correct gdb passthroughs by implementing
machdep->get_cpu_reg
  remove 'frame' from prohibited commands list
  synchronise cpu context changes between crash/gdb
  fix gdb_interface: restore gdb's output streams at end of
    gdb_interface
  fix 'info threads' command

 crash_target.c  |  44 ++++++++++++++++
 defs.h          | 130
+++++++++++++++++++++++++++++++++++++++++++++++-
 gdb-10.2.patch  | 110 +++++++++++++++++++++++++++++++++++++++-
 gdb_interface.c |   2 +-
 kernel.c        |  47 +++++++++++++++--
 ppc64.c         |  95 +++++++++++++++++++++++++++++++++--
 task.c          |  14 ++++++
 tools.c         |   2 +-
 8 files changed, 434 insertions(+), 10 deletions(-)

--
2.41.0