Hi Aditya,
On Fri, Mar 1, 2024 at 5:53 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
Hi Lianbo and Tao,
On Tue, Feb 27, 2024 at 11:27:32AM +0800, lijiang wrote:
> Hi, Aditya
> On Mon, Feb 26, 2024 at 7:12 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
> > Hi Lianbo,
> > On Thu, Feb 22, 2024 at 03:59:36PM +0530, Aditya Gupta wrote:
> > > Hi Lianbo,
> > > On Thu, Feb 22, 2024 at 05:16:31PM +0800, lijiang wrote:
> > > > Hi, Aditya and Tao
> > > > Thank you for working on this.
> > > >
> > > > I would suggest that the current feature can be splitted into
several
> > > > steps, for example:
> > > > [1] complete a basic functionality on ppc64
> > > > [2] improve it based on [1], such known issues
> > > > [3] after finishing the [1](and [2]), Tao will implement it on X86
64
> > > >
> > > > That can avoid confusion(or dependency) between ppc64 and X86 64
> > patches,
> > > > it may save some time and make it easier to review patches.
> > >
> > > Sure, actually I made a mistake in v9, other than that there should be
> > > no dependency in this patch series.
> > > This is the summary:
> > >
> > > Patch #1: PPC64 specific
> > > Patch #2 - #5: Architecture Independent, with patch #4 fixing an
> > > existing bug in gdb_interface, this can be an independent patch.
> > >
> > > So based on your suggestions, let's split into it as:
> > >
> > > 1. This patch series having Patch #1,2,3,5 (removed 4)
> > > 2. Fix for existing bug in gdb_interface: Patch #4, It will be good if
> > > this can be merged soon, as testing 'info threads' will not work
due to
> > > the bug
> > > 3. Tao's patches for x86_64
> > >
> > > Also, even though i have kept #2-#5 as architecture-independent, but
> > > then they should be merged after the ppc patches, for their description
> > > and usage to make sense, so I still kept them in the series.
> >
> > Any comments on this ?
> >
> > I am still of the view it makes sense for all 5 patches to go in as a
> > single series, since independently they might not make much sense (such
> > as, even though patch #4 fixes an existing bug, but info threads will
> > not hit the bug unless the architecture supports gdb mode unwinding).
> >
> > But if splitting it makes more sense, I am okay with it.
> >
> >
> Sorry about this, I should explain it in detail.
>
> Earlier I saw you were discussing with Tao about the following changes:
>
> "E.g. I added a new member "bool need_free" for defs.h:struct
bt_info
> in the "x86 unwind support" patch of mine, however you didn't pick
the
> one. And in ppc64.c:ppc64_get_stack_frame() and
> ppc64.c:ppc64_get_cpu_reg(), the member will be used as:
>
> if (bt_info.need_free) {
> FREEBUF(pt_regs);
> bt_info.need_free = FALSE;
> }
> "
>
> So I would suggest avoiding confusion between ppc64 and x86 64 patches, and
> split into several steps. Also do not put X86 64 changes into ppc64 series.
Sorry for the confusion, I will keep the patch series to how it was at
v8, plus fix tao did in that. That will require 'need_free' part tao
added in patch #1, other than that I will not add any x86 changes into
ppc64.
Will you post your ppc series v10? Because I think my x86 series is
almost ready, so I plan to rebase mine onto your v10, and post it
upstream for discussion. Please let me know your plans. Thanks in
advance!
Thanks,
Tao Liu
Thanks,
Aditya Gupta
>
> Thanks.
> Lianbo
>
> Thanks,
> > Aditya Gupta
> >
> > >
> > > Do let me know, I will send a v10 accordingly :)
> > >
> > > Thanks,
> > > Aditya Gupta
> > >
> > > >
> > > > Thanks.
> > > > Lianbo
> > > >
> > > > On Thu, Feb 22, 2024 at 4:58 PM Tao Liu <ltao(a)redhat.com>
wrote:
> > > >
> > > > > Hi Aditya,
> > > > >
> > > > > On Thu, Feb 22, 2024 at 1:23 PM Aditya Gupta
<adityag(a)linux.ibm.com>
> > > > > wrote:
> > > > > >
> > > > > > The Problem:
> > > > > > ============
> > > > > >
> > > > > > Currently crash is unable to show function arguments and
local
> > > > > variables, as
> > > > > > gdb can do. And functionality for moving between frames
> > ('up'/'down') is
> > > > > not
> > > > > > working in crash.
> > > > > >
> > > > > > Crash has 'gdb passthroughs' for things gdb can do,
but the gdb
> > > > > passthroughs
> > > > > > 'bt', 'frame', 'info locals',
'up', 'down' are not working either,
> > due to
> > > > > > gdb not getting the register values from
> > `crash_target::fetch_registers`,
> > > > > > which then uses `machdep->get_cpu_reg`, which is not
implemented
> > for
> > > > > PPC64
> > > > > >
> > > > > > Proposed Solution:
> > > > > > ==================
> > > > > >
> > > > > > Fix the gdb passthroughs by implementing
"machdep->get_cpu_reg" for
> > > > > PPC64.
> > > > > > This way, "gdb mode in crash" will support this
feature for both
> > ELF and
> > > > > > kdump-compressed vmcore formats, while "gdb"
would only have
> > supported
> > > > > ELF
> > > > > > format
> > > > > >
> > > > > > This way other features of 'gdb', such as seeing
> > > > > > backtraces/registers/variables/arguments/local variables,
moving
> > up and
> > > > > > down stack frames, can be used with any ppc64 vmcore,
irrespective
> > of
> > > > > > being ELF format or kdump-compressed format.
> > > > > >
> > > > > > Note: This doesn't support live debugging on ppc64,
since
> > registers are
> > > > > not
> > > > > > available to be read
> > > > > >
> > > > > > Implications on Architectures:
> > > > > > ====================================
> > > > > >
> > > > > > No architecture other than PPC64 has been affected, other
than in
> > case of
> > > > > > 'frame' command
> > > > > >
> > > > > > As mentioned in patch #2, since frame will not be
prohibited, so
> > it will
> > > > > print:
> > > > > >
> > > > > > crash> frame
> > > > > > #0 <unavailable> in ?? ()
> > > > > >
> > > > > > Instead of before prohibited message:
> > > > > >
> > > > > > crash> frame
> > > > > > crash: prohibited gdb command: frame
> > > > > >
> > > > > > Major change will be in 'gdb mode' on PPC64, that
it will print the
> > > > > frames, and
> > > > > > local variables, instead of failing with errors showing no
frame,
> > or
> > > > > showing
> > > > > > that couldn't get PC, it will be able to give all this
information.
> > > > > >
> > > > > > Testing:
> > > > > > ========
> > > > > >
> > > > > > Git tree with this patch series applied:
> > > > > >
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9
> > > > >
> > > > > I doubt the v9 patch will not work without my x86's trial
patchset. I
> > > > > see your repo, you directly applied my "ppc64 arbitrary
task stack
> > > > > unwind support" patch onto yours. However the patch has
some
> > > > > dependency on my x86 trial patchset.
> > > > >
> > > > > E.g. I added a new member "bool need_free" for
defs.h:struct bt_info
> > > > > in the "x86 unwind support" patch of mine, however you
didn't pick
> > the
> > > > > one. And in ppc64.c:ppc64_get_stack_frame() and
> > > > > ppc64.c:ppc64_get_cpu_reg(), the member will be used as:
> > > > >
> > > > > if (bt_info.need_free) {
> > > > > FREEBUF(pt_regs);
> > > > > bt_info.need_free = FALSE;
> > > > > }
> > > > >
> > > > > So I guess (not tried yet) that your patchset v9 will not work.
> > > > >
> > > > > Currently I'm still struggling with some failing cases of
x86_64
> > > > > unwinding. So I didn't arrange my patchsets, along with the
patch
> > > > > commit log well, since they are all "trial" patches.
> > > > >
> > > > > I agree the patch "ppc64 arbitrary task stack unwind
support" is
> > > > > better to go with the ppc patch series. But I suggest we make
some
> > > > > modifications for it:
> > > > >
> > > > > 1) I'm OK with it being a stand alone patch, or merging the
code
> > > > > changes of this one into your previous patches, but I prefer
the
> > > > > latter one :)
> > > > >
> > > > > 2) If you'd like to go with a stand alone patch, could you
please
> > > > > rewrite a commit log and title for this one?
> > > > >
> > > > > Thanks,
> > > > > Tao Liu
> > > > >
> > > > >
> > > > > >
> > > > > > To test various gdb passthroughs:
> > > > > >
> > > > > > (crash) set
> > > > > > (crash) set gdb on
> > > > > > gdb> thread
> > > > > > gdb> bt
> > > > > > gdb> info threads
> > > > > > gdb> info threads
> > > > > > gdb> info locals
> > > > > > gdb> info variables irq_rover_lock
> > > > > > gdb> info args
> > > > > > gdb> thread 2
> > > > > > gdb> set gdb off
> > > > > > (crash) set
> > > > > > (crash) set -c 6
> > > > > > (crash) gdb thread
> > > > > > (crash) bt
> > > > > > (crash) gdb bt
> > > > > > (crash) frame
> > > > > > (crash) gdb up
> > > > > > (crash) gdb down
> > > > > > (crash) info locals
> > > > > >
> > > > > > Known Issues:
> > > > > > =============
> > > > > >
> > > > > > 1. In gdb mode, 'bt' might fail to show backtrace
in few vmcores
> > > > > collected
> > > > > > from older kernels. This is a known issue due to
register
> > mismatch,
> > > > > and
> > > > > > its fix has been merged upstream:
> > > > > >
> > > > > > This can also cause some 'invalid kernel virtual
address' errors
> > > > > during gdb
> > > > > > unwinding the stack registers
> > > > > >
> > > > > > Commit:
> > > > >
> >
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
> > > > > >
> > > > > > Fixing GDB passthroughs on other architectures
> > > > > > ==============================================
> > > > > >
> > > > > > Much of the work for making gdb passthroughs like 'gdb
bt', 'gdb
> > > > > > thread', 'gdb info locals' etc. has been done
by the patches
> > introducing
> > > > > > 'machdep->get_cpu_reg' and this series fixing
some issues in that.
> > > > > >
> > > > > > Other architectures should be able to fix these gdb
> > functionalities by
> > > > > > simply implementing 'machdep->get_cpu_reg (cpu,
regno, ...)'.
> > > > > >
> > > > > > The reasoning behind that has been explained with a diagram
in
> > commit
> > > > > > description of patch #1
> > > > > >
> > > > > > I will assist with my findings/observations fixing it on
ppc64
> > whenever
> > > > > needed.
> > > > > >
> > > > > > Changelog:
> > > > > > ==========
> > > > > >
> > > > > > V9:
> > > > > > + minor change in patch #5: sync gdb context on a
'set' and 'set
> > -p'
> > > > > > + add taoliu's patch for using current context, and
fixes in
> > > > > ppc64_get_cpu_reg
> > > > > >
> > > > > > V8:
> > > > > > + use get_active_task instead of depending on
CURRENT_CONTEXT in
> > > > > ppc64_get_cpu_reg
> > > > > > + rebase to upstream/master (5977936c0a91)
> > > > > >
> > > > > > V7:
> > > > > > + move changes in gdb-10.2.patch to the end (minor change
in patch
> > > > > #3,4,5)
> > > > > > + fix a memory leak in ppc64_get_cpu_reg (minor change in
patch #1)
> > > > > > + use ascii diagram in patch #1 description
> > > > > >
> > > > > > V6:
> > > > > > + changes in patch #5: fix bug introduced in v5 that
caused
> > initial gdb
> > > > > thread
> > > > > > to be thread 1
> > > > > >
> > > > > > V5:
> > > > > > + changes in patch #1: made ppc64_get_cpu_reg static, and
remove
> > > > > unreachable
> > > > > > code
> > > > > > + changes in patch #3: fixed typo 'ppc64_renum'
instead of
> > > > > 'ppc64_regnum',
> > > > > > remove unneeded if condition
> > > > > > + changes in patch #5: implement refresh regcache on per
thread,
> > instead
> > > > > of all
> > > > > > threads at once
> > > > > >
> > > > > > V4:
> > > > > > + fix segmentation fault in live debugging (change in patch
#1)
> > > > > > + mention live debugging not supported in cover letter and
patch #1
> > > > > > + fixed some checkpatch warnings (change in patch #5)
> > > > > >
> > > > > > V3:
> > > > > > + default gdb thread will be the crashing thread, instead
of being
> > > > > > thread '0'
> > > > > > + synchronise crash cpu and gdb thread context
> > > > > > + fix bug in gdb_interface, that replaced gdb's output
stream,
> > losing
> > > > > > output in some cases, such as info threads and extra
output in
> > info
> > > > > > variables
> > > > > > + fix 'info threads'
> > > > > >
> > > > > > RFC V2:
> > > > > > - removed patch implementing 'frame',
'up', 'down' in crash
> > > > > > - updated the cover letter by removing the mention of
those
> > commands
> > > > > other
> > > > > > than the respective gdb passthrough
> > > > > >
> > > > > > Aditya Gupta (5):
> > > > > > ppc64: correct gdb passthroughs by implementing
> > machdep->get_cpu_reg
> > > > > > remove 'frame' from prohibited commands list
> > > > > > synchronise cpu context changes between crash/gdb
> > > > > > fix gdb_interface: restore gdb's output streams at
end of
> > > > > > gdb_interface
> > > > > > fix 'info threads' command
> > > > > >
> > > > > > crash_target.c | 44 ++++++++++++++++
> > > > > > defs.h | 130
> > +++++++++++++++++++++++++++++++++++++++++++++++-
> > > > > > gdb-10.2.patch | 110
+++++++++++++++++++++++++++++++++++++++-
> > > > > > gdb_interface.c | 2 +-
> > > > > > kernel.c | 47 +++++++++++++++--
> > > > > > ppc64.c | 95 +++++++++++++++++++++++++++++++++--
> > > > > > task.c | 14 ++++++
> > > > > > tools.c | 2 +-
> > > > > > 8 files changed, 434 insertions(+), 10 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.41.0
> > > > > >
> > > > >
> > > > >
> >
> >