Hi Tao Liu,
On Wed, Feb 21, 2024 at 02:08:45PM +0800, Tao Liu wrote:
Hi Aditya,
On Tue, Feb 20, 2024 at 2:30 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
>
> Hi Tao Liu,
>
> >
> > <...snip...>
> >
> > > For now it seems to work fine on ppc64, I will go through the code
> > > changes once. By the way, can you post a rebased patch series for x86 ?
> >
> > I noticed there is some problem with the ppc64 change thread context
> > support. Currently for x86_64, we use "set <pid>" cmd to switch
the
> > current task context to be the task <pid>'s, then we use cmd
"gdb bt"
> > to view the stack unwinding of task <pid>, by which we can view
> > arbitrary task's stack unwinding.
> >
> > However for ppc64 code, there is no "get arbitrary tasks register"
> > code. E.g. in ppc64_get_cpu_reg():
> >
> > task = get_active_task(cpu);
> > tc = task_to_context(task);
> >
> > So no matter how we set any pid, the registers are always the active
> > tasks' register, instead of <pid> tasks' register.
>
> Hmm. True, I will check that, true, gdb as of now will get active tasks
> registers only, as this feature was meant for that, since getting gdb to
> assist for the crashing process was the priority.
>
> Setting context to arbitrary pid ('set <pid>') and backtrace of any
> arbitrary pid works on ppc64, so there should be a way to get those
> registers, i will go through the code once to check this.
>
> But that arbitrary backtrace will be a very good plus.
>
I made some code change[1], and currently it seems to work for stack
unwinding for arbitrary tasks.
Thanks for the patch, and the steps. I was still looking at it, I am
trying to see if there was any bug still left in my code, as I found
one, where gdb's context was not synchronised when 'set <pid>' was
done.
Also, due to a change I made in v6 or v7 of this series, I changed it to
use 'get_active_task' instead of 'CURRENT_CONTEXT'.
I will try the patch.
Thanks,
Aditya Gupta
The usage is as follows:
crash> set <pid>
or
crash> set <task>
crash> gdb bt
which will view the specific task stack unwinding.
> >
> > It would be simple to get this fixed, however the more challenging
> > part, which is difficult for me, is how to get the inactive tasks
> > registers, So we can do similar things as in
> > x86_64.c:x86_64_get_stack_frame() of
> >
https://github.com/liutgnu/crash-dev?
> >
> > I see there is a structure as task_struct -> thread_struct -> struct
> > pt_regs *regs. But sometimes the regs will give NULL value.
>
> True, the registers are NULL atleast in case of the swapper processes.
It seems to me that, in order to support stack unwinding, we don't
need to get all registers, just nip & r1 (pc reg & stack pointer reg)
are enough for gdb unwinding. Anyway, could you please give the trial
patch a try, and see if it can work for you?
>
> Thanks,
> Tao Liu
>
> [1]:
https://github.com/liutgnu/crash-dev/commit/ef1cec3400bd7619ed9d3c9c4b50a...
>
> >
> > I will update on my findings.
> >
> > Thanks,
> > Aditya Gupta
> >
> > >
> > > Thanks,
> > > Tao Liu
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > Thanks,
> > > > Aditya Gupta
> > > >
> > > > >
> > > > > Thanks,
> > > > > Tao Liu
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Kazu
> > > > > >
> > > > > > On 2024/01/05 16:30, Aditya Gupta wrote:
> > > > > > > The Problem:
> > > > > > > ============
> > > > > > >
> > > > > > > Currently crash is unable to show function arguments
and local variables, as
> > > > > > > gdb can do. And functionality for moving between frames
('up'/'down') is not
> > > > > > > working in crash.
> > > > > > >
> > > > > > > Crash has 'gdb passthroughs' for things gdb can
do, but the gdb passthroughs
> > > > > > > 'bt', 'frame', 'info locals',
'up', 'down' are not working either, due to
> > > > > > > gdb not getting the register values from
`crash_target::fetch_registers`,
> > > > > > > which then uses `machdep->get_cpu_reg`, which is not
implemented for PPC64
> > > > > > >
> > > > > > > Proposed Solution:
> > > > > > > ==================
> > > > > > >
> > > > > > > Fix the gdb passthroughs by implementing
"machdep->get_cpu_reg" for PPC64.
> > > > > > > This way, "gdb mode in crash" will support
this feature for both ELF and
> > > > > > > kdump-compressed vmcore formats, while "gdb"
would only have supported ELF
> > > > > > > format
> > > > > > >
> > > > > > > This way other features of 'gdb', such as
seeing
> > > > > > > backtraces/registers/variables/arguments/local
variables, moving up and
> > > > > > > down stack frames, can be used with any ppc64 vmcore,
irrespective of
> > > > > > > being ELF format or kdump-compressed format.
> > > > > > >
> > > > > > > Note: This doesn't support live debugging on ppc64,
since registers are not
> > > > > > > available to be read
> > > > > > >
> > > > > > > Implications on Architectures:
> > > > > > > ====================================
> > > > > > >
> > > > > > > No architecture other than PPC64 has been affected,
other than in case of
> > > > > > > 'frame' command
> > > > > > >
> > > > > > > As mentioned in patch #2, since frame will not be
prohibited, so it will print:
> > > > > > >
> > > > > > > crash> frame
> > > > > > > #0 <unavailable> in ?? ()
> > > > > > >
> > > > > > > Instead of before prohibited message:
> > > > > > >
> > > > > > > crash> frame
> > > > > > > crash: prohibited gdb command: frame
> > > > > > >
> > > > > > > Major change will be in 'gdb mode' on PPC64,
that it will print the frames, and
> > > > > > > local variables, instead of failing with errors showing
no frame, or showing
> > > > > > > that couldn't get PC, it will be able to give all
this information.
> > > > > > >
> > > > > > > Testing:
> > > > > > > ========
> > > > > > >
> > > > > > > Git tree with this patch series applied:
> > > > > > >
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v6
> > > > > > >
> > > > > > > To test various gdb passthroughs:
> > > > > > >
> > > > > > > (crash) set
> > > > > > > (crash) set gdb on
> > > > > > > gdb> thread
> > > > > > > gdb> bt
> > > > > > > gdb> info threads
> > > > > > > gdb> info threads
> > > > > > > gdb> info locals
> > > > > > > gdb> info variables irq_rover_lock
> > > > > > > gdb> info args
> > > > > > > gdb> thread 2
> > > > > > > gdb> set gdb off
> > > > > > > (crash) set
> > > > > > > (crash) set -c 6
> > > > > > > (crash) gdb thread
> > > > > > > (crash) bt
> > > > > > > (crash) gdb bt
> > > > > > > (crash) frame
> > > > > > > (crash) up
> > > > > > > (crash) down
> > > > > > > (crash) info locals
> > > > > > >
> > > > > > > Known Issues:
> > > > > > > =============
> > > > > > >
> > > > > > > 1. In gdb mode, 'bt' might fail to show
backtrace in few vmcores collected
> > > > > > > from older kernels. This is a known issue due to
register mismatch, and
> > > > > > > its fix has been merged upstream:
> > > > > > >
> > > > > > > This can also cause some 'invalid kernel
virtual address' errors during gdb
> > > > > > > unwinding the stack registers
> > > > > > >
> > > > > > > Commit:
https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
> > > > > > >
> > > > > > > Fixing GDB passthroughs on other architectures
> > > > > > > ==============================================
> > > > > > >
> > > > > > > Much of the work for making gdb passthroughs like
'gdb bt', 'gdb
> > > > > > > thread', 'gdb info locals' etc. has been
done by the patches introducing
> > > > > > > 'machdep->get_cpu_reg' and this series
fixing some issues in that.
> > > > > > >
> > > > > > > Other architectures should be able to fix these gdb
functionalities by
> > > > > > > simply implementing 'machdep->get_cpu_reg (cpu,
regno, ...)'.
> > > > > > >
> > > > > > > The reasoning behind that has been explained with a
diagram in commit
> > > > > > > description of patch #1
> > > > > > >
> > > > > > > I will assist with my findings/observations fixing it
on ppc64 whenever needed.
> > > > > > >
> > > > > > > Changelog:
> > > > > > > ==========
> > > > > > >
> > > > > > > V6:
> > > > > > > + changes in patch #5: fix bug introduced in v5 that
caused initial gdb thread
> > > > > > > to be thread 1
> > > > > > >
> > > > > > > V5:
> > > > > > > + changes in patch #1: made ppc64_get_cpu_reg static,
and remove unreachable
> > > > > > > code
> > > > > > > + changes in patch #3: fixed typo 'ppc64_renum'
instead of 'ppc64_regnum',
> > > > > > > remove unneeded if condition
> > > > > > > + changes in patch #5: implement refresh regcache on
per thread, instead of all
> > > > > > > threads at once
> > > > > > >
> > > > > > > V4:
> > > > > > > + fix segmentation fault in live debugging (change in
patch #1)
> > > > > > > + mention live debugging not supported in cover letter
and patch #1
> > > > > > > + fixed some checkpatch warnings (change in patch #5)
> > > > > > >
> > > > > > > V3:
> > > > > > > + default gdb thread will be the crashing thread,
instead of being
> > > > > > > thread '0'
> > > > > > > + synchronise crash cpu and gdb thread context
> > > > > > > + fix bug in gdb_interface, that replaced gdb's
output stream, losing
> > > > > > > output in some cases, such as info threads and extra
output in info
> > > > > > > variables
> > > > > > > + fix 'info threads'
> > > > > > >
> > > > > > > RFC V2:
> > > > > > > - removed patch implementing 'frame',
'up', 'down' in crash
> > > > > > > - updated the cover letter by removing the mention
of those commands other
> > > > > > > than the respective gdb passthrough
> > > > > > >
> > > > > > > Aditya Gupta (5):
> > > > > > > ppc64: correct gdb passthroughs by implementing
machdep->get_cpu_reg
> > > > > > > remove 'frame' from prohibited commands
list
> > > > > > > synchronise cpu context changes between crash/gdb
> > > > > > > fix gdb_interface: restore gdb's output streams
at end of
> > > > > > > gdb_interface
> > > > > > > fix 'info threads' command
> > > > > > >
> > > > > > > crash_target.c | 44 ++++++++++++++++
> > > > > > > defs.h | 130
+++++++++++++++++++++++++++++++++++++++++++++++-
> > > > > > > gdb-10.2.patch | 110
+++++++++++++++++++++++++++++++++++++++-
> > > > > > > gdb_interface.c | 2 +-
> > > > > > > kernel.c | 47 +++++++++++++++--
> > > > > > > ppc64.c | 95
+++++++++++++++++++++++++++++++++--
> > > > > > > task.c | 14 ++++++
> > > > > > > tools.c | 2 +-
> > > > > > > 8 files changed, 434 insertions(+), 10 deletions(-)
> > > > > > >
> > > > >
> > > >
> > >
> >
>