Hi Tao,
On Wed, Mar 20, 2024 at 07:39:12PM +0800, Tao Liu wrote:
Hi Aditya & Alexey,
On Wed, Mar 20, 2024 at 1:57 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
>
> Hi Tao,
>
> > From my view, I think we can keep the current "gdb thread/CPUs =
nr_cpus", or
> > be more ambitious, leaving "gdb thread/CPUs = 1", in case there are
> > systems with thousands of cpus, and we only need this in task.c:
>
> Nice idea, it might break some expectations that 'info threads' list
> active CPUs, but I guess that was anyways not of much help earlier.
>
> Also simplifies many things, and removes much of the synchronisations
> between gdb <-> crash. If this happens, I guess we should just make
> 'info threads' behave as 'thread', ie. just showing current frame,
as
> "CPU 0" might not make sense.
>
Agreed, I have updated the "one gdb thread patch" in [1], along with a
trial patch of arm64 stack unwinding support. I'm not very familiar
with the synchronisations between gdb <-> crash, not sure if I have
cleaned them up and left some bugs. Could you please have a test to
see if it can work as expected? You can ignore the arm64 patch when
testing.
It works for me, both live debug and offline debug.
Regarding synchronisations, no issues, I had some synchronisation thing
in patch #4, I will remove or reduce the patch to remove unneeded
portions later.
Minor nitpick, the change to `pid_to_str` causes 'gdb thread' command to
show something like this:
crash> gdb thread
[Current thread is 1 ( 1 systemd)]
Thanks,
Aditya Gupta
>
> [1]:
https://github.com/liutgnu/crash-dev/tree/one-thread-v2
>
> > The idea seems nice to me too. Amazing. :)
>
> Thanks! :)
>
> >
> > >
> > > + gdb_refresh_regcache(0);
> > > - for (int i = 0; i < get_cpus_online(); i++)
> > > - gdb_refresh_regcache(i);
> >
> > Also, with Alexey's change of moving 'crash_target_init' to a later
> > time, even 'gdb_refresh_regcache(0)' might not be needed. It was needed
> > earlier due to crash_target_init being called before any architecture
> > gets a chance to set 'machdep->get_cpu_reg', and hence initialising
> > registers for CPU 0 had failed, 'gdb_refresh_regcache(0)' just caused
it
> > to fetch registers again, after architectures have set
> > 'machdep->get_cpu_reg'.
>
> Thanks, I will get it removed.
>
> Thanks,
> Tao Liu
>
> > Haven't tested this yet.
> >
> > Thanks,
> > Aditya Gupta
> >
> > >
> > > Do you have any thoughts?
> > >
> > > [1]:
https://github.com/liutgnu/crash-dev/commits/one-thread
> > >
> > > Thanks,
> > > Tao Liu
> > >
> > > > Thanks,
> > > > Tao Liu
> > > >
> > > >
> > > > >
> > > > > I can think of solving it in two ways:
> > > > > 1. Having a array in crash_target, which tells what CPU in gdb
maps to
> > > > > which task in crash, this can be done without requiring changes
to 'info
> > > > > threads', while we do the 'add_thread_silent'
> > > > > 2. Having all tasks in gdb itself, so it's a one-to-one, so
`set_cpu`
> > > > > will then refer to the task ID, I will have to explore more on
this.
> > > > >
> > > > > Any comments ?
> > > > >
> > > > > Thanks,
> > > > > Aditya Gupta
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > --Alexey
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 14, 2024 at 9:22 PM Tao Liu
<ltao(a)redhat.com> wrote:
> > > > > >
> > > > > > > Hi Alexey,
> > > > > > >
> > > > > > > On Thu, Mar 14, 2024 at 6:29 PM Alexey Makhalov
> > > > > > > <alexey.makhalov(a)broadcom.com> wrote:
> > > > > > > >
> > > > > > > > Support for GDB debugging of all tasks active and
inactive.
> > > > > > > > Before this commit only active tasks were listed
by "info threads"
> > > > > > > > with "CPU #" as a Target Id.
> > > > > > > >
> > > > > > > > "info threads" will now show all tasks,
similar to "ps", example:
> > > > > > > > crash> info threads
> > > > > > > > Id Target Id Frame
> > > > > > > > * 1 0 swapper/0 0xffffffffadba19d4
in default_idle () at
> > > > > > > arch/x86/kernel/process.c:731
> > > > > > > > 2 0 swapper/1 0xffffffffadba19d4
in default_idle () at
> > > > > > > arch/x86/kernel/process.c:731
> > > > > > > > 3 0 swapper/2 0xffffffffadba19d4
in default_idle () at
> > > > > > > arch/x86/kernel/process.c:731
> > > > > > > > 4 0 swapper/3 0xffffffffadba19d4
in default_idle () at
> > > > > > > arch/x86/kernel/process.c:731
> > > > > > > > 5 0 swapper/4 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf0000f3e88, next=0xffff9ecb04908000,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > ...
> > > > > > > > 730 970325 taskset 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf006a0fd18, next=0xffff9ecb0aec0000,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > 731 975217 sleep 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf005743c20, next=0xffff9ecac0692880,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > 732 975228 sleep 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf00696fb58, next=0xffff9ecac0690000,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > ...
> > > > > > > > 876 976084 docker 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf0153dbd10, next=0xffff9ecac0645100,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > 877 976085 systemd-userwor 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf0153cbc58, next=0xffff9ecac0645100,
prev=<optimized out>,
> > > > > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > > > > 878 976086 systemd-userwor 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf0153e3c58, next=0xffffffffaec15a40
<init_task>,
> > > > > > > prev=<optimized out>, rq=<optimized out>)
at kernel/sched/core.c:5372
> > > > > > > > 879 976087 systemd-userwor 0xffffffffadb97292
in context_switch
> > > > > > > (rf=0xffffbaf0153ebc58, next=0xffffffffaec15a40
<init_task>,
> > > > > > > prev=<optimized out>, rq=<optimized out>)
at kernel/sched/core.c:5372
> > > > > > > > Where "Target ID" contains "PID
COMM" of the task
> > > > > > > >
> > > > > > >
> > > > > > > I see here, we chose a different tech path of viewing
arbitrary tasks.
> > > > > > > You improved the "info threads" as an
alternative to crash "ps" cmd,
> > > > > > > and use "thread X" to switch to task X and
view its stacktrace.
> > > > > > > However mine is using crash "ps" cmd first,
getting specific tasks pid
> > > > > > > or task struct, then using crash cmd "set
<pid>" or "set
> > > > > > > <task_struct>" to switch to task X and view
its stacktrace.
> > > > > > >
> > > > > > > I notice there is one problem of using the "info
threads as ps" tech
> > > > > > > path, it is difficult to handle the live debug, see the
comment below:
> > > > > > >
> > > > > > > > Example of "731 975217 sleep"
debugging, real case, trying to
> > > > > > > > figure out why sleep was stuck in uninterruptable
sleep.
> > > > > > > > Backtrace using crash:
> > > > > > > > crash> ps | grep 975217
> > > > > > > > 975217 969797 3 ffff9ecb3956a880 UN 0.0
0 0
> > > > > > > sleep
> > > > > > > > crash> bt 975217
> > > > > > > > PID: 975217 TASK: ffff9ecb3956a880 CPU: 3
COMMAND: "sleep"
> > > > > > > > #0 [ffffbaf005743ba0] __schedule at
ffffffffadb97292
> > > > > > > > #1 [ffffbaf005743c60] schedule at
ffffffffadb982b8
> > > > > > > > #2 [ffffbaf005743c80] rwbase_write_lock at
ffffffffadb9aed7
> > > > > > > > #3 [ffffbaf005743cc0] down_write at
ffffffffadb9b133
> > > > > > > > #4 [ffffbaf005743cd0] unlink_file_vma at
ffffffffad2b0e2e
> > > > > > > > #5 [ffffbaf005743cf8] free_pgtables at
ffffffffad2a47b0
> > > > > > > > #6 [ffffbaf005743d88] exit_mmap at
ffffffffad2b3b8d
> > > > > > > > #7 [ffffbaf005743e80] mmput at ffffffffad08c81f
> > > > > > > > #8 [ffffbaf005743e98] do_exit at
ffffffffad09636c
> > > > > > > > #9 [ffffbaf005743ef8] do_group_exit at
ffffffffad096c78
> > > > > > > > RIP: 00007f111c70ddf9 RSP: 00007fff451817e8
RFLAGS: 00000246
> > > > > > > > RAX: ffffffffffffffda RBX: 00007f111c8089e0
RCX: 00007f111c70ddf9
> > > > > > > > RDX: 000000000000003c RSI: 00000000000000e7
RDI: 0000000000000000
> > > > > > > > RBP: 0000000000000000 R8: ffffffffffffff80
R9: 0000000000000000
> > > > > > > > R10: 00007fff451817b0 R11: 0000000000000246
R12: 00007f111c8089e0
> > > > > > > > R13: 00007f111c80e2e0 R14: 0000000000000002
R15: 00007f111c80e2c8
> > > > > > > > ORIG_RAX: 00000000000000e7 CS: 0033 SS:
002b
> > > > > > > >
> > > > > > > > Backtrace using gdb (pay attention, task must be
selected by thread Id):
> > > > > > > > crash> thread 731
> > > > > > > > [Switching to thread 731 ( 975217 sleep)]
> > > > > > > > 5372 switch_to(prev, next, prev);
> > > > > > > > crash> gdb bt
> > > > > > > > #0 0xffffffffadb97292 in context_switch
(rf=0xffffbaf005743c20,
> > > > > > > next=0xffff9ecac0692880, prev=<optimized out>,
rq=<optimized out>) at
> > > > > > > kernel/sched/core.c:5372
> > > > > > > > #1 __schedule (sched_mode=sched_mode@entry=0)
at
> > > > > > > kernel/sched/core.c:6696
> > > > > > > > #2 0xffffffffadb982b8 in schedule () at
kernel/sched/core.c:6772
> > > > > > > > #3 0xffffffffadb9aed7 in rwbase_write_lock
(rwb=rwb@entry=0xffff9ecaf1831430,
> > > > > > > state=state@entry=2) at kernel/locking/rwbase_rt.c:259
> > > > > > > > #4 0xffffffffadb9b133 in __down_write
(sem=sem@entry=0xffff9ecaf1831430)
> > > > > > > at kernel/locking/rwsem.c:1474
> > > > > > > > #5 down_write (sem=sem@entry=0xffff9ecaf1831430)
at
> > > > > > > kernel/locking/rwsem.c:1574
> > > > > > > > #6 0xffffffffad2b0e2e in i_mmap_lock_write
(mapping=<optimized out>)
> > > > > > > at ./include/linux/fs.h:466
> > > > > > > > #7 unlink_file_vma
(vma=vma@entry=0xffff9ecadd566090) at mm/mmap.c:127
> > > > > > > > #8 0xffffffffad2a47b0 in free_pgtables
(tlb=tlb@entry=0xffffbaf005743dd0,
> > > > > > > mt=mt@entry=0xffff9ecb28e3b180, vma=0xffff9ecadd566090,
vma@entry=0xffff9ecadd566000,
> > > > > > > floor=floor@entry=0, ceiling=ceiling@entry=0) at
mm/memory.c:431
> > > > > > > > #9 0xffffffffad2b3b8d in exit_mmap
(mm=mm@entry=0xffff9ecb28e3b180)
> > > > > > > at mm/mmap.c:3237
> > > > > > > > #10 0xffffffffad08c81f in __mmput
(mm=0xffff9ecb28e3b180) at
> > > > > > > kernel/fork.c:1204
> > > > > > > > #11 mmput (mm=mm@entry=0xffff9ecb28e3b180) at
kernel/fork.c:1226
> > > > > > > > #12 0xffffffffad09636c in exit_mm () at
kernel/exit.c:563
> > > > > > > > #13 do_exit (code=code@entry=0) at
kernel/exit.c:856
> > > > > > > > #14 0xffffffffad096c78 in do_group_exit
(exit_code=0) at
> > > > > > > kernel/exit.c:1019
> > > > > > > > #15 0xffffffffad096cf8 in __do_sys_exit_group
(error_code=<optimized
> > > > > > > out>) at kernel/exit.c:1030
> > > > > > > > #16 __se_sys_exit_group (error_code=<optimized
out>) at
> > > > > > > kernel/exit.c:1028
> > > > > > > > #17 __x64_sys_exit_group (regs=<optimized
out>) at kernel/exit.c:1028
> > > > > > > > #18 0xffffffffadb8a327 in do_syscall_x64
(nr=<optimized out>,
> > > > > > > regs=0xffffbaf005743f58) at arch/x86/entry/common.c:51
> > > > > > > > #19 do_syscall_64 (regs=0xffffbaf005743f58,
nr=<optimized out>) at
> > > > > > > arch/x86/entry/common.c:81
> > > > > > > > #20 0xffffffffadc000dc in entry_SYSCALL_64 () at
> > > > > > > arch/x86/entry/entry_64.S:120
> > > > > > > > #21 0x00007f111c80e2c8 in ?? ()
> > > > > > > > #22 0x0000000000000002 in ?? ()
> > > > > > > > #23 0x00007f111c80e2e0 in ?? ()
> > > > > > > > #24 0x00007f111c8089e0 in ?? ()
> > > > > > > > #25 0x0000000000000000 in ?? ()
> > > > > > > > crash> f 3
> > > > > > > > 259 rwbase_schedule();
> > > > > > > > crash> p *rwb
> > > > > > > > $1 = {
> > > > > > > > readers = {
> > > > > > > > counter = 1
> > > > > > > > },
> > > > > > > > rtmutex = {
> > > > > > > > wait_lock = {
> > > > > > > > raw_lock = {
> > > > > > > > {
> > > > > > > > val = {
> > > > > > > > counter = 0
> > > > > > > > },
> > > > > > > > {
> > > > > > > > locked = 0 '\000',
> > > > > > > > pending = 0 '\000'
> > > > > > > > },
> > > > > > > > {
> > > > > > > > locked_pending = 0,
> > > > > > > > tail = 0
> > > > > > > > }
> > > > > > > > }
> > > > > > > > }
> > > > > > > > },
> > > > > > > > waiters = {
> > > > > > > > rb_root = {
> > > > > > > > rb_node = 0xffffbaf006977be0
> > > > > > > > },
> > > > > > > > rb_leftmost = 0xffffbaf00696fbe0
> > > > > > > > },
> > > > > > > > owner = 0xffff9ecb3956a881
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > Additional changes:
> > > > > > > > 1. Allow gdb "frame" command.
> > > > > > > > 2. Blacklist useless gdb "gcore"
command. Use gcore plugin instead.
> > > > > > > > 3. Move crash_target_init() to later time as crash
target requires a
> > > > > > > list of
> > > > > > > > tasks to be initialized.
> > > > > > > >
> > > > > > > > Known issues and TBD items:
> > > > > > > > 1. "info threads" may bail out first
time throwing errors trying to
> > > > > > > access
> > > > > > > > userspace address during unwind process.
Following "info threads"
> > > > > > > > invokations run without issues.
> > > > > > > > 2. To unwind a stack of inactive task, only modern
Linux versions, which
> > > > > > > use
> > > > > > > > inactive_task_frame, are supported and only
x86_64 architecture.
> > > > > > > > 3. gdb bt unwinder does not stop properly and may
show invalid frames
> > > > > > > (21-25
> > > > > > > > on example above). Not a regression, existed
before.
> > > > > > > > 4. gdb bt unwinder does not work on active tasks
in userspace. Not a
> > > > > > > regression,
> > > > > > > > existed before.
> > > > > > > > 5. Only x86_64 architecture supported.
machdep->get_task_reg() must be
> > > > > > > > implemented for others. Not a regression,
existed before.
> > > > > > > > 6. Active tasks registers fetching imlemented only
for VMware dumps, see
> > > > > > > > x86_64_get_task_reg() for more details. Not a
regression, existed
> > > > > > > before.
> > > > > > > >
> > > > > > > > Signed-off-by: Alexey Makhalov
<alexey.makhalov(a)broadcom.com>
> > > > > > > > ---
> > > > > > > > crash_target.c | 39
++++++++++++++++++++---------
> > > > > > > > defs.h | 10 ++++++--
> > > > > > > > gdb-10.2.patch | 7 ++----
> > > > > > > > gdb_interface.c | 63
++++++++++++++++++++++++++++++----------------
> > > > > > > > help.c | 1 +
> > > > > > > > main.c | 1 +
> > > > > > > > task.c | 1 +
> > > > > > > > x86_64.c | 66
+++++++++++++++++++++++++++++++++++++++++++------
> > > > > > > > 8 files changed, 140 insertions(+), 48
deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/crash_target.c b/crash_target.c
> > > > > > > > index 4554806..2fdf203 100644
> > > > > > > > --- a/crash_target.c
> > > > > > > > +++ b/crash_target.c
> > > > > > > > @@ -2,6 +2,8 @@
> > > > > > > > * crash_target.c
> > > > > > > > *
> > > > > > > > * Copyright (c) 2021 VMware, Inc.
> > > > > > > > + * Copyright (c) 2024 Broadcom. All Rights
Reserved. The term "Broadcom"
> > > > > > > > + * refers to Broadcom Inc. and/or its
subsidiaries.
> > > > > > > > *
> > > > > > > > * This program is free software; you can
redistribute it and/or modify
> > > > > > > > * it under the terms of the GNU General Public
License as published by
> > > > > > > > @@ -13,7 +15,7 @@
> > > > > > > > * MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the
> > > > > > > > * GNU General Public License for more details.
> > > > > > > > *
> > > > > > > > - * Author: Alexey Makhalov
<amakhalov(a)vmware.com>
> > > > > > > > + * Author: Alexey Makhalov
<alexey.makhalov(a)broadcom.com>
> > > > > > > > */
> > > > > > > >
> > > > > > > > #include <defs.h>
> > > > > > > > @@ -23,11 +25,11 @@
> > > > > > > > #include "regcache.h"
> > > > > > > > #include "gdbarch.h"
> > > > > > > >
> > > > > > > > -void crash_target_init (void);
> > > > > > > > -
> > > > > > > > +extern "C" void crash_target_init
(void);
> > > > > > > > extern "C" int
gdb_readmem_callback(unsigned long, void *, int, int);
> > > > > > > > -extern "C" int
crash_get_nr_cpus(void);
> > > > > > > > -extern "C" int crash_get_cpu_reg (int
cpu, int regno, const char
> > > > > > > *regname,
> > > > > > > > +extern "C" int
crash_get_nr_tasks(void);
> > > > > > > > +extern "C" void crash_get_task_info(int
task_nr, unsigned long *pid,
> > > > > > > char **comm);
> > > > > > > > +extern "C" int crash_get_task_reg (int
task_nr, int regno, const char
> > > > > > > *regname,
> > > > > > > > int regsize,
void *val);
> > > > > > > >
> > > > > > > >
> > > > > > > > @@ -60,7 +62,13 @@ public:
> > > > > > > > bool has_registers () override { return true;
}
> > > > > > > > bool thread_alive (ptid_t ptid) override {
return true; }
> > > > > > > > std::string pid_to_str (ptid_t ptid) override
> > > > > > > > - { return string_printf ("CPU %ld",
ptid.tid ()); }
> > > > > > > > + {
> > > > > > > > + unsigned long pid;
> > > > > > > > + char *comm;
> > > > > > > > +
> > > > > > > > + crash_get_task_info(ptid.tid(), &pid,
&comm);
> > > > > > > > + return string_printf ("%7ld %s",
pid, comm);
> > > > > > > > + }
> > > > > > > >
> > > > > > > > };
> > > > > > > >
> > > > > > > > @@ -68,18 +76,25 @@ public:
> > > > > > > > void
> > > > > > > > crash_target::fetch_registers (struct regcache
*regcache, int regno)
> > > > > > > > {
> > > > > > > > + int r;
> > > > > > > > gdb_byte regval[16];
> > > > > > > > - int cpu = inferior_ptid.tid();
> > > > > > > > + int task_nr = inferior_ptid.tid();
> > > > > > > > struct gdbarch *arch = regcache->arch ();
> > > > > > > >
> > > > > > > > - for (int r = 0; r < gdbarch_num_regs (arch);
r++)
> > > > > > > > + if (regno >= 0) {
> > > > > > > > + r = regno;
> > > > > > > > + goto onetime;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + for (r = 0; regno == -1 && r <
gdbarch_num_regs (arch); r++)
> > > > > > > > {
> > > > > > > > +onetime:
> > > > > > > > const char *regname =
gdbarch_register_name(arch, r);
> > > > > > > > int regsize = register_size (arch, r);
> > > > > > > > if (regsize > sizeof (regval))
> > > > > > > > error (_("fatal error: buffer size
is not enough to fit
> > > > > > > register value"));
> > > > > > > >
> > > > > > > > - if (crash_get_cpu_reg (cpu, r, regname,
regsize, (void *)®val))
> > > > > > > > + if (crash_get_task_reg (task_nr, r,
regname, regsize, (void
> > > > > > > *)®val))
> > > > > > > > regcache->raw_supply (r, regval);
> > > > > > > > else
> > > > > > > > regcache->raw_supply (r, NULL);
> > > > > > > > @@ -107,10 +122,10 @@ crash_target::xfer_partial
(enum target_object
> > > > > > > object, const char *annex,
> > > > > > > >
> > > > > > > > #define CRASH_INFERIOR_PID 1
> > > > > > > >
> > > > > > > > -void
> > > > > > > > +extern "C" void
> > > > > > > > crash_target_init (void)
> > > > > > > > {
> > > > > > > > - int nr_cpus = crash_get_nr_cpus();
> > > > > > > > + int nr_tasks = crash_get_nr_tasks();
> > > > > > > > crash_target *target = new crash_target ();
> > > > > > > >
> > > > > > > > /* Own the target until it is successfully
pushed. */
> > > > > > > > @@ -119,7 +134,7 @@ crash_target_init (void)
> > > > > > > > push_target (std::move (target_holder));
> > > > > > > >
> > > > > > > > inferior_appeared (current_inferior (),
CRASH_INFERIOR_PID);
> > > > > > > > - for (int i = 0; i < nr_cpus; i++)
> > > > > > > > + for (int i = 0; i < nr_tasks; i++)
> > > > > > > > {
> > > > > > > > thread_info *thread = add_thread_silent
(target,
> > > > > > > >
ptid_t(CRASH_INFERIOR_PID, 0,
> > > > > > > i));
> > > > > > > > diff --git a/defs.h b/defs.h
> > > > > > > > index 98650e8..2b3f247 100644
> > > > > > > > --- a/defs.h
> > > > > > > > +++ b/defs.h
> > > > > > > > @@ -1080,7 +1080,7 @@ struct machdep_table {
> > > > > > > > void (*get_irq_affinity)(int);
> > > > > > > > void (*show_interrupts)(int, ulong *);
> > > > > > > > int (*is_page_ptr)(ulong, physaddr_t *);
> > > > > > > > - int (*get_cpu_reg)(int, int, const char *,
int, void *);
> > > > > > > > + int (*get_task_reg)(struct task_context *,
int, const char *,
> > > > > > > int, void *);
> > > > > > > > int (*is_cpu_prstatus_valid)(int cpu);
> > > > > > > > };
> > > > > > > >
> > > > > > > > @@ -2263,6 +2263,7 @@ struct size_table {
/* stash of
> > > > > > > commonly-used sizes */
> > > > > > > > long pt_regs;
> > > > > > > > long task_struct;
> > > > > > > > long thread_info;
> > > > > > > > + long inactive_task_frame;
> > > > > > > > long softirq_state;
> > > > > > > > long desc_struct;
> > > > > > > > long umode_t;
> > > > > > > > @@ -8001,9 +8002,14 @@ extern int
have_full_symbols(void);
> > > > > > > > #define XEN_HYPERVISOR_ARCH
> > > > > > > > #endif
> > > > > > > >
> > > > > > > > +/*
> > > > > > > > + * crash_target.c
> > > > > > > > + */
> > > > > > > > +extern void crash_target_init (void);
> > > > > > > > +
> > > > > > > > /*
> > > > > > > > * Register numbers must be in sync with
gdb/features/i386/64bit-core.c
> > > > > > > > - * to make crash_target->fetch_registers()
---> machdep->get_cpu_reg()
> > > > > > > > + * to make crash_target->fetch_registers()
---> machdep->get_task_reg()
> > > > > > > > * working properly.
> > > > > > > > */
> > > > > > > > enum x86_64_regnum {
> > > > > > > > diff --git a/gdb-10.2.patch b/gdb-10.2.patch
> > > > > > > > index a7018a2..ecf673d 100644
> > > > > > > > --- a/gdb-10.2.patch
> > > > > > > > +++ b/gdb-10.2.patch
> > > > > > > > @@ -221,7 +221,7 @@ exit 0
> > > > > > > > warning (_("\
> > > > > > > > --- gdb-10.2/gdb/main.c.orig
> > > > > > > > +++ gdb-10.2/gdb/main.c
> > > > > > > > -@@ -392,6 +392,14 @@ start_event_loop ()
> > > > > > > > +@@ -392,6 +392,13 @@ start_event_loop ()
> > > > > > > > return;
> > > > > > > > }
> > > > > > > >
> > > > > > > > @@ -230,7 +230,6 @@ exit 0
> > > > > > > > +extern "C" void main_loop(void);
> > > > > > > > +extern "C" unsigned long
crash_get_kaslr_offset(void);
> > > > > > > > +extern "C" int console(const char *,
...);
> > > > > > > > -+void crash_target_init (void);
> > > > > > > > +#endif
> > > > > > > > +
> > > > > > > > /* Call command_loop. */
> > > > > > > > @@ -316,7 +315,7 @@ exit 0
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > -@@ -1242,6 +1274,16 @@ captured_main (void
*data)
> > > > > > > > +@@ -1242,6 +1274,14 @@ captured_main (void
*data)
> > > > > > > >
> > > > > > > > captured_main_1 (context);
> > > > > > > >
> > > > > > > > @@ -324,8 +323,6 @@ exit 0
> > > > > > > > + /* Relocate the vmlinux. */
> > > > > > > > + objfile_rebase (symfile_objfile,
crash_get_kaslr_offset());
> > > > > > > > +
> > > > > > > > -+ crash_target_init();
> > > > > > > > -+
> > > > > > > > + /* Back to crash. */
> > > > > > > > + main_loop();
> > > > > > > > +#endif
> > > > > > > > diff --git a/gdb_interface.c b/gdb_interface.c
> > > > > > > > index b14319c..03178f5 100644
> > > > > > > > --- a/gdb_interface.c
> > > > > > > > +++ b/gdb_interface.c
> > > > > > > > @@ -3,6 +3,8 @@
> > > > > > > > * Copyright (C) 1999, 2000, 2001, 2002 Mission
Critical Linux, Inc.
> > > > > > > > * Copyright (C) 2002-2015,2018-2019 David
Anderson
> > > > > > > > * Copyright (C) 2002-2015,2018-2019 Red Hat,
Inc. All rights reserved.
> > > > > > > > + * Copyright (c) 2024 Broadcom. All Rights
Reserved. The term "Broadcom"
> > > > > > > > + * refers to Broadcom Inc. and/or its
subsidiaries.
> > > > > > > > *
> > > > > > > > * This program is free software; you can
redistribute it and/or modify
> > > > > > > > * it under the terms of the GNU General Public
License as published by
> > > > > > > > @@ -711,7 +713,7 @@ static char *prohibited_list[]
= {
> > > > > > > > "watch", "rwatch",
"awatch", "attach", "continue", "c",
"fg",
> > > > > > > "detach",
> > > > > > > > "finish", "handle",
"interrupt", "jump", "kill", "next",
"nexti",
> > > > > > > > "signal", "step",
"s", "stepi", "target", "until",
"delete",
> > > > > > > > - "clear", "disable",
"enable", "condition", "ignore", "frame",
> > > > > > > "catch",
> > > > > > > > + "clear", "disable",
"enable", "condition", "ignore", "gcore",
> > > > > > > "catch",
> > > > > > > > "tcatch", "return",
"file", "exec-file", "core-file",
> > > > > > > "symbol-file",
> > > > > > > > "load", "si",
"ni", "shell", "sy",
> > > > > > > > NULL /* must be last */
> > > > > > > > @@ -877,6 +879,7 @@ gdb_readmem_callback(ulong
addr, void *buf, int len,
> > > > > > > int write)
> > > > > > > > switch (len)
> > > > > > > > {
> > > > > > > > case SIZEOF_8BIT:
> > > > > > > > + fprintf(fp, "%s\n",
pc->curcmd);
> > > > > > > > if (STREQ(pc->curcmd,
"bt")) {
> > > > > > > > if (readmem(addr, memtype,
buf, SIZEOF_8BIT,
> > > > > > > >
"gdb_readmem_callback", readflags))
> > > > > > > > @@ -1063,34 +1066,52 @@ get_frame_offset(ulong
pc)
> > > > > > > > unsigned long crash_get_kaslr_offset(void);
> > > > > > > > unsigned long crash_get_kaslr_offset(void)
> > > > > > > > {
> > > > > > > > - return kt->relocate * -1;
> > > > > > > > + return kt->relocate * -1;
> > > > > > > > }
> > > > > > > >
> > > > > > > > /* Callbacks for crash_target */
> > > > > > > > -int crash_get_nr_cpus(void);
> > > > > > > > -int crash_get_cpu_reg (int cpu, int regno, const
char *regname,
> > > > > > > > +int crash_get_nr_tasks(void);
> > > > > > > > +void crash_get_task_info(int task_nr, unsigned
long *pid, char **comm);
> > > > > > > > +int crash_get_task_reg (int task_nr, int regno,
const char *regname,
> > > > > > > > int regsize, void *val);
> > > > > > > >
> > > > > > > > -int crash_get_nr_cpus(void)
> > > > > > > > +int crash_get_nr_tasks(void)
> > > > > > > > {
> > > > > > > > - if (SADUMP_DUMPFILE())
> > > > > > > > - return sadump_get_nr_cpus();
> > > > > > > > - else if (DISKDUMP_DUMPFILE())
> > > > > > > > - return diskdump_get_nr_cpus();
> > > > > > > > - else if (KDUMP_DUMPFILE())
> > > > > > > > - return kdump_get_nr_cpus();
> > > > > > > > - else if (VMSS_DUMPFILE())
> > > > > > > > - return
vmware_vmss_get_nr_cpus();
> > > > > > > > -
> > > > > > > > - /* Just CPU #0 */
> > > > > > > ()> - return 1;
> > > > > > > > + return RUNNING_TASKS();
> > > > > > > > }
> > > > > > >
> > > > > > > The crash_get_nr_tasks() will return a fixed number of
running tasks,
> > > > > > > which is the task status when crash is loading. However
for live debug
> > > > > > > mode, there will always be tasks exit and create, so we
need to sync
> > > > > > > the tasks number with the "info threads". I
guess it is not an easy
> > > > > > > work because that will involve:
> > > > > > >
> > > > > > > when crash loading:
> > > > > > > for (int i = 0; i < nr_tasks; i++)
> > > > > > > {
> > > > > > > thread_info *thread = add_thread_silent (target,
...
> > > > > > > }
> > > > > > >
> > > > > > > when new task create:
> > > > > > > thread_info *thread = add_thread_silent (target, ...
> > > > > > >
> > > > > > > when task exit:
> > > > > > > delete_thread(...)
> > > > > > >
> > > > > > > We may monitor those task create/exit events and do
thread add/delete
> > > > > > > in a callback function or we just always do all tasks
delete/add when
> > > > > > > "info thread" is invoked. But I guess none of
these are as simple as
> > > > > > > using crash "ps" cmd and "set
<pid>", there won't be any problem.
> > > > > > > Because crash "ps" can always show the
current tasks status, and set
> > > > > > > <pid> can switch the task context to it.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Tao Liu
> > > > > > >
> > > > > > > >
> > > > > > > > -int crash_get_cpu_reg (int cpu, int regno, const
char *regname,
> > > > > > > > - int regsize, void *value)
> > > > > > > > +/* Get task information by its index number in TT
*/
> > > > > > > > +void crash_get_task_info(int task_nr, unsigned
long *pid, char **comm)
> > > > > > > > {
> > > > > > > > - if (!machdep->get_cpu_reg)
> > > > > > > > - return FALSE;
> > > > > > > > - return machdep->get_cpu_reg(cpu,
regno, regname, regsize,
> > > > > > > value);
> > > > > > > > + int i;
> > > > > > > > + struct task_context *tc;
> > > > > > > > +
> > > > > > > > + tc = FIRST_CONTEXT();
> > > > > > > > + for (i = 0; i < RUNNING_TASKS(); i++,
tc++)
> > > > > > > > + if (i == task_nr) {
> > > > > > > > + *pid = tc->pid;
> > > > > > > > + *comm = tc->comm;
> > > > > > > > + return;
> > > > > > > > + }
> > > > > > > > + *pid = 0;
> > > > > > > > + *comm = NULL;
> > > > > > > > + return;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int crash_get_task_reg (int task_nr, int regno,
const char *regname,
> > > > > > > > + int regsize, void *value)
> > > > > > > > +{
> > > > > > > > + int i;
> > > > > > > > + struct task_context *tc;
> > > > > > > > +
> > > > > > > > + if (!machdep->get_task_reg)
> > > > > > > > + return FALSE;
> > > > > > > > +
> > > > > > > > + tc = FIRST_CONTEXT();
> > > > > > > > + for (i = 0; i < RUNNING_TASKS(); i++,
tc++)
> > > > > > > > + if (i == task_nr) {
> > > > > > > > + return
machdep->get_task_reg(tc, regno, regname,
> > > > > > > regsize, value);
> > > > > > > > + }
> > > > > > > > + return FALSE;
> > > > > > > > }
> > > > > > > >
> > > > > > > > diff --git a/help.c b/help.c
> > > > > > > > index a9c4d30..85dbda5 100644
> > > > > > > > --- a/help.c
> > > > > > > > +++ b/help.c
> > > > > > > > @@ -8520,6 +8520,7 @@ char *version_info[] = {
> > > > > > > > "Copyright (C) 1999, 2002, 2007 Silicon
Graphics, Inc.",
> > > > > > > > "Copyright (C) 1999, 2000, 2001, 2002
Mission Critical Linux, Inc.",
> > > > > > > > "Copyright (C) 2015, 2021 VMware,
Inc.",
> > > > > > > > +"Copyright (C) 2024 Broadcom, Inc.",
> > > > > > > > "This program is free software, covered by
the GNU General Public
> > > > > > > License,",
> > > > > > > > "and you are welcome to change it and/or
distribute copies of it under",
> > > > > > > > "certain conditions. Enter \"help
copying\" to see the conditions.",
> > > > > > > > diff --git a/main.c b/main.c
> > > > > > > > index 0b6b927..13acd2d 100644
> > > > > > > > --- a/main.c
> > > > > > > > +++ b/main.c
> > > > > > > > @@ -794,6 +794,7 @@ main_loop(void)
> > > > > > > > } else
> > > > > > > > SIGACTION(SIGINT, restart,
&pc->sigaction, NULL);
> > > > > > > >
> > > > > > > > + crash_target_init();
> > > > > > > > /*
> > > > > > > > * Display system statistics and current
context.
> > > > > > > > */
> > > > > > > > diff --git a/task.c b/task.c
> > > > > > > > index ebdb5be..5d26c52 100644
> > > > > > > > --- a/task.c
> > > > > > > > +++ b/task.c
> > > > > > > > @@ -298,6 +298,7 @@ task_init(void)
> > > > > > > > tt->flags |= THREAD_INFO;
> > > > > > > > }
> > > > > > > >
> > > > > > > > + STRUCT_SIZE_INIT(inactive_task_frame,
"inactive_task_frame");
> > > > > > > > MEMBER_OFFSET_INIT(task_struct_state,
"task_struct", "state");
> > > > > > > > MEMBER_SIZE_INIT(task_struct_state,
"task_struct", "state");
> > > > > > > > if (INVALID_MEMBER(task_struct_state)) {
> > > > > > > > diff --git a/x86_64.c b/x86_64.c
> > > > > > > > index 502817d..b6e36a5 100644
> > > > > > > > --- a/x86_64.c
> > > > > > > > +++ b/x86_64.c
> > > > > > > > @@ -126,7 +126,7 @@ static int
x86_64_get_framesize(struct bt_info *,
> > > > > > > ulong, ulong, char *);
> > > > > > > > static void x86_64_framesize_debug(struct bt_info
*);
> > > > > > > > static void x86_64_get_active_set(void);
> > > > > > > > static int x86_64_get_kvaddr_ranges(struct
vaddr_range *);
> > > > > > > > -static int x86_64_get_cpu_reg(int, int, const
char *, int, void *);
> > > > > > > > +static int x86_64_get_task_reg(struct
task_context *, int, const char
> > > > > > > *, int, void *);
> > > > > > > > static int x86_64_verify_paddr(uint64_t);
> > > > > > > > static void GART_init(void);
> > > > > > > > static void x86_64_exception_stacks_init(void);
> > > > > > > > @@ -195,7 +195,7 @@ x86_64_init(int when)
> > > > > > > >
machdep->machspec->irq_eframe_link = UNINITIALIZED;
> > > > > > > >
machdep->machspec->irq_stack_gap = UNINITIALIZED;
> > > > > > > > machdep->get_kvaddr_ranges =
x86_64_get_kvaddr_ranges;
> > > > > > > > - machdep->get_cpu_reg =
x86_64_get_cpu_reg;
> > > > > > > > + machdep->get_task_reg =
x86_64_get_task_reg;
> > > > > > > > if (machdep->cmdline_args[0])
> > > > > > > > parse_cmdline_args();
> > > > > > > > if ((string =
pc->read_vmcoreinfo("relocate"))) {
> > > > > > > > @@ -891,7 +891,7 @@
x86_64_dump_machdep_table(ulong arg)
> > > > > > > > fprintf(fp, " is_page_ptr:
x86_64_is_page_ptr()\n");
> > > > > > > > fprintf(fp, " verify_paddr:
x86_64_verify_paddr()\n");
> > > > > > > > fprintf(fp, " get_kvaddr_ranges:
> > > > > > > x86_64_get_kvaddr_ranges()\n");
> > > > > > > > - fprintf(fp, " get_cpu_reg:
x86_64_get_cpu_reg()\n");
> > > > > > > > + fprintf(fp, " get_task_reg:
x86_64_get_task_reg()\n");
> > > > > > > > fprintf(fp, " init_kernel_pgd:
x86_64_init_kernel_pgd()\n");
> > > > > > > > fprintf(fp, "clear_machdep_cache:
> > > > > > > x86_64_clear_machdep_cache()\n");
> > > > > > > > fprintf(fp, " xendump_p2m_create:
%s\n", PVOPS_XEN() ?
> > > > > > > > @@ -6398,6 +6398,9 @@ x86_64_ORC_init(void)
> > > > > > > > };
> > > > > > > > struct ORC_data *orc;
> > > > > > > >
> > > > > > > > +
MEMBER_OFFSET_INIT(inactive_task_frame_bp,
> > > > > > > "inactive_task_frame", "bp");
> > > > > > > > +
MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr,
> > > > > > > "inactive_task_frame",
"ret_addr");
> > > > > > > > +
> > > > > > > > if (machdep->flags & FRAMEPOINTER)
> > > > > > > > return;
> > > > > > > >
> > > > > > > > @@ -6455,9 +6458,6 @@ x86_64_ORC_init(void)
> > > > > > > > orc->__stop_orc_unwind =
symbol_value("__stop_orc_unwind");
> > > > > > > > orc->orc_lookup =
symbol_value("orc_lookup");
> > > > > > > >
> > > > > > > > -
MEMBER_OFFSET_INIT(inactive_task_frame_bp,
> > > > > > > "inactive_task_frame", "bp");
> > > > > > > > -
MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr,
> > > > > > > "inactive_task_frame",
"ret_addr");
> > > > > > > > -
> > > > > > > > orc->has_signal =
MEMBER_EXISTS("orc_entry", "signal"); /* added
> > > > > > > at 6.3 */
> > > > > > > > orc->has_end =
MEMBER_EXISTS("orc_entry", "end"); /*
> > > > > > > removed at 6.4 */
> > > > > > > >
> > > > > > > > @@ -9070,14 +9070,64 @@
x86_64_get_kvaddr_ranges(struct vaddr_range *vrp)
> > > > > > > > }
> > > > > > > >
> > > > > > > > static int
> > > > > > > > -x86_64_get_cpu_reg(int cpu, int regno, const char
*name,
> > > > > > > > +x86_64_get_task_reg(struct task_context *tc, int
regno, const char
> > > > > > > *name,
> > > > > > > > int size, void *value)
> > > > > > > > {
> > > > > > > > if (regno >= LAST_REGNUM)
> > > > > > > > return FALSE;
> > > > > > > >
> > > > > > > > + /*
> > > > > > > > + * For inactive task, grab rip, rbp, rbx,
r12, r13, r14 and r15
> > > > > > > from
> > > > > > > > + * inactive_task_frame (see
__switch_to_asm). Other regs saved on
> > > > > > > > + * regular frame.
> > > > > > > > + */
> > > > > > > > + if (!is_task_active(tc->task)) {
> > > > > > > > + int frame_size =
STRUCT_SIZE("inactive_task_frame");
> > > > > > > > +
> > > > > > > > + /* Only modern kernels supported.
*/
> > > > > > > > + if (tt->flags & THREAD_INFO
&& frame_size == 7 * 8) {
> > > > > > > > + ulong rsp;
> > > > > > > > + int offset = 0;
> > > > > > > > + switch (regno) {
> > > > > > > > + case RSP_REGNUM:
> > > > > > > > +
readmem(tc->task +
> > > > > > > OFFSET(task_struct_thread) +
> > > > > > > > +
> > > > > > > OFFSET(thread_struct_rsp), KVADDR,
> > > > > > > > +
&rsp, sizeof(void *),
> > > > > > > > +
"thread_struct rsp",
> > > > > > > FAULT_ON_ERROR);
> > > > > > > > + rsp +=
frame_size;
> > > > > > > > +
memcpy(value, &rsp, size);
> > > > > > > > + return
TRUE;
> > > > > > > > + case RIP_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case RBP_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case RBX_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case R12_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case R13_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case R14_REGNUM:
> > > > > > > > + offset +=
8;
> > > > > > > > + case R15_REGNUM:
> > > > > > > > +
readmem(tc->task +
> > > > > > > OFFSET(task_struct_thread) +
> > > > > > > > +
> > > > > > > OFFSET(thread_struct_rsp), KVADDR,
> > > > > > > > +
&rsp, sizeof(void *),
> > > > > > > > +
"thread_struct rsp",
> > > > > > > FAULT_ON_ERROR);
> > > > > > > > +
readmem(rsp + offset, KVADDR,
> > > > > > > value, sizeof(void *),
> > > > > > > > +
> > > > > > > "inactive_thread_frame saved regs",
FAULT_ON_ERROR);
> > > > > > > > + return
TRUE;
> > > > > > > > + }
> > > > > > > > + }
> > > > > > > > + /* TBD: older kernels support. */
> > > > > > > > + return FALSE;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + /*
> > > > > > > > + * Task is active, grab CPU's
registers
> > > > > > > > + */
> > > > > > > > if (VMSS_DUMPFILE())
> > > > > > > > - return
vmware_vmss_get_cpu_reg(cpu, regno, name, size,
> > > > > > > value);
> > > > > > > > + return
vmware_vmss_get_cpu_reg(tc->processor, regno,
> > > > > > > name, size, value);
> > > > > > > >
> > > > > > > > return FALSE;
> > > > > > > > }
> > > > > > > > --
> > > > > > > > 2.39.0
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > >
> >
>