Hi Aditya,
On Mon, Mar 18, 2024 at 2:38 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
Hi Tao,
> > 'set 1' (let's assume it's an INACTIVE task on CPU 5)
> >
> > 'info threads'
> > -> when it switches to CPU 5, it asks crash to sync, ie. `set_cpu(5)`
> > -> crash will return CPU 5's active tasks registers
> > -> WRONG registers for CPU 5, since we wanted registers for PID 1 (which
> > is an inactive task on CPU 5)
>
> What do you mean of "WRONG registers"? Isn't the CPU5's regcache
be pid
> 1's context after 'set 1'? If it isn't then it's a bug.
Yes, it was a bug with my series. Due to doing set_cpu in
'gdb_refresh_regcache'. The issue occured as I haven't considered
arbitrary tasks in my series, which gets solved in your patches.
It was like this:
'set 1' (sets task context in crash to 'systemd')
-> 'change_gdb_cpu_context'
-> 'gdb_refresh_regcache (cpu 5)'
-> set_context (active task on cpu 5) # this was the bug
This causes 'set' itself to switch always to active tasks.
It's because when doing sync from gdb, it only knows the CPU number, and
asks crash to set context before trying to get registers, and crash sets
this context to active task on CPU 5, and thus gdb reads register for
the active task.
This gets solved with your approach of using 'add_silent_thread' which
indirectly causes gdb to refresh regcache, and thus not needing to call
'gdb_refresh_regcache'.
Also, with Alexey's patch moving 'crash_target_init' to later in the
initialisation, we can also possibly remove 'gdb_refresh_regcache'
later, if CPU 0 isn't initialised at a time when crash has
machdep->get_cpu_reg = NULL.
Really sorry Tao, for causing the confusion here.
No worries, and thank you for the explaination. Just raise any
concerns if encountered. The more discussion/testing we have, the
better design and approach we will have for the stack unwinding
feature.
Thanks,
Tao Liu
- Aditya Gupta
>
> After crash finished loading and we see crash> prompt, the cpu regcaches
> will be all active tasks', which are running on each cpu. That is what we saw
> using "info threads", like:
>
> crash> info threads
> Id Target Id Frame
> ....
> * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000,
reserved=reserved@entry=false) at block/blk-mq.c:640
> ....
> 14 CPU 13 native_safe_halt () at
arch/x86/include/asm/irqflags.h:54
> ....
>
> Then we use "set <pid>" to switch to task <pid>, its cpu
regcache will be
> flushed, and if we use command "info threads" again, it will show the
status
> after the flushing. E.g in the following, task pid 1's cpu is #13, so
> cpu #13 is flushed after "set 1":
>
> crash> ps
> PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd
>
> crash> set 1
> crash> info threads
> Id Target Id Frame
> ....
> 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000,
reserved=reserved@entry=false) at block/blk-mq.c:640
> ....
> * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0,
prev=0xffff880169b30000) at kernel/sched/core.c:2527
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...
> ....
>
> When design, I assume the CPUs/threads are like power slots, its number is
> constant, but the contents can be changed. We don't need to prepare the
> "slots" as many as our tasks quantity (I don't know the maxinum
threads that
> a gdb can accept, e.g. on a system which have 65K tasks, but for most
> tasks we'd never switch to it in order to view its stack trace, I guess
> it would be a waste to call add_thread_silent (target, ...) 65K times).
>
> Currently the gdb CPUs/threads(slots) equals to the system's cpu number. But
> if a system which have thousands of cpus, which we think is unnecessary to
> call add_thread_silent 1K times, we can even shrink the slots to be less.
>
> Thanks,
> Tao Liu
>
>
> >
> > I can think of solving it in two ways:
> > 1. Having a array in crash_target, which tells what CPU in gdb maps to
> > which task in crash, this can be done without requiring changes to 'info
> > threads', while we do the 'add_thread_silent'
> > 2. Having all tasks in gdb itself, so it's a one-to-one, so `set_cpu`
> > will then refer to the task ID, I will have to explore more on this.
> >
> > Any comments ?
> >
> > Thanks,
> > Aditya Gupta
> >
> > >
> > > Thanks,
> > > --Alexey
> > >
> > >
> > > On Thu, Mar 14, 2024 at 9:22 PM Tao Liu <ltao(a)redhat.com> wrote:
> > >
> > > > Hi Alexey,
> > > >
> > > > On Thu, Mar 14, 2024 at 6:29 PM Alexey Makhalov
> > > > <alexey.makhalov(a)broadcom.com> wrote:
> > > > >
> > > > > Support for GDB debugging of all tasks active and inactive.
> > > > > Before this commit only active tasks were listed by "info
threads"
> > > > > with "CPU #" as a Target Id.
> > > > >
> > > > > "info threads" will now show all tasks, similar to
"ps", example:
> > > > > crash> info threads
> > > > > Id Target Id Frame
> > > > > * 1 0 swapper/0 0xffffffffadba19d4 in
default_idle () at
> > > > arch/x86/kernel/process.c:731
> > > > > 2 0 swapper/1 0xffffffffadba19d4 in
default_idle () at
> > > > arch/x86/kernel/process.c:731
> > > > > 3 0 swapper/2 0xffffffffadba19d4 in
default_idle () at
> > > > arch/x86/kernel/process.c:731
> > > > > 4 0 swapper/3 0xffffffffadba19d4 in
default_idle () at
> > > > arch/x86/kernel/process.c:731
> > > > > 5 0 swapper/4 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf0000f3e88, next=0xffff9ecb04908000, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > ...
> > > > > 730 970325 taskset 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf006a0fd18, next=0xffff9ecb0aec0000, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > 731 975217 sleep 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf005743c20, next=0xffff9ecac0692880, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > 732 975228 sleep 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf00696fb58, next=0xffff9ecac0690000, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > ...
> > > > > 876 976084 docker 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf0153dbd10, next=0xffff9ecac0645100, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > 877 976085 systemd-userwor 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf0153cbc58, next=0xffff9ecac0645100, prev=<optimized
out>,
> > > > rq=<optimized out>) at kernel/sched/core.c:5372
> > > > > 878 976086 systemd-userwor 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf0153e3c58, next=0xffffffffaec15a40 <init_task>,
> > > > prev=<optimized out>, rq=<optimized out>) at
kernel/sched/core.c:5372
> > > > > 879 976087 systemd-userwor 0xffffffffadb97292 in
context_switch
> > > > (rf=0xffffbaf0153ebc58, next=0xffffffffaec15a40 <init_task>,
> > > > prev=<optimized out>, rq=<optimized out>) at
kernel/sched/core.c:5372
> > > > > Where "Target ID" contains "PID COMM" of the
task
> > > > >
> > > >
> > > > I see here, we chose a different tech path of viewing arbitrary
tasks.
> > > > You improved the "info threads" as an alternative to crash
"ps" cmd,
> > > > and use "thread X" to switch to task X and view its
stacktrace.
> > > > However mine is using crash "ps" cmd first, getting
specific tasks pid
> > > > or task struct, then using crash cmd "set <pid>" or
"set
> > > > <task_struct>" to switch to task X and view its
stacktrace.
> > > >
> > > > I notice there is one problem of using the "info threads as
ps" tech
> > > > path, it is difficult to handle the live debug, see the comment
below:
> > > >
> > > > > Example of "731 975217 sleep" debugging, real case,
trying to
> > > > > figure out why sleep was stuck in uninterruptable sleep.
> > > > > Backtrace using crash:
> > > > > crash> ps | grep 975217
> > > > > 975217 969797 3 ffff9ecb3956a880 UN 0.0 0
0
> > > > sleep
> > > > > crash> bt 975217
> > > > > PID: 975217 TASK: ffff9ecb3956a880 CPU: 3 COMMAND:
"sleep"
> > > > > #0 [ffffbaf005743ba0] __schedule at ffffffffadb97292
> > > > > #1 [ffffbaf005743c60] schedule at ffffffffadb982b8
> > > > > #2 [ffffbaf005743c80] rwbase_write_lock at ffffffffadb9aed7
> > > > > #3 [ffffbaf005743cc0] down_write at ffffffffadb9b133
> > > > > #4 [ffffbaf005743cd0] unlink_file_vma at ffffffffad2b0e2e
> > > > > #5 [ffffbaf005743cf8] free_pgtables at ffffffffad2a47b0
> > > > > #6 [ffffbaf005743d88] exit_mmap at ffffffffad2b3b8d
> > > > > #7 [ffffbaf005743e80] mmput at ffffffffad08c81f
> > > > > #8 [ffffbaf005743e98] do_exit at ffffffffad09636c
> > > > > #9 [ffffbaf005743ef8] do_group_exit at ffffffffad096c78
> > > > > RIP: 00007f111c70ddf9 RSP: 00007fff451817e8 RFLAGS:
00000246
> > > > > RAX: ffffffffffffffda RBX: 00007f111c8089e0 RCX:
00007f111c70ddf9
> > > > > RDX: 000000000000003c RSI: 00000000000000e7 RDI:
0000000000000000
> > > > > RBP: 0000000000000000 R8: ffffffffffffff80 R9:
0000000000000000
> > > > > R10: 00007fff451817b0 R11: 0000000000000246 R12:
00007f111c8089e0
> > > > > R13: 00007f111c80e2e0 R14: 0000000000000002 R15:
00007f111c80e2c8
> > > > > ORIG_RAX: 00000000000000e7 CS: 0033 SS: 002b
> > > > >
> > > > > Backtrace using gdb (pay attention, task must be selected by
thread Id):
> > > > > crash> thread 731
> > > > > [Switching to thread 731 ( 975217 sleep)]
> > > > > 5372 switch_to(prev, next, prev);
> > > > > crash> gdb bt
> > > > > #0 0xffffffffadb97292 in context_switch
(rf=0xffffbaf005743c20,
> > > > next=0xffff9ecac0692880, prev=<optimized out>, rq=<optimized
out>) at
> > > > kernel/sched/core.c:5372
> > > > > #1 __schedule (sched_mode=sched_mode@entry=0) at
> > > > kernel/sched/core.c:6696
> > > > > #2 0xffffffffadb982b8 in schedule () at
kernel/sched/core.c:6772
> > > > > #3 0xffffffffadb9aed7 in rwbase_write_lock
(rwb=rwb@entry=0xffff9ecaf1831430,
> > > > state=state@entry=2) at kernel/locking/rwbase_rt.c:259
> > > > > #4 0xffffffffadb9b133 in __down_write
(sem=sem@entry=0xffff9ecaf1831430)
> > > > at kernel/locking/rwsem.c:1474
> > > > > #5 down_write (sem=sem@entry=0xffff9ecaf1831430) at
> > > > kernel/locking/rwsem.c:1574
> > > > > #6 0xffffffffad2b0e2e in i_mmap_lock_write
(mapping=<optimized out>)
> > > > at ./include/linux/fs.h:466
> > > > > #7 unlink_file_vma (vma=vma@entry=0xffff9ecadd566090) at
mm/mmap.c:127
> > > > > #8 0xffffffffad2a47b0 in free_pgtables
(tlb=tlb@entry=0xffffbaf005743dd0,
> > > > mt=mt@entry=0xffff9ecb28e3b180, vma=0xffff9ecadd566090,
vma@entry=0xffff9ecadd566000,
> > > > floor=floor@entry=0, ceiling=ceiling@entry=0) at mm/memory.c:431
> > > > > #9 0xffffffffad2b3b8d in exit_mmap
(mm=mm@entry=0xffff9ecb28e3b180)
> > > > at mm/mmap.c:3237
> > > > > #10 0xffffffffad08c81f in __mmput (mm=0xffff9ecb28e3b180) at
> > > > kernel/fork.c:1204
> > > > > #11 mmput (mm=mm@entry=0xffff9ecb28e3b180) at
kernel/fork.c:1226
> > > > > #12 0xffffffffad09636c in exit_mm () at kernel/exit.c:563
> > > > > #13 do_exit (code=code@entry=0) at kernel/exit.c:856
> > > > > #14 0xffffffffad096c78 in do_group_exit (exit_code=0) at
> > > > kernel/exit.c:1019
> > > > > #15 0xffffffffad096cf8 in __do_sys_exit_group
(error_code=<optimized
> > > > out>) at kernel/exit.c:1030
> > > > > #16 __se_sys_exit_group (error_code=<optimized out>) at
> > > > kernel/exit.c:1028
> > > > > #17 __x64_sys_exit_group (regs=<optimized out>) at
kernel/exit.c:1028
> > > > > #18 0xffffffffadb8a327 in do_syscall_x64 (nr=<optimized
out>,
> > > > regs=0xffffbaf005743f58) at arch/x86/entry/common.c:51
> > > > > #19 do_syscall_64 (regs=0xffffbaf005743f58, nr=<optimized
out>) at
> > > > arch/x86/entry/common.c:81
> > > > > #20 0xffffffffadc000dc in entry_SYSCALL_64 () at
> > > > arch/x86/entry/entry_64.S:120
> > > > > #21 0x00007f111c80e2c8 in ?? ()
> > > > > #22 0x0000000000000002 in ?? ()
> > > > > #23 0x00007f111c80e2e0 in ?? ()
> > > > > #24 0x00007f111c8089e0 in ?? ()
> > > > > #25 0x0000000000000000 in ?? ()
> > > > > crash> f 3
> > > > > 259 rwbase_schedule();
> > > > > crash> p *rwb
> > > > > $1 = {
> > > > > readers = {
> > > > > counter = 1
> > > > > },
> > > > > rtmutex = {
> > > > > wait_lock = {
> > > > > raw_lock = {
> > > > > {
> > > > > val = {
> > > > > counter = 0
> > > > > },
> > > > > {
> > > > > locked = 0 '\000',
> > > > > pending = 0 '\000'
> > > > > },
> > > > > {
> > > > > locked_pending = 0,
> > > > > tail = 0
> > > > > }
> > > > > }
> > > > > }
> > > > > },
> > > > > waiters = {
> > > > > rb_root = {
> > > > > rb_node = 0xffffbaf006977be0
> > > > > },
> > > > > rb_leftmost = 0xffffbaf00696fbe0
> > > > > },
> > > > > owner = 0xffff9ecb3956a881
> > > > > }
> > > > > }
> > > > >
> > > > > Additional changes:
> > > > > 1. Allow gdb "frame" command.
> > > > > 2. Blacklist useless gdb "gcore" command. Use gcore
plugin instead.
> > > > > 3. Move crash_target_init() to later time as crash target
requires a
> > > > list of
> > > > > tasks to be initialized.
> > > > >
> > > > > Known issues and TBD items:
> > > > > 1. "info threads" may bail out first time throwing
errors trying to
> > > > access
> > > > > userspace address during unwind process. Following "info
threads"
> > > > > invokations run without issues.
> > > > > 2. To unwind a stack of inactive task, only modern Linux
versions, which
> > > > use
> > > > > inactive_task_frame, are supported and only x86_64
architecture.
> > > > > 3. gdb bt unwinder does not stop properly and may show invalid
frames
> > > > (21-25
> > > > > on example above). Not a regression, existed before.
> > > > > 4. gdb bt unwinder does not work on active tasks in userspace.
Not a
> > > > regression,
> > > > > existed before.
> > > > > 5. Only x86_64 architecture supported.
machdep->get_task_reg() must be
> > > > > implemented for others. Not a regression, existed before.
> > > > > 6. Active tasks registers fetching imlemented only for VMware
dumps, see
> > > > > x86_64_get_task_reg() for more details. Not a regression,
existed
> > > > before.
> > > > >
> > > > > Signed-off-by: Alexey Makhalov
<alexey.makhalov(a)broadcom.com>
> > > > > ---
> > > > > crash_target.c | 39 ++++++++++++++++++++---------
> > > > > defs.h | 10 ++++++--
> > > > > gdb-10.2.patch | 7 ++----
> > > > > gdb_interface.c | 63
++++++++++++++++++++++++++++++----------------
> > > > > help.c | 1 +
> > > > > main.c | 1 +
> > > > > task.c | 1 +
> > > > > x86_64.c | 66
+++++++++++++++++++++++++++++++++++++++++++------
> > > > > 8 files changed, 140 insertions(+), 48 deletions(-)
> > > > >
> > > > > diff --git a/crash_target.c b/crash_target.c
> > > > > index 4554806..2fdf203 100644
> > > > > --- a/crash_target.c
> > > > > +++ b/crash_target.c
> > > > > @@ -2,6 +2,8 @@
> > > > > * crash_target.c
> > > > > *
> > > > > * Copyright (c) 2021 VMware, Inc.
> > > > > + * Copyright (c) 2024 Broadcom. All Rights Reserved. The term
"Broadcom"
> > > > > + * refers to Broadcom Inc. and/or its subsidiaries.
> > > > > *
> > > > > * This program is free software; you can redistribute it
and/or modify
> > > > > * it under the terms of the GNU General Public License as
published by
> > > > > @@ -13,7 +15,7 @@
> > > > > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
the
> > > > > * GNU General Public License for more details.
> > > > > *
> > > > > - * Author: Alexey Makhalov <amakhalov(a)vmware.com>
> > > > > + * Author: Alexey Makhalov
<alexey.makhalov(a)broadcom.com>
> > > > > */
> > > > >
> > > > > #include <defs.h>
> > > > > @@ -23,11 +25,11 @@
> > > > > #include "regcache.h"
> > > > > #include "gdbarch.h"
> > > > >
> > > > > -void crash_target_init (void);
> > > > > -
> > > > > +extern "C" void crash_target_init (void);
> > > > > extern "C" int gdb_readmem_callback(unsigned long,
void *, int, int);
> > > > > -extern "C" int crash_get_nr_cpus(void);
> > > > > -extern "C" int crash_get_cpu_reg (int cpu, int regno,
const char
> > > > *regname,
> > > > > +extern "C" int crash_get_nr_tasks(void);
> > > > > +extern "C" void crash_get_task_info(int task_nr,
unsigned long *pid,
> > > > char **comm);
> > > > > +extern "C" int crash_get_task_reg (int task_nr, int
regno, const char
> > > > *regname,
> > > > > int regsize, void *val);
> > > > >
> > > > >
> > > > > @@ -60,7 +62,13 @@ public:
> > > > > bool has_registers () override { return true; }
> > > > > bool thread_alive (ptid_t ptid) override { return true; }
> > > > > std::string pid_to_str (ptid_t ptid) override
> > > > > - { return string_printf ("CPU %ld", ptid.tid ()); }
> > > > > + {
> > > > > + unsigned long pid;
> > > > > + char *comm;
> > > > > +
> > > > > + crash_get_task_info(ptid.tid(), &pid, &comm);
> > > > > + return string_printf ("%7ld %s", pid, comm);
> > > > > + }
> > > > >
> > > > > };
> > > > >
> > > > > @@ -68,18 +76,25 @@ public:
> > > > > void
> > > > > crash_target::fetch_registers (struct regcache *regcache, int
regno)
> > > > > {
> > > > > + int r;
> > > > > gdb_byte regval[16];
> > > > > - int cpu = inferior_ptid.tid();
> > > > > + int task_nr = inferior_ptid.tid();
> > > > > struct gdbarch *arch = regcache->arch ();
> > > > >
> > > > > - for (int r = 0; r < gdbarch_num_regs (arch); r++)
> > > > > + if (regno >= 0) {
> > > > > + r = regno;
> > > > > + goto onetime;
> > > > > + }
> > > > > +
> > > > > + for (r = 0; regno == -1 && r < gdbarch_num_regs
(arch); r++)
> > > > > {
> > > > > +onetime:
> > > > > const char *regname = gdbarch_register_name(arch, r);
> > > > > int regsize = register_size (arch, r);
> > > > > if (regsize > sizeof (regval))
> > > > > error (_("fatal error: buffer size is not enough
to fit
> > > > register value"));
> > > > >
> > > > > - if (crash_get_cpu_reg (cpu, r, regname, regsize, (void
*)®val))
> > > > > + if (crash_get_task_reg (task_nr, r, regname, regsize,
(void
> > > > *)®val))
> > > > > regcache->raw_supply (r, regval);
> > > > > else
> > > > > regcache->raw_supply (r, NULL);
> > > > > @@ -107,10 +122,10 @@ crash_target::xfer_partial (enum
target_object
> > > > object, const char *annex,
> > > > >
> > > > > #define CRASH_INFERIOR_PID 1
> > > > >
> > > > > -void
> > > > > +extern "C" void
> > > > > crash_target_init (void)
> > > > > {
> > > > > - int nr_cpus = crash_get_nr_cpus();
> > > > > + int nr_tasks = crash_get_nr_tasks();
> > > > > crash_target *target = new crash_target ();
> > > > >
> > > > > /* Own the target until it is successfully pushed. */
> > > > > @@ -119,7 +134,7 @@ crash_target_init (void)
> > > > > push_target (std::move (target_holder));
> > > > >
> > > > > inferior_appeared (current_inferior (), CRASH_INFERIOR_PID);
> > > > > - for (int i = 0; i < nr_cpus; i++)
> > > > > + for (int i = 0; i < nr_tasks; i++)
> > > > > {
> > > > > thread_info *thread = add_thread_silent (target,
> > > > >
ptid_t(CRASH_INFERIOR_PID, 0,
> > > > i));
> > > > > diff --git a/defs.h b/defs.h
> > > > > index 98650e8..2b3f247 100644
> > > > > --- a/defs.h
> > > > > +++ b/defs.h
> > > > > @@ -1080,7 +1080,7 @@ struct machdep_table {
> > > > > void (*get_irq_affinity)(int);
> > > > > void (*show_interrupts)(int, ulong *);
> > > > > int (*is_page_ptr)(ulong, physaddr_t *);
> > > > > - int (*get_cpu_reg)(int, int, const char *, int, void
*);
> > > > > + int (*get_task_reg)(struct task_context *, int, const
char *,
> > > > int, void *);
> > > > > int (*is_cpu_prstatus_valid)(int cpu);
> > > > > };
> > > > >
> > > > > @@ -2263,6 +2263,7 @@ struct size_table { /* stash of
> > > > commonly-used sizes */
> > > > > long pt_regs;
> > > > > long task_struct;
> > > > > long thread_info;
> > > > > + long inactive_task_frame;
> > > > > long softirq_state;
> > > > > long desc_struct;
> > > > > long umode_t;
> > > > > @@ -8001,9 +8002,14 @@ extern int have_full_symbols(void);
> > > > > #define XEN_HYPERVISOR_ARCH
> > > > > #endif
> > > > >
> > > > > +/*
> > > > > + * crash_target.c
> > > > > + */
> > > > > +extern void crash_target_init (void);
> > > > > +
> > > > > /*
> > > > > * Register numbers must be in sync with
gdb/features/i386/64bit-core.c
> > > > > - * to make crash_target->fetch_registers() --->
machdep->get_cpu_reg()
> > > > > + * to make crash_target->fetch_registers() --->
machdep->get_task_reg()
> > > > > * working properly.
> > > > > */
> > > > > enum x86_64_regnum {
> > > > > diff --git a/gdb-10.2.patch b/gdb-10.2.patch
> > > > > index a7018a2..ecf673d 100644
> > > > > --- a/gdb-10.2.patch
> > > > > +++ b/gdb-10.2.patch
> > > > > @@ -221,7 +221,7 @@ exit 0
> > > > > warning (_("\
> > > > > --- gdb-10.2/gdb/main.c.orig
> > > > > +++ gdb-10.2/gdb/main.c
> > > > > -@@ -392,6 +392,14 @@ start_event_loop ()
> > > > > +@@ -392,6 +392,13 @@ start_event_loop ()
> > > > > return;
> > > > > }
> > > > >
> > > > > @@ -230,7 +230,6 @@ exit 0
> > > > > +extern "C" void main_loop(void);
> > > > > +extern "C" unsigned long
crash_get_kaslr_offset(void);
> > > > > +extern "C" int console(const char *, ...);
> > > > > -+void crash_target_init (void);
> > > > > +#endif
> > > > > +
> > > > > /* Call command_loop. */
> > > > > @@ -316,7 +315,7 @@ exit 0
> > > > > }
> > > > > }
> > > > >
> > > > > -@@ -1242,6 +1274,16 @@ captured_main (void *data)
> > > > > +@@ -1242,6 +1274,14 @@ captured_main (void *data)
> > > > >
> > > > > captured_main_1 (context);
> > > > >
> > > > > @@ -324,8 +323,6 @@ exit 0
> > > > > + /* Relocate the vmlinux. */
> > > > > + objfile_rebase (symfile_objfile, crash_get_kaslr_offset());
> > > > > +
> > > > > -+ crash_target_init();
> > > > > -+
> > > > > + /* Back to crash. */
> > > > > + main_loop();
> > > > > +#endif
> > > > > diff --git a/gdb_interface.c b/gdb_interface.c
> > > > > index b14319c..03178f5 100644
> > > > > --- a/gdb_interface.c
> > > > > +++ b/gdb_interface.c
> > > > > @@ -3,6 +3,8 @@
> > > > > * Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux,
Inc.
> > > > > * Copyright (C) 2002-2015,2018-2019 David Anderson
> > > > > * Copyright (C) 2002-2015,2018-2019 Red Hat, Inc. All rights
reserved.
> > > > > + * Copyright (c) 2024 Broadcom. All Rights Reserved. The term
"Broadcom"
> > > > > + * refers to Broadcom Inc. and/or its subsidiaries.
> > > > > *
> > > > > * This program is free software; you can redistribute it
and/or modify
> > > > > * it under the terms of the GNU General Public License as
published by
> > > > > @@ -711,7 +713,7 @@ static char *prohibited_list[] = {
> > > > > "watch", "rwatch",
"awatch", "attach", "continue", "c",
"fg",
> > > > "detach",
> > > > > "finish", "handle",
"interrupt", "jump", "kill", "next",
"nexti",
> > > > > "signal", "step", "s",
"stepi", "target", "until", "delete",
> > > > > - "clear", "disable",
"enable", "condition", "ignore", "frame",
> > > > "catch",
> > > > > + "clear", "disable",
"enable", "condition", "ignore", "gcore",
> > > > "catch",
> > > > > "tcatch", "return",
"file", "exec-file", "core-file",
> > > > "symbol-file",
> > > > > "load", "si", "ni",
"shell", "sy",
> > > > > NULL /* must be last */
> > > > > @@ -877,6 +879,7 @@ gdb_readmem_callback(ulong addr, void *buf,
int len,
> > > > int write)
> > > > > switch (len)
> > > > > {
> > > > > case SIZEOF_8BIT:
> > > > > + fprintf(fp, "%s\n", pc->curcmd);
> > > > > if (STREQ(pc->curcmd, "bt")) {
> > > > > if (readmem(addr, memtype, buf,
SIZEOF_8BIT,
> > > > > "gdb_readmem_callback",
readflags))
> > > > > @@ -1063,34 +1066,52 @@ get_frame_offset(ulong pc)
> > > > > unsigned long crash_get_kaslr_offset(void);
> > > > > unsigned long crash_get_kaslr_offset(void)
> > > > > {
> > > > > - return kt->relocate * -1;
> > > > > + return kt->relocate * -1;
> > > > > }
> > > > >
> > > > > /* Callbacks for crash_target */
> > > > > -int crash_get_nr_cpus(void);
> > > > > -int crash_get_cpu_reg (int cpu, int regno, const char
*regname,
> > > > > +int crash_get_nr_tasks(void);
> > > > > +void crash_get_task_info(int task_nr, unsigned long *pid, char
**comm);
> > > > > +int crash_get_task_reg (int task_nr, int regno, const char
*regname,
> > > > > int regsize, void *val);
> > > > >
> > > > > -int crash_get_nr_cpus(void)
> > > > > +int crash_get_nr_tasks(void)
> > > > > {
> > > > > - if (SADUMP_DUMPFILE())
> > > > > - return sadump_get_nr_cpus();
> > > > > - else if (DISKDUMP_DUMPFILE())
> > > > > - return diskdump_get_nr_cpus();
> > > > > - else if (KDUMP_DUMPFILE())
> > > > > - return kdump_get_nr_cpus();
> > > > > - else if (VMSS_DUMPFILE())
> > > > > - return vmware_vmss_get_nr_cpus();
> > > > > -
> > > > > - /* Just CPU #0 */
> > > > ()> - return 1;
> > > > > + return RUNNING_TASKS();
> > > > > }
> > > >
> > > > The crash_get_nr_tasks() will return a fixed number of running
tasks,
> > > > which is the task status when crash is loading. However for live
debug
> > > > mode, there will always be tasks exit and create, so we need to sync
> > > > the tasks number with the "info threads". I guess it is not
an easy
> > > > work because that will involve:
> > > >
> > > > when crash loading:
> > > > for (int i = 0; i < nr_tasks; i++)
> > > > {
> > > > thread_info *thread = add_thread_silent (target, ...
> > > > }
> > > >
> > > > when new task create:
> > > > thread_info *thread = add_thread_silent (target, ...
> > > >
> > > > when task exit:
> > > > delete_thread(...)
> > > >
> > > > We may monitor those task create/exit events and do thread
add/delete
> > > > in a callback function or we just always do all tasks delete/add
when
> > > > "info thread" is invoked. But I guess none of these are as
simple as
> > > > using crash "ps" cmd and "set <pid>", there
won't be any problem.
> > > > Because crash "ps" can always show the current tasks
status, and set
> > > > <pid> can switch the task context to it.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Tao Liu
> > > >
> > > > >
> > > > > -int crash_get_cpu_reg (int cpu, int regno, const char
*regname,
> > > > > - int regsize, void *value)
> > > > > +/* Get task information by its index number in TT */
> > > > > +void crash_get_task_info(int task_nr, unsigned long *pid, char
**comm)
> > > > > {
> > > > > - if (!machdep->get_cpu_reg)
> > > > > - return FALSE;
> > > > > - return machdep->get_cpu_reg(cpu, regno, regname,
regsize,
> > > > value);
> > > > > + int i;
> > > > > + struct task_context *tc;
> > > > > +
> > > > > + tc = FIRST_CONTEXT();
> > > > > + for (i = 0; i < RUNNING_TASKS(); i++, tc++)
> > > > > + if (i == task_nr) {
> > > > > + *pid = tc->pid;
> > > > > + *comm = tc->comm;
> > > > > + return;
> > > > > + }
> > > > > + *pid = 0;
> > > > > + *comm = NULL;
> > > > > + return;
> > > > > +}
> > > > > +
> > > > > +int crash_get_task_reg (int task_nr, int regno, const char
*regname,
> > > > > + int regsize, void *value)
> > > > > +{
> > > > > + int i;
> > > > > + struct task_context *tc;
> > > > > +
> > > > > + if (!machdep->get_task_reg)
> > > > > + return FALSE;
> > > > > +
> > > > > + tc = FIRST_CONTEXT();
> > > > > + for (i = 0; i < RUNNING_TASKS(); i++, tc++)
> > > > > + if (i == task_nr) {
> > > > > + return machdep->get_task_reg(tc,
regno, regname,
> > > > regsize, value);
> > > > > + }
> > > > > + return FALSE;
> > > > > }
> > > > >
> > > > > diff --git a/help.c b/help.c
> > > > > index a9c4d30..85dbda5 100644
> > > > > --- a/help.c
> > > > > +++ b/help.c
> > > > > @@ -8520,6 +8520,7 @@ char *version_info[] = {
> > > > > "Copyright (C) 1999, 2002, 2007 Silicon Graphics,
Inc.",
> > > > > "Copyright (C) 1999, 2000, 2001, 2002 Mission Critical
Linux, Inc.",
> > > > > "Copyright (C) 2015, 2021 VMware, Inc.",
> > > > > +"Copyright (C) 2024 Broadcom, Inc.",
> > > > > "This program is free software, covered by the GNU General
Public
> > > > License,",
> > > > > "and you are welcome to change it and/or distribute copies
of it under",
> > > > > "certain conditions. Enter \"help copying\" to
see the conditions.",
> > > > > diff --git a/main.c b/main.c
> > > > > index 0b6b927..13acd2d 100644
> > > > > --- a/main.c
> > > > > +++ b/main.c
> > > > > @@ -794,6 +794,7 @@ main_loop(void)
> > > > > } else
> > > > > SIGACTION(SIGINT, restart,
&pc->sigaction, NULL);
> > > > >
> > > > > + crash_target_init();
> > > > > /*
> > > > > * Display system statistics and current context.
> > > > > */
> > > > > diff --git a/task.c b/task.c
> > > > > index ebdb5be..5d26c52 100644
> > > > > --- a/task.c
> > > > > +++ b/task.c
> > > > > @@ -298,6 +298,7 @@ task_init(void)
> > > > > tt->flags |= THREAD_INFO;
> > > > > }
> > > > >
> > > > > + STRUCT_SIZE_INIT(inactive_task_frame,
"inactive_task_frame");
> > > > > MEMBER_OFFSET_INIT(task_struct_state,
"task_struct", "state");
> > > > > MEMBER_SIZE_INIT(task_struct_state,
"task_struct", "state");
> > > > > if (INVALID_MEMBER(task_struct_state)) {
> > > > > diff --git a/x86_64.c b/x86_64.c
> > > > > index 502817d..b6e36a5 100644
> > > > > --- a/x86_64.c
> > > > > +++ b/x86_64.c
> > > > > @@ -126,7 +126,7 @@ static int x86_64_get_framesize(struct
bt_info *,
> > > > ulong, ulong, char *);
> > > > > static void x86_64_framesize_debug(struct bt_info *);
> > > > > static void x86_64_get_active_set(void);
> > > > > static int x86_64_get_kvaddr_ranges(struct vaddr_range *);
> > > > > -static int x86_64_get_cpu_reg(int, int, const char *, int, void
*);
> > > > > +static int x86_64_get_task_reg(struct task_context *, int,
const char
> > > > *, int, void *);
> > > > > static int x86_64_verify_paddr(uint64_t);
> > > > > static void GART_init(void);
> > > > > static void x86_64_exception_stacks_init(void);
> > > > > @@ -195,7 +195,7 @@ x86_64_init(int when)
> > > > > machdep->machspec->irq_eframe_link =
UNINITIALIZED;
> > > > > machdep->machspec->irq_stack_gap =
UNINITIALIZED;
> > > > > machdep->get_kvaddr_ranges =
x86_64_get_kvaddr_ranges;
> > > > > - machdep->get_cpu_reg = x86_64_get_cpu_reg;
> > > > > + machdep->get_task_reg = x86_64_get_task_reg;
> > > > > if (machdep->cmdline_args[0])
> > > > > parse_cmdline_args();
> > > > > if ((string =
pc->read_vmcoreinfo("relocate"))) {
> > > > > @@ -891,7 +891,7 @@ x86_64_dump_machdep_table(ulong arg)
> > > > > fprintf(fp, " is_page_ptr:
x86_64_is_page_ptr()\n");
> > > > > fprintf(fp, " verify_paddr:
x86_64_verify_paddr()\n");
> > > > > fprintf(fp, " get_kvaddr_ranges:
> > > > x86_64_get_kvaddr_ranges()\n");
> > > > > - fprintf(fp, " get_cpu_reg:
x86_64_get_cpu_reg()\n");
> > > > > + fprintf(fp, " get_task_reg:
x86_64_get_task_reg()\n");
> > > > > fprintf(fp, " init_kernel_pgd:
x86_64_init_kernel_pgd()\n");
> > > > > fprintf(fp, "clear_machdep_cache:
> > > > x86_64_clear_machdep_cache()\n");
> > > > > fprintf(fp, " xendump_p2m_create: %s\n",
PVOPS_XEN() ?
> > > > > @@ -6398,6 +6398,9 @@ x86_64_ORC_init(void)
> > > > > };
> > > > > struct ORC_data *orc;
> > > > >
> > > > > + MEMBER_OFFSET_INIT(inactive_task_frame_bp,
> > > > "inactive_task_frame", "bp");
> > > > > + MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr,
> > > > "inactive_task_frame", "ret_addr");
> > > > > +
> > > > > if (machdep->flags & FRAMEPOINTER)
> > > > > return;
> > > > >
> > > > > @@ -6455,9 +6458,6 @@ x86_64_ORC_init(void)
> > > > > orc->__stop_orc_unwind =
symbol_value("__stop_orc_unwind");
> > > > > orc->orc_lookup =
symbol_value("orc_lookup");
> > > > >
> > > > > - MEMBER_OFFSET_INIT(inactive_task_frame_bp,
> > > > "inactive_task_frame", "bp");
> > > > > - MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr,
> > > > "inactive_task_frame", "ret_addr");
> > > > > -
> > > > > orc->has_signal =
MEMBER_EXISTS("orc_entry", "signal"); /* added
> > > > at 6.3 */
> > > > > orc->has_end = MEMBER_EXISTS("orc_entry",
"end"); /*
> > > > removed at 6.4 */
> > > > >
> > > > > @@ -9070,14 +9070,64 @@ x86_64_get_kvaddr_ranges(struct
vaddr_range *vrp)
> > > > > }
> > > > >
> > > > > static int
> > > > > -x86_64_get_cpu_reg(int cpu, int regno, const char *name,
> > > > > +x86_64_get_task_reg(struct task_context *tc, int regno, const
char
> > > > *name,
> > > > > int size, void *value)
> > > > > {
> > > > > if (regno >= LAST_REGNUM)
> > > > > return FALSE;
> > > > >
> > > > > + /*
> > > > > + * For inactive task, grab rip, rbp, rbx, r12, r13, r14
and r15
> > > > from
> > > > > + * inactive_task_frame (see __switch_to_asm). Other regs
saved on
> > > > > + * regular frame.
> > > > > + */
> > > > > + if (!is_task_active(tc->task)) {
> > > > > + int frame_size =
STRUCT_SIZE("inactive_task_frame");
> > > > > +
> > > > > + /* Only modern kernels supported. */
> > > > > + if (tt->flags & THREAD_INFO &&
frame_size == 7 * 8) {
> > > > > + ulong rsp;
> > > > > + int offset = 0;
> > > > > + switch (regno) {
> > > > > + case RSP_REGNUM:
> > > > > + readmem(tc->task +
> > > > OFFSET(task_struct_thread) +
> > > > > +
> > > > OFFSET(thread_struct_rsp), KVADDR,
> > > > > + &rsp,
sizeof(void *),
> > > > > +
"thread_struct rsp",
> > > > FAULT_ON_ERROR);
> > > > > + rsp += frame_size;
> > > > > + memcpy(value, &rsp,
size);
> > > > > + return TRUE;
> > > > > + case RIP_REGNUM:
> > > > > + offset += 8;
> > > > > + case RBP_REGNUM:
> > > > > + offset += 8;
> > > > > + case RBX_REGNUM:
> > > > > + offset += 8;
> > > > > + case R12_REGNUM:
> > > > > + offset += 8;
> > > > > + case R13_REGNUM:
> > > > > + offset += 8;
> > > > > + case R14_REGNUM:
> > > > > + offset += 8;
> > > > > + case R15_REGNUM:
> > > > > + readmem(tc->task +
> > > > OFFSET(task_struct_thread) +
> > > > > +
> > > > OFFSET(thread_struct_rsp), KVADDR,
> > > > > + &rsp,
sizeof(void *),
> > > > > +
"thread_struct rsp",
> > > > FAULT_ON_ERROR);
> > > > > + readmem(rsp + offset,
KVADDR,
> > > > value, sizeof(void *),
> > > > > +
> > > > "inactive_thread_frame saved regs", FAULT_ON_ERROR);
> > > > > + return TRUE;
> > > > > + }
> > > > > + }
> > > > > + /* TBD: older kernels support. */
> > > > > + return FALSE;
> > > > > + }
> > > > > +
> > > > > + /*
> > > > > + * Task is active, grab CPU's registers
> > > > > + */
> > > > > if (VMSS_DUMPFILE())
> > > > > - return vmware_vmss_get_cpu_reg(cpu, regno,
name, size,
> > > > value);
> > > > > + return
vmware_vmss_get_cpu_reg(tc->processor, regno,
> > > > name, size, value);
> > > > >
> > > > > return FALSE;
> > > > > }
> > > > > --
> > > > > 2.39.0
> > > > >
> > > >
> > > >
> >
>