Hi Tao,

Thanks for the details. Agree, my approach is targeted primarily for offline debugging.
Create/exit monitoring sounds like an overkill. And from the other side, maintaining a static list of threads (CPUs or Tasks) does not work for live debugging.
"ps" followed by "set" where GDB will receive just one inferior/thread at a time is probably the right universal approach.
Let me experiment with a combination of my and your patches and come back to you.

Thanks,
--Alexey


On Thu, Mar 14, 2024 at 9:22 PM Tao Liu <ltao@redhat.com> wrote:
Hi Alexey,

On Thu, Mar 14, 2024 at 6:29 PM Alexey Makhalov
<alexey.makhalov@broadcom.com> wrote:
>
> Support for GDB debugging of all tasks active and inactive.
> Before this commit only active tasks were listed by "info threads"
> with "CPU #" as a Target Id.
>
> "info threads" will now show all tasks, similar to "ps", example:
> crash> info threads
>   Id   Target Id               Frame
> * 1          0 swapper/0       0xffffffffadba19d4 in default_idle () at arch/x86/kernel/process.c:731
>   2          0 swapper/1       0xffffffffadba19d4 in default_idle () at arch/x86/kernel/process.c:731
>   3          0 swapper/2       0xffffffffadba19d4 in default_idle () at arch/x86/kernel/process.c:731
>   4          0 swapper/3       0xffffffffadba19d4 in default_idle () at arch/x86/kernel/process.c:731
>   5          0 swapper/4       0xffffffffadb97292 in context_switch (rf=0xffffbaf0000f3e88, next=0xffff9ecb04908000, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
> ...
>   730   970325 taskset         0xffffffffadb97292 in context_switch (rf=0xffffbaf006a0fd18, next=0xffff9ecb0aec0000, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>   731   975217 sleep           0xffffffffadb97292 in context_switch (rf=0xffffbaf005743c20, next=0xffff9ecac0692880, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>   732   975228 sleep           0xffffffffadb97292 in context_switch (rf=0xffffbaf00696fb58, next=0xffff9ecac0690000, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
> ...
>   876   976084 docker          0xffffffffadb97292 in context_switch (rf=0xffffbaf0153dbd10, next=0xffff9ecac0645100, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>   877   976085 systemd-userwor 0xffffffffadb97292 in context_switch (rf=0xffffbaf0153cbc58, next=0xffff9ecac0645100, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>   878   976086 systemd-userwor 0xffffffffadb97292 in context_switch (rf=0xffffbaf0153e3c58, next=0xffffffffaec15a40 <init_task>, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>   879   976087 systemd-userwor 0xffffffffadb97292 in context_switch (rf=0xffffbaf0153ebc58, next=0xffffffffaec15a40 <init_task>, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
> Where "Target ID" contains "PID COMM" of the task
>

I see here, we chose a different tech path of viewing arbitrary tasks.
You improved the "info threads" as an alternative to crash "ps" cmd,
and use "thread X" to switch to task X and view its stacktrace.
However mine is using crash "ps" cmd first, getting specific tasks pid
or task struct, then using crash cmd "set <pid>" or "set
<task_struct>" to switch to task X and view its stacktrace.

I notice there is one problem of using the "info threads as ps" tech
path, it is difficult to handle the live debug, see the comment below:

> Example of "731   975217 sleep" debugging, real case, trying to
> figure out why sleep was stuck in uninterruptable sleep.
> Backtrace using crash:
> crash> ps | grep 975217
>    975217  969797   3  ffff9ecb3956a880  UN   0.0        0        0  sleep
> crash> bt 975217
> PID: 975217   TASK: ffff9ecb3956a880  CPU: 3    COMMAND: "sleep"
>  #0 [ffffbaf005743ba0] __schedule at ffffffffadb97292
>  #1 [ffffbaf005743c60] schedule at ffffffffadb982b8
>  #2 [ffffbaf005743c80] rwbase_write_lock at ffffffffadb9aed7
>  #3 [ffffbaf005743cc0] down_write at ffffffffadb9b133
>  #4 [ffffbaf005743cd0] unlink_file_vma at ffffffffad2b0e2e
>  #5 [ffffbaf005743cf8] free_pgtables at ffffffffad2a47b0
>  #6 [ffffbaf005743d88] exit_mmap at ffffffffad2b3b8d
>  #7 [ffffbaf005743e80] mmput at ffffffffad08c81f
>  #8 [ffffbaf005743e98] do_exit at ffffffffad09636c
>  #9 [ffffbaf005743ef8] do_group_exit at ffffffffad096c78
>     RIP: 00007f111c70ddf9  RSP: 00007fff451817e8  RFLAGS: 00000246
>     RAX: ffffffffffffffda  RBX: 00007f111c8089e0  RCX: 00007f111c70ddf9
>     RDX: 000000000000003c  RSI: 00000000000000e7  RDI: 0000000000000000
>     RBP: 0000000000000000   R8: ffffffffffffff80   R9: 0000000000000000
>     R10: 00007fff451817b0  R11: 0000000000000246  R12: 00007f111c8089e0
>     R13: 00007f111c80e2e0  R14: 0000000000000002  R15: 00007f111c80e2c8
>     ORIG_RAX: 00000000000000e7  CS: 0033  SS: 002b
>
> Backtrace using gdb (pay attention, task must be selected by thread Id):
> crash> thread 731
> [Switching to thread 731 ( 975217 sleep)]
> 5372            switch_to(prev, next, prev);
> crash> gdb bt
>  #0  0xffffffffadb97292 in context_switch (rf=0xffffbaf005743c20, next=0xffff9ecac0692880, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:5372
>  #1  __schedule (sched_mode=sched_mode@entry=0) at kernel/sched/core.c:6696
>  #2  0xffffffffadb982b8 in schedule () at kernel/sched/core.c:6772
>  #3  0xffffffffadb9aed7 in rwbase_write_lock (rwb=rwb@entry=0xffff9ecaf1831430, state=state@entry=2) at kernel/locking/rwbase_rt.c:259
>  #4  0xffffffffadb9b133 in __down_write (sem=sem@entry=0xffff9ecaf1831430) at kernel/locking/rwsem.c:1474
>  #5  down_write (sem=sem@entry=0xffff9ecaf1831430) at kernel/locking/rwsem.c:1574
>  #6  0xffffffffad2b0e2e in i_mmap_lock_write (mapping=<optimized out>) at ./include/linux/fs.h:466
>  #7  unlink_file_vma (vma=vma@entry=0xffff9ecadd566090) at mm/mmap.c:127
>  #8  0xffffffffad2a47b0 in free_pgtables (tlb=tlb@entry=0xffffbaf005743dd0, mt=mt@entry=0xffff9ecb28e3b180, vma=0xffff9ecadd566090, vma@entry=0xffff9ecadd566000, floor=floor@entry=0, ceiling=ceiling@entry=0) at mm/memory.c:431
>  #9  0xffffffffad2b3b8d in exit_mmap (mm=mm@entry=0xffff9ecb28e3b180) at mm/mmap.c:3237
>  #10 0xffffffffad08c81f in __mmput (mm=0xffff9ecb28e3b180) at kernel/fork.c:1204
>  #11 mmput (mm=mm@entry=0xffff9ecb28e3b180) at kernel/fork.c:1226
>  #12 0xffffffffad09636c in exit_mm () at kernel/exit.c:563
>  #13 do_exit (code=code@entry=0) at kernel/exit.c:856
>  #14 0xffffffffad096c78 in do_group_exit (exit_code=0) at kernel/exit.c:1019
>  #15 0xffffffffad096cf8 in __do_sys_exit_group (error_code=<optimized out>) at kernel/exit.c:1030
>  #16 __se_sys_exit_group (error_code=<optimized out>) at kernel/exit.c:1028
>  #17 __x64_sys_exit_group (regs=<optimized out>) at kernel/exit.c:1028
>  #18 0xffffffffadb8a327 in do_syscall_x64 (nr=<optimized out>, regs=0xffffbaf005743f58) at arch/x86/entry/common.c:51
>  #19 do_syscall_64 (regs=0xffffbaf005743f58, nr=<optimized out>) at arch/x86/entry/common.c:81
>  #20 0xffffffffadc000dc in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
>  #21 0x00007f111c80e2c8 in ?? ()
>  #22 0x0000000000000002 in ?? ()
>  #23 0x00007f111c80e2e0 in ?? ()
>  #24 0x00007f111c8089e0 in ?? ()
>  #25 0x0000000000000000 in ?? ()
> crash> f 3
> 259                     rwbase_schedule();
> crash> p *rwb
> $1 = {
>   readers = {
>     counter = 1
>   },
>   rtmutex = {
>     wait_lock = {
>       raw_lock = {
>         {
>           val = {
>             counter = 0
>           },
>           {
>             locked = 0 '\000',
>             pending = 0 '\000'
>           },
>           {
>             locked_pending = 0,
>             tail = 0
>           }
>         }
>       }
>     },
>     waiters = {
>       rb_root = {
>         rb_node = 0xffffbaf006977be0
>       },
>       rb_leftmost = 0xffffbaf00696fbe0
>     },
>     owner = 0xffff9ecb3956a881
>   }
> }
>
> Additional changes:
> 1. Allow gdb "frame" command.
> 2. Blacklist useless gdb "gcore" command. Use gcore plugin instead.
> 3. Move crash_target_init() to later time as crash target requires a list of
>    tasks to be initialized.
>
> Known issues and TBD items:
> 1. "info threads" may bail out first time throwing errors trying to access
>    userspace address during unwind process. Following "info threads"
>    invokations run without issues.
> 2. To unwind a stack of inactive task, only modern Linux versions, which use
>    inactive_task_frame, are supported and only x86_64 architecture.
> 3. gdb bt unwinder does not stop properly and may show invalid frames (21-25
>    on example above). Not a regression, existed before.
> 4. gdb bt unwinder does not work on active tasks in userspace. Not a regression,
>    existed before.
> 5. Only x86_64 architecture supported. machdep->get_task_reg() must be
>    implemented for others. Not a regression, existed before.
> 6. Active tasks registers fetching imlemented only for VMware dumps, see
>    x86_64_get_task_reg() for more details. Not a regression, existed before.
>
> Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
> ---
>  crash_target.c  | 39 ++++++++++++++++++++---------
>  defs.h          | 10 ++++++--
>  gdb-10.2.patch  |  7 ++----
>  gdb_interface.c | 63 ++++++++++++++++++++++++++++++----------------
>  help.c          |  1 +
>  main.c          |  1 +
>  task.c          |  1 +
>  x86_64.c        | 66 +++++++++++++++++++++++++++++++++++++++++++------
>  8 files changed, 140 insertions(+), 48 deletions(-)
>
> diff --git a/crash_target.c b/crash_target.c
> index 4554806..2fdf203 100644
> --- a/crash_target.c
> +++ b/crash_target.c
> @@ -2,6 +2,8 @@
>   * crash_target.c
>   *
>   * Copyright (c) 2021 VMware, Inc.
> + * Copyright (c) 2024 Broadcom. All Rights Reserved. The term "Broadcom"
> + * refers to Broadcom Inc. and/or its subsidiaries.
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License as published by
> @@ -13,7 +15,7 @@
>   * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>   * GNU General Public License for more details.
>   *
> - * Author: Alexey Makhalov <amakhalov@vmware.com>
> + * Author: Alexey Makhalov <alexey.makhalov@broadcom.com>
>   */
>
>  #include <defs.h>
> @@ -23,11 +25,11 @@
>  #include "regcache.h"
>  #include "gdbarch.h"
>
> -void crash_target_init (void);
> -
> +extern "C" void crash_target_init (void);
>  extern "C" int gdb_readmem_callback(unsigned long, void *, int, int);
> -extern "C" int crash_get_nr_cpus(void);
> -extern "C" int crash_get_cpu_reg (int cpu, int regno, const char *regname,
> +extern "C" int crash_get_nr_tasks(void);
> +extern "C" void crash_get_task_info(int task_nr, unsigned long *pid, char **comm);
> +extern "C" int crash_get_task_reg (int task_nr, int regno, const char *regname,
>                                    int regsize, void *val);
>
>
> @@ -60,7 +62,13 @@ public:
>    bool has_registers () override { return true; }
>    bool thread_alive (ptid_t ptid) override { return true; }
>    std::string pid_to_str (ptid_t ptid) override
> -  { return string_printf ("CPU %ld", ptid.tid ()); }
> +  {
> +    unsigned long pid;
> +    char *comm;
> +
> +    crash_get_task_info(ptid.tid(), &pid, &comm);
> +    return string_printf ("%7ld %s", pid, comm);
> +  }
>
>  };
>
> @@ -68,18 +76,25 @@ public:
>  void
>  crash_target::fetch_registers (struct regcache *regcache, int regno)
>  {
> +  int r;
>    gdb_byte regval[16];
> -  int cpu = inferior_ptid.tid();
> +  int task_nr = inferior_ptid.tid();
>    struct gdbarch *arch = regcache->arch ();
>
> -  for (int r = 0; r < gdbarch_num_regs (arch); r++)
> +  if (regno >= 0) {
> +    r = regno;
> +    goto onetime;
> +  }
> +
> +  for (r = 0; regno == -1 && r < gdbarch_num_regs (arch); r++)
>      {
> +onetime:
>        const char *regname = gdbarch_register_name(arch, r);
>        int regsize = register_size (arch, r);
>        if (regsize > sizeof (regval))
>          error (_("fatal error: buffer size is not enough to fit register value"));
>
> -      if (crash_get_cpu_reg (cpu, r, regname, regsize, (void *)&regval))
> +      if (crash_get_task_reg (task_nr, r, regname, regsize, (void *)&regval))
>          regcache->raw_supply (r, regval);
>        else
>          regcache->raw_supply (r, NULL);
> @@ -107,10 +122,10 @@ crash_target::xfer_partial (enum target_object object, const char *annex,
>
>  #define CRASH_INFERIOR_PID 1
>
> -void
> +extern "C" void
>  crash_target_init (void)
>  {
> -  int nr_cpus = crash_get_nr_cpus();
> +  int nr_tasks = crash_get_nr_tasks();
>    crash_target *target = new crash_target ();
>
>    /* Own the target until it is successfully pushed.  */
> @@ -119,7 +134,7 @@ crash_target_init (void)
>    push_target (std::move (target_holder));
>
>    inferior_appeared (current_inferior (), CRASH_INFERIOR_PID);
> -  for (int i = 0; i < nr_cpus; i++)
> +  for (int i = 0; i < nr_tasks; i++)
>      {
>        thread_info *thread = add_thread_silent (target,
>                                          ptid_t(CRASH_INFERIOR_PID, 0, i));
> diff --git a/defs.h b/defs.h
> index 98650e8..2b3f247 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -1080,7 +1080,7 @@ struct machdep_table {
>          void (*get_irq_affinity)(int);
>          void (*show_interrupts)(int, ulong *);
>         int (*is_page_ptr)(ulong, physaddr_t *);
> -       int (*get_cpu_reg)(int, int, const char *, int, void *);
> +       int (*get_task_reg)(struct task_context *, int, const char *, int, void *);
>         int (*is_cpu_prstatus_valid)(int cpu);
>  };
>
> @@ -2263,6 +2263,7 @@ struct size_table {         /* stash of commonly-used sizes */
>         long pt_regs;
>         long task_struct;
>         long thread_info;
> +       long inactive_task_frame;
>         long softirq_state;
>         long desc_struct;
>         long umode_t;
> @@ -8001,9 +8002,14 @@ extern int have_full_symbols(void);
>  #define XEN_HYPERVISOR_ARCH
>  #endif
>
> +/*
> + *  crash_target.c
> + */
> +extern void crash_target_init (void);
> +
>  /*
>   * Register numbers must be in sync with gdb/features/i386/64bit-core.c
> - * to make crash_target->fetch_registers() ---> machdep->get_cpu_reg()
> + * to make crash_target->fetch_registers() ---> machdep->get_task_reg()
>   * working properly.
>   */
>  enum x86_64_regnum {
> diff --git a/gdb-10.2.patch b/gdb-10.2.patch
> index a7018a2..ecf673d 100644
> --- a/gdb-10.2.patch
> +++ b/gdb-10.2.patch
> @@ -221,7 +221,7 @@ exit 0
>           warning (_("\
>  --- gdb-10.2/gdb/main.c.orig
>  +++ gdb-10.2/gdb/main.c
> -@@ -392,6 +392,14 @@ start_event_loop ()
> +@@ -392,6 +392,13 @@ start_event_loop ()
>     return;
>   }
>
> @@ -230,7 +230,6 @@ exit 0
>  +extern "C" void main_loop(void);
>  +extern "C" unsigned long crash_get_kaslr_offset(void);
>  +extern "C" int console(const char *, ...);
> -+void crash_target_init (void);
>  +#endif
>  +
>   /* Call command_loop.  */
> @@ -316,7 +315,7 @@ exit 0
>         }
>       }
>
> -@@ -1242,6 +1274,16 @@ captured_main (void *data)
> +@@ -1242,6 +1274,14 @@ captured_main (void *data)
>
>     captured_main_1 (context);
>
> @@ -324,8 +323,6 @@ exit 0
>  +  /* Relocate the vmlinux. */
>  +  objfile_rebase (symfile_objfile, crash_get_kaslr_offset());
>  +
> -+  crash_target_init();
> -+
>  +  /* Back to crash.  */
>  +  main_loop();
>  +#endif
> diff --git a/gdb_interface.c b/gdb_interface.c
> index b14319c..03178f5 100644
> --- a/gdb_interface.c
> +++ b/gdb_interface.c
> @@ -3,6 +3,8 @@
>   * Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>   * Copyright (C) 2002-2015,2018-2019 David Anderson
>   * Copyright (C) 2002-2015,2018-2019 Red Hat, Inc. All rights reserved.
> + * Copyright (c) 2024 Broadcom. All Rights Reserved. The term "Broadcom"
> + * refers to Broadcom Inc. and/or its subsidiaries.
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License as published by
> @@ -711,7 +713,7 @@ static char *prohibited_list[] = {
>         "watch", "rwatch", "awatch", "attach", "continue", "c", "fg", "detach",
>         "finish", "handle", "interrupt", "jump", "kill", "next", "nexti",
>         "signal", "step", "s", "stepi", "target", "until", "delete",
> -       "clear", "disable", "enable", "condition", "ignore", "frame", "catch",
> +       "clear", "disable", "enable", "condition", "ignore", "gcore", "catch",
>         "tcatch", "return", "file", "exec-file", "core-file", "symbol-file",
>         "load", "si", "ni", "shell", "sy",
>         NULL  /* must be last */
> @@ -877,6 +879,7 @@ gdb_readmem_callback(ulong addr, void *buf, int len, int write)
>         switch (len)
>         {
>         case SIZEOF_8BIT:
> +               fprintf(fp, "%s\n", pc->curcmd);
>                 if (STREQ(pc->curcmd, "bt")) {
>                         if (readmem(addr, memtype, buf, SIZEOF_8BIT,
>                             "gdb_readmem_callback", readflags))
> @@ -1063,34 +1066,52 @@ get_frame_offset(ulong pc)
>  unsigned long crash_get_kaslr_offset(void);
>  unsigned long crash_get_kaslr_offset(void)
>  {
> -        return kt->relocate * -1;
> +       return kt->relocate * -1;
>  }
>
>  /* Callbacks for crash_target */
> -int crash_get_nr_cpus(void);
> -int crash_get_cpu_reg (int cpu, int regno, const char *regname,
> +int crash_get_nr_tasks(void);
> +void crash_get_task_info(int task_nr, unsigned long *pid, char **comm);
> +int crash_get_task_reg (int task_nr, int regno, const char *regname,
>                         int regsize, void *val);
>
> -int crash_get_nr_cpus(void)
> +int crash_get_nr_tasks(void)
>  {
> -        if (SADUMP_DUMPFILE())
> -                return sadump_get_nr_cpus();
> -        else if (DISKDUMP_DUMPFILE())
> -                return diskdump_get_nr_cpus();
> -        else if (KDUMP_DUMPFILE())
> -                return kdump_get_nr_cpus();
> -        else if (VMSS_DUMPFILE())
> -                return vmware_vmss_get_nr_cpus();
> -
> -        /* Just CPU #0 */
()> -        return 1;
> +       return RUNNING_TASKS();
>  }

The crash_get_nr_tasks() will return a fixed number of running tasks,
which is the task status when crash is loading. However for live debug
mode, there will always be tasks exit and create, so we need to sync
the tasks number with the "info threads". I guess it is not an easy
work because that will involve:

when crash loading:
for (int i = 0; i < nr_tasks; i++)
{
    thread_info *thread = add_thread_silent (target, ...
}

when new task create:
thread_info *thread = add_thread_silent (target, ...

when task exit:
delete_thread(...)

We may monitor those task create/exit events and do thread add/delete
in a callback function or we just always do all tasks delete/add when
"info thread" is invoked. But I guess none of these are as simple as
using crash "ps" cmd and "set <pid>", there won't be any problem.
Because crash "ps" can always show the current tasks status, and set
<pid> can switch the task context to it.

What do you think?

Thanks,
Tao Liu

>
> -int crash_get_cpu_reg (int cpu, int regno, const char *regname,
> -                       int regsize, void *value)
> +/* Get task information by its index number in TT */
> +void crash_get_task_info(int task_nr, unsigned long *pid, char **comm)
>  {
> -        if (!machdep->get_cpu_reg)
> -                return FALSE;
> -        return machdep->get_cpu_reg(cpu, regno, regname, regsize, value);
> +       int i;
> +       struct task_context *tc;
> +
> +       tc = FIRST_CONTEXT();
> +       for (i = 0; i < RUNNING_TASKS(); i++, tc++)
> +               if (i == task_nr) {
> +                       *pid = tc->pid;
> +                       *comm = tc->comm;
> +                       return;
> +               }
> +       *pid = 0;
> +       *comm = NULL;
> +       return;
> +}
> +
> +int crash_get_task_reg (int task_nr, int regno, const char *regname,
> +                       int regsize, void *value)
> +{
> +       int i;
> +       struct task_context *tc;
> +
> +       if (!machdep->get_task_reg)
> +               return FALSE;
> +
> +       tc = FIRST_CONTEXT();
> +       for (i = 0; i < RUNNING_TASKS(); i++, tc++)
> +               if (i == task_nr) {
> +                       return machdep->get_task_reg(tc, regno, regname, regsize, value);
> +               }
> +       return FALSE;
>  }
>
> diff --git a/help.c b/help.c
> index a9c4d30..85dbda5 100644
> --- a/help.c
> +++ b/help.c
> @@ -8520,6 +8520,7 @@ char *version_info[] = {
>  "Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.",
>  "Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.",
>  "Copyright (C) 2015, 2021  VMware, Inc.",
> +"Copyright (C) 2024  Broadcom, Inc.",
>  "This program is free software, covered by the GNU General Public License,",
>  "and you are welcome to change it and/or distribute copies of it under",
>  "certain conditions.  Enter \"help copying\" to see the conditions.",
> diff --git a/main.c b/main.c
> index 0b6b927..13acd2d 100644
> --- a/main.c
> +++ b/main.c
> @@ -794,6 +794,7 @@ main_loop(void)
>         } else
>                 SIGACTION(SIGINT, restart, &pc->sigaction, NULL);
>
> +       crash_target_init();
>          /*
>           *  Display system statistics and current context.
>           */
> diff --git a/task.c b/task.c
> index ebdb5be..5d26c52 100644
> --- a/task.c
> +++ b/task.c
> @@ -298,6 +298,7 @@ task_init(void)
>                 tt->flags |= THREAD_INFO;
>         }
>
> +       STRUCT_SIZE_INIT(inactive_task_frame, "inactive_task_frame");
>          MEMBER_OFFSET_INIT(task_struct_state, "task_struct", "state");
>         MEMBER_SIZE_INIT(task_struct_state, "task_struct", "state");
>         if (INVALID_MEMBER(task_struct_state)) {
> diff --git a/x86_64.c b/x86_64.c
> index 502817d..b6e36a5 100644
> --- a/x86_64.c
> +++ b/x86_64.c
> @@ -126,7 +126,7 @@ static int x86_64_get_framesize(struct bt_info *, ulong, ulong, char *);
>  static void x86_64_framesize_debug(struct bt_info *);
>  static void x86_64_get_active_set(void);
>  static int x86_64_get_kvaddr_ranges(struct vaddr_range *);
> -static int x86_64_get_cpu_reg(int, int, const char *, int, void *);
> +static int x86_64_get_task_reg(struct task_context *, int, const char *, int, void *);
>  static int x86_64_verify_paddr(uint64_t);
>  static void GART_init(void);
>  static void x86_64_exception_stacks_init(void);
> @@ -195,7 +195,7 @@ x86_64_init(int when)
>                 machdep->machspec->irq_eframe_link = UNINITIALIZED;
>                 machdep->machspec->irq_stack_gap = UNINITIALIZED;
>                 machdep->get_kvaddr_ranges = x86_64_get_kvaddr_ranges;
> -               machdep->get_cpu_reg = x86_64_get_cpu_reg;
> +               machdep->get_task_reg = x86_64_get_task_reg;
>                  if (machdep->cmdline_args[0])
>                          parse_cmdline_args();
>                 if ((string = pc->read_vmcoreinfo("relocate"))) {
> @@ -891,7 +891,7 @@ x86_64_dump_machdep_table(ulong arg)
>          fprintf(fp, "        is_page_ptr: x86_64_is_page_ptr()\n");
>          fprintf(fp, "       verify_paddr: x86_64_verify_paddr()\n");
>          fprintf(fp, "  get_kvaddr_ranges: x86_64_get_kvaddr_ranges()\n");
> -       fprintf(fp, "        get_cpu_reg: x86_64_get_cpu_reg()\n");
> +       fprintf(fp, "       get_task_reg: x86_64_get_task_reg()\n");
>          fprintf(fp, "    init_kernel_pgd: x86_64_init_kernel_pgd()\n");
>          fprintf(fp, "clear_machdep_cache: x86_64_clear_machdep_cache()\n");
>         fprintf(fp, " xendump_p2m_create: %s\n", PVOPS_XEN() ?
> @@ -6398,6 +6398,9 @@ x86_64_ORC_init(void)
>         };
>         struct ORC_data *orc;
>
> +       MEMBER_OFFSET_INIT(inactive_task_frame_bp, "inactive_task_frame", "bp");
> +       MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr, "inactive_task_frame", "ret_addr");
> +
>         if (machdep->flags & FRAMEPOINTER)
>                 return;
>
> @@ -6455,9 +6458,6 @@ x86_64_ORC_init(void)
>         orc->__stop_orc_unwind = symbol_value("__stop_orc_unwind");
>         orc->orc_lookup = symbol_value("orc_lookup");
>
> -       MEMBER_OFFSET_INIT(inactive_task_frame_bp, "inactive_task_frame", "bp");
> -       MEMBER_OFFSET_INIT(inactive_task_frame_ret_addr, "inactive_task_frame", "ret_addr");
> -
>         orc->has_signal = MEMBER_EXISTS("orc_entry", "signal"); /* added at 6.3 */
>         orc->has_end = MEMBER_EXISTS("orc_entry", "end");       /* removed at 6.4 */
>
> @@ -9070,14 +9070,64 @@ x86_64_get_kvaddr_ranges(struct vaddr_range *vrp)
>  }
>
>  static int
> -x86_64_get_cpu_reg(int cpu, int regno, const char *name,
> +x86_64_get_task_reg(struct task_context *tc, int regno, const char *name,
>                     int size, void *value)
>  {
>          if (regno >= LAST_REGNUM)
>                  return FALSE;
>
> +       /*
> +        * For inactive task, grab rip, rbp, rbx, r12, r13, r14 and r15 from
> +        * inactive_task_frame (see __switch_to_asm). Other regs saved on
> +        * regular frame.
> +        */
> +       if (!is_task_active(tc->task)) {
> +               int frame_size = STRUCT_SIZE("inactive_task_frame");
> +
> +               /* Only modern kernels supported. */
> +               if (tt->flags & THREAD_INFO && frame_size == 7 * 8) {
> +                       ulong rsp;
> +                       int offset = 0;
> +                       switch (regno) {
> +                               case RSP_REGNUM:
> +                                       readmem(tc->task + OFFSET(task_struct_thread) +
> +                                               OFFSET(thread_struct_rsp), KVADDR,
> +                                               &rsp, sizeof(void *),
> +                                               "thread_struct rsp", FAULT_ON_ERROR);
> +                                       rsp += frame_size;
> +                                       memcpy(value, &rsp, size);
> +                                       return TRUE;
> +                               case RIP_REGNUM:
> +                                       offset += 8;
> +                               case RBP_REGNUM:
> +                                       offset += 8;
> +                               case RBX_REGNUM:
> +                                       offset += 8;
> +                               case R12_REGNUM:
> +                                       offset += 8;
> +                               case R13_REGNUM:
> +                                       offset += 8;
> +                               case R14_REGNUM:
> +                                       offset += 8;
> +                               case R15_REGNUM:
> +                                       readmem(tc->task + OFFSET(task_struct_thread) +
> +                                               OFFSET(thread_struct_rsp), KVADDR,
> +                                               &rsp, sizeof(void *),
> +                                               "thread_struct rsp", FAULT_ON_ERROR);
> +                                       readmem(rsp + offset, KVADDR, value, sizeof(void *),
> +                                                       "inactive_thread_frame saved regs", FAULT_ON_ERROR);
> +                                       return TRUE;
> +                       }
> +               }
> +               /* TBD: older kernels support. */
> +               return FALSE;
> +       }
> +
> +       /*
> +        * Task is active, grab CPU's registers
> +        */
>          if (VMSS_DUMPFILE())
> -                return vmware_vmss_get_cpu_reg(cpu, regno, name, size, value);
> +                return vmware_vmss_get_cpu_reg(tc->processor, regno, name, size, value);
>
>          return FALSE;
>  }
> --
> 2.39.0
>