Hi Alexey,
On Fri, Mar 15, 2024 at 10:13 AM Tao Liu <ltao(a)redhat.com> wrote:
Hi Alexey,
On Fri, Mar 15, 2024 at 6:55 AM Alexey Makhalov
<alexey.makhalov(a)broadcom.com> wrote:
>
> Please find a patch in "crash_target: Support for GDB debugging of all
tasks" mail thread. Use the latest version 3 of the patch. Thanks, --Alexey
>
> On Wed, Mar 13, 2024 at 10:41 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
>>
>> Hi Alexey,
>>
>> On Wed, Mar 13, 2024 at 12:42:22PM -0700, Alexey Makhalov wrote:
>> > Hi folks,
>> >
>> > I'm working on an improvement in crash_target for gdb to backtrace and
>> > debug (including local frame variables) all tasks, not just active
>> > tasks/CPUs
>> > I will send the patch shortly.
>>
>> Thanks for the interest in this. Just in case, with both patch series
>> combined, it should be possible to backtrace and run info locals, etc,
>> on arbitrary tasks also. Interested in your patch though :)
>>
Yeah, with Aditya's "Improve stack unwind on ppc64" and my "x86_64
gdb
stack unwinding support" patch series, arbitrary tasks' stack
unwinding/local variables can be viewed by gdb. So there might be some
work we have repeated, and I'd like to take reference of your patch
and work together for this feature. In addition, I didn't find the
"crash_target: Support for GDB debugging of all tasks" patch, could
you please just share a link to it?
Thanks,
Tao Liu
>> Thanks,
>> Aditya Gupta
>>
>> >
>> > Thanks,
>> > --Alexey
>> >
>> > On Wed, Mar 13, 2024 at 2:34 AM Aditya Gupta <adityag(a)linux.ibm.com>
wrote:
>> >
>> > > Hi Tao,
>> > > On Wed, Mar 13, 2024 at 11:23:56AM +0800, Tao Liu wrote:
>> > > > Hi Aditya,
>> > > >
>> > > > On Tue, Mar 12, 2024 at 5:12 PM Aditya Gupta
<adityag(a)linux.ibm.com>
>> > > wrote:
>> > > > >
>> > > > > Hi Tao,
>> > > > >
>> > > > > > <...>
>> > > > > >
>> > > > > > +crash_target *target = NULL;
>> > > > > > +
>> > > > > > void
>> > > > > > crash_target_init (void)
>> > > > > > {
>> > > > > > int nr_cpus = crash_get_nr_cpus();
>> > > > > > - crash_target *target = new crash_target ();
>> > > > > > + target = new crash_target ();
>> > > > > >
>> > > > > > /* Own the target until it is successfully pushed.
*/
>> > > > > > target_ops_up target_holder (target);
>> > > > > > @@ -137,29 +139,35 @@ crash_target_init (void)
>> > > > > > reinit_frame_cache ();
>> > > > > > }
>> > > > > >
>> > > > > > -/*
>> > > > > > - * Change gdb's thread context to the thread on
given CPU
>> > > > > > - **/
>> > > > > > extern "C" int
>> > > > > > -gdb_change_cpu_context(unsigned int cpu)
>> > > > > > +gdb_change_thread_context (ulong task)
>> > > > > > {
>> > > > > > - ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu);
>> > > > > > - inferior *inf = current_inferior ();
>> > > > > > - thread_info *tp = find_thread_ptid (inf, ptid);
>> > > > > > -
>> > > > > > - if (tp == nullptr)
>> > > > > > + int tried = 0;
>> > > > > > + inferior* inf = current_inferior();
>> > > > > > + int cpu = crash_set_thread(task);
>> > > > > > + if (cpu < 0)
>> > > > > > return FALSE;
>> > > > > >
>> > > > > > - /* Making sure that crash's context is same */
>> > > > > > - set_cpu(cpu, FALSE);
>> > > > > > -
>> > > > > > - /* Switch to the thread */
>> > > > > > - switch_to_thread(tp);
>> > > > > > -
>> > > > > > - /* Fetch/Refresh thread's registers */
>> > > > > > - gdb_refresh_regcache(cpu);
>> > > > > > + ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu);
>> > > > > >
>> > > > > > - return TRUE;
>> > > > > > +retry:
>> > > > > > + thread_info *tp = find_thread_ptid (inf, ptid);
>> > > > > > + if (tp == nullptr && !tried) {
>> > > > > > + thread_info *thread = add_thread_silent(target,
>> > > > > > +
ptid_t(CRASH_INFERIOR_PID, 0, cpu));
>> > > > >
>> > > > > A minor comment.
>> > > > > Is it possible to do it without needing 'target' as
global ? I see
>> > > > > there's 'inf->current_top_target()', but
it's return type if
>> > > target_ops,
>> > > > > don't know if it can be used.
>> > > > >
>> > > > Thanks for your comments, however I guess it is not workable for
the
>> > > > code change. What we want is to add a thread to the target. I
found
>> > > > one function "target.c:new_thread", which can be used
as
>> > > > "new_thread(current_inferior(), ptid_t(CRASH_INFERIOR_PID,
0, cpu));"
>> > > > to replace the "add_thread_silent()" code, except we
need to make
>> > > > "new_thread" to be non-static.
>> > > >
>> > > > I prefer to keep the code as current, because I don't see the
side
>> > > > effect of making target as global variable. What do you think?
>> > >
>> > > Then there doesn't seem to be an alternative, then I guess we will
have
>> > > to keep the global variable, seems okay to me.
>> > >
>> > > >
>> > > > > > <...>
>> > > > > >
>> > > > > > /*
>> > > > > > * Collect the irq_desc[] entry along with its
associated handler
>> > > and
>> > > > > > diff --git a/task.c b/task.c
>> > > > > > index a405b05..ef79f53 100644
>> > > > > > --- a/task.c
>> > > > > > +++ b/task.c
>> > > > > > @@ -715,7 +715,8 @@ task_init(void)
>> > > > > > * crash_target::fetch_registers, so CPU 0's
registers are
>> > > shown as
>> > > > > > * <unavailable> in gdb mode
>> > > > > > * */
>> > > > > > - gdb_refresh_regcache(0);
>> > > > > > + for (int i = 0; i < get_cpus_online(); i++)
>> > > > > > + gdb_refresh_regcache(i);
>> > > > >
>> > > > > Does x86 require all regcache(s) to be refreshed at init
time ?
>> > > > >
>> > > > > Till v4 of my ppc series, I also was doing refresh of all
CPUs, but
>> > > it's
>> > > > > better to do it when required, and there was a review in v4,
patch 5
>> > > > > from Lianbo, quoting it here:
>> > > > >
>> > > > > > In addition, I haven't tested how far it takes the
time in the
>> > > > > > multi-core environment, is it possible to implement a
smaller
>> > > > > > granularity refresh operation? For example: When we
need switch to
>> > > > > > another cpu, execute the refresh operation for
corresponding
>> > > regcache.
>> > > > >
>> > > > > CPU 0's refresh is required due to gdb having already
tried to
>> > > > > initialised registers before we initialised crash_target
>> > > > >
>> > > > > This might add a small noticeable delay on vmcores from big
systems.
>> > > > >
>> > > > > Adds around 4.6 seconds for a ppc64 vmcore with 64 CPUs, in
crash init
>> > > > > time, while reading it on a fast 48 cpu Power8 system.
>> > > > > This overhead is less for systems with less CPUs, eg. ~0.43
seconds, on
>> > > > > a vmcore of 8 CPUs x86_64 system.
>> > > >
>> > > > Yes, thanks for the testing results. There do have performance
>> > > > downgrades of refreshing all cpu caches. But this will give
correct
>> > > > results when user type "info threads" after showing
"crash>":
>> > > >
>> > > > With all cpu cache refreshed:
>> > > > crash> info threads
>> > > > Id Target Id Frame
>> > > > 1 CPU 0 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 2 CPU 1 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 3 CPU 2 <unavailable> in ?? ()
>> > > > 4 CPU 3 <unavailable> in ?? ()
>> > > > 5 CPU 4 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 6 CPU 5 <unavailable> in ?? ()
>> > > > 7 CPU 6 <unavailable> in ?? ()
>> > > > * 8 CPU 7 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 9 CPU 8 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 10 CPU 9 <unavailable> in ?? ()
>> > > > 11 CPU 10 <unavailable> in ?? ()
>> > > > 12 CPU 11 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 13 CPU 12 <unavailable> in ?? ()
>> > > > 14 CPU 13 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 15 CPU 14 <unavailable> in ?? ()
>> > > > 16 CPU 15 <unavailable> in ?? ()
>> > > > 17 CPU 16 <unavailable> in ?? ()
>> > > > 18 CPU 17 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 19 CPU 18 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 20 CPU 19 <unavailable> in ?? ()
>> > > > 21 CPU 20 <unavailable> in ?? ()
>> > > > 22 CPU 21 <unavailable> in ?? ()
>> > > > 23 CPU 22 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 24 CPU 23 <unavailable> in ?? ()
>> > > > crash> q
>> > > >
>> > > > With only cpu0 refreshed:
>> > > > crash> info threads
>> > > > Id Target Id Frame
>> > > > 1 CPU 0 native_safe_halt () at
>> > > > arch/x86/include/asm/irqflags.h:54
>> > > > 2 CPU 1 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 3 CPU 2 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 4 CPU 3 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 5 CPU 4 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 6 CPU 5 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 7 CPU 6 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > * 8 CPU 7 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 9 CPU 8 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 10 CPU 9 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 11 CPU 10 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 12 CPU 11 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 13 CPU 12 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 14 CPU 13 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 15 CPU 14 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 16 CPU 15 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 17 CPU 16 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 18 CPU 17 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 19 CPU 18 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 20 CPU 19 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 21 CPU 20 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 22 CPU 21 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 23 CPU 22 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > > 24 CPU 23 blk_mq_rq_timed_out
(req=0xffff880fdb246000,
>> > > > reserved=reserved@entry=false) at block/blk-mq.c:640
>> > > >
>> > > > As we can see, other cpus[1-6, 8-23] just take the reg cache of
>> > > > cpu[7], which is incorrect. And if users go further like,
"thread 20"
>> > > > and "gdb bt", it will also give incorrect stack
traces.
>> > > >
>> > > > The cpu cache will only get refreshed once user type "set
<pid>", so
>> > > > the cpu cache will be refreshed by the <pid> task's
context.
>> > > >
>> > > > I doubt a user will understand all the details and constraints,
so I'm
>> > > > afraid the user will be confused by the faulty output. But I also
have
>> > > > no objection if the performance is the priority. Basically it is
a
>> > > > balance of pays and gains. In addition, since cmd
"info" and "thread"
>> > > > is a command provided by gdb, currently I don't know how to
hack
>> > > > those, so cpu cache can be refreshed when "info
threads" or "thread
>> > > > <num>" have been invoked.
>> > > >
>> > > > Do you have any thoughts?
>> > >
>> > > I also had faced that issue initially, ie. the other CPUs using up
same
>> > > regcache, if all are not refreshed.
>> > > While iterating through all threads, gdb switches it's context
>> > > temporarily, while crash's context remained same, thus causing gdb
to
>> > > get same registers for all threads other than 0.
>> > >
>> > > This was solved in patch #3 (synchronise cpu context changes between
>> > > crash/gdb)
>> > > in the ppc's 'Improve stack unwind on ppc64' series, by
syncing gdb's
>> > > context with crash.
>> > >
>> > > Can this change in thread.c in gdb-10.2.patch in patch #2 be reverted
?
>> > > That will fix it.
>> > >
>> > > ```
>> > > ---- gdb-10.2/gdb/thread.c.orig
>> > > -+++ gdb-10.2/gdb/thread.c
>> > > -@@ -60,7 +60,7 @@ static thread_info *current_thread_;
>> > > -
>> > > - #ifdef CRASH_MERGE
>> > > - /* Function to set cpu, defined by crash-utility */
>> > > --extern "C" void set_cpu (int);
>> > > -+extern "C" void set_cpu(int cpu, int print_context);
>> > > - #endif
>> > > -
>> > > - /* RAII type used to increase / decrease the refcount of each
thread
>> > > -@@ -1098,6 +1098,9 @@ print_thread_info_1 (struct ui_out *uiout,
const
>> > > char *requested_threads,
>> > > - uiout->table_body ();
>> > > - }
>> > > -
>> > > -+#ifdef CRASH_MERGE
>> > > -+ int current_cpu = current_thread ?
current_thread->ptid.tid() : 0;
>> > > -+#endif
>> > > - for (inferior *inf : all_inferiors ())
>> > > - for (thread_info *tp : inf->threads ())
>> > > - {
>> > > -@@ -1113,6 +1116,11 @@ print_thread_info_1 (struct ui_out *uiout,
const
>> > > char *requested_threads,
>> > > -
>> > > - ui_out_emit_tuple tuple_emitter (uiout, NULL);
>> > > -
>> > > -+ /* Switch to the thread's CPU in crash */
>> > > -+#ifdef CRASH_MERGE
>> > > -+ set_cpu(tp->ptid.tid(), FALSE);
>> > > -+#endif
>> > > -+
>> > > ```
>> > >
>> > > 'info threads' keeps changing the gdb context temporarily,
while it
>> > > prints each thread's frame and other information.
>> > > Correspondingly this changes ensured that crash's context also
syncs
>> > > according to change in gdb's context.
>> > >
>> > > Then, when user does 'info threads', it will refresh all
regcache since
>> > > it has to go through each thread.
>> > >
>> > > In that case that issue of the same frame being printed for all
threads
>> > > should not happen.
>> > >
>> > > That should remove the need to refresh all regcache at init time, and
>> > > leave it to later when user needs it. Refreshing all regcache at init
>> > > time is the sure shot way that all regcache are correct, but I guess
>> > > everyone will anyways do a 'info threads', but we can delay
the overhead
>> > > to later. What do you think ?
>> > >
>> > > Thanks,
>> > > Aditya Gupta
>> > >
>> > > >
>> > > > Thanks,
>> > > > Tao Liu
>> > > >
>> > > >
>> > > >
>> > > > >
>> > > > > Thanks,
>> > > > > Aditya Gupta
>> > > > >
>> > > > > >
>> > > > > > tt->flags |= TASK_INIT_DONE;
>> > > > > > }
>> > > > > > @@ -5315,7 +5316,7 @@ set_context(ulong task, ulong
pid, uint
>> > > update_gdb_thread)
>> > > > > >
>> > > > > > /* change the selected thread in gdb,
according to
>> > > current context */
>> > > > > > if (update_gdb_thread)
>> > > > > > - return
gdb_change_cpu_context(tc->processor);
>> > > > > > + return
gdb_change_thread_context(tc->task);
>> > > > > > else
>> > > > > > return TRUE;
>> > > > > > } else {
>> > > > > > --
>> > > > > > 2.40.1
>> > > > > >
>> > > > >
>> > > >
>> > > --
>> > > Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io
>> > > To unsubscribe send an email to
devel-leave(a)lists.crash-utility.osci.io
>> > > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
>> > > Contribution Guidelines:
https://github.com/crash-utility/crash/wiki
>> > >