[Crash-utility] Re: [PATCH] Fix the "ps -m" command shows wrong duration of RU task

Monday, 12 May 2025

Hi Kazu & Kenneth,

Sorry for the late reply, and thanks for your fix and comments!

On Thu, May 8, 2025 at 12:20 PM HAGIO KAZUHITO(萩尾　一仁)
<k-hagio-ab(a)nec.com&gt; wrote:
...

 On 2025/05/07 16:16, HAGIO KAZUHITO(萩尾 一仁) wrote:
 > Hi,
 >
 > On 2025/04/28 19:38, Kenneth Yin wrote:
 >> The RU/TASK_RUNNING stat means the task is runnable.
 >> It is either currently running or on a run queue waiting to run.
 >>
 >> Currently, the crash tool uses the "rq_clock -
sched_info->last_arrival" formula to
 >> calculate the duration of task in RU state. This is for the scenario of a task
running on a CPU.
 >
 > The "ps -l" and "ps -m" options display what their help text
describes,
 > not the duration of task in RU state.  Please see "help ps".
 >
 > Also, tasks are sorted by the value, using different values for it could
 > make another confusion.
 >
 > The options have been used for a long time with the current code, if we
 > change the semantics of the options, it would be better to be careful.
 > The change might lose a kind of information instead of getting another
 > kind of information.
 >
 > On the other hand, I think that the duration of waiting in queue might
 > also be useful information.  I'm not sure how we should display them,
 > but for example, how about adding a new option or adding a column for
 > last_queued?

 I thought of that the "runq" command might be suitable to display the
 waiting duration, because only tasks in the run queues have it.  For
 example, extending the "runq -m" option or adding a new option.  just my
 thought.

 Thanks,
 Kazu

 >
 > What do you think, folks?
 >
 > Thanks,
 > Kazu
 >
 >>
 >> But for the scenario of a task waiting in the  CPU run queue (due to some
reason
 >> for example cfs/rt queue throttled), this formula could cause misunderstanding.
 >>
 >> For example:
 >> [ 220 10:36:38.026] [RU]  PID: 12345   TASK: ffff8d674ab6b180  CPU: 1       
COMMAND: "task"
 >>
 >> Looking closer:
 >>
 >> crash> rq.clock ffff8de438a5acc0
 >>     clock = 87029229985307234,
 >>
 >> crash> task -R sched_info,se.exec_start
 >> PID: 12345   TASK: ffff8d674ab6b180  CPU: 1  COMMAND: "task"
 >>     sched_info = {
 >>      pcount = 33,
 >>      run_delay = 0,
 >>      last_arrival = 67983031958439673,
 >>      last_queued = 87029224561119369
 >>     },
 >>     se.exec_start = 67983031958476937,
 >>
 >> 67983031         67983031                 87029224              87029229
 >> |<-   running on CPU  ->| <-      IN    ->|<-    waiting in queue
   ->|
 >>
 >> For this scenario, the "task" was waiting in the run queue of the CPU
only for 5 seconds,
 >> we should use the "rq_clock - sched_info->last_queued" formula.

Please check if my understanding is correct:

The result you saw is "rq_clock - sched_info->last_arrival == 87029229
- 67983031 == 19046198"
The expected result you want is: "rq_clock - sched_info->last_queued
== 87029229 - 87029224 == 5"

You think the 19046198 value is misleading and should be 5 which only
contains the waiting in queue duration, am I correct?

I agree with Kazu's idea, that we shouldn't change the existing ps
cmd's behaviour, and runq is a better alternative for the
waiting-in-queue duration display.

What do you think? Could you please improve your code as well as an
updated "help runq" doc for runq?

Thanks,
Tao Liu

...
 >>
 >> We can trust sched_info->last_queued as it is only set when the task enters
the CPU run queue.
 >> Furthermore, when the task hits/runs on a CPU or dequeues the CPU run queue, it
will be reset to 0.
 >>
 >> Therefore, my idea is simple:
 >>
 >> If a task in RU stat and sched_info->last_queued has value (!= 0),
 >> it means this task is waiting in the run queue, use "rq_clock -
sched_info->last_queued".
 >>
 >> Otherwise, if a task in RU stat and sched_info->last_queued = 0
 >> and sched_info->last_arrival has value (it must be), it means this task is
running on the CPU,
 >> use "rq_clock - sched_info->last_arrival".
 >>
 >> Signed-off-by: Kenneth Yin <kyin(a)redhat.com&gt;
 >> ---
 >>    defs.h    |  1 +
 >>    symbols.c |  2 ++
 >>    task.c    | 21 +++++++++++++++------
 >>    3 files changed, 18 insertions(+), 6 deletions(-)
 >>
 >> diff --git a/defs.h b/defs.h
 >> index 4cf169c..66f5ce4 100644
 >> --- a/defs.h
 >> +++ b/defs.h
 >> @@ -1787,6 +1787,7 @@ struct offset_table {                    /* stash of
commonly-used offsets */
 >>      long vcpu_struct_rq;
 >>      long task_struct_sched_info;
 >>      long sched_info_last_arrival;
 >> +    long sched_info_last_queued;
 >>      long page_objects;
 >>      long kmem_cache_oo;
 >>      long char_device_struct_cdev;
 >> diff --git a/symbols.c b/symbols.c
 >> index e30fafe..fb5035f 100644
 >> --- a/symbols.c
 >> +++ b/symbols.c
 >> @@ -9930,6 +9930,8 @@ dump_offset_table(char *spec, ulong makestruct)
 >>                    OFFSET(sched_rt_entity_run_list));
 >>      fprintf(fp, "       sched_info_last_arrival: %ld\n",
 >>                    OFFSET(sched_info_last_arrival));
 >> +    fprintf(fp, "       sched_info_last_queued: %ld\n",
 >> +            OFFSET(sched_info_last_queued));
 >>            fprintf(fp, "       task_struct_thread_info: %ld\n",
 >>                    OFFSET(task_struct_thread_info));
 >>            fprintf(fp, "             task_struct_stack: %ld\n",
 >> diff --git a/task.c b/task.c
 >> index 3bafe79..f5386ac 100644
 >> --- a/task.c
 >> +++ b/task.c
 >> @@ -332,9 +332,12 @@ task_init(void)
 >>            MEMBER_OFFSET_INIT(task_struct_last_run, "task_struct",
"last_run");
 >>            MEMBER_OFFSET_INIT(task_struct_timestamp, "task_struct",
"timestamp");
 >>            MEMBER_OFFSET_INIT(task_struct_sched_info, "task_struct",
"sched_info");
 >> -    if (VALID_MEMBER(task_struct_sched_info))
 >> +    if (VALID_MEMBER(task_struct_sched_info)) {
 >>              MEMBER_OFFSET_INIT(sched_info_last_arrival,
 >>                      "sched_info", "last_arrival");
 >> +            MEMBER_OFFSET_INIT(sched_info_last_queued,
 >> +                    "sched_info", "last_queued");
 >> +    }
 >>      if (VALID_MEMBER(task_struct_last_run) ||
 >>          VALID_MEMBER(task_struct_timestamp) ||
 >>          VALID_MEMBER(sched_info_last_arrival)) {
 >> @@ -6035,7 +6038,7 @@ ulonglong
 >>    task_last_run(ulong task)
 >>    {
 >>            ulong last_run;
 >> -    ulonglong timestamp;
 >> +    ulonglong timestamp,last_queued;
 >>
 >>      timestamp = 0;
 >>            fill_task_struct(task);
 >> @@ -6047,10 +6050,16 @@ task_last_run(ulong task)
 >>      } else if (VALID_MEMBER(task_struct_timestamp))
 >>              timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct
+
 >>                      OFFSET(task_struct_timestamp)) : 0;
 >> -    else if (VALID_MEMBER(sched_info_last_arrival))
 >> -            timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct
+
 >> -                    OFFSET(task_struct_sched_info) +
 >> -                    OFFSET(sched_info_last_arrival)) : 0;
 >> +    else if (VALID_MEMBER(sched_info_last_queued))
 >> +            last_queued = ULONGLONG(tt->task_struct +
 >> +                    OFFSET(task_struct_sched_info) +
 >> +                    OFFSET(sched_info_last_queued));
 >> +            if (last_queued != 0) {
 >> +                    timestamp = tt->last_task_read ? last_queued : 0;
 >> +            } else if (VALID_MEMBER(sched_info_last_arrival))
 >> +                            timestamp = tt->last_task_read ? 
ULONGLONG(tt->task_struct +
 >> +                            OFFSET(task_struct_sched_info) +
 >> +                            OFFSET(sched_info_last_arrival)) : 0;
 >>
 >>            return timestamp;
 >>    }
 > --
 > Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io
 > To unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io
 > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
 > Contribution Guidelines: https://github.com/crash-utility/crash/wiki
 --
 Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io
 To unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io
 https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
 Contribution Guidelines: https://github.com/crash-utility/crash/wiki 

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[Crash-utility] Re: [PATCH] Fix the "ps -m" command shows wrong duration of RU task