On Tue, Mar 15, 2022 at 7:17 PM Austin Kim <austindh.kim@gmail.com> wrote:
Hello, Lijiang

2022년 3월 15일 (화) 오후 2:29, lijiang <lijiang@redhat.com>님이 작성:
>
> On Tue, Mar 15, 2022 at 1:08 PM lijiang <lijiang@redhat.com> wrote:
>>
>> On Thu, Mar 3, 2022 at 10:43 AM <crash-utility-request@redhat.com> wrote:
>>>
>>> Date: Wed, 2 Mar 2022 21:37:45 +0000
>>> From: Austin Kim <austindh.kim@gmail.com>
>>> To: k-hagio-ab@nec.com, lijiang@redhat.com, crash-utility@redhat.com
>>> Cc: kernel-team@lge.com, mikeseohyungjin@gmail.com
>>> Subject: [Crash-utility] [PATCH v3] ps: Add support to "ps -l|-m" to
>>>         properly display process
>>> Message-ID: <20220302213745.GA868@raspberrypi>
>>> Content-Type: text/plain; charset=us-ascii
>>>
>>> Sometimes kernel image is generated without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
>>> Where relevant commit id is f6db83479932 ("sched/stat: Simplify the sched_info accounting")
>>>
>>>   - CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
>>>   - CONFIG_SCHEDSTATS: KERNEL_VERSION < LINUX(4,2,0)
>>>
>>> Running crash-utility with above kernel image,
>>> "ps -l" option cannot display all processes sorted with most recently-run process.
>>> Also "ps -m" option cannot display all processes with timestamp.
>>>
>>> crash> ps -l or crash> ps -m
>>> ps: last-run timestamps do not exist in this kernel
>>> Usage: ps [-k|-u|-G] [-s]
>>>   [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
>>>      [pid | task | command] ...
>>> Enter "help ps" for details.
>>>
>>> This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival.
>>>
>>> Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included
>>> in task_struct.  In this case we make "ps -l|-m" option to access 'exec_start'
>>> field of sched_entity where 'exec_start' is task_struct.se.exec_start.
>>>
>>> The 'task_struct.se.exec_start' contains the most recently-executed timestamp
>>> when process is running in the below cases;
>>>
>>>  - enqueued to runqueue
>>>  - dequeued from runqueue
>>>  - scheduler tick is invoked
>>>  - etc
>>>
>>> 'task_struct.se.exec_start' could be one of statistics which indicates the most
>>> recently-run timestamp of process activity.
>>
>>
>> Thank you for the update,  Austin.
>>
>> If the task is migraged,
>
>                          ^^^^^^^^
>
> Sorry, it's a typo, replace it with the "migrated". Thanks.
>
>>
>> the value of the 'task_struct.se.exec_start ' will be set to zero and may be updated in the next scheduling with the new time. In this case, does the patch still work well as expected?

Thanks for feedback. Theoretically you are right.

https://elixir.bootlin.com/linux/v5.15.28/source/kernel/sched/fair.c
static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
{
..
/* We have migrated, no longer consider this task hot */
p->se.exec_start = 0;

update_scan_period(p, new_cpu);
}

If we set break point at 'update_scan_period(p, new_cpu)' and then
system stops at this point using debugger like GDB,
we can have misleading information as you said.

Without this condition, it is unlikely to see p->se.exec_start is 0
since 'p->se.exec_start' is quite updated very often(ex: scheduler
tick, etc) on real target device.
 
Thank you for the explanation, Austin.

In multi-core systems, the task migration may occur frequently due to the load balancing, which
is more common than breakpoints. Unless the task is bound to the cpu or running on a single cpu system.

The sched entity is associated with a specific cpu(cpu rq), that is to say, the "-C" option is used implicitly in this patch.  I would suggest pointing out this behavior in the help or outputting a warning explicitly for users. That is my concern.

Thanks.
Lianbo
 
As below link indicated, Kazu already mentioned that there are no
useful way to display all processes
with recently-run order using se.exec_start from crash-utility point
of view. Also I agreed the feedback.

https://listman.redhat.com/archives/crash-utility/2022-March/009615.html

Anyway thanks for feedback over patch in details. Hope to discuss
another patch I will propose.

BR,
Austin Kim

>>
>> Thanks.
>> Lianbo
>>
>>> With this patch, "ps -l|-m" option works well without CONFIG_SCHEDSTATS or
>>> CONFIG_SCHED_INFO.
>>>
>>> Signed-off-by: Austin Kim <austindh.kim@gmail.com>
>>> ---
>>>  defs.h    |  1 +
>>>  help.c    |  5 +++--
>>>  symbols.c |  2 ++
>>>  task.c    | 20 ++++++++++++++++----
>>>  4 files changed, 22 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/defs.h b/defs.h
>>> index bf2c59b..841bd0b 100644
>>> --- a/defs.h
>>> +++ b/defs.h
>>> @@ -2168,6 +2168,7 @@ struct offset_table {                    /* stash of commonly-used offsets */
>>>         long sbitmap_queue_min_shallow_depth;
>>>         long sbq_wait_state_wait_cnt;
>>>         long sbq_wait_state_wait;
>>> +       long sched_entity_exec_start;
>>>  };
>>>
>>>  struct size_table {         /* stash of commonly-used sizes */
>>> diff --git a/help.c b/help.c
>>> index 8347668..6ca7c92 100644
>>> --- a/help.c
>>> +++ b/help.c
>>> @@ -1442,7 +1442,8 @@ char *help_ps[] = {
>>>  "           and system times.",
>>>  "       -l  display the task's last-run timestamp value, using either the",
>>>  "           task_struct's last_run value, the task_struct's timestamp value",
>>> -"           or the task_struct's sched_entity last_arrival value, whichever",
>>> +"           the task_struct's sched_info last_arrival value",
>>> +"           or the task_struct's sched_entity exec_start value, whichever",
>>>  "           applies, of selected, or all, tasks; the list is sorted with the",
>>>  "           most recently-run task (with the largest timestamp) shown first,",
>>>  "           followed by the task's current state.",
>>> @@ -1621,7 +1622,7 @@ char *help_ps[] = {
>>>  "     >  9497      1   0  ffff880549ec2ab0  RU   0.0 42314692 138664  oracle",
>>>  " ",
>>>  "  Show all tasks sorted by their task_struct's last_run, timestamp, or",
>>> -"  sched_entity last_arrival timestamp value, whichever applies:\n",
>>> +"  sched_info last_arrival or sched_entity exec_start timestamp value, whichever applies:\n",
>>>  "    %s> ps -l",
>>>  "    [20811245123] [IN] PID: 37    TASK: f7153030  CPU: 2  COMMAND: \"events/2\"",
>>>  "    [20811229959] [IN] PID: 1756  TASK: f2a5a570  CPU: 2  COMMAND: \"ntpd\"",
>>> diff --git a/symbols.c b/symbols.c
>>> index ba5e274..1c40586 100644
>>> --- a/symbols.c
>>> +++ b/symbols.c
>>> @@ -10290,6 +10290,8 @@ dump_offset_table(char *spec, ulong makestruct)
>>>                 OFFSET(sched_entity_my_q));
>>>         fprintf(fp, "            sched_entity_on_rq: %ld\n",
>>>                 OFFSET(sched_entity_on_rq));
>>> +       fprintf(fp, "            sched_entity_exec_start: %ld\n",
>>> +               OFFSET(sched_entity_exec_start));
>>>         fprintf(fp, "             cfs_rq_nr_running: %ld\n",
>>>                 OFFSET(cfs_rq_nr_running));
>>>         fprintf(fp, "            cfs_rq_rb_leftmost: %ld\n",
>>> diff --git a/task.c b/task.c
>>> index 864c838..2c12196 100644
>>> --- a/task.c
>>> +++ b/task.c
>>> @@ -334,9 +334,15 @@ task_init(void)
>>>         if (VALID_MEMBER(task_struct_sched_info))
>>>                 MEMBER_OFFSET_INIT(sched_info_last_arrival,
>>>                         "sched_info", "last_arrival");
>>> +       MEMBER_OFFSET_INIT(task_struct_se, "task_struct", "se");
>>> +       if (VALID_MEMBER(task_struct_se)) {
>>> +               STRUCT_SIZE_INIT(sched_entity, "sched_entity");
>>> +               MEMBER_OFFSET_INIT(sched_entity_exec_start, "sched_entity", "exec_start");
>>> +       }
>>>         if (VALID_MEMBER(task_struct_last_run) ||
>>>             VALID_MEMBER(task_struct_timestamp) ||
>>> -           VALID_MEMBER(sched_info_last_arrival)) {
>>> +           VALID_MEMBER(sched_info_last_arrival) ||
>>> +           VALID_MEMBER(sched_entity_exec_start)) {
>>>                 char buf[BUFSIZE];
>>>                 strcpy(buf, "alias last ps -l");
>>>                 alias_init(buf);
>>> @@ -3559,7 +3565,8 @@ cmd_ps(void)
>>>                 case 'm':
>>>                         if (INVALID_MEMBER(task_struct_last_run) &&
>>>                             INVALID_MEMBER(task_struct_timestamp) &&
>>> -                           INVALID_MEMBER(sched_info_last_arrival)) {
>>> +                           INVALID_MEMBER(sched_info_last_arrival) &&
>>> +                           INVALID_MEMBER(sched_entity_exec_start)) {
>>>                                 error(INFO,
>>>                              "last-run timestamps do not exist in this kernel\n");
>>>                                 argerrs++;
>>> @@ -3574,7 +3581,8 @@ cmd_ps(void)
>>>                 case 'l':
>>>                         if (INVALID_MEMBER(task_struct_last_run) &&
>>>                             INVALID_MEMBER(task_struct_timestamp) &&
>>> -                           INVALID_MEMBER(sched_info_last_arrival)) {
>>> +                           INVALID_MEMBER(sched_info_last_arrival) &&
>>> +                           INVALID_MEMBER(sched_entity_exec_start)) {
>>>                                 error(INFO,
>>>                              "last-run timestamps do not exist in this kernel\n");
>>>                                 argerrs++;
>>> @@ -6020,7 +6028,11 @@ task_last_run(ulong task)
>>>                 timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct +
>>>                         OFFSET(task_struct_sched_info) +
>>>                         OFFSET(sched_info_last_arrival)) : 0;
>>> -
>>> +       else if (VALID_MEMBER(sched_entity_exec_start))
>>> +                       timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct +
>>> +                       OFFSET(task_struct_se) +
>>> +                       OFFSET(sched_entity_exec_start)) : 0;
>>> +
>>>          return timestamp;
>>>  }
>>>
>>> --
>>> 2.20.1