On Tue, Mar 15, 2022 at 7:17 PM Austin Kim <austindh.kim(a)gmail.com> wrote:
Hello, Lijiang
2022년 3월 15일 (화) 오후 2:29, lijiang <lijiang(a)redhat.xn--com>-4f21ay07k 작성:
>
> On Tue, Mar 15, 2022 at 1:08 PM lijiang <lijiang(a)redhat.com> wrote:
>>
>> On Thu, Mar 3, 2022 at 10:43 AM <crash-utility-request(a)redhat.com>
wrote:
>>>
>>> Date: Wed, 2 Mar 2022 21:37:45 +0000
>>> From: Austin Kim <austindh.kim(a)gmail.com>
>>> To: k-hagio-ab(a)nec.com, lijiang(a)redhat.com, crash-utility(a)redhat.com
>>> Cc: kernel-team(a)lge.com, mikeseohyungjin(a)gmail.com
>>> Subject: [Crash-utility] [PATCH v3] ps: Add support to "ps -l|-m"
to
>>> properly display process
>>> Message-ID: <20220302213745.GA868@raspberrypi>
>>> Content-Type: text/plain; charset=us-ascii
>>>
>>> Sometimes kernel image is generated without CONFIG_SCHEDSTATS or
CONFIG_SCHED_INFO.
>>> Where relevant commit id is f6db83479932 ("sched/stat: Simplify the
sched_info accounting")
>>>
>>> - CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
>>> - CONFIG_SCHEDSTATS: KERNEL_VERSION < LINUX(4,2,0)
>>>
>>> Running crash-utility with above kernel image,
>>> "ps -l" option cannot display all processes sorted with most
recently-run process.
>>> Also "ps -m" option cannot display all processes with timestamp.
>>>
>>> crash> ps -l or crash> ps -m
>>> ps: last-run timestamps do not exist in this kernel
>>> Usage: ps [-k|-u|-G] [-s]
>>> [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
>>> [pid | task | command] ...
>>> Enter "help ps" for details.
>>>
>>> This is because output of "ps -l|-m" depends on
task_struct.sched_info.last_arrival.
>>>
>>> Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field
is
not included
>>> in task_struct. In this case we make "ps -l|-m" option to access
'exec_start'
>>> field of sched_entity where 'exec_start' is
task_struct.se.exec_start.
>>>
>>> The 'task_struct.se.exec_start' contains the most recently-executed
timestamp
>>> when process is running in the below cases;
>>>
>>> - enqueued to runqueue
>>> - dequeued from runqueue
>>> - scheduler tick is invoked
>>> - etc
>>>
>>> 'task_struct.se.exec_start' could be one of statistics which
indicates
the most
>>> recently-run timestamp of process activity.
>>
>>
>> Thank you for the update, Austin.
>>
>> If the task is migraged,
>
> ^^^^^^^^
>
> Sorry, it's a typo, replace it with the "migrated". Thanks.
>
>>
>> the value of the 'task_struct.se.exec_start ' will be set to zero and
may be updated in the next scheduling with the new time. In this case, does
the patch still work well as expected?
Thanks for feedback. Theoretically you are right.
https://elixir.bootlin.com/linux/v5.15.28/source/kernel/sched/fair.c
static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
{
..
/* We have migrated, no longer consider this task hot */
p->se.exec_start = 0;
update_scan_period(p, new_cpu);
}
If we set break point at 'update_scan_period(p, new_cpu)' and then
system stops at this point using debugger like GDB,
we can have misleading information as you said.
Without this condition, it is unlikely to see p->se.exec_start is 0
since 'p->se.exec_start' is quite updated very often(ex: scheduler
tick, etc) on real target device.
Thank you for the explanation, Austin.
In multi-core systems, the task migration may occur frequently due to the
load balancing, which
is more common than breakpoints. Unless the task is bound to the cpu or
running on a single cpu system.
The sched entity is associated with a specific cpu(cpu rq), that is to say,
the "-C" option is used implicitly in this patch. I would suggest pointing
out this behavior in the help or outputting a warning explicitly for users.
That is my concern.
Thanks.
Lianbo
As below link indicated, Kazu already mentioned that there are no
useful way to display all processes
with recently-run order using se.exec_start from crash-utility point
of view. Also I agreed the feedback.
https://listman.redhat.com/archives/crash-utility/2022-March/009615.html
Anyway thanks for feedback over patch in details. Hope to discuss
another patch I will propose.
BR,
Austin Kim
>>
>> Thanks.
>> Lianbo
>>
>>> With this patch, "ps -l|-m" option works well without
CONFIG_SCHEDSTATS or
>>> CONFIG_SCHED_INFO.
>>>
>>> Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
>>> ---
>>> defs.h | 1 +
>>> help.c | 5 +++--
>>> symbols.c | 2 ++
>>> task.c | 20 ++++++++++++++++----
>>> 4 files changed, 22 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/defs.h b/defs.h
>>> index bf2c59b..841bd0b 100644
>>> --- a/defs.h
>>> +++ b/defs.h
>>> @@ -2168,6 +2168,7 @@ struct offset_table { /*
stash of commonly-used offsets */
>>> long sbitmap_queue_min_shallow_depth;
>>> long sbq_wait_state_wait_cnt;
>>> long sbq_wait_state_wait;
>>> + long sched_entity_exec_start;
>>> };
>>>
>>> struct size_table { /* stash of commonly-used sizes */
>>> diff --git a/help.c b/help.c
>>> index 8347668..6ca7c92 100644
>>> --- a/help.c
>>> +++ b/help.c
>>> @@ -1442,7 +1442,8 @@ char *help_ps[] = {
>>> " and system times.",
>>> " -l display the task's last-run timestamp value, using
either
the",
>>> " task_struct's last_run value, the task_struct's
timestamp
value",
>>> -" or the task_struct's sched_entity last_arrival value,
whichever",
>>> +" the task_struct's sched_info last_arrival
value",
>>> +" or the task_struct's sched_entity exec_start value,
whichever",
>>> " applies, of selected, or all, tasks; the list is sorted
with the",
>>> " most recently-run task (with the largest timestamp) shown
first,",
>>> " followed by the task's current state.",
>>> @@ -1621,7 +1622,7 @@ char *help_ps[] = {
>>> " > 9497 1 0 ffff880549ec2ab0 RU 0.0 42314692
138664
oracle",
>>> " ",
>>> " Show all tasks sorted by their task_struct's last_run,
timestamp,
or",
>>> -" sched_entity last_arrival timestamp value, whichever
applies:\n",
>>> +" sched_info last_arrival or sched_entity exec_start timestamp
value, whichever applies:\n",
>>> " %s> ps -l",
>>> " [20811245123] [IN] PID: 37 TASK: f7153030 CPU: 2 COMMAND:
\"events/2\"",
>>> " [20811229959] [IN] PID: 1756 TASK: f2a5a570 CPU: 2 COMMAND:
\"ntpd\"",
>>> diff --git a/symbols.c b/symbols.c
>>> index ba5e274..1c40586 100644
>>> --- a/symbols.c
>>> +++ b/symbols.c
>>> @@ -10290,6 +10290,8 @@ dump_offset_table(char *spec, ulong makestruct)
>>> OFFSET(sched_entity_my_q));
>>> fprintf(fp, " sched_entity_on_rq: %ld\n",
>>> OFFSET(sched_entity_on_rq));
>>> + fprintf(fp, " sched_entity_exec_start: %ld\n",
>>> + OFFSET(sched_entity_exec_start));
>>> fprintf(fp, " cfs_rq_nr_running: %ld\n",
>>> OFFSET(cfs_rq_nr_running));
>>> fprintf(fp, " cfs_rq_rb_leftmost: %ld\n",
>>> diff --git a/task.c b/task.c
>>> index 864c838..2c12196 100644
>>> --- a/task.c
>>> +++ b/task.c
>>> @@ -334,9 +334,15 @@ task_init(void)
>>> if (VALID_MEMBER(task_struct_sched_info))
>>> MEMBER_OFFSET_INIT(sched_info_last_arrival,
>>> "sched_info", "last_arrival");
>>> + MEMBER_OFFSET_INIT(task_struct_se, "task_struct",
"se");
>>> + if (VALID_MEMBER(task_struct_se)) {
>>> + STRUCT_SIZE_INIT(sched_entity, "sched_entity");
>>> + MEMBER_OFFSET_INIT(sched_entity_exec_start,
"sched_entity", "exec_start");
>>> + }
>>> if (VALID_MEMBER(task_struct_last_run) ||
>>> VALID_MEMBER(task_struct_timestamp) ||
>>> - VALID_MEMBER(sched_info_last_arrival)) {
>>> + VALID_MEMBER(sched_info_last_arrival) ||
>>> + VALID_MEMBER(sched_entity_exec_start)) {
>>> char buf[BUFSIZE];
>>> strcpy(buf, "alias last ps -l");
>>> alias_init(buf);
>>> @@ -3559,7 +3565,8 @@ cmd_ps(void)
>>> case 'm':
>>> if (INVALID_MEMBER(task_struct_last_run) &&
>>> INVALID_MEMBER(task_struct_timestamp)
&&
>>> - INVALID_MEMBER(sched_info_last_arrival)) {
>>> + INVALID_MEMBER(sched_info_last_arrival)
&&
>>> + INVALID_MEMBER(sched_entity_exec_start)) {
>>> error(INFO,
>>> "last-run timestamps do not exist in this
kernel\n");
>>> argerrs++;
>>> @@ -3574,7 +3581,8 @@ cmd_ps(void)
>>> case 'l':
>>> if (INVALID_MEMBER(task_struct_last_run) &&
>>> INVALID_MEMBER(task_struct_timestamp)
&&
>>> - INVALID_MEMBER(sched_info_last_arrival)) {
>>> + INVALID_MEMBER(sched_info_last_arrival)
&&
>>> + INVALID_MEMBER(sched_entity_exec_start)) {
>>> error(INFO,
>>> "last-run timestamps do not exist in this
kernel\n");
>>> argerrs++;
>>> @@ -6020,7 +6028,11 @@ task_last_run(ulong task)
>>> timestamp = tt->last_task_read ?
ULONGLONG(tt->task_struct +
>>> OFFSET(task_struct_sched_info) +
>>> OFFSET(sched_info_last_arrival)) : 0;
>>> -
>>> + else if (VALID_MEMBER(sched_entity_exec_start))
>>> + timestamp = tt->last_task_read ?
ULONGLONG(tt->task_struct +
>>> + OFFSET(task_struct_se) +
>>> + OFFSET(sched_entity_exec_start)) : 0;
>>> +
>>> return timestamp;
>>> }
>>>
>>> --
>>> 2.20.1