-----Original Message-----
> But I found a problem, is that the se.exec_start is from
rq->clock_task,
> not from rq->clock like last_arrival. The rq->clock_task may not contain
> irq/steal time, please see update_rq_clock_task().
>
I am understood after looking at below code.
static void sched_info_arrive(struct rq *rq, struct task_struct *t)
{
...
now = rq_clock(rq); //<<--
...
t->sched_info.last_arrival = now;
}
static void update_curr(struct cfs_rq *cfs_rq)
{
u64 now = rq_clock_task(rq_of(cfs_rq)); //<<--
...
curr->exec_start = now;
...
}
Yes, right.
> This causes the following issues with a vmcore generated on a
machine,
> which had run for a while (273 days):
>
Let me ask a question at this point.
The vmcore was generated from machine which has been running 273 days.
I wonder the kernel image of machine(running 273 days) is without
CONFIG_SCHEDSTATS/CONFIG_SCHED_INFO.
If output is based on vmcore without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO,
you can ignore above question.
oh, I should have explain it.
I have no vmcore without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO on hand, so
I dropped the code using last_arrival to simulate it and use the se.exec_start.
--- a/task.c
+++ b/task.c
@@ -6024,10 +6024,6 @@ task_last_run(ulong task)
} else if (VALID_MEMBER(task_struct_timestamp))
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
OFFSET(task_struct_timestamp)) : 0;
- else if (VALID_MEMBER(sched_info_last_arrival))
- timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
- OFFSET(task_struct_sched_info) +
- OFFSET(sched_info_last_arrival)) : 0;
else if (VALID_MEMBER(sched_entity_exec_start))
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct
+
OFFSET(task_struct_se) +
crash> help -o | grep -e task_struct_last_run -e task_struct_timestamp -e
sched_info_last_arrival -e sched_entity_exec_start
task_struct_last_run: -1
task_struct_timestamp: -1
sched_info_last_arrival: 16
sched_entity_exec_start: 72
crash> ps -l
[23601334315087174] [RU] PID: 4023608 TASK: ffff916f7c6b1840 CPU: 15 COMMAND:
"makedumpfile"
...
crash> task 4023608 | grep exec_start
exec_start = 23601334315087174,
I would like to remind that as commit description said, the patch is
**only** valid
without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
When kernel image is with CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO and then
"ps -l|-m" command is used, task_last_run() is called from its subroutines.
ulonglong
task_last_run(ulong task)
{
ulong last_run;
...
else if (VALID_MEMBER(sched_info_last_arrival)) // With
CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
OFFSET(task_struct_sched_info) +
OFFSET(sched_info_last_arrival)) : 0;
+ else if (VALID_MEMBER(sched_entity_exec_start)) // Without
CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO
+ timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
+ OFFSET(task_struct_se) +
+ OFFSET(sched_entity_exec_start)) : 0;
return timestamp;
}
With CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO enabled,
the statement under 'VALID_MEMBER(sched_entity_exec_start)' is not called.
> crash> ps -m | head
> [ 0 00:59:36.582] [RU] PID: 4023608 TASK: ffff916f7c6b1840 CPU: 15 COMMAND:
"makedumpfile"
> ^^^^^^^^^(1)
> [ 0 00:59:37.831] [ID] PID: 413 TASK: ffff916f772d3080 CPU: 15 COMMAND:
"kworker/15:1H"
> [ 0 00:59:39.765] [IN] PID: 3929504 TASK: ffff916f5f0748c0 CPU: 15 COMMAND:
"respawn_actlog"
> [ 0 00:59:40.650] [IN] PID: 1974 TASK: ffff91647dc53080 CPU: 15 COMMAND:
"CPU 2/KVM"
> [ 0 00:59:41.925] [IN] PID: 1297 TASK: ffff916f63c46100 CPU: 15 COMMAND:
"NetworkManager"
> [ 0 00:59:42.944] [ID] PID: 3763057 TASK: ffff9160c4519840 CPU: 15 COMMAND:
"kworker/15:0"
> [ 0 00:59:42.944] [IN] PID: 101 TASK: ffff916040c91840 CPU: 15 COMMAND:
"migration/15"
> [ 0 00:59:43.078] [IN] PID: 100 TASK: ffff916040c5b080 CPU: 15 COMMAND:
"watchdog/15"
> [ 0 00:59:47.533] [IN] PID: 1292 TASK: ffff916f63c43080 CPU: 15 COMMAND:
"lsmd"
> [ 0 00:59:49.089] [IN] PID: 113105 TASK: ffff9160412248c0 CPU: 15 COMMAND:
"kvm-nx-lpage-re"
> ^^^^^^^(2)
> (1) large difference from zero
> (2) large differences among CPUs (probably due to the differences of irq time)
>
> (1) might be solved with rq->clock_task, but (2) will be misleading and
confusing.
> So currently I'm thinking that the "ps -l|-m" options should not use
the se.exec_start.
> What do you think?
>
Agreed. (As you indicated) Since rq->clock cannot hold accurate
timestamp we can rely on,
"ps -l|-m" options should not access the se.exec_start in case of kernel image
without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
Thanks for your code review.
If there is something to improve, let me propose a patch again.
Sure.
I think that the se.exec_start itself may be somewhat useful, especially for
dumpfiles without CONFIG_SCHED{STATS,_INFO}, but not suitable for the current
"ps -l|-m" behavior. There may be another way to make good use of it.
Thanks,
Kazu