----- Original Message -----
Hello Dave,
Currently, runq command doesn't show tasks in throttled cfs_rqs/rt_rqs,
but sometimes we want to display them to help us find the problem. So
I made the patches to make them displayed.
Two patches are attached. and PATCH2 should be applied after PATCH1.
PATCH1: 0001-runq-make-tasks-in-cfs_rq-displayed-hierarchically.patch
this patch is used to change the original display of the cfs_rq tasks.
for example, before applying PATCH1, the display is:
CPU 2 RUNQUEUE: ffff88002c316680
CURRENT: PID: 4274 TASK: ffff88013aa2eae0 COMMAND: "sh"
RT PRIO_ARRAY: ffff88002c316808
[no tasks queued]
CFS RB_ROOT: ffff88002c316718
[120] PID: 4296 TASK: ffff88010916d540 COMMAND: "sh"
[120] PID: 4258 TASK: ffff880135c5eaa0 COMMAND: "sh"
[120] PID: 4270 TASK: ffff88013aeb6aa0 COMMAND: "sh"
[120] PID: 4268 TASK: ffff88013b6b4040 COMMAND: "sh"
[120] PID: 4272 TASK: ffff88013a918080 COMMAND: "sh"
[120] PID: 4280 TASK: ffff880109028080 COMMAND: "sh"
[120] PID: 4276 TASK: ffff8801388fa080 COMMAND: "sh"
[120] PID: 4278 TASK: ffff880109029540 COMMAND: "sh"
[120] PID: 4260 TASK: ffff88013aeeeae0 COMMAND: "sh"
[130] PID: 4335 TASK: ffff88011ac3d500 COMMAND: "sosreport"
[120] PID: 4284 TASK: ffff8801050a7540 COMMAND: "sh"
[120] PID: 4286 TASK: ffff8801050a6080 COMMAND: "sh"
[120] PID: 4282 TASK: ffff88011ac48aa0 COMMAND: "sh"
[120] PID: 4290 TASK: ffff8801041f7540 COMMAND: "sh"
[120] PID: 4288 TASK: ffff880121530aa0 COMMAND: "sh"
[120] PID: 4292 TASK: ffff8801041f6080 COMMAND: "sh"
[120] PID: 4262 TASK: ffff880137b21500 COMMAND: "sh"
[120] PID: 4266 TASK: ffff88011aeb4aa0 COMMAND: "sh"
[120] PID: 4264 TASK: ffff88010400b540 COMMAND: "sh"
after applying PATCH1, the display is changed to:
CPU 2 RUNQUEUE: ffff88002c316680
CURRENT: PID: 4274 TASK: ffff88013aa2eae0 COMMAND: "sh"
RT PRIO_ARRAY: ffff88002c316808
[no tasks queued]
CFS RB_ROOT: ffff88002c316718
[120] PID: 4296 TASK: ffff88010916d540 COMMAND: "sh"
[120] PID: 4258 TASK: ffff880135c5eaa0 COMMAND: "sh"
GROUP CFS RB_ROOT: ffff880138f50200 <test1>
GROUP CFS RB_ROOT: ffff88001f8d1e00 <test11>
[120] PID: 4270 TASK: ffff88013aeb6aa0 COMMAND: "sh"
[120] PID: 4268 TASK: ffff88013b6b4040 COMMAND: "sh"
[120] PID: 4272 TASK: ffff88013a918080 COMMAND: "sh"
GROUP CFS RB_ROOT: ffff88013a1ba200 <test12>
[120] PID: 4280 TASK: ffff880109028080 COMMAND: "sh"
[120] PID: 4276 TASK: ffff8801388fa080 COMMAND: "sh"
[120] PID: 4278 TASK: ffff880109029540 COMMAND: "sh"
[120] PID: 4260 TASK: ffff88013aeeeae0 COMMAND: "sh"
[130] PID: 4335 TASK: ffff88011ac3d500 COMMAND: "sosreport"
GROUP CFS RB_ROOT: ffff88013bb75400 <test2>
GROUP CFS RB_ROOT: ffff8801381f1000 <test21>
[120] PID: 4284 TASK: ffff8801050a7540 COMMAND: "sh"
[120] PID: 4286 TASK: ffff8801050a6080 COMMAND: "sh"
[120] PID: 4282 TASK: ffff88011ac48aa0 COMMAND: "sh"
GROUP CFS RB_ROOT: ffff88001f816000 <test22>
[120] PID: 4290 TASK: ffff8801041f7540 COMMAND: "sh"
[120] PID: 4288 TASK: ffff880121530aa0 COMMAND: "sh"
[120] PID: 4292 TASK: ffff8801041f6080 COMMAND: "sh"
[120] PID: 4262 TASK: ffff880137b21500 COMMAND: "sh"
[120] PID: 4266 TASK: ffff88011aeb4aa0 COMMAND: "sh"
[120] PID: 4264 TASK: ffff88010400b540 COMMAND: "sh"
PACTH2:
0002-runq-make-tasks-in-throttled-cfs_rqs-rt_rqs-displaye.patch
this patch is used to display tasks in throttled cfs_rqs and rt_rqs.
for example, before applying PATCH2:
CPU 1 RUNQUEUE: ffff88002c296680
CURRENT: PID: 3494 TASK: ffff88011bcaa080 COMMAND: "sh"
RT PRIO_ARRAY: ffff88002c296808
[no tasks queued]
CFS RB_ROOT: ffff88002c296718
[120] PID: 3516 TASK: ffff88011bfdd500 COMMAND: "sh"
[120] PID: 3514 TASK: ffff88011bef2ae0 COMMAND: "sh"
after applying PATCH2:
CPU 1 RUNQUEUE: ffff88002c296680
CURRENT: PID: 3494 TASK: ffff88011bcaa080 COMMAND: "sh"
RT PRIO_ARRAY: ffff88002c296808
[ 0] GROUP RT PRIO_ARRAY: ffff88013b351800 <test1> (THROTTLED)
[ 0] GROUP RT PRIO_ARRAY: ffff88013acdb800 <test11>
[ 0] PID: 3546 TASK: ffff88010d05e040 COMMAND: "rtloop99"
[ 1] PID: 3541 TASK: ffff88011bef2080 COMMAND: "rtloop98"
[ 54] PID: 3536 TASK: ffff88011bf28080 COMMAND: "rtloop45"
[ 98] PID: 3530 TASK: ffff88013ab1aaa0 COMMAND: "rtloop1"
[ 1] PID: 3540 TASK: ffff88011bf29540 COMMAND: "rtloop98"
[ 54] PID: 3534 TASK: ffff88013ab14040 COMMAND: "rtloop45"
[ 54] GROUP RT PRIO_ARRAY: ffff88013a207800 <test12>
[ 54] PID: 3537 TASK: ffff88013a378ae0 COMMAND: "rtloop45"
[ 98] PID: 3531 TASK: ffff88011bdae080 COMMAND: "rtloop1"
CFS RB_ROOT: ffff88002c296718
[120] PID: 3516 TASK: ffff88011bfdd500 COMMAND: "sh"
[120] PID: 3514 TASK: ffff88011bef2ae0 COMMAND: "sh"
Another example to show the throttled tasks in cfs_rqs:
CPU 1 RUNQUEUE: ffff880028296680
CURRENT: PID: 7814 TASK: ffff880117bab500 COMMAND: "sh"
RT PRIO_ARRAY: ffff880028296808
[no tasks queued]
CFS RB_ROOT: ffff880028296718
[120] PID: 2619 TASK: ffff88013812c040 COMMAND: "Xorg"
......
[120] PID: 7798 TASK: ffff88013a39eae0 COMMAND: "sh"
GROUP CFS RB_ROOT: ffff880135fc6800 <test1> (DEQUEUED)
GROUP CFS RB_ROOT: ffff88000ce5bc00 <test12> (THROTTLED)
[120] PID: 7820 TASK: ffff8801383ab500 COMMAND: "sh"
[120] PID: 7818 TASK: ffff88013ba14ae0 COMMAND: "sh"
[120] PID: 7816 TASK: ffff880117baa040 COMMAND: "sh"
GROUP CFS RB_ROOT: ffff880135fc6800 <test1> (DEQUEUED)
GROUP CFS RB_ROOT: ffff8800b0b32000 <test11> (THROTTLED)
[120] PID: 7810 TASK: ffff8800b0b96aa0 COMMAND: "sh"
*NOTE*
1. The patch also displays the group name that the cfs_rq/rt_rq belows to.
2. There are two statuses about the rqs: DEQUEUED and THROTTLED.
THROTTLED means this cfs_rq/rt_rq is throttled. And DEQUEUED means the
running entities in the rq are 0 because no running task on it and its
child rqs are throttled.
Thanks
Zhang Yanfei
Hello Zhang,
I tested your patch against a sample set of dumpfiles as
well as live on my 3.6.2-4.fc17 kernel. Here are my
results and comments.
First, when posting patches, please run the build with
"make warn" in order to catch all these types of complaints:
$ make warn
... [ cut ] ...
cc -c -g -DX86_64 -DGDB_7_3_1 task.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes
-fstack-protector
task.c: In function 'dump_task_group_name':
task.c:7545:7: warning: unused variable 'buf' [-Wunused-variable]
task.c: In function 'dump_tasks_in_cfs_rq':
task.c:7584:8: warning: unused variable 'p1' [-Wunused-variable]
task.c: In function 'dump_RT_prio_array':
task.c:8050:8: warning: unused variable 'p1h [-Wunused-variable]
task.c: In function 'cmd_runq':
task.c:8012:31: warning: 'root_task_group' may be used uninitialized in this
function [-Wmaybe-uninitialized]
task.c:7887:8: note: 'root_task_group' was declared here
...
Anyway, when I run this patch on a live Fedora 3.6.2-4.fc17.x86_64
kernel, it shows this *every* time on the active "crash" task:
crash> set
PID: 6825
COMMAND: "crash"
TASK: ffff8801df8bae20 [THREAD_INFO: ffff88020e55a000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash> runq
CPU 0 RUNQUEUE: ffff88021e213cc0
CURRENT: PID: 6825 TASK: ffff8801df8bae20 COMMAND: "crash"
RT PRIO_ARRAY: ffff88021e213e28
[no tasks queued]
CFS RB_ROOT: ffff88021e213d58
GROUP CFS RB_ROOT: ffff8801ebb48200runq: invalid kernel virtual address: 48 type:
"cgroup dentry"
crash>
So clearly that's a problem that needs addressing -- did you
test this patch on a live system?
I see similar issues on vmcores that were taken using the
"snap.so" snapshot extension module. Now, I admit that in
those cases it is certainly possible that the scheduler
infrastructure is undergoing changes *while* the snapshot
was being taken. However, I never see a problem with the
unpatched "runq" command, which always shows the active tasks
correctly. Here are a few examples:
On a 2.6.40.4-5.fc15 snapshot, here is a situation where
using the current runq command, it shows:
CPU 1 RUNQUEUE: ffff88003fc92540
CURRENT: PID: 1341 TASK: ffff880037409730 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc92690
[no tasks queued]
CFS RB_ROOT: ffff88003fc925d8
[no tasks queued]
with your patch, it results in this:
CPU 1 RUNQUEUE: ffff88003fc92540
CURRENT: PID: 1341 TASK: ffff880037409730 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc92690
[no tasks queued]
CFS RB_ROOT: ffff88003fc925d8
[no tasks queued]
[no tasks queued]
On a 2.6.29.4-167.fc11 snapshot, the current runq command shows:
crash> runq
CPU 0 RUNQUEUE: ffff88000101b300
CURRENT: PID: 0 TASK: ffffffff81584360 COMMAND: "swapper"
RT PRIO_ARRAY: ffff88000101b420
[no tasks queued]
CFS RB_ROOT: ffff88000101b398
[no tasks queued]
CPU 1 RUNQUEUE: ffff880001029300
CURRENT: PID: 19625 TASK: ffff8800764e5c00 COMMAND: "crash"
RT PRIO_ARRAY: ffff880001029420
[no tasks queued]
CFS RB_ROOT: ffff880001029398
[no tasks queued]
with your patch, it results in this:
crash> runq
runq: gdb request failed: print &((struct rt_rq *)0x0)->highest_prio.curr
crash>
This is from your patch:
sprintf(buf, "print &((struct rt_rq
*)0x0)->highest_prio.curr");
It is always preferable to use OFFSET() if pre-stored, or at least
use MEMBER_OFFSET() if not, instead of invoking gdb like this. You
already have "offset_table.rt_rq_highest_prio_curr" set up -- why aren't
you using it here? I also saw the same error as above in a 2.6.29.2-52.fc10
snapshot, so perhaps it's a kernel version dependency that you are not
accouting for? In any case, the failure mode above is unacceptable.
Here on another 3.2.6-3.fc16 snapshot, I see this:
CPU 0 RUNQUEUE: ffff88003fc13780
CURRENT: PID: 1383 TASK: ffff88003c932e40 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc13910
[no tasks queued]
CFS RB_ROOT: ffff88003fc13820
[no tasks queued]
With your patch:
CPU 0 RUNQUEUE: ffff88003fc13780
CURRENT: PID: 1383 TASK: ffff88003c932e40 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc13910
[no tasks queued]
CFS RB_ROOT: ffff88003fc13820
GROUP CFS RB_ROOT: ffff88003a432c00runq: invalid kernel virtual address: 38 type:
"cgroup dentry"
So anyway, your patch should be able to at least work as
well on live systems and snapshots as the current runq
command does.
Let me note here that the remainder of the examples below are
from actual crash dumps, i.e. not "snapshots".
First, to be honest about this, I wonder whether the additional task
group data is more, or less, understandable. For example, this example
is from a 2.6.32-220.el6 kernel.
With the current runq:
crash> runq
CPU 0 RUNQUEUE: ffff880133c15fc0
CURRENT: PID: 3144 TASK: ffff88022a446040 COMMAND: "qemu-kvm"
RT PRIO_ARRAY: ffff880133c16148
[no tasks queued]
CFS RB_ROOT: ffff880133c16058
[no tasks queued]
CPU 1 RUNQUEUE: ffff880028215fc0
CURRENT: PID: 2948 TASK: ffff88022af2a100 COMMAND: "bash"
RT PRIO_ARRAY: ffff880028216148
[no tasks queued]
CFS RB_ROOT: ffff880028216058
[120] PID: 3248 TASK: ffff88012a9d4100 COMMAND: "qemu-kvm"
...
With your patch:
crash>
CPU 0 RUNQUEUE: ffff880133c15fc0
CURRENT: PID: 3144 TASK: ffff88022a446040 COMMAND: "qemu-kvm"
RT PRIO_ARRAY: ffff880133c16148
[no tasks queued]
CFS RB_ROOT: ffff880133c16058
GROUP CFS RB_ROOT: ffff88022ac8f000 <libvirt>
GROUP CFS RB_ROOT: ffff88022c075000 <qemu>
GROUP CFS RB_ROOT: ffff88022af94c00 <guest1>
GROUP CFS RB_ROOT: ffff88022aab2600 <vcpu1>
[no tasks queued]
[no tasks queued]
[no tasks queued]
[no tasks queued]
[no tasks queued]
CPU 1 RUNQUEUE: ffff880028215fc0
CURRENT: PID: 2948 TASK: ffff88022af2a100 COMMAND: "bash"
RT PRIO_ARRAY: ffff880028216148
[no tasks queued]
CFS RB_ROOT: ffff880028216058
GROUP CFS RB_ROOT: ffff88012c5d1000 <libvirt>
GROUP CFS RB_ROOT: ffff88012c663e00 <qemu>
GROUP CFS RB_ROOT: ffff88012bb56000 <guest2>
GROUP CFS RB_ROOT: ffff88012b012000 <vcpu0>
[120] PID: 3248 TASK: ffff88012a9d4100 COMMAND:
"qemu-kvm"
...
On CPU 0, there are no other tasks queued, and so the current
runq command shows just that -- whereas your patch shows all of the
other empty RB_ROOT structures. Why bother showing them at all?
And on CPU 1, can you please condense the display a bit? Why does each
GROUP line have to get indented by 6 spaces? Why can't it be indented by
just 3 spaces like the first group?
Here's another 2.6.32-131.0.15.el6 vmcore. With the current runq
command these two cpus look like this:
crash> runq
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND: "loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND: "loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND: "loop.ABB"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
... [ cut ] ...
CPU 6 RUNQUEUE: ffff88000a395f80
CURRENT: PID: 28230 TASK: ffff8800373d2b40 COMMAND: "loop.A"
RT PRIO_ARRAY: ffff88000a396098
[no tasks queued]
CFS RB_ROOT: ffff88000a396010
[120] PID: 28270 TASK: ffff88007812ab40 COMMAND: "loop.ABB"
[120] PID: 28261 TASK: ffff880037cc5540 COMMAND: "loop.ABA"
[120] PID: 28244 TASK: ffff88007b4f6a80 COMMAND: "loop.AB"
[120] PID: 28259 TASK: ffff880075978080 COMMAND: "loop.AAB"
[120] PID: 28257 TASK: ffff8800780a0a80 COMMAND: "loop.AAB"
[120] PID: 28258 TASK: ffff880075979500 COMMAND: "loop.AAB"
[120] PID: 28254 TASK: ffff880037d2a040 COMMAND: "loop.AAA"
[120] PID: 28253 TASK: ffff88007b534100 COMMAND: "loop.AAA"
[120] PID: 28255 TASK: ffff880078628080 COMMAND: "loop.AAA"
[120] PID: 28231 TASK: ffff880037b14b40 COMMAND: "loop.A"
...
With your patch:
crash> runq
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND:
"loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND:
"loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND:
"loop.ABB"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
... [ cut ] ...
CPU 6 RUNQUEUE: ffff88000a395f80
CURRENT: PID: 28230 TASK: ffff8800373d2b40 COMMAND: "loop.A"
RT PRIO_ARRAY: ffff88000a396098
[no tasks queued]
CFS RB_ROOT: ffff88000a396010
[120] PID: 28270 TASK: ffff88007812ab40 COMMAND:
"loop.ABB"
[120] PID: 28261 TASK: ffff880037cc5540 COMMAND:
"loop.ABA"
[120] PID: 28244 TASK: ffff88007b4f6a80 COMMAND: "loop.AB"
[120] PID: 28259 TASK: ffff880075978080 COMMAND:
"loop.AAB"
[120] PID: 28257 TASK: ffff8800780a0a80 COMMAND:
"loop.AAB"
[120] PID: 28258 TASK: ffff880075979500 COMMAND:
"loop.AAB"
[120] PID: 28254 TASK: ffff880037d2a040 COMMAND:
"loop.AAA"
[120] PID: 28253 TASK: ffff88007b534100 COMMAND:
"loop.AAA"
[120] PID: 28255 TASK: ffff880078628080 COMMAND:
"loop.AAA"
[120] PID: 28231 TASK: ffff880037b14b40 COMMAND: "loop.A"
I'm not sure what the reason is, but that display is clearly unnacceptable.
On a 3.2.1-0.10.el7 vmcore, I see this with the current runq command:
crash> runq
... [ cut ] ...
CPU 3 RUNQUEUE: ffff8804271d43c0
CURRENT: PID: 11615 TASK: ffff88020c50a670 COMMAND: "runtest.sh"
RT PRIO_ARRAY: ffff8804271d4590
[no tasks queued]
CFS RB_ROOT: ffff8804271d44a0
[no tasks queued]
...
With your patch, the command aborts here:
crash> runq
... [ cut ] ...
CPU 3 RUNQUEUE: ffff8804271d43c0
CURRENT: PID: 11615 TASK: ffff88020c50a670 COMMAND: "runtest.sh"
RT PRIO_ARRAY: ffff8804271d4590
[no tasks queued]
CFS RB_ROOT: ffff8804271d44a0
GROUP CFS RB_ROOT: ffff88041e0d2760runq: invalid kernel virtual address: 38 type:
"cgroup dentry"
crash>
And lastly, on a 2.6.27-0.244 I saw this again:
crash> runq
runq: gdb request failed: print &((struct rt_rq *)0x0)->highest_prio.curr
crash>
Let me know if you would like to have any of the vmlinux/vmcore
pairs above. If they are not too huge, I can make them available
for you to download from my
people.redhat.com site. Although,
some of the dumpfiles were created and forwarded to me by your
compatriot Daisuke Hatayama... ;-)
And for that matter, I will need for you to make available to me
a sample vmcore that shows the (DEQUEUED) and (THROTTLED) status
so I can have it on hand for testing purposes.
Thanks,
Dave