Hello Dave,
Sorry about not testing the patch fully enough. And I think we
should make a discussion about the first patch. I have done some
tests with the patch, and I attached it. So could you please test
it in your box again.
Hello Zhang,
I tested your patch against a sample set of dumpfiles as
well as live on my 3.6.2-4.fc17 kernel. Here are my
results and comments.
First, when posting patches, please run the build with
"make warn" in order to catch all these types of complaints:
$ make warn
... [ cut ] ...
cc -c -g -DX86_64 -DGDB_7_3_1 task.c -Wall -O2 -Wstrict-prototypes
-Wmissing-prototypes -fstack-protector
task.c: In function 'dump_task_group_name':
task.c:7545:7: warning: unused variable 'buf' [-Wunused-variable]
task.c: In function 'dump_tasks_in_cfs_rq':
task.c:7584:8: warning: unused variable 'p1' [-Wunused-variable]
task.c: In function 'dump_RT_prio_array':
task.c:8050:8: warning: unused variable 'p1h [-Wunused-variable]
task.c: In function 'cmd_runq':
task.c:8012:31: warning: 'root_task_group' may be used uninitialized in this
function [-Wmaybe-uninitialized]
task.c:7887:8: note: 'root_task_group' was declared here
...
my neglect.
Anyway, when I run this patch on a live Fedora 3.6.2-4.fc17.x86_64
kernel, it shows this *every* time on the active "crash" task:
crash> set
PID: 6825
COMMAND: "crash"
TASK: ffff8801df8bae20 [THREAD_INFO: ffff88020e55a000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash> runq
CPU 0 RUNQUEUE: ffff88021e213cc0
CURRENT: PID: 6825 TASK: ffff8801df8bae20 COMMAND: "crash"
RT PRIO_ARRAY: ffff88021e213e28
[no tasks queued]
CFS RB_ROOT: ffff88021e213d58
GROUP CFS RB_ROOT: ffff8801ebb48200runq: invalid kernel virtual address: 48 type:
"cgroup dentry"
crash>
So clearly that's a problem that needs addressing -- did you
test this patch on a live system?
I see similar issues on vmcores that were taken using the
"snap.so" snapshot extension module. Now, I admit that in
those cases it is certainly possible that the scheduler
infrastructure is undergoing changes *while* the snapshot
was being taken. However, I never see a problem with the
unpatched "runq" command, which always shows the active tasks
correctly. Here are a few examples:
On a 2.6.40.4-5.fc15 snapshot, here is a situation where
using the current runq command, it shows:
CPU 1 RUNQUEUE: ffff88003fc92540
CURRENT: PID: 1341 TASK: ffff880037409730 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc92690
[no tasks queued]
CFS RB_ROOT: ffff88003fc925d8
[no tasks queued]
with your patch, it results in this:
CPU 1 RUNQUEUE: ffff88003fc92540
CURRENT: PID: 1341 TASK: ffff880037409730 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc92690
[no tasks queued]
CFS RB_ROOT: ffff88003fc925d8
[no tasks queued]
[no tasks queued]
This is fixed in the new patch, I think.
On a 2.6.29.4-167.fc11 snapshot, the current runq command shows:
crash> runq
CPU 0 RUNQUEUE: ffff88000101b300
CURRENT: PID: 0 TASK: ffffffff81584360 COMMAND: "swapper"
RT PRIO_ARRAY: ffff88000101b420
[no tasks queued]
CFS RB_ROOT: ffff88000101b398
[no tasks queued]
CPU 1 RUNQUEUE: ffff880001029300
CURRENT: PID: 19625 TASK: ffff8800764e5c00 COMMAND: "crash"
RT PRIO_ARRAY: ffff880001029420
[no tasks queued]
CFS RB_ROOT: ffff880001029398
[no tasks queued]
with your patch, it results in this:
crash> runq
runq: gdb request failed: print &((struct rt_rq *)0x0)->highest_prio.curr
crash>
will be fixed in patch2 later.
This is from your patch:
> sprintf(buf, "print &((struct rt_rq
*)0x0)->highest_prio.curr");
It is always preferable to use OFFSET() if pre-stored, or at least
use MEMBER_OFFSET() if not, instead of invoking gdb like this. You
already have "offset_table.rt_rq_highest_prio_curr" set up -- why aren't
you using it here? I also saw the same error as above in a 2.6.29.2-52.fc10
snapshot, so perhaps it's a kernel version dependency that you are not
accouting for? In any case, the failure mode above is unacceptable.
Here on another 3.2.6-3.fc16 snapshot, I see this:
CPU 0 RUNQUEUE: ffff88003fc13780
CURRENT: PID: 1383 TASK: ffff88003c932e40 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc13910
[no tasks queued]
CFS RB_ROOT: ffff88003fc13820
[no tasks queued]
With your patch:
CPU 0 RUNQUEUE: ffff88003fc13780
CURRENT: PID: 1383 TASK: ffff88003c932e40 COMMAND: "crash"
RT PRIO_ARRAY: ffff88003fc13910
[no tasks queued]
CFS RB_ROOT: ffff88003fc13820
GROUP CFS RB_ROOT: ffff88003a432c00runq: invalid kernel virtual address: 38 type:
"cgroup dentry"
This is fixed in the new patch, I think.
So anyway, your patch should be able to at least work as
well on live systems and snapshots as the current runq
command does.
Let me note here that the remainder of the examples below are
from actual crash dumps, i.e. not "snapshots".
First, to be honest about this, I wonder whether the additional task
group data is more, or less, understandable. For example, this example
is from a 2.6.32-220.el6 kernel.
With the current runq:
crash> runq
CPU 0 RUNQUEUE: ffff880133c15fc0
CURRENT: PID: 3144 TASK: ffff88022a446040 COMMAND: "qemu-kvm"
RT PRIO_ARRAY: ffff880133c16148
[no tasks queued]
CFS RB_ROOT: ffff880133c16058
[no tasks queued]
CPU 1 RUNQUEUE: ffff880028215fc0
CURRENT: PID: 2948 TASK: ffff88022af2a100 COMMAND: "bash"
RT PRIO_ARRAY: ffff880028216148
[no tasks queued]
CFS RB_ROOT: ffff880028216058
[120] PID: 3248 TASK: ffff88012a9d4100 COMMAND: "qemu-kvm"
...
With your patch:
crash>
CPU 0 RUNQUEUE: ffff880133c15fc0
CURRENT: PID: 3144 TASK: ffff88022a446040 COMMAND: "qemu-kvm"
RT PRIO_ARRAY: ffff880133c16148
[no tasks queued]
CFS RB_ROOT: ffff880133c16058
GROUP CFS RB_ROOT: ffff88022ac8f000 <libvirt>
GROUP CFS RB_ROOT: ffff88022c075000 <qemu>
GROUP CFS RB_ROOT: ffff88022af94c00 <guest1>
GROUP CFS RB_ROOT: ffff88022aab2600 <vcpu1>
[no tasks queued]
[no tasks queued]
[no tasks queued]
[no tasks queued]
[no tasks queued]
This condition happens when a cfs_rq has only one task and the task is
chosen "current", so the only task is dequeued from the cfs_rq, but the
cfs_rq is still in its parent cfs_rq. Just like the above display:
vcpu1 has only one task named qemu-kvm, and this task is current. vcpu1
is still in guest1.
I have changed the display in the new patch,
and now it displays:
crash>
CPU 0 RUNQUEUE: ffff880133c15fc0
CURRENT: PID: 3144 TASK: ffff88022a446040 COMMAND: "qemu-kvm"
RT PRIO_ARRAY: ffff880133c16148
[no tasks queued]
CFS RB_ROOT: ffff880133c16058
GROUP CFS RB_ROOT: ffff88022ac8f000 <libvirt>
GROUP CFS RB_ROOT: ffff88022c075000 <qemu>
GROUP CFS RB_ROOT: ffff88022af94c00 <guest1>
GROUP CFS RB_ROOT: ffff88022aab2600 <vcpu1>
[no tasks queued]
how do you think about this? The reason that I still display the RB_ROOT, like
libvirt, qemu, guest1, is they are entities in their parent cfs_rq, just like
the task.
CPU 1 RUNQUEUE: ffff880028215fc0
CURRENT: PID: 2948 TASK: ffff88022af2a100 COMMAND: "bash"
RT PRIO_ARRAY: ffff880028216148
[no tasks queued]
CFS RB_ROOT: ffff880028216058
GROUP CFS RB_ROOT: ffff88012c5d1000 <libvirt>
GROUP CFS RB_ROOT: ffff88012c663e00 <qemu>
GROUP CFS RB_ROOT: ffff88012bb56000 <guest2>
GROUP CFS RB_ROOT: ffff88012b012000 <vcpu0>
[120] PID: 3248 TASK: ffff88012a9d4100 COMMAND:
"qemu-kvm"
...
On CPU 0, there are no other tasks queued, and so the current
runq command shows just that -- whereas your patch shows all of the
other empty RB_ROOT structures. Why bother showing them at all?
And on CPU 1, can you please condense the display a bit? Why does each
GROUP line have to get indented by 6 spaces? Why can't it be indented by
just 3 spaces like the first group?
OK, this is changed.
Here's another 2.6.32-131.0.15.el6 vmcore. With the current runq
command these two cpus look like this:
crash> runq
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND: "loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND: "loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND: "loop.ABB"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
... [ cut ] ...
CPU 6 RUNQUEUE: ffff88000a395f80
CURRENT: PID: 28230 TASK: ffff8800373d2b40 COMMAND: "loop.A"
RT PRIO_ARRAY: ffff88000a396098
[no tasks queued]
CFS RB_ROOT: ffff88000a396010
[120] PID: 28270 TASK: ffff88007812ab40 COMMAND: "loop.ABB"
[120] PID: 28261 TASK: ffff880037cc5540 COMMAND: "loop.ABA"
[120] PID: 28244 TASK: ffff88007b4f6a80 COMMAND: "loop.AB"
[120] PID: 28259 TASK: ffff880075978080 COMMAND: "loop.AAB"
[120] PID: 28257 TASK: ffff8800780a0a80 COMMAND: "loop.AAB"
[120] PID: 28258 TASK: ffff880075979500 COMMAND: "loop.AAB"
[120] PID: 28254 TASK: ffff880037d2a040 COMMAND: "loop.AAA"
[120] PID: 28253 TASK: ffff88007b534100 COMMAND: "loop.AAA"
[120] PID: 28255 TASK: ffff880078628080 COMMAND: "loop.AAA"
[120] PID: 28231 TASK: ffff880037b14b40 COMMAND: "loop.A"
...
With your patch:
crash> runq
CPU 0 RUNQUEUE: ffff88000a215f80
CURRENT: PID: 28263 TASK: ffff880037aaa040 COMMAND: "loop.ABA"
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
CFS RB_ROOT: ffff88000a216010
[120] PID: 28262 TASK: ffff880037cc40c0 COMMAND:
"loop.ABA"
[120] PID: 28271 TASK: ffff8800787a8b40 COMMAND:
"loop.ABB"
[120] PID: 28272 TASK: ffff880037afd580 COMMAND:
"loop.ABB"
[120] PID: 28245 TASK: ffff8800785e8b00 COMMAND: "loop.AB"
[120] PID: 28246 TASK: ffff880078628ac0 COMMAND: "loop.AB"
[120] PID: 28241 TASK: ffff880078616b40 COMMAND: "loop.AA"
[120] PID: 28239 TASK: ffff8800785774c0 COMMAND: "loop.AA"
[120] PID: 28240 TASK: ffff880078617580 COMMAND: "loop.AA"
[120] PID: 28232 TASK: ffff880079b5d4c0 COMMAND: "loop.A"
... [ cut ] ...
CPU 6 RUNQUEUE: ffff88000a395f80
CURRENT: PID: 28230 TASK: ffff8800373d2b40 COMMAND: "loop.A"
RT PRIO_ARRAY: ffff88000a396098
[no tasks queued]
CFS RB_ROOT: ffff88000a396010
[120] PID: 28270 TASK: ffff88007812ab40 COMMAND:
"loop.ABB"
[120] PID: 28261 TASK: ffff880037cc5540 COMMAND:
"loop.ABA"
[120] PID: 28244 TASK: ffff88007b4f6a80 COMMAND: "loop.AB"
[120] PID: 28259 TASK: ffff880075978080 COMMAND:
"loop.AAB"
[120] PID: 28257 TASK: ffff8800780a0a80 COMMAND:
"loop.AAB"
[120] PID: 28258 TASK: ffff880075979500 COMMAND:
"loop.AAB"
[120] PID: 28254 TASK: ffff880037d2a040 COMMAND:
"loop.AAA"
[120] PID: 28253 TASK: ffff88007b534100 COMMAND:
"loop.AAA"
[120] PID: 28255 TASK: ffff880078628080 COMMAND:
"loop.AAA"
[120] PID: 28231 TASK: ffff880037b14b40 COMMAND: "loop.A"
I'm not sure what the reason is, but that display is clearly unnacceptable.
This is fixed in the new patch, I think.
On a 3.2.1-0.10.el7 vmcore, I see this with the current runq command:
crash> runq
... [ cut ] ...
CPU 3 RUNQUEUE: ffff8804271d43c0
CURRENT: PID: 11615 TASK: ffff88020c50a670 COMMAND: "runtest.sh"
RT PRIO_ARRAY: ffff8804271d4590
[no tasks queued]
CFS RB_ROOT: ffff8804271d44a0
[no tasks queued]
...
With your patch, the command aborts here:
crash> runq
... [ cut ] ...
CPU 3 RUNQUEUE: ffff8804271d43c0
CURRENT: PID: 11615 TASK: ffff88020c50a670 COMMAND: "runtest.sh"
RT PRIO_ARRAY: ffff8804271d4590
[no tasks queued]
CFS RB_ROOT: ffff8804271d44a0
GROUP CFS RB_ROOT: ffff88041e0d2760runq: invalid kernel virtual address: 38 type:
"cgroup dentry"
crash>
This is fixed in the new patch, I think.
And lastly, on a 2.6.27-0.244 I saw this again:
crash> runq
runq: gdb request failed: print &((struct rt_rq *)0x0)->highest_prio.curr
crash>
will be fixed in patch2 later.
Let me know if you would like to have any of the vmlinux/vmcore
pairs above. If they are not too huge, I can make them available
for you to download from my
people.redhat.com site. Although,
some of the dumpfiles were created and forwarded to me by your
compatriot Daisuke Hatayama... ;-)
And for that matter, I will need for you to make available to me
a sample vmcore that shows the (DEQUEUED) and (THROTTLED) status
so I can have it on hand for testing purposes.
I will give you the sample after we finish the first patch, ok?
Thanks
Zhang