Hi Dave,
I have cleaned up the code and added another change. The current
running task is not in the rb tree (rb_root), so run -q displays it like:
CURRENT: PID: 9048 TASK: ffff8808b07e4200 COMMAND: "actmain"
TASK_GROUP RT_RQ: ffff880002493820
RT PRIO_ARRAY: ffff880002493820
[no tasks queued]
TASK_GROUP CFS_RQ: ffff8800024936e0
CFS RB_ROOT: ffff880002493710
GROUP CFS RB_ROOT: ffff882d609ce830 <TDAT>
GROUP CFS RB_ROOT: ffff883f0bcbfa30 <User>
[no tasks queued]
I can understand why the current running task is not displayed.
However, the "-g" option displays all the task_groups the task
belongs to but at the end it shows "[no tasks queued]". That is
just strange. The new change is to display the task that is running like:
CURRENT: PID: 9048 CFS: ffff88039351a800 TASK: ffff8808b07e4200 COMMAND:
"actmain"
TASK_GROUP RT_RQ: ffff880002493820
RT PRIO_ARRAY: ffff880002493820
[no tasks queued]
TASK_GROUP CFS_RQ: ffff8800024936e0
CFS RB_ROOT: ffff880002493710
GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT: ffff882d609ce830
<TDAT> nr_running: 1 h_nr_running: 1
GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT: ffff883f0bcbfa30
<User> nr_running: 1 h_nr_running: 1
[120] PID: 9048 TASK: ffff8808b07e4200 COMMAND: "actmain"
Thanks,
Anthony
-----Original Message-----
From: crash-utility-bounces(a)redhat.com [mailto:crash-utility-
bounces(a)redhat.com] On Behalf Of Dave Anderson
Sent: Thursday, October 17, 2013 12:46 PM
To: Discussion list for crash utility usage, maintenance and development
Subject: Re: [Crash-utility] FW: patch for slight modification to runq -g
command
----- Original Message -----
>
>
>
>
> Hi,
>
>
>
> We’re debugging an in-house application that makes use of hard limit
> extensively. We ran into a lot of timing windows (all our own making)
and we
> use runq -g command a lot. The current runq -g display only the
RBROOT
> pointer. It really is a bit inconvenient to traverse the task_group
> hierarchy. It would be nice to have the command also display the
> corresponding task_group, cfs_rq pointer (at a minimum). Since the
way we
> crash the system by messing up the nr_running and h_nr_running, so
we also
> display those two fields at the same time. Here’s an example of before
and
> after.
>
>
>
> CPU 4
>
> CURRENT: PID: 0 TASK: ffff8840668c0380 COMMAND: "swapper"
>
> TASK_GROUP RT_RQ: ffff8800027d3820
>
> RT PRIO_ARRAY: ffff8800027d3820
>
> [no tasks queued]
>
> TASK_GROUP CFS_RQ: ffff8800027d36e0
>
> CFS RB_ROOT: ffff8800027d3710
>
> GROUP CFS RB_ROOT: ffff883ff69bcc30 <TDAT>
>
> GROUP CFS RB_ROOT: ffff884006290e30 <User>
>
> GROUP CFS RB_ROOT: ffff88400641c430 <TDWMVP1>
>
> GROUP CFS RB_ROOT: ffff88400646b030 <ServDown1:0>
>
> GROUP CFS RB_ROOT: ffff884006492630 <ServOrder1:1>
>
> GROUP CFS RB_ROOT: ffff883ff058fe30 <TDWMWD57> (THROTTLED)
>
> GROUP CFS RB_ROOT: ffff88047889ee30 <S:WD:3d:35fa>
>
> [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain"
>
> <<< more throttled groups removed >>>
>
>
>
> CPU 4
>
> CURRENT: PID: 0 CFS: ffff8800027d36e0 TASK: ffff8840668c0380
COMMAND:
> "swapper"
>
> TASK_GROUP RT_RQ: ffff8800027d3820
>
> RT PRIO_ARRAY: ffff8800027d3820
>
> [no tasks queued]
>
> TASK_GROUP CFS_RQ: ffff8800027d36e0
>
> CFS RB_ROOT: ffff8800027d3710
>
> GROUP: ffff88405394d000 CFS: ffff883ff69bcc00 RB_ROOT:
ffff883ff69bcc30
> <TDAT> (0 0)
>
> GROUP: ffff88405906c400 CFS: ffff884006290e00 RB_ROOT:
ffff884006290e30
> <User> (0 0)
>
> GROUP: ffff884055081000 CFS: ffff88400641c400 RB_ROOT:
ffff88400641c430
> <TDWMVP1> (0 0)
>
> GROUP: ffff884055081c00 CFS: ffff88400646b000 RB_ROOT:
ffff88400646b030
> <ServDown1:0> (0 0)
>
> GROUP: ffff8840580fd400 CFS: ffff884006492600 RB_ROOT:
ffff884006492630
> <ServOrder1:1> (0 0)
>
> GROUP: ffff884058f58c00 CFS: ffff883ff058fe00 RB_ROOT:
ffff883ff058fe30
> <TDWMWD57> (7 9) (THROTTLED)
>
> GROUP: ffff8808fb976000 CFS: ffff88047889ee00 RB_ROOT:
ffff88047889ee30
> <S:WD:3d:35fa> (1 1)
>
> [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain"
>
> <<< more throttled groups removed >>>
>
>
>
> I have attached the patch we use to display additional information.
Could you
> please take a look at my proposal to see if it is possible that you include
> this kind of display format.
>
>
>
> Thanks,
>
> Anthony
A couple quick observations...
Adding the CFS: address makes sense, but besides you and me and
whoever
reads the patch/code, how would anybody know what the two numbers
inside
the parentheses mean? It could be documented in the help page, but I
wonder if it could be made more obvious somehow?
Anyway, I ran a test on a sample set of vmcores, and this addition
will only work with relevant kernel versions. For example, in these
four dumpfiles (and therefore their kernel versions), the command
fails because the cfs_rq.h_nr_running member does not exist:
2.6.38.2-9.fc15:
CPU 0
CURRENT: PID: 1180 CFS: ffff880037ef1b00 TASK: ffff88003bea2e40
COMMAND: "crash"
TASK_GROUP RT_RQ: ffff88003fc13988
RT PRIO_ARRAY: ffff88003fc13988
[no tasks queued]
TASK_GROUP CFS_RQ: ffff88003fc138b0
CFS RB_ROOT: ffff88003fc138d8
GROUP: ffff88003c610a00 CFS: ffff880037ef1b00 RB_ROOT:
ffff880037ef1b28
runq: invalid structure member offset: cfs_rq_h_nr_running
FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
2.6.40.4-5.fc15:
CPU 1
CURRENT: PID: 1341 CFS: ffff880037592f00 TASK: ffff880037409730
COMMAND: "crash"
TASK_GROUP RT_RQ: ffff88003fc92690
RT PRIO_ARRAY: ffff88003fc92690
[no tasks queued]
TASK_GROUP CFS_RQ: ffff88003fc925b0
CFS RB_ROOT: ffff88003fc925d8
GROUP: ffff88003a84b800 CFS: ffff880037592f00 RB_ROOT:
ffff880037592f28
runq: invalid structure member offset: cfs_rq_h_nr_running
FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
2.6.32-131.0.15.el6:
CPU 0
CURRENT: PID: 28263 CFS: ffff8800794b7140 TASK: ffff880037aaa040
COMMAND: "loop.ABA"
TASK_GROUP RT_RQ: ffff88000a216098
RT PRIO_ARRAY: ffff88000a216098
[no tasks queued]
TASK_GROUP CFS_RQ: ffff88000a215fe8
CFS RB_ROOT: ffff88000a216010
GROUP: ffff8800785bc200 CFS: ffff8800784d7c80 RB_ROOT:
ffff8800784d7ca8 <A>
runq: invalid structure member offset: cfs_rq_h_nr_running
FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
3.1.7-1.fc16:
CPU 2
CURRENT: PID: 1495 CFS: ffff8800277f8500 TASK: ffff880037a60000
COMMAND: "crash"
TASK_GROUP RT_RQ: ffff88003e2532d0
RT PRIO_ARRAY: ffff88003e2532d0
[no tasks queued]
TASK_GROUP CFS_RQ: ffff88003e2531f0
CFS RB_ROOT: ffff88003e253218
GROUP: ffff88002722b000 CFS: ffff8800277f8500 RB_ROOT:
ffff8800277f8528
runq: invalid structure member offset: cfs_rq_h_nr_running
FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
So you'll need to check VALID_MEMBER(cfs_rq_h_nr_running) before
attempting
to print it, and maybe just show the cfs_rq.nr_running value.
When adding new items to the offset_table, they should be added to
the
end of the structure so that extension modules will continue to be able
to access any offsets as expected without having to be re-compiled.
But you can display the new item in dump_offset_table() wherever
you'd
like, typically grouped with similar/associated items -- although in your
patch you did this:
+ fprintf(fp, " cfs_rq_nr_running: %ld\n",
+ OFFSET(cfs_rq_nr_running));
+ fprintf(fp, " cfs_rq_h_nr_running: %ld\n",
+ OFFSET(cfs_rq_h_nr_running));
The previously-existing cfs_rq_nr_running display already exists,
so I suggest that you just move your new cfs_rq_h_nr_running display
underneath it. (and please line up the "help -o" output of the new
item as well).
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility