Re: [Crash-utility] patch for slight modification to runq -g command

Thursday, 7 November 2013

Hi Anthony,

With respect to the nr_running and h_nr_running displays, since you
can "see" the number of tasks queued underneath each particular 
group, I'm not convinced that it's worth displaying them?  

In your first post you mentioned:

...
 Since the way we crash the system by messing up the nr_running and
h_nr_running,
 so we also display those two fields at the same time. Here’s an example of before and
after. 
Are you saying that you purposely modify those two values in order to force
a crash? 

Anyway, I bring this up because their display is kind of ugly, and also because
in the output logs of my test of your patch, I see this particular instance,
where I've got a 3.6.0 kernel where a crash was generated by entering 
"echo c > /proc/sysrq-trigger":

  crash> bt
  PID: 1212   TASK: ffff880035f60000  CPU: 1   COMMAND: "bash"
   #0 [ffff88007831fa20] machine_kexec at ffffffff8103e465
   #1 [ffff88007831fa90] crash_kexec at ffffffff810c6658
   #2 [ffff88007831fb60] oops_end at ffffffff815d5bf8
   #3 [ffff88007831fb90] no_context at ffffffff815c7dae
   #4 [ffff88007831fbf0] __bad_area_nosemaphore at ffffffff815c7f98
   #5 [ffff88007831fc40] bad_area at ffffffff815c81f0
   #6 [ffff88007831fc70] do_page_fault at ffffffff815d87d1
   #7 [ffff88007831fd80] page_fault at ffffffff815d5025
      [exception RIP: sysrq_handle_crash+22]
      RIP: ffffffff81388986  RSP: ffff88007831fe38  RFLAGS: 00010092
      RAX: 000000000000000f  RBX: ffffffff8192dc20  RCX: 00000000000014ff
      RDX: 000000000000332f  RSI: 0000000000000046  RDI: 0000000000000063
      RBP: ffff88007831fe38   R8: ffffffff81b26580   R9: 0000000000000397
      R10: 0000000000000002  R11: 0000000000000396  R12: 0000000000000063
      R13: 0000000000000286  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #8 [ffff88007831fe40] __handle_sysrq at ffffffff813890a7
   #9 [ffff88007831fe80] write_sysrq_trigger at ffffffff8138915a
  #10 [ffff88007831feb0] proc_reg_write at ffffffff811ea879
  #11 [ffff88007831ff00] vfs_write at ffffffff8118991c
  #12 [ffff88007831ff30] sys_write at ffffffff81189c4a
  #13 [ffff88007831ff80] system_call_fastpath at ffffffff815dcae9
      RIP: 00007f64d1a94530  RSP: 00007fffbb0c1248  RFLAGS: 00010246
      RAX: 0000000000000001  RBX: ffffffff815dcae9  RCX: 00000000fbad2a84
      RDX: 0000000000000002  RSI: 00007f64d23ab000  RDI: 0000000000000001
      RBP: 00007f64d23ab000   R8: 000000000000000a   R9: 00007f64d23a4740
      R10: 0000000000000001  R11: 0000000000000246  R12: 0000000000000002
      R13: 00007f64d1d61280  R14: 0000000000000002  R15: 00007f64d1d61280
      ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
  crash>

The "runq -g" output for that cpu looks like this:

  CPU 1
    CURRENT: PID: 1212  CFS: ffff880035cc2f00 TASK: ffff880035f60000  COMMAND:
"bash"
    TASK_GROUP RT_RQ: ffff88007fa541e8
    RT PRIO_ARRAY: ffff88007fa541e8
       [no tasks queued]
    TASK_GROUP CFS_RQ: ffff88007fa540f0
    CFS RB_ROOT: ffff88007fa54118
       GROUP: ffff880078af7800 CFS_RQ: ffff880035cc2f00 RB_ROOT: ffff880035cc2f28
nr_running: 4294967297 h_nr_running: 201908650262921217 
          [120] PID: 1212   TASK: ffff880035f60000  COMMAND: "bash"

I don't understand where those values are coming from, because if
I look at the CFS_RQ, it shows this:

  crash> cfs_rq.nr_running,h_nr_running ffff880035cc2f00
    nr_running = 1
    h_nr_running = 1
  crash>

I also see this occurring on live "snapshot" dumps -- which I understand given
that the kernel's runqueue data structures are being changed while the dump
is being created.  But I don't understand why it's happening in the situation
above.

Dave

----- Original Message -----
...

 ----- Original Message -----
 > Hi Dave,
 > 
 > I have cleaned up the code and added another change.

 OK thanks -- the patch runs through my sample set of vmcores with no problem.

 > The current running task is not in the rb tree (rb_root), so run -q
 > displays it like:
 > 
 >   CURRENT: PID: 9048   TASK: ffff8808b07e4200  COMMAND: "actmain"
 >   TASK_GROUP RT_RQ: ffff880002493820
 >   RT PRIO_ARRAY: ffff880002493820
 >      [no tasks queued]
 >   TASK_GROUP CFS_RQ: ffff8800024936e0
 >   CFS RB_ROOT: ffff880002493710
 >      GROUP CFS RB_ROOT: ffff882d609ce830 <TDAT>
 >         GROUP CFS RB_ROOT: ffff883f0bcbfa30 <User>
 >                [no tasks queued]
 > 
 > I can understand why the current running task is not displayed.
 > However, the "-g" option displays all the task_groups the task
 > belongs to but at the end it shows "[no tasks queued]". That is
 > just strange.  The new change is to display the task that is running like:
 > 
 >   CURRENT: PID: 9048  CFS: ffff88039351a800 TASK: ffff8808b07e4200
 >   COMMAND: "actmain"
 >   TASK_GROUP RT_RQ: ffff880002493820
 >   RT PRIO_ARRAY: ffff880002493820
 >      [no tasks queued]
 >   TASK_GROUP CFS_RQ: ffff8800024936e0
 >   CFS RB_ROOT: ffff880002493710
 >      GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT:
 >      ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1
 >         GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT:
 >         ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1
 >               [120] PID: 9048   TASK: ffff8808b07e4200  COMMAND:
"actmain"

 OK -- I guess I understand why it probably makes sense to duplicate the
 CURRENT task underneath its own GROUP list -- but if that is done, then
 why clutter the CURRENT line with the CFS_RQ address?  And it's not clear
 to me why in your example above, the CFS address of ffff88039351a800
 doesn't show up as the CFS_RQ address above the "actmain" line?

 Taking a simple example, I see this:

  crash> runq -g
  CPU 0
    CURRENT: PID: 0     CFS: ffff88000c7d6aa8 TASK: ffffffff8178ba60  COMMAND:
    "swapper"
    TASK_GROUP RT_RQ: ffff88000c7d6b58
    RT PRIO_ARRAY: ffff88000c7d6b58
       [no tasks queued]
    TASK_GROUP CFS_RQ: ffff88000c7d6aa8
    CFS RB_ROOT: ffff88000c7d6ad0
       [no tasks queued]

  CPU 1
    CURRENT: PID: 1268  CFS: ffff88000c9b5aa8 TASK: ffff88002f11c620  COMMAND:
    "bash"
    TASK_GROUP RT_RQ: ffff88000c9b5b58
    RT PRIO_ARRAY: ffff88000c9b5b58
       [no tasks queued]
    TASK_GROUP CFS_RQ: ffff88000c9b5aa8
    CFS RB_ROOT: ffff88000c9b5ad0
       [120] PID: 1268   TASK: ffff88002f11c620  COMMAND: "bash"

  crash>

 Where the newly-interspersed CFS address redundantly shows the TASK_GROUP
 CFS_RQ
 below.  But adding the CFS address to the "swapper" line doesn't seem to
make
 much sense, or help in any way, since the idle task is a special case that
 never
 gets queued.  And since the CFS address in the "bash" line is redundant with
 the
 TASK_GROUP CFS_RQ below, why bother showing it?

 And in a more complicated example, with your patch, the "qemu-kvm" task also
 shows up underneath its group:

  CPU 0
    CURRENT: PID: 3144  CFS: ffff88022aab2600 TASK: ffff88022a446040  COMMAND:
    "qemu-kvm"
    TASK_GROUP RT_RQ: ffff880133c16148
    RT PRIO_ARRAY: ffff880133c16148
       [no tasks queued]
    TASK_GROUP CFS_RQ: ffff880133c16028
    CFS RB_ROOT: ffff880133c16058
       GROUP: ffff88012b880800 CFS_RQ: ffff88022ac8f000 RB_ROOT:
       ffff88022ac8f030 <libvirt> nr_running: 1 h_nr_running: 1
          GROUP: ffff88012c078000 CFS_RQ: ffff88022c075000 RB_ROOT:
          ffff88022c075030 <qemu> nr_running: 1 h_nr_running: 1
             GROUP: ffff88012b0fb400 CFS_RQ: ffff88022af94c00 RB_ROOT:
             ffff88022af94c30 <guest1> nr_running: 1 h_nr_running: 1
                GROUP: ffff88022c6bbc00 CFS_RQ: ffff88022aab2600 RB_ROOT:
                ffff88022aab2630 <vcpu1> nr_running: 1 h_nr_running: 1
                   [120] PID: 3144   TASK: ffff88022a446040  COMMAND:
                   "qemu-kvm"

 And note that its interspersed CFS address of ffff88022aab2600 is redundantly
 shown
 as the CFS_RQ of its GROUP down below.

 So I don't understand why your example shows different CFS addresses in the
 CURRENT line vs. the GROUP CFS_RQ address above the queued "acctmain" task:

 >   CURRENT: PID: 9048  CFS: ffff88039351a800 TASK: ffff8808b07e4200
 >   COMMAND: "actmain"
 >   TASK_GROUP RT_RQ: ffff880002493820
 >   RT PRIO_ARRAY: ffff880002493820
 >      [no tasks queued]
 >   TASK_GROUP CFS_RQ: ffff8800024936e0
 >   CFS RB_ROOT: ffff880002493710
 >      GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT:
 >      ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1
 >         GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT:
 >         ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1
 >               [120] PID: 9048   TASK: ffff8808b07e4200  COMMAND:
"actmain"

 Am I missing something?  Or is there cut-and-paste error?

 Dave

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] patch for slight modification to runq -g command