On Fri, 11 Oct 2013 15:54:54 -0400 (EDT)
Dave Anderson <anderson(a)redhat.com> wrote:
----- Original Message -----
> Hi all,
>
> a few people complained to me that the crash utility lacks a
> user-friendly interface for percpu variables, especially if they are
> dynamically allocated. I implemented a new command which should make
> these people happy.
>
> Tested on kernel 3.11, but I didn't really add anything
> version-dependent (instead I used the existing infrastructure).
>
> Comments welcome!
>
> Petr Tesarik
Hi Petr,
Hi Dave,
First let me preface this by saying that I like this concept a lot.
Glad you like it. I'm pretty sure we'll find a way to implement this
feature in a way that is likable by everybody.
A couple comments...
First, a minor nit -- "make warn" shows this:
$ make warn
...
cc -c -g -DX86_64 -DGDB_7_6 symbols.c -I./gdb-7.6/bfd -I./gdb-7.6/include -Wall -O2
-Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security
symbols.c: In function 'cmd_percpu':
symbols.c:6071:2: warning: implicit declaration of function 'isdigit'
[-Wimplicit-function-declaration]
symbols.c:6143:6: warning: suggest parentheses around comparison in operand of
'&' [-Wparentheses]
...
Ah, yes, I forgot about "make warn" when updating the patch set.
Fixing these minor issues should be easy enough.
As a quick test, I created a command input file that looks like
this:
$ cat input
percpu -a irq_stack_union
percpu -a gdt_page
percpu -a exception_stacks
percpu -a cpu_llc_shared_map
percpu -a cpu_core_map
percpu -a cpu_sibling_map
percpu -a cpu_llc_id
percpu -a cpu_number
percpu -a x86_bios_cpu_apicid
percpu -a x86_cpu_to_apicid
percpu -a cpu_loops_per_jiffy
percpu -a xen_vcpu_info
percpu -a xen_vcpu
percpu -a idt_desc
percpu -a shadow_tls_desc
...
where I displayed all of the per-cpu variables that are seen in
the kernel's per-cpu symbol list, and it worked nicely.
Your help page shows this:
crash> help percpu
NAME
percpu - percpu variables
SYNOPSIS
percpu [-dx][-a] [-c cpu] [cpu]... [struct|union|*] [struct_name] [address|symbol]
So my test used the "[-a]" and "[symbol]" arguments.
But it's not clear to me when you would use the "[struct|union|*]",
"[struct_name]" and/or the "[address]" arguments? The remainder of
the help page give no explanation of what they are, nor does it
contain any examples of how they would be used. I'm presuming that
it has something to do with dynamically-allocated per-cpu variables?
There is a second example in the help:
Show a per-cpu variable on all processors:
crash> percpu -a
[0]: ffff88011e21a468
struct disk_stats {
sectors = {11197190, 7550896},
ios = {360193, 159113},
merges = {11723, 22075},
ticks = {6180943, 11137498},
io_ticks = 1781827,
time_in_queue = 16785949
}
[and so on...]
The address (0x1a468) was taken from the dkstats field of a struct
hd_struct on my system. The "struct", "union" and "*"
variants are
needed only to disambiguate from a symbol name (if needed) and they are
similar to the "struct", "union" and "*" commands.
Now, as to the functionality of the command, currently the
"p" command
does this when presented with a percpu symbol:
crash> p idle_threads
PER-CPU DATA TYPE:
struct task_struct *idle_threads;
PER-CPU ADDRESSES:
[0]: ffff88021e20e360
[1]: ffff88021e24e360
[2]: ffff88021e28e360
[3]: ffff88021e2ce360
Yes, I know. The crash utility is in fact not so bad when it comes to
global percpu variables.
[...]
Your command pulls out the calculated per-cpu address and prints
it exactly as the "p" command does, and then follows it with the
symbolic display:
crash> percpu -a idle_threads
[0]: ffff88021e20e360
(struct task_struct *) 0xffffffff81c13440 <init_task>
[1]: ffff88021e24e360
(struct task_struct *) 0xffff88021282d330
[2]: ffff88021e28e360
(struct task_struct *) 0xffff88021282dac0
[3]: ffff88021e2ce360
(struct task_struct *) 0xffff88021282e250
crash>
Since this patch essentially adds a command that does the
same thing as the "p" command does with regular symbols, I
wonder if it would be better suited to be merged with the
"p" command? Did you try doing something like that?
Yes, I thought about extending one of the existing commands.
These are the issues I encountered:
1. The "percpu" combines both "*" and "p". The "*"
functionality
(which you didn't test) is in fact primary, and I wanted to get it
with as little typing as possible. If you can propose an acceptably
short variant of "percpu 0 disk_stats 0x1a468" using the existing
commands, I will in fact prefer it to a standalone command.
2. Many options to the cmd_datatype_common were incompatible with the
"percpu" display (e.g. "-f"), some where meaningless (e.g.
"-u").
3. I'd have to implement percpu twice (once for "p" and once for
cmd_datatype_common.
4. Percpu without any argument uses the current CPU. Try something like
"set -c 0", "percpu current_task", then
"set -c 1", "percpu current_task"
This may not be needed so often, so I'm fine with making it an
option.
In general, I think it's a matter of taste, and if you dislike new
commands, it all boils down to finding a suitable syntax to extend the
existing commands. Unfortunately, "-c" (as CPU) is already taken for
count, and "-p" (as processor) is already taken for pointer
dereference. :-(
I can think of:
A. "-C" (but it requires a shift key, and two options that only
differ in case may be confusing)
B. any other random letter ("-a" is free).
If you find an option letter (let's mark it "-?" here), it could be
used like this:
p -? current_task # global var on the current CPU
p -?1-3 current_task # global var on selected CPUs
p -?a current_task # global var on all CPUs
struct -? disk_stats 0x1a468 # dynamic var
*disk_stats -?a 0x1a468 # dynamic var on all CPUs
Petr T