----- Original Message -----
Hi, Dave
When we investigate the problems of disk I/O, we want to get the disk's
gendisk address and request queue's address easily, and the requests num
is also important.
Tha attached patch introduce a new command diskio to display such
information.
Thanks
Wen Congyang
Hello Wen,
I built and tested this patch, and it certainly contains some useful
information. However, I do have a several suggestions.
First, there's no need to make it a unique command. It would be
far more appropriate to make it a "dev" command option, which for
example, already shows gendisk structure addresses for each of the
block devices.
So please move all of your functions from kernel.c to dev.c.
Then, for example, use "dev -d", and have cmd_dev() call your
command like this:
while ((c = getopt(argcnt, args, "dpi")) != EOF) {
switch(c)
{
case 'd':
diskio_option();
return;
...
Since all of the new offset_table and size_table entries are only
used by this command, you can avoid unnecessarily initializing
everything in kernel_init() by doing something like this:
static void
diskio_option(void)
{
if (INVALID_MEMBER(class_devices)) {
MEMBER_OFFSET_INIT(class_devices, "class", "class_devices");
if (INVALID_MEMBER(class_devices))
MEMBER_OFFSET_INIT(class_devices, "class", "devices");
MEMBER_OFFSET_INIT(class_p, "class", "p");
MEMBER_OFFSET_INIT(class_private_devices, "class_private",
"class_devices");
MEMBER_OFFSET_INIT(device_knode_class, "device", "knode_class");
MEMBER_OFFSET_INIT(device_node, "device", "node");
MEMBER_OFFSET_INIT(device_type, "device", "type");
MEMBER_OFFSET_INIT(gendisk_dev, "gendisk", "dev");
if (INVALID_MEMBER(gendisk_dev))
MEMBER_OFFSET_INIT(gendisk_dev, "gendisk", "__dev");
MEMBER_OFFSET_INIT(gendisk_kobj, "gendisk", "kobj");
MEMBER_OFFSET_INIT(gendisk_part0, "gendisk", "part0");
MEMBER_OFFSET_INIT(gendisk_queue, "gendisk", "queue");
MEMBER_OFFSET_INIT(hd_struct_dev, "hd_struct", "__dev");
MEMBER_OFFSET_INIT(klist_k_list, "klist", "k_list");
MEMBER_OFFSET_INIT(klist_node_n_klist, "klist_node", "n_klist");
MEMBER_OFFSET_INIT(klist_node_n_node, "klist_node", "n_node");
MEMBER_OFFSET_INIT(kobject_entry, "kobject", "entry");
MEMBER_OFFSET_INIT(kset_list, "kset", "list");
MEMBER_OFFSET_INIT(request_list_count, "request_list", "count");
MEMBER_OFFSET_INIT(request_queue_in_flight, "request_queue",
"in_flight");
MEMBER_OFFSET_INIT(request_queue_rq, "request_queue", "rq");
MEMBER_OFFSET_INIT(subsys_private_klist_devices, "subsys_private",
"klist_devices");
MEMBER_OFFSET_INIT(subsystem_kset, "subsystem", "kset");
STRUCT_SIZE_INIT(subsystem, "subsystem");
STRUCT_SIZE_INIT(class_private, "class_private");
MEMBER_SIZE_INIT(rq_in_flight, "request_queue", "in_flight");
MEMBER_SIZE_INIT(class_private_devices, "class_private",
"class_devices");
}
display_all_diskio();
}
Secondly, the whole READ/WRITE issue is confusing to the uninitiated
(like myself). After speaking with Jeff Moyer, we believe the output
of your command could be clarified.
Your help page indicates:
+" This command dumps I/O statistics of all disks:",
+" TOTAL: The total requests that have not been ended",
+" READ: The total read requests that have not been ended",
+" WRITE: The total write requests that have not been ended",
+" DRV: The total requests that have been in the driver, but not end",
+"",
+" Note: some kernel does not contain read/write requests, and the command",
+" will output '-----'"
+"\nEXAMPLES",
+" %s> diskio",
+" MAJOR GENDISK NAME RUQUEST QUEUE TOTAL READ WRITE
DRV",
+" 008 0xe00000010773ea80 sda 0x3000000109c9fbf0 12 0 12
0",
+" 008 0xe00000010773e680 sdb 0x3000000109c9f8a0 2 2 0
0",
+" 008 0xe000000107781d80 sdc 0x300000010c268050 6 0 6
6",
+" 008 0xe00000010773e080 sdd 0x300000010c26bbf0 0 0 0
0",
+" 008 0xe00000010773dc80 sde 0x300000010c3dd780 0 0 0
0",
+NULL
As Jeff explained to me, this is how it works:
(1) In older kernels, this enum did not exist:
enum {
BLK_RW_ASYNC = 0,
BLK_RW_SYNC = 1,
};
and in that case, the request_list.count[2] values are the
READ/WRITE values, as you show in the help page example above.
(2) In newer kernels, the enum does exist, and the meaning of the
request_list.count[2] values are ASYNC/SYNC requests. In that
case, you show "-----" under the READ and WRITE columns, which
I found *very* confusing.
What Jeff Moyer suggested is that -- in the case of new kernels -- you
should alternatively show the counts with ASYNC and SYNC columns like
this:
MAJOR GENDISK NAME REQUEST QUEUE TOTAL ASYNC SYNC DRV",
...
And the READ/WRITE vs. ASYNC/SYNC output display difference should be
referenced in the help page output.
Third, a minor nit -- note that you misspelled it as "RUQUEST" both in
the command and in the help page. Also, it appears that you used an IA64 as
the example, given the GENDISK addresses. But aren't the REQUEST QUEUE
addresses below beginning with "0x300..." user-space addresses for IA64?
+" MAJOR GENDISK NAME RUQUEST QUEUE TOTAL READ WRITE
DRV",
+" 008 0xe00000010773ea80 sda 0x3000000109c9fbf0 12 0 12
0",
+" 008 0xe00000010773e680 sdb 0x3000000109c9f8a0 2 2 0
0",
+" 008 0xe000000107781d80 sdc 0x300000010c268050 6 0 6
6",
+" 008 0xe00000010773e080 sdd 0x300000010c26bbf0 0 0 0
0",
+" 008 0xe00000010773dc80 sde 0x300000010c3dd780 0 0 0
0",
In any case, can you change it to use either x86_64 or x86 as an example?
Fourth, in testing this with a 2.6.25 kernel, the command hangs:
crash> dev -d
MAJOR GENDISK NAME REQUEST QUEUE TOTAL READ WRITE DRV
2 0xffff81012d8a5000 fd0 0xffff81012dc053c0 0 0 0 0
22 0xffff81012dc6b000 hdc 0xffff81012d8ae340 0 0 0 0
8 0xffff81012dd71000 sda 0xffff81012d8af040 0 0 0 0
< hangs here forever >
I'm not absolutely sure, but if I hit CTRL-C under gdb, it seems that it's
spinning in next_disk_list() in that "goto again" loop?
Fifth, I tried it on a much older RHEL3 kernel, which shows:
crash> dev -d
dev: invalid request_queue.in_flight's size
crash>
It's not really invalid size, but it's more the case that you
don't support that old a kernel version. In that case, instead
of doing this:
error(FATAL, "invalid request_queue.in_flight's size\n");
you should do this instead:
option_not_supported('d');
And finally, whenever adding fields to the offset_table or size_table,
please display their values in dump_offset_table() for the "help -o"
command.
Thanks,
Dave