----- Original Message -----
Sorry,Ijust realized that my email setting is not correct.
Resend patch file here.
> Dave,
>
> This patch add -M and -m option for file commands, which allow to dump
> page cache for a file.
>
> Please review and let me know your comments. Thanks!
Hello Oliver,
Before getting into the patch specifics, please make it apply to the current
git tree contents:
$ git clone
git://github.com/crash-utility/crash.git
Cloning into 'crash'...
remote: Counting objects: 954, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 954 (delta 0), reused 0 (delta 0), pack-reused 951
Receiving objects: 100% (954/954), 2.08 MiB, done.
Resolving deltas: 100% (634/634), done.
$ cd crash
$ patch -p1 < ../0001-files-support-dump-file-page-caches.patch
patching file defs.h
Hunk #3 succeeded at 4755 (offset -16 lines).
patching file filesys.c
patching file memory.c
Hunk #1 succeeded at 134 with fuzz 1 (offset 2 lines).
Hunk #2 succeeded at 6467 (offset 305 lines).
patching file task.c
Hunk #1 succeeded at 5612 (offset 11 lines).
Hunk #2 succeeded at 5636 (offset 11 lines).
Hunk #3 succeeded at 6144 (offset 11 lines).
Hunk #4 succeeded at 6377 (offset 11 lines).
$
And make sure it compiles cleanly with "make warn":
$ make warn
... [ cut ] ...
cc -c -g -DX86_64 -DLZO -DSNAPPY -DGDB_7_6 memory.c -Wall -O2 -Wstrict-prototypes
-Wmissing-prototypes -fstack-protector -Wformat-security
memory.c: In function 'dump_file_address_mappings':
memory.c:6477:22: warning: unused variable 'ret' [-Wunused-variable]
memory.c:6475:8: warning: unused variable 'radix_tree_rnode'
[-Wunused-variable]
memory.c: At top level:
memory.c:6505:1: warning: no previous prototype for 'get_page_tree_count'
[-Wmissing-prototypes]
memory.c: In function 'get_page_tree_count':
memory.c:6507:8: warning: unused variable 'radix_tree_rnode'
[-Wunused-variable]
cc -c -g -DX86_64 -DLZO -DSNAPPY -DGDB_7_6 filesys.c -Wall -O2 -Wstrict-prototypes
-Wmissing-prototypes -fstack-protector -Wformat-security
filesys.c: In function 'cmd_files':
filesys.c:2215:4: warning: implicit declaration of function
'dump_file_address_mappings' [-Wimplicit-function-declaration]
filesys.c: In function 'file_dump':
filesys.c:2894:4: warning: implicit declaration of function
'get_page_tree_count' [-Wimplicit-function-declaration]
...
I've only done some quick testing, but for starters, The PATH translation
for /dev files is not working the same way as the regular "files" command:
crash> files
PID: 19772 TASK: ffff810278593820 CPU: 7 COMMAND: "sshd"
ROOT: / CWD: /
FD FILE DENTRY INODE TYPE PATH
0 ffff8102777021c0 ffff81027cc06078 ffff81027f3efa18 CHR /dev/null
1 ffff8102777021c0 ffff81027cc06078 ffff81027f3efa18 CHR /dev/null
2 ffff81027760abc0 ffff81027cc06078 ffff81027f3efa18 CHR /dev/null
3 ffff810213616dc0 ffff8102130de738 ffff81026d575610 SOCK socket:/[72770]
4 ffff81027e5b5cc0 ffff8102130de8e8 ffff810274152ad0 SOCK socket:/[72980]
5 ffff810216a63e80 ffff810229a28228 ffff81021006a910 PIPE
6 ffff810276ba8c80 ffff810229a28228 ffff81021006a910 PIPE
7 ffff81027e0757c0 ffff8102130dec48 ffff81027f2eb110 SOCK socket:/[72991]
8 ffff8102136160c0 ffff810213079b70 ffff8102741525d0 SOCK socket:/[72992]
9 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR /dev/ptmx
10 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR /dev/ptmx
11 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR /dev/ptmx
crash> files -M
PID: 19772 TASK: ffff810278593820 CPU: 7 COMMAND: "sshd"
ROOT: / CWD: /
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
0 ffff81027f3efb28 0 ffff81027f3efa18 CHR /null
1 ffff81027f3efb28 0 ffff81027f3efa18 CHR /null
2 ffff81027f3efb28 0 ffff81027f3efa18 CHR /null
3 ffff81026d575720 0 ffff81026d575610 SOCK socket:/[72770]
4 ffff810274152be0 0 ffff810274152ad0 SOCK socket:/[72980]
5 ffff81021006aa20 0 ffff81021006a910 PIPE
6 ffff81021006aa20 0 ffff81021006a910 PIPE
7 ffff81027f2eb220 0 ffff81027f2eb110 SOCK socket:/[72991]
8 ffff8102741526e0 0 ffff8102741525d0 SOCK socket:/[72992]
9 ffff81027f3ef528 0 ffff81027f3ef418 CHR /ptmx
10 ffff81027f3ef528 0 ffff81027f3ef418 CHR /ptmx
11 ffff81027f3ef528 0 ffff81027f3ef418 CHR /ptmx
crash>
But more importantly, for "files -M", it's not clear to me what the
PGCACHE-PGS count
should or does mean.
One might expect to pass any ADDR-SPACE address shown by "files -M" to the
"files -m <address-space>" option, and see PGCACHE-PGS worth of pages
dumped.
But that's not always true.
For example:
crash> files -M
PID: 30700 TASK: ffff810876c8d7a0 CPU: 0 COMMAND: "_progres"
ROOT: / CWD: /home/TCusa
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
0 ffff81080e7095c0 0 ffff81080e7094b0 CHR /45
1 ffff81080e7095c0 0 ffff81080e7094b0 CHR /45
2 ffff81080e7095c0 0 ffff81080e7094b0 CHR /45
3 ffff810fe6da85c0 0 ffff810fe6da84b0 PIPE
4 ffff81021b28eb88 35 ffff81021b28ea78 REG /dlc/101c/convmap.cp
5 ffff811024532b88 6 ffff811024532a78 REG /ProgTemp/lbiVQS7cd
6 ffff810224fbd850 78 ffff810224fbd740 REG /usa/mfg/usa.lg
7 ffff810218e31220 12 ffff810218e31110 REG /usa/mfg/usa.db
8 ffff810218e31220 12 ffff810218e31110 REG /usa/mfg/usa.db
9 ffff810e371fe260 42937 ffff810e371fe150 REG /usa/mfg/usa.b1
10 ffff81102f12ee40 0 ffff81102f12ed30 REG /usa/mfg/usa.b2
11 ffff810224fbde40 490 ffff810224fbdd30 REG /usa/mfg/usa.d1
12 ffff810224fbde40 490 ffff810224fbdd30 REG /usa/mfg/usa.d1
...
Taking FD 7's address space structure, the 12 page cache pages can be dumped:
crash> files -m ffff810218e31220
Address Space ffff810218e31220 : 12 pages in page cache
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffff81010774d1f8 221609000 ffff810218e31220 0 1 22010000001006c
ffff81010485b378 14ac59000 ffff810218e31220 1 1 14810000001006c
ffff810107993a78 22bc79000 ffff810218e31220 2 1 228100000010028
ffff8101049d3660 1517d4000 ffff810218e31220 3 1 150100000010028
ffff810103b29670 10e742000 ffff810218e31220 4 1 108100000010028
ffff810106d51ba0 1f3bec000 ffff810218e31220 5 1 1f0100000010028
ffff810103ac95f0 10cbd2000 ffff810218e31220 6 1 108100000010028
ffff810106c8cc18 1f03a5000 ffff810218e31220 7 1 1f0100000010028
ffff8101077b1028 223293000 ffff810218e31220 8 1 220100000010028
ffff810106cc03b0 1f125a000 ffff810218e31220 9 1 1f0100000010028
ffff810107a04cd8 22dccd000 ffff810218e31220 a 1 228100000010028
ffff8101078741b8 226a51000 ffff810218e31220 b 1 220100000010028
crash>
So taking FD 11's ffff810224fbde40, you would expect all 490 pages plus
the 3 line header to be dumped:
crash> files -m ffff810224fbde40 | wc -l
46
crash>
Or taking FD 9's ffff810e371fe260, you would expect 42937 pages plus the header:
crash> files -m ffff810e371fe260 | wc -l
2444
crash>
So what does that mean exactly? Should the PGCACHE-PGS display show "x of y",
where "x" is the number of a file's "y" cached pages that are
mapped into the
specified address space?
I haven't looked too deeply at the patch-set yet, but in my quick test, I ran
into this in your new dump_file_address_mappings() function:
+ /* Now walk the tree, counting all the pages in the tree */
+ for (index = 0; index <= count; index++) {
+ rtp.index = index;
+ if (do_radix_tree(root_rnode, RADIX_TREE_SEARCH, &rtp)) {
+ meminfo.spec_addr = (ulong)rtp.value;
+ meminfo.memtype = KVADDR;
+ meminfo.flags = ADDRESS_SPECIFIED;
+ dump_mem_map_SPARSEMEM(&meminfo);
+ }
+ }
Also, if the kernel is not configured with CONFIG_SPARSEMEM, the "files -m"
option
fails like this, on a 2.6.9 RHEL4 x86_64 kernel (yes the address space virtual address
is correct):
crash> files -m 1016fcaa668
Address Space 1016fcaa668 : 20 pages in page cache
files: cannot resolve "mem_section"
crash>
For backwards-compatibility, I did a quick check on a couple older 32-bit x86 kernels,
and on a RHEL4 2.6.9-based x86 kernel, "files -M" fails every time:
crash> files -M
PID: 4846 TASK: c09de0b0 CPU: 0 COMMAND: "dmach7"
ROOT: / CWD: /home/m7istp.4.6.18.b4/m7istp/dmach7/bin
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
radix_tree_root at cf53a3ac:
struct radix_tree_root {
height = 0x220,
gfp_mask = 0x0,
rnode = 0x1d244b3c
}
files: height 544 is greater than height_to_maxindex[] index 7
crash>
I thought it might be a problem with really old kernels, but it happens on
a 32-bit RHEL5 2.6.18-128.2.1.el5 kernel:
crash> files -M
PID: 22328 TASK: dbdeb000 CPU: 0 COMMAND: "sushiremote"
ROOT: / CWD: /afs/cs.wisc.edu/u/s/u/sushi
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
radix_tree_root at de8cb798:
struct radix_tree_root {
height = 0x220,
gfp_mask = 0x0,
rnode = 0x1000000
}
files: height 544 is greater than height_to_maxindex[] index 7
crash>
And the same thing on the most recent 32-bit x86 kernel I have on hand, which
is 2.6.40.4-5.fc15.i686.PAE:
crash> files -M
PID: 3804 TASK: f466a5e0 CPU: 0 COMMAND: "crash"
ROOT: / CWD: /root/crash-5.1.8
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
radix_tree_root at e6d6f644:
struct radix_tree_root {
height = 0x20,
gfp_mask = 0x0,
rnode = 0x101
}
files: height 32 is greater than height_to_maxindex[] index 7
crash>
So then I tried it on a 32-bit ARM 3.10.17 kernel, which also fails:
crash> files -M
PID: 13429 TASK: db944580 CPU: 1 COMMAND: "AudioIn_5F8"
ROOT: / CWD: /
FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
radix_tree_root at db6b1a7c:
struct radix_tree_root {
height = 0x20,
gfp_mask = 0x0,
rnode = 0x0
}
files: height 32 is greater than height_to_maxindex[] index 7
crash>
So I'm guessing that the patch fails on all 32-bit kernels.
Dave
>
> Here is the usage,
>
> 1. Dump a process page cache number, default is crash, also work with given
> pid,
>
> crash> files -M
>
> PID: 22710 TASK: ffff8801077153e0 CPU: 1 COMMAND: "crash"
>
> ROOT: / CWD: /auto/home2/yango/workspace/crash
>
> FD ADDR-SPACE PGCACHE-PGS INODE TYPE PATH
>
> 0 ffff8801031edbe8 0 ffff8801031edaa0 CHR /2
>
> 1 ffff8801031edbe8 0 ffff8801031edaa0 CHR /2
>
> 2 ffff8801031edbe8 0 ffff8801031edaa0 CHR /2
>
> 3 ffff880139bf8950 0 ffff880139bf8808 CHR /null
>
> 4 ffff88011e561390 0 ffff88011e561248 CHR /crash
>
> 5 ffff88012f8345f0 37910 ffff88012f8344a8 REG
> /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
>
> [snipped..........................]
>
>
> 2. Dump pages in a given addr-space, this exmaple is ffff88012f8345f0
> from above output.
> page flags could indicates the dirty pages for fsync stress debugging,
>
> crash> files -m ffff88012f8345f0
>
> Address Space ffff88012f8345f0 : 37910 pages in page cache
>
> PAGE PHYSICAL MAPPING INDEX CNT FLAGS
>
> ffffea0001f5bc40 7d6f1000 ffff88012f8345f0 0 2 3ff0000000086c
> referenced,uptodate,lru,active,private
>
> ffffea0001f5bc80 7d6f2000 ffff88012f8345f0 1 2 3ff0000000082c
> referenced,uptodate,lru,private
>
>
..............................[snipped...].........................................................................
>
> ffffea00016226c0 5889b000 ffff88012f8345f0 9414 2 3ff0000000086c
> referenced,uptodate,lru,active,private
>
> ffffea000224f480 893d2000 ffff88012f8345f0 9415 2 3ff0000000086c
> referenced,uptodate,lru,active,private
>
> 3. For each files doesn't work with -m but it work with -M
>
> crash> foreach files -m
>
> foreach: foreach files command does not support -m option
>
> So we can use foreach to find which process or files have most page
> cache number,
>
> crash> foreach files -M | grep REG | sort -k3 -n | tail -10
>
> 20 ffff880137a70be0 2 ffff880137a70a98 REG /ffinLFoAy
>
> 4 ffff880037630de0 131 ffff880037630c98 REG
> /var/log/audit/audit.log
>
> 4 ffff880037630de0 131 ffff880037630c98 REG
> /var/log/audit/audit.log
>
> 36 ffff8801352e91d8 574 ffff8801352e9090 REG
> /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/user-530.journal
>
> 34 ffff8801352e81f8 590 ffff8801352e80b0 REG
> /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/user-42.journal
>
> 5 ffff8800a90219c8 9816 ffff8800a9021880 REG
> /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
>
> 13 ffff880135267198 14051 ffff880135267050 REG
> /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/system.journal
>
> 5 ffff88012f8345f0 37910 ffff88012f8344a8 REG
> /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
>
> 1 ffff8800704f3d80 59468 ffff8800704f3c38 REG
> /ws/irqstat/nohup.out
>
> 2 ffff8800704f3d80 59468 ffff8800704f3c38 REG
> /ws/irqstat/nohup.out
>
>
> With these commands, we can easily to debug some page cache flush
> stress issue, and find out which process or files had the problem.
>
>
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility