Date: Thu, 18 Jun 2015 17:18:27 -0400
From: anderson(a)redhat.com
To: crash-utility(a)redhat.com
Subject: Re: [Crash-utility] [PATCH v2] files: support dump file pages from its address
space
Hi Oliver,>
I have only had a chance to look at the patch contents, and haven't been able
to fully test it. I'll try to do that tomorrow, or if not, next week.
In the meantime, I do have a few comments regarding the patch itself that need
to be addressed.
The major problem is maintaining the API for extension modules. Any changes
to defs.h such as the function prototypes or structure declarations that are exported
in that file should not require the re-compilation of pre-existing extension
modules.
I was not aware of that. Good to know it.
One simple example is this addition you make to the offset_table:
@@ -1416,6 +1417,7 @@ struct offset_table { /* stash of commonly-used
offsets */
long inode_i_flock;
long inode_i_fop;
long inode_i_mapping;
+ long address_space_page_tree;
long address_space_nrpages;
long vfsmount_mnt_next;
long vfsmount_mnt_devname;
By inserting the new address_space_page_tree member where you did, it will break
the usage of OFFSET() for all subsequent members. Note this comment at the
beginning of the structure:
struct offset_table { /* stash of commonly-used offsets */
long list_head_next; /* add new entries to end of table */
...
So please put address_space_page_tree at the end of the offset_table structure.
Accept. I will follow this way.
On the other hand, please add an entry to the dump_offset_table()
function for
use by "help -o", and in that function you can place the
address_space_page_tree
display right next to address_space_nrpages.
But more importantly with respect to the API is the change where you
added the callback
argument to the do_radix_tree() function. The do_radix_tree API cannot be changed
without breaking any pre-existing extension modules that currently use it.
I have a question about extension modules support.
What is the exact meaning of pre-existing extension modules?
Are they the modules in crash source tree or outside of the source tree?
If a extension modules are maintained out side of crash source code tree, do we still need
take care of it?
Here's what I suggest. Instead of using RADIX_TREE_DUMP, create
a new flag
RADIX_TREE_DUMP_CALLBACK, and you can store the callback function address inside
the general purpose pc->curcmd_private member.
I'm ok with use a different flag. Although I didn't find any source codes are
using RADIX_TREE_DUMP flag.
But I had some concerns for storing callback address in pc->curcmd_private.
I noticed that the pc->curcmd_private is already used by other crash code at many
locations.
If someday people want to use the RADIX_TREE_DUMP_CALLBACK at that context, it may cause
the troubles.
If we really don't want to add new args for do_radix_tree, maybe we can define a new
global function pointerfor this purpose.
If you don't like my suggestion, we also can consider to use RADIX_TREE_SEARCH.
And lastly (for now), I am a big believer in keeping things as simple
as possible.
You have added all of these constructs:
+ #define AS_PAGE_TREE (0x20000000)
+ #define IS_AS_PAGE_TREE() (vt->flags & AS_PAGE_TREE)
+ MEMBER_OFFSET_INIT(address_space_page_tree, "address_space",
"page_tree");
+ if (VALID_MEMBER(address_space_page_tree))
+ vt->flags |= AS_PAGE_TREE;
+ /*
+ * Check the availability of address space page tree
+ */
+ int
+ is_as_page_tree_supported(void)
+ {
+ return (IS_AS_PAGE_TREE() ? TRUE : FALSE);
+ }
and where your code calls it twice:
+ if (is_as_page_tree_supported()) {
+ if (is_as_page_tree_supported())
Please get rid of AS_PAGE_TREE, IS_AS_PAGE_TREE() and is_as_page_tree_supported()
and simply use "if (VALID_MEMBER(address_space_page_tree))". That's
precisely the
kind of situation that VALID_MEMBER() is meant to be used for.
Accept.
Thanks for your careful review. I do learn lot from your comments.
Thanks,
Dave
>
>
> files: support dump file pages from its address space
>
> Added two options in files command,
>
> 1. -M option, which allows dump address space and page number for each files
> 2. -m option, which could dump each pages in given address mapping
>
> The foreach command also could work with -M, so that we can easily
> find which processes/files hold most page cache pages within the system.
>
> Signed-off-by: Yong Yang <yangoliver(a)gmail.com>
> ---
> defs.h | 11 ++++++-
> filesys.c | 111
> +++++++++++++++++++++++++++++++++++++++++++++++++++-----------
> kernel.c | 4 +--
> memory.c | 73 +++++++++++++++++++++++++++++++++++++++++
> task.c | 25 +++++++++++---
> 5 files changed, 197 insertions(+), 27 deletions(-)
>
> diff --git a/defs.h b/defs.h
> index b25b505..608e09f 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -1111,6 +1111,7 @@ extern struct machdep_table *machdep;
> #define FOREACH_a_FLAG (0x4000000)
> #define FOREACH_G_FLAG (0x8000000)
> #define FOREACH_F_FLAG2 (0x10000000)
> +#define FOREACH_M_FLAG (0x20000000)
>
> #define FOREACH_PS_EXCLUSIVE \
>
(FOREACH_g_FLAG|FOREACH_a_FLAG|FOREACH_t_FLAG|FOREACH_c_FLAG|FOREACH_p_FLAG|FOREACH_l_FLAG|FOREACH_r_FLAG|FOREACH_m_FLAG)
> @@ -1416,6 +1417,7 @@ struct offset_table { /* stash of
> commonly-used offsets */
> long inode_i_flock;
> long inode_i_fop;
> long inode_i_mapping;
> + long address_space_page_tree;
> long address_space_nrpages;
> long vfsmount_mnt_next;
> long vfsmount_mnt_devname;
> @@ -2286,11 +2288,13 @@ struct vm_table { /* kernel VM-related
> data */
> #define PAGEFLAGS (0x4000000)
> #define SLAB_OVERLOAD_PAGE (0x8000000)
> #define SLAB_CPU_CACHE (0x10000000)
> +#define AS_PAGE_TREE (0x20000000)
>
> #define IS_FLATMEM() (vt->flags & FLATMEM)
> #define IS_DISCONTIGMEM() (vt->flags & DISCONTIGMEM)
> #define IS_SPARSEMEM() (vt->flags & SPARSEMEM)
> #define IS_SPARSEMEM_EX() (vt->flags & SPARSEMEM_EX)
> +#define IS_AS_PAGE_TREE() (vt->flags & AS_PAGE_TREE)
>
> #define COMMON_VADDR_SPACE() (vt->flags & COMMON_VADDR)
> #define PADDR_PRLEN (vt->paddr_prlen)
> @@ -2598,6 +2602,7 @@ struct load_module {
> #define PRINT_SINGLE_VMA (0x80)
> #define PRINT_RADIX_10 (0x100)
> #define PRINT_RADIX_16 (0x200)
> +#define PRINT_PAGES (0x400)
>
> #define MIN_PAGE_SIZE (4096)
>
> @@ -4707,6 +4712,9 @@ void alter_stackbuf(struct bt_info *);
> int vaddr_type(ulong, struct task_context *);
> char *format_stack_entry(struct bt_info *bt, char *, ulong, ulong);
> int in_user_stack(ulong, ulong);
> +void dump_file_address_mappings(ulong);
> +long get_page_tree_count(ulong i_mapping);
> +int is_as_page_tree_supported(void);
>
> /*
> * filesys.c
> @@ -4747,12 +4755,13 @@ struct radix_tree_pair {
> ulong index;
> void *value;
> };
> -ulong do_radix_tree(ulong, int, struct radix_tree_pair *);
> +ulong do_radix_tree(ulong, int, struct radix_tree_pair *, int (*)(ulong));
> int file_dump(ulong, ulong, ulong, int, int);
> #define DUMP_FULL_NAME 1
> #define DUMP_INODE_ONLY 2
> #define DUMP_DENTRY_ONLY 4
> #define DUMP_EMPTY_FILE 8
> +#define DUMP_FILE_PAGE 16
> #endif /* !GDB_COMMON */
> int same_file(char *, char *);
> #ifndef GDB_COMMON
> diff --git a/filesys.c b/filesys.c
> index 0573fe6..f0ec78b 100644
> --- a/filesys.c
> +++ b/filesys.c
> @@ -2187,11 +2187,12 @@ cmd_files(void)
> int subsequent;
> struct reference reference, *ref;
> char *refarg;
> + int open_flags = 0;
>
> ref = NULL;
> refarg = NULL;
>
> - while ((c = getopt(argcnt, args, "d:R:")) != EOF) {
> + while ((c = getopt(argcnt, args, "d:R:m:M")) != EOF) {
> switch(c)
> {
> case 'R':
> @@ -2209,7 +2210,20 @@ cmd_files(void)
> value = htol(optarg, FAULT_ON_ERROR, NULL);
> display_dentry_info(value);
> return;
> -
> + case 'm':
> + if (is_as_page_tree_supported()) {
> + value = htol(optarg, FAULT_ON_ERROR, NULL);
> + dump_file_address_mappings(value);
> + } else {
> + option_not_supported('m');
> + }
> + return;
> + case 'M':
> + if (is_as_page_tree_supported())
> + open_flags |= PRINT_PAGES;
> + else
> + option_not_supported('M');
> + break;
> default:
> argerrs++;
> break;
> @@ -2222,7 +2236,9 @@ cmd_files(void)
> if (!args[optind]) {
> if (!ref)
> print_task_header(fp, CURRENT_CONTEXT(), 0);
> - open_files_dump(CURRENT_TASK(), 0, ref);
> +
> + open_files_dump(CURRENT_TASK(), open_flags, ref);
> +
> return;
> }
>
> @@ -2241,7 +2257,7 @@ cmd_files(void)
> for (tc = pid_to_context(value); tc; tc =
> tc->tc_next) {
> if (!ref)
> print_task_header(fp, tc,
> subsequent);
> - open_files_dump(tc->task, 0, ref);
> + open_files_dump(tc->task, open_flags, ref);
> fprintf(fp, "\n");
> }
> break;
> @@ -2249,7 +2265,7 @@ cmd_files(void)
> case STR_TASK:
> if (!ref)
> print_task_header(fp, tc, subsequent);
> - open_files_dump(tc->task, 0, ref);
> + open_files_dump(tc->task, open_flags, ref);
> break;
>
> case STR_INVALID:
> @@ -2321,6 +2337,7 @@ open_files_dump(ulong task, int flags, struct reference
> *ref)
> char buf4[BUFSIZE];
> char root_pwd[BUFSIZE];
> int root_pwd_printed = 0;
> + int file_dump_flags = 0;
>
> BZERO(root_pathname, BUFSIZE);
> BZERO(pwd_pathname, BUFSIZE);
> @@ -2329,15 +2346,27 @@ open_files_dump(ulong task, int flags, struct
> reference *ref)
> fdtable_buf = GETBUF(SIZE(fdtable));
> fill_task_struct(task);
>
> - sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> - space(MINSPACE),
> - mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "FILE"),
> - space(MINSPACE),
> - mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "DENTRY"),
> - space(MINSPACE),
> - mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> - space(MINSPACE),
> - space(MINSPACE));
> + if (flags & PRINT_PAGES) {
> + sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> + space(MINSPACE),
> + mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "ADDR-SPACE"),
> + space(MINSPACE),
> + mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "PAGE-COUNT"),
> + space(MINSPACE),
> + mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> + space(MINSPACE),
> + space(MINSPACE));
> + } else {
> + sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> + space(MINSPACE),
> + mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "FILE"),
> + space(MINSPACE),
> + mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "DENTRY"),
> + space(MINSPACE),
> + mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> + space(MINSPACE),
> + space(MINSPACE));
> + }
>
> tc = task_to_context(task);
>
> @@ -2523,6 +2552,10 @@ open_files_dump(ulong task, int flags, struct
> reference *ref)
> return;
> }
>
> + file_dump_flags = DUMP_FULL_NAME | DUMP_EMPTY_FILE;
> + if (flags & PRINT_PAGES)
> + file_dump_flags |= DUMP_FILE_PAGE;
> +
> j = 0;
> for (;;) {
> unsigned long set;
> @@ -2539,8 +2572,7 @@ open_files_dump(ulong task, int flags, struct reference
> *ref)
>
> if (ref && file) {
> open_tmpfile();
> - if (file_dump(file, 0, 0, i,
> - DUMP_FULL_NAME|DUMP_EMPTY_FILE))
> {
> + if (file_dump(file, 0, 0, i,
> file_dump_flags)) {
> BZERO(buf4, BUFSIZE);
> rewind(pc->tmpfile);
> ret = fgets(buf4, BUFSIZE,
> @@ -2558,8 +2590,7 @@ open_files_dump(ulong task, int flags, struct reference
> *ref)
> fprintf(fp, "%s", files_header);
> header_printed = 1;
> }
> - file_dump(file, 0, 0, i,
> - DUMP_FULL_NAME|DUMP_EMPTY_FILE);
> + file_dump(file, 0, 0, i, file_dump_flags);
> }
> }
> i++;
> @@ -2754,6 +2785,8 @@ file_dump(ulong file, ulong dentry, ulong inode, int
> fd, int flags)
> char buf1[BUFSIZE];
> char buf2[BUFSIZE];
> char buf3[BUFSIZE];
> + ulong i_mapping = 0;
> + ulong count = 0;
>
> file_buf = NULL;
>
> @@ -2863,6 +2896,28 @@ file_dump(ulong file, ulong dentry, ulong inode, int
> fd, int flags)
> type,
> space(MINSPACE),
> pathname+1);
> + } else if (flags & DUMP_FILE_PAGE) {
> + i_mapping = ULONG(inode_buf + OFFSET(inode_i_mapping));
> + count = get_page_tree_count(i_mapping);
> +
> + fprintf(fp, "%3d%s%s%s%s%s%s%s%s%s%s\n",
> + fd,
> + space(MINSPACE),
> + mkstring(buf1, VADDR_PRLEN,
> + CENTER|RJUST|LONG_HEX,
> + MKSTR(i_mapping)),
> + space(MINSPACE),
> + mkstring(buf2, VADDR_PRLEN,
> + CENTER|RJUST|LONG_DEC,
> + MKSTR(count)),
> + space(MINSPACE),
> + mkstring(buf3, VADDR_PRLEN,
> + CENTER|RJUST|LONG_HEX,
> + MKSTR(inode)),
> + space(MINSPACE),
> + type,
> + space(MINSPACE),
> + pathname);
> } else {
> fprintf(fp, "%3d%s%s%s%s%s%s%s%s%s%s\n",
> fd,
> @@ -3877,9 +3932,13 @@ ulong RADIX_TREE_MAP_MASK = UNINITIALIZED;
> * RADIX_TREE_GATHER; the dimension (max count) of the array may
> * be stored in the index field of the first structure to avoid
> * any chance of an overrun.
> + *
> + * entry_ops: The operation against each of returned entry value.
> + * Only used by RADIX_TREE_DUMP.
> */
> ulong
> -do_radix_tree(ulong root, int flag, struct radix_tree_pair *rtp)
> +do_radix_tree(ulong root, int flag, struct radix_tree_pair *rtp,
> + int (*entry_ops)(ulong))
> {
> int i, ilen, height;
> long nlen;
> @@ -3970,7 +4029,19 @@ do_radix_tree(ulong root, int flag, struct
> radix_tree_pair *rtp)
> for (index = count = 0; index <= maxindex; index++) {
> if ((ret =
> radix_tree_lookup(root_rnode, index, height))) {
> - fprintf(fp, "[%ld] %lx\n", index, (ulong)ret);
> + if (entry_ops == NULL) {
> + /* Default operation */
> + fprintf(fp, "[%ld] %lx\n",
> + index, (ulong)ret);
> + } else {
> + /* Caller defined operation */
> + if (entry_ops((ulong)ret) != 0) {
> + error(FATAL, "do_radix_tree: "
> + "dump operation failed, "
> + "count: %ld\n", count);
> + return -EIO;
> + }
> + }
> count++;
> }
> }
> diff --git a/kernel.c b/kernel.c
> index cb8084a..53d2e1d 100644
> --- a/kernel.c
> +++ b/kernel.c
> @@ -5746,12 +5746,12 @@ get_irq_desc_addr(int irq)
> return addr;
>
> cnt = do_radix_tree(symbol_value("irq_desc_tree"),
> - RADIX_TREE_COUNT, NULL);
> + RADIX_TREE_COUNT, NULL, NULL);
> len = sizeof(struct radix_tree_pair) * (cnt+1);
> rtp = (struct radix_tree_pair *)GETBUF(len);
> rtp[0].index = cnt;
> cnt = do_radix_tree(symbol_value("irq_desc_tree"),
> - RADIX_TREE_GATHER, rtp);
> + RADIX_TREE_GATHER, rtp, NULL);
>
> if (kt->highest_irq == 0)
> kt->highest_irq = rtp[cnt-1].index;
> diff --git a/memory.c b/memory.c
> index 765732b..0102dbc 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -292,6 +292,7 @@ static void dump_per_cpu_offsets(void);
> static void dump_page_flags(ulonglong);
> static ulong kmem_cache_nodelists(ulong);
> static void dump_hstates(void);
> +static int dump_file_page(ulong);
>
> /*
> * Memory display modes specific to this file.
> @@ -476,6 +477,9 @@ vm_init(void)
> MEMBER_OFFSET_INIT(block_device_bd_list, "block_device",
"bd_list");
> MEMBER_OFFSET_INIT(block_device_bd_disk, "block_device",
"bd_disk");
> MEMBER_OFFSET_INIT(inode_i_mapping, "inode", "i_mapping");
> + MEMBER_OFFSET_INIT(address_space_page_tree, "address_space",
"page_tree");
> + if (VALID_MEMBER(address_space_page_tree))
> + vt->flags |= AS_PAGE_TREE;
> MEMBER_OFFSET_INIT(address_space_nrpages, "address_space",
"nrpages");
> if (INVALID_MEMBER(address_space_nrpages))
> MEMBER_OFFSET_INIT(address_space_nrpages, "address_space",
"__nrpages");
> @@ -6465,6 +6469,75 @@ translate_page_flags(char *buffer, ulong flags)
> }
>
> /*
> + * Page tree dump ops.
> + */
> +static int
> +dump_file_page(ulong page)
> +{
> + struct meminfo meminfo;
> +
> + BZERO(&meminfo, sizeof(struct meminfo));
> + meminfo.spec_addr = page;
> + meminfo.memtype = KVADDR;
> + meminfo.flags = ADDRESS_SPECIFIED;
> + dump_mem_map(&meminfo);
> +
> + return 0;
> +}
> +
> +/*
> + * The address space file mapping radix tree walker.
> + */
> +void
> +dump_file_address_mappings(ulong i_mapping)
> +{
> + ulong root_rnode;
> + ulong count;
> +
> + root_rnode = i_mapping + OFFSET(address_space_page_tree);
> + count = get_page_tree_count(i_mapping);
> + fprintf(fp, "Address Space %lx, page tree %lx, %ld pages\n\n",
> + i_mapping, root_rnode, count);
> +
> + /* Dump each pages in radix tree */
> + (void) do_radix_tree(root_rnode, RADIX_TREE_DUMP,
> + NULL, &dump_file_page);
> +
> + return;
> +}
> +
> +/*
> + * Get the page count for the specific mapping
> + */
> +long
> +get_page_tree_count(ulong i_mapping)
> +{
> + ulong address_space = i_mapping;
> + char *address_space_buf;
> + ulong nrpages = 0;
> +
> + address_space_buf = GETBUF(SIZE(address_space));
> +
> + readmem(address_space, KVADDR, address_space_buf,
> + SIZE(address_space), "address_space buffer",
> + FAULT_ON_ERROR);
> + nrpages = ULONG(address_space_buf + OFFSET(address_space_nrpages));
> +
> + FREEBUF(address_space_buf);
> +
> + return nrpages;
> +}
> +
> +/*
> + * Check the availability of address space page tree
> + */
> +int
> +is_as_page_tree_supported(void)
> +{
> + return (IS_AS_PAGE_TREE() ? TRUE : FALSE);
> +}
> +
> +/*
> * dump_page_hash_table() displays the entries in each page_hash_table.
> */
>
> diff --git a/task.c b/task.c
> index 45be68c..4c95259 100644
> --- a/task.c
> +++ b/task.c
> @@ -5612,7 +5612,7 @@ cmd_foreach(void)
> BZERO(&foreach_data, sizeof(struct foreach_data));
> fd = &foreach_data;
>
> - while ((c = getopt(argcnt, args, "R:vomlgersStTpukcfFxhdaG"))
!=
> EOF) {
> + while ((c = getopt(argcnt, args, "R:vomMlgersStTpukcfFxhdaG"))
!=
> EOF) {
> switch(c)
> {
> case 'R':
> @@ -5636,6 +5636,10 @@ cmd_foreach(void)
> fd->flags |= FOREACH_m_FLAG;
> break;
>
> + case 'M':
> + fd->flags |= FOREACH_M_FLAG;
> + break;
> +
> case 'l':
> fd->flags |= FOREACH_l_FLAG;
> break;
> @@ -6140,6 +6144,13 @@ foreach(struct foreach_data *fd)
> print_header = FALSE;
> break;
>
> + case FOREACH_FILES:
> + if (fd->flags & FOREACH_m_FLAG)
> + error(FATAL,
> + "foreach files command does not "
> + "support -m option\n");
> + break;
> +
> case FOREACH_TEST:
> break;
> }
> @@ -6366,9 +6377,15 @@ foreach(struct foreach_data *fd)
>
> case FOREACH_FILES:
> pc->curcmd = "files";
> - open_files_dump(tc->task,
> - fd->flags & FOREACH_i_FLAG ?
> - PRINT_INODES : 0,
> + cmdflags = 0;
> +
> + if (fd->flags & FOREACH_i_FLAG)
> + cmdflags |= PRINT_INODES;
> + if (fd->flags & FOREACH_M_FLAG)
> + cmdflags |= PRINT_PAGES;
> +
> + open_files_dump(tc->task,
> + cmdflags,
> fd->reference ? ref : NULL);
> break;
>
> --
> 1.9.1
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/crash-utility
>
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility