> Date: Thu, 18 Jun 2015 17:18:27 -0400
> From: anderson@redhat.com
> To: crash-utility@redhat.com
> Subject: Re: [Crash-utility] [PATCH v2] files: support dump file pages from its address space

[Snipped here...]

> Hi Oliver,
>
> I have only had a chance to look at the patch contents, and haven't been able
> to fully test it. I'll try to do that tomorrow, or if not, next week.
>
> In the meantime, I do have a few comments regarding the patch itself that need
> to be addressed.
>
> The major problem is maintaining the API for extension modules. Any changes
> to defs.h such as the function prototypes or structure declarations that are exported
> in that file should not require the re-compilation of pre-existing extension
> modules. 

I was not aware of that. Good to know it.
 
>
> One simple example is this addition you make to the offset_table:
>
> @@ -1416,6 +1417,7 @@ struct offset_table { /* stash of commonly-used offsets */
> long inode_i_flock;
> long inode_i_fop;
> long inode_i_mapping;
> + long address_space_page_tree;
> long address_space_nrpages;
> long vfsmount_mnt_next;
> long vfsmount_mnt_devname;
>
> By inserting the new address_space_page_tree member where you did, it will break
> the usage of OFFSET() for all subsequent members. Note this comment at the
> beginning of the structure:
>
> struct offset_table { /* stash of commonly-used offsets */
> long list_head_next; /* add new entries to end of table */
> ...
>
> So please put address_space_page_tree at the end of the offset_table structure.

Accept. I will follow this way.

> On the other hand, please add an entry to the dump_offset_table() function for
> use by "help -o", and in that function you can place the address_space_page_tree
> display right next to address_space_nrpages.

Accept.

> But more importantly with respect to the API is the change where you added the callback
> argument to the do_radix_tree() function. The do_radix_tree API cannot be changed
> without breaking any pre-existing extension modules that currently use it.

I have a question about extension modules support.

What is the exact meaning of pre-existing extension modules? 

Are they the modules in crash source tree or outside of the source tree?

If a extension modules are maintained out side of crash source code tree, do we still need take care of it?

> Here's what I suggest. Instead of using RADIX_TREE_DUMP, create a new flag
> RADIX_TREE_DUMP_CALLBACK, and you can store the callback function address inside
> the general purpose pc->curcmd_private member.

I'm ok with use a different flag. Although I didn't find any source codes are using RADIX_TREE_DUMP flag.

But I had some concerns for storing callback address in pc->curcmd_private.

I noticed that the pc->curcmd_private is already used by other crash code at many locations.

If someday people want to use the RADIX_TREE_DUMP_CALLBACK at that context, it may cause the troubles.

If we really don't want to add new args for do_radix_tree, maybe we can define a new global function pointer
for this purpose. 

If you don't like my suggestion, we also can consider to use RADIX_TREE_SEARCH.

> And lastly (for now), I am a big believer in keeping things as simple as possible.
>
> You have added all of these constructs:
>
> + #define AS_PAGE_TREE (0x20000000)
>
> + #define IS_AS_PAGE_TREE() (vt->flags & AS_PAGE_TREE)
>
> + MEMBER_OFFSET_INIT(address_space_page_tree, "address_space", "page_tree");
> + if (VALID_MEMBER(address_space_page_tree))
> + vt->flags |= AS_PAGE_TREE;
>
> + /*
> + * Check the availability of address space page tree
> + */
> + int
> + is_as_page_tree_supported(void)
> + {
> + return (IS_AS_PAGE_TREE() ? TRUE : FALSE);
> + }
>
> and where your code calls it twice:
>
> + if (is_as_page_tree_supported()) {
>
> + if (is_as_page_tree_supported())
>
> Please get rid of AS_PAGE_TREE, IS_AS_PAGE_TREE() and is_as_page_tree_supported()
> and simply use "if (VALID_MEMBER(address_space_page_tree))". That's precisely the
> kind of situation that VALID_MEMBER() is meant to be used for.

Accept. 

Thanks for your careful review. I do learn lot from your comments.


> Thanks,
> Dave
>
>
> >
> >
> > files: support dump file pages from its address space
> >
> > Added two options in files command,
> >
> > 1. -M option, which allows dump address space and page number for each files
> > 2. -m option, which could dump each pages in given address mapping
> >
> > The foreach command also could work with -M, so that we can easily
> > find which processes/files hold most page cache pages within the system.
> >
> > Signed-off-by: Yong Yang <yangoliver@gmail.com>
> > ---
> > defs.h | 11 ++++++-
> > filesys.c | 111
> > +++++++++++++++++++++++++++++++++++++++++++++++++++-----------
> > kernel.c | 4 +--
> > memory.c | 73 +++++++++++++++++++++++++++++++++++++++++
> > task.c | 25 +++++++++++---
> > 5 files changed, 197 insertions(+), 27 deletions(-)
> >
> > diff --git a/defs.h b/defs.h
> > index b25b505..608e09f 100644
> > --- a/defs.h
> > +++ b/defs.h
> > @@ -1111,6 +1111,7 @@ extern struct machdep_table *machdep;
> > #define FOREACH_a_FLAG (0x4000000)
> > #define FOREACH_G_FLAG (0x8000000)
> > #define FOREACH_F_FLAG2 (0x10000000)
> > +#define FOREACH_M_FLAG (0x20000000)
> >
> > #define FOREACH_PS_EXCLUSIVE \
> > (FOREACH_g_FLAG|FOREACH_a_FLAG|FOREACH_t_FLAG|FOREACH_c_FLAG|FOREACH_p_FLAG|FOREACH_l_FLAG|FOREACH_r_FLAG|FOREACH_m_FLAG)
> > @@ -1416,6 +1417,7 @@ struct offset_table { /* stash of
> > commonly-used offsets */
> > long inode_i_flock;
> > long inode_i_fop;
> > long inode_i_mapping;
> > + long address_space_page_tree;
> > long address_space_nrpages;
> > long vfsmount_mnt_next;
> > long vfsmount_mnt_devname;
> > @@ -2286,11 +2288,13 @@ struct vm_table { /* kernel VM-related
> > data */
> > #define PAGEFLAGS (0x4000000)
> > #define SLAB_OVERLOAD_PAGE (0x8000000)
> > #define SLAB_CPU_CACHE (0x10000000)
> > +#define AS_PAGE_TREE (0x20000000)
> >
> > #define IS_FLATMEM() (vt->flags & FLATMEM)
> > #define IS_DISCONTIGMEM() (vt->flags & DISCONTIGMEM)
> > #define IS_SPARSEMEM() (vt->flags & SPARSEMEM)
> > #define IS_SPARSEMEM_EX() (vt->flags & SPARSEMEM_EX)
> > +#define IS_AS_PAGE_TREE() (vt->flags & AS_PAGE_TREE)
> >
> > #define COMMON_VADDR_SPACE() (vt->flags & COMMON_VADDR)
> > #define PADDR_PRLEN (vt->paddr_prlen)
> > @@ -2598,6 +2602,7 @@ struct load_module {
> > #define PRINT_SINGLE_VMA (0x80)
> > #define PRINT_RADIX_10 (0x100)
> > #define PRINT_RADIX_16 (0x200)
> > +#define PRINT_PAGES (0x400)
> >
> > #define MIN_PAGE_SIZE (4096)
> >
> > @@ -4707,6 +4712,9 @@ void alter_stackbuf(struct bt_info *);
> > int vaddr_type(ulong, struct task_context *);
> > char *format_stack_entry(struct bt_info *bt, char *, ulong, ulong);
> > int in_user_stack(ulong, ulong);
> > +void dump_file_address_mappings(ulong);
> > +long get_page_tree_count(ulong i_mapping);
> > +int is_as_page_tree_supported(void);
> >
> > /*
> > * filesys.c
> > @@ -4747,12 +4755,13 @@ struct radix_tree_pair {
> > ulong index;
> > void *value;
> > };
> > -ulong do_radix_tree(ulong, int, struct radix_tree_pair *);
> > +ulong do_radix_tree(ulong, int, struct radix_tree_pair *, int (*)(ulong));
> > int file_dump(ulong, ulong, ulong, int, int);
> > #define DUMP_FULL_NAME 1
> > #define DUMP_INODE_ONLY 2
> > #define DUMP_DENTRY_ONLY 4
> > #define DUMP_EMPTY_FILE 8
> > +#define DUMP_FILE_PAGE 16
> > #endif /* !GDB_COMMON */
> > int same_file(char *, char *);
> > #ifndef GDB_COMMON
> > diff --git a/filesys.c b/filesys.c
> > index 0573fe6..f0ec78b 100644
> > --- a/filesys.c
> > +++ b/filesys.c
> > @@ -2187,11 +2187,12 @@ cmd_files(void)
> > int subsequent;
> > struct reference reference, *ref;
> > char *refarg;
> > + int open_flags = 0;
> >
> > ref = NULL;
> > refarg = NULL;
> >
> > - while ((c = getopt(argcnt, args, "d:R:")) != EOF) {
> > + while ((c = getopt(argcnt, args, "d:R:m:M")) != EOF) {
> > switch(c)
> > {
> > case 'R':
> > @@ -2209,7 +2210,20 @@ cmd_files(void)
> > value = htol(optarg, FAULT_ON_ERROR, NULL);
> > display_dentry_info(value);
> > return;
> > -
> > + case 'm':
> > + if (is_as_page_tree_supported()) {
> > + value = htol(optarg, FAULT_ON_ERROR, NULL);
> > + dump_file_address_mappings(value);
> > + } else {
> > + option_not_supported('m');
> > + }
> > + return;
> > + case 'M':
> > + if (is_as_page_tree_supported())
> > + open_flags |= PRINT_PAGES;
> > + else
> > + option_not_supported('M');
> > + break;
> > default:
> > argerrs++;
> > break;
> > @@ -2222,7 +2236,9 @@ cmd_files(void)
> > if (!args[optind]) {
> > if (!ref)
> > print_task_header(fp, CURRENT_CONTEXT(), 0);
> > - open_files_dump(CURRENT_TASK(), 0, ref);
> > +
> > + open_files_dump(CURRENT_TASK(), open_flags, ref);
> > +
> > return;
> > }
> >
> > @@ -2241,7 +2257,7 @@ cmd_files(void)
> > for (tc = pid_to_context(value); tc; tc =
> > tc->tc_next) {
> > if (!ref)
> > print_task_header(fp, tc,
> > subsequent);
> > - open_files_dump(tc->task, 0, ref);
> > + open_files_dump(tc->task, open_flags, ref);
> > fprintf(fp, "\n");
> > }
> > break;
> > @@ -2249,7 +2265,7 @@ cmd_files(void)
> > case STR_TASK:
> > if (!ref)
> > print_task_header(fp, tc, subsequent);
> > - open_files_dump(tc->task, 0, ref);
> > + open_files_dump(tc->task, open_flags, ref);
> > break;
> >
> > case STR_INVALID:
> > @@ -2321,6 +2337,7 @@ open_files_dump(ulong task, int flags, struct reference
> > *ref)
> > char buf4[BUFSIZE];
> > char root_pwd[BUFSIZE];
> > int root_pwd_printed = 0;
> > + int file_dump_flags = 0;
> >
> > BZERO(root_pathname, BUFSIZE);
> > BZERO(pwd_pathname, BUFSIZE);
> > @@ -2329,15 +2346,27 @@ open_files_dump(ulong task, int flags, struct
> > reference *ref)
> > fdtable_buf = GETBUF(SIZE(fdtable));
> > fill_task_struct(task);
> >
> > - sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> > - space(MINSPACE),
> > - mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "FILE"),
> > - space(MINSPACE),
> > - mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "DENTRY"),
> > - space(MINSPACE),
> > - mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> > - space(MINSPACE),
> > - space(MINSPACE));
> > + if (flags & PRINT_PAGES) {
> > + sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> > + space(MINSPACE),
> > + mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "ADDR-SPACE"),
> > + space(MINSPACE),
> > + mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "PAGE-COUNT"),
> > + space(MINSPACE),
> > + mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> > + space(MINSPACE),
> > + space(MINSPACE));
> > + } else {
> > + sprintf(files_header, " FD%s%s%s%s%s%s%sTYPE%sPATH\n",
> > + space(MINSPACE),
> > + mkstring(buf1, VADDR_PRLEN, CENTER|LJUST, "FILE"),
> > + space(MINSPACE),
> > + mkstring(buf2, VADDR_PRLEN, CENTER|LJUST, "DENTRY"),
> > + space(MINSPACE),
> > + mkstring(buf3, VADDR_PRLEN, CENTER|LJUST, "INODE"),
> > + space(MINSPACE),
> > + space(MINSPACE));
> > + }
> >
> > tc = task_to_context(task);
> >
> > @@ -2523,6 +2552,10 @@ open_files_dump(ulong task, int flags, struct
> > reference *ref)
> > return;
> > }
> >
> > + file_dump_flags = DUMP_FULL_NAME | DUMP_EMPTY_FILE;
> > + if (flags & PRINT_PAGES)
> > + file_dump_flags |= DUMP_FILE_PAGE;
> > +
> > j = 0;
> > for (;;) {
> > unsigned long set;
> > @@ -2539,8 +2572,7 @@ open_files_dump(ulong task, int flags, struct reference
> > *ref)
> >
> > if (ref && file) {
> > open_tmpfile();
> > - if (file_dump(file, 0, 0, i,
> > - DUMP_FULL_NAME|DUMP_EMPTY_FILE))
> > {
> > + if (file_dump(file, 0, 0, i,
> > file_dump_flags)) {
> > BZERO(buf4, BUFSIZE);
> > rewind(pc->tmpfile);
> > ret = fgets(buf4, BUFSIZE,
> > @@ -2558,8 +2590,7 @@ open_files_dump(ulong task, int flags, struct reference
> > *ref)
> > fprintf(fp, "%s", files_header);
> > header_printed = 1;
> > }
> > - file_dump(file, 0, 0, i,
> > - DUMP_FULL_NAME|DUMP_EMPTY_FILE);
> > + file_dump(file, 0, 0, i, file_dump_flags);
> > }
> > }
> > i++;
> > @@ -2754,6 +2785,8 @@ file_dump(ulong file, ulong dentry, ulong inode, int
> > fd, int flags)
> > char buf1[BUFSIZE];
> > char buf2[BUFSIZE];
> > char buf3[BUFSIZE];
> > + ulong i_mapping = 0;
> > + ulong count = 0;
> >
> > file_buf = NULL;
> >
> > @@ -2863,6 +2896,28 @@ file_dump(ulong file, ulong dentry, ulong inode, int
> > fd, int flags)
> > type,
> > space(MINSPACE),
> > pathname+1);
> > + } else if (flags & DUMP_FILE_PAGE) {
> > + i_mapping = ULONG(inode_buf + OFFSET(inode_i_mapping));
> > + count = get_page_tree_count(i_mapping);
> > +
> > + fprintf(fp, "%3d%s%s%s%s%s%s%s%s%s%s\n",
> > + fd,
> > + space(MINSPACE),
> > + mkstring(buf1, VADDR_PRLEN,
> > + CENTER|RJUST|LONG_HEX,
> > + MKSTR(i_mapping)),
> > + space(MINSPACE),
> > + mkstring(buf2, VADDR_PRLEN,
> > + CENTER|RJUST|LONG_DEC,
> > + MKSTR(count)),
> > + space(MINSPACE),
> > + mkstring(buf3, VADDR_PRLEN,
> > + CENTER|RJUST|LONG_HEX,
> > + MKSTR(inode)),
> > + space(MINSPACE),
> > + type,
> > + space(MINSPACE),
> > + pathname);
> > } else {
> > fprintf(fp, "%3d%s%s%s%s%s%s%s%s%s%s\n",
> > fd,
> > @@ -3877,9 +3932,13 @@ ulong RADIX_TREE_MAP_MASK = UNINITIALIZED;
> > * RADIX_TREE_GATHER; the dimension (max count) of the array may
> > * be stored in the index field of the first structure to avoid
> > * any chance of an overrun.
> > + *
> > + * entry_ops: The operation against each of returned entry value.
> > + * Only used by RADIX_TREE_DUMP.
> > */
> > ulong
> > -do_radix_tree(ulong root, int flag, struct radix_tree_pair *rtp)
> > +do_radix_tree(ulong root, int flag, struct radix_tree_pair *rtp,
> > + int (*entry_ops)(ulong))
> > {
> > int i, ilen, height;
> > long nlen;
> > @@ -3970,7 +4029,19 @@ do_radix_tree(ulong root, int flag, struct
> > radix_tree_pair *rtp)
> > for (index = count = 0; index <= maxindex; index++) {
> > if ((ret =
> > radix_tree_lookup(root_rnode, index, height))) {
> > - fprintf(fp, "[%ld] %lx\n", index, (ulong)ret);
> > + if (entry_ops == NULL) {
> > + /* Default operation */
> > + fprintf(fp, "[%ld] %lx\n",
> > + index, (ulong)ret);
> > + } else {
> > + /* Caller defined operation */
> > + if (entry_ops((ulong)ret) != 0) {
> > + error(FATAL, "do_radix_tree: "
> > + "dump operation failed, "
> > + "count: %ld\n", count);
> > + return -EIO;
> > + }
> > + }
> > count++;
> > }
> > }
> > diff --git a/kernel.c b/kernel.c
> > index cb8084a..53d2e1d 100644
> > --- a/kernel.c
> > +++ b/kernel.c
> > @@ -5746,12 +5746,12 @@ get_irq_desc_addr(int irq)
> > return addr;
> >
> > cnt = do_radix_tree(symbol_value("irq_desc_tree"),
> > - RADIX_TREE_COUNT, NULL);
> > + RADIX_TREE_COUNT, NULL, NULL);
> > len = sizeof(struct radix_tree_pair) * (cnt+1);
> > rtp = (struct radix_tree_pair *)GETBUF(len);
> > rtp[0].index = cnt;
> > cnt = do_radix_tree(symbol_value("irq_desc_tree"),
> > - RADIX_TREE_GATHER, rtp);
> > + RADIX_TREE_GATHER, rtp, NULL);
> >
> > if (kt->highest_irq == 0)
> > kt->highest_irq = rtp[cnt-1].index;
> > diff --git a/memory.c b/memory.c
> > index 765732b..0102dbc 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -292,6 +292,7 @@ static void dump_per_cpu_offsets(void);
> > static void dump_page_flags(ulonglong);
> > static ulong kmem_cache_nodelists(ulong);
> > static void dump_hstates(void);
> > +static int dump_file_page(ulong);
> >
> > /*
> > * Memory display modes specific to this file.
> > @@ -476,6 +477,9 @@ vm_init(void)
> > MEMBER_OFFSET_INIT(block_device_bd_list, "block_device", "bd_list");
> > MEMBER_OFFSET_INIT(block_device_bd_disk, "block_device", "bd_disk");
> > MEMBER_OFFSET_INIT(inode_i_mapping, "inode", "i_mapping");
> > + MEMBER_OFFSET_INIT(address_space_page_tree, "address_space", "page_tree");
> > + if (VALID_MEMBER(address_space_page_tree))
> > + vt->flags |= AS_PAGE_TREE;
> > MEMBER_OFFSET_INIT(address_space_nrpages, "address_space", "nrpages");
> > if (INVALID_MEMBER(address_space_nrpages))
> > MEMBER_OFFSET_INIT(address_space_nrpages, "address_space", "__nrpages");
> > @@ -6465,6 +6469,75 @@ translate_page_flags(char *buffer, ulong flags)
> > }
> >
> > /*
> > + * Page tree dump ops.
> > + */
> > +static int
> > +dump_file_page(ulong page)
> > +{
> > + struct meminfo meminfo;
> > +
> > + BZERO(&meminfo, sizeof(struct meminfo));
> > + meminfo.spec_addr = page;
> > + meminfo.memtype = KVADDR;
> > + meminfo.flags = ADDRESS_SPECIFIED;
> > + dump_mem_map(&meminfo);
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * The address space file mapping radix tree walker.
> > + */
> > +void
> > +dump_file_address_mappings(ulong i_mapping)
> > +{
> > + ulong root_rnode;
> > + ulong count;
> > +
> > + root_rnode = i_mapping + OFFSET(address_space_page_tree);
> > + count = get_page_tree_count(i_mapping);
> > + fprintf(fp, "Address Space %lx, page tree %lx, %ld pages\n\n",
> > + i_mapping, root_rnode, count);
> > +
> > + /* Dump each pages in radix tree */
> > + (void) do_radix_tree(root_rnode, RADIX_TREE_DUMP,
> > + NULL, &dump_file_page);
> > +
> > + return;
> > +}
> > +
> > +/*
> > + * Get the page count for the specific mapping
> > + */
> > +long
> > +get_page_tree_count(ulong i_mapping)
> > +{
> > + ulong address_space = i_mapping;
> > + char *address_space_buf;
> > + ulong nrpages = 0;
> > +
> > + address_space_buf = GETBUF(SIZE(address_space));
> > +
> > + readmem(address_space, KVADDR, address_space_buf,
> > + SIZE(address_space), "address_space buffer",
> > + FAULT_ON_ERROR);
> > + nrpages = ULONG(address_space_buf + OFFSET(address_space_nrpages));
> > +
> > + FREEBUF(address_space_buf);
> > +
> > + return nrpages;
> > +}
> > +
> > +/*
> > + * Check the availability of address space page tree
> > + */
> > +int
> > +is_as_page_tree_supported(void)
> > +{
> > + return (IS_AS_PAGE_TREE() ? TRUE : FALSE);
> > +}
> > +
> > +/*
> > * dump_page_hash_table() displays the entries in each page_hash_table.
> > */
> >
> > diff --git a/task.c b/task.c
> > index 45be68c..4c95259 100644
> > --- a/task.c
> > +++ b/task.c
> > @@ -5612,7 +5612,7 @@ cmd_foreach(void)
> > BZERO(&foreach_data, sizeof(struct foreach_data));
> > fd = &foreach_data;
> >
> > - while ((c = getopt(argcnt, args, "R:vomlgersStTpukcfFxhdaG")) !=
> > EOF) {
> > + while ((c = getopt(argcnt, args, "R:vomMlgersStTpukcfFxhdaG")) !=
> > EOF) {
> > switch(c)
> > {
> > case 'R':
> > @@ -5636,6 +5636,10 @@ cmd_foreach(void)
> > fd->flags |= FOREACH_m_FLAG;
> > break;
> >
> > + case 'M':
> > + fd->flags |= FOREACH_M_FLAG;
> > + break;
> > +
> > case 'l':
> > fd->flags |= FOREACH_l_FLAG;
> > break;
> > @@ -6140,6 +6144,13 @@ foreach(struct foreach_data *fd)
> > print_header = FALSE;
> > break;
> >
> > + case FOREACH_FILES:
> > + if (fd->flags & FOREACH_m_FLAG)
> > + error(FATAL,
> > + "foreach files command does not "
> > + "support -m option\n");
> > + break;
> > +
> > case FOREACH_TEST:
> > break;
> > }
> > @@ -6366,9 +6377,15 @@ foreach(struct foreach_data *fd)
> >
> > case FOREACH_FILES:
> > pc->curcmd = "files";
> > - open_files_dump(tc->task,
> > - fd->flags & FOREACH_i_FLAG ?
> > - PRINT_INODES : 0,
> > + cmdflags = 0;
> > +
> > + if (fd->flags & FOREACH_i_FLAG)
> > + cmdflags |= PRINT_INODES;
> > + if (fd->flags & FOREACH_M_FLAG)
> > + cmdflags |= PRINT_PAGES;
> > +
> > + open_files_dump(tc->task,
> > + cmdflags,
> > fd->reference ? ref : NULL);
> > break;
> >
> > --
> > 1.9.1
> >
> > --
> > Crash-utility mailing list
> > Crash-utility@redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
> >
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility