On Fri, Nov 3, 2023 at 5:45 PM Shijie Huang <shijie@amperemail.onmicrosoft.com> wrote:

Hi Lianbo,

在 2023/11/3 17:27, lijiang 写道:
On Thu, Nov 2, 2023 at 9:35 AM Huang Shijie <shijie@os.amperecomputing.com> wrote:
In the NUMA machine, it is useful to know the memory distribution of
an inode page cache:
        How many pages in the node 0?
        How many pages in the node 1?

Add "files -n" command to get the memory distribution information:
        1.) Add new argument for dump_inode_page_cache_info()
        2.) make page_to_nid() a global function.
        3.) Add summary_inode_page() to check each page's node
            information.
        4.) Use print_inode_summary_info() to print the
            memory distribution information of an inode.

Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
---
v2 --> v3:
        1.) Always return 1 for summary_inode_page().
        2.) Add more comment for help_files.


Thank you for the update, Shijie.
 
---
 defs.h    |  1 +
 filesys.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 help.c    | 14 +++++++++++++-
 memory.c  |  3 +--
 4 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/defs.h b/defs.h
index 788f63a..1fe2d0b 100644
--- a/defs.h
+++ b/defs.h
@@ -5750,6 +5750,7 @@ int dump_inode_page(ulong);
 ulong valid_section_nr(ulong);
 void display_memory_from_file_offset(ulonglong, long, void *);
 void swap_info_init(void);
+int page_to_nid(ulong);

 /*
  *  filesys.c
diff --git a/filesys.c b/filesys.c
index 1d0ee7f..2c7cc74 100644
--- a/filesys.c
+++ b/filesys.c
@@ -49,7 +49,7 @@ static int match_file_string(char *, char *, char *);
 static ulong get_root_vfsmount(char *);
 static void check_live_arch_mismatch(void);
 static long get_inode_nrpages(ulong);
-static void dump_inode_page_cache_info(ulong);
+static void dump_inode_page_cache_info(ulong, void *callback);

 #define DENTRY_CACHE (20)
 #define INODE_CACHE  (20)
@@ -2192,8 +2192,31 @@ get_inode_nrpages(ulong i_mapping)
        return nrpages;
 }

+/* Used to collect the numa information for an inode */
+static ulong *numa_node;
+
+static void
+print_inode_summary_info(void)
+{
+       int i;
+
+       fprintf(fp, "     NODE           PAGES\n");
+       for (i = 0; i < vt->numnodes; i++)
+               fprintf(fp, "     %2d          %8ld\n", i, numa_node[i]);
+}
+
+static int
+summary_inode_page(ulong page)
+{
+       int node = page_to_nid(page);
+
+       if (0 <= node && node < vt->numnodes)
+               numa_node[node]++;
+       return 1;
+}

A clear error message would be nice when the "files -n" command fails. What's your opinion on the following changes?

ok, no problem.


 I did not met the "error" message with the /proc/kcore.

Maybe there are error message in the kernel crash core file.


+summary_inode_page(ulong page)
+{
+       int node;
+
+       if (!is_page_ptr(page, NULL))
+               error(FATAL, "Invalid inode page(0x%lx)\n", page);
+

Is this redandent?

The is_page_ptr() is called in page_to_nid();


It's true.

page_to_nid()->
page_to_phys()->
is_page_ptr() 

But it may be worth sacrificing a little performance for helping with the debugging, if there is no better approach. Or need to refactor the relevant code? Looks like another issue.

+       node = page_to_nid(page);
+       if (node < 0 || node >= vt->numnodes)
+               error(FATAL, "Invalid node(%d) for page(0x%lx)\n", node, page);
+
+       numa_node[node]++;
+
+       return 1;
+}
 

Without the above changes, it will print a lot of failures of "invalid page" once the corresponding inode page is invalid, unless that is expected behavior.

crash> files -n ffff8ea84c130938
     INODE        NRPAGES
ffff8ea84c130938    62527

files: page_to_nid: invalid page: 0
files: page_to_nid: invalid page: 0
files: page_to_nid: invalid page: 0
files: page_to_nid: invalid page: 10
files: page_to_nid: invalid page: 10
files: page_to_nid: invalid page: 10
files: page_to_nid: invalid page: 20
files: page_to_nid: invalid page: 20
files: page_to_nid: invalid page: 20
files: page_to_nid: invalid page: 30
files: page_to_nid: invalid page: 30
files: page_to_nid: invalid page: 30
files: page_to_nid: invalid page: 40
files: page_to_nid: invalid page: 40
files: page_to_nid: invalid page: 40
files: page_to_nid: invalid page: 50
files: page_to_nid: invalid page: 50
files: page_to_nid: invalid page: 50
files: page_to_nid: invalid page: 60
files: page_to_nid: invalid page: 60
...


And also says that in help.c: 
+"     -n inode   given a hexadecimal inode address, check all the pages",
+"                in the page cache, and display a NUMA node distribution",
+"                if the inode page is valid, otherwise it will fail.",

Just my suggestions, any thoughts?

I am fine with it.

I just curious why there are invalid pages in the page cache?


I tested it based on the live debugging(/proc/kcore), the steps are:
[1] files command to display the inode info
crash> files
PID: 293974   TASK: ffff8ea84be33380  CPU: 21   COMMAND: "crash"
ROOT: /    CWD: /home/test/crash
 FD       FILE            DENTRY           INODE       TYPE PATH
  0 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR  /dev/pts/0
  1 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR  /dev/pts/0
  2 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR  /dev/pts/0
  3 ffff8ea88675ec00 ffff8ea87040e780 ffff8ea872112a18 CHR  /dev/null
  4 ffff8ea8bc9ac300 ffff8ea872e0c240 ffff8ea872ec64e0 REG  /proc/kcore
  5 ffff8ea8bc9ad800 ffff8ea8539c6540 ffff8ea84c130938 REG  /home/linux/vmlinux
...

[2] files -n ffff8ea84c130938

Thanks.
Lianbo

Thanks

Huang Shijie