Hi Lianbo,
在 2023/11/3 21:14, lijiang 写道:
[EXTERNAL EMAIL NOTICE: This email originated from an external
sender.
Please be mindful of safe email handling and proprietary information
protection practices.]
On Fri, Nov 3, 2023 at 5:45 PM Shijie Huang
<shijie(a)amperemail.onmicrosoft.com> wrote:
Hi Lianbo,
在 2023/11/3 17:27, lijiang 写道:
> On Thu, Nov 2, 2023 at 9:35 AM Huang Shijie
> <shijie(a)os.amperecomputing.com> wrote:
>
> In the NUMA machine, it is useful to know the memory
> distribution of
> an inode page cache:
> How many pages in the node 0?
> How many pages in the node 1?
>
> Add "files -n" command to get the memory distribution
> information:
> 1.) Add new argument for dump_inode_page_cache_info()
> 2.) make page_to_nid() a global function.
> 3.) Add summary_inode_page() to check each page's node
> information.
> 4.) Use print_inode_summary_info() to print the
> memory distribution information of an inode.
>
> Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
> ---
> v2 --> v3:
> 1.) Always return 1 for summary_inode_page().
> 2.) Add more comment for help_files.
>
>
> Thank you for the update, Shijie.
>
> ---
> defs.h | 1 +
> filesys.c | 53
> ++++++++++++++++++++++++++++++++++++++++++++++++-----
> help.c | 14 +++++++++++++-
> memory.c | 3 +--
> 4 files changed, 63 insertions(+), 8 deletions(-)
>
> diff --git a/defs.h b/defs.h
> index 788f63a..1fe2d0b 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -5750,6 +5750,7 @@ int dump_inode_page(ulong);
> ulong valid_section_nr(ulong);
> void display_memory_from_file_offset(ulonglong, long, void *);
> void swap_info_init(void);
> +int page_to_nid(ulong);
>
> /*
> * filesys.c
> diff --git a/filesys.c b/filesys.c
> index 1d0ee7f..2c7cc74 100644
> --- a/filesys.c
> +++ b/filesys.c
> @@ -49,7 +49,7 @@ static int match_file_string(char *, char
> *, char *);
> static ulong get_root_vfsmount(char *);
> static void check_live_arch_mismatch(void);
> static long get_inode_nrpages(ulong);
> -static void dump_inode_page_cache_info(ulong);
> +static void dump_inode_page_cache_info(ulong, void *callback);
>
> #define DENTRY_CACHE (20)
> #define INODE_CACHE (20)
> @@ -2192,8 +2192,31 @@ get_inode_nrpages(ulong i_mapping)
> return nrpages;
> }
>
> +/* Used to collect the numa information for an inode */
> +static ulong *numa_node;
> +
> +static void
> +print_inode_summary_info(void)
> +{
> + int i;
> +
> + fprintf(fp, " NODE PAGES\n");
> + for (i = 0; i < vt->numnodes; i++)
> + fprintf(fp, " %2d %8ld\n", i, numa_node[i]);
> +}
> +
> +static int
> +summary_inode_page(ulong page)
> +{
> + int node = page_to_nid(page);
> +
> + if (0 <= node && node < vt->numnodes)
> + numa_node[node]++;
> + return 1;
> +}
>
>
> A clear error message would be nice when the "files -n" command
> fails. What's your opinion on the following changes?
>
ok, no problem.
I did not met the "error" message with the /proc/kcore.
Maybe there are error message in the kernel crash core file.
> +summary_inode_page(ulong page)
> +{
> + int node;
> +
> + if (!is_page_ptr(page, NULL))
> + error(FATAL, "Invalid inode page(0x%lx)\n", page);
> +
Is this redandent?
The is_page_ptr() is called in page_to_nid();
It's true.
page_to_nid()->
page_to_phys()->
is_page_ptr()
But it may be worth sacrificing a little performance for helping with
the debugging, if there is no better approach. Or need to refactor the
relevant code? Looks
We may optimize some functions, such as page_to_phys().
but yes, it's another issue.
like another issue.
I guess all the invalid page is "0" which means an empty page cache
slot, so could you please try this:
+static int
+summary_inode_page(ulong page)
+{
+ int node;
+
+ if (page) {
+ node = page_to_nid(page);
+ if (0 <= node && node < vt->numnodes)
+ numa_node[node]++;
+ }
+ return 1;
+}
+
> + node = page_to_nid(page);
> + if (node < 0 || node >= vt->numnodes)
> + error(FATAL, "Invalid node(%d) for
> page(0x%lx)\n", node, page);
> +
> + numa_node[node]++;
> +
> + return 1;
> +}
>
> Without the above changes, it will print a lot of failures of
> "invalid page" once the corresponding inode page is invalid,
> unless that is expected behavior.
>
> crash> files -n ffff8ea84c130938
> INODE NRPAGES
> ffff8ea84c130938 62527
>
> files: page_to_nid: invalid page: 0
> files: page_to_nid: invalid page: 0
> files: page_to_nid: invalid page: 0
> files: page_to_nid: invalid page: 10
> files: page_to_nid: invalid page: 10
> files: page_to_nid: invalid page: 10
> files: page_to_nid: invalid page: 20
> files: page_to_nid: invalid page: 20
> files: page_to_nid: invalid page: 20
> files: page_to_nid: invalid page: 30
> files: page_to_nid: invalid page: 30
> files: page_to_nid: invalid page: 30
> files: page_to_nid: invalid page: 40
> files: page_to_nid: invalid page: 40
> files: page_to_nid: invalid page: 40
> files: page_to_nid: invalid page: 50
> files: page_to_nid: invalid page: 50
> files: page_to_nid: invalid page: 50
> files: page_to_nid: invalid page: 60
> files: page_to_nid: invalid page: 60
> ...
>
>
> And also says that in help.c:
> +" -n inode given a hexadecimal inode address, check all
> the pages",
> +" in the page cache, and display a NUMA node
> distribution",
> +" if the inode page is valid, otherwise it will
> fail.",
>
> Just my suggestions, any thoughts?
I am fine with it.
I just curious why there are invalid pages in the page cache?
I tested it based on the live debugging(/proc/kcore), the steps are:
[1] files command to display the inode info
crash> files
PID: 293974 TASK: ffff8ea84be33380 CPU: 21 COMMAND: "crash"
ROOT: / CWD: /home/test/crash
FD FILE DENTRY INODE TYPE PATH
0 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR /dev/pts/0
1 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR /dev/pts/0
2 ffff8ea8514d0d00 ffff8ea87f2c9200 ffff8ea844548ea0 CHR /dev/pts/0
3 ffff8ea88675ec00 ffff8ea87040e780 ffff8ea872112a18 CHR /dev/null
4 ffff8ea8bc9ac300 ffff8ea872e0c240 ffff8ea872ec64e0 REG /proc/kcore
5 ffff8ea8bc9ad800 ffff8ea8539c6540 ffff8ea84c130938 REG
/home/linux/vmlinux
...
[2] files -n ffff8ea84c130938
I also tested with it.
Thanks
Huang Shijie