On Thu, May 25, 2023 at 7:27 AM Rongwei Wang <rongwei.wang@linux.alibaba.com> wrote:On 2023/5/24 22:18, lijiang wrote:
Hi, RongweiThank you for the patch.
On Tue, May 16, 2023 at 8:00 PM <crash-utility-request@redhat.com> wrote:Date: Tue, 16 May 2023 19:40:54 +0800
From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
To: crash-utility@redhat.com, k-hagio-ab@nec.com
Subject: [Crash-utility] [PATCH v3] arm64/x86_64: show zero pfn
information when using vtop
Message-ID: <20230516114054.63844-1-rongwei.wang@linux.alibaba.com>
Content-Type: text/plain; charset="US-ASCII"; x-default=true
Now vtop can not show us the page is zero pfn
when PTE or PMD has attached ZERO PAGE. This
patch supports show this information directly
when using vtop, likes:
crash> vtop -c 13674 ffff8917e000
VIRTUAL PHYSICAL
ffff8917e000 836e71000
PAGE DIRECTORY: ffff000802f8d000
PGD: ffff000802f8dff8 => 884e29003
PUD: ffff000844e29ff0 => 884e93003
PMD: ffff000844e93240 => 840413003
PTE: ffff000800413bf0 => 160000836e71fc3
PAGE: 836e71000 (ZERO PAGE)
...
If huge page found:
crash> vtop -c 14538 ffff95800000
VIRTUAL PHYSICAL
ffff95800000 910c00000
PAGE DIRECTORY: ffff000801fa0000
PGD: ffff000801fa0ff8 => 884f53003
PUD: ffff000844f53ff0 => 8426cb003
PMD: ffff0008026cb560 => 60000910c00fc1
PAGE: 910c00000 (2MB, ZERO PAGE)
...
I did some tests on x86 64 and aarch64 machines, and got the following results.
[1] on x86 64, it does not print the "ZERO PAGE" when using 1G huge pages. (but for 2M huge page, it works)
Hi, lijiang
Sorry, It seems that I miss here when I replying your previous email.
You mean 2M hugetlb will show "ZERO PAGE" when testing my patch?
IMHO, I re-read mm/hugetlb.c under kernel, but still not find any
place of allocating zero page. It's something unexpected for me.
And I test 2M hugetlb in my environment, not find "ZERO PAGE":
crash> vtop -c 4400 ffff83200000
VIRTUAL PHYSICAL
ffff83200000 10ca00000
PAGE DIRECTORY: ffff0007f860e000
PGD: ffff0007f860eff8 => 1081c3003
PUD: ffff0000c81c3ff0 => 83860c003
PMD: ffff0007f860c0c8 => 6000010ca00fc1
PAGE: 10ca00000 (2MB) <---here
my hugetlb testcase as below:
#include <sys/mman.h>
#include <stdio.h>
#include <memory.h>
int main(int argc, char *argv[])
{
char *m;
size_t s = (8UL * 1024 * 1024);
unsigned long i;
char val;
m = mmap(NULL, s, PROT_READ | PROT_WRITE, MAP_PRIVATE |
MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
if (m == MAP_FAILED) {
perror("map mem");
m = NULL;
return 1;
}
for (i=0; i<s; i+=4096) {
val = *(m+i);
}
printf("addr: 0x%lx\n", (unsigned long)m);
printf("map_hugetlb ok, press ENTER to quit!\n");
getchar();
munmap(m, s);
return 0;
}
Can you help make sure the behavior of 2M hugetlb again?
crash> vtop -c 2763 7fdfc0000000
VIRTUAL PHYSICAL
7fdfc0000000 300000000
PGD: 23b9ae7f8 => 8000000235031067
PUD: 235031bf8 => 80000003000008e7
PAGE: 300000000 (1GB)
PTE PHYSICAL FLAGS
80000003000008e7 300000000 (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
VMA START END FLAGS FILE
ffff9d65fc8a85c0 7fdfc0000000 7fe000000000 84400fb /mnt/hugetlbfs/test
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffef30cc000000 300000000 ffff9d65f5c35850 0 2 57ffffc001000c uptodate,dirty,head
crash> help -v|grep zero
zero_paddr: 221a37000
huge_zero_paddr: 240000000
[2] on aarch64, it does not print the "ZERO PAGE"crash> vtop -c 23390 ffff8d600000
VIRTUAL PHYSICAL
ffff8d600000 cc800000
PAGE DIRECTORY: ffff224ba02d9000
PGD: ffff224ba02d9ff8 => 80000017b38f003
PUD: ffff224b7b38fff0 => 80000017b38e003
PMD: ffff224b7b38e358 => e80000cc800f41
PAGE: cc800000 (2MB)
PTE PHYSICAL FLAGS
e80000cc800f41 cc800000 (VALID|USER|SHARED|AF|NG|PXN|UXN|DIRTY)
VMA START END FLAGS FILE
ffff224bb315f678 ffff8d600000 ffff8d800000 4400fb /mnt/hugetlbfs/test
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffc892b320000 cc800000 ffff224b5c48ac90 0 2 7ffff80001000c uptodate,dirty,head
crash> help -v|grep zero
zero_paddr: 142662000
huge_zero_paddr: 111400000
I have one question: can this patch print "ZERO PAGE" on x86 64 when using 1G huge pages? Or is it expected behavior on x86 64?
And It does not work on aarch64 machine to me. Did I miss anything else?Hi, lijiang
I find you use '/mnt/hugetlbfs/test' to test this patch, but I have not do this on hugetlbfs, just support THP (I
You are right, Rongwei. Because the z.c test case does not work on my machines, I wrote two test cases by myself.
my test cases:[1] use the mmap() to map the memory(without the *MAP_HUGETLB* flag):ptr = mmap(NULL, map_len, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
[2] use the madvise() as below:posix_memalign(&ptr, huge_page_size, n);madvise(ptr, n, MADV_HUGEPAGE);
And then read data from ptr, the [1] and [2] both can work on my X86 64 machines for 2M hugepage.
But for the 1G hugepage, I did not see the similar code change in your patch, do you mean that this patch doesn't support for the 1G hugepage?
IMHO, now linux kernel just support 1G hugetlb (no 1G THP), and meanwhile, hugetlb has no ZERO PAGE stuff.
If I'm wrong here, please let me know.
Thanks for your careful review.
I can confirm that the 1G hugepage is supported on my X86 64 machine:# cat /proc/meminfo |grep -i huge
AnonHugePages: 407552 kB
ShmemHugePages: 0 kB
FileHugePages: 63488 kB
HugePages_Total: 10
HugePages_Free: 10
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 10485760 kB
indeed ignore the hugetlb when coding this function).
And I have read mm/hugetlb.c roughly, not find any zero page stuff when page fault with read. It seems that hugetlb is not an angel user of this function.
If the current patch does not support printing "ZERO PAGE" information when using hugetlbfs, it should be good to describe it in patch log(and explain a bit *why* if possible).
OK, I can update comments next.
If I miss something, please let me know.
BTW: could you please check the z.c test case again? Or can you share your steps in detail, that can help me speed up the tests on X86 64(4k/2M/1G) and aarch64(4k or 64k/2M/512M/1G) machines.# gcc -o z z.c
[root@hp-z640-01 crash]# ./z
zero: 7efe4cbff010 ---seems the address is problematic, not aligned
It's right, this address is directly return from malloc().
You can use 7efe4cc00000 or 7efe4cc01000 to check.
In a short, just use aligned (4k or 2M) address to test is ok.
Thanks,
-wrw
hello
Thanks.Lianbo
Thanks for your time!
-wrw
ThanksLianbo