----- Original Message -----
I found Dave had alread done the first phase of future support for
x86_64
5-level page tables(commit 307e7f35f510). when I asked him about the
state of this work, he gave me a more detailed answer and suggestion.
I follow his advice, and do the following job.
1. Refine the original logical:
1) Create some new common function for getting the offset of page table
2) Repace the PML4 and UPML with the common PGD:
machdep->machspec->pml4/upml ==> machdep->pdg
3) Using the PUD in x86_64
2. Add 5-level page tables support for x86_64_k/uvtop()
This patchset is the second phase of the work, As Dave said, we need to be
a manner of determining very early on whether the kernel page tables are
using 5-level and whether each user-space task is using 4- or 5-level page
tables. These will be done after this phase.
About test work:
I have tested this patchset with 4-level and 5-level paging table.
sadump/ Xen/ Old Linux / RHEL4 are not be tested.
Hello Dou,
Thank you very much for the work you have done so far. I have not spent
any time looking at the patches in detail, but instead I first ran a quick
test of the patch on a set of ~250 kernels that I keep around for testing,
where I just ran the "mod" command to at least verify that kernel virtual
addresses could be translated.
Now, as always, backwards compatibility must be maintained. I do not have
any sadump dumpfiles, but obviously you (Fujitsu) can test those. However
I do have some older Xen and RHEL4-era kernels in my sample set.
As it turns out, *all* RHEL4 kernels failed (i.e. any kernel version
earlier than 2.6.9), which report "WARNING: cannot access vmalloc'd
module memory" during initialization when trying to gather the kernel
module list.
For all of the 2.6.9 and earlier kernels, they show the "WARNING: cannot
access vmalloc'd module memory" message during session initialization:
$ crash vmlinux-2.6.9-42.0.2.ELsmp.gz vmcore
crash 7.2.1rc26
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
please wait... (gathering module symbol data)
WARNING: cannot access vmalloc'd module memory
KERNEL: vmlinux-2.6.9-42.0.2.ELsmp.gz
DUMPFILE: vmcore
CPUS: 8
DATE: Tue Nov 21 19:14:17 2006
UPTIME: 6 days, 01:23:25
LOAD AVERAGE: 24.34, 7.89, 4.46
TASKS: 865
NODENAME: lonrs00268
RELEASE: 2.6.9-42.0.2.ELsmp
VERSION: #1 SMP Thu Aug 17 17:57:31 EDT 2006
MACHINE: x86_64 (2199 Mhz)
MEMORY: 16 GB
PANIC: "Kernel BUG at panic:75"
PID: 20046
COMMAND: "oracle"
TASK: 101c6b047f0 [THREAD_INFO: 101a428a000]
CPU: 7
STATE: TASK_RUNNING (NMI)
crash>
If I run the session with "crash -d4 vmlinux-2.6.9-42.0.2.ELsmp.gz vmcore",
you can see that it it reads a "pud page", but then fails:
...
please wait... (gathering module symbol data)module: ffffffffa0634180
<readmem: ffffffffa0634180, KVADDR, "module struct", 1408, (ROE|Q),
f73780>
<readmem: 4f8000, PHYSADDR, "pud page", 4096, (FOE), 2080b40>
<read_diskdump: addr: 4f8000 paddr: 4f8000 cnt: 4096>
crash: invalid kernel virtual address: ffffffffa0634180 type: "module
struct"
WARNING: cannot access vmalloc'd module memory
...
Without the patch, the module virtual address translation succeeds:
...
please wait... (gathering module symbol data)module: ffffffffa0634180
<readmem: ffffffffa0634180, KVADDR, "module struct", 1408, (ROE|Q),
f705e0>
<readmem: 103000, PHYSADDR, "pgd page", 4096, (FOE), 25d7b50>
<read_diskdump: addr: 103000 paddr: 103000 cnt: 4096>
<readmem: 105000, PHYSADDR, "pmd page", 4096, (FOE), 25d8b60>
<read_diskdump: addr: 105000 paddr: 105000 cnt: 4096>
<readmem: d9bfb0000, PHYSADDR, "page table", 4096, (FOE), 25d9b70>
<read_diskdump: addr: d9bfb0000 paddr: d9bfb0000 cnt: 4096>
<read_diskdump: addr: ffffffffa0634180 paddr: d9bfb3180 cnt: 1408>
...
So it appears to be reading from the wrong starting page table location,
i.e., from "pud page 4f8000" instead of "pgd page 103000".
Also, several Xen kernels failed with segmentation violations during
session initialization. They all fail here in x86_64_xendump_load_page(),
when "*pgd" gets referenced:
static char *
x86_64_xendump_load_page(ulong kvaddr, struct xendump_data *xd)
{
ulong mfn;
ulong *pgd, *pud, *pmd, *ptep;
pgd = ((ulong *)machdep->pgd) + pgd_index(kvaddr);
mfn = ((*pgd) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();
^^^^
Here is the relevant part of the gdb trace of a 2.6.18-based xen
kernel:
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000502748 in x86_64_xendump_load_page
(kvaddr=kvaddr@entry=18446744071568498888, xd=0xf521a0 <xendump_data>,
xd=0xf521a0 <xendump_data>) at x86_64.c:7003
7003 mfn = ((*pgd) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7.x86_64
libgcc-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 lzo-2.06-8.el7.x86_64
ncurses-libs-5.9-13.20130511.el7.x86_64 snappy-1.1.0-3.el7.x86_64
xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x0000000000502748 in x86_64_xendump_load_page
(kvaddr=kvaddr@entry=18446744071568498888, xd=0xf521a0 <xendump_data>,
xd=0xf521a0 <xendump_data>) at x86_64.c:7003
#1 0x0000000000503191 in x86_64_xendump_p2m_create (xd=0xf521a0 <xendump_data>) at
x86_64.c:6749
#2 0x0000000000565d4e in xc_core_create_pfn_tables () at xendump.c:1258
#3 xc_core_read (addr=<optimized out>, paddr=7080864, cnt=32, bufptr=0xf70f80
<shared_bufs>) at xendump.c:168
#4 read_xendump (fd=<optimized out>, bufptr=0xf70f80 <shared_bufs>, cnt=32,
addr=<optimized out>, paddr=7080864) at xendump.c:836
#5 0x000000000047b038 in readmem (addr=18446744071569148832, memtype=memtype@entry=1,
buffer=buffer@entry=0xf70f80 <shared_bufs>,
size=size@entry=32, type=type@entry=0x94dcc3 "possible",
error_handle=error_handle@entry=2) at memory.c:2233
#6 0x00000000004ea33e in cpu_maps_init () at kernel.c:903
#7 kernel_init () at kernel.c:118
#8 0x0000000000467e5a in main_loop () at main.c:768
#9 0x000000000069dad3 in captured_command_loop (data=data@entry=0x0) at main.c:258
#10 0x000000000069c37a in catch_errors (func=func@entry=0x69dac0
<captured_command_loop>, func_args=func_args@entry=0x0,
errstring=errstring@entry=0x8e713f "", mask=mask@entry=6) at
exceptions.c:557
#11 0x000000000069ea66 in captured_main (data=data@entry=0x7ffd637c92a0) at main.c:1064
#12 0x000000000069c37a in catch_errors (func=func@entry=0x69dda0 <captured_main>,
func_args=func_args@entry=0x7ffd637c92a0,
errstring=errstring@entry=0x8e713f "", mask=mask@entry=6) at
exceptions.c:557
#13 0x000000000069edc7 in gdb_main (args=0x7ffd637c92a0) at main.c:1079
#14 gdb_main_entry (argc=<optimized out>, argv=argv@entry=0x7ffd637c9408) at
main.c:1099
#15 0x00000000004f0604 in gdb_main_loop (argc=<optimized out>, argc@entry=3,
argv=argv@entry=0x7ffd637c9408) at gdb_interface.c:76
#16 0x00000000004662c5 in main (argc=3, argv=0x7ffd637c9408) at main.c:707
(gdb) p pgd
$1 = (ulong *) 0xfffffffc054f4210
(gdb)
I haven't investigated further, but in all of the xen cases, the
value of "pgd" above was a kernel virtual address as shown in the
example above.
However, without the patch, the function looks like this, and with
my debug printf of "pml4", the address is a user-space address as
expected:
static char *
x86_64_xendump_load_page(ulong kvaddr, struct xendump_data *xd)
{
ulong mfn;
ulong *pml4, *pgd, *pmd, *ptep;
pml4 = ((ulong *)machdep->machspec->pml4) + pml4_index(kvaddr);
mfn = ((*pml4) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();
fprintf(fp, "x86_64_xendump_load_page: pml4: %lx\n", pml4);
...
So for example, with the debug statement, I see this:
# crash vmlinux-2.6.18-1.2714.el5xen.gz xguest-crashdump
crash 7.2.1rc26
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
x86_64_xendump_load_page: pml4: 25d6c08
x86_64_xendump_load_page: pml4: 25d6c08
KERNEL: vmlinux-2.6.18-1.2714.el5xen.gz
DUMPFILE: xguest-crashdump
...
In a private email, I will send you a pointer to where I have temporarily
stored the 2 vmlinux/vmcore pairs shown above. I'm thinking that it will
probably be fairly easy for you to figure out what's happening in both cases.
Again, I very much appreciate the work you have undertaken here.
Thanks,
Dave