[RFC PATCH 00/15] Support module memory layout change on Linux 6.4
by HAGIO KAZUHITO(萩尾 一仁)
This patchset supports module memory layout change on Linux 6.4 by
kernel commit [1]. Without the patchset, crash cannot even start a
session with an error message like this:
crash: invalid structure member offset: module_core_size
FILE: kernel.c LINE: 3787 FUNCTION: module_init()
(For the current crash, you can use "crash --no_modules" option without
module functionalities to avoid the failure of startup.)
This patchset is also located at GitHub [2].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit...
[2] https://github.com/k-hagio/crash/tree/6.4-module.wip2
Probably most of module and symbol functions will work, though maybe
there might be lack of fixes for some functions and there might be my
misunderstanding of crash code. (There are also some functions not
fixed because crash does not use them for the recent kernels.)
Please let me know if there are any bugs and comments on the design,
coding style and etc.
* The current patchset is a draft and kind of POC, fixes are piled up
and no code and performance optimization. I will rearrange them later.
* Currently enum mod_mem_type is backported from the kernel as it is,
because I'm not sure whether it's likely to change soon.
* The new module memory areas are scattered, and managed by the
following struct load_module members.
struct load_module {
...
/* For 6.4 module_memory */
struct module_memory mem[MOD_MEM_NUM_TYPES];
struct syment **symtable;
struct syment **symend;
struct syment *ext_symtable[MOD_MEM_NUM_TYPES];
struct syment *ext_symend[MOD_MEM_NUM_TYPES];
struct syment *load_symtable[MOD_MEM_NUM_TYPES];
struct syment *load_symend[MOD_MEM_NUM_TYPES];
int address_order[MOD_MEM_NUM_TYPES];
int nr_mems;
};
* "sym -M" output is ordered by module text start address on a
per-module basis for now. (how can I say...) So if you get all of
module symbols in address order, need to sort them. But modules will be
mixed.
crash> sym -M | grep MODULE # displayed per module
...
ffffffffc046f000 MODULE TEXT START: dm_mirror
ffffffffc0472000 MODULE TEXT END: dm_mirror
ffffffffc0473000 MODULE DATA START: dm_mirror
ffffffffc0475000 MODULE DATA END: dm_mirror
ffffffffc0476000 MODULE RODATA START: dm_mirror
ffffffffc0478000 MODULE RODATA END: dm_mirror
ffffffffc044b000 MODULE RO_AFTER_INIT START: libata # lower than the
previous
ffffffffc044c000 MODULE RO_AFTER_INIT END: libata
ffffffffc0479000 MODULE TEXT START: libata
ffffffffc049d000 MODULE TEXT END: libata
ffffffffc049e000 MODULE DATA START: libata
ffffffffc04c8000 MODULE DATA END: libata
...
crash> sym -M | grep MODULE | sort # displayed in address order
...
ffffffffc0468000 MODULE RODATA START: dm_region_hash
ffffffffc046a000 MODULE RODATA END: dm_region_hash
ffffffffc046b000 MODULE RODATA START: t10_pi
ffffffffc046c000 MODULE RODATA END: t10_pi
ffffffffc046d000 MODULE TEXT START: ghash_clmulni_intel
ffffffffc046e000 MODULE TEXT END: ghash_clmulni_intel
ffffffffc046f000 MODULE TEXT START: dm_mirror
ffffffffc0472000 MODULE TEXT END: dm_mirror
...
Kazuhito Hagio (15):
Add support for struct module_memory on Linux 6.4 and later
Support "sym -l|-M|-m" options
Make "sym -m" option print symbols in address order
Fix verify_module() and next_module_vaddr()
Fix {lowest,highest}_modules_address() and is_kernel_text()
Support "mod -s|-S" options
Support percpu symbols for "sym" options
Support "mod -d|-D" options
Support "sym -n" option
Support "sym -p" option
Fix module_symbol() and is_kernel_text()
Remove unused find_mod_etext() in store_module_symbols_v3()
Fix get_section, check_for_dups, symbol_query, symbol_name_count
Fix symbol_search_next, symbol_complete_match and get_syment_array
mod: Change "BASE" on header to "TEXT_BASE" to clarify
defs.h | 45 ++
gdb-10.2.patch | 16 +
kernel.c | 47 +-
memory.c | 36 +-
symbols.c | 1571 +++++++++++++++++++++++++++++++++++++++++++++---
5 files changed, 1616 insertions(+), 99 deletions(-)
--
2.31.1
1 year, 6 months
[Question] crash-arm64 cannot determine VA_BITS_ACTUAL for qemu dump-guest-memory
by Qiwu.Chen
Dear Mantainers,
I meet a problem that the latest crash tool built for ARM64 cannot load the vmcore genarated by Qemu ARM64 guest OS.
1) The vmcore captured by "dump-guest-memory" cmd in qemu monitor mode:
(qemu) dump-guest-memory vmcore
$ file vmcore
vmcore: ELF 64-bit LSB core file, ARM aarch64, version 1 (SYSV), SVR4-style
My host OS installed qemu-system-aarch64 version is 6.2.0
$ qemu-system-aarch64 --version
QEMU emulator version 6.2.0
2) The test for linux version both 5.0 and 5.15 which disable kaslr is NG, but test linux version 4.0 loading vmcore is OK.
Here's the error log while crash tool loading linux-5.15 vmcore:
$ crash64 vmlinux vmcore
crash64 8.0.3++
Copyright (C) 2002-2022 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
crash64: cannot determine VA_BITS_ACTUAL
3) It seems crash tool cannot get vabits_actual from vmcore, so I append "vabits_actual=48" to crash cmd:
$ crash-arm64 vmlinux vmcore -m vabits_actual=48
The result shows some symbols seek error:
crash-arm64: seek error: kernel virtual address: ffff800011769918 type: "possible"WARNING: cannot read cpu_possible_map
crash-arm64: seek error: kernel virtual address: ffff800011769958 type: "present"
WARNING: cannot read cpu_present_map
crash-arm64: seek error: kernel virtual address: ffff800011769938 type: "online"
WARNING: cannot read cpu_online_map
crash-arm64: seek error: kernel virtual address: ffff800011769978 type: "active"
WARNING: cannot read cpu_active_map
crash-arm64: seek error: kernel virtual address: ffff800011936390 type: "shadow_timekeeper xtime_sec"
xtime timespec.tv_sec: 55fcd060743a: Thu Feb 11 04:14:50 CST 2997960
crash-arm64: seek error: kernel virtual address: ffff800011771268 type: "init_uts_ns"
The attachment is the detailed log which append "-d 1" to crash cmd.
Could you please help this?
Thanks
1 year, 6 months
[PATCH v2] Output prompt when stdin is not a TTY.
by Hsin-Yi Wang
When stdin is not a TTY, prompt ("crash> ") won't be displayed. If
another process interact with crash with piped stdin/stdout, it will not
get the prompt as a delimiter.
Compared to other debugger like gdb, crash seems intended to give a
prompt in this case in the beginning of process_command_line(). It
checks if pc->flags does NOT have any of
READLINE|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE, a
prompt should be printed. The check will never be true since READLINE is
set in setup_environment() unconditionally.
It makes more sense to change the READLINE flag in the check to TTY
instead. Besides this change, the prompt in process_command_line() should
only be print when it's not in the middle of processing the input file
recovering from a previous FATAL command, because the prompt will be
displayed by the exec_input_file().
Additionally, when stdin is not TTY, repeat the command line from user
after prompt, which can give more context.
The prompt and command line can be opt out by using the silent (-s) flag.
Signed-off-by: Hsin-Yi Wang <hsinyi(a)chromium.org>
---
v1: https://listman.redhat.com/archives/crash-utility/2023-May/010740.html
v1->v2:
1. remove additional prompt when recovering from FATAL command from
file.
2. fix a few space/tab indent.
---
cmdline.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/cmdline.c b/cmdline.c
index ded6551..b7f919a 100644
--- a/cmdline.c
+++ b/cmdline.c
@@ -64,8 +64,8 @@ process_command_line(void)
fp = stdout;
BZERO(pc->command_line, BUFSIZE);
- if (!(pc->flags &
- (READLINE|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE)))
+ if (!pc->ifile_in_progress && !(pc->flags &
+ (TTY|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE)))
fprintf(fp, "%s", pc->prompt);
fflush(fp);
@@ -136,12 +136,16 @@ process_command_line(void)
add_history(pc->command_line);
check_special_handling(pc->command_line);
- } else {
- if (fgets(pc->command_line, BUFSIZE-1, stdin) == NULL)
+ } else {
+ if (fgets(pc->command_line, BUFSIZE-1, stdin) == NULL)
clean_exit(1);
+ if (!(pc->flags & SILENT)) {
+ fprintf(fp, "%s", pc->command_line);
+ fflush(fp);
+ }
clean_line(pc->command_line);
strcpy(pc->orig_line, pc->command_line);
- }
+ }
/*
* First clean out all linefeeds and leading/trailing spaces.
--
2.41.0.rc0.172.g3f132b7071-goog
1 year, 6 months
[Question] page excluded: kernel virtual address
by Qiwu.Chen
Dear Mantainers,
I find there is an inevitable problem for linux-5.x ARM64 kdump that the error "page excluded: kernel virtual address: xxx" will be occured when read the address of ext4_super_block for ext4 filesystem in latest crash utility debugging enviroment.
1) Here's my reproduce steps:
crash64> mount
MOUNT SUPERBLK TYPE DEVNAME DIRNAME
ffff000001e65180 ffff000001c1c000 rootfs none /
ffff00002ea66000 ffff00000288e000 ext4 /dev/root /
crash64> struct super_block.s_fs_info -x ffff00000288e000 s_fs_info = 0xffff000002885000,
crash64> struct ext4_sb_info.s_es -x 0xffff000002885000
s_es = 0xffff0000043c2400,
crash64> struct ext4_sb_info.s_es -x 0xffff0000043c2400
struct: page excluded: kernel virtual address: ffff0000043c2400 type: "gdb_readmem_callback"
crash64> rd 0xffff0000043c2400rd: page excluded: kernel virtual address: ffff0000043c2400 type: "64-bit KVADDR"
crash64> kmem -p 0xffff0000043c2400 PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffc000010f080 443c2000 ffff000002f93af0 0 2 ffff00000022036 referenced,uptodate,lru,active,private,mappedtodisk
crash64> vtop 0xffff0000043c2400
VIRTUAL PHYSICAL
ffff0000043c2400 443c2400
PAGE DIRECTORY: ffff80001163b000
PGD: ffff80001163b000 => 180000007fff9803
PUD: ffff00003fff9000 => 180000007fff8803
PMD: ffff00003fff8108 => 180000007ffe0803
PTE: ffff00003ffe0e10 => 680000443c2f07
PAGE: 443c2000
PTE PHYSICAL FLAGS
680000443c2f07 443c2000 (VALID|SHARED|AF|NG|PXN|UXN)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffc000010f080 443c2000 ffff000002f93af0 0 2 ffff00000022036 referenced,uptodate,lru,active,private,mappedtodisk
2) Here's the kernel virtual kernel memory layout for my tested arm64 kernel version 5.15, we can see 0xffff0000043c2400 is in the kernel linear memory region :
[ 0.000000] Virtual kernel memory layout:[ 0.000000] modules : 0xffff800008000000 - 0xffff800010000000 ( 128 MB)
[ 0.000000] vmalloc : 0xffff800010000000 - 0xfffffbfff0000000 (126975 GB)
[ 0.000000] .text : 0xffff800010000000 - 0xffff8000111a0000 ( 18048 KB)
[ 0.000000] .init : 0xffff800011640000 - 0xffff800011760000 ( 1152 KB)
[ 0.000000] .rodata : 0xffff8000111a0000 - 0xffff800011636000 ( 4696 KB)
[ 0.000000] .data : 0xffff800011760000 - 0xffff800011902200 ( 1673 KB)
[ 0.000000] .bss : 0xffff800011902200 - 0xffff8000119c6fb0 ( 788 KB)
[ 0.000000] fixed : 0xfffffbfffdbf9000 - 0xfffffbfffe000000 ( 4124 KB)
[ 0.000000] PCI I/O : 0xfffffbfffe800000 - 0xfffffbffff800000 ( 16 MB)
[ 0.000000] vmemmap : 0xfffffc0000000000 - 0xfffffe0000000000 ( 2048 GB maximum)
[ 0.000000] 0xfffffc0000000000 - 0xfffffc0001000000 ( 16 MB actual)
[ 0.000000] memory : 0xffff000000000000 - 0xffff000040000000 ( 1024 MB)
[ 0.000000] PAGE_OFFSET : 0xffff000000000000
[ 0.000000] KIMAGE_VADDR : 0xffff800010000000
[ 0.000000] kimage_voffset : 0xffff7fffcfe00000
[ 0.000000] PHYS_OFFSET : 0x40000000
[ 0.000000] start memory : 0x40000000
I have no idea whether this problem is something wrong with crash utility. Could anybody please help this?
Thanks
1 year, 6 months
[PATCH] Output prompt when stdin is not a TTY.
by Hsin-Yi Wang
When stdin is not a TTY, prompt ("crash> ") won't be displayed. If
another process interact with crash with piped stdin/stdout, it will not
get the prompt as a delimiter.
Compared to other debugger like gdb, crash seems intended to give a
prompt in this case in the beginning of process_command_line(). It
checks if pc->flags does NOT have any of
READLINE|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE, a
prompt should be printed. The check will never be true since READLINE is
set in setup_environment() unconditionally. It makes more sense to
change the READLINE flag in the check to TTY instead.
Additionally, when stdin is not TTY, repeat the command line from user
after prompt, which can give more context.
The prompt and command line can be opt out by using the silent (-s) flag.
Signed-off-by: Hsin-Yi Wang <hsinyi(a)chromium.org>
---
v1:
- The original discussion: https://listman.redhat.com/archives/crash-utility/2023-May/010710.html
- Kazu: provide the idea to print the command line as well.
---
cmdline.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/cmdline.c b/cmdline.c
index ded6551..9821d86 100644
--- a/cmdline.c
+++ b/cmdline.c
@@ -65,7 +65,7 @@ process_command_line(void)
BZERO(pc->command_line, BUFSIZE);
if (!(pc->flags &
- (READLINE|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE)))
+ (TTY|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE)))
fprintf(fp, "%s", pc->prompt);
fflush(fp);
@@ -139,6 +139,10 @@ process_command_line(void)
} else {
if (fgets(pc->command_line, BUFSIZE-1, stdin) == NULL)
clean_exit(1);
+ if (!(pc->flags & SILENT)) {
+ fprintf(fp, "%s", pc->command_line);
+ fflush(fp);
+ }
clean_line(pc->command_line);
strcpy(pc->orig_line, pc->command_line);
}
--
2.41.0.rc0.172.g3f132b7071-goog
1 year, 6 months
Question on prompt behavior
by Hsin-Yi Wang
hi crash-utility community,
When stdin is not a TTY, but all the other flags remain the same,
prompt ("crash> ") won't be displayed. An example use case is, the
stdin of crash is replaced by a piped fd connected to another process.
In process_command_line(), it checks if pc->flags does NOT have any of
the flag: READLINE|SILENT|CMDLINE_IFILE|RCHOME_IFILE|RCLOCAL_IFILE, a
prompt should be printed.
But in setup_environment(), pc->flags is set to have READLINE flag[2],
the above check will not be true at all.
Should READLINE be set for all cases in setup_environment()?
- If true, should the check in process_command_line() look for TTY
instead of READLINE? Since if pc->flags has TTY, [2] won't be true and
the prompt will be printed later in TTY's case[3].
- If false, where should be a proper place and conditions to set READLINE?
Or is the current behavior intended? I may not fully understand the
design logic. Any explanations are appreciated.
Thanks!
[1] https://github.com/crash-utility/crash/blob/05a3a328fcd8920e49926b6d1c9c8...
[2] https://github.com/crash-utility/crash/blob/8246dce99dd23457e8c7a3fe9609c...
[3] https://github.com/crash-utility/crash/blob/05a3a328fcd8920e49926b6d1c9c8...
1 year, 6 months
[PATCH 1/2] diskdump/netdump: fix segmentation fault caused by failure of stopping CPUs
by HATAYAMA Daisuke
There's no NMI on ARM. Hence, stopping the non-panicking CPUs from the
panicking CPU via IPI can fail easily if interrupts are being masked
in those moment. Moreover, crash_notes are not initialized for such
unstopped CPUs and the corresponding NT_PRSTATUS notes are not
attached to vmcore. However, crash utility never takes it
consideration such uninitialized crash_notes and then ends with
mapping different NT_PRSTATUS to actually unstopped CPUs. This corrupt
mapping can result crash utility into segmentation fault in the
operations where register values in NT_PRSTATUS notes are used.
For example:
crash> bt 1408
PID: 1408 TASK: ffff000003e22200 CPU: 2 COMMAND: "repro"
Segmentation fault (core dumped)
crash> help -D
diskdump_data:
filename: 127.0.0.1-2023-05-26-02:21:27/vmcore-ld1
flags: 46 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED|LZO_SUPPORTED)
...snip...
notes_buf: 1815df0
num_vmcoredd_notes: 0
num_prstatus_notes: 5
notes[0]: 1815df0 (NT_PRSTATUS)
si.signo: 0 si.code: 0 si.errno: 0
...snip...
PSTATE: 80400005 FPVALID: 00000000
notes[4]: 1808f10 (NT_PRSTATUS)
Segmentation fault (core dumped)
To fix this issue, let's map NT_PRSTATUS to some CPU only if the
corresponding crash_notes is checked to be initialized.
Signed-off-by: HATAYAMA Daisuke <d.hatayama(a)fujitsu.com>
---
defs.h | 1 +
diskdump.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
netdump.c | 5 ++++-
3 files changed, 55 insertions(+), 2 deletions(-)
diff --git a/defs.h b/defs.h
index 12ad6aa..72129a1 100644
--- a/defs.h
+++ b/defs.h
@@ -7111,6 +7111,7 @@ int dumpfile_is_split(void);
void show_split_dumpfiles(void);
void x86_process_elf_notes(void *, unsigned long);
void *diskdump_get_prstatus_percpu(int);
+int have_crash_notes(int cpu);
void map_cpus_to_prstatus_kdump_cmprs(void);
void diskdump_display_regs(int, FILE *);
void process_elf32_notes(void *, ulong);
diff --git a/diskdump.c b/diskdump.c
index cf5f5d9..11d29d3 100644
--- a/diskdump.c
+++ b/diskdump.c
@@ -101,6 +101,55 @@ int dumpfile_is_split(void)
return KDUMP_SPLIT();
}
+int have_crash_notes(int cpu)
+{
+ ulong crash_notes, notes_ptr;
+ char *buf, *p;
+ Elf64_Nhdr *note = NULL;
+
+ if (!readmem(symbol_value("crash_notes"),
+ KVADDR,
+ &crash_notes,
+ sizeof(crash_notes),
+ "crash_notes",
+ RETURN_ON_ERROR)) {
+ error(WARNING, "cannot read \"crash_notes\"\n");
+ return FALSE;
+ }
+
+ if (symbol_exists("__per_cpu_offset"))
+ notes_ptr = crash_notes + kt->__per_cpu_offset[cpu];
+ else
+ notes_ptr = crash_notes;
+
+ buf = GETBUF(SIZE(note_buf));
+
+ if (!readmem(notes_ptr,
+ KVADDR,
+ buf,
+ SIZE(note_buf),
+ "note_buf_t",
+ RETURN_ON_ERROR)) {
+ error(WARNING, "cpu %d: cannot read NT_PRSTATUS note\n", cpu);
+ return FALSE;
+ }
+
+ note = (Elf64_Nhdr *)buf;
+ p = buf + sizeof(Elf64_Nhdr);
+
+ if (note->n_type != NT_PRSTATUS) {
+ error(WARNING, "cpu %d: invalid NT_PRSTATUS note (n_type != NT_PRSTATUS)\n", cpu);
+ return FALSE;
+ }
+
+ if (!STRNEQ(p, "CORE")) {
+ error(WARNING, "cpu %d: invalid NT_PRSTATUS note (name != \"CORE\")\n", cpu);
+ return FALSE;
+ }
+
+ return TRUE;
+}
+
void
map_cpus_to_prstatus_kdump_cmprs(void)
{
@@ -131,7 +180,7 @@ map_cpus_to_prstatus_kdump_cmprs(void)
nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS);
for (i = 0, j = 0; i < nrcpus; i++) {
- if (in_cpu_map(ONLINE_MAP, i)) {
+ if (in_cpu_map(ONLINE_MAP, i) && have_crash_notes(i)) {
dd->nt_prstatus_percpu[i] = nt_ptr[j++];
dd->num_prstatus_notes =
MAX(dd->num_prstatus_notes, i+1);
diff --git a/netdump.c b/netdump.c
index 01af145..b272984 100644
--- a/netdump.c
+++ b/netdump.c
@@ -99,8 +99,11 @@ map_cpus_to_prstatus(void)
nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS);
for (i = 0, j = 0; i < nrcpus; i++) {
- if (in_cpu_map(ONLINE_MAP, i))
+ if (in_cpu_map(ONLINE_MAP, i) && have_crash_notes(i)) {
nd->nt_prstatus_percpu[i] = nt_ptr[j++];
+ nd->num_prstatus_notes =
+ MAX(nd->num_prstatus_notes, i+1);
+ }
}
FREEBUF(nt_ptr);
--
2.25.1
1 year, 6 months
[PATCH RFC] arm64: show zero pfn information when using vtop
by Rongwei Wang
Now vtop can not show us the page is zero pfn
when PTE or PMD has attached ZERO PAGE. This
patch supports show this information directly
when using vtop, likes:
crash> vtop -c 13674 ffff8917e000
VIRTUAL PHYSICAL
ffff8917e000 836e71000
PAGE DIRECTORY: ffff000802f8d000
PGD: ffff000802f8dff8 => 884e29003
PUD: ffff000844e29ff0 => 884e93003
PMD: ffff000844e93240 => 840413003
PTE: ffff000800413bf0 => 160000836e71fc3
PAGE: 836e71000 (ZERO PAGE)
PTE PHYSICAL FLAGS
160000836e71fc3 836e71000 (VALID|USER|RDONLY|SHARED|AF|NG|PXN|UXN|SPECIAL)
VMA START END FLAGS FILE
ffff000844f51860 ffff8917c000 ffff8957d000 100073
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffe001fbb9c40 836e71000 0 0 1 2ffffc000001000 reserved
If huge page found:
crash> vtop -c 14538 ffff95800000
VIRTUAL PHYSICAL
ffff95800000 910c00000
PAGE DIRECTORY: ffff000801fa0000
PGD: ffff000801fa0ff8 => 884f53003
PUD: ffff000844f53ff0 => 8426cb003
PMD: ffff0008026cb560 => 60000910c00fc1
PAGE: 910c00000 (2MB, ZERO PAGE)
PTE PHYSICAL FLAGS
60000910c00fc1 910c00000 (VALID|USER|RDONLY|SHARED|AF|NG|PXN|UXN)
VMA START END FLAGS FILE
ffff0000caa711e0 ffff956a9000 ffff95aaa000 100073
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffe0023230000 910c00000 0 0 1 6ffffc000010000 head
That seems be sensible with this patch.
Signed-off-by: Rongwei Wang <rongwei.wang(a)linux.alibaba.com>
---
arm64.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++------
defs.h | 5 ++++
2 files changed, 88 insertions(+), 9 deletions(-)
diff --git a/arm64.c b/arm64.c
index 56fb841..264572d 100644
--- a/arm64.c
+++ b/arm64.c
@@ -419,7 +419,25 @@ arm64_init(int when)
/* use machdep parameters */
arm64_calc_phys_offset();
arm64_calc_physvirt_offset();
-
+
+ if (kernel_symbol_exists("zero_pfn")) {
+ ulong zero_pfn = 0;
+
+ if (readmem(symbol_value("zero_pfn"), KVADDR,
+ &zero_pfn, sizeof(zero_pfn),
+ "read zero_pfn", QUIET|RETURN_ON_ERROR))
+ machdep->zero_pfn = zero_pfn;
+ }
+
+ if (kernel_symbol_exists("huge_zero_pfn")) {
+ ulong huge_zero_pfn = 0;
+
+ if (readmem(symbol_value("huge_zero_pfn"), KVADDR,
+ &huge_zero_pfn, sizeof(huge_zero_pfn),
+ "read huge_zero_pfn", QUIET|RETURN_ON_ERROR))
+ machdep->huge_zero_pfn = huge_zero_pfn;
+ }
+
if (CRASHDEBUG(1)) {
if (machdep->flags & NEW_VMEMMAP)
fprintf(fp, "kimage_voffset: %lx\n",
@@ -1787,7 +1805,14 @@ arm64_vtop_2level_64k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if ((pgd_val & PMD_TYPE_MASK) == PMD_TYPE_SECT) {
ulong sectionbase = (pgd_val & SECTION_PAGE_MASK_512MB) & PHYS_MASK;
if (verbose) {
- fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
+ if (kernel_symbol_exists("huge_zero_pfn")) {
+ if (sectionbase == (HUGE_ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (512MB, ZERO PAGE)\n\n",
+ HUGE_ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
+ } else
+ fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
arm64_translate_pte(pgd_val, 0, 0);
}
*paddr = sectionbase + (vaddr & ~SECTION_PAGE_MASK_512MB);
@@ -1806,7 +1831,14 @@ arm64_vtop_2level_64k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if (pte_val & PTE_VALID) {
*paddr = (PAGEBASE(pte_val) & PHYS_MASK) + PAGEOFFSET(vaddr);
if (verbose) {
- fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ if (kernel_symbol_exists("zero_pfn")) {
+ if (PAGEBASE(*paddr) == (ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (ZERO PAGE)\n\n",
+ ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ } else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
arm64_translate_pte(pte_val, 0, 0);
}
} else {
@@ -1859,7 +1891,14 @@ arm64_vtop_3level_64k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if ((pmd_val & PMD_TYPE_MASK) == PMD_TYPE_SECT) {
ulong sectionbase = PTE_TO_PHYS(pmd_val) & SECTION_PAGE_MASK_512MB;
if (verbose) {
- fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
+ if (kernel_symbol_exists("huge_zero_pfn")) {
+ if (sectionbase == (HUGE_ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (512MB, ZERO PAGE)\n\n",
+ HUGE_ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
+ } else
+ fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase);
arm64_translate_pte(pmd_val, 0, 0);
}
*paddr = sectionbase + (vaddr & ~SECTION_PAGE_MASK_512MB);
@@ -1878,7 +1917,14 @@ arm64_vtop_3level_64k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if (pte_val & PTE_VALID) {
*paddr = PTE_TO_PHYS(pte_val) + PAGEOFFSET(vaddr);
if (verbose) {
- fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ if (kernel_symbol_exists("zero_pfn")) {
+ if (PAGEBASE(*paddr) == (ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (ZERO PAGE)\n\n",
+ ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ } else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
arm64_translate_pte(pte_val, 0, 0);
}
} else {
@@ -1940,7 +1986,14 @@ arm64_vtop_3level_4k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if ((pmd_val & PMD_TYPE_MASK) == PMD_TYPE_SECT) {
ulong sectionbase = (pmd_val & SECTION_PAGE_MASK_2MB) & PHYS_MASK;
if (verbose) {
- fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
+ if (kernel_symbol_exists("huge_zero_pfn")) {
+ if (sectionbase == (HUGE_ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (2MB, ZERO PAGE)\n\n",
+ HUGE_ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
+ } else
+ fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
arm64_translate_pte(pmd_val, 0, 0);
}
*paddr = sectionbase + (vaddr & ~SECTION_PAGE_MASK_2MB);
@@ -1959,7 +2012,14 @@ arm64_vtop_3level_4k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if (pte_val & PTE_VALID) {
*paddr = (PAGEBASE(pte_val) & PHYS_MASK) + PAGEOFFSET(vaddr);
if (verbose) {
- fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ if (kernel_symbol_exists("zero_pfn")) {
+ if (PAGEBASE(*paddr) == (ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (ZERO PAGE)\n\n",
+ ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ } else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
arm64_translate_pte(pte_val, 0, 0);
}
} else {
@@ -2029,7 +2089,14 @@ arm64_vtop_4level_4k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if ((pmd_val & PMD_TYPE_MASK) == PMD_TYPE_SECT) {
ulong sectionbase = (pmd_val & SECTION_PAGE_MASK_2MB) & PHYS_MASK;
if (verbose) {
- fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
+ if (kernel_symbol_exists("huge_zero_pfn")) {
+ if (sectionbase == (HUGE_ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (2MB, ZERO PAGE)\n\n",
+ HUGE_ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
+ } else
+ fprintf(fp, " PAGE: %lx (2MB)\n\n", sectionbase);
arm64_translate_pte(pmd_val, 0, 0);
}
*paddr = sectionbase + (vaddr & ~SECTION_PAGE_MASK_2MB);
@@ -2048,7 +2115,14 @@ arm64_vtop_4level_4k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose)
if (pte_val & PTE_VALID) {
*paddr = (PAGEBASE(pte_val) & PHYS_MASK) + PAGEOFFSET(vaddr);
if (verbose) {
- fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ if (kernel_symbol_exists("zero_pfn")) {
+ if (PAGEBASE(*paddr) == (ZEROPFN() << PAGESHIFT()))
+ fprintf(fp, " PAGE: %lx (ZERO PAGE)\n\n",
+ ZEROPFN() << PAGESHIFT());
+ else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
+ } else
+ fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr));
arm64_translate_pte(pte_val, 0, 0);
}
} else {
diff --git a/defs.h b/defs.h
index 12ad6aa..4ed2d0a 100644
--- a/defs.h
+++ b/defs.h
@@ -1071,6 +1071,8 @@ struct machdep_table {
void (*show_interrupts)(int, ulong *);
int (*is_page_ptr)(ulong, physaddr_t *);
int (*get_cpu_reg)(int, int, const char *, int, void *);
+ ulong zero_pfn;
+ ulong huge_zero_pfn;
};
/*
@@ -2999,6 +3001,9 @@ struct load_module {
#define VIRTPAGEBASE(X) (((ulong)(X)) & (ulong)machdep->pagemask)
#define PHYSPAGEBASE(X) (((physaddr_t)(X)) & (physaddr_t)machdep->pagemask)
+#define ZEROPFN() (machdep->zero_pfn)
+#define HUGE_ZEROPFN() (machdep->huge_zero_pfn)
+
/*
* Sparse memory stuff
* These must follow the definitions in the kernel mmzone.h
--
2.27.0
1 year, 6 months
Re: [Crash-utility] [PATCH v3] arm64/x86_64: show zero pfn information when using vtop
by lijiang
Hi, Rongwei
Thank you for the patch.
On Tue, May 16, 2023 at 8:00 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Tue, 16 May 2023 19:40:54 +0800
> From: Rongwei Wang <rongwei.wang(a)linux.alibaba.com>
> To: crash-utility(a)redhat.com, k-hagio-ab(a)nec.com
> Subject: [Crash-utility] [PATCH v3] arm64/x86_64: show zero pfn
> information when using vtop
> Message-ID: <20230516114054.63844-1-rongwei.wang(a)linux.alibaba.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> Now vtop can not show us the page is zero pfn
> when PTE or PMD has attached ZERO PAGE. This
> patch supports show this information directly
> when using vtop, likes:
>
> crash> vtop -c 13674 ffff8917e000
> VIRTUAL PHYSICAL
> ffff8917e000 836e71000
>
> PAGE DIRECTORY: ffff000802f8d000
> PGD: ffff000802f8dff8 => 884e29003
> PUD: ffff000844e29ff0 => 884e93003
> PMD: ffff000844e93240 => 840413003
> PTE: ffff000800413bf0 => 160000836e71fc3
> PAGE: 836e71000 (ZERO PAGE)
> ...
>
> If huge page found:
>
> crash> vtop -c 14538 ffff95800000
> VIRTUAL PHYSICAL
> ffff95800000 910c00000
>
> PAGE DIRECTORY: ffff000801fa0000
> PGD: ffff000801fa0ff8 => 884f53003
> PUD: ffff000844f53ff0 => 8426cb003
> PMD: ffff0008026cb560 => 60000910c00fc1
> PAGE: 910c00000 (2MB, ZERO PAGE)
> ...
>
>
I did some tests on x86 64 and aarch64 machines, and got the following
results.
[1] on x86 64, it does not print the "ZERO PAGE" when using 1G huge pages.
(but for 2M huge page, it works)
crash> vtop -c 2763 7fdfc0000000
VIRTUAL PHYSICAL
7fdfc0000000 300000000
PGD: 23b9ae7f8 => 8000000235031067
PUD: 235031bf8 => 80000003000008e7
PAGE: 300000000 (1GB)
PTE PHYSICAL FLAGS
80000003000008e7 300000000 (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
VMA START END FLAGS FILE
ffff9d65fc8a85c0 7fdfc0000000 7fe000000000 84400fb /mnt/hugetlbfs/test
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffef30cc000000 300000000 ffff9d65f5c35850 0 2 57ffffc001000c
uptodate,dirty,head
crash> help -v|grep zero
zero_paddr: 221a37000
huge_zero_paddr: 240000000
[2] on aarch64, it does not print the "ZERO PAGE"
crash> vtop -c 23390 ffff8d600000
VIRTUAL PHYSICAL
ffff8d600000 cc800000
PAGE DIRECTORY: ffff224ba02d9000
PGD: ffff224ba02d9ff8 => 80000017b38f003
PUD: ffff224b7b38fff0 => 80000017b38e003
PMD: ffff224b7b38e358 => e80000cc800f41
PAGE: cc800000 (2MB)
PTE PHYSICAL FLAGS
e80000cc800f41 cc800000 (VALID|USER|SHARED|AF|NG|PXN|UXN|DIRTY)
VMA START END FLAGS FILE
ffff224bb315f678 ffff8d600000 ffff8d800000 4400fb /mnt/hugetlbfs/test
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffc892b320000 cc800000 ffff224b5c48ac90 0 2 7ffff80001000c
uptodate,dirty,head
crash> help -v|grep zero
zero_paddr: 142662000
huge_zero_paddr: 111400000
I have one question: can this patch print "ZERO PAGE" on x86 64 when using
1G huge pages? Or is it expected behavior on x86 64?
And It does not work on aarch64 machine to me. Did I miss anything else?
Thanks
Lianbo
1 year, 6 months
[PATCH] Fix "kmem -v" option displaying no regions on Linux 6.3 and later
by HAGIO KAZUHITO(萩尾 一仁)
Kernel commit 869176a09606 ("mm/vmalloc.c: add flags to mark vm_map_ram
area"), which is contained in Linux 6.3 and later, added "flags" member
to struct vmap_area. This was the revival of the "flags" member as
kernel commit 688fcbfc06e4 had eliminated it before.
As a result, crash started to use the old procedure using the member and
displays no vmalloc'd regions, because it does not have the same flag
value as the old one.
crash> kmem -v
VMAP_AREA VM_STRUCT ADDRESS RANGE SIZE
crash>
To fix this, also check if vmap_area.purge_list exists, which was
introduced with the flags and removed later, to determine that the flags
member is the old one.
Related vmap_area history:
v2.6.28 db64fe02258f introduced vmap_area with flags and purge_list
v5.4 688fcbfc06e4 removed flags
v5.11 96e2db456135 removed purge_list
v6.3 869176a09606 added flags again
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
---
defs.h | 1 +
memory.c | 4 +++-
symbols.c | 1 +
3 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/defs.h b/defs.h
index 21cc760444d1..bfa07c3f5150 100644
--- a/defs.h
+++ b/defs.h
@@ -2216,6 +2216,7 @@ struct offset_table { /* stash of commonly-used offsets */
long in6_addr_in6_u;
long kset_kobj;
long subsys_private_subsys;
+ long vmap_area_purge_list;
};
struct size_table { /* stash of commonly-used sizes */
diff --git a/memory.c b/memory.c
index 953fc380c03c..15fa8b2f08f1 100644
--- a/memory.c
+++ b/memory.c
@@ -429,6 +429,7 @@ vm_init(void)
MEMBER_OFFSET_INIT(vmap_area_vm, "vmap_area", "vm");
if (INVALID_MEMBER(vmap_area_vm))
MEMBER_OFFSET_INIT(vmap_area_vm, "vmap_area", "private");
+ MEMBER_OFFSET_INIT(vmap_area_purge_list, "vmap_area", "purge_list");
STRUCT_SIZE_INIT(vmap_area, "vmap_area");
if (VALID_MEMBER(vmap_area_va_start) &&
VALID_MEMBER(vmap_area_va_end) &&
@@ -9063,7 +9064,8 @@ dump_vmap_area(struct meminfo *vi)
readmem(ld->list_ptr[i], KVADDR, vmap_area_buf,
SIZE(vmap_area), "vmap_area struct", FAULT_ON_ERROR);
- if (VALID_MEMBER(vmap_area_flags)) {
+ if (VALID_MEMBER(vmap_area_flags) &&
+ VALID_MEMBER(vmap_area_purge_list)) {
flags = ULONG(vmap_area_buf + OFFSET(vmap_area_flags));
if (flags != VM_VM_AREA)
continue;
diff --git a/symbols.c b/symbols.c
index f0721023816d..7b1d59203b90 100644
--- a/symbols.c
+++ b/symbols.c
@@ -9169,6 +9169,7 @@ dump_offset_table(char *spec, ulong makestruct)
OFFSET(vmap_area_vm));
fprintf(fp, " vmap_area_flags: %ld\n",
OFFSET(vmap_area_flags));
+ fprintf(fp, " vmap_area_purge_list: %ld\n", OFFSET(vmap_area_purge_list));
fprintf(fp, " module_size_of_struct: %ld\n",
OFFSET(module_size_of_struct));
--
2.31.1
1 year, 7 months