[PATCH v3] arm64: update the modules/vmalloc/vmemmap ranges
by Huang Shijie
< 1 > The background.
The current crash code is still based at kernel v4.20, but the kernel is v5.17-rc4(now).
The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
I list all the changes from kernel v4.20 to v5.17:
1.) The current crash code is based at kernel v4.20.
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
2.) In the kernel v5.0, the patch will add a new BFP JIT region:
"91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
The layout does not changed until v5.4.
3.) In the kernel v5.4, several patches changes the layout, such as:
"ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
"14c127c957c1 arm64: mm: Flip kernel VA space"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
In the v5.7, the patch:
"bbd6ec605c arm64/mm: Enable memory hot remove"
adds the VMEMMAP_END.
4.) In the kernel v5.11, several patches changes the layout, such as:
"9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
"f4693c2716b3 arm64: mm: extend linear region for 52-bit VA configurations"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
5.) In the kernel v5.17-rc1, after the patch
"b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
< 2 > What does this patch do?
1.) Use arm64_get_struct_page_size() to get the size of struct page{} in the PRE_GDB.
2.) If we can succeed in above step, we will try to call arm64_get_va_range() to
get the proper kernel virtual ranges.
In the arm64_get_va_range(), we calculate the ranges by the hooks of
different kernel versions:
get_range: arm64_get_range_v5_17,
get_range: arm64_get_range_v5_11,
get_range: arm64_get_range_v5_4,
get_range: arm64_get_range_v5_0,
3.) If we can succeed in above steps, the arm64_calc_virtual_memory_ranges()
will be ignored. If we failed in above steps, the arm64_calc_virtual_memory_ranges()
will continue to do its work.
< 3 > Test this patch.
Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
(The CONFIG_KASAN is NOT set for this kernel.)
Before this patch, we get the wrong output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800048000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800040000000
modules_end: ffff800047ffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
After this patch, we can get the correct output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800010000000
vmalloc_end: fffffdffbfff0000
modules_vaddr: ffff800008000000
modules_end: ffff800010000000
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
v2 --> v3:
Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
vmemmep_vaddr, not use the ms->VA_BITS.
v1 --> v2:
The Crash code is based on v4.20 not v4.9.
Changed the commit message about it.
---
arm64.c | 362 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 354 insertions(+), 8 deletions(-)
diff --git a/arm64.c b/arm64.c
index de1038a..7d0884d 100644
--- a/arm64.c
+++ b/arm64.c
@@ -92,6 +92,13 @@ static void arm64_calc_VA_BITS(void);
static int arm64_is_uvaddr(ulong, struct task_context *);
static void arm64_calc_KERNELPACMASK(void);
+struct kernel_range {
+ unsigned long modules_vaddr, modules_end;
+ unsigned long vmalloc_start_addr, vmalloc_end;
+ unsigned long vmemmap_vaddr, vmemmap_end;
+};
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms);
+static void arm64_get_struct_page_size(void);
/*
* Do all necessary machine-specific setup here. This is called several times
@@ -219,6 +226,7 @@ arm64_init(int when)
machdep->pageoffset = machdep->pagesize - 1;
machdep->pagemask = ~((ulonglong)machdep->pageoffset);
+ arm64_get_struct_page_size();
arm64_calc_VA_BITS();
arm64_calc_KERNELPACMASK();
ms = machdep->machspec;
@@ -238,35 +246,47 @@ arm64_init(int when)
}
machdep->is_kvaddr = generic_is_kvaddr;
machdep->kvtop = arm64_kvtop;
+
+ /* The defaults */
+ ms->vmalloc_end = ARM64_VMALLOC_END;
+ ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
+ ms->vmemmap_end = ARM64_VMEMMAP_END;
+
if (machdep->flags & NEW_VMEMMAP) {
struct syment *sp;
+ struct kernel_range *r;
sp = kernel_symbol_search("_text");
ms->kimage_text = (sp ? sp->value : 0);
sp = kernel_symbol_search("_end");
ms->kimage_end = (sp ? sp->value : 0);
- if (ms->VA_BITS_ACTUAL) {
+ if (ASSIGN_SIZE(page) && (r = arm64_get_va_range(ms))) {
+ /* We can get all the MODULES/VMALLOC/VMEMMAP ranges now.*/
+ ms->modules_vaddr = r->modules_vaddr;
+ ms->modules_end = r->modules_end;
+ ms->vmalloc_start_addr = r->vmalloc_start_addr;
+ ms->vmalloc_end = r->vmalloc_end;
+ ms->vmemmap_vaddr = r->vmemmap_vaddr;
+ ms->vmemmap_end = r->vmemmap_end;
+ } else if (ms->VA_BITS_ACTUAL) {
ms->modules_vaddr = (st->_stext_vmlinux & TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
} else {
ms->modules_vaddr = ARM64_VA_START;
if (kernel_symbol_exists("kasan_init"))
ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
}
- ms->vmalloc_start_addr = ms->modules_end + 1;
-
arm64_calc_kimage_voffset();
} else {
ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
ms->modules_end = ARM64_PAGE_OFFSET - 1;
ms->vmalloc_start_addr = ARM64_VA_START;
}
- ms->vmalloc_end = ARM64_VMALLOC_END;
- ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
- ms->vmemmap_end = ARM64_VMEMMAP_END;
switch (machdep->pagesize)
{
@@ -387,7 +407,9 @@ arm64_init(int when)
break;
case POST_GDB:
- arm64_calc_virtual_memory_ranges();
+ /* Can we get the size of struct page before POST_GDB */
+ if (!ASSIGN_SIZE(page))
+ arm64_calc_virtual_memory_ranges();
arm64_get_section_size_bits();
if (!machdep->max_physmem_bits) {
@@ -494,6 +516,331 @@ arm64_init(int when)
}
}
+struct kernel_va_range_handler {
+ unsigned long kernel_versions_start; /* include */
+ unsigned long kernel_versions_end; /* exclude */
+ struct kernel_range *(*get_range)(struct machine_specific *);
+};
+
+static struct kernel_range tmp_range;
+#define _PAGE_END(va) (-(1UL << ((va) - 1)))
+#define SZ_64K 0x00010000
+#define SZ_2M 0x00200000
+
+/*
+ * Get the max shift of the size of struct page.
+ * Most of the time, it is 64 bytes, but not sure.
+ */
+static int arm64_get_struct_page_max_shift(void)
+{
+ unsigned long v = ASSIGN_SIZE(page);
+
+ if (16 < v && v <= 32)
+ return 5;
+ if (32 < v && v <= 64)
+ return 6;
+ if (64 < v && v <= 128)
+ return 7;
+
+ error(FATAL, "We should not have such struct page size:%d!\n", v);
+ return 0;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.17-rc1:
+ * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
+ */
+static struct kernel_range *arm64_get_range_v5_17(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v);
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift();
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.11:
+ * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_11(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift();
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+static unsigned long arm64_get_pud_size(void)
+{
+ unsigned long PUD_SIZE = 0;
+
+ switch (machdep->pagesize) {
+ case 4096:
+ if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
+ PUD_SIZE = PUD_SIZE_L4_4K;
+ } else {
+ PUD_SIZE = PGDIR_SIZE_L3_4K;
+ }
+ break;
+
+ case 65536:
+ PUD_SIZE = PGDIR_SIZE_L2_64K;
+ default:
+ break;
+ }
+ return PUD_SIZE;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.4, such as:
+ * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
+ * "14c127c957c1 arm64: mm: Flip kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_4(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ char *string;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ /* See the arch/arm64/Kconfig*/
+ ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET", &string);
+ if (ret != IKCONFIG_STR)
+ return NULL;
+ kasan_shadow_offset = atol(string);
+
+ r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) + kasan_shadow_offset
+ + bpf_jit_size;
+ } else {
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift();
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
+ /*
+ * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable memory hot remove"
+ * adds the VMEMMAP_END.
+ */
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+ } else {
+ r->vmemmap_end = 0xffffffffffffffffUL;
+ }
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.0, such as:
+ * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
+ */
+static struct kernel_range *arm64_get_range_v5_0(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, PUD_SIZE;
+ unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ unsigned long va_start, page_offset;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
+ page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ r->modules_vaddr = va_start + (1UL << (v - kasan_shadow_shift)) + bpf_jit_size;
+ } else {
+ r->modules_vaddr = va_start + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmemmap_size = (1UL << (v - machdep->pageshift - 1 + arm64_get_struct_page_max_shift()));
+
+ r->vmemmap_vaddr = page_offset - vmemmap_size;
+ r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not have VMEMMAP_END */
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
+ return r;
+}
+
+static struct kernel_va_range_handler kernel_va_range_handlers[] = {
+ {
+ LINUX(5,17,0),
+ LINUX(6,0,0), /* Just a boundary, Change it later */
+ get_range: arm64_get_range_v5_17,
+ }, {
+ LINUX(5,11,0), LINUX(5,17,0),
+ get_range: arm64_get_range_v5_11,
+ }, {
+ LINUX(5,4,0), LINUX(5,11,0),
+ get_range: arm64_get_range_v5_4,
+ }, {
+ LINUX(5,0,0), LINUX(5,4,0),
+ get_range: arm64_get_range_v5_0,
+ },
+};
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+
+static unsigned long arm64_get_kernel_version(void)
+{
+ char *string;
+ char buf[BUFSIZE];
+ char *p1, *p2;
+
+ if (THIS_KERNEL_VERSION)
+ return THIS_KERNEL_VERSION;
+
+ string = pc->read_vmcoreinfo("OSRELEASE");
+ if (string) {
+ strcpy(buf, string);
+
+ p1 = p2 = buf;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[0] = atoi(p1);
+
+ p1 = ++p2;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[1] = atoi(p1);
+
+ p1 = ++p2;
+ while ((*p2 >= '0') && (*p2 <= '9'))
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[2] = atoi(p1);
+ }
+ free(string);
+ return THIS_KERNEL_VERSION;
+}
+
+/* Return NULL if we fail. */
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms)
+{
+ struct kernel_va_range_handler *h;
+ unsigned long kernel_version = THIS_KERNEL_VERSION;
+ int i;
+
+ if (!kernel_version) {
+ kernel_version = arm64_get_kernel_version();
+ if (!kernel_version)
+ return NULL;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
+ h = kernel_va_range_handlers + i;
+
+ /* Get the right kernel version */
+ if (h->kernel_versions_start <= kernel_version &&
+ kernel_version < h->kernel_versions_end) {
+
+ /* Get the correct virtual address ranges */
+ return h->get_range(ms);
+ }
+ }
+ return NULL;
+}
+
+/* Get the size of struct page {} */
+static void arm64_get_struct_page_size()
+{
+ char *string;
+
+ string = pc->read_vmcoreinfo("SIZE(page)");
+ if (string)
+ ASSIGN_SIZE(page) = atol(string);
+ free(string);
+}
+
/*
* Accept or reject a symbol from the kernel namelist.
*/
@@ -4255,7 +4602,6 @@ arm64_calc_VA_BITS(void)
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
-#define SZ_64K 0x00010000
static void
arm64_calc_virtual_memory_ranges(void)
--
2.30.2
2 years, 10 months
Re: [Crash-utility] [PATCH 1/2] ps: Add support to "ps -l" to properly display process list
by lijiang
Thank you for the patch, Austin.
On Fri, Feb 25, 2022 at 4:52 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Fri, 25 Feb 2022 07:19:32 +0000
> From: Austin Kim <austindh.kim(a)gmail.com>
> To: k-hagio-ab(a)nec.com, crash-utility(a)redhat.com
> Cc: kernel-team(a)lge.com, mikeseohyungjin(a)gmail.com
> Subject: [Crash-utility] [PATCH 1/2] ps: Add support to "ps -l" to
> properly display process list
> Message-ID: <20220225071932.GA1097@raspberrypi>
> Content-Type: text/plain; charset=us-ascii
>
> Sometimes kernel image is generated without CONFIG_SCHED_STAT or
> CONFIG_SCHED_INFO.
>
> Running crash-utility with above kernel image,
> "ps -l" options cannot display all processes sorted with most recently-run
> process
>
> crash> ps -l
> ps: last-run timestamps do not exist in this kernel
> Usage:
> ps [-k|-u|-G] [-s] [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
> [pid | task | command] ...
> Enter "help ps" for details.
>
> This is because output of 'ps -l' depends on
> task_struct.sched_info.last_arrival.
> Without CONFIG_SCHED_STAT or CONFIG_SCHED_INFO, 'sched_info' field is not
> included
> in task_struct.
>
> So we make 'ps -e' option to access 'exec_start' field of sched_entity.
> where 'exec_start' is task_struct.se.exec_start.
>
> With this patch, "ps -l" option works well without CONFIG_SCHED_STAT or
> CONFIG_SCHED_INFO.
>
> The history of CONFIG_SCHED_INFO and CONFIG_SCHED_STAT is as below;
>
> - CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
> - CONFIG_SCHED_STAT: KERNEL_VERSION < LINUX(4,2,0)
>
>
Could you please add the kernel commit ID for the relevant changes?
In addition, I would suggest to fold these two patches as one patch and
change the subject as:
"ps: Add support to "ps -l|-m" to properly display process list", what do
you think?
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
> ---
> defs.h | 2 ++
> symbols.c | 2 ++
> task.c | 17 ++++++++++++++---
> 3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/defs.h b/defs.h
> index 7d386d2..ed2f5ca 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -1768,6 +1768,8 @@ struct offset_table { /* stash of
> commonly-used offsets */
> long vcpu_struct_rq;
> long task_struct_sched_info;
> long sched_info_last_arrival;
> + long task_struct_sched_entity;
> + long se_exec_start;
>
This can be only appended to the end of the offset_table.
For more details, refer to the section "writing patches" in wiki:
https://github.com/crash-utility/crash/wiki
Thanks.
Lianbo
long page_objects;
> long kmem_cache_oo;
> long char_device_struct_cdev;
> diff --git a/symbols.c b/symbols.c
> index 97fb778..5e2032a 100644
> --- a/symbols.c
> +++ b/symbols.c
> @@ -8892,6 +8892,8 @@ dump_offset_table(char *spec, ulong makestruct)
> OFFSET(sched_rt_entity_run_list));
> fprintf(fp, " sched_info_last_arrival: %ld\n",
> OFFSET(sched_info_last_arrival));
> + fprintf(fp, " se_exec_start: %ld\n",
> + OFFSET(se_exec_start));
> fprintf(fp, " task_struct_thread_info: %ld\n",
> OFFSET(task_struct_thread_info));
> fprintf(fp, " task_struct_stack: %ld\n",
> diff --git a/task.c b/task.c
> index 864c838..e6fde74 100644
> --- a/task.c
> +++ b/task.c
> @@ -334,9 +334,15 @@ task_init(void)
> if (VALID_MEMBER(task_struct_sched_info))
> MEMBER_OFFSET_INIT(sched_info_last_arrival,
> "sched_info", "last_arrival");
> + MEMBER_OFFSET_INIT(task_struct_sched_entity, "task_struct", "se");
> + if (VALID_MEMBER(task_struct_sched_entity)) {
> + STRUCT_SIZE_INIT(sched_entity, "sched_entity");
> + MEMBER_OFFSET_INIT(se_exec_start, "sched_entity",
> "exec_start");
> + }
> if (VALID_MEMBER(task_struct_last_run) ||
> VALID_MEMBER(task_struct_timestamp) ||
> - VALID_MEMBER(sched_info_last_arrival)) {
> + VALID_MEMBER(sched_info_last_arrival) ||
> + VALID_MEMBER(se_exec_start)) {
> char buf[BUFSIZE];
> strcpy(buf, "alias last ps -l");
> alias_init(buf);
> @@ -3574,7 +3580,8 @@ cmd_ps(void)
> case 'l':
> if (INVALID_MEMBER(task_struct_last_run) &&
> INVALID_MEMBER(task_struct_timestamp) &&
> - INVALID_MEMBER(sched_info_last_arrival)) {
> + INVALID_MEMBER(sched_info_last_arrival) &&
> + INVALID_MEMBER(se_exec_start)) {
> error(INFO,
> "last-run timestamps do not exist in this
> kernel\n");
> argerrs++;
> @@ -6020,7 +6027,11 @@ task_last_run(ulong task)
> timestamp = tt->last_task_read ?
> ULONGLONG(tt->task_struct +
> OFFSET(task_struct_sched_info) +
> OFFSET(sched_info_last_arrival)) : 0;
> -
> + else if (VALID_MEMBER(se_exec_start))
> + timestamp = tt->last_task_read ?
> ULONGLONG(tt->task_struct +
> + OFFSET(task_struct_sched_entity) +
> + OFFSET(se_exec_start)) : 0;
> +
> return timestamp;
> }
>
> --
> 2.20.1
>
2 years, 10 months
Re: [Crash-utility] [PATCH v2] Makefile: Change the behavior of target "cscope"
by lijiang
On Thu, Feb 24, 2022 at 10:32 AM <crash-utility-request(a)redhat.com> wrote:
> Date: Thu, 24 Feb 2022 10:23:56 +0000
> From: Huang Shijie <shijie(a)os.amperecomputing.com>
> To: k-hagio-ab(a)nec.com
> Cc: zwang(a)amperecomputing.com, patches(a)amperecomputing.com,
> lijiang(a)redhat.com, crash-utility(a)redhat.com
> Subject: [Crash-utility] [PATCH v2] Makefile: Change the behavior of
> target "cscope"
> Message-ID: <20220224102356.42157-1-shijie(a)os.amperecomputing.com>
> Content-Type: text/plain
>
> Make the "make cscope" only generate cscope index, not call the cscope.
>
> Also fix a typo:
> cscope_out --> cscope.out
>
> Acked-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
> Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
> ---
> v1 --> v2:
> Changed the title, added Kazu's Ack.
> ---
>
Thank you for the update, Shijie.
For v2: Applied.
Lianbo
Makefile | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 2ca496d..007d030 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -639,10 +639,10 @@ ref:
> $(MAKE) ctags cscope
>
> cscope:
> - rm -f cscope.files cscope_out
> + rm -f cscope.files cscope.out
> for FILE in ${SOURCE_FILES}; do \
> echo $$FILE >> cscope.files; done
> - cscope
> + cscope -b -f cscope.out
>
> glink: make_configure
> @./configure -q -b
> --
> 2.30.2
>
2 years, 10 months
Re: [Crash-utility] [PATCHv2] arm64: deduce the start address of kernel code, based on kernel version
by lijiang
On Fri, Feb 25, 2022 at 1:01 AM <crash-utility-request(a)redhat.com> wrote:
> Date: Thu, 24 Feb 2022 11:52:12 +0800
> From: Pingfan Liu <piliu(a)redhat.com>
> To: crash-utility(a)redhat.com
> Subject: [Crash-utility] [PATCHv2] arm64: deduce the start address of
> kernel code, based on kernel version
> Message-ID: <20220224035212.14186-1-piliu(a)redhat.com>
>
> After kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from
> permanent kernel mapping"), the range [_text, _stext] is reclaimed. But
> the current crash code still assumes kernel starting from "_text".
>
>
Thank you for the fix, Pingfan. Good findings.
The v2 looks good and the test is ok. Applied.
Lianbo
This change only affects the vmalloced area on arm64 and may result a
> false in arm64_IS_VMALLOC_ADDR().
>
> Since vmcore has no extra information about this trival change, it can
> only be deduced from kernel version, which means ms->kimage_text can not
> be correctly initialized until kernel_init() finishes. Here on arm64, it
> can be done at the point machdep_init(POST_GDB). This is fine
> since there is no access to vmalloced area at this stage.
>
> Signed-off-by: Pingfan Liu <piliu(a)redhat.com>
> ---
> arm64.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/arm64.c b/arm64.c
> index de1038a..3ab8489 100644
> --- a/arm64.c
> +++ b/arm64.c
> @@ -92,6 +92,20 @@ static void arm64_calc_VA_BITS(void);
> static int arm64_is_uvaddr(ulong, struct task_context *);
> static void arm64_calc_KERNELPACMASK(void);
>
> +static void arm64_calc_kernel_start(void)
> +{
> + struct machine_specific *ms = machdep->machspec;
> + struct syment *sp;
> +
> + if (THIS_KERNEL_VERSION >= LINUX(5,11,0))
> + sp = kernel_symbol_search("_stext");
> + else
> + sp = kernel_symbol_search("_text");
> +
> + ms->kimage_text = (sp ? sp->value : 0);
> + sp = kernel_symbol_search("_end");
> + ms->kimage_end = (sp ? sp->value : 0);
> +}
>
> /*
> * Do all necessary machine-specific setup here. This is called several
> times
> @@ -241,6 +255,7 @@ arm64_init(int when)
> if (machdep->flags & NEW_VMEMMAP) {
> struct syment *sp;
>
> + /* It is finally decided in
> arm64_calc_kernel_start() */
> sp = kernel_symbol_search("_text");
> ms->kimage_text = (sp ? sp->value : 0);
> sp = kernel_symbol_search("_end");
> @@ -387,6 +402,8 @@ arm64_init(int when)
> break;
>
> case POST_GDB:
> + /* Rely on kernel version to decide the kernel start
> address */
> + arm64_calc_kernel_start();
> arm64_calc_virtual_memory_ranges();
> arm64_get_section_size_bits();
>
> --
> 2.31.1
>
2 years, 10 months
[PATCH 2/2] ps: Add support to "ps -m" to display process list with timestamp
by Austin Kim
Sometimes "ps -m" options cannot display all processes with timestamp value.
crash> ps -m
ps: last-run timestamps do not exist in this kernel
Usage:
ps [-k|-u|-G] [-s] [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
[pid | task | command] ...
Enter "help ps" for details.
This is because output of 'ps -m' depends on task_struct.sched_info.last_arrival.
Without CONFIG_SCHED_STAT or CONFIG_SCHED_INFO, 'sched_info.last_arrival' field
is not included in task_struct.
So we make 'ps -m' option to access 'exec_start' field of sched_entity.
where 'exec_start' is task_struct.se.exec_start.
With this patch, "ps -m" option works well without CONFIG_SCHED_STAT or
CONFIG_SCHED_INFO.
The history of CONFIG_SCHED_INFO and CONFIG_SCHED_STAT is as below;
- CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
- CONFIG_SCHED_STAT: KERNEL_VERSION < LINUX(4,2,0)
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
---
task.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/task.c b/task.c
index e6fde74..55e2312 100644
--- a/task.c
+++ b/task.c
@@ -3565,7 +3565,8 @@ cmd_ps(void)
case 'm':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(se_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
--
2.20.1
2 years, 10 months
[PATCH 1/2] ps: Add support to "ps -l" to properly display process list
by Austin Kim
Sometimes kernel image is generated without CONFIG_SCHED_STAT or CONFIG_SCHED_INFO.
Running crash-utility with above kernel image,
"ps -l" options cannot display all processes sorted with most recently-run process
crash> ps -l
ps: last-run timestamps do not exist in this kernel
Usage:
ps [-k|-u|-G] [-s] [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
[pid | task | command] ...
Enter "help ps" for details.
This is because output of 'ps -l' depends on task_struct.sched_info.last_arrival.
Without CONFIG_SCHED_STAT or CONFIG_SCHED_INFO, 'sched_info' field is not included
in task_struct.
So we make 'ps -e' option to access 'exec_start' field of sched_entity.
where 'exec_start' is task_struct.se.exec_start.
With this patch, "ps -l" option works well without CONFIG_SCHED_STAT or
CONFIG_SCHED_INFO.
The history of CONFIG_SCHED_INFO and CONFIG_SCHED_STAT is as below;
- CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
- CONFIG_SCHED_STAT: KERNEL_VERSION < LINUX(4,2,0)
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
---
defs.h | 2 ++
symbols.c | 2 ++
task.c | 17 ++++++++++++++---
3 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/defs.h b/defs.h
index 7d386d2..ed2f5ca 100644
--- a/defs.h
+++ b/defs.h
@@ -1768,6 +1768,8 @@ struct offset_table { /* stash of commonly-used offsets */
long vcpu_struct_rq;
long task_struct_sched_info;
long sched_info_last_arrival;
+ long task_struct_sched_entity;
+ long se_exec_start;
long page_objects;
long kmem_cache_oo;
long char_device_struct_cdev;
diff --git a/symbols.c b/symbols.c
index 97fb778..5e2032a 100644
--- a/symbols.c
+++ b/symbols.c
@@ -8892,6 +8892,8 @@ dump_offset_table(char *spec, ulong makestruct)
OFFSET(sched_rt_entity_run_list));
fprintf(fp, " sched_info_last_arrival: %ld\n",
OFFSET(sched_info_last_arrival));
+ fprintf(fp, " se_exec_start: %ld\n",
+ OFFSET(se_exec_start));
fprintf(fp, " task_struct_thread_info: %ld\n",
OFFSET(task_struct_thread_info));
fprintf(fp, " task_struct_stack: %ld\n",
diff --git a/task.c b/task.c
index 864c838..e6fde74 100644
--- a/task.c
+++ b/task.c
@@ -334,9 +334,15 @@ task_init(void)
if (VALID_MEMBER(task_struct_sched_info))
MEMBER_OFFSET_INIT(sched_info_last_arrival,
"sched_info", "last_arrival");
+ MEMBER_OFFSET_INIT(task_struct_sched_entity, "task_struct", "se");
+ if (VALID_MEMBER(task_struct_sched_entity)) {
+ STRUCT_SIZE_INIT(sched_entity, "sched_entity");
+ MEMBER_OFFSET_INIT(se_exec_start, "sched_entity", "exec_start");
+ }
if (VALID_MEMBER(task_struct_last_run) ||
VALID_MEMBER(task_struct_timestamp) ||
- VALID_MEMBER(sched_info_last_arrival)) {
+ VALID_MEMBER(sched_info_last_arrival) ||
+ VALID_MEMBER(se_exec_start)) {
char buf[BUFSIZE];
strcpy(buf, "alias last ps -l");
alias_init(buf);
@@ -3574,7 +3580,8 @@ cmd_ps(void)
case 'l':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(se_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
@@ -6020,7 +6027,11 @@ task_last_run(ulong task)
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
OFFSET(task_struct_sched_info) +
OFFSET(sched_info_last_arrival)) : 0;
-
+ else if (VALID_MEMBER(se_exec_start))
+ timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
+ OFFSET(task_struct_sched_entity) +
+ OFFSET(se_exec_start)) : 0;
+
return timestamp;
}
--
2.20.1
2 years, 10 months
[PATCH] Fix sys command to display its help information correctly
by Lianbo Jiang
Sometimes, the sys command may be misused, but it doesn't display
the expected help information, for example:
Without the patch:
crash> sys kmem
NAME
kmem - kernel memory
SYNOPSIS
kmem [-f|-F|-c|-C|-i|-v|-V|-n|-z|-o|-h] [-p | -m member[,member]]
[[-s|-S|-S=cpu[s]|-r] [slab] [-I slab[,slab]]] [-g [flags]] [[-P] address]]
...
crash> sys abc
crash>
With the patch:
crash> sys kmem
Usage:
sys [-c [name|number]] [-t] [-i] config
Enter "help sys" for details.
crash> sys abc
Usage:
sys [-c [name|number]] [-t] [-i] config
Enter "help sys" for details.
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
kernel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel.c b/kernel.c
index 9c4aabffe580..1c6344735299 100644
--- a/kernel.c
+++ b/kernel.c
@@ -5476,7 +5476,7 @@ cmd_sys(void)
else if (STREQ(args[optind], "config"))
read_in_kernel_config(IKCFG_READ);
else
- cmd_usage(args[optind], COMPLETE_HELP);
+ cmd_usage(pc->curcmd, SYNOPSIS);
optind++;
} while (args[optind]);
}
--
2.20.1
2 years, 10 months
[PATCH v2] Makefile: Change the behavior of target "cscope"
by Huang Shijie
Make the "make cscope" only generate cscope index, not call the cscope.
Also fix a typo:
cscope_out --> cscope.out
Acked-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
v1 --> v2:
Changed the title, added Kazu's Ack.
---
Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile
index 2ca496d..007d030 100644
--- a/Makefile
+++ b/Makefile
@@ -639,10 +639,10 @@ ref:
$(MAKE) ctags cscope
cscope:
- rm -f cscope.files cscope_out
+ rm -f cscope.files cscope.out
for FILE in ${SOURCE_FILES}; do \
echo $$FILE >> cscope.files; done
- cscope
+ cscope -b -f cscope.out
glink: make_configure
@./configure -q -b
--
2.30.2
2 years, 10 months
[PATCH] Makefile: fix the wrong target "cscope"
by Huang Shijie
The "make cscope" should generate cscope index, not call the cscope.
Just fix it.
Also fix a typo:
cscope_out --> cscope.out
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile
index 20b6e31..bf21c2a 100644
--- a/Makefile
+++ b/Makefile
@@ -642,10 +642,10 @@ ref:
$(MAKE) ctags cscope
cscope:
- rm -f cscope.files cscope_out
+ rm -f cscope.files cscope.out
for FILE in ${SOURCE_FILES}; do \
echo $$FILE >> cscope.files; done
- cscope
+ cscope -b -f cscope.out
glink: make_configure
@./configure -q -b
--
2.30.2
2 years, 10 months
[PATCHv2] arm64: deduce the start address of kernel code, based on kernel version
by Pingfan Liu
After kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from
permanent kernel mapping"), the range [_text, _stext] is reclaimed. But
the current crash code still assumes kernel starting from "_text".
This change only affects the vmalloced area on arm64 and may result a
false in arm64_IS_VMALLOC_ADDR().
Since vmcore has no extra information about this trival change, it can
only be deduced from kernel version, which means ms->kimage_text can not
be correctly initialized until kernel_init() finishes. Here on arm64, it
can be done at the point machdep_init(POST_GDB). This is fine
since there is no access to vmalloced area at this stage.
Signed-off-by: Pingfan Liu <piliu(a)redhat.com>
---
arm64.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arm64.c b/arm64.c
index de1038a..3ab8489 100644
--- a/arm64.c
+++ b/arm64.c
@@ -92,6 +92,20 @@ static void arm64_calc_VA_BITS(void);
static int arm64_is_uvaddr(ulong, struct task_context *);
static void arm64_calc_KERNELPACMASK(void);
+static void arm64_calc_kernel_start(void)
+{
+ struct machine_specific *ms = machdep->machspec;
+ struct syment *sp;
+
+ if (THIS_KERNEL_VERSION >= LINUX(5,11,0))
+ sp = kernel_symbol_search("_stext");
+ else
+ sp = kernel_symbol_search("_text");
+
+ ms->kimage_text = (sp ? sp->value : 0);
+ sp = kernel_symbol_search("_end");
+ ms->kimage_end = (sp ? sp->value : 0);
+}
/*
* Do all necessary machine-specific setup here. This is called several times
@@ -241,6 +255,7 @@ arm64_init(int when)
if (machdep->flags & NEW_VMEMMAP) {
struct syment *sp;
+ /* It is finally decided in arm64_calc_kernel_start() */
sp = kernel_symbol_search("_text");
ms->kimage_text = (sp ? sp->value : 0);
sp = kernel_symbol_search("_end");
@@ -387,6 +402,8 @@ arm64_init(int when)
break;
case POST_GDB:
+ /* Rely on kernel version to decide the kernel start address */
+ arm64_calc_kernel_start();
arm64_calc_virtual_memory_ranges();
arm64_get_section_size_bits();
--
2.31.1
2 years, 10 months