Re: [Crash-utility] [PATCH v6] arm64: update the modules/vmalloc/vmemmap ranges
by lijiang
Hi, ShiJie
Sorry for the late reply.
On Fri, Mar 4, 2022 at 3:21 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Fri, 4 Mar 2022 15:16:30 +0000
> From: Huang Shijie <shijie(a)os.amperecomputing.com>
> To: k-hagio-ab(a)nec.com
> Cc: lijiang(a)redhat.com, zwang(a)amperecomputing.com,
> darren(a)os.amperecomputing.com, patches(a)amperecomputing.com,
> crash-utility(a)redhat.com
> Subject: [Crash-utility] [PATCH v6] arm64: update the
> modules/vmalloc/vmemmap ranges
> Message-ID: <20220304151630.2364339-1-shijie(a)os.amperecomputing.com>
> Content-Type: text/plain
>
> < 1 > The background.
> The current crash code is still based at kernel v4.20, but the kernel
> is v5.17-rc4(now).
> The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
>
> I list all the changes from kernel v4.20 to v5.17:
>
> 1.) The current crash code is based at kernel v4.20.
> The virtual memory layout looks like this:
>
> +--------------------------------------------------------------------+
> | KASAN | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE -
> SZ_64K)
>
> #define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
>
> 2.) In the kernel v5.0, the patch will add a new BFP JIT region:
> "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module
> memory"
>
> The virtual memory layout looks like this:
>
> +--------------------------------------------------------------------+
> | KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE -
> SZ_64K)
>
> #define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
>
> The layout does not changed until v5.4.
>
> 3.) In the kernel v5.4, several patches changes the layout, such as:
> "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
> "14c127c957c1 arm64: mm: Flip kernel VA space"
> and the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>
> #define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
>
> In the v5.7, the patch:
> "bbd6ec605c arm64/mm: Enable memory hot remove"
> adds the VMEMMAP_END.
>
> 4.) In the kernel v5.11, several patches changes the layout, such as:
> "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
> "f4693c2716b3 arm64: mm: extend linear region for 52-bit VA
> configurations"
> and the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (VMEMMAP_START - SZ_256M)
>
> #define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> #define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
>
> 5.) In the kernel v5.17-rc1, after the patch
> "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
> the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (VMEMMAP_START - SZ_256M)
>
> #define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> #define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
>
> < 2 > What does this patch do?
> 1.) Use arm64_get_struct_page_size() to get the size of struct page{}
> in the PRE_GDB.
>
> 2.) If we can succeed in above step, we will try to call
> arm64_get_va_range() to
> get the proper kernel virtual ranges.
>
> In the arm64_get_va_range(), we calculate the ranges by the hooks
> of
> different kernel versions:
> get_range: arm64_get_range_v5_17,
> get_range: arm64_get_range_v5_11,
> get_range: arm64_get_range_v5_4,
> get_range: arm64_get_range_v5_0,
>
> 3.) If we can succeed in above steps, the
> arm64_calc_virtual_memory_ranges()
> will be ignored. If we failed in above steps, the
> arm64_calc_virtual_memory_ranges()
> will continue to do its work.
>
> < 3 > Test this patch.
> Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
> (The CONFIG_KASAN is NOT set for this kernel.)
>
> Before this patch, we get the wrong output from "help -m":
> ----------------------------------------------------------
> vmalloc_start_addr: ffff800048000000
> vmalloc_end: fffffdffbffeffff
> modules_vaddr: ffff800040000000
> modules_end: ffff800047ffffff
> vmemmap_vaddr: fffffdffffe00000
> vmemmap_end: ffffffffffffffff
> ----------------------------------------------------------
>
> After this patch, we can get the correct output from "help -m":
> ----------------------------------------------------------
> vmalloc_start_addr: ffff800010000000
> vmalloc_end: fffffdffbffeffff
> modules_vaddr: ffff800008000000
> modules_end: ffff80000fffffff
> vmemmap_vaddr: fffffdffffe00000
> vmemmap_end: ffffffffffffffff
> ----------------------------------------------------------
>
> Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
> ---
> v5 --> v6:
> 1.)
> Find a bug in the v5:
> We need to minus 1 for Crash's
> modules_end/vmalloc_end/vmemmap_end.
>
> 2.) Change version limit to LINUX(99, 0, 0)
> 3.) Tested it again.
>
> v4 --> v5:
> 1.) Reset ms->struct_page_size to 0 if arm64_get_va_range() fails,
> so arm64_calc_virtual_memory_ranges() can continue its work.
>
> 2.) Tested again with the new code.
>
> v3 --> v4:
> 1.) Add struct_page_size to @ms.
> Change some functions, such as arm64_init() and
> arm64_get_struct_page_size().
> (Do not use the ASSIGN_SIZE, use the ms->struct_page_size
> instead.)
>
> 2.) Tested again with the new code.
>
> v2 --> v3:
> Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
> We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
> vmemmep_vaddr, not use the ms->VA_BITS.
>
> v1 --> v2:
> The Crash code is based on v4.20 not v4.9.
> Changed the commit message about it.
> ---
> arm64.c | 379 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> defs.h | 1 +
> 2 files changed, 369 insertions(+), 11 deletions(-)
>
> diff --git a/arm64.c b/arm64.c
> index 3ab8489..841016c 100644
> --- a/arm64.c
> +++ b/arm64.c
> @@ -92,6 +92,14 @@ static void arm64_calc_VA_BITS(void);
> static int arm64_is_uvaddr(ulong, struct task_context *);
> static void arm64_calc_KERNELPACMASK(void);
>
> +struct kernel_range {
> + unsigned long modules_vaddr, modules_end;
> + unsigned long vmalloc_start_addr, vmalloc_end;
> + unsigned long vmemmap_vaddr, vmemmap_end;
> +};
> +static struct kernel_range *arm64_get_va_range(struct machine_specific
> *ms);
> +static void arm64_get_struct_page_size(struct machine_specific *ms);
> +
> static void arm64_calc_kernel_start(void)
> {
> struct machine_specific *ms = machdep->machspec;
> @@ -233,9 +241,10 @@ arm64_init(int when)
> machdep->pageoffset = machdep->pagesize - 1;
> machdep->pagemask = ~((ulonglong)machdep->pageoffset);
>
> + ms = machdep->machspec;
> + arm64_get_struct_page_size(ms);
> arm64_calc_VA_BITS();
> arm64_calc_KERNELPACMASK();
> - ms = machdep->machspec;
>
> /* vabits_actual introduced after mm flip, so it should be
> flipped layout */
> if (ms->VA_BITS_ACTUAL) {
> @@ -252,8 +261,15 @@ arm64_init(int when)
> }
> machdep->is_kvaddr = generic_is_kvaddr;
> machdep->kvtop = arm64_kvtop;
> +
> + /* The defaults */
> + ms->vmalloc_end = ARM64_VMALLOC_END;
> + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> + ms->vmemmap_end = ARM64_VMEMMAP_END;
> +
> if (machdep->flags & NEW_VMEMMAP) {
> struct syment *sp;
> + struct kernel_range *r;
>
> /* It is finally decided in
> arm64_calc_kernel_start() */
> sp = kernel_symbol_search("_text");
> @@ -261,27 +277,36 @@ arm64_init(int when)
> sp = kernel_symbol_search("_end");
> ms->kimage_end = (sp ? sp->value : 0);
>
> - if (ms->VA_BITS_ACTUAL) {
> + if (ms->struct_page_size && (r =
> arm64_get_va_range(ms))) {
> + /* We can get all the
> MODULES/VMALLOC/VMEMMAP ranges now.*/
> + ms->modules_vaddr = r->modules_vaddr;
> + ms->modules_end = r->modules_end -
> 1;
> + ms->vmalloc_start_addr =
> r->vmalloc_start_addr;
> + ms->vmalloc_end = r->vmalloc_end -
> 1;
> + ms->vmemmap_vaddr = r->vmemmap_vaddr;
> + if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0))
> + ms->vmemmap_end =
> r->vmemmap_end - 1;
> + else
> + ms->vmemmap_end = -1;
> +
> + } else if (ms->VA_BITS_ACTUAL) {
> ms->modules_vaddr = (st->_stext_vmlinux &
> TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
> ms->modules_end = ms->modules_vaddr +
> ARM64_MODULES_VSIZE -1;
> + ms->vmalloc_start_addr = ms->modules_end +
> 1;
> } else {
> ms->modules_vaddr = ARM64_VA_START;
> if (kernel_symbol_exists("kasan_init"))
> ms->modules_vaddr +=
> ARM64_KASAN_SHADOW_SIZE;
> ms->modules_end = ms->modules_vaddr +
> ARM64_MODULES_VSIZE -1;
> + ms->vmalloc_start_addr = ms->modules_end +
> 1;
> }
>
> - ms->vmalloc_start_addr = ms->modules_end + 1;
> -
> arm64_calc_kimage_voffset();
> } else {
> ms->modules_vaddr = ARM64_PAGE_OFFSET -
> MEGABYTES(64);
> ms->modules_end = ARM64_PAGE_OFFSET - 1;
> ms->vmalloc_start_addr = ARM64_VA_START;
> }
> - ms->vmalloc_end = ARM64_VMALLOC_END;
> - ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> - ms->vmemmap_end = ARM64_VMEMMAP_END;
>
> switch (machdep->pagesize)
> {
> @@ -404,7 +429,12 @@ arm64_init(int when)
> case POST_GDB:
> /* Rely on kernel version to decide the kernel start
> address */
> arm64_calc_kernel_start();
> - arm64_calc_virtual_memory_ranges();
> +
> + /* Can we get the size of struct page before POST_GDB */
> + ms = machdep->machspec;
> + if (!ms->struct_page_size)
> + arm64_calc_virtual_memory_ranges();
> +
> arm64_get_section_size_bits();
>
> if (!machdep->max_physmem_bits) {
> @@ -419,8 +449,6 @@ arm64_init(int when)
> machdep->max_physmem_bits =
> _MAX_PHYSMEM_BITS;
> }
>
> - ms = machdep->machspec;
> -
> if (CRASHDEBUG(1)) {
> if (ms->VA_BITS_ACTUAL) {
> fprintf(fp, "CONFIG_ARM64_VA_BITS: %ld\n",
> ms->CONFIG_ARM64_VA_BITS);
> @@ -511,6 +539,336 @@ arm64_init(int when)
> }
> }
>
> +struct kernel_va_range_handler {
> + unsigned long kernel_versions_start; /* include */
> + unsigned long kernel_versions_end; /* exclude */
> + struct kernel_range *(*get_range)(struct machine_specific *);
> +};
> +
> +static struct kernel_range tmp_range;
> +#define _PAGE_END(va) (-(1UL << ((va) - 1)))
> +#define SZ_64K 0x00010000
> +#define SZ_2M 0x00200000
> +
> +/*
> + * Get the max shift of the size of struct page.
> + * Most of the time, it is 64 bytes, but not sure.
> + */
> +static int arm64_get_struct_page_max_shift(struct machine_specific *ms)
> +{
> + unsigned long v = ms->struct_page_size;
> +
> + if (16 < v && v <= 32)
> + return 5;
> + if (32 < v && v <= 64)
> + return 6;
> + if (64 < v && v <= 128)
> + return 7;
> +
> + error(FATAL, "We should not have such struct page size:%d!\n", v);
> + return 0;
> +}
>
If I understand the above function correctly, can it be replaced
by ceil(log2(v))? That can keep it consistent with the kernel. But the
weakness is to include the header math.h in arm64.c. Do you have any
specific concerns about this?
For example:
return ceil(log2(v));
+
> +/*
> + * The change is caused by the kernel patch since v5.17-rc1:
> + * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
> + */
> +static struct kernel_range *arm64_get_range_v5_17(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long vmem_shift, vmemmap_size;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + r->modules_vaddr = _PAGE_END(v);
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS -
> vmem_shift)));
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
> + return r;
> +}
> +
> +/*
> + * The change is caused by the kernel patch since v5.11:
> + * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
> + */
> +static struct kernel_range *arm64_get_range_v5_11(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long vmem_shift, vmemmap_size, bpf_jit_size =
> MEGABYTES(128);
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS -
> vmem_shift)));
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
> + return r;
> +}
> +
> +static unsigned long arm64_get_pud_size(void)
> +{
> + unsigned long PUD_SIZE = 0;
> +
> + switch (machdep->pagesize) {
> + case 4096:
> + if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
> + PUD_SIZE = PUD_SIZE_L4_4K;
> + } else {
> + PUD_SIZE = PGDIR_SIZE_L3_4K;
> + }
> + break;
> +
> + case 65536:
> + PUD_SIZE = PGDIR_SIZE_L2_64K;
> + default:
> + break;
> + }
> + return PUD_SIZE;
> +}
> +
> +/*
> + * The change is caused by the kernel patches since v5.4, such as:
> + * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
> + * "14c127c957c1 arm64: mm: Flip kernel VA space"
> + */
> +static struct kernel_range *arm64_get_range_v5_4(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
> + unsigned long vmem_shift, vmemmap_size, bpf_jit_size =
> MEGABYTES(128);
> + char *string;
> + int ret;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + if (kernel_symbol_exists("kasan_init")) {
> + /* See the arch/arm64/Makefile */
> + ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
> + if (ret == IKCONFIG_N)
> + return NULL;
> + kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
> +
> + /* See the arch/arm64/Kconfig*/
> + ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET",
> &string);
> + if (ret != IKCONFIG_STR)
> + return NULL;
> + kasan_shadow_offset = atol(string);
> +
> + r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) +
> kasan_shadow_offset
> + + bpf_jit_size;
> + } else {
> + r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
> + }
> +
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
> + if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
> + /*
> + * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable
> memory hot remove"
> + * adds the VMEMMAP_END.
> + */
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> + } else {
> + r->vmemmap_end = 0xffffffffffffffffUL;
> + }
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + PUD_SIZE = arm64_get_pud_size();
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
> + return r;
> +}
> +
> +/*
> + * The change is caused by the kernel patches since v5.0, such as:
> + * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module
> memory"
> + */
> +static struct kernel_range *arm64_get_range_v5_0(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long kasan_shadow_shift, PUD_SIZE;
> + unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
> + unsigned long va_start, page_offset;
> + int ret;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
> + page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + if (kernel_symbol_exists("kasan_init")) {
> + /* See the arch/arm64/Makefile */
> + ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
> + if (ret == IKCONFIG_N)
> + return NULL;
> + kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
> +
> + r->modules_vaddr = va_start + (1UL << (v -
> kasan_shadow_shift)) + bpf_jit_size;
> + } else {
> + r->modules_vaddr = va_start + bpf_jit_size;
> + }
> +
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmemmap_size = (1UL << (v - machdep->pageshift - 1 +
> arm64_get_struct_page_max_shift(ms)));
> +
> + r->vmemmap_vaddr = page_offset - vmemmap_size;
> + r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not
> have VMEMMAP_END */
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + PUD_SIZE = arm64_get_pud_size();
> +
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
> + return r;
> +}
> +
> +static struct kernel_va_range_handler kernel_va_range_handlers[] = {
> + {
> + LINUX(5,17,0),
> + LINUX(99,0,0), /* Just a boundary, Change it later */
> + get_range: arm64_get_range_v5_17,
> + }, {
> + LINUX(5,11,0), LINUX(5,17,0),
> + get_range: arm64_get_range_v5_11,
> + }, {
> + LINUX(5,4,0), LINUX(5,11,0),
> + get_range: arm64_get_range_v5_4,
> + }, {
> + LINUX(5,0,0), LINUX(5,4,0),
> + get_range: arm64_get_range_v5_0,
> + },
> +};
> +
> +#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
> +
> +static unsigned long arm64_get_kernel_version(void)
> +{
> + char *string;
> + char buf[BUFSIZE];
> + char *p1, *p2;
> +
> + if (THIS_KERNEL_VERSION)
> + return THIS_KERNEL_VERSION;
> +
> + string = pc->read_vmcoreinfo("OSRELEASE");
> + if (string) {
> + strcpy(buf, string);
> +
> + p1 = p2 = buf;
> + while (*p2 != '.')
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[0] = atoi(p1);
> +
> + p1 = ++p2;
> + while (*p2 != '.')
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[1] = atoi(p1);
> +
> + p1 = ++p2;
> + while ((*p2 >= '0') && (*p2 <= '9'))
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[2] = atoi(p1);
> + }
> + free(string);
> + return THIS_KERNEL_VERSION;
> +}
> +
> +/* Return NULL if we fail. */
> +static struct kernel_range *arm64_get_va_range(struct machine_specific
> *ms)
> +{
> + struct kernel_va_range_handler *h;
> + unsigned long kernel_version = arm64_get_kernel_version();
> + struct kernel_range *r = NULL;
> + int i;
> +
> + if (!kernel_version)
> + goto range_failed;
> +
> + for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
> + h = kernel_va_range_handlers + i;
> +
> + /* Get the right hook for this kernel version */
> + if (h->kernel_versions_start <= kernel_version &&
> + kernel_version < h->kernel_versions_end) {
> +
> + /* Get the correct virtual address ranges */
> + r = h->get_range(ms);
> + if (!r)
> + goto range_failed;
> + return r;
> + }
> + }
> +
> +range_failed:
> + /* Reset ms->struct_page_size to 0 for
> arm64_calc_virtual_memory_ranges() */
> + ms->struct_page_size = 0;
> + return NULL;
> +}
> +
> +/* Get the size of struct page {} */
> +static void arm64_get_struct_page_size(struct machine_specific *ms)
> +{
> + char *string;
> +
> + string = pc->read_vmcoreinfo("SIZE(page)");
> + if (string)
> + ms->struct_page_size = atol(string);
> + free(string);
> +}
> +
> /*
> * Accept or reject a symbol from the kernel namelist.
> */
> @@ -4272,7 +4630,6 @@ arm64_calc_VA_BITS(void)
> #define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
> #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x,
> (typeof(x))(a) - 1)
> #define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
> -#define SZ_64K 0x00010000
>
> static void
> arm64_calc_virtual_memory_ranges(void)
> diff --git a/defs.h b/defs.h
> index bf2c59b..81ac049 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -3386,6 +3386,7 @@ struct machine_specific {
> ulong VA_START;
> ulong CONFIG_ARM64_KERNELPACMASK;
> ulong physvirt_offset;
> + ulong struct_page_size;
> };
>
Can you add this one to the arm64_dump_machdep_table()?
Thanks.
Lianbo
struct arm64_stackframe {
> --
> 2.30.2
>
>
>
>
> ------------------------------
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
>
> End of Crash-utility Digest, Vol 198, Issue 7
> *********************************************
>
>
2 years, 9 months
Re: [Crash-utility] [PATCH 1/2] Fix memory leak in __sbitmap_for_each_set function
by lijiang
Thank you for the fix, Sergey.
On Wed, Mar 9, 2022 at 8:00 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Tue, 8 Mar 2022 23:27:09 +0300
> From: Sergey Samoylenko <s.samoylenko(a)yadro.com>
> To: <crash-utility(a)redhat.com>
> Cc: <linux(a)yadro.com>, Sergey Samoylenko <s.samoylenko(a)yadro.com>
> Subject: [Crash-utility] [PATCH 1/2] Fix memory leak in
> __sbitmap_for_each_set function
> Message-ID: <20220308202710.46668-2-s.samoylenko(a)yadro.com>
> Content-Type: text/plain
>
> Signed-off-by: Sergey Samoylenko <s.samoylenko(a)yadro.com>
> ---
> sbitmap.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/sbitmap.c b/sbitmap.c
> index 91a5274..4eaa0cc 100644
> --- a/sbitmap.c
> +++ b/sbitmap.c
>
This reminds me, if parse_line() or dump_struct_member() fails, is there a
potential risk of memory leaks in the dump_struct_members()?
file: sbitmap.c
432 static void dump_struct_members(const char *s, ulong addr, unsigned
radix)
433 {
434 int i, argc;
435 char *p1, *p2;
436 char *structname, *members;
437 char *arglist[MAXARGS];
438
439 structname = GETBUF(strlen(s) + 1);
440 members = GETBUF(strlen(s) + 1);
441
442 strcpy(structname, s);
443 p1 = strstr(structname, ".") + 1;
444
445 p2 = strstr(s, ".") + 1;
446 strcpy(members, p2);
447 replace_string(members, ",", ' ');
448 argc = parse_line(members, arglist);
449
450 for (i = 0; i < argc; i++) {
451 *p1 = NULLCHAR;
452 strcat(structname, arglist[i]);
453 dump_struct_member(structname, addr, radix);
454 }
455
456 FREEBUF(structname);
457 FREEBUF(members);
458 }
I noticed that the parse_line() has a return value, but the
dump_struct_member() has no return value, is there a good way to avoid the
potential risks? Or leave it there?
BTW: I saw the similar issues in tools.c
Thanks.
Lianbo
@@ -272,7 +272,7 @@ static void __sbitmap_for_each_set(const struct
> sbitmap_context *sc,
> if (nr >= depth)
> break;
> if (!fn((index << sc->shift) + nr, data))
> - return;
> + goto exit;
>
> nr++;
> }
> @@ -282,6 +282,7 @@ next:
> index = 0;
> }
>
> +exit:
> FREEBUF(sbitmap_word_buf);
> }
>
> --
> 2.25.1
>
2 years, 9 months
[PATCH v7] arm64: update the modules/vmalloc/vmemmap ranges
by Huang Shijie
< 1 > The background.
The current crash code is still based at kernel v4.20, but the kernel is v5.17-rc4(now).
The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
I list all the changes from kernel v4.20 to v5.17:
1.) The current crash code is based at kernel v4.20.
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
2.) In the kernel v5.0, the patch will add a new BFP JIT region:
"91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
The layout does not changed until v5.4.
3.) In the kernel v5.4, several patches changes the layout, such as:
"ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
"14c127c957c1 arm64: mm: Flip kernel VA space"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
In the v5.7, the patch:
"bbd6ec605c arm64/mm: Enable memory hot remove"
adds the VMEMMAP_END.
4.) In the kernel v5.11, several patches changes the layout, such as:
"9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
"f4693c2716b3 arm64: mm: extend linear region for 52-bit VA configurations"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
5.) In the kernel v5.17-rc1, after the patch
"b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
< 2 > What does this patch do?
1.) Use arm64_get_struct_page_size() to get the size of struct page{} in the PRE_GDB.
2.) If we can succeed in above step, we will try to call arm64_get_va_range() to
get the proper kernel virtual ranges.
In the arm64_get_va_range(), we calculate the ranges by the hooks of
different kernel versions:
get_range: arm64_get_range_v5_17,
get_range: arm64_get_range_v5_11,
get_range: arm64_get_range_v5_4,
get_range: arm64_get_range_v5_0,
3.) If we can succeed in above steps, the arm64_calc_virtual_memory_ranges()
will be ignored. If we failed in above steps, the arm64_calc_virtual_memory_ranges()
will continue to do its work.
< 3 > Test this patch.
Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
(The CONFIG_KASAN is NOT set for this kernel.)
Before this patch, we get the wrong output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800048000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800040000000
modules_end: ffff800047ffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
After this patch, we can get the correct output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800010000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800008000000
modules_end: ffff80000fffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
v6 --> v7:
1.) Simplify the arm64_get_struct_page_max_shift().
2.) Add "struct_page_size" dump info in arm64_dump_machdep_table().
3.) Tested it again.
v5 --> v6:
1.)
Find a bug in the v5:
We need to minus 1 for Crash's modules_end/vmalloc_end/vmemmap_end.
2.) Change version limit to LINUX(99, 0, 0)
3.) Tested it again.
v4 --> v5:
1.) Reset ms->struct_page_size to 0 if arm64_get_va_range() fails,
so arm64_calc_virtual_memory_ranges() can continue its work.
2.) Tested again with the new code.
v3 --> v4:
1.) Add struct_page_size to @ms.
Change some functions, such as arm64_init() and arm64_get_struct_page_size().
(Do not use the ASSIGN_SIZE, use the ms->struct_page_size instead.)
2.) Tested again with the new code.
v2 --> v3:
Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
vmemmep_vaddr, not use the ms->VA_BITS.
v1 --> v2:
The Crash code is based on v4.20 not v4.9.
Changed the commit message about it.
---
arm64.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
defs.h | 1 +
2 files changed, 362 insertions(+), 14 deletions(-)
diff --git a/arm64.c b/arm64.c
index 3ab8489..ac8d9e0 100644
--- a/arm64.c
+++ b/arm64.c
@@ -20,6 +20,7 @@
#include "defs.h"
#include <elf.h>
#include <endian.h>
+#include <math.h>
#include <sys/ioctl.h>
#define NOT_IMPLEMENTED(X) error((X), "%s: function not implemented\n", __func__)
@@ -92,6 +93,14 @@ static void arm64_calc_VA_BITS(void);
static int arm64_is_uvaddr(ulong, struct task_context *);
static void arm64_calc_KERNELPACMASK(void);
+struct kernel_range {
+ unsigned long modules_vaddr, modules_end;
+ unsigned long vmalloc_start_addr, vmalloc_end;
+ unsigned long vmemmap_vaddr, vmemmap_end;
+};
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms);
+static void arm64_get_struct_page_size(struct machine_specific *ms);
+
static void arm64_calc_kernel_start(void)
{
struct machine_specific *ms = machdep->machspec;
@@ -233,9 +242,10 @@ arm64_init(int when)
machdep->pageoffset = machdep->pagesize - 1;
machdep->pagemask = ~((ulonglong)machdep->pageoffset);
+ ms = machdep->machspec;
+ arm64_get_struct_page_size(ms);
arm64_calc_VA_BITS();
arm64_calc_KERNELPACMASK();
- ms = machdep->machspec;
/* vabits_actual introduced after mm flip, so it should be flipped layout */
if (ms->VA_BITS_ACTUAL) {
@@ -252,8 +262,15 @@ arm64_init(int when)
}
machdep->is_kvaddr = generic_is_kvaddr;
machdep->kvtop = arm64_kvtop;
+
+ /* The defaults */
+ ms->vmalloc_end = ARM64_VMALLOC_END;
+ ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
+ ms->vmemmap_end = ARM64_VMEMMAP_END;
+
if (machdep->flags & NEW_VMEMMAP) {
struct syment *sp;
+ struct kernel_range *r;
/* It is finally decided in arm64_calc_kernel_start() */
sp = kernel_symbol_search("_text");
@@ -261,27 +278,36 @@ arm64_init(int when)
sp = kernel_symbol_search("_end");
ms->kimage_end = (sp ? sp->value : 0);
- if (ms->VA_BITS_ACTUAL) {
+ if (ms->struct_page_size && (r = arm64_get_va_range(ms))) {
+ /* We can get all the MODULES/VMALLOC/VMEMMAP ranges now.*/
+ ms->modules_vaddr = r->modules_vaddr;
+ ms->modules_end = r->modules_end - 1;
+ ms->vmalloc_start_addr = r->vmalloc_start_addr;
+ ms->vmalloc_end = r->vmalloc_end - 1;
+ ms->vmemmap_vaddr = r->vmemmap_vaddr;
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0))
+ ms->vmemmap_end = r->vmemmap_end - 1;
+ else
+ ms->vmemmap_end = -1;
+
+ } else if (ms->VA_BITS_ACTUAL) {
ms->modules_vaddr = (st->_stext_vmlinux & TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
} else {
ms->modules_vaddr = ARM64_VA_START;
if (kernel_symbol_exists("kasan_init"))
ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
}
- ms->vmalloc_start_addr = ms->modules_end + 1;
-
arm64_calc_kimage_voffset();
} else {
ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
ms->modules_end = ARM64_PAGE_OFFSET - 1;
ms->vmalloc_start_addr = ARM64_VA_START;
}
- ms->vmalloc_end = ARM64_VMALLOC_END;
- ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
- ms->vmemmap_end = ARM64_VMEMMAP_END;
switch (machdep->pagesize)
{
@@ -404,7 +430,12 @@ arm64_init(int when)
case POST_GDB:
/* Rely on kernel version to decide the kernel start address */
arm64_calc_kernel_start();
- arm64_calc_virtual_memory_ranges();
+
+ /* Can we get the size of struct page before POST_GDB */
+ ms = machdep->machspec;
+ if (!ms->struct_page_size)
+ arm64_calc_virtual_memory_ranges();
+
arm64_get_section_size_bits();
if (!machdep->max_physmem_bits) {
@@ -419,8 +450,6 @@ arm64_init(int when)
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
}
- ms = machdep->machspec;
-
if (CRASHDEBUG(1)) {
if (ms->VA_BITS_ACTUAL) {
fprintf(fp, "CONFIG_ARM64_VA_BITS: %ld\n", ms->CONFIG_ARM64_VA_BITS);
@@ -511,6 +540,326 @@ arm64_init(int when)
}
}
+struct kernel_va_range_handler {
+ unsigned long kernel_versions_start; /* include */
+ unsigned long kernel_versions_end; /* exclude */
+ struct kernel_range *(*get_range)(struct machine_specific *);
+};
+
+static struct kernel_range tmp_range;
+#define _PAGE_END(va) (-(1UL << ((va) - 1)))
+#define SZ_64K 0x00010000
+#define SZ_2M 0x00200000
+
+/*
+ * Get the max shift of the size of struct page.
+ * Most of the time, it is 64 bytes, but not sure.
+ */
+static int arm64_get_struct_page_max_shift(struct machine_specific *ms)
+{
+ return (int)ceil(log2(ms->struct_page_size));
+}
+
+/*
+ * The change is caused by the kernel patch since v5.17-rc1:
+ * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
+ */
+static struct kernel_range *arm64_get_range_v5_17(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v);
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.11:
+ * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_11(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+static unsigned long arm64_get_pud_size(void)
+{
+ unsigned long PUD_SIZE = 0;
+
+ switch (machdep->pagesize) {
+ case 4096:
+ if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
+ PUD_SIZE = PUD_SIZE_L4_4K;
+ } else {
+ PUD_SIZE = PGDIR_SIZE_L3_4K;
+ }
+ break;
+
+ case 65536:
+ PUD_SIZE = PGDIR_SIZE_L2_64K;
+ default:
+ break;
+ }
+ return PUD_SIZE;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.4, such as:
+ * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
+ * "14c127c957c1 arm64: mm: Flip kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_4(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ char *string;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ /* See the arch/arm64/Kconfig*/
+ ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET", &string);
+ if (ret != IKCONFIG_STR)
+ return NULL;
+ kasan_shadow_offset = atol(string);
+
+ r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) + kasan_shadow_offset
+ + bpf_jit_size;
+ } else {
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
+ /*
+ * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable memory hot remove"
+ * adds the VMEMMAP_END.
+ */
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+ } else {
+ r->vmemmap_end = 0xffffffffffffffffUL;
+ }
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.0, such as:
+ * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
+ */
+static struct kernel_range *arm64_get_range_v5_0(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, PUD_SIZE;
+ unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ unsigned long va_start, page_offset;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
+ page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ r->modules_vaddr = va_start + (1UL << (v - kasan_shadow_shift)) + bpf_jit_size;
+ } else {
+ r->modules_vaddr = va_start + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmemmap_size = (1UL << (v - machdep->pageshift - 1 + arm64_get_struct_page_max_shift(ms)));
+
+ r->vmemmap_vaddr = page_offset - vmemmap_size;
+ r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not have VMEMMAP_END */
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
+ return r;
+}
+
+static struct kernel_va_range_handler kernel_va_range_handlers[] = {
+ {
+ LINUX(5,17,0),
+ LINUX(99,0,0), /* Just a boundary, Change it later */
+ get_range: arm64_get_range_v5_17,
+ }, {
+ LINUX(5,11,0), LINUX(5,17,0),
+ get_range: arm64_get_range_v5_11,
+ }, {
+ LINUX(5,4,0), LINUX(5,11,0),
+ get_range: arm64_get_range_v5_4,
+ }, {
+ LINUX(5,0,0), LINUX(5,4,0),
+ get_range: arm64_get_range_v5_0,
+ },
+};
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+
+static unsigned long arm64_get_kernel_version(void)
+{
+ char *string;
+ char buf[BUFSIZE];
+ char *p1, *p2;
+
+ if (THIS_KERNEL_VERSION)
+ return THIS_KERNEL_VERSION;
+
+ string = pc->read_vmcoreinfo("OSRELEASE");
+ if (string) {
+ strcpy(buf, string);
+
+ p1 = p2 = buf;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[0] = atoi(p1);
+
+ p1 = ++p2;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[1] = atoi(p1);
+
+ p1 = ++p2;
+ while ((*p2 >= '0') && (*p2 <= '9'))
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[2] = atoi(p1);
+ }
+ free(string);
+ return THIS_KERNEL_VERSION;
+}
+
+/* Return NULL if we fail. */
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms)
+{
+ struct kernel_va_range_handler *h;
+ unsigned long kernel_version = arm64_get_kernel_version();
+ struct kernel_range *r = NULL;
+ int i;
+
+ if (!kernel_version)
+ goto range_failed;
+
+ for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
+ h = kernel_va_range_handlers + i;
+
+ /* Get the right hook for this kernel version */
+ if (h->kernel_versions_start <= kernel_version &&
+ kernel_version < h->kernel_versions_end) {
+
+ /* Get the correct virtual address ranges */
+ r = h->get_range(ms);
+ if (!r)
+ goto range_failed;
+ return r;
+ }
+ }
+
+range_failed:
+ /* Reset ms->struct_page_size to 0 for arm64_calc_virtual_memory_ranges() */
+ ms->struct_page_size = 0;
+ return NULL;
+}
+
+/* Get the size of struct page {} */
+static void arm64_get_struct_page_size(struct machine_specific *ms)
+{
+ char *string;
+
+ string = pc->read_vmcoreinfo("SIZE(page)");
+ if (string)
+ ms->struct_page_size = atol(string);
+ free(string);
+}
+
/*
* Accept or reject a symbol from the kernel namelist.
*/
@@ -557,7 +906,7 @@ arm64_verify_symbol(const char *name, ulong value, char type)
void
arm64_dump_machdep_table(ulong arg)
{
- const struct machine_specific *ms;
+ const struct machine_specific *ms = machdep->machspec;
int others, i;
others = 0;
@@ -598,6 +947,7 @@ arm64_dump_machdep_table(ulong arg)
fprintf(fp, " pageshift: %d\n", machdep->pageshift);
fprintf(fp, " pagemask: %lx\n", (ulong)machdep->pagemask);
fprintf(fp, " pageoffset: %lx\n", machdep->pageoffset);
+ fprintf(fp, " struct_page_size: %ld\n", ms->struct_page_size);
fprintf(fp, " stacksize: %ld\n", machdep->stacksize);
fprintf(fp, " hz: %d\n", machdep->hz);
fprintf(fp, " mhz: %ld\n", machdep->mhz);
@@ -683,8 +1033,6 @@ arm64_dump_machdep_table(ulong arg)
machdep->cmdline_args[i] : "(unused)");
}
- ms = machdep->machspec;
-
fprintf(fp, " machspec: %lx\n", (ulong)ms);
fprintf(fp, " VA_BITS: %ld\n", ms->VA_BITS);
fprintf(fp, " CONFIG_ARM64_VA_BITS: %ld\n", ms->CONFIG_ARM64_VA_BITS);
@@ -4272,7 +4620,6 @@ arm64_calc_VA_BITS(void)
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
-#define SZ_64K 0x00010000
static void
arm64_calc_virtual_memory_ranges(void)
diff --git a/defs.h b/defs.h
index bf2c59b..81ac049 100644
--- a/defs.h
+++ b/defs.h
@@ -3386,6 +3386,7 @@ struct machine_specific {
ulong VA_START;
ulong CONFIG_ARM64_KERNELPACMASK;
ulong physvirt_offset;
+ ulong struct_page_size;
};
struct arm64_stackframe {
--
2.30.2
2 years, 9 months
Re: [Crash-utility] [PATCH v7] arm64: update the modules/vmalloc/vmemmap ranges
by lijiang
Thank you for the update and work, ShiJie.
For the V7:
Acked-by: Lianbo Jiang <lijiang(a)redhat.com>
On Fri, Mar 11, 2022 at 1:05 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Fri, 11 Mar 2022 13:00:59 +0000
> From: Huang Shijie <shijie(a)os.amperecomputing.com>
> To: k-hagio-ab(a)nec.com, lijiang(a)redhat.com
> Cc: crash-utility(a)redhat.com, zwang(a)amperecomputing.com,
> darren(a)os.amperecomputing.com, patches(a)amperecomputing.com, Huang
> Shijie <shijie(a)os.amperecomputing.com>
> Subject: [Crash-utility] [PATCH v7] arm64: update the
> modules/vmalloc/vmemmap ranges
> Message-ID: <20220311130059.266383-1-shijie(a)os.amperecomputing.com>
> Content-Type: text/plain
>
> < 1 > The background.
> The current crash code is still based at kernel v4.20, but the kernel
> is v5.17-rc4(now).
> The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
>
> I list all the changes from kernel v4.20 to v5.17:
>
> 1.) The current crash code is based at kernel v4.20.
> The virtual memory layout looks like this:
>
> +--------------------------------------------------------------------+
> | KASAN | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE -
> SZ_64K)
>
> #define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
>
> 2.) In the kernel v5.0, the patch will add a new BFP JIT region:
> "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module
> memory"
>
> The virtual memory layout looks like this:
>
> +--------------------------------------------------------------------+
> | KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE -
> SZ_64K)
>
> #define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
>
> The layout does not changed until v5.4.
>
> 3.) In the kernel v5.4, several patches changes the layout, such as:
> "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
> "14c127c957c1 arm64: mm: Flip kernel VA space"
> and the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>
> #define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
>
> In the v5.7, the patch:
> "bbd6ec605c arm64/mm: Enable memory hot remove"
> adds the VMEMMAP_END.
>
> 4.) In the kernel v5.11, several patches changes the layout, such as:
> "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
> "f4693c2716b3 arm64: mm: extend linear region for 52-bit VA
> configurations"
> and the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (BPF_JIT_REGION_END)
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (VMEMMAP_START - SZ_256M)
>
> #define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> #define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
>
> 5.) In the kernel v5.17-rc1, after the patch
> "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
> the virtual memory layout looks like this:
>
>
> +--------------------------------------------------------------------+
> | MODULE | VMALLOC | .... | VMEMMAP
> |
>
> +--------------------------------------------------------------------+
>
> The macros are:
> #define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
> #define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (VMEMMAP_START - SZ_256M)
>
> #define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> #define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
>
> < 2 > What does this patch do?
> 1.) Use arm64_get_struct_page_size() to get the size of struct page{}
> in the PRE_GDB.
>
> 2.) If we can succeed in above step, we will try to call
> arm64_get_va_range() to
> get the proper kernel virtual ranges.
>
> In the arm64_get_va_range(), we calculate the ranges by the hooks
> of
> different kernel versions:
> get_range: arm64_get_range_v5_17,
> get_range: arm64_get_range_v5_11,
> get_range: arm64_get_range_v5_4,
> get_range: arm64_get_range_v5_0,
>
> 3.) If we can succeed in above steps, the
> arm64_calc_virtual_memory_ranges()
> will be ignored. If we failed in above steps, the
> arm64_calc_virtual_memory_ranges()
> will continue to do its work.
>
> < 3 > Test this patch.
> Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
> (The CONFIG_KASAN is NOT set for this kernel.)
>
> Before this patch, we get the wrong output from "help -m":
> ----------------------------------------------------------
> vmalloc_start_addr: ffff800048000000
> vmalloc_end: fffffdffbffeffff
> modules_vaddr: ffff800040000000
> modules_end: ffff800047ffffff
> vmemmap_vaddr: fffffdffffe00000
> vmemmap_end: ffffffffffffffff
> ----------------------------------------------------------
>
> After this patch, we can get the correct output from "help -m":
> ----------------------------------------------------------
> vmalloc_start_addr: ffff800010000000
> vmalloc_end: fffffdffbffeffff
> modules_vaddr: ffff800008000000
> modules_end: ffff80000fffffff
> vmemmap_vaddr: fffffdffffe00000
> vmemmap_end: ffffffffffffffff
> ----------------------------------------------------------
>
> Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
> ---
> v6 --> v7:
> 1.) Simplify the arm64_get_struct_page_max_shift().
> 2.) Add "struct_page_size" dump info in
> arm64_dump_machdep_table().
> 3.) Tested it again.
>
> v5 --> v6:
> 1.)
> Find a bug in the v5:
> We need to minus 1 for Crash's
> modules_end/vmalloc_end/vmemmap_end.
>
> 2.) Change version limit to LINUX(99, 0, 0)
> 3.) Tested it again.
>
> v4 --> v5:
> 1.) Reset ms->struct_page_size to 0 if arm64_get_va_range() fails,
> so arm64_calc_virtual_memory_ranges() can continue its work.
>
> 2.) Tested again with the new code.
>
> v3 --> v4:
> 1.) Add struct_page_size to @ms.
> Change some functions, such as arm64_init() and
> arm64_get_struct_page_size().
> (Do not use the ASSIGN_SIZE, use the ms->struct_page_size
> instead.)
>
> 2.) Tested again with the new code.
>
> v2 --> v3:
> Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
> We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
> vmemmep_vaddr, not use the ms->VA_BITS.
>
> v1 --> v2:
> The Crash code is based on v4.20 not v4.9.
> Changed the commit message about it.
> ---
> arm64.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
> defs.h | 1 +
> 2 files changed, 362 insertions(+), 14 deletions(-)
>
> diff --git a/arm64.c b/arm64.c
> index 3ab8489..ac8d9e0 100644
> --- a/arm64.c
> +++ b/arm64.c
> @@ -20,6 +20,7 @@
> #include "defs.h"
> #include <elf.h>
> #include <endian.h>
> +#include <math.h>
> #include <sys/ioctl.h>
>
> #define NOT_IMPLEMENTED(X) error((X), "%s: function not implemented\n",
> __func__)
> @@ -92,6 +93,14 @@ static void arm64_calc_VA_BITS(void);
> static int arm64_is_uvaddr(ulong, struct task_context *);
> static void arm64_calc_KERNELPACMASK(void);
>
> +struct kernel_range {
> + unsigned long modules_vaddr, modules_end;
> + unsigned long vmalloc_start_addr, vmalloc_end;
> + unsigned long vmemmap_vaddr, vmemmap_end;
> +};
> +static struct kernel_range *arm64_get_va_range(struct machine_specific
> *ms);
> +static void arm64_get_struct_page_size(struct machine_specific *ms);
> +
> static void arm64_calc_kernel_start(void)
> {
> struct machine_specific *ms = machdep->machspec;
> @@ -233,9 +242,10 @@ arm64_init(int when)
> machdep->pageoffset = machdep->pagesize - 1;
> machdep->pagemask = ~((ulonglong)machdep->pageoffset);
>
> + ms = machdep->machspec;
> + arm64_get_struct_page_size(ms);
> arm64_calc_VA_BITS();
> arm64_calc_KERNELPACMASK();
> - ms = machdep->machspec;
>
> /* vabits_actual introduced after mm flip, so it should be
> flipped layout */
> if (ms->VA_BITS_ACTUAL) {
> @@ -252,8 +262,15 @@ arm64_init(int when)
> }
> machdep->is_kvaddr = generic_is_kvaddr;
> machdep->kvtop = arm64_kvtop;
> +
> + /* The defaults */
> + ms->vmalloc_end = ARM64_VMALLOC_END;
> + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> + ms->vmemmap_end = ARM64_VMEMMAP_END;
> +
> if (machdep->flags & NEW_VMEMMAP) {
> struct syment *sp;
> + struct kernel_range *r;
>
> /* It is finally decided in
> arm64_calc_kernel_start() */
> sp = kernel_symbol_search("_text");
> @@ -261,27 +278,36 @@ arm64_init(int when)
> sp = kernel_symbol_search("_end");
> ms->kimage_end = (sp ? sp->value : 0);
>
> - if (ms->VA_BITS_ACTUAL) {
> + if (ms->struct_page_size && (r =
> arm64_get_va_range(ms))) {
> + /* We can get all the
> MODULES/VMALLOC/VMEMMAP ranges now.*/
> + ms->modules_vaddr = r->modules_vaddr;
> + ms->modules_end = r->modules_end -
> 1;
> + ms->vmalloc_start_addr =
> r->vmalloc_start_addr;
> + ms->vmalloc_end = r->vmalloc_end -
> 1;
> + ms->vmemmap_vaddr = r->vmemmap_vaddr;
> + if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0))
> + ms->vmemmap_end =
> r->vmemmap_end - 1;
> + else
> + ms->vmemmap_end = -1;
> +
> + } else if (ms->VA_BITS_ACTUAL) {
> ms->modules_vaddr = (st->_stext_vmlinux &
> TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
> ms->modules_end = ms->modules_vaddr +
> ARM64_MODULES_VSIZE -1;
> + ms->vmalloc_start_addr = ms->modules_end +
> 1;
> } else {
> ms->modules_vaddr = ARM64_VA_START;
> if (kernel_symbol_exists("kasan_init"))
> ms->modules_vaddr +=
> ARM64_KASAN_SHADOW_SIZE;
> ms->modules_end = ms->modules_vaddr +
> ARM64_MODULES_VSIZE -1;
> + ms->vmalloc_start_addr = ms->modules_end +
> 1;
> }
>
> - ms->vmalloc_start_addr = ms->modules_end + 1;
> -
> arm64_calc_kimage_voffset();
> } else {
> ms->modules_vaddr = ARM64_PAGE_OFFSET -
> MEGABYTES(64);
> ms->modules_end = ARM64_PAGE_OFFSET - 1;
> ms->vmalloc_start_addr = ARM64_VA_START;
> }
> - ms->vmalloc_end = ARM64_VMALLOC_END;
> - ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> - ms->vmemmap_end = ARM64_VMEMMAP_END;
>
> switch (machdep->pagesize)
> {
> @@ -404,7 +430,12 @@ arm64_init(int when)
> case POST_GDB:
> /* Rely on kernel version to decide the kernel start
> address */
> arm64_calc_kernel_start();
> - arm64_calc_virtual_memory_ranges();
> +
> + /* Can we get the size of struct page before POST_GDB */
> + ms = machdep->machspec;
> + if (!ms->struct_page_size)
> + arm64_calc_virtual_memory_ranges();
> +
> arm64_get_section_size_bits();
>
> if (!machdep->max_physmem_bits) {
> @@ -419,8 +450,6 @@ arm64_init(int when)
> machdep->max_physmem_bits =
> _MAX_PHYSMEM_BITS;
> }
>
> - ms = machdep->machspec;
> -
> if (CRASHDEBUG(1)) {
> if (ms->VA_BITS_ACTUAL) {
> fprintf(fp, "CONFIG_ARM64_VA_BITS: %ld\n",
> ms->CONFIG_ARM64_VA_BITS);
> @@ -511,6 +540,326 @@ arm64_init(int when)
> }
> }
>
> +struct kernel_va_range_handler {
> + unsigned long kernel_versions_start; /* include */
> + unsigned long kernel_versions_end; /* exclude */
> + struct kernel_range *(*get_range)(struct machine_specific *);
> +};
> +
> +static struct kernel_range tmp_range;
> +#define _PAGE_END(va) (-(1UL << ((va) - 1)))
> +#define SZ_64K 0x00010000
> +#define SZ_2M 0x00200000
> +
> +/*
> + * Get the max shift of the size of struct page.
> + * Most of the time, it is 64 bytes, but not sure.
> + */
> +static int arm64_get_struct_page_max_shift(struct machine_specific *ms)
> +{
> + return (int)ceil(log2(ms->struct_page_size));
> +}
> +
> +/*
> + * The change is caused by the kernel patch since v5.17-rc1:
> + * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
> + */
> +static struct kernel_range *arm64_get_range_v5_17(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long vmem_shift, vmemmap_size;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + r->modules_vaddr = _PAGE_END(v);
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS -
> vmem_shift)));
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
> + return r;
> +}
> +
> +/*
> + * The change is caused by the kernel patch since v5.11:
> + * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
> + */
> +static struct kernel_range *arm64_get_range_v5_11(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long vmem_shift, vmemmap_size, bpf_jit_size =
> MEGABYTES(128);
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS -
> vmem_shift)));
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
> + return r;
> +}
> +
> +static unsigned long arm64_get_pud_size(void)
> +{
> + unsigned long PUD_SIZE = 0;
> +
> + switch (machdep->pagesize) {
> + case 4096:
> + if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
> + PUD_SIZE = PUD_SIZE_L4_4K;
> + } else {
> + PUD_SIZE = PGDIR_SIZE_L3_4K;
> + }
> + break;
> +
> + case 65536:
> + PUD_SIZE = PGDIR_SIZE_L2_64K;
> + default:
> + break;
> + }
> + return PUD_SIZE;
> +}
> +
> +/*
> + * The change is caused by the kernel patches since v5.4, such as:
> + * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
> + * "14c127c957c1 arm64: mm: Flip kernel VA space"
> + */
> +static struct kernel_range *arm64_get_range_v5_4(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
> + unsigned long vmem_shift, vmemmap_size, bpf_jit_size =
> MEGABYTES(128);
> + char *string;
> + int ret;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + if (v > 48)
> + v = 48;
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + if (kernel_symbol_exists("kasan_init")) {
> + /* See the arch/arm64/Makefile */
> + ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
> + if (ret == IKCONFIG_N)
> + return NULL;
> + kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
> +
> + /* See the arch/arm64/Kconfig*/
> + ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET",
> &string);
> + if (ret != IKCONFIG_STR)
> + return NULL;
> + kasan_shadow_offset = atol(string);
> +
> + r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) +
> kasan_shadow_offset
> + + bpf_jit_size;
> + } else {
> + r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
> + }
> +
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmem_shift = machdep->pageshift -
> arm64_get_struct_page_max_shift(ms);
> + vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
> +
> + r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
> + if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
> + /*
> + * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable
> memory hot remove"
> + * adds the VMEMMAP_END.
> + */
> + r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
> + } else {
> + r->vmemmap_end = 0xffffffffffffffffUL;
> + }
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + PUD_SIZE = arm64_get_pud_size();
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
> + return r;
> +}
> +
> +/*
> + * The change is caused by the kernel patches since v5.0, such as:
> + * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module
> memory"
> + */
> +static struct kernel_range *arm64_get_range_v5_0(struct machine_specific
> *ms)
> +{
> + struct kernel_range *r = &tmp_range;
> + unsigned long v = ms->CONFIG_ARM64_VA_BITS;
> + unsigned long kasan_shadow_shift, PUD_SIZE;
> + unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
> + unsigned long va_start, page_offset;
> + int ret;
> +
> + /* Not initialized yet */
> + if (v == 0)
> + return NULL;
> +
> + va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
> + page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
> +
> + /* Get the MODULES_VADDR ~ MODULES_END */
> + if (kernel_symbol_exists("kasan_init")) {
> + /* See the arch/arm64/Makefile */
> + ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
> + if (ret == IKCONFIG_N)
> + return NULL;
> + kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
> +
> + r->modules_vaddr = va_start + (1UL << (v -
> kasan_shadow_shift)) + bpf_jit_size;
> + } else {
> + r->modules_vaddr = va_start + bpf_jit_size;
> + }
> +
> + r->modules_end = r->modules_vaddr + MEGABYTES(128);
> +
> + /* Get the VMEMMAP_START ~ VMEMMAP_END */
> + vmemmap_size = (1UL << (v - machdep->pageshift - 1 +
> arm64_get_struct_page_max_shift(ms)));
> +
> + r->vmemmap_vaddr = page_offset - vmemmap_size;
> + r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not
> have VMEMMAP_END */
> +
> + /* Get the VMALLOC_START ~ VMALLOC_END */
> + PUD_SIZE = arm64_get_pud_size();
> +
> + r->vmalloc_start_addr = r->modules_end;
> + r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
> + return r;
> +}
> +
> +static struct kernel_va_range_handler kernel_va_range_handlers[] = {
> + {
> + LINUX(5,17,0),
> + LINUX(99,0,0), /* Just a boundary, Change it later */
> + get_range: arm64_get_range_v5_17,
> + }, {
> + LINUX(5,11,0), LINUX(5,17,0),
> + get_range: arm64_get_range_v5_11,
> + }, {
> + LINUX(5,4,0), LINUX(5,11,0),
> + get_range: arm64_get_range_v5_4,
> + }, {
> + LINUX(5,0,0), LINUX(5,4,0),
> + get_range: arm64_get_range_v5_0,
> + },
> +};
> +
> +#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
> +
> +static unsigned long arm64_get_kernel_version(void)
> +{
> + char *string;
> + char buf[BUFSIZE];
> + char *p1, *p2;
> +
> + if (THIS_KERNEL_VERSION)
> + return THIS_KERNEL_VERSION;
> +
> + string = pc->read_vmcoreinfo("OSRELEASE");
> + if (string) {
> + strcpy(buf, string);
> +
> + p1 = p2 = buf;
> + while (*p2 != '.')
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[0] = atoi(p1);
> +
> + p1 = ++p2;
> + while (*p2 != '.')
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[1] = atoi(p1);
> +
> + p1 = ++p2;
> + while ((*p2 >= '0') && (*p2 <= '9'))
> + p2++;
> + *p2 = NULLCHAR;
> + kt->kernel_version[2] = atoi(p1);
> + }
> + free(string);
> + return THIS_KERNEL_VERSION;
> +}
> +
> +/* Return NULL if we fail. */
> +static struct kernel_range *arm64_get_va_range(struct machine_specific
> *ms)
> +{
> + struct kernel_va_range_handler *h;
> + unsigned long kernel_version = arm64_get_kernel_version();
> + struct kernel_range *r = NULL;
> + int i;
> +
> + if (!kernel_version)
> + goto range_failed;
> +
> + for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
> + h = kernel_va_range_handlers + i;
> +
> + /* Get the right hook for this kernel version */
> + if (h->kernel_versions_start <= kernel_version &&
> + kernel_version < h->kernel_versions_end) {
> +
> + /* Get the correct virtual address ranges */
> + r = h->get_range(ms);
> + if (!r)
> + goto range_failed;
> + return r;
> + }
> + }
> +
> +range_failed:
> + /* Reset ms->struct_page_size to 0 for
> arm64_calc_virtual_memory_ranges() */
> + ms->struct_page_size = 0;
> + return NULL;
> +}
> +
> +/* Get the size of struct page {} */
> +static void arm64_get_struct_page_size(struct machine_specific *ms)
> +{
> + char *string;
> +
> + string = pc->read_vmcoreinfo("SIZE(page)");
> + if (string)
> + ms->struct_page_size = atol(string);
> + free(string);
> +}
> +
> /*
> * Accept or reject a symbol from the kernel namelist.
> */
> @@ -557,7 +906,7 @@ arm64_verify_symbol(const char *name, ulong value,
> char type)
> void
> arm64_dump_machdep_table(ulong arg)
> {
> - const struct machine_specific *ms;
> + const struct machine_specific *ms = machdep->machspec;
> int others, i;
>
> others = 0;
> @@ -598,6 +947,7 @@ arm64_dump_machdep_table(ulong arg)
> fprintf(fp, " pageshift: %d\n", machdep->pageshift);
> fprintf(fp, " pagemask: %lx\n",
> (ulong)machdep->pagemask);
> fprintf(fp, " pageoffset: %lx\n", machdep->pageoffset);
> + fprintf(fp, " struct_page_size: %ld\n", ms->struct_page_size);
> fprintf(fp, " stacksize: %ld\n", machdep->stacksize);
> fprintf(fp, " hz: %d\n", machdep->hz);
> fprintf(fp, " mhz: %ld\n", machdep->mhz);
> @@ -683,8 +1033,6 @@ arm64_dump_machdep_table(ulong arg)
> machdep->cmdline_args[i] : "(unused)");
> }
>
> - ms = machdep->machspec;
> -
> fprintf(fp, " machspec: %lx\n", (ulong)ms);
> fprintf(fp, " VA_BITS: %ld\n", ms->VA_BITS);
> fprintf(fp, " CONFIG_ARM64_VA_BITS: %ld\n",
> ms->CONFIG_ARM64_VA_BITS);
> @@ -4272,7 +4620,6 @@ arm64_calc_VA_BITS(void)
> #define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
> #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x,
> (typeof(x))(a) - 1)
> #define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
> -#define SZ_64K 0x00010000
>
> static void
> arm64_calc_virtual_memory_ranges(void)
> diff --git a/defs.h b/defs.h
> index bf2c59b..81ac049 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -3386,6 +3386,7 @@ struct machine_specific {
> ulong VA_START;
> ulong CONFIG_ARM64_KERNELPACMASK;
> ulong physvirt_offset;
> + ulong struct_page_size;
> };
>
> struct arm64_stackframe {
> --
> 2.30.2
>
>
2 years, 9 months
[PATCH 0/2] Fix memory leak in sbitmap.c
by Sergey Samoylenko
The patch set fixes memory leak for sbitmapq command.
Sergey Samoylenko (2):
Fix memory leak in __sbitmap_for_each_set function
Use readmem more carefully
sbitmap.c | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)
--
2.25.1
2 years, 9 months
[PATCH v6] arm64: update the modules/vmalloc/vmemmap ranges
by Huang Shijie
< 1 > The background.
The current crash code is still based at kernel v4.20, but the kernel is v5.17-rc4(now).
The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
I list all the changes from kernel v4.20 to v5.17:
1.) The current crash code is based at kernel v4.20.
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
2.) In the kernel v5.0, the patch will add a new BFP JIT region:
"91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
The layout does not changed until v5.4.
3.) In the kernel v5.4, several patches changes the layout, such as:
"ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
"14c127c957c1 arm64: mm: Flip kernel VA space"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
In the v5.7, the patch:
"bbd6ec605c arm64/mm: Enable memory hot remove"
adds the VMEMMAP_END.
4.) In the kernel v5.11, several patches changes the layout, such as:
"9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
"f4693c2716b3 arm64: mm: extend linear region for 52-bit VA configurations"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
5.) In the kernel v5.17-rc1, after the patch
"b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
< 2 > What does this patch do?
1.) Use arm64_get_struct_page_size() to get the size of struct page{} in the PRE_GDB.
2.) If we can succeed in above step, we will try to call arm64_get_va_range() to
get the proper kernel virtual ranges.
In the arm64_get_va_range(), we calculate the ranges by the hooks of
different kernel versions:
get_range: arm64_get_range_v5_17,
get_range: arm64_get_range_v5_11,
get_range: arm64_get_range_v5_4,
get_range: arm64_get_range_v5_0,
3.) If we can succeed in above steps, the arm64_calc_virtual_memory_ranges()
will be ignored. If we failed in above steps, the arm64_calc_virtual_memory_ranges()
will continue to do its work.
< 3 > Test this patch.
Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
(The CONFIG_KASAN is NOT set for this kernel.)
Before this patch, we get the wrong output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800048000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800040000000
modules_end: ffff800047ffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
After this patch, we can get the correct output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800010000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800008000000
modules_end: ffff80000fffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
v5 --> v6:
1.)
Find a bug in the v5:
We need to minus 1 for Crash's modules_end/vmalloc_end/vmemmap_end.
2.) Change version limit to LINUX(99, 0, 0)
3.) Tested it again.
v4 --> v5:
1.) Reset ms->struct_page_size to 0 if arm64_get_va_range() fails,
so arm64_calc_virtual_memory_ranges() can continue its work.
2.) Tested again with the new code.
v3 --> v4:
1.) Add struct_page_size to @ms.
Change some functions, such as arm64_init() and arm64_get_struct_page_size().
(Do not use the ASSIGN_SIZE, use the ms->struct_page_size instead.)
2.) Tested again with the new code.
v2 --> v3:
Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
vmemmep_vaddr, not use the ms->VA_BITS.
v1 --> v2:
The Crash code is based on v4.20 not v4.9.
Changed the commit message about it.
---
arm64.c | 379 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
defs.h | 1 +
2 files changed, 369 insertions(+), 11 deletions(-)
diff --git a/arm64.c b/arm64.c
index 3ab8489..841016c 100644
--- a/arm64.c
+++ b/arm64.c
@@ -92,6 +92,14 @@ static void arm64_calc_VA_BITS(void);
static int arm64_is_uvaddr(ulong, struct task_context *);
static void arm64_calc_KERNELPACMASK(void);
+struct kernel_range {
+ unsigned long modules_vaddr, modules_end;
+ unsigned long vmalloc_start_addr, vmalloc_end;
+ unsigned long vmemmap_vaddr, vmemmap_end;
+};
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms);
+static void arm64_get_struct_page_size(struct machine_specific *ms);
+
static void arm64_calc_kernel_start(void)
{
struct machine_specific *ms = machdep->machspec;
@@ -233,9 +241,10 @@ arm64_init(int when)
machdep->pageoffset = machdep->pagesize - 1;
machdep->pagemask = ~((ulonglong)machdep->pageoffset);
+ ms = machdep->machspec;
+ arm64_get_struct_page_size(ms);
arm64_calc_VA_BITS();
arm64_calc_KERNELPACMASK();
- ms = machdep->machspec;
/* vabits_actual introduced after mm flip, so it should be flipped layout */
if (ms->VA_BITS_ACTUAL) {
@@ -252,8 +261,15 @@ arm64_init(int when)
}
machdep->is_kvaddr = generic_is_kvaddr;
machdep->kvtop = arm64_kvtop;
+
+ /* The defaults */
+ ms->vmalloc_end = ARM64_VMALLOC_END;
+ ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
+ ms->vmemmap_end = ARM64_VMEMMAP_END;
+
if (machdep->flags & NEW_VMEMMAP) {
struct syment *sp;
+ struct kernel_range *r;
/* It is finally decided in arm64_calc_kernel_start() */
sp = kernel_symbol_search("_text");
@@ -261,27 +277,36 @@ arm64_init(int when)
sp = kernel_symbol_search("_end");
ms->kimage_end = (sp ? sp->value : 0);
- if (ms->VA_BITS_ACTUAL) {
+ if (ms->struct_page_size && (r = arm64_get_va_range(ms))) {
+ /* We can get all the MODULES/VMALLOC/VMEMMAP ranges now.*/
+ ms->modules_vaddr = r->modules_vaddr;
+ ms->modules_end = r->modules_end - 1;
+ ms->vmalloc_start_addr = r->vmalloc_start_addr;
+ ms->vmalloc_end = r->vmalloc_end - 1;
+ ms->vmemmap_vaddr = r->vmemmap_vaddr;
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0))
+ ms->vmemmap_end = r->vmemmap_end - 1;
+ else
+ ms->vmemmap_end = -1;
+
+ } else if (ms->VA_BITS_ACTUAL) {
ms->modules_vaddr = (st->_stext_vmlinux & TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
} else {
ms->modules_vaddr = ARM64_VA_START;
if (kernel_symbol_exists("kasan_init"))
ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
}
- ms->vmalloc_start_addr = ms->modules_end + 1;
-
arm64_calc_kimage_voffset();
} else {
ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
ms->modules_end = ARM64_PAGE_OFFSET - 1;
ms->vmalloc_start_addr = ARM64_VA_START;
}
- ms->vmalloc_end = ARM64_VMALLOC_END;
- ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
- ms->vmemmap_end = ARM64_VMEMMAP_END;
switch (machdep->pagesize)
{
@@ -404,7 +429,12 @@ arm64_init(int when)
case POST_GDB:
/* Rely on kernel version to decide the kernel start address */
arm64_calc_kernel_start();
- arm64_calc_virtual_memory_ranges();
+
+ /* Can we get the size of struct page before POST_GDB */
+ ms = machdep->machspec;
+ if (!ms->struct_page_size)
+ arm64_calc_virtual_memory_ranges();
+
arm64_get_section_size_bits();
if (!machdep->max_physmem_bits) {
@@ -419,8 +449,6 @@ arm64_init(int when)
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
}
- ms = machdep->machspec;
-
if (CRASHDEBUG(1)) {
if (ms->VA_BITS_ACTUAL) {
fprintf(fp, "CONFIG_ARM64_VA_BITS: %ld\n", ms->CONFIG_ARM64_VA_BITS);
@@ -511,6 +539,336 @@ arm64_init(int when)
}
}
+struct kernel_va_range_handler {
+ unsigned long kernel_versions_start; /* include */
+ unsigned long kernel_versions_end; /* exclude */
+ struct kernel_range *(*get_range)(struct machine_specific *);
+};
+
+static struct kernel_range tmp_range;
+#define _PAGE_END(va) (-(1UL << ((va) - 1)))
+#define SZ_64K 0x00010000
+#define SZ_2M 0x00200000
+
+/*
+ * Get the max shift of the size of struct page.
+ * Most of the time, it is 64 bytes, but not sure.
+ */
+static int arm64_get_struct_page_max_shift(struct machine_specific *ms)
+{
+ unsigned long v = ms->struct_page_size;
+
+ if (16 < v && v <= 32)
+ return 5;
+ if (32 < v && v <= 64)
+ return 6;
+ if (64 < v && v <= 128)
+ return 7;
+
+ error(FATAL, "We should not have such struct page size:%d!\n", v);
+ return 0;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.17-rc1:
+ * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
+ */
+static struct kernel_range *arm64_get_range_v5_17(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v);
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.11:
+ * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_11(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+static unsigned long arm64_get_pud_size(void)
+{
+ unsigned long PUD_SIZE = 0;
+
+ switch (machdep->pagesize) {
+ case 4096:
+ if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
+ PUD_SIZE = PUD_SIZE_L4_4K;
+ } else {
+ PUD_SIZE = PGDIR_SIZE_L3_4K;
+ }
+ break;
+
+ case 65536:
+ PUD_SIZE = PGDIR_SIZE_L2_64K;
+ default:
+ break;
+ }
+ return PUD_SIZE;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.4, such as:
+ * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
+ * "14c127c957c1 arm64: mm: Flip kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_4(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ char *string;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ /* See the arch/arm64/Kconfig*/
+ ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET", &string);
+ if (ret != IKCONFIG_STR)
+ return NULL;
+ kasan_shadow_offset = atol(string);
+
+ r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) + kasan_shadow_offset
+ + bpf_jit_size;
+ } else {
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
+ /*
+ * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable memory hot remove"
+ * adds the VMEMMAP_END.
+ */
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+ } else {
+ r->vmemmap_end = 0xffffffffffffffffUL;
+ }
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.0, such as:
+ * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
+ */
+static struct kernel_range *arm64_get_range_v5_0(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, PUD_SIZE;
+ unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ unsigned long va_start, page_offset;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
+ page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ r->modules_vaddr = va_start + (1UL << (v - kasan_shadow_shift)) + bpf_jit_size;
+ } else {
+ r->modules_vaddr = va_start + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmemmap_size = (1UL << (v - machdep->pageshift - 1 + arm64_get_struct_page_max_shift(ms)));
+
+ r->vmemmap_vaddr = page_offset - vmemmap_size;
+ r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not have VMEMMAP_END */
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
+ return r;
+}
+
+static struct kernel_va_range_handler kernel_va_range_handlers[] = {
+ {
+ LINUX(5,17,0),
+ LINUX(99,0,0), /* Just a boundary, Change it later */
+ get_range: arm64_get_range_v5_17,
+ }, {
+ LINUX(5,11,0), LINUX(5,17,0),
+ get_range: arm64_get_range_v5_11,
+ }, {
+ LINUX(5,4,0), LINUX(5,11,0),
+ get_range: arm64_get_range_v5_4,
+ }, {
+ LINUX(5,0,0), LINUX(5,4,0),
+ get_range: arm64_get_range_v5_0,
+ },
+};
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+
+static unsigned long arm64_get_kernel_version(void)
+{
+ char *string;
+ char buf[BUFSIZE];
+ char *p1, *p2;
+
+ if (THIS_KERNEL_VERSION)
+ return THIS_KERNEL_VERSION;
+
+ string = pc->read_vmcoreinfo("OSRELEASE");
+ if (string) {
+ strcpy(buf, string);
+
+ p1 = p2 = buf;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[0] = atoi(p1);
+
+ p1 = ++p2;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[1] = atoi(p1);
+
+ p1 = ++p2;
+ while ((*p2 >= '0') && (*p2 <= '9'))
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[2] = atoi(p1);
+ }
+ free(string);
+ return THIS_KERNEL_VERSION;
+}
+
+/* Return NULL if we fail. */
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms)
+{
+ struct kernel_va_range_handler *h;
+ unsigned long kernel_version = arm64_get_kernel_version();
+ struct kernel_range *r = NULL;
+ int i;
+
+ if (!kernel_version)
+ goto range_failed;
+
+ for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
+ h = kernel_va_range_handlers + i;
+
+ /* Get the right hook for this kernel version */
+ if (h->kernel_versions_start <= kernel_version &&
+ kernel_version < h->kernel_versions_end) {
+
+ /* Get the correct virtual address ranges */
+ r = h->get_range(ms);
+ if (!r)
+ goto range_failed;
+ return r;
+ }
+ }
+
+range_failed:
+ /* Reset ms->struct_page_size to 0 for arm64_calc_virtual_memory_ranges() */
+ ms->struct_page_size = 0;
+ return NULL;
+}
+
+/* Get the size of struct page {} */
+static void arm64_get_struct_page_size(struct machine_specific *ms)
+{
+ char *string;
+
+ string = pc->read_vmcoreinfo("SIZE(page)");
+ if (string)
+ ms->struct_page_size = atol(string);
+ free(string);
+}
+
/*
* Accept or reject a symbol from the kernel namelist.
*/
@@ -4272,7 +4630,6 @@ arm64_calc_VA_BITS(void)
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
-#define SZ_64K 0x00010000
static void
arm64_calc_virtual_memory_ranges(void)
diff --git a/defs.h b/defs.h
index bf2c59b..81ac049 100644
--- a/defs.h
+++ b/defs.h
@@ -3386,6 +3386,7 @@ struct machine_specific {
ulong VA_START;
ulong CONFIG_ARM64_KERNELPACMASK;
ulong physvirt_offset;
+ ulong struct_page_size;
};
struct arm64_stackframe {
--
2.30.2
2 years, 9 months
[PATCH v2] ps: Add support to "ps -l|-m" to properly display process list
by Austin Kim
Sometimes kernel image is generated without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
Where relevant commit id is f6db83479932 ("sched/stat: Simplify the sched_info accounting")
- CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
- CONFIG_SCHEDSTATS: KERNEL_VERSION < LINUX(4,2,0)
Running crash-utility with above kernel image,
"ps -l" option cannot display all processes sorted with most recently-run process.
Also "ps -m" option cannot display all processes with timestamp.
crash> ps -l or crash> ps -m
ps: last-run timestamps do not exist in this kernel
Usage: ps [-k|-u|-G] [-s]
[-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
[pid | task | command] ...
Enter "help ps" for details.
This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival.
Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included
in task_struct.
So we make "ps -l|-m" option to access 'exec_start' field of sched_entity
where 'exec_start' is task_struct.se.exec_start.
With this patch, "ps -l|-m" option works well without CONFIG_SCHEDSTATS or
CONFIG_SCHED_INFO.
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
---
defs.h | 2 ++
symbols.c | 2 ++
task.c | 20 ++++++++++++++++----
3 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/defs.h b/defs.h
index bf2c59b..5dda176 100644
--- a/defs.h
+++ b/defs.h
@@ -2168,6 +2168,8 @@ struct offset_table { /* stash of commonly-used offsets */
long sbitmap_queue_min_shallow_depth;
long sbq_wait_state_wait_cnt;
long sbq_wait_state_wait;
+ long task_struct_sched_entity;
+ long se_exec_start;
};
struct size_table { /* stash of commonly-used sizes */
diff --git a/symbols.c b/symbols.c
index ba5e274..e5abe87 100644
--- a/symbols.c
+++ b/symbols.c
@@ -8892,6 +8892,8 @@ dump_offset_table(char *spec, ulong makestruct)
OFFSET(sched_rt_entity_run_list));
fprintf(fp, " sched_info_last_arrival: %ld\n",
OFFSET(sched_info_last_arrival));
+ fprintf(fp, " se_exec_start: %ld\n",
+ OFFSET(se_exec_start));
fprintf(fp, " task_struct_thread_info: %ld\n",
OFFSET(task_struct_thread_info));
fprintf(fp, " task_struct_stack: %ld\n",
diff --git a/task.c b/task.c
index 864c838..55e2312 100644
--- a/task.c
+++ b/task.c
@@ -334,9 +334,15 @@ task_init(void)
if (VALID_MEMBER(task_struct_sched_info))
MEMBER_OFFSET_INIT(sched_info_last_arrival,
"sched_info", "last_arrival");
+ MEMBER_OFFSET_INIT(task_struct_sched_entity, "task_struct", "se");
+ if (VALID_MEMBER(task_struct_sched_entity)) {
+ STRUCT_SIZE_INIT(sched_entity, "sched_entity");
+ MEMBER_OFFSET_INIT(se_exec_start, "sched_entity", "exec_start");
+ }
if (VALID_MEMBER(task_struct_last_run) ||
VALID_MEMBER(task_struct_timestamp) ||
- VALID_MEMBER(sched_info_last_arrival)) {
+ VALID_MEMBER(sched_info_last_arrival) ||
+ VALID_MEMBER(se_exec_start)) {
char buf[BUFSIZE];
strcpy(buf, "alias last ps -l");
alias_init(buf);
@@ -3559,7 +3565,8 @@ cmd_ps(void)
case 'm':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(se_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
@@ -3574,7 +3581,8 @@ cmd_ps(void)
case 'l':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(se_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
@@ -6020,7 +6028,11 @@ task_last_run(ulong task)
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
OFFSET(task_struct_sched_info) +
OFFSET(sched_info_last_arrival)) : 0;
-
+ else if (VALID_MEMBER(se_exec_start))
+ timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
+ OFFSET(task_struct_sched_entity) +
+ OFFSET(se_exec_start)) : 0;
+
return timestamp;
}
--
2.20.1
2 years, 9 months
[PATCH v5] arm64: update the modules/vmalloc/vmemmap ranges
by Huang Shijie
< 1 > The background.
The current crash code is still based at kernel v4.20, but the kernel is v5.17-rc4(now).
The MODULE/VMALLOC/VMEMMAP ranges are not be updated since v4.20.
I list all the changes from kernel v4.20 to v5.17:
1.) The current crash code is based at kernel v4.20.
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
2.) In the kernel v5.0, the patch will add a new BFP JIT region:
"91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
The virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (PAGE_OFFSET - VMEMMAP_SIZE)
The layout does not changed until v5.4.
3.) In the kernel v5.4, several patches changes the layout, such as:
"ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
"14c127c957c1 arm64: mm: Flip kernel VA space"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| KASAN | BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M)
In the v5.7, the patch:
"bbd6ec605c arm64/mm: Enable memory hot remove"
adds the VMEMMAP_END.
4.) In the kernel v5.11, several patches changes the layout, such as:
"9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
"f4693c2716b3 arm64: mm: extend linear region for 52-bit VA configurations"
and the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| BPF_JIT | MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (BPF_JIT_REGION_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
5.) In the kernel v5.17-rc1, after the patch
"b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
the virtual memory layout looks like this:
+--------------------------------------------------------------------+
| MODULE | VMALLOC | .... | VMEMMAP |
+--------------------------------------------------------------------+
The macros are:
#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (VMEMMAP_START - SZ_256M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
< 2 > What does this patch do?
1.) Use arm64_get_struct_page_size() to get the size of struct page{} in the PRE_GDB.
2.) If we can succeed in above step, we will try to call arm64_get_va_range() to
get the proper kernel virtual ranges.
In the arm64_get_va_range(), we calculate the ranges by the hooks of
different kernel versions:
get_range: arm64_get_range_v5_17,
get_range: arm64_get_range_v5_11,
get_range: arm64_get_range_v5_4,
get_range: arm64_get_range_v5_0,
3.) If we can succeed in above steps, the arm64_calc_virtual_memory_ranges()
will be ignored. If we failed in above steps, the arm64_calc_virtual_memory_ranges()
will continue to do its work.
< 3 > Test this patch.
Tested this patch with a vmcore produced by a 5.4.119 kernel panic.
(The CONFIG_KASAN is NOT set for this kernel.)
Before this patch, we get the wrong output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800048000000
vmalloc_end: fffffdffbffeffff
modules_vaddr: ffff800040000000
modules_end: ffff800047ffffff
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
After this patch, we can get the correct output from "help -m":
----------------------------------------------------------
vmalloc_start_addr: ffff800010000000
vmalloc_end: fffffdffbfff0000
modules_vaddr: ffff800008000000
modules_end: ffff800010000000
vmemmap_vaddr: fffffdffffe00000
vmemmap_end: ffffffffffffffff
----------------------------------------------------------
Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com>
---
v4 --> v5:
1.) Reset ms->struct_page_size to 0 if arm64_get_va_range() fails,
so arm64_calc_virtual_memory_ranges() can continue its work.
2.) Tested again with the new code.
v3 --> v4:
1.) Add struct_page_size to @ms.
Change some functions, such as arm64_init() and arm64_get_struct_page_size().
(Do not use the ASSIGN_SIZE, use the ms->struct_page_size instead.)
2.) Tested again with the new code.
v2 --> v3:
Fount two bugs in arm64_get_range_v5_17/arm64_get_range_v5_11:
We should use the ms->CONFIG_ARM64_VA_BITS to calculate the
vmemmep_vaddr, not use the ms->VA_BITS.
v1 --> v2:
The Crash code is based on v4.20 not v4.9.
Changed the commit message about it.
---
arm64.c | 375 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
defs.h | 1 +
2 files changed, 365 insertions(+), 11 deletions(-)
diff --git a/arm64.c b/arm64.c
index 3ab8489..94203f0 100644
--- a/arm64.c
+++ b/arm64.c
@@ -92,6 +92,14 @@ static void arm64_calc_VA_BITS(void);
static int arm64_is_uvaddr(ulong, struct task_context *);
static void arm64_calc_KERNELPACMASK(void);
+struct kernel_range {
+ unsigned long modules_vaddr, modules_end;
+ unsigned long vmalloc_start_addr, vmalloc_end;
+ unsigned long vmemmap_vaddr, vmemmap_end;
+};
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms);
+static void arm64_get_struct_page_size(struct machine_specific *ms);
+
static void arm64_calc_kernel_start(void)
{
struct machine_specific *ms = machdep->machspec;
@@ -233,9 +241,10 @@ arm64_init(int when)
machdep->pageoffset = machdep->pagesize - 1;
machdep->pagemask = ~((ulonglong)machdep->pageoffset);
+ ms = machdep->machspec;
+ arm64_get_struct_page_size(ms);
arm64_calc_VA_BITS();
arm64_calc_KERNELPACMASK();
- ms = machdep->machspec;
/* vabits_actual introduced after mm flip, so it should be flipped layout */
if (ms->VA_BITS_ACTUAL) {
@@ -252,8 +261,15 @@ arm64_init(int when)
}
machdep->is_kvaddr = generic_is_kvaddr;
machdep->kvtop = arm64_kvtop;
+
+ /* The defaults */
+ ms->vmalloc_end = ARM64_VMALLOC_END;
+ ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
+ ms->vmemmap_end = ARM64_VMEMMAP_END;
+
if (machdep->flags & NEW_VMEMMAP) {
struct syment *sp;
+ struct kernel_range *r;
/* It is finally decided in arm64_calc_kernel_start() */
sp = kernel_symbol_search("_text");
@@ -261,27 +277,32 @@ arm64_init(int when)
sp = kernel_symbol_search("_end");
ms->kimage_end = (sp ? sp->value : 0);
- if (ms->VA_BITS_ACTUAL) {
+ if (ms->struct_page_size && (r = arm64_get_va_range(ms))) {
+ /* We can get all the MODULES/VMALLOC/VMEMMAP ranges now.*/
+ ms->modules_vaddr = r->modules_vaddr;
+ ms->modules_end = r->modules_end;
+ ms->vmalloc_start_addr = r->vmalloc_start_addr;
+ ms->vmalloc_end = r->vmalloc_end;
+ ms->vmemmap_vaddr = r->vmemmap_vaddr;
+ ms->vmemmap_end = r->vmemmap_end;
+ } else if (ms->VA_BITS_ACTUAL) {
ms->modules_vaddr = (st->_stext_vmlinux & TEXT_OFFSET_MASK) - ARM64_MODULES_VSIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
} else {
ms->modules_vaddr = ARM64_VA_START;
if (kernel_symbol_exists("kasan_init"))
ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
ms->modules_end = ms->modules_vaddr + ARM64_MODULES_VSIZE -1;
+ ms->vmalloc_start_addr = ms->modules_end + 1;
}
- ms->vmalloc_start_addr = ms->modules_end + 1;
-
arm64_calc_kimage_voffset();
} else {
ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
ms->modules_end = ARM64_PAGE_OFFSET - 1;
ms->vmalloc_start_addr = ARM64_VA_START;
}
- ms->vmalloc_end = ARM64_VMALLOC_END;
- ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
- ms->vmemmap_end = ARM64_VMEMMAP_END;
switch (machdep->pagesize)
{
@@ -404,7 +425,12 @@ arm64_init(int when)
case POST_GDB:
/* Rely on kernel version to decide the kernel start address */
arm64_calc_kernel_start();
- arm64_calc_virtual_memory_ranges();
+
+ /* Can we get the size of struct page before POST_GDB */
+ ms = machdep->machspec;
+ if (!ms->struct_page_size)
+ arm64_calc_virtual_memory_ranges();
+
arm64_get_section_size_bits();
if (!machdep->max_physmem_bits) {
@@ -419,8 +445,6 @@ arm64_init(int when)
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
}
- ms = machdep->machspec;
-
if (CRASHDEBUG(1)) {
if (ms->VA_BITS_ACTUAL) {
fprintf(fp, "CONFIG_ARM64_VA_BITS: %ld\n", ms->CONFIG_ARM64_VA_BITS);
@@ -511,6 +535,336 @@ arm64_init(int when)
}
}
+struct kernel_va_range_handler {
+ unsigned long kernel_versions_start; /* include */
+ unsigned long kernel_versions_end; /* exclude */
+ struct kernel_range *(*get_range)(struct machine_specific *);
+};
+
+static struct kernel_range tmp_range;
+#define _PAGE_END(va) (-(1UL << ((va) - 1)))
+#define SZ_64K 0x00010000
+#define SZ_2M 0x00200000
+
+/*
+ * Get the max shift of the size of struct page.
+ * Most of the time, it is 64 bytes, but not sure.
+ */
+static int arm64_get_struct_page_max_shift(struct machine_specific *ms)
+{
+ unsigned long v = ms->struct_page_size;
+
+ if (16 < v && v <= 32)
+ return 5;
+ if (32 < v && v <= 64)
+ return 6;
+ if (64 < v && v <= 128)
+ return 7;
+
+ error(FATAL, "We should not have such struct page size:%d!\n", v);
+ return 0;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.17-rc1:
+ * "b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs"
+ */
+static struct kernel_range *arm64_get_range_v5_17(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v);
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patch since v5.11:
+ * "9ad7c6d5e75b arm64: mm: tidy up top of kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_11(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-(1UL << (ms->CONFIG_ARM64_VA_BITS - vmem_shift)));
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = r->vmemmap_vaddr - MEGABYTES(256);
+ return r;
+}
+
+static unsigned long arm64_get_pud_size(void)
+{
+ unsigned long PUD_SIZE = 0;
+
+ switch (machdep->pagesize) {
+ case 4096:
+ if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L4_4K) {
+ PUD_SIZE = PUD_SIZE_L4_4K;
+ } else {
+ PUD_SIZE = PGDIR_SIZE_L3_4K;
+ }
+ break;
+
+ case 65536:
+ PUD_SIZE = PGDIR_SIZE_L2_64K;
+ default:
+ break;
+ }
+ return PUD_SIZE;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.4, such as:
+ * "ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE"
+ * "14c127c957c1 arm64: mm: Flip kernel VA space"
+ */
+static struct kernel_range *arm64_get_range_v5_4(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, kasan_shadow_offset, PUD_SIZE;
+ unsigned long vmem_shift, vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ char *string;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ if (v > 48)
+ v = 48;
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ /* See the arch/arm64/Kconfig*/
+ ret = get_kernel_config("CONFIG_KASAN_SHADOW_OFFSET", &string);
+ if (ret != IKCONFIG_STR)
+ return NULL;
+ kasan_shadow_offset = atol(string);
+
+ r->modules_vaddr = (1UL << (64 - kasan_shadow_shift)) + kasan_shadow_offset
+ + bpf_jit_size;
+ } else {
+ r->modules_vaddr = _PAGE_END(v) + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmem_shift = machdep->pageshift - arm64_get_struct_page_max_shift(ms);
+ vmemmap_size = (_PAGE_END(v) - PAGE_OFFSET) >> vmem_shift;
+
+ r->vmemmap_vaddr = (-vmemmap_size - SZ_2M);
+ if (THIS_KERNEL_VERSION >= LINUX(5, 7, 0)) {
+ /*
+ * In the v5.7, the patch: "bbd6ec605c arm64/mm: Enable memory hot remove"
+ * adds the VMEMMAP_END.
+ */
+ r->vmemmap_end = r->vmemmap_vaddr + vmemmap_size;
+ } else {
+ r->vmemmap_end = 0xffffffffffffffffUL;
+ }
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = (-PUD_SIZE - vmemmap_size - SZ_64K);
+ return r;
+}
+
+/*
+ * The change is caused by the kernel patches since v5.0, such as:
+ * "91fc957c9b1d arm64/bpf: don't allocate BPF JIT programs in module memory"
+ */
+static struct kernel_range *arm64_get_range_v5_0(struct machine_specific *ms)
+{
+ struct kernel_range *r = &tmp_range;
+ unsigned long v = ms->CONFIG_ARM64_VA_BITS;
+ unsigned long kasan_shadow_shift, PUD_SIZE;
+ unsigned long vmemmap_size, bpf_jit_size = MEGABYTES(128);
+ unsigned long va_start, page_offset;
+ int ret;
+
+ /* Not initialized yet */
+ if (v == 0)
+ return NULL;
+
+ va_start = (0xffffffffffffffffUL - (1UL << v) + 1);
+ page_offset = (0xffffffffffffffffUL - (1UL << (v - 1)) + 1);
+
+ /* Get the MODULES_VADDR ~ MODULES_END */
+ if (kernel_symbol_exists("kasan_init")) {
+ /* See the arch/arm64/Makefile */
+ ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL);
+ if (ret == IKCONFIG_N)
+ return NULL;
+ kasan_shadow_shift = (ret == IKCONFIG_Y) ? 4: 3;
+
+ r->modules_vaddr = va_start + (1UL << (v - kasan_shadow_shift)) + bpf_jit_size;
+ } else {
+ r->modules_vaddr = va_start + bpf_jit_size;
+ }
+
+ r->modules_end = r->modules_vaddr + MEGABYTES(128);
+
+ /* Get the VMEMMAP_START ~ VMEMMAP_END */
+ vmemmap_size = (1UL << (v - machdep->pageshift - 1 + arm64_get_struct_page_max_shift(ms)));
+
+ r->vmemmap_vaddr = page_offset - vmemmap_size;
+ r->vmemmap_end = 0xffffffffffffffffUL; /* this kernel does not have VMEMMAP_END */
+
+ /* Get the VMALLOC_START ~ VMALLOC_END */
+ PUD_SIZE = arm64_get_pud_size();
+
+ r->vmalloc_start_addr = r->modules_end;
+ r->vmalloc_end = page_offset - PUD_SIZE - vmemmap_size - SZ_64K;
+ return r;
+}
+
+static struct kernel_va_range_handler kernel_va_range_handlers[] = {
+ {
+ LINUX(5,17,0),
+ LINUX(6,0,0), /* Just a boundary, Change it later */
+ get_range: arm64_get_range_v5_17,
+ }, {
+ LINUX(5,11,0), LINUX(5,17,0),
+ get_range: arm64_get_range_v5_11,
+ }, {
+ LINUX(5,4,0), LINUX(5,11,0),
+ get_range: arm64_get_range_v5_4,
+ }, {
+ LINUX(5,0,0), LINUX(5,4,0),
+ get_range: arm64_get_range_v5_0,
+ },
+};
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+
+static unsigned long arm64_get_kernel_version(void)
+{
+ char *string;
+ char buf[BUFSIZE];
+ char *p1, *p2;
+
+ if (THIS_KERNEL_VERSION)
+ return THIS_KERNEL_VERSION;
+
+ string = pc->read_vmcoreinfo("OSRELEASE");
+ if (string) {
+ strcpy(buf, string);
+
+ p1 = p2 = buf;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[0] = atoi(p1);
+
+ p1 = ++p2;
+ while (*p2 != '.')
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[1] = atoi(p1);
+
+ p1 = ++p2;
+ while ((*p2 >= '0') && (*p2 <= '9'))
+ p2++;
+ *p2 = NULLCHAR;
+ kt->kernel_version[2] = atoi(p1);
+ }
+ free(string);
+ return THIS_KERNEL_VERSION;
+}
+
+/* Return NULL if we fail. */
+static struct kernel_range *arm64_get_va_range(struct machine_specific *ms)
+{
+ struct kernel_va_range_handler *h;
+ unsigned long kernel_version = arm64_get_kernel_version();
+ struct kernel_range *r = NULL;
+ int i;
+
+ if (!kernel_version)
+ goto range_failed;
+
+ for (i = 0; i < ARRAY_SIZE(kernel_va_range_handlers); i++) {
+ h = kernel_va_range_handlers + i;
+
+ /* Get the right hook for this kernel version */
+ if (h->kernel_versions_start <= kernel_version &&
+ kernel_version < h->kernel_versions_end) {
+
+ /* Get the correct virtual address ranges */
+ r = h->get_range(ms);
+ if (!r)
+ goto range_failed;
+ return r;
+ }
+ }
+
+range_failed:
+ /* Reset ms->struct_page_size to 0 for arm64_calc_virtual_memory_ranges() */
+ ms->struct_page_size = 0;
+ return NULL;
+}
+
+/* Get the size of struct page {} */
+static void arm64_get_struct_page_size(struct machine_specific *ms)
+{
+ char *string;
+
+ string = pc->read_vmcoreinfo("SIZE(page)");
+ if (string)
+ ms->struct_page_size = atol(string);
+ free(string);
+}
+
/*
* Accept or reject a symbol from the kernel namelist.
*/
@@ -4272,7 +4626,6 @@ arm64_calc_VA_BITS(void)
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
-#define SZ_64K 0x00010000
static void
arm64_calc_virtual_memory_ranges(void)
diff --git a/defs.h b/defs.h
index bf2c59b..81ac049 100644
--- a/defs.h
+++ b/defs.h
@@ -3386,6 +3386,7 @@ struct machine_specific {
ulong VA_START;
ulong CONFIG_ARM64_KERNELPACMASK;
ulong physvirt_offset;
+ ulong struct_page_size;
};
struct arm64_stackframe {
--
2.30.2
2 years, 9 months
Re: [Crash-utility] [PATCH 1/2] ps: Add support to "ps -l" to properly display process list
by lijiang
On Tue, Mar 1, 2022 at 10:27 AM <crash-utility-request(a)redhat.com> wrote:
> Date: Tue, 1 Mar 2022 02:26:32 +0000
> From: HAGIO KAZUHITO(?????) <k-hagio-ab(a)nec.com>
> To: lijiang <lijiang(a)redhat.com>
> Cc: "Discussion list for crash utility usage, maintenance and
> development" <crash-utility(a)redhat.com>
> Subject: Re: [Crash-utility] [PATCH 1/2] ps: Add support to "ps -l" to
> properly display process list
> Message-ID:
> <
> TYYPR01MB6777F039D25FAFD0A6486D81DD029(a)TYYPR01MB6777.jpnprd01.prod.outlook.com
> >
>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Lianbo,
>
> -----Original Message-----
> > >> diff --git a/defs.h b/defs.h
> > >> index 7d386d2..ed2f5ca 100644
> > >> --- a/defs.h
> > >> +++ b/defs.h
> > >> @@ -1768,6 +1768,8 @@ struct offset_table { /*
> stash of commonly-used offsets */
> > >> long vcpu_struct_rq;
> > >> long task_struct_sched_info;
> > >> long sched_info_last_arrival;
> > >> + long task_struct_sched_entity;
> > >> + long se_exec_start;
> > >
> > >
> > > This can be only appended to the end of the offset_table.
> > > For more details, refer to the section "writing patches" in wiki:
> > > https://github.com/crash-utility/crash/wiki
>
> Seeing this exchange and thought of something like this:
>
> --- a/defs.h
> +++ b/defs.h
> @@ -1215,8 +1215,8 @@ struct reference {
> void *refp;
> };
>
> -struct offset_table { /* stash of commonly-used
> offsets */
> - long list_head_next; /* add new entries to end of
> table */
> +struct offset_table { /* NOTE: add new entries to end of table
> [1] */
> + long list_head_next; /* [1]
> https://github.com/crash-utility/crash/wiki */
> long list_head_prev;
> long task_struct_pid;
> long task_struct_state;
>
>
> With this patch, a patch trying to add an entry to the middle of the table:
>
> diff --git a/defs.h b/defs.h
> index 938e39ca4baf..c0814482abaa 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -2007,6 +2007,7 @@ struct offset_table { /* NOTE: add new
> entries to end of table [1] */
> long mm_struct_mm_count;
> long task_struct_thread_reg29;
> long task_struct_thread_reg31;
> + long foo_bar;
> long pt_regs_regs;
> long pt_regs_cp0_badvaddr;
> long address_space_page_tree;
>
>
> Having this even only for the offset_table, size_table and array_table
> might be effective to let developers notice that rule.
That might help, but it may still be ignored.
What do you think? Attached a patch.
>
Tried to add the contribution guidelines at the end of email, like this:
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidlines: https://github.com/crash-utility/crash/wiki
Or do we have a script to check the patch rules? Just like the checkpatch.pl
in the kernel?
Thanks.
Lianbo
2 years, 9 months
[PATCH v3] ps: Add support to "ps -l|-m" to properly display process
by Austin Kim
Sometimes kernel image is generated without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
Where relevant commit id is f6db83479932 ("sched/stat: Simplify the sched_info accounting")
- CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
- CONFIG_SCHEDSTATS: KERNEL_VERSION < LINUX(4,2,0)
Running crash-utility with above kernel image,
"ps -l" option cannot display all processes sorted with most recently-run process.
Also "ps -m" option cannot display all processes with timestamp.
crash> ps -l or crash> ps -m
ps: last-run timestamps do not exist in this kernel
Usage: ps [-k|-u|-G] [-s]
[-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
[pid | task | command] ...
Enter "help ps" for details.
This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival.
Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included
in task_struct. In this case we make "ps -l|-m" option to access 'exec_start'
field of sched_entity where 'exec_start' is task_struct.se.exec_start.
The 'task_struct.se.exec_start' contains the most recently-executed timestamp
when process is running in the below cases;
- enqueued to runqueue
- dequeued from runqueue
- scheduler tick is invoked
- etc
'task_struct.se.exec_start' could be one of statistics which indicates the most
recently-run timestamp of process activity.
With this patch, "ps -l|-m" option works well without CONFIG_SCHEDSTATS or
CONFIG_SCHED_INFO.
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
---
defs.h | 1 +
help.c | 5 +++--
symbols.c | 2 ++
task.c | 20 ++++++++++++++++----
4 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/defs.h b/defs.h
index bf2c59b..841bd0b 100644
--- a/defs.h
+++ b/defs.h
@@ -2168,6 +2168,7 @@ struct offset_table { /* stash of commonly-used offsets */
long sbitmap_queue_min_shallow_depth;
long sbq_wait_state_wait_cnt;
long sbq_wait_state_wait;
+ long sched_entity_exec_start;
};
struct size_table { /* stash of commonly-used sizes */
diff --git a/help.c b/help.c
index 8347668..6ca7c92 100644
--- a/help.c
+++ b/help.c
@@ -1442,7 +1442,8 @@ char *help_ps[] = {
" and system times.",
" -l display the task's last-run timestamp value, using either the",
" task_struct's last_run value, the task_struct's timestamp value",
-" or the task_struct's sched_entity last_arrival value, whichever",
+" the task_struct's sched_info last_arrival value",
+" or the task_struct's sched_entity exec_start value, whichever",
" applies, of selected, or all, tasks; the list is sorted with the",
" most recently-run task (with the largest timestamp) shown first,",
" followed by the task's current state.",
@@ -1621,7 +1622,7 @@ char *help_ps[] = {
" > 9497 1 0 ffff880549ec2ab0 RU 0.0 42314692 138664 oracle",
" ",
" Show all tasks sorted by their task_struct's last_run, timestamp, or",
-" sched_entity last_arrival timestamp value, whichever applies:\n",
+" sched_info last_arrival or sched_entity exec_start timestamp value, whichever applies:\n",
" %s> ps -l",
" [20811245123] [IN] PID: 37 TASK: f7153030 CPU: 2 COMMAND: \"events/2\"",
" [20811229959] [IN] PID: 1756 TASK: f2a5a570 CPU: 2 COMMAND: \"ntpd\"",
diff --git a/symbols.c b/symbols.c
index ba5e274..1c40586 100644
--- a/symbols.c
+++ b/symbols.c
@@ -10290,6 +10290,8 @@ dump_offset_table(char *spec, ulong makestruct)
OFFSET(sched_entity_my_q));
fprintf(fp, " sched_entity_on_rq: %ld\n",
OFFSET(sched_entity_on_rq));
+ fprintf(fp, " sched_entity_exec_start: %ld\n",
+ OFFSET(sched_entity_exec_start));
fprintf(fp, " cfs_rq_nr_running: %ld\n",
OFFSET(cfs_rq_nr_running));
fprintf(fp, " cfs_rq_rb_leftmost: %ld\n",
diff --git a/task.c b/task.c
index 864c838..2c12196 100644
--- a/task.c
+++ b/task.c
@@ -334,9 +334,15 @@ task_init(void)
if (VALID_MEMBER(task_struct_sched_info))
MEMBER_OFFSET_INIT(sched_info_last_arrival,
"sched_info", "last_arrival");
+ MEMBER_OFFSET_INIT(task_struct_se, "task_struct", "se");
+ if (VALID_MEMBER(task_struct_se)) {
+ STRUCT_SIZE_INIT(sched_entity, "sched_entity");
+ MEMBER_OFFSET_INIT(sched_entity_exec_start, "sched_entity", "exec_start");
+ }
if (VALID_MEMBER(task_struct_last_run) ||
VALID_MEMBER(task_struct_timestamp) ||
- VALID_MEMBER(sched_info_last_arrival)) {
+ VALID_MEMBER(sched_info_last_arrival) ||
+ VALID_MEMBER(sched_entity_exec_start)) {
char buf[BUFSIZE];
strcpy(buf, "alias last ps -l");
alias_init(buf);
@@ -3559,7 +3565,8 @@ cmd_ps(void)
case 'm':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(sched_entity_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
@@ -3574,7 +3581,8 @@ cmd_ps(void)
case 'l':
if (INVALID_MEMBER(task_struct_last_run) &&
INVALID_MEMBER(task_struct_timestamp) &&
- INVALID_MEMBER(sched_info_last_arrival)) {
+ INVALID_MEMBER(sched_info_last_arrival) &&
+ INVALID_MEMBER(sched_entity_exec_start)) {
error(INFO,
"last-run timestamps do not exist in this kernel\n");
argerrs++;
@@ -6020,7 +6028,11 @@ task_last_run(ulong task)
timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
OFFSET(task_struct_sched_info) +
OFFSET(sched_info_last_arrival)) : 0;
-
+ else if (VALID_MEMBER(sched_entity_exec_start))
+ timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct +
+ OFFSET(task_struct_se) +
+ OFFSET(sched_entity_exec_start)) : 0;
+
return timestamp;
}
--
2.20.1
2 years, 9 months