June 2025 - Crash-utility - Crash Utility List Archives

[PATCH RFC][makedumpfile 00/10] btf/kallsyms based eppic extension for mm page filtering

by Tao Liu

A) This patchset will introduce the following features to makedumpfile: 1) Enable eppic script for memory pages filtering. 2) Enable btf and kallsyms for symbol type and address resolving. 3) Port maple tree data structures and functions, primarily used for vma iteration. B) The purpose of the features are: 1) Currently makedumpfile filters mm pages based on page flags, because flags can help to determine one page's usage. But this page-flag-checking method lacks of flexibility in certain cases, e.g. if we want to filter those mm pages occupied by GPU during vmcore dumping due to: a) GPU may be taking a large memory and contains sensitive data; b) GPU mm pages have no relations to kernel crash and useless for vmcore analysis. But there is no GPU mm page specific flags, and apparently we don't need to create one just for kdump use. A programmable filtering tool is more suitable for such cases. In addition, different GPU vendors may use different ways for mm pages allocating, programmable filtering is better than hard coding these GPU specific logics into makedumpfile in this case. 2) Currently makedumpfile already contains a programmable filtering tool, aka eppic script, which allows user to write customized code for data erasing. However it has the following drawbacks: a) cannot do mm page filtering. b) need to access to debuginfo of both kernel and modules, which is not applicable in the 2nd kernel. c) Poor performance, making vmcore dumping time unacceptable (See the following performance testing). makedumpfile need to resolve the dwarf data from debuginfo, to get symbols types and addresses. In recent kernel there are dwarf alternatives such as btf/kallsyms which can be used for this purpose. And btf/kallsyms info are already packed within vmcore, so we can use it directly. 3) Maple tree data structures are used in recent kernels, such as vma iteration. So maple tree poring is needed. With these, this patchset introduces an upgraded eppic, which is based on btf/kallsyms symbol resolving, and is programmable for mm page filtering. The following info shows its usage and performance, please note the tests are performed in 1st kernel: $ time ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore /tmp/dwarf.out -x /lib/debug/lib/modules/6.11.8-300.fc41.x86_64/vmlinux --eppic eppic_scripts/filter_amdgpu_mm_pages.c real 14m6.894s user 4m16.900s sys 9m44.695s $ time ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore /tmp/btf.out --eppic eppic_scripts/filter_amdgpu_mm_pages.c real 0m10.672s user 0m9.270s sys 0m1.130s -rw------- 1 root root 367475074 Jun 10 18:06 btf.out -rw------- 1 root root 367475074 Jun 10 21:05 dwarf.out -rw-rw-rw- 1 root root 387181418 Jun 10 18:03 /var/crash/127.0.0.1-2025-06-10-18:03:12/vmcore C) Discussion: 1) GPU types: Currently only tested with amdgpu's mm page filtering, others are not tested. 2) Code structure: There are some similar code shared by makedumpfile and crash, such as maple tree data structure, also I planed to port the btf/kallsyms code to crash as well, so there are code duplications for crash & makedumpfile. Since I havn't working on crash poring, code change on btf/kallsyms is expected. How can we share the code, creating a common library or keep the duplication as it is? 3) OS: The code can work on rhel-10+/rhel9.5+ on x86_64/arm64/s390/ppc64. Others are not tested. D) Testing: 1) If you don't want to create your vmcore, you can find a vmcore which I created with amdgpu mm pages unfiltered [1], the amdgpu mm pages are allocated by program [2]. You can use the vmcore in 1st kernel to filter the amdgpu mm pages by the previous performance testing cmdline. To verify the pages are filtered in crash: Unfiltered: crash> search -c "!QAZXSW@#EDC" ffff96b7fa800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ffff96b87c800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX crash> rd ffff96b7fa800000 ffff96b7fa800000: 405753585a415121 !QAZXSW@ crash> rd ffff96b87c800000 ffff96b87c800000: 405753585a415121 !QAZXSW@ Filtered: crash> search -c "!QAZXSW@#EDC" crash> rd ffff96b7fa800000 rd: page excluded: kernel virtual address: ffff96b7fa800000 type: "64-bit KVADDR" crash> rd ffff96b87c800000 rd: page excluded: kernel virtual address: ffff96b87c800000 type: "64-bit KVADDR" 2) You can use eppic_scripts/print_all_vma.c against an ordinary vmcore to test only btf/kallsyms functions by output all VMAs if no amdgpu vmcores/machine avaliable. [1]: https://people.redhat.com/~ltao/core/ [2]: https://gist.github.com/liutgnu/a8cbce1c666452f1530e1410d1f352df Tao Liu (10): dwarf_info: Support kernel address randomization dwarf_info: Fix a infinite recursion bug for search_domain Add page filtering function Add btf/kallsyms support for symbol type/address resolving Export necessary btf/kallsyms functions to eppic extension Port the maple tree data structures and functions Supporting main() as the entry of eppic script Enable page filtering for dwarf eppic Enable page filtering for btf/kallsyms eppic Introducing 2 eppic scripts to test the dwarf/btf eppic extension Makefile | 6 +- btf.c | 919 +++++++++++++++++++++++++ btf.h | 176 +++++ dwarf_info.c | 15 +- eppic_maple.c | 431 ++++++++++++ eppic_maple.h | 8 + eppic_scripts/filter_amdgpu_mm_pages.c | 36 + eppic_scripts/print_all_vma.c | 29 + erase_info.c | 123 +++- erase_info.h | 22 + extension_btf.c | 218 ++++++ extension_eppic.c | 41 +- extension_eppic.h | 6 +- kallsyms.c | 371 ++++++++++ kallsyms.h | 42 ++ makedumpfile.c | 21 +- makedumpfile.h | 11 + 17 files changed, 2448 insertions(+), 27 deletions(-) create mode 100644 btf.c create mode 100644 btf.h create mode 100644 eppic_maple.c create mode 100644 eppic_maple.h create mode 100644 eppic_scripts/filter_amdgpu_mm_pages.c create mode 100644 eppic_scripts/print_all_vma.c create mode 100644 extension_btf.c create mode 100644 kallsyms.c create mode 100644 kallsyms.h -- 2.47.0

2 days

1
10
0 / 0

[PATCH] vmcoreinfo: read vmcoreinfo using 'vmcoreinfo_data' when unavailable in elf note

by Aditya Gupta

Few vmcores don't have vmcoreinfo elf note, such as those created using virsh-dump. On architectures such as PowerPC64, vmcoreinfo is mandatory to fetch the first_vmalloc_address, for vmcores of upstream linux, since crash-utility commit: commit 5b24e363a898 ("get vmalloc start address from vmcoreinfo") Try reading from the 'vmcoreinfo_data' symbol instead, if the vmcoreinfo crash tries to read in case of diskdump/netdump is empty/missing. The approach to read 'vmcoreinfo_data' was used for a live kernel, which can be reused in the case of missing vmcoreinfo note also, as the 'vmcoreinfo_data' symbol is available with vmcore too Note though, till GDB interface is not initialised, reading from vmcoreinfo_data symbol is not done, so behaviour is same as previously with no vmcoreinfo (only till GDB interface is not initialised) Hence rename 'vmcoreinfo_read_string' in kernel.c to 'vmcoreinfo_read_from_memory', and use it in netdump.c and diskdump.c too. Reported-by: Anushree Mathur <anushree.mathur(a)linux.ibm.com> Reported-by: Kowshik Jois <kowsjois(a)linux.ibm.com> Tested-by: Anushree Mathur <anushree.mathur(a)linux.ibm.com> Tested-by: Kowshik Jois <kowsjois(a)linux.ibm.com> Signed-off-by: Aditya Gupta <adityag(a)linux.ibm.com> --- defs.h | 1 + diskdump.c | 18 ++++++++++++++++++ kernel.c | 17 ++++++++++++----- netdump.c | 19 +++++++++++++++++++ 4 files changed, 50 insertions(+), 5 deletions(-) diff --git a/defs.h b/defs.h index 2fdb4db56a05..fbd09e19103f 100644 --- a/defs.h +++ b/defs.h @@ -6213,6 +6213,7 @@ void dump_kernel_table(int); void dump_bt_info(struct bt_info *, char *where); void dump_log(int); void parse_kernel_version(char *); +char *vmcoreinfo_read_from_memory(const char *); #define LOG_LEVEL(v) ((v) & 0x07) #define SHOW_LOG_LEVEL (0x1) diff --git a/diskdump.c b/diskdump.c index ce3cbb7b12dd..3be56248c7a9 100644 --- a/diskdump.c +++ b/diskdump.c @@ -1041,6 +1041,13 @@ pfn_to_pos(ulong pfn) return desc_pos; } +/** + * Check if vmcoreinfo in vmcore is missing/empty + */ +static bool is_vmcoreinfo_empty(void) +{ + return (dd->sub_header_kdump->size_vmcoreinfo == 0); +} /* * Determine whether a file is a diskdump creation, and if TRUE, @@ -1088,6 +1095,17 @@ is_diskdump(char *file) pc->read_vmcoreinfo = vmcoreinfo_read_string; + /* + * vmcoreinfo can be empty in case of dump collected via virsh-dump + * + * check if vmcoreinfo is not available in vmcore, and try to read + * the vmcoreinfo from memory, using "vmcoreinfo_data" symbol + */ + if (is_vmcoreinfo_empty()) { + error(WARNING, "vmcoreinfo is empty, will read from symbols\n"); + pc->read_vmcoreinfo = vmcoreinfo_read_from_memory; + } + if ((pc->flags2 & GET_LOG) && KDUMP_CMPRS_VALID()) { pc->dfd = dd->dfd; pc->readmem = read_diskdump; diff --git a/kernel.c b/kernel.c index b8d3b7999974..b296487ea036 100644 --- a/kernel.c +++ b/kernel.c @@ -99,7 +99,6 @@ static ulong dump_audit_skb_queue(ulong); static ulong __dump_audit(char *); static void dump_audit(void); static void dump_printk_safe_seq_buf(int); -static char *vmcoreinfo_read_string(const char *); static void check_vmcoreinfo(void); static int is_pvops_xen(void); static int get_linux_banner_from_vmlinux(char *, size_t); @@ -11892,8 +11891,8 @@ dump_printk_safe_seq_buf(int msg_flags) * Returns a string (that has to be freed by the caller) that contains the * value for key or NULL if the key has not been found. */ -static char * -vmcoreinfo_read_string(const char *key) +char * +vmcoreinfo_read_from_memory(const char *key) { char *buf, *value_string, *p1, *p2; size_t value_length; @@ -11903,6 +11902,14 @@ vmcoreinfo_read_string(const char *key) buf = value_string = NULL; + if (!(pc->flags & GDB_INIT)) { + /* + * GDB interface hasn't been initialised yet, so can't + * access vmcoreinfo_data + */ + return NULL; + } + switch (get_symbol_type("vmcoreinfo_data", NULL, NULL)) { case TYPE_CODE_PTR: @@ -11958,10 +11965,10 @@ check_vmcoreinfo(void) switch (get_symbol_type("vmcoreinfo_data", NULL, NULL)) { case TYPE_CODE_PTR: - pc->read_vmcoreinfo = vmcoreinfo_read_string; + pc->read_vmcoreinfo = vmcoreinfo_read_from_memory; break; case TYPE_CODE_ARRAY: - pc->read_vmcoreinfo = vmcoreinfo_read_string; + pc->read_vmcoreinfo = vmcoreinfo_read_from_memory; break; } } diff --git a/netdump.c b/netdump.c index c7ff009e7f90..c9f0e4eaa580 100644 --- a/netdump.c +++ b/netdump.c @@ -111,6 +111,14 @@ map_cpus_to_prstatus(void) FREEBUF(nt_ptr); } +/** + * Check if vmcoreinfo in vmcore is missing/empty + */ +static bool is_vmcoreinfo_empty(void) +{ + return (nd->size_vmcoreinfo == 0); +} + /* * Determine whether a file is a netdump/diskdump/kdump creation, * and if TRUE, initialize the vmcore_data structure. @@ -464,6 +472,17 @@ is_netdump(char *file, ulong source_query) pc->read_vmcoreinfo = vmcoreinfo_read_string; + /* + * vmcoreinfo can be empty in case of dump collected via virsh-dump + * + * check if vmcoreinfo is not available in vmcore, and try to read + * the vmcoreinfo from memory, using "vmcoreinfo_data" symbol + */ + if (is_vmcoreinfo_empty()) { + error(WARNING, "vmcoreinfo is empty, will read from symbols\n"); + pc->read_vmcoreinfo = vmcoreinfo_read_from_memory; + } + if ((source_query == KDUMP_LOCAL) && (pc->flags2 & GET_OSRELEASE)) kdump_get_osrelease(); -- 2.49.0

1 week, 5 days

1
0
0 / 0

[Crash-utility][PATCH] Fix crash initialization failure on LoongArch with recent GDB versions

by Ming Wang

The crash tool failed to initialize on LoongArch64 when using GDB 16.2 (and likely other recent GDB versions that have enhanced LoongArch support) due to the error: "fatal error: buffer size is not enough to fit register value". This occurs in supply_registers() because GDB now correctly reports the size of LoongArch LASX (256-bit) vector registers (xr0-xr31) as 32 bytes. The `regval` buffer in `crash_target.c` was previously fixed at 16 bytes. This patch increases the `regval` buffer size to 32 bytes to accommodate the largest LoongArch registers reported by GDB. This allows crash to initialize successfully. Signed-off-by: Ming Wang <wangming01(a)loongson.cn> --- crash_target.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crash_target.c b/crash_target.c index 5966b7b..d93d58c 100644 --- a/crash_target.c +++ b/crash_target.c @@ -71,7 +71,7 @@ public: static void supply_registers(struct regcache *regcache, int regno) { - gdb_byte regval[16]; + gdb_byte regval[32]; struct gdbarch *arch = regcache->arch (); const char *regname = gdbarch_register_name(arch, regno); int regsize = register_size(arch, regno); -- 2.41.3

2 weeks

1
0
0 / 0

[PATCH] Add blk_mq shared tags support for dev -d/-D

by Tao Liu

When blk_mq shared tags enabled for devices like scsi, the IO status is incorrect, e.g: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL ASYNC SYNC 8 ffff90528df86000 sda ffff9052a3d61800 144 144 0 8 ffff905280718c00 sdb ffff9052a3d63c00 48 48 0 crash> epython rqlist ffff90528e94a5c0 sda is unknown, deadline: 89.992 (90) rq_alloc: 0.196 ffff90528e92f700 sda is unknown, deadline: 89.998 (90) rq_alloc: 0.202 ffff90528e95ccc0 sda is unknown, deadline: 89.999 (90) rq_alloc: 0.203 ffff90528e968bc0 sdb is unknown, deadline: 89.997 (90) rq_alloc: 0.201 The root cause is: for shared tags case, only the shared tags are put into count. Without this patch, tags of all the hw_ctx are counted, which is incorrect. After apply the patch: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL READ WRITE 8 ffff90528df86000 sda ffff9052a3d61800 3 3 0 8 ffff905280718c00 sdb ffff9052a3d63c00 1 1 0 This patch makes the following modification: 1) blk_mq shared tag support. 2) Function renaming: queue_for_each_hw_ctx -> blk_mq_queue_tag_busy_iter, because the latter is more close to the corresponding kernel function. 3) Extract a new queue_for_each_hw_ctx() function to be called for both shared-tags case and the hw_ctx case. Note: The patch is safe for earlier kernels which have no blk_mq shared tags implemented, because the blk_mq_is_shared_tags() check will exit safely. Signed-off-by: Tao Liu <ltao(a)redhat.com> --- Please discard the previous patch "Filter repeated rq for cmd dev -d/-D", because filtering is an incorrect fix. --- defs.h | 3 ++ dev.c | 96 ++++++++++++++++++++++++++++++++++++++----------------- symbols.c | 6 ++++ 3 files changed, 76 insertions(+), 29 deletions(-) diff --git a/defs.h b/defs.h index bbd6d4b..4fecb83 100644 --- a/defs.h +++ b/defs.h @@ -2271,6 +2271,9 @@ struct offset_table { /* stash of commonly-used offsets */ long task_struct_thread_context_x28; long neigh_table_hash_heads; long neighbour_hash; + long request_queue_tag_set; + long blk_mq_tag_set_flags; + long blk_mq_tag_set_shared_tags; }; struct size_table { /* stash of commonly-used sizes */ diff --git a/dev.c b/dev.c index 9d38aef..0a4d5c9 100644 --- a/dev.c +++ b/dev.c @@ -4326,6 +4326,12 @@ struct bt_iter_data { #define MQ_RQ_IN_FLIGHT 1 #define REQ_OP_BITS 8 #define REQ_OP_MASK ((1 << REQ_OP_BITS) - 1) +#define BLK_MQ_F_TAG_HCTX_SHARED (1 << 3) + +static bool blk_mq_is_shared_tags(unsigned int flags) +{ + return flags & BLK_MQ_F_TAG_HCTX_SHARED; +} static uint op_is_write(uint op) { @@ -4403,43 +4409,72 @@ static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, uint nr_r sbitmap_for_each_set(&sc, bt_iter, &iter_data); } -static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio *dio) +static bool queue_for_each_hw_ctx(ulong q, ulong blk_mq_tags_ptr, + bool bitmap_tags_is_ptr, struct diskio *dio) { - uint i; + uint i, nr_reserved_tags = 0; + ulong tags = 0, addr = 0; + bool ret = FALSE; + + if (!readmem(blk_mq_tags_ptr, KVADDR, &tags, sizeof(ulong), + "blk_mq_hw_ctx.tags", RETURN_ON_ERROR)) + goto out; + + addr = tags + OFFSET(blk_mq_tags_nr_reserved_tags); + if (!readmem(addr, KVADDR, &nr_reserved_tags, sizeof(uint), + "blk_mq_tags_nr_reserved_tags", RETURN_ON_ERROR)) + goto out; + + if (nr_reserved_tags) { + addr = tags + OFFSET(blk_mq_tags_breserved_tags); + if (bitmap_tags_is_ptr && + !readmem(addr, KVADDR, &addr, sizeof(ulong), + "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) + goto out; + bt_for_each(q, tags, addr, 1, nr_reserved_tags, dio); + } + addr = tags + OFFSET(blk_mq_tags_bitmap_tags); + if (bitmap_tags_is_ptr && + !readmem(addr, KVADDR, &addr, sizeof(ulong), + "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) + goto out; + bt_for_each(q, tags, addr, 0, nr_reserved_tags, dio); + + ret = TRUE; +out: + return ret; +} + +/* + * Replica of kernel block/blk-mq-tag.c:blk_mq_queue_tag_busy_iter() +*/ +static void blk_mq_queue_tag_busy_iter(ulong q, ulong *hctx, uint cnt, + struct diskio *dio) +{ + uint i, flags; int bitmap_tags_is_ptr = 0; + ulong addr = 0; if (MEMBER_TYPE("blk_mq_tags", "bitmap_tags") == TYPE_CODE_PTR) bitmap_tags_is_ptr = 1; - for (i = 0; i < cnt; i++) { - ulong addr = 0, tags = 0; - uint nr_reserved_tags = 0; + readmem(q + OFFSET(request_queue_tag_set), KVADDR, &addr, + sizeof(ulong), "request_queue.tag_set", RETURN_ON_ERROR); - /* Tags owned by the block driver */ - addr = hctx[i] + OFFSET(blk_mq_hw_ctx_tags); - if (!readmem(addr, KVADDR, &tags, sizeof(ulong), - "blk_mq_hw_ctx.tags", RETURN_ON_ERROR)) - break; + readmem(addr + OFFSET(blk_mq_tag_set_flags), KVADDR, + &flags, sizeof(uint), "blk_mq_tag_set.flags", RETURN_ON_ERROR); - addr = tags + OFFSET(blk_mq_tags_nr_reserved_tags); - if (!readmem(addr, KVADDR, &nr_reserved_tags, sizeof(uint), - "blk_mq_tags_nr_reserved_tags", RETURN_ON_ERROR)) - break; + if (blk_mq_is_shared_tags(flags)) { + addr = addr + OFFSET(blk_mq_tag_set_shared_tags); + queue_for_each_hw_ctx(q, addr, bitmap_tags_is_ptr, dio); + return; + } - if (nr_reserved_tags) { - addr = tags + OFFSET(blk_mq_tags_breserved_tags); - if (bitmap_tags_is_ptr && - !readmem(addr, KVADDR, &addr, sizeof(ulong), - "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) - break; - bt_for_each(q, tags, addr, 1, nr_reserved_tags, dio); - } - addr = tags + OFFSET(blk_mq_tags_bitmap_tags); - if (bitmap_tags_is_ptr && - !readmem(addr, KVADDR, &addr, sizeof(ulong), - "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) - break; - bt_for_each(q, tags, addr, 0, nr_reserved_tags, dio); + for (i = 0; i < cnt; i++) { + /* Tags owned by the block driver */ + addr = hctx[i] + OFFSET(blk_mq_hw_ctx_tags); + if (queue_for_each_hw_ctx(q, addr, bitmap_tags_is_ptr, dio) == FALSE) + return; } } @@ -4489,7 +4524,7 @@ static void get_mq_diskio_from_hw_queues(ulong q, struct diskio *dio) return; } - queue_for_each_hw_ctx(q, hctx_array, cnt, dio); + blk_mq_queue_tag_busy_iter(q, hctx_array, cnt, dio); FREEBUF(hctx_array); } @@ -4914,6 +4949,9 @@ void diskio_init(void) MEMBER_SIZE_INIT(class_private_devices, "class_private", "class_devices"); MEMBER_OFFSET_INIT(disk_stats_in_flight, "disk_stats", "in_flight"); + MEMBER_OFFSET_INIT(request_queue_tag_set, "request_queue", "tag_set"); + MEMBER_OFFSET_INIT(blk_mq_tag_set_flags, "blk_mq_tag_set", "flags"); + MEMBER_OFFSET_INIT(blk_mq_tag_set_shared_tags, "blk_mq_tag_set", "shared_tags"); dt->flags |= DISKIO_INIT; } diff --git a/symbols.c b/symbols.c index e30fafe..794519a 100644 --- a/symbols.c +++ b/symbols.c @@ -11487,6 +11487,12 @@ dump_offset_table(char *spec, ulong makestruct) OFFSET(blk_mq_tags_nr_reserved_tags)); fprintf(fp, " blk_mq_tags_rqs: %ld\n", OFFSET(blk_mq_tags_rqs)); + fprintf(fp, " request_queue_tag_set: %ld\n", + OFFSET(request_queue_tag_set)); + fprintf(fp, " blk_mq_tag_set_flags: %ld\n", + OFFSET(blk_mq_tag_set_flags)); + fprintf(fp, " blk_mq_tag_set_shared_tags: %ld\n", + OFFSET(blk_mq_tag_set_shared_tags)); fprintf(fp, " subsys_private_subsys: %ld\n", OFFSET(subsys_private_subsys)); fprintf(fp, " subsys_private_klist_devices: %ld\n", -- 2.47.0

2 weeks

1
0
0 / 0

[PATCH] Fix the issue of "page excluded" messages flooding

by Lianbo Jiang

The current issue is only observed on PPC64le machine when loading crash, E.g: ... crash: page excluded: kernel virtual address: c0000000022d6098 type: "gdb_readmem_callback" crash: page excluded: kernel virtual address: c0000000022d6098 type: "gdb_readmem_callback" ... crash> And this issue can not be reproduced on crash 8, which only occurred after the gdb-16.2 upgrade(see commit dfb2bb55e530). So far I haven't found out why it always reads the same address(excluded page) many times, anyway, crash tool should avoid flooding messages firstly, similarly let's use the same debug level(8) such as the read_diskdump()(see diskdump.c). Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com> --- memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/memory.c b/memory.c index 0d8d89862383..58624bb5f44c 100644 --- a/memory.c +++ b/memory.c @@ -2504,7 +2504,7 @@ readmem(ulonglong addr, int memtype, void *buffer, long size, case PAGE_EXCLUDED: RETURN_ON_PARTIAL_READ(); - if (PRINT_ERROR_MESSAGE) + if (CRASHDEBUG(8)) error(INFO, PAGE_EXCLUDED_ERRMSG, memtype_string(memtype, 0), addr, type); goto readmem_error; -- 2.47.1

1 month, 1 week

1
0
0 / 0

[PATCH] Fix "kmem -p" option on Linux 6.16-rc1 and later kernels

by HAGIO KAZUHITO(萩尾　一仁)

Kernel commit acc53a0b4c156 ("mm: rename page->index to page->__folio_index"), which is contained in Linux 6.16-rc1 and later kernels, renamed the member. Without the patch, the "kmem -p" option fails with the following error: kmem: invalid structure member offset: page_index FILE: memory.c LINE: 6016 FUNCTION: dump_mem_map_SPARSEMEM() Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com> --- memory.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/memory.c b/memory.c index 0d8d89862383..5cb8b58e2181 100644 --- a/memory.c +++ b/memory.c @@ -531,6 +531,8 @@ vm_init(void) ASSIGN_OFFSET(page_mapping) = MEMBER_OFFSET("page", "_mapcount") + STRUCT_SIZE("atomic_t") + sizeof(ulong); MEMBER_OFFSET_INIT(page_index, "page", "index"); + if (INVALID_MEMBER(page_index)) /* 6.16 and later */ + MEMBER_OFFSET_INIT(page_index, "page", "__folio_index"); if (INVALID_MEMBER(page_index)) ANON_MEMBER_OFFSET_INIT(page_index, "page", "index"); MEMBER_OFFSET_INIT(page_buffers, "page", "buffers"); -- 2.31.1

1 month, 1 week

1
0
0 / 0

Re: [PATCH v4 0/5] gdb multi-stack unwinding support

by lijiang

On Wed, Jun 25, 2025 at 12:04 PM <devel-request(a)lists.crash-utility.osci.io> wrote: > Date: Wed, 25 Jun 2025 16:01:58 +1200 > From: Tao Liu <ltao(a)redhat.com> > Subject: [Crash-utility] [PATCH v4 0/5] gdb multi-stack unwinding > support > To: devel(a)lists.crash-utility.osci.io > Cc: Tao Liu <ltao(a)redhat.com> > Message-ID: <20250625040203.60334-1-ltao(a)redhat.com> > Content-Type: text/plain; charset="US-ASCII"; x-default=true > > This patchset is based on Alexy's work [1], and is the follow-up of the > previous "gdb stack unwinding support for crash utility" patchset. > > Currently gdb target analyzes only one task at a time and it backtraces > only straight stack until end of the stack. If stacks were concatenated > during exceptions or interrupts, gdb bt will show only the topmost one. > > This patchset will introduce multiple stacks support for gdb stack > unwinding, > which can be observed as a different threads from gdb perspective. A > short usage is as follows: > > 'set <PID>' - to switch to a specific task > 'gdb info threads' - to see list of in-kernel stacks of this task. > 'gdb thread <ID>' - to switch to the stack. > 'gdb bt' - to unwind it. > > E.g, with the patchset: > > crash> bt > PID: 17636 TASK: ffff88032e0742c0 CPU: 11 COMMAND: "kworker/11:4" > #0 [ffff88037fca6b58] machine_kexec at ffffffff8103cef2 > #1 [ffff88037fca6ba8] crash_kexec at ffffffff810c9aa3 > #2 [ffff88037fca6c70] panic at ffffffff815f0444 > ... > #9 [ffff88037fca6ec8] do_nmi at ffffffff815fd980 > #10 [ffff88037fca6ef0] end_repeat_nmi at ffffffff815fcec1 > [exception RIP: memcpy+13] > RIP: ffffffff812f5b1d RSP: ffff88034f2a9728 RFLAGS: 00010046 > RAX: ffffc900139fe000 RBX: ffff880374b7a1b0 RCX: 0000000000000030 > RBP: ffff88034f2a9778 R8: 000000007fffffff R9: 00000000ffffffff > ... > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > --- <NMI exception stack> --- > #11 [ffff88034f2a9728] memcpy at ffffffff812f5b1d > #12 [ffff88034f2a9728] mga_dirty_update at ffffffffa024ad2b [mgag200] > #13 [ffff88034f2a9780] mga_imageblit at ffffffffa024ae3f [mgag200] > #14 [ffff88034f2a97a0] bit_putcs at ffffffff813424ef > ... > > crash> info threads > Id Target Id Frame > * 1 17636 kworker/11:4 (stack 0) crash_setup_regs (oldregs=0x0, > newregs=0xffff88037fca6bb0) > 2 17636 kworker/11:4 (stack 1) 0xffffffff812f5b1d in memcpy () > > crash> thread 2 > crash> gdb bt > #0 0xffffffff812f5b1d in memcpy () at arch/x86/lib/memcpy_64.S:69 > ... > > There are 2 stacks of the current task, and we can list/switch-to/unwind > each stack. > > [1]: > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01204.html > > v2 -> v1: 1) Rebase this patchset onto gdb-16.2 [2]. > 2) Improved the silent_call_bt() to catch the error FATAL. > > [2]: > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01354.html > > v3 -> v2: 1) Rebase this patchset to crash v9.0.0. > 2) Fix v2's segfault in cmd "bt -E". > 3) Elimit repeat stacks by adding constraints before > gdb_add_substack(). > > v4 -> v3: 1) Fix compiling warning of silent_call_bt() > 2) Add known issue link found in ppc arch. > > Thank you for the update, Tao. For the v4: Ack Thanks Lianbo Tao Liu (5): > Add multi-threads support in crash target > Call cmd_bt silently after "set pid" > x86_64: Add gdb multi-stack unwind support > arm64: Add gdb multi-stack unwind support > ppc64: Add gdb multi-stack unwind support > > arm64.c | 102 +++++++++++++++++++++++++++++++++-- > crash_target.c | 49 +++++++++++++++-- > defs.h | 3 +- > gdb_interface.c | 6 +-- > kernel.c | 44 +++++++++++++++ > ppc64.c | 78 ++++++++++++++++++++++++--- > task.c | 4 +- > x86_64.c | 138 +++++++++++++++++++++++++++++++++++++++++++++--- > 8 files changed, 394 insertions(+), 30 deletions(-) > > -- > 2.47.0 >

1 month, 1 week

1
0
0 / 0

[PATCH] Filter repeated rq for cmd dev -d/-D

by Tao Liu

CEE reported an issue that "dev -d/-D" reports incorrect value of read/write: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL ASYNC SYNC 8 ffff90528df86000 sda ffff9052a3d61800 144 144 0 8 ffff905280718c00 sdb ffff9052a3d63c00 48 48 0 crash> epython rqlist ffff90528e94a5c0 sda is unknown, deadline: 89.992 (90) rq_alloc: 0.196 ffff90528e92f700 sda is unknown, deadline: 89.998 (90) rq_alloc: 0.202 ffff90528e95ccc0 sda is unknown, deadline: 89.999 (90) rq_alloc: 0.203 ffff90528e968bc0 sdb is unknown, deadline: 89.997 (90) rq_alloc: 0.201 The value of 144 ASYNC is incorrect and epython rqlist only show 3 items for sda. The reason is, mq_check_inflight() may get the same rq multiple times during iteration, so they are counted repeatly. This patch will add a rq repetition check. After apply the patch: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL READ WRITE 8 ffff90528df86000 sda ffff9052a3d61800 3 3 0 8 ffff905280718c00 sdb ffff9052a3d63c00 1 1 0 Signed-off-by: Tao Liu <ltao(a)redhat.com> --- dev.c | 43 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/dev.c b/dev.c index 9d38aef..b0434cf 100644 --- a/dev.c +++ b/dev.c @@ -4316,6 +4316,9 @@ struct bt_iter_data { ulong tags; uint reserved; uint nr_reserved_tags; + ulong **rq_list; + int *rq_list_len; + int *rq_list_cap; busy_tag_iter_fn *fn; void *data; }; @@ -4381,10 +4384,30 @@ static bool bt_iter(uint bitnr, void *data) if (!readmem(addr, KVADDR, &rq, sizeof(ulong), "blk_mq_tags.rqs[]", RETURN_ON_ERROR)) return FALSE; + for (int i = 0; i < *iter_data->rq_list_len; i++) { + /* Skip the handled rq */ + if ((*iter_data->rq_list)[i] == rq) + return TRUE; + } + /* Mark the rq is handled */ + (*iter_data->rq_list)[(*iter_data->rq_list_len)++] = rq; + if (*iter_data->rq_list_len > *iter_data->rq_list_cap / 2) { + *iter_data->rq_list_cap <<= 1; + ulong *tmp = reallocarray(*iter_data->rq_list, + *iter_data->rq_list_cap, sizeof(ulong)); + if (!tmp) { + free(*iter_data->rq_list); + error(FATAL, "cannot reallocarray rq_list array"); + } + *iter_data->rq_list = tmp; + } + return iter_data->fn(rq, iter_data->data); } -static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, uint nr_resvd_tags, struct diskio *dio) +static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, + uint nr_resvd_tags, ulong **rq_list, int *rq_list_len, + int *rq_list_cap, struct diskio *dio) { struct sbitmap_context sc = {0}; struct mq_inflight mi = { @@ -4395,6 +4418,9 @@ static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, uint nr_r .tags = tags, .reserved = reserved, .nr_reserved_tags = nr_resvd_tags, + .rq_list = rq_list, + .rq_list_len = rq_list_len, + .rq_list_cap = rq_list_cap, .fn = mq_check_inflight, .data = &mi, }; @@ -4407,10 +4433,18 @@ static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio { uint i; int bitmap_tags_is_ptr = 0; + ulong *rq_list; + int rq_list_len = 0; + int rq_list_cap = 1; if (MEMBER_TYPE("blk_mq_tags", "bitmap_tags") == TYPE_CODE_PTR) bitmap_tags_is_ptr = 1; + rq_list = calloc(rq_list_cap, sizeof(ulong)); + if (!rq_list) { + error(FATAL, "cannot malloc rq_list array"); + } + for (i = 0; i < cnt; i++) { ulong addr = 0, tags = 0; uint nr_reserved_tags = 0; @@ -4432,15 +4466,18 @@ static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio !readmem(addr, KVADDR, &addr, sizeof(ulong), "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) break; - bt_for_each(q, tags, addr, 1, nr_reserved_tags, dio); + bt_for_each(q, tags, addr, 1, nr_reserved_tags, &rq_list, + &rq_list_len, &rq_list_cap, dio); } addr = tags + OFFSET(blk_mq_tags_bitmap_tags); if (bitmap_tags_is_ptr && !readmem(addr, KVADDR, &addr, sizeof(ulong), "blk_mq_tags.bitmap_tags", RETURN_ON_ERROR)) break; - bt_for_each(q, tags, addr, 0, nr_reserved_tags, dio); + bt_for_each(q, tags, addr, 1, nr_reserved_tags, &rq_list, + &rq_list_len, &rq_list_cap, dio); } + free(rq_list); } static void get_mq_diskio_from_hw_queues(ulong q, struct diskio *dio) -- 2.47.0

1 month, 1 week

1
0
0 / 0

Crash tool failed to parse vmcore from Linux v6.15 on RISCV

by Pnina Feder

Hi, When parsing a vmcore generated by Linux 6.15 on RISC-V, the crash tool fails. We identified the root cause: the tool is unable to read memory addresses that are marked as reserved in the /proc/iomem map. These addresses are missing from the vmcore, yet certain kernel structures (e.g., IRQ pointers) reference them. This issue did not occur with Linux 6.14, where the same addresses were not marked as reserved in /proc/iomem and were correctly included in the vmcore. Did anybody see something like that? Thanks, Pnina

1 month, 1 week

1
0
0 / 0

Crash tool failes to boot with vmcore from linux 6.15 on riscv64 (the same works on linux 6.14)

by pnina.feder＠mobileye.com

crash: CONFIG_NR_CPUS: 32 crash: CONFIG_HZ: 1000 crash: # CONFIG_DEBUG_INFO_REDUCED is not set cpu_possible_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpu_present_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpu_online_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpu_active_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 xtime timespec.tv_sec: 0: Thu Jan 1 02:00:00 IST 1970 utsname: sysname: Linux nodename: buildroot release: 6.15.0 version: #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025 machine: riscv64 domainname: (none) base kernel version: 6.15.0 verify_namelist: dumpfile /proc/version: Linux version 6.15.0 (pfeder@epgd034) (riscv64-mti-linux-gnu-gcc (MIPS GNU Tools v1.13 for RISC-V Linux) 11.2.0, GNU ld (MIPS GNU Tools v1.13 for RISC-V Linux) 2.42) #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025 /localdrive/users/pfeder/EQ7_22_06/open_src/eyeq7/buildroot/out-eq7-qemu/build/linux-custom/vmlinux: Linux version 6.15.0 (pfeder@epgd034) (riscv64-mti-linux-gnu-gcc (MIPS GNU Tools v1.13 for RISC-V Linux) 11.2.0, GNU ld (MIPS GNU Tools v1.13 for RISC-V Linux) 2.42) #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025 crash: get_cpus_present: present: 24 crash: get_cpus_present: present: 24 hypervisor: (undetermined) irq_stack_ptr: type: 1, TYPE_CODE_PTR target_typecode: 17, other target_length: 8 length: 8 IRQ stack pointer[0] is ffffffd6fbdce068 crash: read error: kernel virtual address: ffffffd6fbdce068 type: "IRQ stack pointer" IRQ stack pointer[1] is ffffffd6fbde3068 crash: read error: kernel virtual address: ffffffd6fbde3068 type: "IRQ stack pointer" IRQ stack pointer[2] is ffffffd6fbdf8068 crash: read error: kernel virtual address: ffffffd6fbdf8068 type: "IRQ stack pointer" IRQ stack pointer[3] is ffffffd6fbe0d068 crash: read error: kernel virtual address: ffffffd6fbe0d068 type: "IRQ stack pointer" IRQ stack pointer[4] is ffffffd6fbe22068 crash: read error: kernel virtual address: ffffffd6fbe22068 type: "IRQ stack pointer" IRQ stack pointer[5] is ffffffd6fbe37068 crash: read error: kernel virtual address: ffffffd6fbe37068 type: "IRQ stack pointer" IRQ stack pointer[6] is ffffffd6fbe4c068 crash: read error: kernel virtual address: ffffffd6fbe4c068 type: "IRQ stack pointer" IRQ stack pointer[7] is ffffffd6fbe61068 crash: read error: kernel virtual address: ffffffd6fbe61068 type: "IRQ stack pointer" IRQ stack pointer[8] is ffffffd6fbe76068 crash: read error: kernel virtual address: ffffffd6fbe76068 type: "IRQ stack pointer" IRQ stack pointer[9] is ffffffd6fbe8b068 crash: read error: kernel virtual address: ffffffd6fbe8b068 type: "IRQ stack pointer" IRQ stack pointer[10] is ffffffd6fbea0068 crash: read error: kernel virtual address: ffffffd6fbea0068 type: "IRQ stack pointer" IRQ stack pointer[11] is ffffffd6fbeb5068 crash: read error: kernel virtual address: ffffffd6fbeb5068 type: "IRQ stack pointer" IRQ stack pointer[12] is ffffffd6fbeca068 crash: read error: kernel virtual address: ffffffd6fbeca068 type: "IRQ stack pointer" IRQ stack pointer[13] is ffffffd6fbedf068 crash: read error: kernel virtual address: ffffffd6fbedf068 type: "IRQ stack pointer" IRQ stack pointer[14] is ffffffd6fbef4068 crash: read error: kernel virtual address: ffffffd6fbef4068 type: "IRQ stack pointer" IRQ stack pointer[15] is ffffffd6fbf09068 crash: read error: kernel virtual address: ffffffd6fbf09068 type: "IRQ stack pointer" IRQ stack pointer[16] is ffffffd6fbf1e068 crash: read error: kernel virtual address: ffffffd6fbf1e068 type: "IRQ stack pointer" IRQ stack pointer[17] is ffffffd6fbf33068 crash: read error: kernel virtual address: ffffffd6fbf33068 type: "IRQ stack pointer" IRQ stack pointer[18] is ffffffd6fbf48068 crash: read error: kernel virtual address: ffffffd6fbf48068 type: "IRQ stack pointer" IRQ stack pointer[19] is ffffffd6fbf5d068 crash: read error: kernel virtual address: ffffffd6fbf5d068 type: "IRQ stack pointer" IRQ stack pointer[20] is ffffffd6fbf72068 crash: read error: kernel virtual address: ffffffd6fbf72068 type: "IRQ stack pointer" IRQ stack pointer[21] is ffffffd6fbf87068 crash: read error: kernel virtual address: ffffffd6fbf87068 type: "IRQ stack pointer" IRQ stack pointer[22] is ffffffd6fbf9c068 crash: read error: kernel virtual address: ffffffd6fbf9c068 type: "IRQ stack pointer" IRQ stack pointer[23] is ffffffd6fbfb1068 crash: read error: kernel virtual address: ffffffd6fbfb1068 type: "IRQ stack pointer" overflow_stack: type: 2, TYPE_CODE_ARRAY target_typecode: 8, TYPE_CODE_INT target_length: 8 length: 4096 kernel NR_CPUS: 32 node_online_map: [1] -> nodes online: 1 node_table[0]: id: 0 pgdat: ffffffff80ebb980 size: 1048576 present: 1048576 mem_map: ffffffd6fbfc9200 start_paddr: 800000000 start_mapnr: 8388608 NOTE: page_hash_table does not exist in this kernel please wait... (gathering kmem slab cache data) kmem_cache_downsize: 192 to 192 pageflags from pageflag_names: 00000001 locked 00000080 waiters 00000004 referenced 00000008 uptodate 00000010 dirty 00000020 lru 00000100 active 00000200 workingset 00000400 owner_priv_1 00000800 owner_2 00001000 arch_1 00002000 reserved 00004000 private 00008000 private_2 00000002 writeback 00000040 head 00010000 reclaim 00020000 swapbacked 00040000 unevictable 00080000 dropbehind 00100000 mlocked crash: read error: kernel virtual address: ffffffd6fbddee00 type: "note_buf_t" WARNING: cannot find NT_PRSTATUS note for cpu: 0 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: get_cpus_online: online: 24 crash: struct module_memory detected. crash: read error: kernel virtual address: ffffffd6fbdd8880 type: "runqueues entry (per_cpu)"

1 month, 1 week

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility June 2025