Crash-utility August 2023

devel@lists.crash-utility.osci.io

10 participants
26 discussions

[RFC PATCH v2 0/4] Improve stack unwind on ppc64

by Aditya Gupta

The Problem: ============ Currently crash is unable to show function arguments and local variables, as gdb can do. And functionality for moving between frames ('up'/'down') is not working in crash. Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to gdb not getting the register values from `crash_target::fetch_registers`, which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64 Proposed Solution: ================== Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64. This way, "gdb mode in crash" will support this feature for both ELF and kdump-compressed vmcore formats, while "gdb" would only have supported ELF format Implications on Architectures: ==================================== No architecture other than PPC64 has been affected, other than in case of 'frame' command As mentioned in patch #2, since frame will not be prohibited, so it will print: crash> frame #0 <unavailable> in ?? () Instead of before prohibited message: crash> frame crash: prohibited gdb command: frame On PPC64, the default mode ("crash mode") will not have ANY OTHER changes, other than 'frame' as mentioned above. Major change will be in 'gdb mode' on PPC64, that it will print the frames, and local variables, instead of failing with errors showing no frame, or showing that couldn't get PC Testing: ======== Git tree with this patch series applied: https://github.com/adi-g15-ibm/crash/tree/stack-unwind-rfc2 To test gdb passthroughs: crash> set gdb on gdb> thread 3 # or any other thread number to change context in gdb gdb> bt gdb> frame gdb> up gdb> down gdb> info locals Known Issues: ============= 1. In gdb mode, 'info threads' might hang for few seconds, and print only 2 threads 2. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected from older kernels. This is a known issue due to register mismatch, and its fix has been merged upstream: Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581... TODO: ===== 1. Introduce automatic thread selection in gdb mode, to select the crashing thread in gdb, eliminating the need to manually run "thread <id>" after switching to gdb mode. Changelog: ========== RFC V2: - removed patch implementing 'frame', 'up', 'down' in crash - updated the cover letter by removing the mention of those commands other than the respective gdb passthrough Aditya Gupta (4): add generic get_dumpfile_regs to read registers ppc64: fix gdb passthrough by implementing machdep->get_cpu_reg remove 'frame' from prohibited commands list make cpu context change transparent to crash/gdb defs.h | 125 ++++++++++++++++++++++++++++++++++++++++++++++++ gdb-10.2.patch | 28 +++++++++++ gdb_interface.c | 2 +- kernel.c | 33 +++++++++++++ ppc64.c | 105 ++++++++++++++++++++++++++++++++++++++-- tools.c | 12 +++-- 6 files changed, 298 insertions(+), 7 deletions(-) -- 2.41.0

1 year, 11 months

1
4
0 / 0

RISCV64: Use va_kernel_pa_offset in VTOP()

by Song Shuai

Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping") changes the phys_ram_base from the kernel_map.phys_addr to the start of DRAM. The Crash's VTOP() still uses phys_ram_base and kernel_map.virt_addr to translate kernel virtual address, that made Crash boot failed with Linux v6.4 and later version. Let Linux export kernel_map.va_kernel_pa_offset in v6.5 and Crash can use "va_kernel_pa_offset" to translate the kernel virtual address in VTOP() correctly. Signed-off-by: Song Shuai <suagrfillet(a)gmail.com> --- You can check/test the Linux changes from this link: https://github.com/sugarfillet/linux/commits/6.5-rc3-crash And I'll send the Linux changes to riscv/for-next If you're ok with this patch. --- defs.h | 4 ++-- riscv64.c | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/defs.h b/defs.h index 358f365..46b9857 100644 --- a/defs.h +++ b/defs.h @@ -3662,8 +3662,7 @@ typedef signed int s32; ulong _X = X; \ (THIS_KERNEL_VERSION >= LINUX(5,13,0) && \ (_X) >= machdep->machspec->kernel_link_addr) ? \ - (((unsigned long)(_X)-(machdep->machspec->kernel_link_addr)) + \ - machdep->machspec->phys_base): \ + ((unsigned long)(_X)-(machdep->machspec->va_kernel_pa_offset)): \ (((unsigned long)(_X)-(machdep->kvbase)) + \ machdep->machspec->phys_base); \ }) @@ -7021,6 +7020,7 @@ struct machine_specific { ulong modules_vaddr; ulong modules_end; ulong kernel_link_addr; + ulong va_kernel_pa_offset; ulong _page_present; ulong _page_read; diff --git a/riscv64.c b/riscv64.c index 6b9a688..b9e50b4 100644 --- a/riscv64.c +++ b/riscv64.c @@ -418,6 +418,27 @@ error: error(FATAL, "cannot get vm layout\n"); } +static void +riscv64_get_va_kernel_pa_offset(struct machine_specific *ms) +{ + unsigned long kernel_version = riscv64_get_kernel_version(); + + /* + * va_kernel_pa_offset is defined in Linux kernel since 6.5. + */ + if (kernel_version >= LINUX(6,5,0)) { + char *string; + if ((string = pc->read_vmcoreinfo("NUMBER(va_kernel_pa_offset)"))) { + ms->va_kernel_pa_offset = htol(string, QUIET, NULL); + free(string); + } else + error(FATAL, "cannot read va_kernel_pa_offset\n"); + } else if (kernel_version >= LINUX(6,4,0)) + error(FATAL, "cannot determine va_kernel_pa_offset since Linux 6.4\n"); + else + ms->va_kernel_pa_offset = ms->kernel_link_addr - ms->phys_base; +} + static int riscv64_is_kvaddr(ulong vaddr) { @@ -1352,6 +1373,7 @@ riscv64_init(int when) riscv64_get_struct_page_size(machdep->machspec); riscv64_get_va_bits(machdep->machspec); riscv64_get_va_range(machdep->machspec); + riscv64_get_va_kernel_pa_offset(machdep->machspec); pt_level_alloc(&machdep->pgd, "cannot malloc pgd space."); pt_level_alloc(&machdep->machspec->p4d, "cannot malloc p4d space."); -- 2.20.1

1 year, 11 months

4
7
0 / 0

[RFC PATCH 0/5] Improve stack unwind on ppc64

by Aditya Gupta

The Problem: ============ Currently crash is unable to show function arguments and local variables, as gdb can do. And functionality for moving between frames ('up'/'down') is not working in crash. Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to gdb not getting the register values from `crash_target::fetch_registers`, which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64 Proposed Solution: ================== Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64. This way, "gdb mode in crash" will support this feature for both ELF and kdump-compressed vmcore formats, while "gdb" would only have supported ELF format Also, backtrace can be slightly different in gdb and crash (due to gdb being able to print inline frames also). so it can cause confusion of 'bt' working in crash context, but 'frame'/'up'/'down' working as 'gdb passthrough', showing different frames. This has been explained in patch #4. So to prevent confusion mentioned above, implement 'frame', 'up', 'down' as commands in default crash mode also, instead of working via gdb passthroughs which they do currently. So, now in default mode, 'bt','frame','up','down' will be consistent with each other. Implications on Architectures: ==================================== No architecture other than PPC64 has been affected, other than in case of 'frame', 'up', 'down' commands 1. frame: As mentioned in patch #2, that frame will not be prohibited, and will print: crash> frame #0 <unavailable> in ?? () Instead of before prohibited message: crash> frame crash: prohibited gdb command: frame 2. up/down: These commands will now be run as native crash commands by default instead of showing crash> up crash: ambiguous command: up (symbol and gdb command) crash> down crash: ambiguous command: down (symbol and gdb command) On PPC64, the default mode ("crash mode") will not have ANY OTHER changes, other than the 'frame', 'up', 'down' as mentioned above. Major change will be in 'gdb mode' on PPC64, that it will print the frames, and local variables, instead of failing with errors showing no frame, or showing that couldn't get PC Testing: ======== Git tree with this patch series applied: https://github.ibm.com/adityag/crash (replace this link with github.com later) To test 'frame'/'up'/'down' in crash (implemented in patch #3): crash> bt crash> frame crash> up crash> down crash> up 4 To test 'bt'/'frame'/'up'/'down'/'info locals' gdb passthroughs: crash> set gdb on gdb> thread 3 # or any other thread number to change context in gdb gdb> bt gdb> frame gdb> up gdb> down gdb> info locals Known Issues: ============= 1. In gdb mode, 'info threads' might hang for few seconds, and print only 2 threads 2. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected from older kernels. This is a known issue due to register mismatch, and its fix has been accepted upstream: Commit: https://github.com/linuxppc/linux/commit/b684c09f09e7a6af3794d4233ef78581... TODO: ===== 1. Introduce automatic thread selection in gdb mode, to select the crashing thread in gdb, eliminating the need to manually run "thread <id>" after switching to gdb mode. Aditya Gupta (5): add generic get_dumpfile_regs to read registers ppc64: fix gdb passthrough by implementing machdep->get_cpu_reg remove 'frame' from prohibited commands list implement 'frame', 'up', 'down' inside crash make cpu context change transparent to crash/gdb defs.h | 135 ++++++++++++++++++++++++++ gdb-10.2.patch | 28 ++++++ gdb_interface.c | 2 +- global_data.c | 3 + help.c | 34 +++++++ kernel.c | 183 +++++++++++++++++++++++++++++++++++ ppc64.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++- task.c | 1 + tools.c | 12 ++- 9 files changed, 641 insertions(+), 7 deletions(-) -- 2.41.0

1 year, 11 months

2
6
0 / 0

[PATCH v2] Fix the "foreach DE" task identifier displays incorrect state tasks.

by Lianbo Jiang

Currently, the "foreach DE ps -m" command may display "DE" as well as "ZO" state tasks as below: crash> foreach DE ps -m ... [0 00:00:00.040] [ZO] PID: 11458 TASK: ffff91c75680d280 CPU: 7 COMMAND: "ora_w01o_p01mci" [0 00:00:00.044] [ZO] PID: 49118 TASK: ffff91c7bf3e8000 CPU: 19 COMMAND: "oracle_49118_p0" [0 00:00:00.050] [ZO] PID: 28748 TASK: ffff91a7cbde3180 CPU: 2 COMMAND: "ora_imr0_p01sci" [0 00:00:00.050] [DE] PID: 28405 TASK: ffff91a7c8eb0000 CPU: 27 COMMAND: "ora_vktm_p01sci" [0 00:00:00.051] [ZO] PID: 31716 TASK: ffff91a7f7192100 CPU: 6 COMMAND: "ora_p001_p01sci" ... That is not expected behavior, the "foreach" command needs to handle such cases. Let's add a check to determine if the task state identifier is specified and the specified identifier is equal to the actual task state identifier, so that it can filter out the unspecified state tasks. With the patch: crash> foreach DE ps -m [0 00:00:00.050] [DE] PID: 28405 TASK: ffff91a7c8eb0000 CPU: 27 COMMAND: "ora_vktm_p01sci" crash> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com> --- defs.h | 2 +- task.c | 52 +++++++++++++++++++--------------------------------- 2 files changed, 20 insertions(+), 34 deletions(-) diff --git a/defs.h b/defs.h index 358f365585cf..5ee60f1eb3a5 100644 --- a/defs.h +++ b/defs.h @@ -1203,7 +1203,7 @@ struct foreach_data { char *pattern; regex_t regex; } regex_info[MAX_REGEX_ARGS]; - ulong state; + const char *state; char *reference; int keys; int pids; diff --git a/task.c b/task.c index b9076da35565..20a9ce3aa40b 100644 --- a/task.c +++ b/task.c @@ -6636,39 +6636,42 @@ cmd_foreach(void) STREQ(args[optind], "NE") || STREQ(args[optind], "SW")) { + ulong state = TASK_STATE_UNINITIALIZED; + if (fd->flags & FOREACH_STATE) error(FATAL, "only one task state allowed\n"); if (STREQ(args[optind], "RU")) - fd->state = _RUNNING_; + state = _RUNNING_; else if (STREQ(args[optind], "IN")) - fd->state = _INTERRUPTIBLE_; + state = _INTERRUPTIBLE_; else if (STREQ(args[optind], "UN")) - fd->state = _UNINTERRUPTIBLE_; + state = _UNINTERRUPTIBLE_; else if (STREQ(args[optind], "ST")) - fd->state = _STOPPED_; + state = _STOPPED_; else if (STREQ(args[optind], "TR")) - fd->state = _TRACING_STOPPED_; + state = _TRACING_STOPPED_; else if (STREQ(args[optind], "ZO")) - fd->state = _ZOMBIE_; + state = _ZOMBIE_; else if (STREQ(args[optind], "DE")) - fd->state = _DEAD_; + state = _DEAD_; else if (STREQ(args[optind], "SW")) - fd->state = _SWAPPING_; + state = _SWAPPING_; else if (STREQ(args[optind], "PA")) - fd->state = _PARKED_; + state = _PARKED_; else if (STREQ(args[optind], "WA")) - fd->state = _WAKING_; + state = _WAKING_; else if (STREQ(args[optind], "ID")) - fd->state = _UNINTERRUPTIBLE_|_NOLOAD_; + state = _UNINTERRUPTIBLE_|_NOLOAD_; else if (STREQ(args[optind], "NE")) - fd->state = _NEW_; + state = _NEW_; - if (fd->state == TASK_STATE_UNINITIALIZED) + if (state == TASK_STATE_UNINITIALIZED) error(FATAL, "invalid task state for this kernel: %s\n", args[optind]); + fd->state = args[optind]; fd->flags |= FOREACH_STATE; optind++; @@ -7039,26 +7042,9 @@ foreach(struct foreach_data *fd) if ((fd->flags & FOREACH_KERNEL) && !is_kernel_thread(tc->task)) continue; - if (fd->flags & FOREACH_STATE) { - if (fd->state == _RUNNING_) { - if (task_state(tc->task) != _RUNNING_) - continue; - } else if (fd->state & _UNINTERRUPTIBLE_) { - if (!(task_state(tc->task) & _UNINTERRUPTIBLE_)) - continue; - - if (valid_task_state(_NOLOAD_)) { - if (fd->state & _NOLOAD_) { - if (!(task_state(tc->task) & _NOLOAD_)) - continue; - } else { - if ((task_state(tc->task) & _NOLOAD_)) - continue; - } - } - } else if (!(task_state(tc->task) & fd->state)) - continue; - } + if ((fd->flags & FOREACH_STATE) && + (!STRNEQ(task_state_string(tc->task, buf, 0), fd->state))) + continue; if (specified) { for (j = 0; j < fd->tasks; j++) { -- 2.37.1

1 year, 11 months

3
3
0 / 0

[RFC][PATCH 0/1] add loongarch64 platform support.

by Ming Wang

This patch are for Crash-utility tool, it make crash tool support on loongarch64 architecture and the common commands(bt, p, rd, mod, log, set, dis, and so on). The upstream GDB code supports the loongarch64 architecture from version 13.1. See: https://sourceware.org/gdb/download/ANNOUNCEMENT But Crash-utility depends on gdb-10.2, gdb-10.2 do NOT supported loongarch64. So we need a patch(gdb-10.2-loongarch.patch) to support it. I don't have a better way to deal with this problem at the moment. I test this patch on Loongson 3C50000 processor platform. ... KERNEL: /usr/lib/debug/lib/modules/5.10.0-60.102.0.128.oe2203.loongarch64/vmlinux DUMPFILE: /proc/kcore CPUS: 16 DATE: Thu Jul 27 19:51:21 CST 2023 UPTIME: 06:35:11 LOAD AVERAGE: 0.15, 0.03, 0.01 TASKS: 257 NODENAME: localhost.localdomain RELEASE: 5.10.0-60.102.0.128.oe2203.loongarch64 VERSION: #1 SMP Fri Jul 14 04:17:09 UTC 2023 MACHINE: loongarch64 (2200 Mhz) MEMORY: 64 GB PID: 2964 COMMAND: "crash" TASK: 9000000098805500 [THREAD_INFO: 9000000094d48000] CPU: 6 STATE: TASK_RUNNING (ACTIVE) crash> crash> dis -l start_kernel /linux-5.10.0-60.102.0.128.oe2203.loongarch64/init/main.c: 883 0x9000000001030818 <start_kernel>: 0x0141ee40 /linux-5.10.0-60.102.0.128.oe2203.loongarch64/init/main.c: 879 0x900000000103081c <start_kernel+4>: 0x90000000 /linux-5.10.0-60.102.0.128.oe2203.loongarch64/init/main.c: 883 0x9000000001030820 <start_kernel+8>: addu16i.d $zero, $t8, 8179(0x1ff3) /linux-5.10.0-60.102.0.128.oe2203.loongarch64/init/main.c: 879 ... About the LoongArch64 Architecture: https://www.kernel.org/doc/html/latest/loongarch/index.html After this RFC, I will split this big patch to many small patchs by function, like RISCV64 patch sets. Ming Wang (1): loongarch64: Support loongarch64 architecture and common commands Makefile | 9 +- README | 4 +- configure.c | 27 +- crash.8 | 2 +- defs.h | 161 +- diskdump.c | 24 +- gdb-10.2-loongarch.patch | 15207 +++++++++++++++++++++++++++++++++++++ gdb_interface.c | 1 - help.c | 9 +- lkcd_vmdump_v1.h | 2 +- lkcd_vmdump_v2_v3.h | 5 +- loongarch64.c | 1347 ++++ main.c | 3 +- netdump.c | 26 +- ramdump.c | 2 + symbols.c | 26 +- 16 files changed, 16832 insertions(+), 23 deletions(-) create mode 100644 gdb-10.2-loongarch.patch create mode 100644 loongarch64.c base-commit: c74f375e0ef7cd9b593fa1d73c47505822c8f2a0 -- 2.39.2

1 year, 11 months

3
5
0 / 0

[PATCH] Fix the "foreach DE" task identifier displays incorrect state tasks.

by Lianbo Jiang

1 year, 11 months

3
5
0 / 0

← Newer
1
2
3
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility August 2023