Crash tool failed to parse vmcore from Linux v6.15 on RISCV
by Pnina Feder
Hi,
When parsing a vmcore generated by Linux 6.15 on RISC-V, the crash tool fails.
We identified the root cause: the tool is unable to read memory addresses that are marked as reserved in the /proc/iomem map.
These addresses are missing from the vmcore, yet certain kernel structures (e.g., IRQ pointers) reference them.
This issue did not occur with Linux 6.14, where the same addresses were not marked as reserved in /proc/iomem and were correctly included in the vmcore.
Did anybody see something like that?
Thanks,
Pnina
2 hours, 28 minutes
Re: [PATCH v4 0/5] gdb multi-stack unwinding support
by lijiang
On Wed, Jun 25, 2025 at 12:04 PM <devel-request(a)lists.crash-utility.osci.io>
wrote:
> Date: Wed, 25 Jun 2025 16:01:58 +1200
> From: Tao Liu <ltao(a)redhat.com>
> Subject: [Crash-utility] [PATCH v4 0/5] gdb multi-stack unwinding
> support
> To: devel(a)lists.crash-utility.osci.io
> Cc: Tao Liu <ltao(a)redhat.com>
> Message-ID: <20250625040203.60334-1-ltao(a)redhat.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> This patchset is based on Alexy's work [1], and is the follow-up of the
> previous "gdb stack unwinding support for crash utility" patchset.
>
> Currently gdb target analyzes only one task at a time and it backtraces
> only straight stack until end of the stack. If stacks were concatenated
> during exceptions or interrupts, gdb bt will show only the topmost one.
>
> This patchset will introduce multiple stacks support for gdb stack
> unwinding,
> which can be observed as a different threads from gdb perspective. A
> short usage is as follows:
>
> 'set <PID>' - to switch to a specific task
> 'gdb info threads' - to see list of in-kernel stacks of this task.
> 'gdb thread <ID>' - to switch to the stack.
> 'gdb bt' - to unwind it.
>
> E.g, with the patchset:
>
> crash> bt
> PID: 17636 TASK: ffff88032e0742c0 CPU: 11 COMMAND: "kworker/11:4"
> #0 [ffff88037fca6b58] machine_kexec at ffffffff8103cef2
> #1 [ffff88037fca6ba8] crash_kexec at ffffffff810c9aa3
> #2 [ffff88037fca6c70] panic at ffffffff815f0444
> ...
> #9 [ffff88037fca6ec8] do_nmi at ffffffff815fd980
> #10 [ffff88037fca6ef0] end_repeat_nmi at ffffffff815fcec1
> [exception RIP: memcpy+13]
> RIP: ffffffff812f5b1d RSP: ffff88034f2a9728 RFLAGS: 00010046
> RAX: ffffc900139fe000 RBX: ffff880374b7a1b0 RCX: 0000000000000030
> RBP: ffff88034f2a9778 R8: 000000007fffffff R9: 00000000ffffffff
> ...
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> --- <NMI exception stack> ---
> #11 [ffff88034f2a9728] memcpy at ffffffff812f5b1d
> #12 [ffff88034f2a9728] mga_dirty_update at ffffffffa024ad2b [mgag200]
> #13 [ffff88034f2a9780] mga_imageblit at ffffffffa024ae3f [mgag200]
> #14 [ffff88034f2a97a0] bit_putcs at ffffffff813424ef
> ...
>
> crash> info threads
> Id Target Id Frame
> * 1 17636 kworker/11:4 (stack 0) crash_setup_regs (oldregs=0x0,
> newregs=0xffff88037fca6bb0)
> 2 17636 kworker/11:4 (stack 1) 0xffffffff812f5b1d in memcpy ()
>
> crash> thread 2
> crash> gdb bt
> #0 0xffffffff812f5b1d in memcpy () at arch/x86/lib/memcpy_64.S:69
> ...
>
> There are 2 stacks of the current task, and we can list/switch-to/unwind
> each stack.
>
> [1]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01204.html
>
> v2 -> v1: 1) Rebase this patchset onto gdb-16.2 [2].
> 2) Improved the silent_call_bt() to catch the error FATAL.
>
> [2]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01354.html
>
> v3 -> v2: 1) Rebase this patchset to crash v9.0.0.
> 2) Fix v2's segfault in cmd "bt -E".
> 3) Elimit repeat stacks by adding constraints before
> gdb_add_substack().
>
> v4 -> v3: 1) Fix compiling warning of silent_call_bt()
> 2) Add known issue link found in ppc arch.
>
>
Thank you for the update, Tao.
For the v4: Ack
Thanks
Lianbo
Tao Liu (5):
> Add multi-threads support in crash target
> Call cmd_bt silently after "set pid"
> x86_64: Add gdb multi-stack unwind support
> arm64: Add gdb multi-stack unwind support
> ppc64: Add gdb multi-stack unwind support
>
> arm64.c | 102 +++++++++++++++++++++++++++++++++--
> crash_target.c | 49 +++++++++++++++--
> defs.h | 3 +-
> gdb_interface.c | 6 +--
> kernel.c | 44 +++++++++++++++
> ppc64.c | 78 ++++++++++++++++++++++++---
> task.c | 4 +-
> x86_64.c | 138 +++++++++++++++++++++++++++++++++++++++++++++---
> 8 files changed, 394 insertions(+), 30 deletions(-)
>
> --
> 2.47.0
>
3 hours, 12 minutes
[PATCH] Filter repeated rq for cmd dev -d/-D
by Tao Liu
CEE reported an issue that "dev -d/-D" reports incorrect value
of read/write:
crash> dev -d
MAJOR GENDISK NAME REQUEST_QUEUE TOTAL ASYNC SYNC
8 ffff90528df86000 sda ffff9052a3d61800 144 144 0
8 ffff905280718c00 sdb ffff9052a3d63c00 48 48 0
crash> epython rqlist
ffff90528e94a5c0 sda is unknown, deadline: 89.992 (90) rq_alloc: 0.196
ffff90528e92f700 sda is unknown, deadline: 89.998 (90) rq_alloc: 0.202
ffff90528e95ccc0 sda is unknown, deadline: 89.999 (90) rq_alloc: 0.203
ffff90528e968bc0 sdb is unknown, deadline: 89.997 (90) rq_alloc: 0.201
The value of 144 ASYNC is incorrect and epython rqlist only show 3 items for sda.
The reason is, mq_check_inflight() may get the same rq multiple times during
iteration, so they are counted repeatly.
This patch will add a rq repetition check. After apply the patch:
crash> dev -d
MAJOR GENDISK NAME REQUEST_QUEUE TOTAL READ WRITE
8 ffff90528df86000 sda ffff9052a3d61800 3 3 0
8 ffff905280718c00 sdb ffff9052a3d63c00 1 1 0
Signed-off-by: Tao Liu <ltao(a)redhat.com>
---
dev.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/dev.c b/dev.c
index 9d38aef..b0434cf 100644
--- a/dev.c
+++ b/dev.c
@@ -4316,6 +4316,9 @@ struct bt_iter_data {
ulong tags;
uint reserved;
uint nr_reserved_tags;
+ ulong **rq_list;
+ int *rq_list_len;
+ int *rq_list_cap;
busy_tag_iter_fn *fn;
void *data;
};
@@ -4381,10 +4384,30 @@ static bool bt_iter(uint bitnr, void *data)
if (!readmem(addr, KVADDR, &rq, sizeof(ulong), "blk_mq_tags.rqs[]", RETURN_ON_ERROR))
return FALSE;
+ for (int i = 0; i < *iter_data->rq_list_len; i++) {
+ /* Skip the handled rq */
+ if ((*iter_data->rq_list)[i] == rq)
+ return TRUE;
+ }
+ /* Mark the rq is handled */
+ (*iter_data->rq_list)[(*iter_data->rq_list_len)++] = rq;
+ if (*iter_data->rq_list_len > *iter_data->rq_list_cap / 2) {
+ *iter_data->rq_list_cap <<= 1;
+ ulong *tmp = reallocarray(*iter_data->rq_list,
+ *iter_data->rq_list_cap, sizeof(ulong));
+ if (!tmp) {
+ free(*iter_data->rq_list);
+ error(FATAL, "cannot reallocarray rq_list array");
+ }
+ *iter_data->rq_list = tmp;
+ }
+
return iter_data->fn(rq, iter_data->data);
}
-static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, uint nr_resvd_tags, struct diskio *dio)
+static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved,
+ uint nr_resvd_tags, ulong **rq_list, int *rq_list_len,
+ int *rq_list_cap, struct diskio *dio)
{
struct sbitmap_context sc = {0};
struct mq_inflight mi = {
@@ -4395,6 +4418,9 @@ static void bt_for_each(ulong q, ulong tags, ulong sbq, uint reserved, uint nr_r
.tags = tags,
.reserved = reserved,
.nr_reserved_tags = nr_resvd_tags,
+ .rq_list = rq_list,
+ .rq_list_len = rq_list_len,
+ .rq_list_cap = rq_list_cap,
.fn = mq_check_inflight,
.data = &mi,
};
@@ -4407,10 +4433,18 @@ static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio
{
uint i;
int bitmap_tags_is_ptr = 0;
+ ulong *rq_list;
+ int rq_list_len = 0;
+ int rq_list_cap = 1;
if (MEMBER_TYPE("blk_mq_tags", "bitmap_tags") == TYPE_CODE_PTR)
bitmap_tags_is_ptr = 1;
+ rq_list = calloc(rq_list_cap, sizeof(ulong));
+ if (!rq_list) {
+ error(FATAL, "cannot malloc rq_list array");
+ }
+
for (i = 0; i < cnt; i++) {
ulong addr = 0, tags = 0;
uint nr_reserved_tags = 0;
@@ -4432,15 +4466,18 @@ static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio
!readmem(addr, KVADDR, &addr, sizeof(ulong),
"blk_mq_tags.bitmap_tags", RETURN_ON_ERROR))
break;
- bt_for_each(q, tags, addr, 1, nr_reserved_tags, dio);
+ bt_for_each(q, tags, addr, 1, nr_reserved_tags, &rq_list,
+ &rq_list_len, &rq_list_cap, dio);
}
addr = tags + OFFSET(blk_mq_tags_bitmap_tags);
if (bitmap_tags_is_ptr &&
!readmem(addr, KVADDR, &addr, sizeof(ulong),
"blk_mq_tags.bitmap_tags", RETURN_ON_ERROR))
break;
- bt_for_each(q, tags, addr, 0, nr_reserved_tags, dio);
+ bt_for_each(q, tags, addr, 1, nr_reserved_tags, &rq_list,
+ &rq_list_len, &rq_list_cap, dio);
}
+ free(rq_list);
}
static void get_mq_diskio_from_hw_queues(ulong q, struct diskio *dio)
--
2.47.0
11 hours, 43 minutes
Crash tool failes to boot with vmcore from linux 6.15 on riscv64 (the same works on linux 6.14)
by pnina.feder@mobileye.com
crash: CONFIG_NR_CPUS: 32
crash: CONFIG_HZ: 1000
crash: # CONFIG_DEBUG_INFO_REDUCED is not set
cpu_possible_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpu_present_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpu_online_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpu_active_mask: cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
xtime timespec.tv_sec: 0: Thu Jan 1 02:00:00 IST 1970
utsname:
sysname: Linux
nodename: buildroot
release: 6.15.0
version: #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025
machine: riscv64
domainname: (none)
base kernel version: 6.15.0
verify_namelist:
dumpfile /proc/version:
Linux version 6.15.0 (pfeder@epgd034) (riscv64-mti-linux-gnu-gcc (MIPS GNU Tools v1.13 for RISC-V Linux) 11.2.0, GNU ld (MIPS GNU Tools v1.13 for RISC-V Linux) 2.42) #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025
/localdrive/users/pfeder/EQ7_22_06/open_src/eyeq7/buildroot/out-eq7-qemu/build/linux-custom/vmlinux:
Linux version 6.15.0 (pfeder@epgd034) (riscv64-mti-linux-gnu-gcc (MIPS GNU Tools v1.13 for RISC-V Linux) 11.2.0, GNU ld (MIPS GNU Tools v1.13 for RISC-V Linux) 2.42) #1 SMP PREEMPT_RT Mon Jun 23 15:09:03 IDT 2025
crash: get_cpus_present: present: 24
crash: get_cpus_present: present: 24
hypervisor: (undetermined)
irq_stack_ptr:
type: 1, TYPE_CODE_PTR
target_typecode: 17, other
target_length: 8
length: 8
IRQ stack pointer[0] is ffffffd6fbdce068
crash: read error: kernel virtual address: ffffffd6fbdce068 type: "IRQ stack pointer"
IRQ stack pointer[1] is ffffffd6fbde3068
crash: read error: kernel virtual address: ffffffd6fbde3068 type: "IRQ stack pointer"
IRQ stack pointer[2] is ffffffd6fbdf8068
crash: read error: kernel virtual address: ffffffd6fbdf8068 type: "IRQ stack pointer"
IRQ stack pointer[3] is ffffffd6fbe0d068
crash: read error: kernel virtual address: ffffffd6fbe0d068 type: "IRQ stack pointer"
IRQ stack pointer[4] is ffffffd6fbe22068
crash: read error: kernel virtual address: ffffffd6fbe22068 type: "IRQ stack pointer"
IRQ stack pointer[5] is ffffffd6fbe37068
crash: read error: kernel virtual address: ffffffd6fbe37068 type: "IRQ stack pointer"
IRQ stack pointer[6] is ffffffd6fbe4c068
crash: read error: kernel virtual address: ffffffd6fbe4c068 type: "IRQ stack pointer"
IRQ stack pointer[7] is ffffffd6fbe61068
crash: read error: kernel virtual address: ffffffd6fbe61068 type: "IRQ stack pointer"
IRQ stack pointer[8] is ffffffd6fbe76068
crash: read error: kernel virtual address: ffffffd6fbe76068 type: "IRQ stack pointer"
IRQ stack pointer[9] is ffffffd6fbe8b068
crash: read error: kernel virtual address: ffffffd6fbe8b068 type: "IRQ stack pointer"
IRQ stack pointer[10] is ffffffd6fbea0068
crash: read error: kernel virtual address: ffffffd6fbea0068 type: "IRQ stack pointer"
IRQ stack pointer[11] is ffffffd6fbeb5068
crash: read error: kernel virtual address: ffffffd6fbeb5068 type: "IRQ stack pointer"
IRQ stack pointer[12] is ffffffd6fbeca068
crash: read error: kernel virtual address: ffffffd6fbeca068 type: "IRQ stack pointer"
IRQ stack pointer[13] is ffffffd6fbedf068
crash: read error: kernel virtual address: ffffffd6fbedf068 type: "IRQ stack pointer"
IRQ stack pointer[14] is ffffffd6fbef4068
crash: read error: kernel virtual address: ffffffd6fbef4068 type: "IRQ stack pointer"
IRQ stack pointer[15] is ffffffd6fbf09068
crash: read error: kernel virtual address: ffffffd6fbf09068 type: "IRQ stack pointer"
IRQ stack pointer[16] is ffffffd6fbf1e068
crash: read error: kernel virtual address: ffffffd6fbf1e068 type: "IRQ stack pointer"
IRQ stack pointer[17] is ffffffd6fbf33068
crash: read error: kernel virtual address: ffffffd6fbf33068 type: "IRQ stack pointer"
IRQ stack pointer[18] is ffffffd6fbf48068
crash: read error: kernel virtual address: ffffffd6fbf48068 type: "IRQ stack pointer"
IRQ stack pointer[19] is ffffffd6fbf5d068
crash: read error: kernel virtual address: ffffffd6fbf5d068 type: "IRQ stack pointer"
IRQ stack pointer[20] is ffffffd6fbf72068
crash: read error: kernel virtual address: ffffffd6fbf72068 type: "IRQ stack pointer"
IRQ stack pointer[21] is ffffffd6fbf87068
crash: read error: kernel virtual address: ffffffd6fbf87068 type: "IRQ stack pointer"
IRQ stack pointer[22] is ffffffd6fbf9c068
crash: read error: kernel virtual address: ffffffd6fbf9c068 type: "IRQ stack pointer"
IRQ stack pointer[23] is ffffffd6fbfb1068
crash: read error: kernel virtual address: ffffffd6fbfb1068 type: "IRQ stack pointer"
overflow_stack:
type: 2, TYPE_CODE_ARRAY
target_typecode: 8, TYPE_CODE_INT
target_length: 8
length: 4096
kernel NR_CPUS: 32
node_online_map: [1] -> nodes online: 1
node_table[0]:
id: 0
pgdat: ffffffff80ebb980
size: 1048576
present: 1048576
mem_map: ffffffd6fbfc9200
start_paddr: 800000000
start_mapnr: 8388608
NOTE: page_hash_table does not exist in this kernel
please wait... (gathering kmem slab cache data)
kmem_cache_downsize: 192 to 192
pageflags from pageflag_names:
00000001 locked
00000080 waiters
00000004 referenced
00000008 uptodate
00000010 dirty
00000020 lru
00000100 active
00000200 workingset
00000400 owner_priv_1
00000800 owner_2
00001000 arch_1
00002000 reserved
00004000 private
00008000 private_2
00000002 writeback
00000040 head
00010000 reclaim
00020000 swapbacked
00040000 unevictable
00080000 dropbehind
00100000 mlocked
crash: read error: kernel virtual address: ffffffd6fbddee00 type: "note_buf_t"
WARNING: cannot find NT_PRSTATUS note for cpu: 0
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: get_cpus_online: online: 24
crash: struct module_memory detected.
crash: read error: kernel virtual address: ffffffd6fbdd8880 type: "runqueues entry (per_cpu)"
1 day, 2 hours
[PATCH v4 0/5] gdb multi-stack unwinding support
by Tao Liu
This patchset is based on Alexy's work [1], and is the follow-up of the
previous "gdb stack unwinding support for crash utility" patchset.
Currently gdb target analyzes only one task at a time and it backtraces
only straight stack until end of the stack. If stacks were concatenated
during exceptions or interrupts, gdb bt will show only the topmost one.
This patchset will introduce multiple stacks support for gdb stack unwinding,
which can be observed as a different threads from gdb perspective. A
short usage is as follows:
'set <PID>' - to switch to a specific task
'gdb info threads' - to see list of in-kernel stacks of this task.
'gdb thread <ID>' - to switch to the stack.
'gdb bt' - to unwind it.
E.g, with the patchset:
crash> bt
PID: 17636 TASK: ffff88032e0742c0 CPU: 11 COMMAND: "kworker/11:4"
#0 [ffff88037fca6b58] machine_kexec at ffffffff8103cef2
#1 [ffff88037fca6ba8] crash_kexec at ffffffff810c9aa3
#2 [ffff88037fca6c70] panic at ffffffff815f0444
...
#9 [ffff88037fca6ec8] do_nmi at ffffffff815fd980
#10 [ffff88037fca6ef0] end_repeat_nmi at ffffffff815fcec1
[exception RIP: memcpy+13]
RIP: ffffffff812f5b1d RSP: ffff88034f2a9728 RFLAGS: 00010046
RAX: ffffc900139fe000 RBX: ffff880374b7a1b0 RCX: 0000000000000030
RBP: ffff88034f2a9778 R8: 000000007fffffff R9: 00000000ffffffff
...
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#11 [ffff88034f2a9728] memcpy at ffffffff812f5b1d
#12 [ffff88034f2a9728] mga_dirty_update at ffffffffa024ad2b [mgag200]
#13 [ffff88034f2a9780] mga_imageblit at ffffffffa024ae3f [mgag200]
#14 [ffff88034f2a97a0] bit_putcs at ffffffff813424ef
...
crash> info threads
Id Target Id Frame
* 1 17636 kworker/11:4 (stack 0) crash_setup_regs (oldregs=0x0, newregs=0xffff88037fca6bb0)
2 17636 kworker/11:4 (stack 1) 0xffffffff812f5b1d in memcpy ()
crash> thread 2
crash> gdb bt
#0 0xffffffff812f5b1d in memcpy () at arch/x86/lib/memcpy_64.S:69
...
There are 2 stacks of the current task, and we can list/switch-to/unwind
each stack.
[1]: https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01204.html
v2 -> v1: 1) Rebase this patchset onto gdb-16.2 [2].
2) Improved the silent_call_bt() to catch the error FATAL.
[2]: https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01354.html
v3 -> v2: 1) Rebase this patchset to crash v9.0.0.
2) Fix v2's segfault in cmd "bt -E".
3) Elimit repeat stacks by adding constraints before
gdb_add_substack().
v4 -> v3: 1) Fix compiling warning of silent_call_bt()
2) Add known issue link found in ppc arch.
Tao Liu (5):
Add multi-threads support in crash target
Call cmd_bt silently after "set pid"
x86_64: Add gdb multi-stack unwind support
arm64: Add gdb multi-stack unwind support
ppc64: Add gdb multi-stack unwind support
arm64.c | 102 +++++++++++++++++++++++++++++++++--
crash_target.c | 49 +++++++++++++++--
defs.h | 3 +-
gdb_interface.c | 6 +--
kernel.c | 44 +++++++++++++++
ppc64.c | 78 ++++++++++++++++++++++++---
task.c | 4 +-
x86_64.c | 138 +++++++++++++++++++++++++++++++++++++++++++++---
8 files changed, 394 insertions(+), 30 deletions(-)
--
2.47.0
1 day, 7 hours
"make target=ARM" fails on Ubuntu24 LTS
by Naveen Kumar Chaudhary
I am trying to build crash-9.0.0 on Ubuntu 24 LTS, but it always fails with
below error :
configure: error: Building GDB requires GMP 4.2+, and MPFR 3.1.0+.
Try the --with-gmp and/or --with-mpfr options to specify
$ pkg-config --modversion gmp
6.3.0
$ pkg-config --modversion mpfr
4.2.1
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble
The packages are properly installed and the header files are also present :
$ ls /usr/include/x86_64-linux-gnu/gmp.h
/usr/include/x86_64-linux-gnu/gmp.h
$ ls /usr/include/mpfr.h
/usr/include/mpfr.h
I even tried this :
$ CPPFLAGS="-I/usr/include/x86_64-linux-gnu -I/usr/include"
LDFLAGS="-L/usr/lib/x86_64-linux-gnu" make target=ARM
but still same error. Can someone please give any pointers.
Regards,
Naveen
5 days, 7 hours
Re: [PATCH v3 5/5] ppc64: Add gdb multi-stack unwind support
by lijiang
For ppc64 machine, I noticed that the gdb bt may not work as expected, for
example:
crash> set 2
PID: 2
COMMAND: "kthreadd"
TASK: c000000004797f80 [THREAD_INFO: c000000004797f80]
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> bt
PID: 2 TASK: c000000004797f80 CPU: 0 COMMAND: "kthreadd"
#0 [c00000000484fbc0] _end at c00000000484fd70 (unreliable)
#1 [c00000000484fd70] __switch_to at c00000000001fabc
#2 [c00000000484fdd0] __schedule at c0000000011ca9dc
#3 [c00000000484feb0] schedule at c0000000011caeb0
#4 [c00000000484ff20] kthreadd at c0000000001af6c4
#5 [c00000000484ffe0] start_kernel_thread at c00000000000ded8
crash> gdb bt
#0 0xc00000000484fd70 in ?? ()
gdb: gdb request failed: bt
crash>
crash> sys|grep RELEASE
RELEASE: 6.12.0
Is that expected behavior?(I do not remember if I mentioned the similar
issue in the previous patch review.)
I have no more comments for other changes, just three issues(see another
two comments).
Thanks
Lianbo
On Tue, Jun 3, 2025 at 1:18 PM <devel-request(a)lists.crash-utility.osci.io>
wrote:
> Date: Tue, 3 Jun 2025 17:11:38 +1200
> From: Tao Liu <ltao(a)redhat.com>
> Subject: [Crash-utility] [PATCH v3 5/5] ppc64: Add gdb multi-stack
> unwind support
> To: devel(a)lists.crash-utility.osci.io
> Cc: Alexey Makhalov <alexey.makhalov(a)broadcom.com>
> Message-ID: <20250603051138.59896-6-ltao(a)redhat.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> Co-developed-by: Alexey Makhalov <alexey.makhalov(a)broadcom.com>
> Co-developed-by: Tao Liu <ltao(a)redhat.com>
> Signed-off-by: Tao Liu <ltao(a)redhat.com>
> ---
> ppc64.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 64 insertions(+), 6 deletions(-)
>
> diff --git a/ppc64.c b/ppc64.c
> index 532eb3f..d1a5067 100644
> --- a/ppc64.c
> +++ b/ppc64.c
> @@ -2053,6 +2053,7 @@ ppc64_back_trace_cmd(struct bt_info *bt)
> char buf[BUFSIZE];
> struct gnu_request *req;
> extern void print_stack_text_syms(struct bt_info *, ulong, ulong);
> + extra_stacks_idx = 0;
>
> bt->flags |= BT_EXCEPTION_FRAME;
>
> @@ -2071,6 +2072,29 @@ ppc64_back_trace_cmd(struct bt_info *bt)
> req->pc = bt->instptr;
> req->sp = bt->stkptr;
>
> + if (is_task_active(bt->task)) {
> + if (!extra_stacks_regs[extra_stacks_idx]) {
> + extra_stacks_regs[extra_stacks_idx] =
> + (struct user_regs_bitmap_struct *)
> + malloc(sizeof(struct
> user_regs_bitmap_struct));
> + }
> + memset(extra_stacks_regs[extra_stacks_idx], 0,
> + sizeof(struct user_regs_bitmap_struct));
> + extra_stacks_regs[extra_stacks_idx]->ur.nip = req->pc;
> + extra_stacks_regs[extra_stacks_idx]->ur.gpr[1] = req->sp;
> + SET_BIT(extra_stacks_regs[extra_stacks_idx]->bitmap,
> + REG_SEQ(ppc64_pt_regs, nip));
> + SET_BIT(extra_stacks_regs[extra_stacks_idx]->bitmap,
> + REG_SEQ(ppc64_pt_regs, gpr[0]) + 1);
> + if (!bt->machdep ||
> + (extra_stacks_regs[extra_stacks_idx]->ur.gpr[1] !=
> + ((struct user_regs_bitmap_struct
> *)(bt->machdep))->ur.gpr[1] &&
> + extra_stacks_regs[extra_stacks_idx]->ur.nip !=
> + ((struct user_regs_bitmap_struct
> *)(bt->machdep))->ur.nip)) {
> + gdb_add_substack (extra_stacks_idx++);
> + }
> + }
> +
> if (bt->flags &
> (BT_TEXT_SYMBOLS|BT_TEXT_SYMBOLS_PRINT|BT_TEXT_SYMBOLS_NOPRINT)) {
> if (!INSTACK(req->sp, bt))
> @@ -2512,6 +2536,28 @@ ppc64_print_eframe(char *efrm_str, struct
> ppc64_pt_regs *regs,
> fprintf(fp, " %s [%lx] exception frame:\n", efrm_str, regs->trap);
> ppc64_print_regs(regs);
> ppc64_print_nip_lr(regs, 1);
> +
> + if (!((regs->msr >> MSR_PR_LG) & 0x1) &&
> + !(bt->flags & BT_EFRAME_SEARCH)) {
> + if (!extra_stacks_regs[extra_stacks_idx]) {
> + extra_stacks_regs[extra_stacks_idx] =
> + (struct user_regs_bitmap_struct *)
> + malloc(sizeof(struct
> user_regs_bitmap_struct));
> + }
> + memset(extra_stacks_regs[extra_stacks_idx], 0,
> + sizeof(struct user_regs_bitmap_struct));
> + memcpy(&extra_stacks_regs[extra_stacks_idx]->ur, regs,
> + sizeof(struct ppc64_pt_regs));
> + for (int i = 0; i < sizeof(struct
> ppc64_pt_regs)/sizeof(ulong); i++)
> +
> SET_BIT(extra_stacks_regs[extra_stacks_idx]->bitmap, i);
> + if (!bt->machdep ||
> + (extra_stacks_regs[extra_stacks_idx]->ur.gpr[1] !=
> + ((struct user_regs_bitmap_struct
> *)(bt->machdep))->ur.gpr[1] &&
> + extra_stacks_regs[extra_stacks_idx]->ur.nip !=
> + ((struct user_regs_bitmap_struct
> *)(bt->machdep))->ur.nip)) {
> + gdb_add_substack (extra_stacks_idx++);
> + }
> + }
> }
>
> static int
> @@ -2552,6 +2598,12 @@ ppc64_get_current_task_reg(int regno, const char
> *name, int size,
> tc = CURRENT_CONTEXT();
> if (!tc)
> return FALSE;
> +
> + if (sid && sid <= extra_stacks_idx) {
> + ur_bitmap = extra_stacks_regs[sid - 1];
> + goto get_sub;
> + }
> +
> BZERO(&bt_setup, sizeof(struct bt_info));
> clone_bt_info(&bt_setup, &bt_info, tc);
> fill_stackbuf(&bt_info);
> @@ -2570,39 +2622,45 @@ ppc64_get_current_task_reg(int regno, const char
> *name, int size,
> goto get_all;
> }
>
> +get_sub:
> switch (regno) {
> case PPC64_R0_REGNUM ... PPC64_R31_REGNUM:
> if (!NUM_IN_BITMAP(ur_bitmap->bitmap,
> REG_SEQ(ppc64_pt_regs, gpr[0]) + regno -
> PPC64_R0_REGNUM)) {
> - FREEBUF(ur_bitmap);
> + if (!sid)
> + FREEBUF(ur_bitmap);
> return FALSE;
> }
> break;
> case PPC64_PC_REGNUM:
> if (!NUM_IN_BITMAP(ur_bitmap->bitmap,
> REG_SEQ(ppc64_pt_regs, nip))) {
> - FREEBUF(ur_bitmap);
> + if (!sid)
> + FREEBUF(ur_bitmap);
> return FALSE;
> }
> break;
> case PPC64_MSR_REGNUM:
> if (!NUM_IN_BITMAP(ur_bitmap->bitmap,
> REG_SEQ(ppc64_pt_regs, msr))) {
> - FREEBUF(ur_bitmap);
> + if (!sid)
> + FREEBUF(ur_bitmap);
> return FALSE;
> }
> break;
> case PPC64_LR_REGNUM:
> if (!NUM_IN_BITMAP(ur_bitmap->bitmap,
> REG_SEQ(ppc64_pt_regs, link))) {
> - FREEBUF(ur_bitmap);
> + if (!sid)
> + FREEBUF(ur_bitmap);
> return FALSE;
> }
> break;
> case PPC64_CTR_REGNUM:
> if (!NUM_IN_BITMAP(ur_bitmap->bitmap,
> REG_SEQ(ppc64_pt_regs, ctr))) {
> - FREEBUF(ur_bitmap);
> + if (!sid)
> + FREEBUF(ur_bitmap);
> return FALSE;
> }
> break;
> @@ -2645,7 +2703,7 @@ get_all:
> ret = TRUE;
> break;
> }
> - if (bt_info.need_free) {
> + if (!sid && bt_info.need_free) {
> FREEBUF(ur_bitmap);
> bt_info.need_free = FALSE;
> }
> --
> 2.47.0
>
6 days, 2 hours
Re: [PATCH v3 2/5] Call cmd_bt silently after "set pid"
by lijiang
Thank you for working on this.
On Tue, Jun 3, 2025 at 1:15 PM <devel-request(a)lists.crash-utility.osci.io>
wrote:
> Date: Tue, 3 Jun 2025 17:11:35 +1200
> From: Tao Liu <ltao(a)redhat.com>
> Subject: [Crash-utility] [PATCH v3 2/5] Call cmd_bt silently after
> "set pid"
> To: devel(a)lists.crash-utility.osci.io
> Cc: Alexey Makhalov <alexey.makhalov(a)broadcom.com>
> Message-ID: <20250603051138.59896-3-ltao(a)redhat.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> Cmd bt will list multi-stacks of one task. After we "set <pid>" switch
> context to one task, we first need a bt call to detect the multi-stacks,
> however we don't want any console output from it, so a nullfp is used for
> output receive. The silent bt call is only triggered once as part of task
> context switch by cmd set.
>
> A array of user_regs pointers is reserved for each supported arch. If one
> extra stack found, a user_regs structure will be allocated for storing regs
> value of the stack.
>
> Co-developed-by: Alexey Makhalov <alexey.makhalov(a)broadcom.com>
> Co-developed-by: Tao Liu <ltao(a)redhat.com>
> Signed-off-by: Tao Liu <ltao(a)redhat.com>
> ---
> arm64.c | 4 ++++
> crash_target.c | 7 +++++++
> kernel.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> ppc64.c | 4 ++++
> task.c | 4 ++--
> x86_64.c | 3 +++
> 6 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/arm64.c b/arm64.c
> index 1cdde5f..8291301 100644
> --- a/arm64.c
> +++ b/arm64.c
> @@ -126,6 +126,10 @@ struct user_regs_bitmap_struct {
> ulong bitmap[32];
> };
>
> +#define MAX_EXCEPTION_STACKS 7
> +ulong extra_stacks_idx = 0;
> +struct user_regs_bitmap_struct *extra_stacks_regs[MAX_EXCEPTION_STACKS] =
> {0};
> +
> static inline bool is_mte_kvaddr(ulong addr)
> {
> /* check for ARM64_MTE enabled */
> diff --git a/crash_target.c b/crash_target.c
> index 71998ef..ad1480c 100644
> --- a/crash_target.c
> +++ b/crash_target.c
> @@ -31,6 +31,9 @@ extern "C" int crash_get_current_task_reg (int regno,
> const char *regname,
> extern "C" int gdb_change_thread_context (void);
> extern "C" int gdb_add_substack (int);
> extern "C" void crash_get_current_task_info(unsigned long *pid, char
> **comm);
> +#if defined (X86_64) || defined (ARM64) || defined (PPC64)
> +extern "C" void silent_call_bt(void);
> +#endif
>
> /* The crash target. */
>
> @@ -164,6 +167,10 @@ gdb_change_thread_context (void)
> /* 3rd, refresh regcache for tid 0 */
> target_fetch_registers(get_thread_regcache(inferior_thread()), -1);
> reinit_frame_cache();
> +#if defined (X86_64) || defined (ARM64) || defined (PPC64)
> + /* 4th, invoke bt silently to refresh the additional stacks */
> + silent_call_bt();
> +#endif
> return TRUE;
> }
>
> diff --git a/kernel.c b/kernel.c
> index b8d3b79..b15c32c 100644
> --- a/kernel.c
> +++ b/kernel.c
> @@ -12002,3 +12002,46 @@ int get_linux_banner_from_vmlinux(char *buf,
> size_t size)
>
> return TRUE;
> }
> +
> +#if defined(X86_64) || defined(ARM64) || defined(PPC64)
> +extern ulong extra_stacks_idx;
> +extern void *extra_stacks_regs[];
> +void silent_call_bt(void)
> +{
> + jmp_buf main_loop_env_save;
> + unsigned long long flags_save = pc->flags;
> + FILE *fp_save = fp;
> + FILE *error_fp_save = pc->error_fp;
> + /* Redirect all cmd_bt() outputs into null */
> + fp = pc->nullfp;
> + pc->error_fp = pc->nullfp;
> +
> + for (int i = 0; i < extra_stacks_idx; i++) {
> + /* Note: GETBUF/FREEBUF is not applicable for
> extra_stacks_regs,
> + because we are reserving extra_stacks_regs by cmd_bt()
> + for later use. But GETBUF/FREEBUF is designed for use
> only
> + within one cmd. See process_command_line() ->
> restore_sanity()
> + -> free_all_bufs(). So we use malloc/free instead. */
> + free(extra_stacks_regs[i]);
> + extra_stacks_regs[i] = NULL;
> + }
> + /* Prepare args used by cmd_bt() */
> + sprintf(pc->command_line, "bt\n");
> + argcnt = parse_line(pc->command_line, args);
> + optind = 1;
> + pc->flags |= RUNTIME;
>
Is this unnecessary? The flag 'RUNTIME' is set only once when
initializing(see the main_loop()), and will never be cleared.
Again: a warning observed
gcc -c -g -DX86_64 -DLZO -DGDB_16_2 kernel.c -I./gdb-16.2/bfd
-I./gdb-16.2/include -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes
-fstack-protector -Wformat-security
kernel.c:12009:6: warning: no previous prototype for ‘silent_call_bt’
[-Wmissing-prototypes]
12009 | void silent_call_bt(void)
| ^~~~~~~~~~~~~~
Thanks
Lianbo
+
> + /* Catch error FATAL generated by cmd_bt() if any */
> + memcpy(&main_loop_env_save, &pc->main_loop_env, sizeof(jmp_buf));
> + if (setjmp(pc->main_loop_env)) {
> + goto out;
> + }
> + cmd_bt();
> +out:
> + /* Restore all */
> + memcpy(&pc->main_loop_env, &main_loop_env_save, sizeof(jmp_buf));
> + pc->flags = flags_save;
> + fp = fp_save;
> + pc->error_fp = error_fp_save;
> +}
> +#endif
> diff --git a/ppc64.c b/ppc64.c
> index 7ac12fe..532eb3f 100644
> --- a/ppc64.c
> +++ b/ppc64.c
> @@ -80,6 +80,10 @@ struct user_regs_bitmap_struct {
> ulong bitmap[32];
> };
>
> +#define MAX_EXCEPTION_STACKS 7
> +ulong extra_stacks_idx = 0;
> +struct user_regs_bitmap_struct *extra_stacks_regs[MAX_EXCEPTION_STACKS] =
> {0};
> +
> static int is_opal_context(ulong sp, ulong nip)
> {
> uint64_t opal_start, opal_end;
> diff --git a/task.c b/task.c
> index e07b479..ec04b55 100644
> --- a/task.c
> +++ b/task.c
> @@ -3062,7 +3062,7 @@ sort_context_array(void)
> curtask = CURRENT_TASK();
> qsort((void *)tt->context_array, (size_t)tt->running_tasks,
> sizeof(struct task_context), sort_by_pid);
> - set_context(curtask, NO_PID, TRUE);
> + set_context(curtask, NO_PID, FALSE);
>
> sort_context_by_task();
> }
> @@ -3109,7 +3109,7 @@ sort_context_array_by_last_run(void)
> curtask = CURRENT_TASK();
> qsort((void *)tt->context_array, (size_t)tt->running_tasks,
> sizeof(struct task_context), sort_by_last_run);
> - set_context(curtask, NO_PID, TRUE);
> + set_context(curtask, NO_PID, FALSE);
>
> sort_context_by_task();
> }
> diff --git a/x86_64.c b/x86_64.c
> index a46fb9d..ee23d8b 100644
> --- a/x86_64.c
> +++ b/x86_64.c
> @@ -160,6 +160,9 @@ struct user_regs_bitmap_struct {
> ulong bitmap[32];
> };
>
> +ulong extra_stacks_idx = 0;
> +struct user_regs_bitmap_struct *extra_stacks_regs[MAX_EXCEPTION_STACKS] =
> {0};
> +
> /*
> * Do all necessary machine-specific setup here. This is called several
> * times during initialization.
> --
> 2.47.0
>
6 days, 5 hours
[PATCH] Fix "kmem -p" option on Linux 6.16-rc1 and later kernels
by HAGIO KAZUHITO(萩尾 一仁)
Kernel commit acc53a0b4c156 ("mm: rename page->index to
page->__folio_index"), which is contained in Linux 6.16-rc1 and later
kernels, renamed the member. Without the patch, the "kmem -p" option
fails with the following error:
kmem: invalid structure member offset: page_index
FILE: memory.c LINE: 6016 FUNCTION: dump_mem_map_SPARSEMEM()
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
---
memory.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/memory.c b/memory.c
index 0d8d89862383..5cb8b58e2181 100644
--- a/memory.c
+++ b/memory.c
@@ -531,6 +531,8 @@ vm_init(void)
ASSIGN_OFFSET(page_mapping) = MEMBER_OFFSET("page", "_mapcount") +
STRUCT_SIZE("atomic_t") + sizeof(ulong);
MEMBER_OFFSET_INIT(page_index, "page", "index");
+ if (INVALID_MEMBER(page_index)) /* 6.16 and later */
+ MEMBER_OFFSET_INIT(page_index, "page", "__folio_index");
if (INVALID_MEMBER(page_index))
ANON_MEMBER_OFFSET_INIT(page_index, "page", "index");
MEMBER_OFFSET_INIT(page_buffers, "page", "buffers");
--
2.31.1
1 week
[PATCH] Fix the issue of "page excluded" messages flooding
by Lianbo Jiang
The current issue is only observed on PPC64le machine when loading crash,
E.g:
...
crash: page excluded: kernel virtual address: c0000000022d6098 type: "gdb_readmem_callback"
crash: page excluded: kernel virtual address: c0000000022d6098 type: "gdb_readmem_callback"
...
crash>
And this issue can not be reproduced on crash 8, which only occurred
after the gdb-16.2 upgrade(see commit dfb2bb55e530).
So far I haven't found out why it always reads the same address(excluded
page) many times, anyway, crash tool should avoid flooding messages firstly,
similarly let's use the same debug level(8) such as the read_diskdump()(see
diskdump.c).
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/memory.c b/memory.c
index 0d8d89862383..58624bb5f44c 100644
--- a/memory.c
+++ b/memory.c
@@ -2504,7 +2504,7 @@ readmem(ulonglong addr, int memtype, void *buffer, long size,
case PAGE_EXCLUDED:
RETURN_ON_PARTIAL_READ();
- if (PRINT_ERROR_MESSAGE)
+ if (CRASHDEBUG(8))
error(INFO, PAGE_EXCLUDED_ERRMSG, memtype_string(memtype, 0), addr, type);
goto readmem_error;
--
2.47.1
1 week