[PATCH 0/5] LoongArch64: Fix register notes and exception unwinding
by Ming Wang
This series fixes several LoongArch64 crash utility issues seen when
reading CPU register state and unwinding active tasks from vmcore files.
The first two patches fix register extraction from NT_PRSTATUS/crash
notes and initialize the pt_regs.regs offset used for active task register
fallback. The remaining patches improve backtracing across LoongArch
exception frames, fix the stack top boundary check for pt_regs, and avoid
a null eframe_search callback when bt -e is requested.
The NT_PRSTATUS register layout follows the LoongArch kernel
elf_gregset_t/struct pt_regs order: 32 GPRs followed by orig_a0, csr_era,
csr_badvaddr, csr_crmd, csr_prmd, csr_euen, csr_ecfg and csr_estat.
The exception unwinding changes only treat entry symbols that leave a
saved pt_regs at sp as exception frames.
Tested:
- LoongArch 3C6000 platform with vmcore
- make -C crash -j$(nproc) on x86_64 host
- gcc -DLOONGARCH64 -c -o /tmp/loongarch64-final.o loongarch64.c
- gcc -c -o /tmp/loongarch64-final-generic.o loongarch64.c
- git diff --check origin/master..HEAD
Ming Wang (5):
LoongArch64: Fix CPU registers reading from dump notes
LoongArch64: Fix pt_regs initialization for active tasks
LoongArch64: Support backtracing across exception boundaries
LoongArch64: Fix stack frame loop bounds for exception frames
LoongArch64: Add dummy eframe_search to avoid bt -e segfault
loongarch64.c | 78 +++++++++++++++++++++++++++++++++++++++------------
1 file changed, 60 insertions(+), 18 deletions(-)
--
2.43.0
1 day, 9 hours
ways2well
by febit61013@gzeos.com
If you want to study evidence-based information regarding next-generation alternatives to invasive wrist surgeries, you should definitely set aside a few minutes to read through https://ways2well.com/blog/stem-cell-therapy-for-wrist-pain-what-you-need.... The blog layout is highly professional, addressing crucial safety standards and answering common patient inquiries regarding biological treatments. It serves as a perfect educational baseline to help you understand how cellular medicine is reshaping sports medicine and physical rehabilitation today.
2 days, 22 hours
[PATCH v2] Fix unwinding with 32k stacks in ppc64le RHEL 9.4
by Lucas Oakley
In RHEL 9.4 ppc64le, the stack size was adjusted from 16k to
32k. As a result, ppc64_back_trace() can bail prematurely when
checking if the stack pointer exists in the range of the
range of the irq stacks, since SIZE(irq_ctx), used in
ppc64_in_irqstack(), is set to 16k. This patch ensures that
irq_ctx is updated to 32k if a 32k stack size is used.
Tested against:
el6 x86_64
el7 ppc64le s390x x86_64
el8 aarch64 ppc64le s390x x86_64
el9 aarch64 ppc64le s390x x86_64
el10 aarch64 ppc64le s390x x86_64
Without the commit:
crash> bt -c 3
PID: 17524 TASK: c0000000b2c0e400 CPU: 3 COMMAND: "xyz"
cannot find the stack info.
With the commit:
crash> bt -c 3
PID: 17524 TASK: c0000000b2c0e400 CPU: 3 COMMAND: "xyz"
#0 [c000001dff7d7c10] smp_call_function_single_async at c00000000028dd38
#1 [c000001dff7d7d30] _raw_spin_lock_irqsave at c000000001023f1c
#2 [c000001dff7d7d60] ibmvscsi_handle_crq at c0080000044635ec [ibmvscsi]
#3 [c000001dff7d7de0] ibmvscsi_task at c008000004463804 [ibmvscsi]
#4 [c000001dff7d7e30] tasklet_action_common.constprop.0 at c0000000001624cc
#5 [c000001dff7d7e90] __do_softirq at c0000000010244cc
#6 [c000001dff7d7f90] do_softirq_own_stack at c000000000016480
#7 [c000000140d67700] __irq_exit_rcu at c0000000001613b8
#8 [c000000140d67730] irq_exit at c000000000162170
#9 [c000000140d67750] do_IRQ at c000000000015fa4
#10 [c000000140d67780] hardware_interrupt_common_virt at c000000000009080
Hardware Interrupt [500] exception frame:
R0: c000000001023de0 R1: c000000140d67a90 R2: c000000002c02500
R3: c00800000b30269c R4: 0000000000000001 R5: 0000000000000001
R6: ffffffffffffffff R7: 0000000000000000 R8: 0000000000000000
R9: fffffffffffe0000 R10: 0000000000000002 R11: 0000000048422824
R12: c000000001023d70 R13: c000001dffffd480 R14: 00007ffd6ef6d238
R15: 0000000000000028 R16: c000001dfc1e2280 R17: c000001dfc1e2280
R18: 00000000ab97fa48 R19: 0000000000000000 R20: 0000000000000001
R21: 0000001df9ff0000 R22: 0000000000000028 R23: c0000000021f2280
R24: 0000000000000000 R25: c0000000021f2280 R26: c0000000021f2380
R27: 0000000000000000 R28: c00800000b30269c R29: 0000000000000000
R30: c000000002c47190 R31: 000000000020000b
NIP: c0000000000aea14 MSR: 800000000280b033 OR3: c0000000000ae944
CTR: c000000001023d70 LR: c000000001023de0 XER: 0000000020040001
CCR: 0000000088422824 MQ: 0000000000000000 DAR: 0000000000000001
DSISR: c0080000073b1e94 Syscall Result: 0000000000000000
[NIP : queued_spin_lock_slowpath+1204]
[LR : _raw_spin_lock+112]
#11 [c000000140d67a90] queued_spin_lock_slowpath at c0000000000aea14
#12 [c000000140d67bb0] _raw_spin_lock at c000000001023de0 (unreliable)
#13 [c000000140d67bd0] dm_blk_close at c00800000b2c6850 [dm_mod]
#14 [c000000140d67c00] blkdev_put_whole at c0000000007d2738
#15 [c000000140d67c30] bdev_release at c0000000007d3a38
#16 [c000000140d67c90] blkdev_release at c0000000007d4224
#17 [c000000140d67cb0] __fput at c0000000005d2e98
#18 [c000000140d67d00] task_work_run at c00000000018fb14
#19 [c000000140d67d50] do_notify_resume at c000000000020bd4
#20 [c000000140d67d80] interrupt_exit_user_prepare_main at c00000000002ed98
#21 [c000000140d67de0] syscall_exit_prepare at c00000000002f240
#22 [c000000140d67e10] system_call_vectored_common at c00000000000bff4
Signed-off-by: Lucas Oakley <soakley(a)redhat.com>
---
task.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/task.c b/task.c
index ec04b55..a398a8d 100644
--- a/task.c
+++ b/task.c
@@ -739,6 +739,18 @@ irqstacks_init(void)
if (!(tt->softirq_tasks = (ulong *)calloc(NR_CPUS, sizeof(ulong))))
error(FATAL, "cannot malloc softirq_tasks space.");
+ /*
+ * RHEL 9.4.z adjusted the stack size from 16k to 32k for
+ * ppc64le only. We need to ensure that SIZE(irq_ctx) is
+ * correctly set so the unwinder doesn't prematurely bail
+ * when switching between the kernel stack and irq stacks.
+ * The stack size is updated in task_init(), which calls
+ * this routine, irqstacks_init() after checking for the
+ * existence of irq_ctx.
+ */
+ if ((SIZE(irq_ctx) != -1) && (STACKSIZE() > SIZE(irq_ctx)))
+ ASSIGN_SIZE(irq_ctx) = STACKSIZE();
+
thread_info_buf = GETBUF(SIZE(irq_ctx));
if ((hard_sp = per_cpu_symbol_search("per_cpu__hardirq_ctx")) ||
--
2.52.0
2 days, 22 hours
ANNA
by lava.lavalol228@gmail.com
A meal prep owner was tired of cheap containers that fogged up and cracked under hot food, until he tried these SafePro 24 oz shallow clear PET square containers from McDonald Paper – a 140‑count case of crystal‑clear PET that lets customers see every layer of salad or pasta, the shallow 1.5‑inch height is perfect for grab‑and‑go entrees without wasted space, and the durable design handles freezer storage and hot food up to 100°F without warping or leaking. Here's where he orders them now: https://www.mcdonaldpaper.com/safepro-sc4-24c-24-oz-shallow-clear-pet-squ...
2 days, 22 hours
[PATCH] crash: symbols: Optimize symbol string matching overhead in numeric_forward()
by Rui Qi
The numeric_forward() function serves as the comparator for qsort() when
sorting kernel symbols. For a modern Linux kernel containing hundreds of
thousands of symbols (N), qsort() performs O(N log N) comparisons,
meaning this function is invoked millions of times during the startup
phase of the crash utility.
During these millions of comparisons, it frequently checks for specific
symbol names (e.g., "_stext", "kaslr_get_random_long") using the STREQ()
macro. STREQ() internally expands to string_exists() checks followed by a
full strcmp(), incurring significant function call overhead that cannot be
optimized out by the compiler at runtime.
By explicitly checking the first character of the symbol name
(e.g., x->name[0] == '_') before invoking STREQ(), we introduce a
lightweight "early reject" mechanism. Since the distribution of kernel
symbol starting characters is relatively sparse, this short-circuits the
evaluation for the vast majority of symbols, completely avoiding the
overhead of strcmp() macro expansion.
Additionally, since x->name could potentially be NULL, we must safely
guard the character access with an explicit non-null check (x->name &&)
to prevent segmentation faults.
This O(N log N) cumulative optimization yields a measurable performance
improvement in symbol sorting speed, which scales directly with the size
of the kernel symbol table.
Signed-off-by: Rui Qi <qirui.001(a)bytedance.com>
---
symbols.c | 40 ++++++++++++++++++++--------------------
1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/symbols.c b/symbols.c
index 8eb8b37abc23..736d9d96d606 100644
--- a/symbols.c
+++ b/symbols.c
@@ -14401,16 +14401,16 @@ numeric_forward(const void *P_x, const void *P_y)
error(FATAL, "bfd_minisymbol_to_symbol failed\n");
if (st->_stext_vmlinux == UNINITIALIZED) {
- if (STREQ(x->name, "_stext"))
+ if (x->name && x->name[0] == '_' && STREQ(x->name, "_stext"))
st->_stext_vmlinux = valueof(x);
- else if (STREQ(y->name, "_stext"))
+ else if (y->name && y->name[0] == '_' && STREQ(y->name, "_stext"))
st->_stext_vmlinux = valueof(y);
}
if (kt->flags2 & KASLR_CHECK) {
- if (STREQ(x->name, "kaslr_get_random_long") ||
- STREQ(y->name, "kaslr_get_random_long") ||
- STREQ(x->name, "module_load_offset") ||
- STREQ(y->name, "module_load_offset")) {
+ if ((x->name && x->name[0] == 'k' && STREQ(x->name, "kaslr_get_random_long")) ||
+ (y->name && y->name[0] == 'k' && STREQ(y->name, "kaslr_get_random_long")) ||
+ (x->name && x->name[0] == 'm' && STREQ(x->name, "module_load_offset")) ||
+ (y->name && y->name[0] == 'm' && STREQ(y->name, "module_load_offset"))) {
kt->flags2 &= ~KASLR_CHECK;
kt->flags2 |= (RELOC_AUTO|KASLR);
}
@@ -14418,36 +14418,36 @@ numeric_forward(const void *P_x, const void *P_y)
if (SADUMP_DUMPFILE() || QEMU_MEM_DUMP_NO_VMCOREINFO() || VMSS_DUMPFILE()) {
/* Need for kaslr_offset and phys_base */
- if (STREQ(x->name, "divide_error") ||
- STREQ(x->name, "asm_exc_divide_error"))
+ if ((x->name && x->name[0] == 'd' && STREQ(x->name, "divide_error")) ||
+ (x->name && x->name[0] == 'a' && STREQ(x->name, "asm_exc_divide_error")))
st->divide_error_vmlinux = valueof(x);
- else if (STREQ(y->name, "divide_error") ||
- STREQ(y->name, "asm_exc_divide_error"))
+ else if ((y->name && y->name[0] == 'd' && STREQ(y->name, "divide_error")) ||
+ (y->name && y->name[0] == 'a' && STREQ(y->name, "asm_exc_divide_error")))
st->divide_error_vmlinux = valueof(y);
- if (STREQ(x->name, "idt_table"))
+ if (x->name && x->name[0] == 'i' && STREQ(x->name, "idt_table"))
st->idt_table_vmlinux = valueof(x);
- else if (STREQ(y->name, "idt_table"))
+ else if (y->name && y->name[0] == 'i' && STREQ(y->name, "idt_table"))
st->idt_table_vmlinux = valueof(y);
- if (STREQ(x->name, "kaiser_init"))
+ if (x->name && x->name[0] == 'k' && STREQ(x->name, "kaiser_init"))
st->kaiser_init_vmlinux = valueof(x);
- else if (STREQ(y->name, "kaiser_init"))
+ else if (y->name && y->name[0] == 'k' && STREQ(y->name, "kaiser_init"))
st->kaiser_init_vmlinux = valueof(y);
- if (STREQ(x->name, "linux_banner"))
+ if (x->name && x->name[0] == 'l' && STREQ(x->name, "linux_banner"))
st->linux_banner_vmlinux = valueof(x);
- else if (STREQ(y->name, "linux_banner"))
+ else if (y->name && y->name[0] == 'l' && STREQ(y->name, "linux_banner"))
st->linux_banner_vmlinux = valueof(y);
- if (STREQ(x->name, "pti_init"))
+ if (x->name && x->name[0] == 'p' && STREQ(x->name, "pti_init"))
st->pti_init_vmlinux = valueof(x);
- else if (STREQ(y->name, "pti_init"))
+ else if (y->name && y->name[0] == 'p' && STREQ(y->name, "pti_init"))
st->pti_init_vmlinux = valueof(y);
- if (STREQ(x->name, "saved_command_line"))
+ if (x->name && x->name[0] == 's' && STREQ(x->name, "saved_command_line"))
st->saved_command_line_vmlinux = valueof(x);
- else if (STREQ(y->name, "saved_command_line"))
+ else if (y->name && y->name[0] == 's' && STREQ(y->name, "saved_command_line"))
st->saved_command_line_vmlinux = valueof(y);
}
--
2.20.1
3 days, 4 hours
vtop returns stale page table entries due to FILL_PUD caching logic
by Anderson Nascimento
Hello,
I have been using the crash tool to teach paging. It is an excellent
tool for simplifying page table walks for students. However, I have
encountered a persistent issue regarding stale data when inspecting
mappings that are populated mid-session.
When using the vtop command on a mapping that is not yet populated,
and then running it again after a memory operation has occurred, vtop
continues to return NULL or stale entries. This happens because the
FILL_PUD (and similar) macros check if the current PUD address matches
the last_pud_read address. If they match, the tool skips the readmem()
call, even if the underlying physical memory has changed.
In the debugging session below, I demonstrate that rd -p showed the
populated PUD entry, but vtop still reported 0. I was able to resolve
this by manually forcing a re-read in GDB by resetting the cache
variable:
(gdb) set machdep->last_pud_read=1
992 #define IS_LAST_PUD_READ(pud) ((ulong)(pud) == machdep->last_pud_read)
...
1001 #define FILL_PUD(PUD, TYPE, SIZE)
\
1002 if (!IS_LAST_PUD_READ(PUD)) {
\
1003 readmem((ulonglong)((ulong)(PUD)), TYPE,
machdep->pud, \
1004 SIZE, "pud page", FAULT_ON_ERROR);
\
1005 machdep->last_pud_read = (ulong)(PUD);
\
1006 }
Steps to Reproduce:
1) Run vtop on an unpopulated user address.
2) Trigger a page fault/memory access in the target process to
populate the entry.
3) Run vtop again; it will still show "(not mapped)" despite the
physical memory being updated.
Is this caching behavior intended for performance, or should a way to
invalidate this cache for live sessions be implemented?
crash> vtop -c 5084 0x41414000
[Detaching after fork from child process 5085]
VIRTUAL PHYSICAL
41414000 (not mapped)
PGD: 6e80000 => 7fc8067
PUD: 7fc8008 => 0
VMA START END FLAGS FILE
ffff88801ec382b8 41414000 41415000 8100073
crash> rd -p 7fc8008
[Detaching after fork from child process 5086]
7fc8008: 0000000000000000 ........
crash> vtop -c 5084 0x41414000
[Detaching after fork from child process 5087]
VIRTUAL PHYSICAL
41414000 (not mapped)
PGD: 6e80000 => 7fc8067
PUD: 7fc8008 => 0
VMA START END FLAGS FILE
ffff88801ec382b8 41414000 41415000 8100073
crash> rd -p 7fc8008
[Detaching after fork from child process 5088]
7fc8008: 000000000e576067 g`W.....
crash>
Thread 1 "crash" received signal SIGINT, Interrupt.
0x00007ffff629d141 in pselect () from /lib64/libc.so.6
=> 0x00007ffff629d141 <pselect+193>: 48 3d 00 f0 ff ff cmp
$0xfffffffffffff000,%rax
(gdb) en 2
(gdb) c
Continuing.
vtop -c 5084 0x41414000
[Detaching after fork from child process 5089]
VIRTUAL PHYSICAL
Thread 1 "crash" hit Breakpoint 2, x86_64_pud_offset
(pgd_pte=<optimized out>, vaddr=1094795264, verbose=0, IS_XEN=0) at
x86_64.c:1970
1970 FILL_PUD(pud_paddr, PHYSADDR, PAGESIZE());
=> 0x00005555557f5da5 <x86_64_pud_offset+85>: 48 8b 96 40 01 00 00 mov
0x140(%rsi),%rdx
0x00005555557f5dac <x86_64_pud_offset+92>: 48 39 9e 20 01 00 00 cmp
%rbx,0x120(%rsi)
0x00005555557f5db3 <x86_64_pud_offset+99>: 74 32 je
0x5555557f5de7 <x86_64_pud_offset+151>
0x00005555557f5db5 <x86_64_pud_offset+101>: 8b 4e 18 mov 0x18(%rsi),%ecx
0x00005555557f5db8 <x86_64_pud_offset+104>: 41 b9 01 00 00 00 mov
$0x1,%r9d
0x00005555557f5dbe <x86_64_pud_offset+110>: be 04 00 00 00 mov $0x4,%esi
0x00005555557f5dc3 <x86_64_pud_offset+115>: 48 89 df mov %rbx,%rdi
0x00005555557f5dc6 <x86_64_pud_offset+118>: 4c 8d 05 6c a8 58 00
lea 0x58a86c(%rip),%r8 # 0x555555d80639
0x00005555557f5dcd <x86_64_pud_offset+125>: e8 8e e2 f5 ff callq
0x555555754060 <readmem>
0x00005555557f5dd2 <x86_64_pud_offset+130>: 48 8b 35 87 8e a9 00
mov 0xa98e87(%rip),%rsi # 0x55555628ec60 <machdep>
0x00005555557f5dd9 <x86_64_pud_offset+137>: 48 89 9e 20 01 00 00
mov %rbx,0x120(%rsi)
0x00005555557f5de0 <x86_64_pud_offset+144>: 48 8b 96 40 01 00 00
mov 0x140(%rsi),%rdx
(gdb) set machdep->last_pud_read=1 <- This forces the PUD to be re-read
(gdb) dis
(gdb) c
Continuing.
41414000 f67e000
PGD: 6e80000 => 7fc8067
PUD: 7fc8008 => e576067
PMD: e576050 => aa1c067
PTE: aa1c0a0 => 800000000f67e867
PAGE: f67e000
PTE PHYSICAL FLAGS
800000000f67e867 f67e000 (PRESENT|RW|USER|ACCESSED|DIRTY|NX)
VMA START END FLAGS FILE
ffff88801ec382b8 41414000 41415000 8100073
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea00003d9f80 f67e000 ffff8880142d40c1 41414 1 fffffc0040028
uptodate,lru,swapbacked
crash>
Best regards,
--
Anderson Nascimento
Allele Security Intelligence
https://www.allelesecurity.com
1 week
[PATCH 1/2] Fix "kmem -i" option to display swap usage on Linux 6.18 and later
by HAGIO KAZUHITO(萩尾 一仁)
From: Kazuhito Hagio <k-hagio-ab(a)nec.com>
Kernel commit 8578e0c00dcf ("mm, swap: use the swap table to track the
swap count"), which is contained in Linux 6.18 and later kernels,
removed swapper_spaces symbol.
As a result, "kmem -i" skips swap usage output because the existing
check only looks for swapper_space/swapper_spaces.
Also check for the swap_info symbol so dump_swap_info() is called on
newer kernels as well.
Signed-off-by: Kazuhito Hagko <k-hagio-ab(a)nec.com>
---
memory.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/memory.c b/memory.c
index 38c6a139e984..15946c58eaf2 100644
--- a/memory.c
+++ b/memory.c
@@ -8871,7 +8871,8 @@ dump_kmeminfo(struct meminfo *mi)
* get swap data from dump_swap_info().
*/
fprintf(fp, "\n");
- if (symbol_exists("swapper_space") || symbol_exists("swapper_spaces")) {
+ if (symbol_exists("swap_info") ||
+ symbol_exists("swapper_space") || symbol_exists("swapper_spaces")) {
if (dump_swap_info(RETURN_ON_ERROR, &totalswap_pages,
&totalused_pages)) {
fprintf(fp, "%13s %7ld %11s ----\n",
--
2.31.1
1 week, 3 days
[PATCH] x86_64: Fix "bt" command to use correct ORC register values on Linux 7.1 and later
by HAGIO KAZUHITO(萩尾 一仁)
From: Kazuhito Hagio <k-hagio-ab(a)nec.com>
Kernel commit 1735858caa4b ("objtool/x86: Reorder ORC register numbering")
changed the ORC register numbering. Without the patch, crash can interpret ORC
entry incorrectly and the "bt" command may generate broken backtraces on Linux
7.1 and later kernels like this:
crash> bt 1
PID: 1 TASK: ffff8ab0009cd100 CPU: 2 COMMAND: "systemd"
#0 [ffffd2218003b9c8] __schedule at ffffffffaa9d862b
#1 [ffffd2218003ba20] schedule at ffffffffaa9d8993
#2 [ffffd2218003ba30] schedule_hrtimeout_range_clock at ffffffffaa9df77b
#3 [ffffd2218003bab0] ep_poll at ffffffffaa1231e4
#4 [ffffd2218003bb50] do_epoll_wait at ffffffffaa123272
#5 [ffffd2218003bb88] __x64_sys_epoll_wait at ffffffffaa123b1f
#6 [ffffd2218003bbd8] do_syscall_64 at ffffffffaa9cca6c
#7 [ffffd2218003bc58] __memcg_slab_free_hook at ffffffffaa079da3
#8 [ffffd2218003bcf0] __memcg_slab_free_hook at ffffffffaa079da3
#9 [ffffd2218003bd50] __x64_sys_gettid at ffffffffa9ce1656
#10 [ffffd2218003bd58] do_syscall_64 at ffffffffaa9ccaa4
#11 [ffffd2218003bdc0] update_cfs_rq_load_avg at ffffffffa9d1bf59
#12 [ffffd2218003be00] __update_blocked_fair at ffffffffa9d214b8
#13 [ffffd2218003be70] sched_clock at ffffffffa9c460dc
#14 [ffffd2218003be78] sched_clock_cpu at ffffffffa9d4aeab
#15 [ffffd2218003be98] irqtime_account_irq at ffffffffa9d3af0d
#16 [ffffd2218003bec0] handle_softirqs at ffffffffa9cce5ac
#17 [ffffd2218003bf40] entry_SYSCALL_64_after_hwframe at ffffffffa9a0012b
Fix this by making ORC_REG_SP and ORC_REG_PREV_SP depend on kernel version, as
no other way was found.
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
---
Hi,
I could not find another way except for using kernel version, is there any idea?
defs.h | 6 ++++--
x86_64.c | 11 +++++++++++
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/defs.h b/defs.h
index 6373ee1831af..e4bd77a518aa 100644
--- a/defs.h
+++ b/defs.h
@@ -6649,6 +6649,8 @@ struct ORC_data {
orc_entry orc_entry_data;
int has_signal;
int has_end;
+ int reg_sp;
+ int reg_prev_sp;
};
#define ORC_TYPE_CALL ((machdep->flags & ORC_6_4) ? 2 : 0)
@@ -6659,11 +6661,11 @@ struct ORC_data {
#define UNWIND_HINT_TYPE_RESTORE 4
#define ORC_REG_UNDEFINED 0
-#define ORC_REG_PREV_SP 1
+#define ORC_REG_PREV_SP (machdep->machspec->orc.reg_prev_sp)
#define ORC_REG_DX 2
#define ORC_REG_DI 3
#define ORC_REG_BP 4
-#define ORC_REG_SP 5
+#define ORC_REG_SP (machdep->machspec->orc.reg_sp)
#define ORC_REG_R10 6
#define ORC_REG_R13 7
#define ORC_REG_BP_INDIRECT 8
diff --git a/x86_64.c b/x86_64.c
index b2cddbf8ba3d..ec3c0e87fa06 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -999,6 +999,8 @@ x86_64_dump_machdep_table(ulong arg)
fprintf(fp, " module_ORC: %s\n", ms->orc.module_ORC ? "TRUE" : "FALSE");
fprintf(fp, " has_signal: %s\n", ms->orc.has_signal ? "TRUE" : "FALSE");
fprintf(fp, " has_end: %s\n", ms->orc.has_end ? "TRUE" : "FALSE");
+ fprintf(fp, " reg_sp: %d\n", ms->orc.reg_sp);
+ fprintf(fp, " reg_prev_sp: %d\n", ms->orc.reg_prev_sp);
fprintf(fp, " lookup_num_blocks: %d\n", ms->orc.lookup_num_blocks);
fprintf(fp, " __start_orc_unwind_ip: %lx\n", ms->orc.__start_orc_unwind_ip);
fprintf(fp, " __stop_orc_unwind_ip: %lx\n", ms->orc.__stop_orc_unwind_ip);
@@ -6720,6 +6722,15 @@ x86_64_ORC_init(void)
if (orc->has_signal && !orc->has_end)
machdep->flags |= ORC_6_4;
+ /* See kernel commit 1735858caa4b */
+ if (THIS_KERNEL_VERSION >= LINUX(7,1,0)) {
+ ORC_REG_SP = 3;
+ ORC_REG_PREV_SP = 8;
+ } else {
+ ORC_REG_SP = 5;
+ ORC_REG_PREV_SP = 1;
+ }
+
machdep->flags |= ORC;
}
--
2.31.1
1 week, 3 days
[PATCH] x86_64: Fix "bt" command for noreturn functions
by HAGIO KAZUHITO(萩尾 一仁)
From: Kazuhito Hagio <k-hagio-ab(a)nec.com>
On x86_64, the "bt" command resolves saved return addresses with
value_search(textaddr). However, a return address is the instruction
pointer after the call, not the call site itself.
This becomes a problem when the caller ends with a call to a noreturn
function. In that case, the saved return address can match the start
address of the following symbol, and "bt" loses track of the call chain
and this can lead to very long session initialization.
The same issue also affects symbol+offset formatting, line number
lookup, and ORC-based frame size resolution.
Fix it by resolving normal backtrace return addresses with textaddr-1,
while keeping exact textaddr handling for real RIP values saved in
exception frames. Add value_to_symstr_trace() so the displayed
symbol+offset still reflects the original return address value.
Suggested-by: Kosuke Tatsukawa <tatsu-ab1(a)nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
---
defs.h | 1 +
symbols.c | 24 +++++++++++++++++++++---
x86_64.c | 31 ++++++++++++++++++++++++++-----
3 files changed, 48 insertions(+), 8 deletions(-)
diff --git a/defs.h b/defs.h
index 89044b18cdbe..79969df2a8c2 100644
--- a/defs.h
+++ b/defs.h
@@ -5798,6 +5798,7 @@ struct syment *prev_symbol(char *, struct syment *);
void get_symbol_data(char *, long, void *);
int try_get_symbol_data(char *, long, void *);
char *value_to_symstr(ulong, char *, ulong);
+char *value_to_symstr_trace(ulong, char *, ulong);
char *value_symbol(ulong);
ulong symbol_value(char *);
ulong symbol_value_module(char *, char *);
diff --git a/symbols.c b/symbols.c
index 3c62f54d4a93..372d0b230b18 100644
--- a/symbols.c
+++ b/symbols.c
@@ -104,6 +104,7 @@ static void free_structure(struct struct_elem *);
static unsigned char is_right_brace(const char *);
static struct struct_elem *find_node(struct struct_elem *, char *);
static void dump_node(struct struct_elem *, char *, unsigned char, unsigned char);
+static char *_value_to_symstr(ulong value, char *buf, ulong radix, int trace);
static int module_mem_type(ulong, struct load_module *);
static ulong module_mem_end(ulong, struct load_module *);
@@ -5973,14 +5974,25 @@ generic_machdep_value_to_symbol(ulong value, ulong *offset)
return NULL;
}
+char *
+value_to_symstr(ulong value, char *buf, ulong radix)
+{
+ return _value_to_symstr(value, buf, radix, 0);
+}
+
+char *
+value_to_symstr_trace(ulong value, char *buf, ulong radix)
+{
+ return _value_to_symstr(value, buf, radix, 1);
+}
/*
* For a given value, format a string containing the nearest symbol name
* plus the offset if appropriate. Display the offset in the specified
* radix (10 or 16) -- if it's 0, set it to the current pc->output_radix.
*/
-char *
-value_to_symstr(ulong value, char *buf, ulong radix)
+static char *
+_value_to_symstr(ulong value, char *buf, ulong radix, int trace)
{
struct syment *sp;
ulong offset;
@@ -5996,7 +6008,13 @@ value_to_symstr(ulong value, char *buf, ulong radix)
if ((radix != 10) && (radix != 16))
radix = 16;
- if ((sp = value_search(value, &offset))) {
+ if (trace) {
+ sp = value_search(value-1, &offset);
+ offset++;
+ } else
+ sp = value_search(value, &offset);
+
+ if (sp) {
if (offset)
sprintf(buf, radix == 16 ? "%s+0x%lx" : "%s+%ld",
sp->name, offset);
diff --git a/x86_64.c b/x86_64.c
index b2cddbf8ba3d..ff283ed68191 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -3229,14 +3229,23 @@ x86_64_print_stack_entry(struct bt_info *bt, FILE *ofp, int level,
if (!(bt->flags & BT_SAVE_EFRAME_IP))
bt->eframe_ip = 0;
offset = 0;
- sp = value_search(text, &offset);
+ if (bt->flags & BT_SAVE_EFRAME_IP)
+ sp = value_search(text, &offset);
+ else {
+ sp = value_search(text-1, &offset);
+ offset++;
+ }
if (!sp)
return BACKTRACE_ENTRY_IGNORED;
name = sp->name;
if (offset && (bt->flags & BT_SYMBOL_OFFSET))
- name_plus_offset = value_to_symstr(text, buf2, bt->radix);
+ if (bt->flags & BT_SAVE_EFRAME_IP)
+ name_plus_offset = value_to_symstr(text, buf2, bt->radix);
+ else
+ /* text-1 is used in the function */
+ name_plus_offset = value_to_symstr_trace(text, buf2, bt->radix);
else
name_plus_offset = NULL;
@@ -3337,7 +3346,10 @@ x86_64_print_stack_entry(struct bt_info *bt, FILE *ofp, int level,
fprintf(ofp, "\n");
if (bt->flags & BT_LINE_NUMBERS) {
- get_line_number(text, buf1, FALSE);
+ if (bt->flags & BT_SAVE_EFRAME_IP)
+ get_line_number(text, buf1, FALSE);
+ else
+ get_line_number(text-1, buf1, FALSE);
if (strlen(buf1))
fprintf(ofp, " %s\n", buf1);
}
@@ -3864,8 +3876,10 @@ in_exception_stack:
}
level++;
+ bt->flags |= BT_SAVE_EFRAME_IP;
if ((framesize = x86_64_get_framesize(bt, bt->instptr, rsp, NULL)) >= 0)
rsp += framesize;
+ bt->flags &= ~BT_SAVE_EFRAME_IP;
}
}
@@ -8811,7 +8825,13 @@ x86_64_get_framesize(struct bt_info *bt, ulong textaddr, ulong rsp, char *stack_
return 0;
}
- if (!(sp = value_search(textaddr, &offset))) {
+ if (bt->flags & BT_SAVE_EFRAME_IP)
+ sp = value_search(textaddr, &offset);
+ else {
+ sp = value_search(textaddr-1, &offset);
+ offset++;
+ }
+ if (!sp) {
if (!(bt->flags & BT_FRAMESIZE_DEBUG))
bt->flags |= BT_FRAMESIZE_DISABLE;
return 0;
@@ -8887,7 +8907,8 @@ x86_64_get_framesize(struct bt_info *bt, ulong textaddr, ulong rsp, char *stack_
if ((sp->value >= kt->init_begin) && (sp->value < kt->init_end))
return 0;
- if ((machdep->flags & ORC) && (korc = orc_find(textaddr))) {
+ if ((machdep->flags & ORC) &&
+ (korc = orc_find(bt->flags & BT_SAVE_EFRAME_IP ? textaddr : textaddr-1))) {
if (CRASHDEBUG(1)) {
struct ORC_data *orc = &machdep->machspec->orc;
fprintf(fp,
--
2.31.1
1 week, 3 days