Ok, so since this is simply a fix to prevent a SIGSEGV, then my alternative
suggestion to have arm64_is_kernel_exception_frame() return FALSE if the
"regs" address assignment is invalid should suffice.
Thanks,
Dave
----- Original Message -----
Hi Dave,
1. Is this a kdump-generated dumpfile?
It's a kdump-generated dumpfile for arm64.
2. Have you looked into why you get the "bt: WARNING: cannot determine
starting stack frame for task ffffffcd74122000" message?
Because kernel didn't enable crash_notes symbol to save active task regs.
3. You didn't show the results of your patch -- if you apply it, does the
backtrace get displayed correctly?
From the result of my patch, it shows bade stack frame for sp address
0xffffff800c42ba00.
crash> bt -S ffffff800c42ba00 108
PID: 108 TASK: ffffffcd74122000 CPU: 5 COMMAND: "rtmm_reclaim"
bt: WARNING: cannot determine starting stack frame for task ffffffcd74122000
#0 [ffffff9c29e4fa10] (null) at fffffffffffffffc
4. Since the "bt -S" option is almost never used. Would it be possible to
restrict your patch to fix/verify things in the section where it handles the
bt->hp->sp setting?
I add the following change to print where it handles the bt->hp->sp setting:
--- a/arm64.c
+++ b/arm64.c
@@ -2542,7 +2542,8 @@ arm64_back_trace_cmd(struct bt_info *bt)
* x+8: contains stackframe.pc -- text return address
* x+16: is the stackframe.sp address
*/
-
+ fprintf(stderr, "bt:flags=%llx, bptr=%lx, eip=%lx, esp=%lx,
stkptr=%lx, instptr=%lx, frameptr=%lx\n",
+ bt->flags, bt->bptr, bt->hp->eip, bt->hp->esp,
bt->stkptr,
bt->instptr, bt->frameptr);
if (bt->flags & BT_KDUMP_ADJUST) {
if (arm64_on_irq_stack(bt->tc->processor, bt->bptr)) {
arm64_set_irq_stack(bt);
@@ -2572,6 +2573,7 @@ arm64_back_trace_cmd(struct bt_info *bt)
stackframe.fp = bt->frameptr;
}
+ fprintf(stderr, "stackframe:sp=%lx, pc=%lx, fp=%lx\n", stackframe.sp,
stackframe.pc, stackframe.fp);
if (bt->flags & BT_TEXT_SYMBOLS) {
arm64_print_text_symbols(bt, &stackframe, ofp);
if (BT_REFERENCE_FOUND(bt)) {
The result shows as below:
crash> bt -S ffffff800c42ba00 108
bt:flags=4000000000000, bptr=0, eip=0, esp=ffffff800c42ba00,
stkptr=ffffff800c42ba00, instptr=0, frameptr=0
stackframe:sp=ffffff800c42ba08, pc=0, fp=ffffff9c29e4fa10
It seems invalid stackframe.sp and pc calculated by
GET_STACK_ULONG(bt->hp->esp). I think it must be resulted from invalid
bt->stackbuf address.
(gdb) p /x *(struct bt_info *) 0x7fffffffd640
$4 = {task = 0xffffffcd74122000, flags = 0x0, instptr = 0x0, stkptr =
0xffffff800c42ba00, bptr = 0x0, stackbase = 0xffffff800c428000,
stacktop = 0xffffff800c42c000, stackbuf = 0x555555f23ae0, tc =
0x5555596e1778, hp = 0x7fffffffd5f0, textlist = 0x0, ref = 0x0, frameptr =
0x0,
call_target = 0x0, machdep = 0x0, debug = 0x0, eframe_ip = 0x0, radix =
0x0, cpumask = 0x0}
so this is the reason for that matter what is the stackframe.pc and
stackframe.fp.
Best regards,
Qiwu
-----Original Message-----
From: Dave Anderson <anderson(a)redhat.com>
Sent: Monday, November 4, 2019 11:39 PM
To: Discussion list for crash utility usage, maintenance and development
<crash-utility(a)redhat.com>
Cc: 陈启武 <chenqiwu(a)xiaomi.com>
Subject: [External Mail]Re: [Crash-utility] [PATCH] Fix a potential segfault
for the ARM64 "bt -S <stack-address>" command
----- Original Message -----
> > The stackframe.fp(0xffffff9c29e4f8e0) is larger than the stacktop
> > address, so lead to segmentation violation gernarated by accessing
> > regs->sp:
> > (gdb) p /x 18446743644915693792//stkptr
> > $5 = 0xffffff9c29e4f8e0
> > (gdb) p /x
> > 0xffffff9c29e4f8e0-0xffffff800c428000//STACK_OFFSET_TYPE(stkptr)
> > $6 = 0x1c1da278e0
> > (gdb) p /x regs
> > $7 = 0x55717394b3c0
> > (gdb) p *(struct arm64_pt_regs *) 0x55717394b3c0 Cannot access
> > memory at address 0x55717394b3c0
> >
> > For fix this, I think it must be add a condition
> > "arm64_in_exception_text(stackframe.pc) && INSTACK(stackframe.fp,
bt)"
> > to avoid an invalid exception frame before transitioning to the process
> > stack.
Or alternatively, would it be better to have
arm64_is_kernel_exception_frame() verify that the "regs" assignment is
legitimate, and if not, just return FALSE?
Dave
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from
XIAOMI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!******/#