The same issue also exists in x86_64 arch. However I cannot dive into
To illustrate the x86_64 issue:
crash> dis __crash_kexec
...
0xffffffff93c1c7d6 <__crash_kexec+86>: call 0xffffffff93c1c6f0
<crash_setup_regs>
0xffffffff93c1c7db <__crash_kexec+91>: call 0xffffffff93c1b120
<crash_save_vmcoreinfo>
0xffffffff93c1c7e0 <__crash_kexec+96>: mov %rsp,%rdi
0xffffffff93c1c7e3 <__crash_kexec+99>: call 0xffffffff93a72580
<machine_crash_shutdown>
0xffffffff93c1c7e8 <__crash_kexec+104>: mov 0x2c317e9(%rip),%rdi
# 0xffffffff9684dfd8 <kexec_crash_image>
0xffffffff93c1c7ef <__crash_kexec+111>: call 0xffffffff93a80e70
<machine_kexec>
0xffffffff93c1c7f4 <__crash_kexec+116>: movl $0x0,0x2c317ee(%rip)
# 0xffffffff9684dfec <__kexec_lock>
...
crash> gdb bt
#0 0xffffffff93c1c764 in crash_setup_regs
(newregs=0xffffc33000ad7ad8, oldregs=0x0)
...
crash> struct -x pt_regs 0xffffc33000ad7ad8
sp = 0xffffc33000ad7ad0,
ip = 0xffffffff93c1c764,
crash> dis crash_setup_regs
...
0xffffffff93c1c71e <crash_setup_regs+46>: mov %rsp,0x98(%rdi)
...
0xffffffff93c1c75d <crash_setup_regs+109>: lea 0x0(%rip),%rax
# 0xffffffff93c1c764 <crash_setup_regs+116>
0xffffffff93c1c764 <crash_setup_regs+116>: mov %rax,0x80(%rdi)
0xffffffff93c1c76b <crash_setup_regs+123>: ret
...
The newregs 0xffffc33000ad7ad8, is the pt_regs which all regs are
saved within crash_setup_regs(), so ip = 0xffffffff93c1c764 is
expected, because it is the exact address after lea instruction. In
the meantime, sp is supposed to point to the address of returning
address of crash_setup_regs, aka 0xffffffff93c1c7db, right after
instruction "call crash_setup_regs". However if we inspect sp
0xffffc33000ad7ad0, we will see:
crash> rd 0xffffc33000ad7ad0
ffffc33000ad7ad0: ffffffff93c1c7f4 ........
crash> sym ffffffff93c1c7f4
ffffffff93c1c7f4 (T) __crash_kexec+116
The ffffffff93c1c7f4 is not the address after "call crash_setup_regs",
but "call machine_kexec". So it is also unexpected. The only reason
for cmd bt can still work on x86_64 arch, is that there is no stack
grow down in crash_setup_regs(). In contrast with ppc64, the stack
grows down by "<crash_setup_regs+20>: stdu r1,-32(r1)". So
there is no modification on this stack frame. But it is still a
problem that crash_setup_regs() is not inlined.
Thanks,
Tao Liu
gcc code why newer versions will make the inline function like this.
Or we can modify crash_setup_regs() to be the following:
static inline void __attribute__((always_inline)) crash_setup_regs(...)
Any thoughts?
Thanks,
Tao Liu
>
> Anyways this patch is also good, doesn't seem problematic for the
> previous working cases.
>
>
> Thanks,
>
> Aditya Gupta
>
> > This patch will search and check for the valid newsp at the beginning of
> > stack back trace. For the above example, a valid newsp would be assigned
> > as c00000000a327850.
> >
> > Before:
> > crash> bt
> > ...
> > #0 [c00000000a327660] crash_setup_regs at c0000000002b9984
> > After:
> > crash> bt
> > ...
> > #0 [c00000000a327660] crash_setup_regs at c0000000002b9984
> > #1 [c00000000a327850] panic at c000000000167de4
> > #2 [c00000000a3278f0] sysrq_handle_crash at c000000000a4d3a8
> > #3 [c00000000a327950] __handle_sysrq at c000000000a4dec4
> > #4 [c00000000a3279f0] write_sysrq_trigger at c000000000a4e984
> > ...
> > Signed-off-by: Tao Liu <ltao(a)redhat.com>
> > ---
> > ppc64.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/ppc64.c b/ppc64.c
> > index 6e5f155..8af7b8a 100644
> > --- a/ppc64.c
> > +++ b/ppc64.c
> > @@ -2148,6 +2148,19 @@ ppc64_back_trace(struct gnu_request *req, struct bt_info
*bt)
> >
> > while (INSTACK(req->sp, bt)) {
> > newsp = *(ulong *)&bt->stackbuf[req->sp -
bt->stackbase];
> > +
> > + if (!newpc) {
> > + ulong *up;
> > + int i;
> > + for (i = 0, up = (ulong *)&bt->stackbuf[req->sp
- bt->stackbase];
> > + i < (req->sp - bt->stackbase) /
sizeof(ulong); i++, up++) {
> > + if (INSTACK(*up, bt) &&
is_kernel_text(*(up + 2))) {
> > + newsp = *up;
> > + break;
> > + }
> > + }
> > + }
> > +
> > if ((req->name = closest_symbol(req->pc)) == NULL) {
> > if (CRASHDEBUG(1)) {
> > error(FATAL,
>