On Fri, Feb 3, 2023 at 9:36 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>
wrote:
On 2023/02/02 19:53, lijiang wrote:
>> Thank you for the comment, Kazu.
>>
>> On Thu, Feb 2, 2023 at 5:16 PM HAGIO KAZUHITO(萩尾 一仁) <
k-hagio-ab(a)nec.com <mailto:k-hagio-ab@nec.com>> wrote:
>>
>> Hi Lianbo,
>>
>> thanks for the patch.
>>
>> On 2023/02/02 16:01, Lianbo Jiang wrote:
>> > Currently, the "bt" command may print a bogus exception
frame
>> > and the rest frame will be truncated on x86 64 when using the
>> > "virsh send-key <kvm guest> KEY_LEFTALT KEY_SYSRQ
KEY_C" command
>> > to trigger a panic from the KVM host. For example:
>> >
>> > crash> bt
>> > PID: 0 TASK: ffff9e7a47e32f00 CPU: 3 COMMAND:
"swapper/3"
>> > #0 [ffffba7900118bb8] machine_kexec at ffffffff87e5c2c7
>> > #1 [ffffba7900118c08] __crash_kexec at ffffffff87f9500d
>> > #2 [ffffba7900118cd0] panic at ffffffff87edfff9
>> > #3 [ffffba7900118d50] sysrq_handle_crash at ffffffff883ce2c1
>> > #4 [ffffba7900118d58] __handle_sysrq.cold.15 at
ffffffff883ceb56
>> > #5 [ffffba7900118d88] sysrq_filter at ffffffff883ce9a2
>> > #6 [ffffba7900118dc0] input_to_handler at ffffffff884fb0bf
>> > #7 [ffffba7900118df8] input_pass_values at ffffffff884fc1b7
>> > #8 [ffffba7900118e20] input_handle_event at ffffffff884fe278
>> > #9 [ffffba7900118e50] input_event at ffffffff884fe74b
>> > #10 [ffffba7900118e88] atkbd_interrupt at ffffffff88504e2f
>> > #11 [ffffba7900118ee0] serio_interrupt at ffffffff884f7516
>> > #12 [ffffba7900118f10] i8042_interrupt at ffffffff884f8b04
>> > #13 [ffffba7900118f50] __handle_irq_event_percpu at
ffffffff87f51430
>> > #14 [ffffba7900118f90] handle_irq_event_percpu at
ffffffff87f51590
>> > #15 [ffffba7900118fb8] handle_irq_event at ffffffff87f51616
>> > #16 [ffffba7900118fd8] handle_edge_irq at ffffffff87f559f2
>> > #17 [ffffba7900118ff0] asm_call_on_stack at ffffffff88800fa2
>> > --- <IRQ stack> ---
>> > #18 [ffffba790008bda0] asm_call_on_stack at ffffffff88800fa2
>> > RIP: ffffffffffffffff RSP: 0000000000000124 RFLAGS:
00000003
>> > RAX: 0000000000000000 RBX: 0000000000000001 RCX:
0000000000000000
>> > RDX: ffffffff88800c1e RSI: 0000000000000000 RDI:
0000000000000000
>> > RBP: 0000000000000001 R8: 0000000000000000 R9:
0000000000000000
>> > R10: 0000000000000000 R11: ffffffff88760555 R12:
ffffba790008be08
>> > R13: ffffffff87f18002 R14: ffff9e7a47e32f00 R15:
ffff9e7bb6198e00
>> > ORIG_RAX: 0000000000000000 CS: 0003 SS: 0000
>> > bt: WARNING: possibly bogus exception frame
>> > crash>
>>
>> Could I have the bt result with the patch?
>>
>> Sure. I tested it on upstream kernel with the patch:
>>
>> crash> bt
>> PID: 0 TASK: ffff9e7a47e32f00 CPU: 3 COMMAND: "swapper/3"
>> #0 [ffffba7900118bb8] machine_kexec at ffffffff87e5c2c7
>> #1 [ffffba7900118c08] __crash_kexec at ffffffff87f9500d
>> #2 [ffffba7900118cd0] panic at ffffffff87edfff9
>> #3 [ffffba7900118d50] sysrq_handle_crash at ffffffff883ce2c1
>> #4 [ffffba7900118d58] __handle_sysrq.cold.15 at ffffffff883ceb56
>> #5 [ffffba7900118d88] sysrq_filter at ffffffff883ce9a2
>> #6 [ffffba7900118dc0] input_to_handler at ffffffff884fb0bf
>> #7 [ffffba7900118df8] input_pass_values at ffffffff884fc1b7
>> #8 [ffffba7900118e20] input_handle_event at ffffffff884fe278
>> #9 [ffffba7900118e50] input_event at ffffffff884fe74b
>> #10 [ffffba7900118e88] atkbd_interrupt at ffffffff88504e2f
>> #11 [ffffba7900118ee0] serio_interrupt at ffffffff884f7516
>> #12 [ffffba7900118f10] i8042_interrupt at ffffffff884f8b04
>> #13 [ffffba7900118f50] __handle_irq_event_percpu at ffffffff87f51430
>> #14 [ffffba7900118f90] handle_irq_event_percpu at ffffffff87f51590
>> #15 [ffffba7900118fb8] handle_irq_event at ffffffff87f51616
>> #16 [ffffba7900118fd8] handle_edge_irq at ffffffff87f559f2
>> #17 [ffffba7900118ff0] asm_call_on_stack at ffffffff88800fa2
>> --- <IRQ stack> ---
>> #18 [ffffba790008bd50] asm_call_on_stack at ffffffff88800fa2
>> [exception RIP: common_interrupt+181]
>> RIP: ffffffff88760555 RSP: 0000000000000000 RFLAGS: 00000000
>> RAX: ffff9e7bb6198e00 RBX: ffff9e7bb6198e00 RCX: ffff9e7a47e32f00
>> RDX: ffffffff87f18002 RSI: ffffba790008be08 RDI: 0000000000000001
>> RBP: 0000000d25808fab R8: ffffffff87f17d1c R9: ffffffff87e29ec5
>> R10: ffffffff87e63dcd R11: ffff9e7a47e32f00 R12: ffffffff8808ea76
>> R13: ffffffff8808e1d2 R14: 0000000000000046 R15: ffffffff87f762f1
>> ORIG_RAX: 0000000000000001 CS: 0000 SS: 0000
>> #19 [ffffba790008be00] asm_common_interrupt at ffffffff88800c1e
>> #20 [ffffba790008be88] default_idle at ffffffff8876ffc5
>> #21 [ffffba790008bed8] do_idle at ffffffff87f1982d
>> #22 [ffffba790008bf20] cpu_startup_entry at ffffffff87f19a29
>> #23 [ffffba790008bf30] start_secondary at ffffffff87e5118f
>> #24 [ffffba790008bf50] secondary_startup_64 at ffffffff87e000e6
Thanks. Hmm, I think it's strange that RIP is in commont_interrupt and
RSP is zero. I guess RIP should be in default_idle.
Thank you for pointing out this issue. The RIP is in the default_idle
according to the kernel log output.
I tried to hardcode and get the expected eframe information, but the
backtrace is incorrect:
crash> bt
PID: 0 TASK: ffff9e7a47e32f00 CPU: 3 COMMAND: "swapper/3"
#0 [ffffba7900118bb8] machine_kexec at ffffffff87e5c2c7
#1 [ffffba7900118c08] __crash_kexec at ffffffff87f9500d
#2 [ffffba7900118cd0] panic at ffffffff87edfff9
#3 [ffffba7900118d50] sysrq_handle_crash at ffffffff883ce2c1
#4 [ffffba7900118d58] __handle_sysrq.cold.15 at ffffffff883ceb56
#5 [ffffba7900118d88] sysrq_filter at ffffffff883ce9a2
#6 [ffffba7900118dc0] input_to_handler at ffffffff884fb0bf
#7 [ffffba7900118df8] input_pass_values at ffffffff884fc1b7
#8 [ffffba7900118e20] input_handle_event at ffffffff884fe278
#9 [ffffba7900118e50] input_event at ffffffff884fe74b
#10 [ffffba7900118e88] atkbd_interrupt at ffffffff88504e2f
#11 [ffffba7900118ee0] serio_interrupt at ffffffff884f7516
#12 [ffffba7900118f10] i8042_interrupt at ffffffff884f8b04
#13 [ffffba7900118f50] __handle_irq_event_percpu at ffffffff87f51430
#14 [ffffba7900118f90] handle_irq_event_percpu at ffffffff87f51590
#15 [ffffba7900118fb8] handle_irq_event at ffffffff87f51616
#16 [ffffba7900118fd8] handle_edge_irq at ffffffff87f559f2
#17 [ffffba7900118ff0] asm_call_on_stack at ffffffff88800fa2
--- <IRQ stack> ---
#18 [ffffba790008be08] asm_call_on_stack at ffffffff88800fa2
[exception RIP: default_idle+37]
RIP: ffffffff8876ffc5 RSP: ffffba790008beb8 RFLAGS: 00000246
RAX: ffffffff8876ffa0 RBX: 0000000000000003 RCX: 0000000000000001
RDX: 000000000001a6de RSI: 0000000000000087 RDI: 0000000000000003
RBP: 0000000000000003 R8: 000000146ccbc1f0 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000124 R12: ffffffffffffffff
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#19 [ffffba790008bed8] do_idle at ffffffff87f1982d
#20 [ffffba790008bf20] cpu_startup_entry at ffffffff87f19a29
#21 [ffffba790008bf30] start_secondary at ffffffff87e5118f
#22 [ffffba790008bf50] secondary_startup_64 at ffffffff87e000e6
crash>
Let me debug it more and to see what happened.
What is printed as the panic message in "log" command?
The kernel panic message:
[ 57.542538] Kernel panic - not syncing: sysrq triggered crash
[ 57.545512] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
5.7.0+ #7
[ 57.549160] Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015
[ 57.552655] Call Trace:
[ 57.553994] <IRQ>
[ 57.555143] dump_stack+0x57/0x70
[ 57.556930] panic+0xfb/0x2d7
[ 57.558516] ? printk+0x58/0x6f
[ 57.560194] sysrq_handle_crash+0x11/0x20
[ 57.562371] __handle_sysrq.cold.15+0x48/0x102
[ 57.564663] sysrq_filter+0x312/0x3c0
[ 57.566579] input_to_handler+0x4f/0xf0
[ 57.568583] input_pass_values.part.7+0x117/0x130
[ 57.571188] input_handle_event+0x138/0x5c0
[ 57.573380] input_event+0x4b/0x70
[ 57.575173] atkbd_interrupt+0x61f/0x6a0
[ 57.577228] serio_interrupt+0x46/0x90
[ 57.579192] i8042_interrupt+0x154/0x240
[ 57.581328] __handle_irq_event_percpu+0x40/0x170
[ 57.583788] handle_irq_event_percpu+0x30/0x80
[ 57.586081] handle_irq_event+0x36/0x53
[ 57.588071] handle_edge_irq+0x82/0x190
[ 57.590096] asm_call_on_stack+0x12/0x20
[ 57.592142] </IRQ>
[ 57.593264] common_interrupt+0xb5/0x140
[ 57.595311] asm_common_interrupt+0x1e/0x40
[ 57.597474] RIP: 0010:default_idle+0x25/0x150
[ 57.599735] Code: cc cc cc cc cc 66 66 66 66 90 41 55 41 54 55 65 8b 2d
af 23 8a 77 53 66 66 66 66 90 e9 07 00 00 00 0f 00 2d 9f 37 49 00 fb f4
<65> 8b 2d 94 23 8a 77 66 66 66 66 90 5b 5d 41 5c 41 5d c3 65 8b 05
[ 57.609196] RSP: 0018:ffffba790008beb8 EFLAGS: 00000246
[ 57.611875] RAX: ffffffff8876ffa0 RBX: 0000000000000003 RCX:
0000000000000001
[ 57.615506] RDX: 000000000001a6de RSI: 0000000000000087 RDI:
0000000000000003
[ 57.619142] RBP: 0000000000000003 R08: 000000146ccbc1f0 R09:
0000000000000000
[ 57.622770] R10: 0000000000000000 R11: 0000000000000124 R12:
ffffffffffffffff
[ 57.626386] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[ 57.630029] ? __cpuidle_text_start+0x8/0x8
[ 57.632209] do_idle+0x1bd/0x240
[ 57.633914] cpu_startup_entry+0x19/0x20
[ 57.635966] start_secondary+0x15f/0x1b0
[ 57.638021] secondary_startup_64+0xb6/0xc0
Thanks.
Lianbo
> crash>
>>
>> I've still not learned the idtentry and etc. enough, but at least,
>> a wrong exception frame is still shown at my end. This is a RHEL9.1
>> vmcore that reproduces the bogus exception frame error.
>>
>> crash> bt
>> PID: 0 TASK: ffffffff8881a940 CPU: 0 COMMAND:
"swapper/0"
>> #0 [ffffa5ab40003bb0] machine_kexec at ffffffff86e6973d
>> #1 [ffffa5ab40003c00] __crash_kexec at ffffffff86fbe29d
>> #2 [ffffa5ab40003cc8] panic at ffffffff8788cb6e
>> ...
>> #16 [ffffa5ab40003fc8] __common_interrupt at ffffffff86e2666e
>> #17 [ffffa5ab40003ff0] common_interrupt at ffffffff878d83ee
>> --- <IRQ stack> ---
>> #18 [ffffffff88803d80] common_interrupt at ffffffff878d83ee
>> [exception RIP: unknown or invalid address] <<<--
invalid
>> RIP: 0000000000000000 RSP: 000000007d3187b0 RFLAGS:
ffffffff87a00ce2
>> RAX: ffffffff86e71564 RBX: 0000000000000c56 RCX:
ffffffff878da876
>> RDX: 0000000000000000 RSI: ffffffff878dab3e RDI:
ffffffff878d8384
>> RBP: 0000000000000000 R8: 0000000000000000 R9:
000000134ab46fed
>> R10: ffff9143bbc2afc0 R11: ffff9143b9f2d540 R12:
0000000000000000
>> R13: ffffffff86f390ae R14: ffff9143b9f2d540 R15:
ffff9143b9f2d540
>> ORIG_RAX: 0000000000000000 CS: 0000 SS: 7d3187a8
>> #19 [ffffffff88803e98] default_idle at ffffffff878e8570
>> #20 [ffffffff88803ec0] default_idle_call at ffffffff878e86d3
>> #21 [ffffffff88803ec8] cpuidle_idle_call at ffffffff86f4ac3d
>> #22 [ffffffff88803f00] do_idle at ffffffff86f4ad1b
>> ...
>>
>> crash> dis default_idle
>> 0xffffffff878e8560 <default_idle>: nopl 0x0(%rax,%rax,1)
[FTRACE NOP]
>> 0xffffffff878e8565 <default_idle+5>: xchg %ax,%ax
>> 0xffffffff878e8567 <default_idle+7>: verw 0x51b2c2(%rip)
# 0xffffffff87e03830 <ds.0>
>> 0xffffffff878e856e <default_idle+14>: sti
>> 0xffffffff878e856f <default_idle+15>: hlt
>> 0xffffffff878e8570 <default_idle+16>: ret
>>
>> I guess "0xffffffff878e8570 <default_idle+16>" should be
the
exception
>> RIP here..
I also should have check the panic message...
[ 82.842430] common_interrupt+0x7e/0xa0
[ 82.842741] </IRQ>
[ 82.842927] asm_common_interrupt+0x22/0x40
[ 82.843166] RIP: 0010:default_idle+0x10/0x20
[ 82.843616] Code: cc 0f ae f0 0f ae 38 0f ae f0 eb b5 66 66 2e 0f 1f 84
00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 66 90 0f 00 2d c2 b2 51 00 fb f4
<c3> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65
[ 82.844714] RSP: 0018:ffffffff88803ec0 EFLAGS: 00000242
[ 82.845002] RAX: ffffffff878e8560 RBX: ffffffff8881a940 RCX:
0000000000000001
[ 82.845396] RDX: 000000000000f92b RSI: 0000000000000083 RDI:
000000000000f92c
[ 82.845853] RBP: 0000000000000000 R08: 00000014e57bc624 R09:
0000000000000001
[ 82.846263] R10: 0000000000000400 R11: 0000000000000000 R12:
0000000000000000
[ 82.846733] R13: 000000007da66a18 R14: 000000007d3187a8 R15:
000000007d3187b0
[ 82.847143] ? mwait_idle+0x80/0x80
[ 82.847349] default_idle_call+0x33/0xe0
[ 82.847640] cpuidle_idle_call+0x15d/0x1c0
In this case, RIP is "0xffffffff878e8570 <default_idle+16>" and
at ffffffff88803e98. This will be pt_regs.ip.
crash> bt -r
PID: 0 TASK: ffffffff8881a940 CPU: 0 COMMAND: "swapper/0"
...
ffffffff88803e70: 0000000000000001 000000000000f92b
ffffffff88803e80: 0000000000000083 000000000000f92c
ffffffff88803e90: ffffffffffffffff default_idle+16 <<<
ffffffff88803ea0: 0000000000000010 0000000000000242
ffffffff88803eb0: init_thread_union+16064 0000000000000018
...
crash> struct -o pt_regs.ip
struct pt_regs {
[128] unsigned long ip;
}
crash> px 0xffffffff88803e98-128
$6 = 0xffffffff88803e18
crash> struct -x pt_regs 0xffffffff88803e18
struct pt_regs {
r15 = 0x7d3187b0,
r14 = 0x7d3187a8,
r13 = 0x7da66a18,
r12 = 0x0,
bp = 0x0,
bx = 0xffffffff8881a940,
r11 = 0x0,
r10 = 0x400,
r9 = 0x1,
r8 = 0x14e57bc624,
ax = 0xffffffff878e8560,
cx = 0x1,
dx = 0xf92b,
si = 0x83,
di = 0xf92c,
orig_ax = 0xffffffffffffffff,
ip = 0xffffffff878e8570,
cs = 0x10,
flags = 0x242,
sp = 0xffffffff88803ec0,
ss = 0x18
}
This looks same as the saved regs in the panic message.
>
>>
>> The above address of RIP is zero. Maybe this is a specific RHEL9 vmcore
issue, or it
>> is related to some kernel specific config? I will dig into the details
later.
>>
>> >
>> > Kernel commit fa5e5c409213 ("x86/entry: Use idtentry for
interrupts")
>> > change causes the current issue, crash needs to adjust the value
of
>> > irq eframe.
>> >
>> > Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com <mailto:
lijiang(a)redhat.com>>
>> > ---
>> > x86_64.c | 15 +++++++++++----
>> > 1 file changed, 11 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/x86_64.c b/x86_64.c
>> > index 7a5d6f050c89..6504d16fb17e 100644
>> > --- a/x86_64.c
>> > +++ b/x86_64.c
>> > @@ -4647,10 +4647,15 @@ x86_64_exception_frame(ulong flags, ulong
kvaddr, char *local,
>> > r14 = ULONG(pt_regs_buf + ms->pto.r14);
>> > r15 = ULONG(pt_regs_buf + ms->pto.r15);
>> >
>> > - verified = x86_64_eframe_verify(bt,
>> > - kvaddr ? kvaddr : (local - bt->stackbuf) +
bt->stackbase,
>> > - cs, ss, rip, rsp, rflags);
>> > -
>> > + /*
>> > + * Do not verify when the flags is not set to EFRAME_VERIFY.
>> > + */
>> > + if (symbol_search("asm_common_interrupt") &&
!(flags &
EFRAME_VERIFY))
>> > + verified = TRUE;
>> > + else
>> > + verified = x86_64_eframe_verify(bt,
>> > + kvaddr ? kvaddr : (local - bt->stackbuf) +
bt->stackbase,
>> > + cs, ss, rip, rsp, rflags);
>> > /*
>> > * If it's print-if-verified request, don't print bogus
eframes.
>> > */
>> > @@ -6577,6 +6582,8 @@ x86_64_irq_eframe_link(ulong stkref, struct
bt_info *bt, FILE *ofp)
>> > return stkref;
>> >
>> > irq_eframe = stkref - machdep->machspec->irq_eframe_link;
>> > + if (symbol_search("asm_common_interrupt"))
>> > + irq_eframe -= 80;
>>
>> So, where does this 80 come from?
>>
>>
>> crash> dis asm_common_interrupt
>> 0xffffffff88800c00 <asm_common_interrupt>: data16 xchg %ax,%ax
>> 0xffffffff88800c03 <asm_common_interrupt+3>: call
0xffffffff88801130 <error_entry>
>> 0xffffffff88800c08 <asm_common_interrupt+8>: mov %rsp,%rdi
>> 0xffffffff88800c0b <asm_common_interrupt+11>: mov 0x78(%rsp),%rsi
>> 0xffffffff88800c10 <asm_common_interrupt+16>: movq
$0xffffffffffffffff,0x78(%rsp)
>> 0xffffffff88800c19 <asm_common_interrupt+25>: call
0xffffffff887604a0 <common_interrupt>
>> 0xffffffff88800c1e <asm_common_interrupt+30>: jmp
0xffffffff88801210 <error_return>
>> crash>
>>
>> The rsp offset is 0x78, because the irq_eframe_link is initialized to
40, so the irq eframe offset needs to
>> subtract 80( 0x78 = 120 = 40 + 80).
hmm, I will check how the irq_eframe_link is initialized.
Thanks,
Kazu