Hi Aureau,
On Tue, Apr 16, 2024 at 5:14 PM Aureau, Georges (Kernel Tools ERT)
<georges.aureau(a)hpe.com> wrote:
Hello Tao Liu,
We don't update machdep->machspec->irq_eframe_link as it is now more a
reference point for trying various link offsets (+8, -8, +16, -16) around it.
This reference point should be viewed as the most likely link offset, and if we fail, we
try around it, but without updating this reference point,
otherwise we lose our reference point and the offsets we would be trying next would have
a different meaning.
Different processors may be caught in different IRQ/SYSVEC functions with different
framesizes,
so there is no longer a unique (same for all IRQ/SYSVEC functions) link offset, so we
can't compute one eframe link for all.
We are adapting the eframe link offset on the fly based on possible IRQ/SYSVEC function
framesizes, ie.
most likely -32, then try -40, then -24, then -48, then -16.
If we would update as we find a link, let's say -16, then next time around,
most likely -16, then try -24, then -8, then -32, then 0, eg. we would lose the -40 and
-48.
So if we have a most likely eframe link as a reference point, we should not move this
point.
The comment in x86_64_irq_eframe_link() for the patch I'm suggesting is saying:
"keep ms->irq_eframe_link as the most likely value, and try a few sizes around
it."
See below:
Thanks for the explanation, it sounds reasonable. OK I have no other
comments for this patch, so ack.
Thanks,
Tao Liu
--
diff --git a/x86_64.c b/x86_64.c
index aec82b0..90c2a91 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -6623,13 +6623,14 @@ x86_64_irq_eframe_link_init(void)
/*
* Calculate and verify the IRQ exception frame location from the
- * stack reference at the top of the IRQ stack, possibly adjusting
- * the ms->irq_eframe_link value.
+ * stack reference at the top of the IRQ stack, keep ms->irq_eframe_link
+ * as the most likely value, and try a few sizes around it.
*/
static ulong
x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE *ofp)
{
ulong irq_eframe;
+ int i, try[] = { 8, -8, 16, -16 };
if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
return stkref;
@@ -6639,9 +6640,10 @@ x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE
*ofp)
if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt, ofp))
return irq_eframe;
- if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt, ofp)) {
- machdep->machspec->irq_eframe_link -= 8;
- return (irq_eframe + 8);
+ for (i=0; i<sizeof(try)/sizeof(int); i++) {
+ if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+try[i], 0, bt, ofp))
{
+ return (irq_eframe + try[i]);
+ }
}
return irq_eframe;
--
Cheers,
Georges
-----Original Message-----
From: Tao Liu <ltao(a)redhat.com>
Sent: Tuesday, April 16, 2024 3:19 AM
To: Aureau, Georges (Kernel Tools ERT) <georges.aureau(a)hpe.com>
Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>; Lianbo Jiang
<lijiang(a)redhat.com>; devel(a)lists.crash-utility.osci.io
Subject: Re: [Crash-utility] Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't
handle all IRQ exception properly
Hi Aureau,
On Mon, Apr 15, 2024 at 11:37 PM Aureau, Georges (Kernel Tools ERT)
<georges.aureau(a)hpe.com> wrote:
>
> Hello Tao Liu, and Kazu,
>
> Thanks Kazu for your ack.
>
> Indeed, Tao Liu, you are right, we could factor the case 0 (most
> likely case) into the try[] loop but this would make the code not as
> readable. I would rather keep my original patch, where we leave case 0 as an
explicit call (most likely case) outside of the try[] loop.
OK, thanks for the explanation, I agree with your thoughts. One more question which I
still not very clear:
- machdep->machspec->irq_eframe_link -= 8;
Why don't we update irq_eframe_link after the code change?
Thanks,
Tao Liu
>
> Thanks,
> Georges
>
> -----Original Message-----
> From: Tao Liu <ltao(a)redhat.com>
> Sent: Monday, April 15, 2024 2:33 PM
> To: Aureau, Georges (Kernel Tools ERT) <georges.aureau(a)hpe.com>
> Cc: Lianbo Jiang <lijiang(a)redhat.com>; HAGIO KAZUHITO(萩尾 一仁)
> <k-hagio-ab(a)nec.com>; devel(a)lists.crash-utility.osci.io
> Subject: Re: [Crash-utility] Re: [PATCH] x86_64: rhel 9.3: "crash"
> doesn't handle all IRQ exception properly
>
> Hi Aureau,
>
> On Mon, Apr 15, 2024 at 4:48 PM HAGIO KAZUHITO(萩尾 一仁)
> <k-hagio-ab(a)nec.com> wrote:
> >
> > On 2024/04/10 22:47, Aureau, Georges (Kernel Tools ERT) wrote:
> > > Hello Lianbo,
> > >
> > > In my case, the fix you suggested works properly as I've got a single
CPU with an IRQ exception in a SYSVEC handler.
> > > But it we have multiple CPUs with different IRQ exceptions, then
> > > your suggestion does not work as
> > > x86_64_irq_eframe_link() is changing irq_eframe_link, for example:
> > >
> > > x86_64_irq_eframe_link_init():
> > > set default irq_eframe_link=-24 since we have
> > > sysvec_apic_timer_interrupt
> > >
> > > Then if we do "bt -c cpu" on several cpu X, Y, X, being of their
irq_stack:
> > >
> > > x86_64_irq_eframe_link()/cpu X/rip:sysvec_apic_timer_interrupt:
> > > default irq_eframe_link=-24 is matching
-framesize(rip:sysvec_apic_timer_interrupt)
> > > so we are ok
> > >
> > > x86_64_irq_eframe_link()/cpu Y/rip:common_interrupt:
> > > -24 does not work as framesize(common_interrupt)=32
> > > we try -24-8 = -32, it works, and we set irq_eframe_link=-32
> > >
> > > x86_64_irq_eframe_link()/cpu Z/rip:sysvec_apic_timer_interrupt:
> > > -32 does not work as framesize(rip:sysvec_apic_timer_interrupt)=24
> > > we try -32-8 = -40, it does not work, and we are back to square one
!
> > > here we should try -32+8 = -24
> > >
> > > We would need to have "bt -c cpu" doing runtime per cpu
selection between -32 and -24 depending on framesize(*(pcpu_hot.hardirq_stack_ptr-8)).
> > > So we can't really see "irq_eframe_link" as being a value
valid for all cpu's / IRQ.
> > > The "irq_eframe_link" should be per cpu, depending of what each
cpu is executing (more below).
> > >
> > > Currently x86_64_irq_eframe_link() tries "stkref", then it
tries
> > > "stkref-irq_eframe_link" (with irq_eframe_link set in
irq_eframe_link_init()), and then it tries "stkref-irq_eframe_link+8":
> > >
> > > static ulong
> > > x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE
> > > *ofp) {
> > > ulong irq_eframe;
> > >
> > > if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
> > > return stkref;
> > >
> > > irq_eframe = stkref - machdep->machspec->irq_eframe_link;
> > >
> > > if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt,
ofp))
> > > return irq_eframe;
> > >
> > > if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt,
ofp)) {
> > > machdep->machspec->irq_eframe_link -= 8;
> > > return (irq_eframe + 8);
> > > }
> > >
> > > return irq_eframe;
> > > }
> > >
> > >
> > > I don't have the history as why we try
"stkref-irq_eframe_link+8" ? Any idea ?
> > >
> > > But this current design with defaulting irq_eframe_link to -32 (or -24 as
per your suggestion) does not always work.
> > >
> > > We would need to handle both irq_stack rip's from IRQ functions (eg.
common_interrupt):
> > >
> > > #define DEFINE_IDTENTRY_IRQ(func) \
> > > static void __##func(struct pt_regs *regs, u32 vector); \
> > >
\
> > > __visible noinstr void func(struct pt_regs *regs, \
> > > unsigned long error_code)
\
> > > { \
> > > irqentry_state_t state = irqentry_enter(regs);
\
> > > u32 vector = (u32)(u8)error_code;
\
> > >
\
> > > instrumentation_begin();
\
> > > kvm_set_cpu_l1tf_flush_l1d();
\
> > > run_irq_on_irqstack_cond(__##func, regs, vector);
\
> > > instrumentation_end();
\
> > > irqentry_exit(regs, state);
\
> > > } \
> > >
> > > \ static noinline void __##func(struct pt_regs *regs, u32 vector)
> > >
> > >
> > > And also to handle irq_stack rip's from SYSVEC functions (eg.
sysvec_apic_timer_interrupt):
> > >
> > > #define DEFINE_IDTENTRY_SYSVEC(func) \
> > > static void __##func(struct pt_regs *regs); \
> > >
\
> > > __visible noinstr void func(struct pt_regs *regs) \
> > > { \
> > > irqentry_state_t state = irqentry_enter(regs);
\
> > >
\
> > > instrumentation_begin();
\
> > > kvm_set_cpu_l1tf_flush_l1d();
\
> > > run_sysvec_on_irqstack_cond(__##func, regs);
\
> > > instrumentation_end();
\
> > > irqentry_exit(regs, state);
\
> > > } \
> > >
> > > \ static noinline void __##func(struct pt_regs *regs) Other changes are
required.
> > >
> > >
> > > Clearly we should be adjusting "irq_eframe_link" based on
framesize for either DEFINE_IDTENTRY_IRQ() functions or DEFINE_IDTENTRY_SYSVEC()
functions.
> > > Maybe we should have "irq_eframe_link" for
DEFINE_IDTENTRY_IRQ() functions, and another "sysvec_eframe_link" for
DEFINE_IDTENTRY_SYSVEC() functions, then try both links.
> > > Anyhow, the current x86_64_irq_eframe_link_init() code is making
assumption on a single common framesize , and actually there is a couple of assumptions:
> > > - assuming framesizes with hardcoded values
> > > - assuming a single common framesize
> > >
> > > To keep backward compatibility with the current code, which is
> > > trying +8 (not sure why), the patch I proposed was keeping the existing
+8, but it is also trying -8, and also +16 and -16, as to try having more fallbacks in
case we didn't guess the proper framesizes.
> >
> > Thank you for the fix, I also have a vmcore printing "bogus
> > exception frame" but there has been no good idea to fix. The patch
> > fixed this so
> >
> > Acked-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
> >
> > $ diff -u before.log after.log
> >
> > @@ -577,14 +577,21 @@
> > #9 [ffff954a4683cfe0] __irq_exit_rcu at ffffffff86b0948f
> > #10 [ffff954a4683cff0] sysvec_apic_timer_interrupt at ffffffff875fe8f2
> > --- <IRQ stack> ---
> > - RIP: 0000000000000010 RSP: 0000000000000018 RFLAGS: ffff954a4437fe80
> > - RAX: 000000000000001f RBX: 000000000000a3e7 RCX: 000000000000000e
> > - RDX: 000000003a2e8f1a RSI: 0000000000000000 RDI: ffffffffffffffff
> > - RBP: ffffb53a3f580000 R8: ffff8a51bf5b0a80 R9: 0000001a95a9b7b8
> > - R10: 0000000000000000 R11: 00000000000022ef R12: 0000000000000004
> > - R13: ffffffff884acd60 R14: 0000001a95a9b7b8 R15: 0000000000000004
> > - ORIG_RAX: ffffffff872fe528 CS: 0202 SS: 1a92a013c6
> > -WARNING: possibly bogus exception frame
> > +#11 [ffff954a4437fdd8] asm_sysvec_apic_timer_interrupt at ffffffff87800d86
> > + [exception RIP: cpuidle_enter_state+216]
> > + RIP: ffffffff872fe528 RSP: ffff954a4437fe80 RFLAGS: 00000202
> > + RAX: ffff8a51bf5b0a80 RBX: ffffb53a3f580000 RCX: 000000000000001f
> > + RDX: 000000000000000e RSI: 000000003a2e8f1a RDI: 0000000000000000
> > + RBP: 0000000000000004 R8: 0000001a95a9b7b8 R9: 0000000000000000
> > + R10: 00000000000022ef R11: 000000000000a3e7 R12: ffffffff884acd60
> > + R13: 0000001a95a9b7b8 R14: 0000000000000004 R15: 0000000000000000
> > + ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > +#12 [ffff954a4437feb8] cpuidle_enter at ffffffff872fe8b9
> > +#13 [ffff954a4437fed8] cpuidle_idle_call at ffffffff86b624dc
> > +#14 [ffff954a4437ff10] do_idle at ffffffff86b625fb
> > +#15 [ffff954a4437ff28] cpu_startup_entry at ffffffff86b62849
> > +#16 [ffff954a4437ff38] start_secondary at ffffffff86a62a0f
> > +#17 [ffff954a4437ff50] secondary_startup_64_no_verify at
> > +ffffffff86a0015a
> >
> > Thanks,
> > Kazu
> >
> >
> > >
> > > Regards,
> > > Georges
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Lianbo Jiang <lijiang(a)redhat.com>
> > > Sent: Wednesday, April 10, 2024 6:11 AM
> > > To: devel(a)lists.crash-utility.osci.io
> > > Cc: Aureau, Georges (Kernel Tools ERT) <georges.aureau(a)hpe.com>
> > > Subject: Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't
handle all
> > > IRQ exception properly
> > >
> > > Hi, Aureau
> > >
> > > Thank you for the fix.
> > >
> > > On 4/5/24 08:38, devel-request(a)lists.crash-utility.osci.io wrote:
> > >> Date: Thu, 4 Apr 2024 14:39:06 +0000
> > >> From: "Aureau, Georges (Kernel Tools
> > >> ERT)"<georges.aureau(a)hpe.com>
> > >> Subject: [Crash-utility] [PATCH] x86_64: rhel 9.3: "crash"
doesn't
> > >> handle all IRQ exception properly.
> > >> To:"devel@lists.crash-utility.osci.io"
> > >> <devel(a)lists.crash-utility.osci.io>
> > >> Message-ID:
<SJ0PR84MB14821A83D9631672B4F0F9449F3C2(a)SJ0PR84MB1482.NAMP
> > >> RD84.PROD.OUTLOOK.COM>
> > >> Content-Type: text/plain; charset="us-ascii"
> > >>
> > >>
> > >> Hello,
> > >>
> > >> On x86_64, with rhel 9.3, there are cases where "crash"
can't handle IRQ exception frames properly.
> > >>
> > >> For example, "crash" fails with "WARNING: possibly
bogus exception frame":
> > >>
> > >> crash> bt -c 30
> > >> PID: 2898241 TASK: ff4cb0ce0da0c680 CPU: 30 COMMAND:
"star-ccm+"
> > >> #0 [fffffe4658d88e58] crash_nmi_callback at ffffffffa00675e8
> > >> #1 [fffffe4658d88e68] nmi_handle at ffffffffa002ebab
> > >> #2 [fffffe4658d88eb0] default_do_nmi at ffffffffa0c56cd0
> > >> #3 [fffffe4658d88ed0] exc_nmi at ffffffffa0c56ee1
> > >> #4 [fffffe4658d88ef0] end_repeat_nmi at ffffffffa0e01639
> > >> [exception RIP: native_queued_spin_lock_slowpath+636]
> > >> RIP: ffffffffa0c6b80c RSP: ff5eba269937ce10 RFLAGS: 00000046
> > >> RAX: 0000000000000000 RBX: ff4cb12bcfbb2940 RCX:
00000000007c0000
> > >> RDX: 0000000000000057 RSI: 0000000001600000 RDI:
ff4cb12d67205608
> > >> RBP: ff4cb12d67205608 R8: ff90b9c682475940 R9:
0000000000000000
> > >> R10: ff4cb0d70ee4e100 R11: 0000000000000000 R12:
0000000000000000
> > >> R13: 000000000000001e R14: ff90b9c682474918 R15:
ff90b9c682477120
> > >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > >> --- <NMI exception stack> ---
> > >> #5 [ff5eba269937ce10] native_queued_spin_lock_slowpath at
ffffffffa0c6b80c
> > >> #6 [ff5eba269937ce30] _raw_spin_lock_irqsave at ffffffffa0c6ad90
> > >> #7 [ff5eba269937ce40] free_iova at ffffffffa07df716
> > >> #8 [ff5eba269937ce68] fq_ring_free at ffffffffa07dba15
> > >> #9 [ff5eba269937ce98] fq_flush_timeout at ffffffffa07dc8fe
> > >> #10 [ff5eba269937ced0] call_timer_fn at ffffffffa01c00a4
> > >> #11 [ff5eba269937cef0] __run_timers at ffffffffa01c03ae
> > >> #12 [ff5eba269937cf88] run_timer_softirq at ffffffffa01c0476
> > >> #13 [ff5eba269937cf90] __do_softirq at ffffffffa0c6c007
> > >> #14 [ff5eba269937cfe0] __irq_exit_rcu at ffffffffa010ef61
> > >> #15 [ff5eba269937cff0] sysvec_apic_timer_interrupt at
> > >> ffffffffa0c58ca2
> > >> --- <IRQ stack> ---
> > >> RIP: 0000000000000010 RSP: 0000000000000018 RFLAGS:
ff5eba26ddc9f7e8
> > >> RAX: 0000000000000a20 RBX: ff5eba26ddc9f940 RCX:
0000000000001000
> > >> RDX: ffffffb559980000 RSI: ff4cb12d67207400 RDI:
ffffffffffffffff
> > >> RBP: 0000000000001000 R8: ff5eba26ddc9f940 R9:
ff5eba26ddc9f8af
> > >> R10: 0000000000000003 R11: 0000000000000a20 R12:
ff5eba26ddc9f8b0
> > >> R13: 000000283c07f000 R14: ff4cb0f5a29a1c00 R15:
0000000000000001
> > >> ORIG_RAX: ffffffffa07c4e60 CS: 0206 SS: 7000001cf0380001
> > >> bt: WARNING: possibly bogus exception frame
> > >>
> > >>
> > >> Running "crash" with "--machdep
irq_eframe_link=0xffffffffffffffe8" (ie. thus irq_eframe_link = -24) works properly:
> > >>
> > >> PID: 2898241 TASK: ff4cb0ce0da0c680 CPU: 30 COMMAND:
"star-ccm+"
> > >> #0 [fffffe4658d88e58] crash_nmi_callback at ffffffffa00675e8
> > >> #1 [fffffe4658d88e68] nmi_handle at ffffffffa002ebab
> > >> #2 [fffffe4658d88eb0] default_do_nmi at ffffffffa0c56cd0
> > >> #3 [fffffe4658d88ed0] exc_nmi at ffffffffa0c56ee1
> > >> #4 [fffffe4658d88ef0] end_repeat_nmi at ffffffffa0e01639
> > >> [exception RIP: native_queued_spin_lock_slowpath+636]
> > >> RIP: ffffffffa0c6b80c RSP: ff5eba269937ce10 RFLAGS: 00000046
> > >> RAX: 0000000000000000 RBX: ff4cb12bcfbb2940 RCX:
00000000007c0000
> > >> RDX: 0000000000000057 RSI: 0000000001600000 RDI:
ff4cb12d67205608
> > >> RBP: ff4cb12d67205608 R8: ff90b9c682475940 R9:
0000000000000000
> > >> R10: ff4cb0d70ee4e100 R11: 0000000000000000 R12:
0000000000000000
> > >> R13: 000000000000001e R14: ff90b9c682474918 R15:
ff90b9c682477120
> > >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > >> --- <NMI exception stack> ---
> > >> #5 [ff5eba269937ce10] native_queued_spin_lock_slowpath at
ffffffffa0c6b80c
> > >> #6 [ff5eba269937ce30] _raw_spin_lock_irqsave at ffffffffa0c6ad90
> > >> #7 [ff5eba269937ce40] free_iova at ffffffffa07df716
> > >> #8 [ff5eba269937ce68] fq_ring_free at ffffffffa07dba15
> > >> #9 [ff5eba269937ce98] fq_flush_timeout at ffffffffa07dc8fe
> > >> #10 [ff5eba269937ced0] call_timer_fn at ffffffffa01c00a4
> > >> #11 [ff5eba269937cef0] __run_timers at ffffffffa01c03ae
> > >> #12 [ff5eba269937cf88] run_timer_softirq at ffffffffa01c0476
> > >> #13 [ff5eba269937cf90] __do_softirq at ffffffffa0c6c007
> > >> #14 [ff5eba269937cfe0] __irq_exit_rcu at ffffffffa010ef61
> > >> #15 [ff5eba269937cff0] sysvec_apic_timer_interrupt at
> > >> ffffffffa0c58ca2
> > >> --- <IRQ stack> ---
> > >> #16 [ff5eba26ddc9f738] asm_sysvec_apic_timer_interrupt at
ffffffffa0e00e06
> > >> [exception RIP: alloc_pte.constprop.0+32]
> > >> RIP: ffffffffa07c4e60 RSP: ff5eba26ddc9f7e8 RFLAGS: 00000206
> > >> RAX: ff5eba26ddc9f940 RBX: 0000000000001000 RCX:
0000000000000a20
> > >> RDX: 0000000000001000 RSI: ffffffb559980000 RDI:
ff4cb12d67207400
> > >> RBP: ff5eba26ddc9f8b0 R8: ff5eba26ddc9f8af R9:
0000000000000003
> > >> R10: 0000000000000a20 R11: ff5eba26ddc9f940 R12:
000000283c07f000
> > >> R13: ff4cb0f5a29a1c00 R14: 0000000000000001 R15:
ff4cb0f5a29a1bf8
> > >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > >> #17 [ff5eba26ddc9f830] iommu_v1_map_pages at ffffffffa07c5648
> > >> #18 [ff5eba26ddc9f8f8] __iommu_map at ffffffffa07d7803
> > >> #19 [ff5eba26ddc9f990] iommu_map_sg at ffffffffa07d7b71
> > >> #20 [ff5eba26ddc9f9f0] iommu_dma_map_sg at ffffffffa07ddcc9
> > >> #21 [ff5eba26ddc9fa90] __dma_map_sg_attrs at ffffffffa01b5205
> > >> #22 [ff5eba26ddc9fa98] dma_map_sgtable at ffffffffa01b526d
> > >> #23 [ff5eba26ddc9faa8] ib_umem_get at ffffffffc112f308
> > >> [ib_uverbs]
> > >> #24 [ff5eba26ddc9fb00] mlx5_ib_reg_user_mr at ffffffffc11eb991
> > >> [mlx5_ib]
> > >> #25 [ff5eba26ddc9fb40] ib_uverbs_reg_mr at ffffffffc1123bc4
> > >> [ib_uverbs]
> > >> #26 [ff5eba26ddc9fbb0]
> > >> ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE
> > >> at
> > >> ffffffffc112d064 [ib_uverbs]
> > >> #27 [ff5eba26ddc9fbe0] ib_uverbs_run_method at ffffffffc112a121
> > >> [ib_uverbs]
> > >> #28 [ff5eba26ddc9fc30] ib_uverbs_cmd_verbs at ffffffffc112a332
> > >> [ib_uverbs]
> > >> #29 [ff5eba26ddc9fe68] ib_uverbs_ioctl at ffffffffc112a494
> > >> [ib_uverbs]
> > >> #30 [ff5eba26ddc9fea8] __x64_sys_ioctl at ffffffffa0438f67
> > >> #31 [ff5eba26ddc9fed8] do_syscall_64 at ffffffffa0c55189
> > >> #32 [ff5eba26ddc9ff50] entry_SYSCALL_64_after_hwframe at
ffffffffa0e000ea
> > >> RIP: 0000146ef3a3ec6b RSP: 00007ffe3b7dc5b8 RFLAGS: 00000246
> > >> RAX: ffffffffffffffda RBX: 00007ffe3b7dc6a8 RCX:
0000146ef3a3ec6b
> > >> RDX: 00007ffe3b7dc690 RSI: 00000000c0181b01 RDI:
0000000000000011
> > >> RBP: 00007ffe3b7dc670 R8: 0000146ed9549010 R9:
00007ffe3b7dc7c4
> > >> R10: 0000000000000009 R11: 0000000000000246 R12:
00007ffe3b7dc7c4
> > >> R13: 000000000000000c R14: 000000000000000c R15:
0000146ed9549150
> > >> ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
> > >>
> > >>
> > >> Some background:
> > >>
> > >> asm_common_interrupt:
> > >> callq error_entry
> > >> movq %rax,%rsp
> > >> movq %rsp,%rdi
> > >> movq 0x78(%rsp),%rsi
> > >> movq $-0x1,0x78(%rsp)
> > >> call common_interrupt # rsp pointing to regs
> > >>
> > >> common_interrupt
> > >> pushq %r12
> > >> pushq %rbp
> > >> pushq %rbx
> > >> [...]
> > >> movq hardirq_stack_ptr,%r11
> > >> movq %rsp,(%r11)
> > >> movq %r11,%rsp
> > >> [...]
> > >> call __common_interrupt # rip:common_interrupt
> > >>
> > >> So frame_size(rip:common_interrupt) = 32 (3 push + ret).
> > >>
> > >> Hence "machdep->machspec->irq_eframe_link = -32;"
(see x86_64_irq_eframe_link_init()).
> > >>
> > >> Now:
> > >>
> > >> asm_sysvec_apic_timer_interrupt:
> > >> pushq $-0x1
> > >> callq error_entry
> > >> movq %rax,%rsp
> > >> movq %rsp,%rdi
> > >> callq sysvec_apic_timer_interrupt
> > >>
> > >> sysvec_apic_timer_interrupt
> > >> pushq %r12
> > >> pushq %rbp
> > >> [...]
> > >> movq hardirq_stack_ptr,%r11
> > >> movq %rsp,(%r11)
> > >> movq %r11,%rsp
> > >> [...]
> > >> call __sysvec_apic_timer_interrupt #
> > >> rip:sysvec_apic_timer_interrupt
> > >>
> > >> Here frame_size(rip:sysvec_apic_timer_interrupt) = 24 (2 push +
> > >> ret)
> > >>
> > >> We should also notice that:
> > >>
> > >> rip = *(hardirq_stack_ptr - 8)
> > >> rsp = *(hardirq_stack_ptr)
> > >> regs = rsp - frame_size(rip)
> > >>
> > >> But x86_64_get_framesize() does not work with IRQ handlers (returns
0).
> > >> So not many options other than hardcoding the most likely value and
looking around it.
> > >> Actually x86_64_irq_eframe_link() was trying -32 (default), and then
-40, but not -24 !
> > >>
> > >> Signed-off-by: Georges Aureau<georges.aureau(a)hpe.com>
> > >> --
> > >> diff --git a/x86_64.c b/x86_64.c
> > >> index aec82b0..90c2a91 100644
> > >> --- a/x86_64.c
> > >> +++ b/x86_64.c
> > >> @@ -6623,13 +6623,14 @@ x86_64_irq_eframe_link_init(void)
> > >>
> > >> /*
> > >> * Calculate and verify the IRQ exception frame location from
> > >> the
> > >> - * stack reference at the top of the IRQ stack, possibly
> > >> adjusting
> > >> - * the ms->irq_eframe_link value.
> > >> + * stack reference at the top of the IRQ stack, keep
> > >> + ms->irq_eframe_link
> > >> + * as the most likely value, and try a few sizes around it.
> > >> */
> > >> static ulong
> > >> x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE
*ofp)
> > >> {
> > >> ulong irq_eframe;
> > >> + int i, try[] = { 8, -8, 16, -16 };
> > >>
> > >> if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt,
ofp))
> > >> return stkref;
> > >> @@ -6639,9 +6640,10 @@ x86_64_irq_eframe_link(ulong stkref, struct
bt_info *bt, FILE *ofp)
> > >> if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt,
ofp))
> > >> return irq_eframe;
> > >>
> > >> - if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt,
ofp)) {
> > >> - machdep->machspec->irq_eframe_link -= 8;
>
> Why don't we update "machdep->machspec->irq_eframe_link" in the
following added code?
>
> > >> - return (irq_eframe + 8);
> > >> + for (i=0; i<sizeof(try)/sizeof(int); i++) {
> > >> + if (x86_64_exception_frame(EFRAME_VERIFY,
irq_eframe+try[i], 0, bt, ofp)) {
> > >> + return (irq_eframe + try[i]);
> > >> + }
> > >> }
> > >>
> > >> return irq_eframe;
>
> I guess if we don't need to update
> "machdep->machspec->irq_eframe_link", maybe we can merge the code as
follows?
>
> x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE *ofp) {
> ulong irq_eframe;
> + int i, try[] = { 0, 8, -8, 16, -16 };
>
> if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
> return stkref;
>
> irq_eframe = stkref - machdep->machspec->irq_eframe_link;
>
> - if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt, ofp))
> - return irq_eframe;
> -
> - if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt, ofp)) {
> - machdep->machspec->irq_eframe_link -= 8;
> - return (irq_eframe + 8);
> + for (i=0; i<sizeof(try)/sizeof(int); i++) {
> + if (x86_64_exception_frame(EFRAME_VERIFY,
> irq_eframe+try[i], 0, bt, ofp)) {
> + return (irq_eframe + try[i]);
> + }
> }
>
> return irq_eframe;
>
> Thanks,
> Tao Liu
>
> > >
> > >
> > > Can you help to try the following changes? It could be good to check the
relevant symbol, but I'm not sure if this can work well for you and cover some corner
cases.
> > >
> > > diff --git a/x86_64.c b/x86_64.c
> > > index aec82b03b62e..251e224c013b 100644
> > > --- a/x86_64.c
> > > +++ b/x86_64.c
> > > @@ -6574,6 +6574,8 @@ x86_64_irq_eframe_link_init(void)
> > > if (symbol_exists("asm_common_interrupt")) {
> > > if (symbol_exists("asm_call_on_stack"))
> > > machdep->machspec->irq_eframe_link = -64;
> > > + else if (symbol_exists("sysvec_apic_timer_interrupt"))
> > > + machdep->machspec->irq_eframe_link = -24;
> > > else
> > > machdep->machspec->irq_eframe_link = -32;
> > > return;
> > >
> > >
> > > Thanks
> > >
> > > Lianbo
> > >
> > > --
> > > Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io To
> > > unsubscribe send an email to
> > > devel-leave(a)lists.crash-utility.osci.io
> > > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.
> > > io
> > > / Contribution Guidelines:
> > >
https://github.com/crash-utility/crash/wiki
> > --
> > Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io To
> > unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io
> > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io
> > / Contribution Guidelines:
> >
https://github.com/crash-utility/crash/wiki
>