On 1/12/24 10:40, HAGIO KAZUHITO(萩尾 一仁) wrote:
Hi Lianbo,
sorry for the delay..
No worries, Kazu.
On 2023/11/29 21:37, lijiang wrote:
> On Wed, Nov 29, 2023 at 8:45 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>
> wrote:
>
>> On 2023/11/28 11:31, Lianbo Jiang wrote:
>>> Currently, the "bt pid" command may not print enough stack trace
and the
>>> remaining frames will be truncated on x86_64. For example:
>>>
>>> Without the patch:
>>> crash> bt 493113
>>> PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND:
>> "sriov_fec_daemo"
>>> #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
>>> #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
>>> #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
>>> RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
>>> RAX: ffffffffffffffda RBX: 000000c00004e000 RCX:
>> 000000000047cdbb
>>> RDX: 000000000000000c RSI: 000000c000097798 RDI:
>> 0000000000000009
>>> RBP: 000000c0000975f8 R8: 0000000000000001 R9:
>> 000000c00098d680
>>> R10: 000000000000000c R11: 0000000000000216 R12:
>> 000000c000097688
>>> R13: 0000000000000000 R14: 000000c0006c3520 R15:
>> 00007f5e359946b7
>>> ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
>>>
>>> With the patch:
>>> crash> bt 493113
>>> PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND:
>> "sriov_fec_daemo"
>>> #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
>>> #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
>>> #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
>>> #3 [ff77abc4e81cfc68] vfio_unregister_group_dev at ffffffffc10e76ae
[vfio]
I'm still not sure how it should be fixed, but...
crash> bt -t 493113
PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND: "sriov_fec_daemo"
START: __schedule at ffffffff81b239cb
[ff77abc4e81cfb08] __schedule at ffffffff81b239cb
[ff77abc4e81cfb70] schedule at ffffffff81b23e2d
[ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
[ff77abc4e81cfbb0] process_timeout at ffffffff811a09d0
[ff77abc4e81cfbf0] __wait_for_common at ffffffff81b24abb
[ff77abc4e81cfbf8] schedule_timeout at ffffffff81b2c960
[ff77abc4e81cfc68] vfio_unregister_group_dev at ffffffffc10e76ae [vfio]
crash> set debug 2
debug: 2
crash> bt 493113
#0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
rsp: ff77abc4e81cfb08 prev_sp: ff77abc4e81cfb78 framesize: 96
--> ff77abc4e81cfb78 - 8 = [ff77abc4e81cfb70] schedule
#1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
rsp: ff77abc4e81cfb70 prev_sp: ff77abc4e81cfb90 framesize: 16
--> ff77abc4e81cfb90 - 8 = [ff77abc4e81cfb88] schedule_timeout
#2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
rsp: ff77abc4e81cfb88 prev_sp: ff77abc4e81cfbf8 framesize: 96
--> ff77abc4e81cfbf8 - 8 = [ff77abc4e81cfbf0] __wait_for_common
< ignoring __wait_for_common() -- makes indirect call and NOT schedule_timeout()>
So it looks like it resolves the correct next frame (__wait_for_common)
with ORC data, but the BT_CHECK_CALLER check overwrites it.
I also saw the above debug information.
The original if-else code block considered two cases(enable or disable
BT_CHECK_CALLER), not mentioned the ORC case here. I'm not sure if it is
reasonable to check the ORC flag.
According to the back trace, I see the similar check such as schedule()
and schedule_timeout() in the x86_64_function_called_by(), the scenario
looks closer to the actual situation.
if (is_direct_call_target(bt)) {
if (CRASHDEBUG(2))
fprintf(ofp, "< enable BT_CHECK_CALLER for %s
\n",
bt->call_target);
bt->flags |= BT_CHECK_CALLER;
} else {
if (CRASHDEBUG(2) && (bt->flags & BT_CHECK_CALLER))
fprintf(ofp, "< disable BT_CHECK_CALLER for %s
\n",
bt->call_target);
if (bt->flags & BT_CHECK_CALLER) {
if (CRASHDEBUG(2))
fprintf(ofp, "< set BT_NO_CHECK_CALLER
\n");
bt->flags |=
BT_NO_CHECK_CALLER;
}
bt->flags &= ~(ulonglong)BT_CHECK_CALLER;
}
The flag BT_NO_CHECK_CALLER will be set later, so the
is_direct_call_target() will return FALSE in the next time:
if (!bt->call_target || (bt->flags & BT_NO_CHECK_CALLER))
return FALSE;
probably the BT_CHECK_CALLER is a very old mechanism, I wonder if
it's necessary with ORC...
Unfortunately, I can not track the history, there are two changes:
[1]
/*
* Functions that jmp to schedule() or schedule_timeout().
*/
if (sp) {
if ((STREQ(sp->name, "schedule_timeout_interruptible") ||
STREQ(sp->name, "schedule_timeout_uninterruptible")))
sp = symbol_search("schedule_timeout");
if (STREQ(sp->name, "__cond_resched"))
sp = symbol_search("schedule");
}
[2]
/*
* Functions that won't be called indirectly.
* Add more to this as they are discovered.
*/
static const char *direct_call_targets[] = {
"schedule",
"schedule_timeout",
NULL
};
static int
is_direct_call_target(struct bt_info *bt)
{
int i;
if (!bt->call_target || (bt->flags & BT_NO_CHECK_CALLER))
return FALSE;
if (strstr(bt->call_target, "schedule") &&
is_task_active(bt->task))
return FALSE;
for (i = 0; direct_call_targets[i]; i++) {
if (STREQ(direct_call_targets[i], bt->call_target))
return TRUE;
}
return FALSE;
}
Could I have a few additional information?:
Sure.
- What is the kernel version? (e.g. 5.14.0-362.8.1.el9_3.x86_64)
I observed the current issue on the kernel 5.14.0-283.rt14.283.el9.x86_64.
- What happens with the following patch?
The following changes can also work in this case.
Do you prefer adding an ORC check or a similar check in
x86_64_function_called_by()? Or both?
Thanks
Lianbo
> --- a/x86_64.c
> +++ b/x86_64.c
> @@ -3342,7 +3342,7 @@ x86_64_print_stack_entry(struct bt_info *bt, FILE *ofp, int
level,
>
> bt->call_target = name;
>
> - if (is_direct_call_target(bt)) {
> + if (!(machdep->flags & ORC) && is_direct_call_target(bt)) {
> if (CRASHDEBUG(2))
> fprintf(ofp, "< enable BT_CHECK_CALLER for %s
\n",
>
bt->call_target);
>
> Thanks,
> Kazu
>
>>>> #4 [ff77abc4e81cfca8] vfio_pci_core_unregister_device at
>>> ffffffffc11bb599 [vfio_pci_core]
>>>> #5 [ff77abc4e81cfcc0] vfio_pci_remove at ffffffffc103e045
[vfio_pci]
>>>> #6 [ff77abc4e81cfcd0] pci_device_remove at ffffffff815d7513
>>>> #7 [ff77abc4e81cfcf0] device_release_driver_internal at
>>> ffffffff81708baa
>>>> #8 [ff77abc4e81cfd20] unbind_store at ffffffff81705f6f
>>>> #9 [ff77abc4e81cfd50] kernfs_fop_write_iter at ffffffff81454bf1
>>>> #10 [ff77abc4e81cfd88] new_sync_write at ffffffff813aad8c
>>>> #11 [ff77abc4e81cfe20] vfs_write at ffffffff813adb36
>>>> #12 [ff77abc4e81cfe58] ksys_write at ffffffff813adeb2
>>>> #13 [ff77abc4e81cfe90] do_syscall_64 at ffffffff81b17159
>>>> #14 [ff77abc4e81cff50] entry_SYSCALL_64_after_hwframe at
>>> ffffffff81c0009b
>>>> RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
>>>> RAX: ffffffffffffffda RBX: 000000c00004e000 RCX:
>>> 000000000047cdbb
>>>> RDX: 000000000000000c RSI: 000000c000097798 RDI:
>>> 0000000000000009
>>>> RBP: 000000c0000975f8 R8: 0000000000000001 R9:
>>> 000000c00098d680
>>>> R10: 000000000000000c R11: 0000000000000216 R12:
>>> 000000c000097688
>>>> R13: 0000000000000000 R14: 000000c0006c3520 R15:
>>> 00007f5e359946b7
>>>> ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
>>>>
>>>> Let's add a check function that jump to schedule_timeout(), just
like
>>>> the schedule_timeout_*() in x86_64_function_called_by().
>>>
>> Thank you for the comments, Kazu.
>>
>>
>>> Sorry, but could you elaborate the cause of this issue and the reason
>>> why the patch can fix it? I didn't get it with the commit log.
>>>
>>>
>> I did the debugging, and also saw that it gets the called function, but
>> it's not the target function, here, the target function is
>> schedule_timeout(). So, we need to correct it.
>>
>> After the instructions 'x/i 0xffffffffc10e76a9', the stack trace was
>> truncated, but it is the actual back trace. Let's see the following output
>> for debugging:
>>
>> crash> bt 493113
>> PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND:
"sriov_fec_daemo"
>> #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
>> #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
>> bt: x86_64_function_called_by 4469: x/i 0xffffffff81b2c9e3
>> bt: x86_64_function_called_by 4475: 0xffffffff81b2c9e3
>> <schedule_timeout+131>: call 0xffffffff81b23dd0 <schedule>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff81b23dd0 offset:0
>> name:schedule
>> #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
>> bt: x86_64_function_called_by 4469: x/i 0xffffffffc10e76a9
>> bt: x86_64_function_called_by 4475: 0xffffffffc10e76a9
>> <vfio_unregister_group_dev+121>: call 0xffffffff81b24c70
>> <wait_for_completion_interruptible_timeout>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff81b24c70 offset:0
>> name:wait_for_completion_interruptible_timeout
>> bt: x86_64_function_called_by 4469: x/i 0xffffffffc11bb594
>> bt: x86_64_function_called_by 4475: 0xffffffffc11bb594
>> <vfio_pci_core_unregister_device+20>: call 0xffffffffc10e7630
>> <vfio_unregister_group_dev>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffffc10e7630 offset:0
>> name:vfio_unregister_group_dev
>> bt: x86_64_function_called_by 4469: x/i 0xffffffffc103e040
>> bt: x86_64_function_called_by 4475: 0xffffffffc103e040
>> <vfio_pci_remove+16>: call 0xffffffffc11bb580
>> <vfio_pci_core_unregister_device>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffffc11bb580 offset:0
>> name:vfio_pci_core_unregister_device
>> bt: x86_64_function_called_by 4469: x/i 0xffffffff81708ba5
>> bt: x86_64_function_called_by 4475: 0xffffffff81708ba5
>> <device_release_driver_internal+421>: call 0xffffffff817075a0
>> <device_remove>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff817075a0 offset:0
>> name:device_remove
>> bt: x86_64_function_called_by 4469: x/i 0xffffffff81705f6a
>> bt: x86_64_function_called_by 4475: 0xffffffff81705f6a
>> <unbind_store+282>: call 0xffffffff81708c40
<device_driver_detach>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff81708c40 offset:0
>> name:device_driver_detach
>> bt: x86_64_function_called_by 4469: x/i 0xffffffff813adb31
>> bt: x86_64_function_called_by 4475: 0xffffffff813adb31 <vfs_write+593>:
>> call 0xffffffff813aac90 <new_sync_write>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff813aac90 offset:0
>> name:new_sync_write
>> bt: x86_64_function_called_by 4469: x/i 0xffffffff813adead
>> bt: x86_64_function_called_by 4475: 0xffffffff813adead <ksys_write+93>:
>> call 0xffffffff813ad8e0 <vfs_write>
>>
>> bt: x86_64_function_called_by 4488: value:ffffffff813ad8e0 offset:0
>> name:vfs_write
>> crash>
>>
>> Also, could I have the following information?
>>>
>> Sure.
>>
>>
>>> - output of "bt -t 493113" with the patch
>>>
>> crash> bt -t 493113
>> PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND:
"sriov_fec_daemo"
>> START: __schedule at ffffffff81b239cb
>> [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
>> [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
>> [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
>> [ff77abc4e81cfbb0] process_timeout at ffffffff811a09d0
>> [ff77abc4e81cfbf0] __wait_for_common at ffffffff81b24abb
>> [ff77abc4e81cfbf8] schedule_timeout at ffffffff81b2c960
>> [ff77abc4e81cfc68] vfio_unregister_group_dev at ffffffffc10e76ae [vfio]
>> [ff77abc4e81cfca8] vfio_pci_core_unregister_device at ffffffffc11bb599
>> [vfio_pci_core]
>> [ff77abc4e81cfcc0] vfio_pci_remove at ffffffffc103e045 [vfio_pci]
>> [ff77abc4e81cfcd0] pci_device_remove at ffffffff815d7513
>> [ff77abc4e81cfcf0] device_release_driver_internal at ffffffff81708baa
>> [ff77abc4e81cfd20] unbind_store at ffffffff81705f6f
>> [ff77abc4e81cfd50] kernfs_fop_write_iter at ffffffff81454bf1
>> [ff77abc4e81cfd88] new_sync_write at ffffffff813aad8c
>> [ff77abc4e81cfe20] vfs_write at ffffffff813adb36
>> [ff77abc4e81cfe58] ksys_write at ffffffff813adeb2
>> [ff77abc4e81cfe90] do_syscall_64 at ffffffff81b17159
>> [ff77abc4e81cfea0] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cfeb0] syscall_exit_to_user_mode at ffffffff81b1b50d
>> [ff77abc4e81cfec0] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cfed0] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cfed8] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cfee0] syscall_exit_to_user_mode at ffffffff81b1b50d
>> [ff77abc4e81cfef0] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cfef8] __context_tracking_enter at ffffffff81b1b6fc
>> [ff77abc4e81cff08] syscall_exit_to_user_mode at ffffffff81b1b50d
>> [ff77abc4e81cff18] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cff20] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cff28] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cff30] do_syscall_64 at ffffffff81b17169
>> [ff77abc4e81cff38] __context_tracking_enter at ffffffff81b1b6fc
>> [ff77abc4e81cff50] entry_SYSCALL_64_after_hwframe at ffffffff81c0009b
>> RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
>> RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000000000047cdbb
>> RDX: 000000000000000c RSI: 000000c000097798 RDI: 0000000000000009
>> RBP: 000000c0000975f8 R8: 0000000000000001 R9: 000000c00098d680
>> R10: 000000000000000c R11: 0000000000000216 R12: 000000c000097688
>> R13: 0000000000000000 R14: 000000c0006c3520 R15: 00007f5e359946b7
>> ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
>> crash>
>>
>> - output of "bt 493113" after "set debug 2" without the
patch.
>> crash> set debug 2
>> debug: 2
>> crash> bt 493113
>> PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND:
"sriov_fec_daemo"
>> rsp: ff77abc4e81cfad8 rbp: ff2e34ff19aeb140
>> #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
>> orc_find:
>> ip: ffffffff81b239cb idx: 45625
>> start = orc_lookup[45625]: 419274
>> stop = orc_lookup[45626] + 1: 419283
>> ip table start: ffffffff8312bbcc
>> unwind table start: ffffffff833a0ff8
>> __orc_find:
>> ip: ffffffff81b239cb num_entries: 9
>> orc_ip(ffffffff8312bbcc): fe9f7bbf -> ffffffff81b2378b / __schedule+27
->
>> ffffffff833a0ff8
>> orc_ip(ffffffff8312bbd0): fe9f7e1b -> ffffffff81b239eb / __schedule+635
>> -> ffffffff833a0ffe
>> orc_ip(ffffffff8312bbd4): fe9f7e18 -> ffffffff81b239ec / __schedule+636
>> -> ffffffff833a1004
>> orc_ip(ffffffff8312bbd8): fe9f7e15 -> ffffffff81b239ed / __schedule+637
>> -> ffffffff833a100a
>> orc_ip(ffffffff8312bbdc): fe9f7e13 -> ffffffff81b239ef / __schedule+639
>> -> ffffffff833a1010
>> orc_ip(ffffffff8312bbe0): fe9f7e11 -> ffffffff81b239f1 / __schedule+641
>> -> ffffffff833a1016
>> orc_ip(ffffffff8312bbe4): fe9f7e0f -> ffffffff81b239f3 / __schedule+643
>> -> ffffffff833a101c
>> orc_ip(ffffffff8312bbe8): fe9f7e0d -> ffffffff81b239f5 / __schedule+645
>> -> ffffffff833a1022
>> orc_ip(ffffffff8312bbec): fe9f7e0e -> ffffffff81b239fa / __schedule+650
>> -> ffffffff833a1028
>> found: ffffffff8312bbcc index: 0
>> orc_entry: ffffffff833a0ff8 sp_offset: 104 bp_offset: -48 sp_reg: 5
>> bp_reg: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb08 textaddr: ffffffff81b239cb -> spo: 104 bpo: -48 spr:
>> 5 bpr: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb08 prev_sp: ff77abc4e81cfb78 framesize: 96
>> rsp: ff77abc4e81cfb08 prev_sp: ff77abc4e81cfb78 prev_bp: ff77abc4e81cfb48
>> -> ff2e34ecbd3ca2c0
>> #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
>> < enable BT_CHECK_CALLER for schedule >
>> orc_find:
>> ip: ffffffff81b23e2d idx: 45630
>> start = orc_lookup[45630]: 419286
>> stop = orc_lookup[45631] + 1: 419298
>> ip table start: ffffffff8312bbfc
>> unwind table start: ffffffff833a1040
>> __orc_find:
>> ip: ffffffff81b23e2d num_entries: 12
>> orc_ip(ffffffff8312bbfc): fe9f81e4 -> ffffffff81b23de0 / schedule+16
->
>> ffffffff833a1040
>> orc_ip(ffffffff8312bc00): fe9f8254 -> ffffffff81b23e54 / schedule+132
->
>> ffffffff833a1046
>> orc_ip(ffffffff8312bc04): fe9f8251 -> ffffffff81b23e55 / schedule+133
->
>> ffffffff833a104c
>> orc_ip(ffffffff8312bc08): fe9f8252 -> ffffffff81b23e5a / schedule+138
->
>> ffffffff833a1052
>> orc_ip(ffffffff8312bc0c): fe9f8260 -> ffffffff81b23e6c / schedule+156
->
>> ffffffff833a1058
>> orc_ip(ffffffff8312bc10): fe9f825d -> ffffffff81b23e6d / schedule+157
->
>> ffffffff833a105e
>> orc_ip(ffffffff8312bc14): fe9f825e -> ffffffff81b23e72 / schedule+162
->
>> ffffffff833a1064
>> orc_ip(ffffffff8312bc18): fe9f827d -> ffffffff81b23e95 / schedule+197
->
>> ffffffff833a106a
>> orc_ip(ffffffff8312bc1c): fe9f8284 -> ffffffff81b23ea0 / yield+0 ->
>> ffffffff833a1070
>> orc_ip(ffffffff8312bc20): fe9f8298 -> ffffffff81b23eb8 / yield+24 ->
>> ffffffff833a1076
>> orc_ip(ffffffff8312bc24): fe9f829c -> ffffffff81b23ec0 / io_schedule+0
->
>> ffffffff833a107c
>> orc_ip(ffffffff8312bc28): fe9f829e -> ffffffff81b23ec6 / io_schedule+6
->
>> ffffffff833a1082
>> found: ffffffff8312bbfc index: 0
>> orc_entry: ffffffff833a1040 sp_offset: 24 bp_offset: -16 sp_reg: 5
>> bp_reg: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb70 textaddr: ffffffff81b23e2d -> spo: 24 bpo: -16 spr: 5
>> bpr: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb70 prev_sp: ff77abc4e81cfb90 framesize: 16
>> rsp: ff77abc4e81cfb70 prev_sp: ff77abc4e81cfb90 prev_bp: ff77abc4e81cfb80
>> -> 101047510
>> #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
>> < enable BT_CHECK_CALLER for schedule_timeout >
>> orc_find:
>> ip: ffffffff81b2c9e8 idx: 45769
>> start = orc_lookup[45769]: 420322
>> stop = orc_lookup[45770] + 1: 420333
>> ip table start: ffffffff8312cc2c
>> unwind table start: ffffffff833a2888
>> __orc_find:
>> ip: ffffffff81b2c9e8 num_entries: 11
>> orc_ip(ffffffff8312cc2c): fe9ffcc3 -> ffffffff81b2c8ef /
>> usleep_range_state+31 -> ffffffff833a2888
>> orc_ip(ffffffff8312cc30): fe9ffd18 -> ffffffff81b2c948 /
>> usleep_range_state+120 -> ffffffff833a288e
>> orc_ip(ffffffff8312cc34): fe9ffd15 -> ffffffff81b2c949 /
>> usleep_range_state+121 -> ffffffff833a2894
>> orc_ip(ffffffff8312cc38): fe9ffd12 -> ffffffff81b2c94a /
>> usleep_range_state+122 -> ffffffff833a289a
>> orc_ip(ffffffff8312cc3c): fe9ffd10 -> ffffffff81b2c94c /
>> usleep_range_state+124 -> ffffffff833a28a0
>> orc_ip(ffffffff8312cc40): fe9ffd11 -> ffffffff81b2c951 /
>> usleep_range_state+129 -> ffffffff833a28a6
>> orc_ip(ffffffff8312cc44): fe9ffd12 -> ffffffff81b2c956 /
>> usleep_range_state+134 -> ffffffff833a28ac
>> orc_ip(ffffffff8312cc48): fe9ffd18 -> ffffffff81b2c960 /
>> schedule_timeout+0 -> ffffffff833a28b2
>> orc_ip(ffffffff8312cc4c): fe9ffd1b -> ffffffff81b2c967 /
>> schedule_timeout+7 -> ffffffff833a28b8
>> orc_ip(ffffffff8312cc50): fe9ffd18 -> ffffffff81b2c968 /
>> schedule_timeout+8 -> ffffffff833a28be
>> orc_ip(ffffffff8312cc54): fe9ffd18 -> ffffffff81b2c96c /
>> schedule_timeout+12 -> ffffffff833a28c4
>> found: ffffffff8312cc54 index: 10
>> orc_entry: ffffffff833a28c4 sp_offset: 104 bp_offset: -24 sp_reg: 5
>> bp_reg: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb88 textaddr: ffffffff81b2c9e8 -> spo: 104 bpo: -24 spr:
>> 5 bpr: 1 type: 0 end: 0
>> rsp: ff77abc4e81cfb88 prev_sp: ff77abc4e81cfbf8 framesize: 96
>> rsp: ff77abc4e81cfb88 prev_sp: ff77abc4e81cfbf8 prev_bp: ff77abc4e81cfbe0
>> -> 2710
>> < ignoring __wait_for_common() -- makes indirect call and NOT
>> schedule_timeout()>
>> < ignoring vfio_unregister_group_dev() -- calls
>> wait_for_completion_interruptible_timeout() and NOT schedule_timeout()>
>> < ignoring vfio_pci_core_unregister_device() -- calls
>> vfio_unregister_group_dev() and NOT schedule_timeout()>
>> < ignoring vfio_pci_remove() -- calls vfio_pci_core_unregister_device() and
>> NOT schedule_timeout()>
>> < ignoring pci_device_remove() -- makes indirect call and NOT
>> schedule_timeout()>
>> < ignoring device_release_driver_internal() -- calls device_remove() and
>> NOT schedule_timeout()>
>> < ignoring unbind_store() -- calls device_driver_detach() and NOT
>> schedule_timeout()>
>> < ignoring kernfs_fop_write_iter() -- makes indirect call and NOT
>> schedule_timeout()>
>> < ignoring new_sync_write() -- makes indirect call and NOT
>> schedule_timeout()>
>> < ignoring vfs_write() -- calls new_sync_write() and NOT
schedule_timeout()>
>> < ignoring ksys_write() -- calls vfs_write() and NOT schedule_timeout()>
>> < ignoring do_syscall_64() -- makes indirect call and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring syscall_exit_to_user_mode() -- calls __context_tracking_enter()
>> and NOT schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring syscall_exit_to_user_mode() -- calls __context_tracking_enter()
>> and NOT schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring __context_tracking_enter() -- calls __this_cpu_preempt_check()
>> and NOT schedule_timeout()>
>> < ignoring syscall_exit_to_user_mode() -- calls __context_tracking_enter()
>> and NOT schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring do_syscall_64() -- calls syscall_exit_to_user_mode() and NOT
>> schedule_timeout()>
>> < ignoring __context_tracking_enter() -- calls __this_cpu_preempt_check()
>> and NOT schedule_timeout()>
>> < ignoring entry_SYSCALL_64_after_hwframe() -- calls do_syscall_64() and
>> NOT schedule_timeout()>
>> < exception frame at: ff77abc4e81cff58 >
>> RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
>> RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000000000047cdbb
>> RDX: 000000000000000c RSI: 000000c000097798 RDI: 0000000000000009
>> RBP: 000000c0000975f8 R8: 0000000000000001 R9: 000000c00098d680
>> R10: 000000000000000c R11: 0000000000000216 R12: 000000c000097688
>> R13: 0000000000000000 R14: 000000c0006c3520 R15: 00007f5e359946b7
>> ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
>> crash>
>>
>> Thanks.
>> Lianbo
>>
>>
>>> Thanks,
>>> Kazu
>>>
>>>> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
>>>> ---
>>>> x86_64.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/x86_64.c b/x86_64.c
>>>> index 42ade4817ad9..16850d98dc2d 100644
>>>> --- a/x86_64.c
>>>> +++ b/x86_64.c
>>>> @@ -4487,7 +4487,8 @@ x86_64_function_called_by(ulong rip)
>>>> */
>>>> if (sp) {
>>>> if ((STREQ(sp->name,
"schedule_timeout_interruptible") ||
>>>> - STREQ(sp->name,
"schedule_timeout_uninterruptible")))
>>>> + STREQ(sp->name,
"schedule_timeout_uninterruptible") ||
>>>> + STREQ(sp->name,
>>> "wait_for_completion_interruptible_timeout")))
>>>> sp =
symbol_search("schedule_timeout");
>>>>
>>>> if (STREQ(sp->name, "__cond_resched"))
>>> >