----- Original Message -----
Hello Dave,
The attached patches are used to display hrtimers.
To see detailed information, can refer to the patches.
--
Regards
Qiao Nuohan
Hi Qiao,
I ran this option through my set of sample vmcores, and have a couple
comments/suggestions.
A few of the sample vmcores generated bad rb_node.last virtual addresses,
and when they do, the command is aborted. For example:
$ crash -s 2.6.32-313.el6_softlockup/vmcore 2.6.32-313.el6_softlockup/vmlinux.gz
crash> timer -r
UPTIME: 300039(1000HZ)
cpu: 0
clock: 0
.base: ffff8800282115a8
.offset: 1354043557752205725
.get_time: ktime_get_real
EXPIRES HRTIMER FUNCTION
(empty)
clock: 1
.base: ffff8800282115e8
.offset: 0
.get_time: ktime_get
EXPIRES HRTIMER FUNCTION
300040000000-300040000000 ffff8800282116a0 ffffffff810a4b70
<tick_sched_timer>
3660368239466-3660368239466 ffff880224f07c68 ffffffff81071c00 <it_real_fn>
3660700491472-3660700491472 ffff880224f07068 ffffffff81071c00 <it_real_fn>
14461470907794-14461570907794 ffff88022750fa68 ffffffff81098160
<hrtimer_wakeup>
clock: 2
.base: ffff880028211628
.offset: 0
.get_time:
timer: invalid kernel virtual address: 1bc2f type: "rb_node last"
crash>
Here are some other vmcore examples where it failed similarly:
timer: invalid kernel virtual address: 1 type: "rb_node last"
timer: invalid kernel virtual address: 1ffffffff type: "rb_node last"
timer: invalid kernel virtual address: 1 type: "rb_node last"
timer: invalid kernel virtual address: 1 type: "rb_node last"
Perhaps there could be a way to pre-verify the addresses with
accessible(), and if the address is bogus, display an error message,
but allow the command to continue on with the other cpus?
And secondly, I ran into numerous examples of runaway commands
that loop over the same hrtimer entries. The kernels were typically
taken with the "snap.so" extension module, or where the dumpfiles
were taken with "virsh dump" -- but not always.
For examples:
$ crash -s 2.6.18-152.el5_HVM_virsh_dump/vmcore
2.6.18-152.el5_HVM_virsh_dump/vmlinux.gz
... [ cut ] ...
cpu: 1
clock: 0
.base: ffff810009580aa0
.get_time: ktime_get_real
EXPIRES HRTIMER FUNCTION
(empty)
clock: 1
.base: ffff810009580ae0
.get_time: ktime_get
EXPIRES HRTIMER FUNCTION
901601121855 ffff810034107ee8 ffffffff800a2536 <hrtimer_wakeup>
930465261855 ffff81003c22b5b0 ffffffff80093ced <it_real_fn>
922835889855 ffff81003333bee8 ffffffff800a2536 <hrtimer_wakeup>
930465261855 ffff81003c22b5b0 ffffffff80093ced <it_real_fn>
922835889855 ffff81003333bee8 ffffffff800a2536 <hrtimer_wakeup>
930465261855 ffff81003c22b5b0 ffffffff80093ced <it_real_fn>
922835889855 ffff81003333bee8 ffffffff800a2536 <hrtimer_wakeup>
930465261855 ffff81003c22b5b0 ffffffff80093ced <it_real_fn>
922835889855 ffff81003333bee8 ffffffff800a2536 <hrtimer_wakeup>
930465261855 ffff81003c22b5b0 ffffffff80093ced <it_real_fn>
922835889855 ffff81003333bee8 ffffffff800a2536 <hrtimer_wakeup>
... [ forever ] ....
$ crash -s snapshot-3.1.7-1.fc16/vmcore snapshot-3.1.7-1.fc16/vmlinux.gz
crash> timer -r
... [ cut ] ...
cpu: 6
clock: 0
.base: ffff88003e2ce180
.offset: 0
.get_time: ktime_get
EXPIRES HRTIMER FUNCTION
1689390000000-1689390000000 ffff88003e2ce280 ffffffff8109f650
<tick_sched_timer>
1692446941251-1692446941251 ffff88003e2ce400 ffffffff810d9790
<watchdog_timer_fn>
3628519841476-3628519891476 ffff880033fd5eb8 ffffffff81091b70 <hrtimer_wakeup>
1689390000000-1689390000000 ffff88003e2ce280 ffffffff8109f650
<tick_sched_timer>
1692446941251-1692446941251 ffff88003e2ce400 ffffffff810d9790
<watchdog_timer_fn>
3628519841476-3628519891476 ffff880033fd5eb8 ffffffff81091b70 <hrtimer_wakeup>
1689390000000-1689390000000 ffff88003e2ce280 ffffffff8109f650
<tick_sched_timer>
1692446941251-1692446941251 ffff88003e2ce400 ffffffff810d9790
<watchdog_timer_fn>
3628519841476-3628519891476 ffff880033fd5eb8 ffffffff81091b70 <hrtimer_wakeup>
1689390000000-1689390000000 ffff88003e2ce280 ffffffff8109f650
<tick_sched_timer>
1692446941251-1692446941251 ffff88003e2ce400 ffffffff810d9790
<watchdog_timer_fn>
3628519841476-3628519891476 ffff880033fd5eb8 ffffffff81091b70 <hrtimer_wakeup>
1689390000000-1689390000000 ffff88003e2ce280 ffffffff8109f650
<tick_sched_timer>
... [ forever ] ...
Maybe you can use hq_open()/hq_enter()/hq_close() on the hrtimer addresses
to prevent this from happening, warn the user when it does, and continue on
with the next cpu?
Thanks,
Dave