Hi Dave,
I've been staring at an issue in crash all afternoon trying to work out what happened
to the current clock value printed by timer -r. I've got output from the timer -r
command that looks like (the vmcore has 315+ days of uptime) this:
CPU: 0 HRTIMER_CPU_BASE: ffff883f7fa0d7a0
CLOCK: 0 HRTIMER_CLOCK_BASE: ffff883f7fa0d7e0 [ktime_get]
CURRENT
9017759896290448
SOFTEXPIRES EXPIRES HRTIMER FUNCTION
27464498630000000 27464498630000000 ffff883f7fa0da40 ffffffff810d0440
<tick_sched_timer>
27464498643441992 27464498643491992 ffff880243f5bd60 ffffffff8109ae70
<hrtimer_wakeup>
27464498690255938 27464498690755937 ffff8842d7c1b908 ffffffff8109ae70
<hrtimer_wakeup>
...
The time under the CURRENT heading is way less than the softexpires/expires which really
didn't look correct so I dug into the crash code and found that the calculation in
dump_hrtimer_clock_base() can overflow for a largish value of current_time (I did the
calculations involved in working out what current_time was by hand to make sure that they
would overflow and they did, but I haven't included those workings). The value above
works out to be the amount over the maximum value of an unsigned 64bit in that would have
been calculated (i.e. it wrapped).
When I changed it from:
now = current_time * 1000000000LL / machdep->hz + offset;
to
now = current_time * (1000000000LL / machdep->hz) + offset;
It works as I expected it to. I've tested with that change and the fixed output is:
CPU: 0 HRTIMER_CPU_BASE: ffff883f7fa0d7a0
CLOCK: 0 HRTIMER_CLOCK_BASE: ffff883f7fa0d7e0 [ktime_get]
CURRENT
27464503970000000
SOFTEXPIRES EXPIRES HRTIMER FUNCTION
27464498630000000 27464498630000000 ffff883f7fa0da40 ffffffff810d0440
<tick_sched_timer>
27464498643441992 27464498643491992 ffff880243f5bd60 ffffffff8109ae70
<hrtimer_wakeup>
27464498690255938 27464498690755937 ffff8842d7c1b908 ffffffff8109ae70
<hrtimer_wakeup>
There's a second one in dump_hrtimer_base():
now = current_time * 1000000000LL / machdep->hz;
that may have similar issues but it wasn't impacting on the vmcore I was looking at
but I changed it anyway.
Thanks
Shane
P.S. cut and pasted the diff output not sure if it's going to work correctly or not
because of that.
---
--- a/kernel.c 2016-11-30 13:56:29.000000000 -0500
+++ b/kernel.c 2017-02-06 02:23:35.179164828 -0500
@@ -7263,7 +7263,7 @@ dump_hrtimer_clock_base(const void *hrti
offset = 0;
if (VALID_MEMBER(hrtimer_clock_base_offset))
offset = ktime_to_ns(base + OFFSET(hrtimer_clock_base_offset));
- now = current_time * 1000000000LL / machdep->hz + offset;
+ now = current_time * (1000000000LL / machdep->hz) + offset;
dump_active_timers(base, now);
}
@@ -7285,7 +7285,7 @@ dump_hrtimer_base(const void *hrtimer_ba
/* get current time(uptime) */
get_uptime(NULL, ¤t_time);
- now = current_time * 1000000000LL / machdep->hz;
+ now = current_time * (1000000000LL / machdep->hz);
dump_active_timers(base, now);
}