mirror-linux/kernel/time
Thomas Gleixner c06343be0b clocksource: Reduce watchdog readout delay limit to prevent false positives
The "valid" readout delay between the two reads of the watchdog is larger
than the valid delta between the resulting watchdog and clocksource
intervals, which results in false positive watchdog results.

Assume TSC is the clocksource and HPET is the watchdog and both have a
uncertainty margin of 250us (default). The watchdog readout does:

  1) wdnow = read(HPET);
  2) csnow = read(TSC);
  3) wdend = read(HPET);

The valid window for the delta between #1 and #3 is calculated by the
uncertainty margins of the watchdog and the clocksource:

   m = 2 * watchdog.uncertainty_margin + cs.uncertainty margin;

which results in 750us for the TSC/HPET case.

The actual interval comparison uses a smaller margin:

   m = watchdog.uncertainty_margin + cs.uncertainty margin;

which results in 500us for the TSC/HPET case.

That means the following scenario will trigger the watchdog:

 Watchdog cycle N:

 1)       wdnow[N] = read(HPET);
 2)       csnow[N] = read(TSC);
 3)       wdend[N] = read(HPET);

Assume the delay between #1 and #2 is 100us and the delay between #1 and

 Watchdog cycle N + 1:

 4)       wdnow[N + 1] = read(HPET);
 5)       csnow[N + 1] = read(TSC);
 6)       wdend[N + 1] = read(HPET);

If the delay between #4 and #6 is within the 750us margin then any delay
between #4 and #5 which is larger than 600us will fail the interval check
and mark the TSC unstable because the intervals are calculated against the
previous value:

    wd_int = wdnow[N + 1] - wdnow[N];
    cs_int = csnow[N + 1] - csnow[N];

Putting the above delays in place this results in:

    cs_int = (wdnow[N + 1] + 610us) - (wdnow[N] + 100us);
 -> cs_int = wd_int + 510us;

which is obviously larger than the allowed 500us margin and results in
marking TSC unstable.

Fix this by using the same margin as the interval comparison. If the delay
between two watchdog reads is larger than that, then the readout was either
disturbed by interconnect congestion, NMIs or SMIs.

Fixes: 4ac1dd3245 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/lkml/20250602223251.496591-1-daniel@quora.org/
Link: https://patch.msgid.link/87bjjxc9dq.ffs@tglx
2026-01-21 11:33:11 +01:00
..
Kconfig time: Introduce auxiliary POSIX clocks 2025-06-19 14:28:22 +02:00
Makefile time: Build generic update_vsyscall() only with generic time vDSO 2025-09-04 11:23:50 +02:00
alarmtimer.c time: Fix spelling mistakes in comments 2025-09-21 10:02:02 +02:00
clockevents.c treewide: Update email address 2026-01-11 06:09:11 -10:00
clocksource-wdtest.c clocksource/wdtest: Print time values for short udelay(1) 2025-01-15 19:49:13 +01:00
clocksource.c clocksource: Reduce watchdog readout delay limit to prevent false positives 2026-01-21 11:33:11 +01:00
hrtimer.c hrtimer: Fix softirq base check in update_needs_ipi() 2026-01-13 11:04:41 +01:00
itimer.c timers/itimer: Avoid direct access to hrtimer clockbase 2025-09-09 12:27:17 +02:00
jiffies.c sysctl: Move proc_doulongvec_ms_jiffies_minmax to kernel/time/jiffies.c 2025-11-27 15:45:37 +01:00
namespace.c ns: drop custom reference count initialization for initial namespaces 2025-11-11 10:01:32 +01:00
ntp.c ntp: Use ktime_get_ntp_seconds() 2025-06-19 14:28:24 +02:00
ntp_internal.h ntp: Rename __do_adjtimex() to ntp_adjtimex() 2025-06-19 14:28:23 +02:00
posix-clock.c Networking changes for 6.15. 2025-03-26 21:48:21 -07:00
posix-cpu-timers.c hrtimer: Store time as ktime_t in restart block 2025-11-14 16:31:19 +01:00
posix-stubs.c
posix-timers.c Update to the time/timers core: 2025-12-02 09:58:33 -08:00
posix-timers.h timekeeping: Add minimal posix-timers support for auxiliary clocks 2025-06-27 20:13:12 +02:00
sched_clock.c syscore: Pass context data to callbacks 2025-11-14 10:01:52 +01:00
sleep_timeout.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
test_udelay.c time: Add MODULE_DESCRIPTION() to time test modules 2024-06-03 11:18:50 +02:00
tick-broadcast-hrtimer.c time: Switch to hrtimer_setup() 2025-02-18 10:32:33 +01:00
tick-broadcast.c treewide: Update email address 2026-01-11 06:09:11 -10:00
tick-common.c treewide: Update email address 2026-01-11 06:09:11 -10:00
tick-internal.h tick: Do not set device to detached state in tick_shutdown() 2025-09-09 13:39:00 +02:00
tick-legacy.c
tick-oneshot.c treewide: Update email address 2026-01-11 06:09:11 -10:00
tick-sched.c treewide: Update email address 2026-01-11 06:09:11 -10:00
tick-sched.h tick/sched: Fix struct tick_sched doc warnings 2024-04-01 10:36:35 +02:00
time.c time: export timespec64_add_safe() symbol 2025-09-03 16:51:08 -07:00
time_test.c time: Add MODULE_DESCRIPTION() to time test modules 2024-06-03 11:18:50 +02:00
timeconst.bc
timeconv.c
timecounter.c time/timecounter: Fix the lie that struct cyclecounter is const 2025-07-01 15:38:25 +02:00
timekeeping.c timekeeping: Adjust the leap state for the correct auxiliary timekeeper 2026-01-20 10:18:53 +01:00
timekeeping.h
timekeeping_debug.c timekeeping: Add percpu counter for tracking floor swap events 2024-10-10 10:20:46 +02:00
timekeeping_internal.h timekeeping: Provide ktime_get_ntp_seconds() 2025-06-19 14:28:23 +02:00
timer.c Tree wide cleanup of the remaining users of in_irq() which got replaced 2025-12-02 10:18:49 -08:00
timer_list.c hrtimer: Remove hrtimer_clock_base:: Get_time 2025-09-09 12:27:18 +02:00
timer_migration.c timers/migration: Exclude isolated cpus from hierarchy 2025-11-20 20:17:32 +01:00
timer_migration.h timers/migration: Rename 'online' bit to 'available' 2025-11-20 20:17:31 +01:00
vsyscall.c vdso/vsyscall: Avoid slow division loop in auxiliary clock update 2025-09-03 11:55:11 +02:00