Blob Blame History Raw
From 63344c0941d42640d082097b8090d4e749499328 Mon Sep 17 00:00:00 2001
From: Jiri Wiesner <jwiesner@suse.de>
Date: Wed, 13 Dec 2023 10:42:42 +0100
Subject: [PATCH v2] clocksource: disable watchdog checks on TSC when TSC is
 watchdog
Patch-mainline: Never, ignored by upstream
References: bsc#1215885

Change the semantics of the tsc=watchdog option to always remove the
CLOCK_SOURCE_MUST_VERIFY bit from the TSC clocksource so as to provide an
alternative to the tsc=nowatchdog option, which is used routinely to
prevent switches to the HPET clocksource on machines with hardware issues.

Currently, the TSC does not get checked on machines with a stable TSC (4
NUMA nodes or less, CONSTANT_TSC, NONSTOP_TSC and TSC_ADJUST), and the
tsc=watchdog option allows one to have the PMTMR (acpi_pm) checked
instead. There are production machines that do not quality for having the
clocksource watchdog disabled (usually on account of having more than 4
NUMA nodes) and that also may have a malfunctioning CPU that causes a
watchdog check failure in a few days after a reboot resulting in the
current clocksource being switched to the HPET. Solving a hardware issue
or replacing the machine may take an inordinate amount of time, while the
production machine is still needed to do its job.

It is possible to pass tsc=nowatchdog to the kernel, which disables the
clocksource watchdog on the TSC, effectively leaving the operator of the
production machine blind to any clocksource malfunction. The sematics of
the recently introduced tsc=watchdog option, 0051293c5330 ("clocksource:
Enable TSC watchdog checking of HPET and PMTMR only when requested"), is
changed by this patch so that the TSC does not get checked when
tsc=watchdog is specified regardless of the TSC passing or not passing the
criteria for being judged to be stable. This way, the TSC acts as a
watchdog checking other clocksources (HPET, PMTMR) and a failed check
does not result in the current clocksource (TSC) getting marked unstable.
Passing tsc=watchdog has the advantange of the operator being informed
that the machine may be experiencing clocksource issues.

There is one side effect of the semantic change of tsc=watchdog - it fixes
the HPET not having its CLOCK_SOURCE_MUST_VERIFY bit set as intended by
efc8b329c7fd ("clocksource: Verify HPET and PMTMR when TSC unverified").
When the system is booting up, the HPET does not have its
CLOCK_SOURCE_MUST_VERIFY bit set on account of the order in which
clocksources are initialized in x86_late_time_init(). The HPET
initialization and clocksource registration always comes before the TSC
init function unsets the CLOCK_SOURCE_MUST_VERIFY bit.

Fixes: efc8b329c7fd ("clocksource: Verify HPET and PMTMR when TSC unverified")
Signed-off-by: Jiri Wiesner <jwiesner@suse.de>
---
 Documentation/admin-guide/kernel-parameters.txt | 9 +++++----
 arch/x86/include/asm/time.h                     | 2 +-
 arch/x86/kernel/hpet.c                          | 2 +-
 arch/x86/kernel/tsc.c                           | 7 +++----
 drivers/clocksource/acpi_pm.c                   | 2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5235d8b45cd6..efe17aa2fd5b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5924,10 +5924,11 @@
 			obtained from HW or FW using either an MSR or CPUID(0x15).
 			Warn if the difference is more than 500 ppm.
 			[x86] watchdog: Use TSC as the watchdog clocksource with
-			which to check other HW timers (HPET or PM timer), but
-			only on systems where TSC has been deemed trustworthy.
-			This will be suppressed by an earlier tsc=nowatchdog and
-			can be overridden by a later tsc=nowatchdog.  A console
+			which to check other HW timers (HPET or PM timer).
+			Disables watchdog checks on TSC, even on systems where
+			TSC has not been deemed trustworthy. This will be
+			suppressed by an earlier tsc=nowatchdog and can be
+			overridden by a later tsc=nowatchdog. A console
 			message will flag any such suppression or overriding.
 
 	tsc_early_khz=  [X86] Skip early TSC calibration and use the given
diff --git a/arch/x86/include/asm/time.h b/arch/x86/include/asm/time.h
index a53961c64a56..a5dcde179d6d 100644
--- a/arch/x86/include/asm/time.h
+++ b/arch/x86/include/asm/time.h
@@ -8,7 +8,7 @@
 extern void hpet_time_init(void);
 extern void time_init(void);
 extern bool pit_timer_init(void);
-extern bool tsc_clocksource_watchdog_disabled(void);
+extern bool tsc_clocksource_as_watchdog(void);
 
 extern struct clock_event_device *global_clock_event;
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index e723209441c1..c7a73bef2d24 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -1091,7 +1091,7 @@ int __init hpet_enable(void)
 	if (!hpet_counting())
 		goto out_nohpet;
 
-	if (tsc_clocksource_watchdog_disabled())
+	if (tsc_clocksource_as_watchdog())
 		clocksource_hpet.flags |= CLOCK_SOURCE_MUST_VERIFY;
 	clocksource_register_hz(&clocksource_hpet, (u32)hpet_freq);
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 4fb132f0ff37..ba0559aa8bfd 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1203,10 +1203,9 @@ static void __init tsc_disable_clocksource_watchdog(void)
 	clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
 }
 
-bool tsc_clocksource_watchdog_disabled(void)
+bool tsc_clocksource_as_watchdog(void)
 {
-	return !(clocksource_tsc.flags & CLOCK_SOURCE_MUST_VERIFY) &&
-	       tsc_as_watchdog && !no_tsc_watchdog;
+	return tsc_as_watchdog && !no_tsc_watchdog;
 }
 
 static void __init check_system_tsc_reliable(void)
@@ -1591,7 +1590,7 @@ void __init tsc_init(void)
 		return;
 	}
 
-	if (tsc_clocksource_reliable || no_tsc_watchdog)
+	if (tsc_clocksource_reliable || no_tsc_watchdog || tsc_as_watchdog)
 		tsc_disable_clocksource_watchdog();
 
 	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 82338773602c..9b5dad94713e 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -211,7 +211,7 @@ static int __init init_acpi_pm_clocksource(void)
 		return -ENODEV;
 	}
 
-	if (tsc_clocksource_watchdog_disabled())
+	if (tsc_clocksource_as_watchdog())
 		clocksource_acpi_pm.flags |= CLOCK_SOURCE_MUST_VERIFY;
 	return clocksource_register_hz(&clocksource_acpi_pm, PMTMR_TICKS_PER_SEC);
 }
-- 
2.35.3