Jiri Slaby cddbdb
From: Benjamin Berg <bberg@redhat.com>
Jiri Slaby cddbdb
Date: Wed, 9 Oct 2019 17:54:24 +0200
Jiri Slaby cddbdb
Subject: x86/mce: Lower throttling MCE messages' priority to warning
Jiri Slaby cddbdb
Git-commit: 9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7
Jiri Slaby cddbdb
Patch-mainline: 5.5-rc1
Jiri Slaby cddbdb
References: git-fixes
Jiri Slaby cddbdb
Jiri Slaby cddbdb
On modern CPUs it is quite normal that the temperature limits are
Jiri Slaby cddbdb
reached and the CPU is throttled. In fact, often the thermal design is
Jiri Slaby cddbdb
not sufficient to cool the CPU at full load and limits can quickly be
Jiri Slaby cddbdb
reached when a burst in load happens. This will even happen with
Jiri Slaby cddbdb
technologies like RAPL limitting the long term power consumption of
Jiri Slaby cddbdb
the package.
Jiri Slaby cddbdb
Jiri Slaby cddbdb
Also, these limits are "softer", as Srinivas explains:
Jiri Slaby cddbdb
Jiri Slaby cddbdb
"CPU temperature doesn't have to hit max(TjMax) to get these warnings.
Jiri Slaby cddbdb
OEMs ha[ve] an ability to program a threshold where a thermal interrupt
Jiri Slaby cddbdb
can be generated. In some systems the offset is 20C+ (Read only value).
Jiri Slaby cddbdb
Jiri Slaby cddbdb
In recent systems, there is another offset on top of it which can be
Jiri Slaby cddbdb
programmed by OS, once some agent can adjust power limits dynamically.
Jiri Slaby cddbdb
By default this is set to low by the firmware, which I guess the
Jiri Slaby cddbdb
prime motivation of Benjamin to submit the patch."
Jiri Slaby cddbdb
Jiri Slaby cddbdb
So these messages do not usually indicate a hardware issue (e.g.
Jiri Slaby cddbdb
insufficient cooling). Log them as warnings to avoid confusion about
Jiri Slaby cddbdb
their severity.
Jiri Slaby cddbdb
Jiri Slaby cddbdb
 [ bp: Massage commit mesage. ]
Jiri Slaby cddbdb
Jiri Slaby cddbdb
Signed-off-by: Benjamin Berg <bberg@redhat.com>
Jiri Slaby cddbdb
Signed-off-by: Borislav Petkov <bp@suse.de>
Jiri Slaby cddbdb
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Jiri Slaby cddbdb
Tested-by: Christian Kellner <ckellner@redhat.com>
Jiri Slaby cddbdb
Cc: "H. Peter Anvin" <hpa@zytor.com>
Jiri Slaby cddbdb
Cc: Ingo Molnar <mingo@redhat.com>
Jiri Slaby cddbdb
Cc: linux-edac <linux-edac@vger.kernel.org>
Jiri Slaby cddbdb
Cc: Peter Zijlstra <peterz@infradead.org>
Jiri Slaby cddbdb
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Jiri Slaby cddbdb
Cc: Thomas Gleixner <tglx@linutronix.de>
Jiri Slaby cddbdb
Cc: Tony Luck <tony.luck@intel.com>
Jiri Slaby cddbdb
Cc: x86-ml <x86@kernel.org>
Jiri Slaby cddbdb
Link: https://lkml.kernel.org/r/20191009155424.249277-1-bberg@redhat.com
Jiri Slaby cddbdb
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jiri Slaby cddbdb
---
Jiri Slaby cddbdb
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    2 +-
Jiri Slaby cddbdb
 1 file changed, 1 insertion(+), 1 deletion(-)
Jiri Slaby cddbdb
Jiri Slaby cddbdb
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
Jiri Slaby cddbdb
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
Jiri Slaby cddbdb
@@ -185,7 +185,7 @@ static void therm_throt_process(bool new
Jiri Slaby cddbdb
 	/* if we just entered the thermal event */
Jiri Slaby cddbdb
 	if (new_event) {
Jiri Slaby cddbdb
 		if (event == THERMAL_THROTTLING_EVENT)
Jiri Slaby cddbdb
-			pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
Jiri Slaby cddbdb
+			pr_warn("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
Jiri Slaby cddbdb
 				this_cpu,
Jiri Slaby cddbdb
 				level == CORE_LEVEL ? "Core" : "Package",
Jiri Slaby cddbdb
 				state->count);