From: mgorman <mgorman@suse.com>
Date: Tue, 5 Jan 2021 11:26:33 +0000
Subject: [PATCH] intel_idle: Disable ACPI _CST on Haswell
References: bsc#1177399, bsc#1180347, bsc#1180141
Patch-mainline: Never, upstream will always use ACPI information unless disabled by kernel command line
Numerous workload regressions have bisected repeatedly to the commit
6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") but only on
a set of haswell machines that all have the same CPU.
netperf UDP_STREAM
5.5.0 5.5.0-rc2 5.5.0-rc2 5.6.0 5.9.0-rc8
vanilla sle15-sp2-pre-cst sle15-sp2-enable-cst vanilla vanilla
Hmean send-64 203.21 ( 0.00%) 206.43 * 1.58%* 176.89 * -12.95%* 181.18 * -10.84%* 194.45 * -4.31%*
Hmean send-128 401.40 ( 0.00%) 414.19 * 3.19%* 355.84 * -11.35%* 364.13 * -9.29%* 387.83 * -3.38%*
Hmean send-256 786.69 ( 0.00%) 799.70 ( 1.65%) 700.65 * -10.94%* 719.82 * -8.50%* 756.40 * -3.85%*
Hmean send-1024 3059.57 ( 0.00%) 3106.57 * 1.54%* 2659.62 * -13.07%* 2793.58 * -8.69%* 3006.95 * -1.72%*
Hmean send-2048 5976.66 ( 0.00%) 6102.64 ( 2.11%) 5249.34 * -12.17%* 5392.04 * -9.78%* 5805.02 * -2.87%*
Hmean send-3312 9145.09 ( 0.00%) 9304.85 * 1.75%* 8197.25 * -10.36%* 8398.36 * -8.17%* 9120.88 ( -0.26%)
Hmean send-4096 10871.63 ( 0.00%) 11129.76 * 2.37%* 9667.68 * -11.07%* 9929.70 * -8.66%* 10863.41 ( -0.08%)
Hmean send-8192 17747.35 ( 0.00%) 17969.19 ( 1.25%) 15652.91 * -11.80%* 16081.20 * -9.39%* 17316.13 * -2.43%*
Hmean send-16384 29187.16 ( 0.00%) 29418.75 * 0.79%* 26296.64 * -9.90%* 27028.18 * -7.40%* 26941.26 * -7.69%*
Hmean recv-64 203.21 ( 0.00%) 206.43 * 1.58%* 176.89 * -12.95%* 181.18 * -10.84%* 194.45 * -4.31%*
Hmean recv-128 401.40 ( 0.00%) 414.19 * 3.19%* 355.84 * -11.35%* 364.13 * -9.29%* 387.83 * -3.38%*
Hmean recv-256 786.69 ( 0.00%) 799.70 ( 1.65%) 700.65 * -10.94%* 719.82 * -8.50%* 756.40 * -3.85%*
Hmean recv-1024 3059.57 ( 0.00%) 3106.57 * 1.54%* 2659.62 * -13.07%* 2793.58 * -8.69%* 3006.95 * -1.72%*
Hmean recv-2048 5976.66 ( 0.00%) 6102.64 ( 2.11%) 5249.34 * -12.17%* 5392.00 * -9.78%* 5805.02 * -2.87%*
Hmean recv-3312 9145.09 ( 0.00%) 9304.85 * 1.75%* 8197.25 * -10.36%* 8398.36 * -8.17%* 9120.88 ( -0.26%)
Hmean recv-4096 10871.63 ( 0.00%) 11129.76 * 2.37%* 9667.68 * -11.07%* 9929.70 * -8.66%* 10863.38 ( -0.08%)
Hmean recv-8192 17747.35 ( 0.00%) 17969.19 ( 1.25%) 15652.91 * -11.80%* 16081.20 * -9.39%* 17315.96 * -2.43%*
Hmean recv-16384 29187.13 ( 0.00%) 29418.72 * 0.79%* 26296.63 * -9.90%* 27028.18 * -7.40%* 26941.23 * -7.69%*
The impact of the patch appears to be disabling the C3 state leaving C1E
the lowest available state. It's not clear why this is so problematic on
Haswell as C1E should not be inherently hazardous unless it's somehow
interfering with turbo-boost.
QA also reported similar problems during the SP3 cycle on Haswell CPUs
specifically although there is evidence that Skylake and CascadeLake may
also be affected depending on the BIOS configuration. In general, it seems
the default tables on Haswell are either buggy or the defaults selected
are poor. It's common enough that this patch disables using ACPI _CST
tables by default.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
drivers/idle/intel_idle.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 8858f7fc18a8..540e332feacd 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1089,7 +1089,7 @@ static const struct idle_cpu idle_cpu_hsw __initconst = {
static const struct idle_cpu idle_cpu_hsx __initconst = {
.state_table = hsw_cstates,
.disable_promotion_to_c1e = true,
- .use_acpi = true,
+ .use_acpi = false,
};
static const struct idle_cpu idle_cpu_bdw __initconst = {