Blob Blame History Raw
From: mgorman <mgorman@suse.com>
Date: Tue, 5 Jan 2021 11:26:33 +0000
Subject: [PATCH] intel_idle: Disable ACPI _CST on Haswell

References: bsc#1177399, bsc#1180347, bsc#1180141
Patch-mainline: Never, upstream will always use ACPI information unless disabled by kernel command line

Numerous workload regressions have bisected repeatedly to the commit
6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") but only on
a set of haswell machines that all have the same CPU.

netperf UDP_STREAM
                                       5.5.0              5.5.0-rc2              5.5.0-rc2                  5.6.0              5.9.0-rc8
                                     vanilla      sle15-sp2-pre-cst   sle15-sp2-enable-cst                vanilla                vanilla
Hmean     send-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
Hmean     send-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
Hmean     send-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
Hmean     send-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
Hmean     send-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.04 *  -9.78%*     5805.02 *  -2.87%*
Hmean     send-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
Hmean     send-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.41 (  -0.08%)
Hmean     send-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17316.13 *  -2.43%*
Hmean     send-16384    29187.16 (   0.00%)    29418.75 *   0.79%*    26296.64 *  -9.90%*    27028.18 *  -7.40%*    26941.26 *  -7.69%*
Hmean     recv-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
Hmean     recv-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
Hmean     recv-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
Hmean     recv-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
Hmean     recv-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.00 *  -9.78%*     5805.02 *  -2.87%*
Hmean     recv-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
Hmean     recv-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.38 (  -0.08%)
Hmean     recv-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17315.96 *  -2.43%*
Hmean     recv-16384    29187.13 (   0.00%)    29418.72 *   0.79%*    26296.63 *  -9.90%*    27028.18 *  -7.40%*    26941.23 *  -7.69%*

The impact of the patch appears to be disabling the C3 state leaving C1E
the lowest available state. It's not clear why this is so problematic on
Haswell as C1E should not be inherently hazardous unless it's somehow
interfering with turbo-boost.

QA also reported similar problems during the SP3 cycle on Haswell CPUs
specifically although there is evidence that Skylake and CascadeLake may
also be affected depending on the BIOS configuration. In general, it seems
the default tables on Haswell are either buggy or the defaults selected
are poor. It's common enough that this patch disables using ACPI _CST
tables by default.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 drivers/idle/intel_idle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 8858f7fc18a8..540e332feacd 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1089,7 +1089,7 @@ static const struct idle_cpu idle_cpu_hsw __initconst = {
 static const struct idle_cpu idle_cpu_hsx __initconst = {
 	.state_table = hsw_cstates,
 	.disable_promotion_to_c1e = true,
-	.use_acpi = true,
+	.use_acpi = false,
 };
 
 static const struct idle_cpu idle_cpu_bdw __initconst = {