Borislav Petkov f3a7e3
From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Borislav Petkov f3a7e3
Date: Thu, 19 May 2022 20:29:11 -0700
Borislav Petkov f3a7e3
Subject: x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
Borislav Petkov f3a7e3
Git-commit: 8cb861e9e3c9a55099ad3d08e1a3b653d29c33ca
Borislav Petkov f3a7e3
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
Borislav Petkov f3a7e3
Patch-mainline: Queued in tip for v5.19
Borislav Petkov f3a7e3
References: bsc#1199650 CVE-2022-21166 CVE-2022-21127 CVE-2022-21123 CVE-2022-21125 CVE-2022-21180
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Processor MMIO Stale Data is a class of vulnerabilities that may
Borislav Petkov f3a7e3
expose data after an MMIO operation. For details please refer to
Borislav Petkov f3a7e3
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
These vulnerabilities are broadly categorized as:
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Device Register Partial Write (DRPW):
Borislav Petkov f3a7e3
  Some endpoint MMIO registers incorrectly handle writes that are
Borislav Petkov f3a7e3
  smaller than the register size. Instead of aborting the write or only
Borislav Petkov f3a7e3
  copying the correct subset of bytes (for example, 2 bytes for a 2-byte
Borislav Petkov f3a7e3
  write), more bytes than specified by the write transaction may be
Borislav Petkov f3a7e3
  written to the register. On some processors, this may expose stale
Borislav Petkov f3a7e3
  data from the fill buffers of the core that created the write
Borislav Petkov f3a7e3
  transaction.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Shared Buffers Data Sampling (SBDS):
Borislav Petkov f3a7e3
  After propagators may have moved data around the uncore and copied
Borislav Petkov f3a7e3
  stale data into client core fill buffers, processors affected by MFBDS
Borislav Petkov f3a7e3
  can leak data from the fill buffer.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Shared Buffers Data Read (SBDR):
Borislav Petkov f3a7e3
  It is similar to Shared Buffer Data Sampling (SBDS) except that the
Borislav Petkov f3a7e3
  data is directly read into the architectural software-visible state.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
An attacker can use these vulnerabilities to extract data from CPU fill
Borislav Petkov f3a7e3
buffers using MDS and TAA methods. Mitigate it by clearing the CPU fill
Borislav Petkov f3a7e3
buffers using the VERW instruction before returning to a user or a
Borislav Petkov f3a7e3
guest.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
On CPUs not affected by MDS and TAA, user application cannot sample data
Borislav Petkov f3a7e3
from CPU fill buffers using MDS or TAA. A guest with MMIO access can
Borislav Petkov f3a7e3
still use DRPW or SBDR to extract data architecturally. Mitigate it with
Borislav Petkov f3a7e3
VERW instruction to clear fill buffers before VMENTER for MMIO capable
Borislav Petkov f3a7e3
guests.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Add a kernel parameter mmio_stale_data={off|full|full,nosmt} to control
Borislav Petkov f3a7e3
the mitigation.
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Borislav Petkov f3a7e3
Signed-off-by: Borislav Petkov <bp@suse.de>
Borislav Petkov f3a7e3
---
Borislav Petkov f3a7e3
 Documentation/admin-guide/kernel-parameters.txt |   36 +++++++
Borislav Petkov f3a7e3
 arch/x86/include/asm/nospec-branch.h            |    2 
Borislav Petkov f3a7e3
 arch/x86/kernel/cpu/bugs.c                      |  111 +++++++++++++++++++++++-
Borislav Petkov f3a7e3
 arch/x86/kvm/vmx/vmx.c                          |    3 
Borislav Petkov f3a7e3
 4 files changed, 148 insertions(+), 4 deletions(-)
Borislav Petkov f3a7e3
Borislav Petkov f3a7e3
--- a/arch/x86/include/asm/nospec-branch.h
Borislav Petkov f3a7e3
+++ b/arch/x86/include/asm/nospec-branch.h
Borislav Petkov f3a7e3
@@ -313,6 +313,8 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_alway
Borislav Petkov f3a7e3
 DECLARE_STATIC_KEY_FALSE(mds_user_clear);
Borislav Petkov f3a7e3
 DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
+DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear);
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
 #include <asm/segment.h>
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 /**
Borislav Petkov f3a7e3
--- a/arch/x86/kernel/cpu/bugs.c
Borislav Petkov f3a7e3
+++ b/arch/x86/kernel/cpu/bugs.c
Borislav Petkov f3a7e3
@@ -43,6 +43,7 @@ static void __init l1tf_select_mitigatio
Borislav Petkov f3a7e3
 static void __init mds_select_mitigation(void);
Borislav Petkov f3a7e3
 static void __init md_clear_update_mitigation(void);
Borislav Petkov f3a7e3
 static void __init taa_select_mitigation(void);
Borislav Petkov f3a7e3
+static void __init mmio_select_mitigation(void);
Borislav Petkov f3a7e3
 static void __init srbds_select_mitigation(void);
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */
Borislav Petkov f3a7e3
@@ -77,6 +78,10 @@ EXPORT_SYMBOL_GPL(mds_user_clear);
Borislav Petkov f3a7e3
 DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
Borislav Petkov f3a7e3
 EXPORT_SYMBOL_GPL(mds_idle_clear);
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
+/* Controls CPU Fill buffer clear before KVM guest MMIO accesses */
Borislav Petkov f3a7e3
+DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear);
Borislav Petkov f3a7e3
+EXPORT_SYMBOL_GPL(mmio_stale_data_clear);
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
 void __init check_bugs(void)
Borislav Petkov f3a7e3
 {
Borislav Petkov f3a7e3
 	identify_boot_cpu();
Borislav Petkov f3a7e3
@@ -111,11 +116,13 @@ void __init check_bugs(void)
Borislav Petkov f3a7e3
 	l1tf_select_mitigation();
Borislav Petkov f3a7e3
 	mds_select_mitigation();
Borislav Petkov f3a7e3
 	taa_select_mitigation();
Borislav Petkov f3a7e3
+	mmio_select_mitigation();
Borislav Petkov f3a7e3
 	srbds_select_mitigation();
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 	/*
Borislav Petkov f3a7e3
-	 * As MDS and TAA mitigations are inter-related, update and print their
Borislav Petkov f3a7e3
-	 * mitigation after TAA mitigation selection is done.
Borislav Petkov f3a7e3
+	 * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update
Borislav Petkov f3a7e3
+	 * and print their mitigation after MDS, TAA and MMIO Stale Data
Borislav Petkov f3a7e3
+	 * mitigation selection is done.
Borislav Petkov f3a7e3
 	 */
Borislav Petkov f3a7e3
 	md_clear_update_mitigation();
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
@@ -375,6 +382,90 @@ static int __init tsx_async_abort_parse_
Borislav Petkov f3a7e3
 early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 #undef pr_fmt
Borislav Petkov f3a7e3
+#define pr_fmt(fmt)	"MMIO Stale Data: " fmt
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+enum mmio_mitigations {
Borislav Petkov f3a7e3
+	MMIO_MITIGATION_OFF,
Borislav Petkov f3a7e3
+	MMIO_MITIGATION_UCODE_NEEDED,
Borislav Petkov f3a7e3
+	MMIO_MITIGATION_VERW,
Borislav Petkov f3a7e3
+};
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+/* Default mitigation for Processor MMIO Stale Data vulnerabilities */
Borislav Petkov f3a7e3
+static enum mmio_mitigations mmio_mitigation __ro_after_init = MMIO_MITIGATION_VERW;
Borislav Petkov f3a7e3
+static bool mmio_nosmt __ro_after_init = false;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+static const char * const mmio_strings[] = {
Borislav Petkov f3a7e3
+	[MMIO_MITIGATION_OFF]		= "Vulnerable",
Borislav Petkov f3a7e3
+	[MMIO_MITIGATION_UCODE_NEEDED]	= "Vulnerable: Clear CPU buffers attempted, no microcode",
Borislav Petkov f3a7e3
+	[MMIO_MITIGATION_VERW]		= "Mitigation: Clear CPU buffers",
Borislav Petkov f3a7e3
+};
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+static void __init mmio_select_mitigation(void)
Borislav Petkov f3a7e3
+{
Borislav Petkov f3a7e3
+	u64 ia32_cap;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) ||
Borislav Petkov f3a7e3
+	    cpu_mitigations_off()) {
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_OFF;
Borislav Petkov f3a7e3
+		return;
Borislav Petkov f3a7e3
+	}
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	if (mmio_mitigation == MMIO_MITIGATION_OFF)
Borislav Petkov f3a7e3
+		return;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	ia32_cap = x86_read_arch_cap_msr();
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	/*
Borislav Petkov f3a7e3
+	 * Enable CPU buffer clear mitigation for host and VMM, if also affected
Borislav Petkov f3a7e3
+	 * by MDS or TAA. Otherwise, enable mitigation for VMM only.
Borislav Petkov f3a7e3
+	 */
Borislav Petkov f3a7e3
+	if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) &&
Borislav Petkov f3a7e3
+					      boot_cpu_has(X86_FEATURE_RTM)))
Borislav Petkov f3a7e3
+		static_branch_enable(&mds_user_clear);
Borislav Petkov f3a7e3
+	else
Borislav Petkov f3a7e3
+		static_branch_enable(&mmio_stale_data_clear);
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	/*
Borislav Petkov f3a7e3
+	 * Check if the system has the right microcode.
Borislav Petkov f3a7e3
+	 *
Borislav Petkov f3a7e3
+	 * CPU Fill buffer clear mitigation is enumerated by either an explicit
Borislav Petkov f3a7e3
+	 * FB_CLEAR or by the presence of both MD_CLEAR and L1D_FLUSH on MDS
Borislav Petkov f3a7e3
+	 * affected systems.
Borislav Petkov f3a7e3
+	 */
Borislav Petkov f3a7e3
+	if ((ia32_cap & ARCH_CAP_FB_CLEAR) ||
Borislav Petkov f3a7e3
+	    (boot_cpu_has(X86_FEATURE_MD_CLEAR) &&
Borislav Petkov f3a7e3
+	     boot_cpu_has(X86_FEATURE_FLUSH_L1D) &&
Borislav Petkov f3a7e3
+	     !(ia32_cap & ARCH_CAP_MDS_NO)))
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_VERW;
Borislav Petkov f3a7e3
+	else
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	if (mmio_nosmt || cpu_mitigations_auto_nosmt())
Borislav Petkov f3a7e3
+		cpu_smt_disable(false);
Borislav Petkov f3a7e3
+}
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+static int __init mmio_stale_data_parse_cmdline(char *str)
Borislav Petkov f3a7e3
+{
Borislav Petkov f3a7e3
+	if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
Borislav Petkov f3a7e3
+		return 0;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	if (!str)
Borislav Petkov f3a7e3
+		return -EINVAL;
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	if (!strcmp(str, "off")) {
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_OFF;
Borislav Petkov f3a7e3
+	} else if (!strcmp(str, "full")) {
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_VERW;
Borislav Petkov f3a7e3
+	} else if (!strcmp(str, "full,nosmt")) {
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_VERW;
Borislav Petkov f3a7e3
+		mmio_nosmt = true;
Borislav Petkov f3a7e3
+	}
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+	return 0;
Borislav Petkov f3a7e3
+}
Borislav Petkov f3a7e3
+early_param("mmio_stale_data", mmio_stale_data_parse_cmdline);
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+#undef pr_fmt
Borislav Petkov f3a7e3
 #define pr_fmt(fmt)     "" fmt
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 static void __init md_clear_update_mitigation(void)
Borislav Petkov f3a7e3
@@ -386,19 +477,31 @@ static void __init md_clear_update_mitig
Borislav Petkov f3a7e3
 		goto out;
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 	/*
Borislav Petkov f3a7e3
-	 * mds_user_clear is now enabled. Update MDS mitigation, if
Borislav Petkov f3a7e3
-	 * necessary.
Borislav Petkov f3a7e3
+	 * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data
Borislav Petkov f3a7e3
+	 * mitigation, if necessary.
Borislav Petkov f3a7e3
 	 */
Borislav Petkov f3a7e3
 	if (mds_mitigation == MDS_MITIGATION_OFF &&
Borislav Petkov f3a7e3
 	    boot_cpu_has_bug(X86_BUG_MDS)) {
Borislav Petkov f3a7e3
 		mds_mitigation = MDS_MITIGATION_FULL;
Borislav Petkov f3a7e3
 		mds_select_mitigation();
Borislav Petkov f3a7e3
 	}
Borislav Petkov f3a7e3
+	if (taa_mitigation == TAA_MITIGATION_OFF &&
Borislav Petkov f3a7e3
+	    boot_cpu_has_bug(X86_BUG_TAA)) {
Borislav Petkov f3a7e3
+		taa_mitigation = TAA_MITIGATION_VERW;
Borislav Petkov f3a7e3
+		taa_select_mitigation();
Borislav Petkov f3a7e3
+	}
Borislav Petkov f3a7e3
+	if (mmio_mitigation == MMIO_MITIGATION_OFF &&
Borislav Petkov f3a7e3
+	    boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) {
Borislav Petkov f3a7e3
+		mmio_mitigation = MMIO_MITIGATION_VERW;
Borislav Petkov f3a7e3
+		mmio_select_mitigation();
Borislav Petkov f3a7e3
+	}
Borislav Petkov f3a7e3
 out:
Borislav Petkov f3a7e3
 	if (boot_cpu_has_bug(X86_BUG_MDS))
Borislav Petkov f3a7e3
 		pr_info("MDS: %s\n", mds_strings[mds_mitigation]);
Borislav Petkov f3a7e3
 	if (boot_cpu_has_bug(X86_BUG_TAA))
Borislav Petkov f3a7e3
 		pr_info("TAA: %s\n", taa_strings[taa_mitigation]);
Borislav Petkov f3a7e3
+	if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
Borislav Petkov f3a7e3
+		pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]);
Borislav Petkov f3a7e3
 }
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 #undef pr_fmt
Borislav Petkov f3a7e3
--- a/arch/x86/kvm/vmx/vmx.c
Borislav Petkov f3a7e3
+++ b/arch/x86/kvm/vmx/vmx.c
Borislav Petkov f3a7e3
@@ -6742,6 +6742,9 @@ static void vmx_vcpu_run(struct kvm_vcpu
Borislav Petkov f3a7e3
 		vmx_l1d_flush(vcpu);
Borislav Petkov f3a7e3
 	else if (static_branch_unlikely(&mds_user_clear))
Borislav Petkov f3a7e3
 		mds_clear_cpu_buffers();
Borislav Petkov f3a7e3
+	else if (static_branch_unlikely(&mmio_stale_data_clear) &&
Borislav Petkov f3a7e3
+		 kvm_arch_has_assigned_device(vcpu->kvm))
Borislav Petkov f3a7e3
+		mds_clear_cpu_buffers();
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 	if (vcpu->arch.cr2 != read_cr2())
Borislav Petkov f3a7e3
 		write_cr2(vcpu->arch.cr2);
Borislav Petkov f3a7e3
--- a/Documentation/admin-guide/kernel-parameters.txt
Borislav Petkov f3a7e3
+++ b/Documentation/admin-guide/kernel-parameters.txt
Borislav Petkov f3a7e3
@@ -2736,6 +2736,7 @@
Borislav Petkov f3a7e3
 					       kvm.nx_huge_pages=off [X86]
Borislav Petkov f3a7e3
 					       no_entry_flush [PPC]
Borislav Petkov f3a7e3
 					       no_uaccess_flush [PPC]
Borislav Petkov f3a7e3
+					       mmio_stale_data=off [X86]
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 				Exceptions:
Borislav Petkov f3a7e3
 					       This does not have any effect on
Borislav Petkov f3a7e3
@@ -2757,6 +2758,7 @@
Borislav Petkov f3a7e3
 				Equivalent to: l1tf=flush,nosmt [X86]
Borislav Petkov f3a7e3
 					       mds=full,nosmt [X86]
Borislav Petkov f3a7e3
 					       tsx_async_abort=full,nosmt [X86]
Borislav Petkov f3a7e3
+					       mmio_stale_data=full,nosmt [X86]
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
 	mminit_loglevel=
Borislav Petkov f3a7e3
 			[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
Borislav Petkov f3a7e3
@@ -2766,6 +2768,40 @@
Borislav Petkov f3a7e3
 			log everything. Information is printed at KERN_DEBUG
Borislav Petkov f3a7e3
 			so loglevel=8 may also need to be specified.
Borislav Petkov f3a7e3
 
Borislav Petkov f3a7e3
+	mmio_stale_data=
Borislav Petkov f3a7e3
+			[X86,INTEL] Control mitigation for the Processor
Borislav Petkov f3a7e3
+			MMIO Stale Data vulnerabilities.
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			Processor MMIO Stale Data is a class of
Borislav Petkov f3a7e3
+			vulnerabilities that may expose data after an MMIO
Borislav Petkov f3a7e3
+			operation. Exposed data could originate or end in
Borislav Petkov f3a7e3
+			the same CPU buffers as affected by MDS and TAA.
Borislav Petkov f3a7e3
+			Therefore, similar to MDS and TAA, the mitigation
Borislav Petkov f3a7e3
+			is to clear the affected CPU buffers.
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			This parameter controls the mitigation. The
Borislav Petkov f3a7e3
+			options are:
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			full       - Enable mitigation on vulnerable CPUs
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			full,nosmt - Enable mitigation and disable SMT on
Borislav Petkov f3a7e3
+				     vulnerable CPUs.
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			off        - Unconditionally disable mitigation
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			On MDS or TAA affected machines,
Borislav Petkov f3a7e3
+			mmio_stale_data=off can be prevented by an active
Borislav Petkov f3a7e3
+			MDS or TAA mitigation as these vulnerabilities are
Borislav Petkov f3a7e3
+			mitigated with the same mechanism so in order to
Borislav Petkov f3a7e3
+			disable this mitigation, you need to specify
Borislav Petkov f3a7e3
+			mds=off and tsx_async_abort=off too.
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			Not specifying this option is equivalent to
Borislav Petkov f3a7e3
+			mmio_stale_data=full.
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
+			For details see:
Borislav Petkov f3a7e3
+			Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
Borislav Petkov f3a7e3
+
Borislav Petkov f3a7e3
 	module.sig_enforce
Borislav Petkov f3a7e3
 			[KNL] When CONFIG_MODULE_SIG is set, this means that
Borislav Petkov f3a7e3
 			modules without (valid) signatures will fail to load.