Dario Faggioli 6ec5f5
From: Jim Mattson <jmattson@google.com>
Dario Faggioli 6ec5f5
Date: Wed, 19 Oct 2022 14:36:20 -0700
Dario Faggioli 6ec5f5
Subject: KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRS
Dario Faggioli 6ec5f5
Git-commit: 2e7eab81425ad6c875f2ed47c0ce01e78afc38a5
Dario Faggioli 6ec5f5
Patch-mainline: v6.2-rc1
Dario Faggioli 6ec5f5
References: bsc#1206992 CVE-2022-2196
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
According to Intel's document on Indirect Branch Restricted
Dario Faggioli 6ec5f5
Speculation, "Enabling IBRS does not prevent software from controlling
Dario Faggioli 6ec5f5
the predicted targets of indirect branches of unrelated software
Dario Faggioli 6ec5f5
executed later at the same predictor mode (for example, between two
Dario Faggioli 6ec5f5
different user applications, or two different virtual machines). Such
Dario Faggioli 6ec5f5
isolation can be ensured through use of the Indirect Branch Predictor
Dario Faggioli 6ec5f5
Barrier (IBPB) command." This applies to both basic and enhanced IBRS.
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
Since L1 and L2 VMs share hardware predictor modes (guest-user and
Dario Faggioli 6ec5f5
guest-kernel), hardware IBRS is not sufficient to virtualize
Dario Faggioli 6ec5f5
IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts,
Dario Faggioli 6ec5f5
hardware IBRS is actually sufficient in practice, even though it isn't
Dario Faggioli 6ec5f5
sufficient architecturally.)
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
For virtual CPUs that support IBRS, add an indirect branch prediction
Dario Faggioli 6ec5f5
barrier on emulated VM-exit, to ensure that the predicted targets of
Dario Faggioli 6ec5f5
indirect branches executed in L1 cannot be controlled by software that
Dario Faggioli 6ec5f5
was executed in L2.
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
Since we typically don't intercept guest writes to IA32_SPEC_CTRL,
Dario Faggioli 6ec5f5
perform the IBPB at emulated VM-exit regardless of the current
Dario Faggioli 6ec5f5
IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be
Dario Faggioli 6ec5f5
deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is
Dario Faggioli 6ec5f5
clear at emulated VM-exit.
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
This is CVE-2022-2196.
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
Fixes: 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02")
Dario Faggioli 6ec5f5
Cc: Sean Christopherson <seanjc@google.com>
Dario Faggioli 6ec5f5
Signed-off-by: Jim Mattson <jmattson@google.com>
Dario Faggioli 6ec5f5
Reviewed-by: Sean Christopherson <seanjc@google.com>
Dario Faggioli 6ec5f5
Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.com
Dario Faggioli 6ec5f5
Signed-off-by: Sean Christopherson <seanjc@google.com>
Dario Faggioli 6ec5f5
Acked-by: Dario Faggioli <dfaggioli@suse.com>
Dario Faggioli 6ec5f5
---
Dario Faggioli 6ec5f5
 arch/x86/kvm/vmx/nested.c | 11 +++++++++++
Dario Faggioli 6ec5f5
 arch/x86/kvm/vmx/vmx.c    |  6 ++++--
Dario Faggioli 6ec5f5
 2 files changed, 15 insertions(+), 2 deletions(-)
Dario Faggioli 6ec5f5
Dario Faggioli 6ec5f5
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
Dario Faggioli 6ec5f5
index 892791019968..61c83424285c 100644
Dario Faggioli 6ec5f5
--- a/arch/x86/kvm/vmx/nested.c
Dario Faggioli 6ec5f5
+++ b/arch/x86/kvm/vmx/nested.c
Dario Faggioli 6ec5f5
@@ -4798,6 +4798,17 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
Dario Faggioli 6ec5f5
 
Dario Faggioli 6ec5f5
 	vmx_switch_vmcs(vcpu, &vmx->vmcs01);
Dario Faggioli 6ec5f5
 
Dario Faggioli 6ec5f5
+	/*
Dario Faggioli 6ec5f5
+	 * If IBRS is advertised to the vCPU, KVM must flush the indirect
Dario Faggioli 6ec5f5
+	 * branch predictors when transitioning from L2 to L1, as L1 expects
Dario Faggioli 6ec5f5
+	 * hardware (KVM in this case) to provide separate predictor modes.
Dario Faggioli 6ec5f5
+	 * Bare metal isolates VMX root (host) from VMX non-root (guest), but
Dario Faggioli 6ec5f5
+	 * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
Dario Faggioli 6ec5f5
+	 * separate modes for L2 vs L1.
Dario Faggioli 6ec5f5
+	 */
Dario Faggioli 6ec5f5
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
Dario Faggioli 6ec5f5
+		indirect_branch_prediction_barrier();
Dario Faggioli 6ec5f5
+
Dario Faggioli 6ec5f5
 	/* Update any VMCS fields that might have changed while L2 ran */
Dario Faggioli 6ec5f5
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
Dario Faggioli 6ec5f5
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
Dario Faggioli 6ec5f5
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
Dario Faggioli 6ec5f5
index cb40f724d8cc..3f31c46c306e 100644
Dario Faggioli 6ec5f5
--- a/arch/x86/kvm/vmx/vmx.c
Dario Faggioli 6ec5f5
+++ b/arch/x86/kvm/vmx/vmx.c
Dario Faggioli 6ec5f5
@@ -1348,8 +1348,10 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
Dario Faggioli 6ec5f5
 
Dario Faggioli 6ec5f5
 		/*
Dario Faggioli 6ec5f5
 		 * No indirect branch prediction barrier needed when switching
Dario Faggioli 6ec5f5
-		 * the active VMCS within a guest, e.g. on nested VM-Enter.
Dario Faggioli 6ec5f5
-		 * The L1 VMM can protect itself with retpolines, IBPB or IBRS.
Dario Faggioli 6ec5f5
+		 * the active VMCS within a vCPU, unless IBRS is advertised to
Dario Faggioli 6ec5f5
+		 * the vCPU.  To minimize the number of IBPBs executed, KVM
Dario Faggioli 6ec5f5
+		 * performs IBPB on nested VM-Exit (a single nested transition
Dario Faggioli 6ec5f5
+		 * may switch the active VMCS multiple times).
Dario Faggioli 6ec5f5
 		 */
Dario Faggioli 6ec5f5
 		if (!buddy || WARN_ON_ONCE(buddy->vmcs != prev))
Dario Faggioli 6ec5f5
 			indirect_branch_prediction_barrier();
Dario Faggioli 6ec5f5