Ivan T. Ivanov 80e5dc
From: Ryan Roberts <ryan.roberts@arm.com>
Ivan T. Ivanov 80e5dc
Date: Thu, 3 Nov 2022 15:05:06 +0000
Ivan T. Ivanov 80e5dc
Subject: KVM: arm64: Fix kvm init failure when mode!=vhe and VA_BITS=52.
Ivan T. Ivanov 80e5dc
Git-commit: 579d7ebe90a332cc5b6c02db9250fd0816a64f63
Ivan T. Ivanov 80e5dc
Patch-mainline: v6.2-rc1
Ivan T. Ivanov 80e5dc
References: git-fixes
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
For nvhe and protected modes, the hyp stage 1 page-tables were previously
Ivan T. Ivanov 80e5dc
configured to have the same number of VA bits as the kernel's idmap.
Ivan T. Ivanov 80e5dc
However, for kernel configs with VA_BITS=52 and where the kernel is
Ivan T. Ivanov 80e5dc
loaded in physical memory below 48 bits, the idmap VA bits is actually
Ivan T. Ivanov 80e5dc
smaller than the kernel's normal stage 1 VA bits. This can lead to
Ivan T. Ivanov 80e5dc
kernel addresses that can't be mapped into the hypervisor, leading to
Ivan T. Ivanov 80e5dc
kvm initialization failure during boot:
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
  kvm [1]: IPA Size Limit: 48 bits
Ivan T. Ivanov 80e5dc
  kvm [1]: Cannot map world-switch code
Ivan T. Ivanov 80e5dc
  kvm [1]: error initializing Hyp mode: -34
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
Fix this by ensuring that the hyp stage 1 VA size is the maximum of
Ivan T. Ivanov 80e5dc
what's used for the idmap and the regular kernel stage 1. At the same
Ivan T. Ivanov 80e5dc
time, refactor the code so that the hyp VA bits is only calculated in
Ivan T. Ivanov 80e5dc
one place.
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
Prior to 7ba8f2b2d652, the idmap was always 52 bits for a 52 VA bits
Ivan T. Ivanov 80e5dc
kernel and therefore the hyp stage1 was also always 52 bits.
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
Fixes: 7ba8f2b2d652 ("arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds")
Ivan T. Ivanov 80e5dc
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Ivan T. Ivanov 80e5dc
[maz: commit message fixes]
Ivan T. Ivanov 80e5dc
Signed-off-by: Marc Zyngier <maz@kernel.org>
Ivan T. Ivanov 80e5dc
Link: https://lore.kernel.org/r/20221103150507.32948-2-ryan.roberts@arm.com
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
Acked-by: Ivan T. Ivanov <iivanov@suse.de>
Ivan T. Ivanov 80e5dc
---
Ivan T. Ivanov 80e5dc
 arch/arm64/kvm/arm.c |   20 +++-----------------
Ivan T. Ivanov 80e5dc
 arch/arm64/kvm/mmu.c |   28 +++++++++++++++++++++++++++-
Ivan T. Ivanov 80e5dc
 2 files changed, 30 insertions(+), 18 deletions(-)
Ivan T. Ivanov 80e5dc
Ivan T. Ivanov 80e5dc
--- a/arch/arm64/kvm/arm.c
Ivan T. Ivanov 80e5dc
+++ b/arch/arm64/kvm/arm.c
Ivan T. Ivanov 80e5dc
@@ -1447,7 +1447,7 @@ static int kvm_init_vector_slots(void)
Ivan T. Ivanov 80e5dc
 	return 0;
Ivan T. Ivanov 80e5dc
 }
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
-static void cpu_prepare_hyp_mode(int cpu)
Ivan T. Ivanov 80e5dc
+static void cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
Ivan T. Ivanov 80e5dc
 {
Ivan T. Ivanov 80e5dc
 	struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
Ivan T. Ivanov 80e5dc
 	unsigned long tcr;
Ivan T. Ivanov 80e5dc
@@ -1463,23 +1463,9 @@ static void cpu_prepare_hyp_mode(int cpu
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
 	params->mair_el2 = read_sysreg(mair_el1);
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
-	/*
Ivan T. Ivanov 80e5dc
-	 * The ID map may be configured to use an extended virtual address
Ivan T. Ivanov 80e5dc
-	 * range. This is only the case if system RAM is out of range for the
Ivan T. Ivanov 80e5dc
-	 * currently configured page size and VA_BITS, in which case we will
Ivan T. Ivanov 80e5dc
-	 * also need the extended virtual range for the HYP ID map, or we won't
Ivan T. Ivanov 80e5dc
-	 * be able to enable the EL2 MMU.
Ivan T. Ivanov 80e5dc
-	 *
Ivan T. Ivanov 80e5dc
-	 * However, at EL2, there is only one TTBR register, and we can't switch
Ivan T. Ivanov 80e5dc
-	 * between translation tables *and* update TCR_EL2.T0SZ at the same
Ivan T. Ivanov 80e5dc
-	 * time. Bottom line: we need to use the extended range with *both* our
Ivan T. Ivanov 80e5dc
-	 * translation tables.
Ivan T. Ivanov 80e5dc
-	 *
Ivan T. Ivanov 80e5dc
-	 * So use the same T0SZ value we use for the ID map.
Ivan T. Ivanov 80e5dc
-	 */
Ivan T. Ivanov 80e5dc
 	tcr = (read_sysreg(tcr_el1) & TCR_EL2_MASK) | TCR_EL2_RES1;
Ivan T. Ivanov 80e5dc
 	tcr &= ~TCR_T0SZ_MASK;
Ivan T. Ivanov 80e5dc
-	tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
Ivan T. Ivanov 80e5dc
+	tcr |= TCR_T0SZ(hyp_va_bits);
Ivan T. Ivanov 80e5dc
 	params->tcr_el2 = tcr;
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
 	params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) + PAGE_SIZE);
Ivan T. Ivanov 80e5dc
@@ -1937,7 +1923,7 @@ static int init_hyp_mode(void)
Ivan T. Ivanov 80e5dc
 		}
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
 		/* Prepare the CPU initialization parameters */
Ivan T. Ivanov 80e5dc
-		cpu_prepare_hyp_mode(cpu);
Ivan T. Ivanov 80e5dc
+		cpu_prepare_hyp_mode(cpu, hyp_va_bits);
Ivan T. Ivanov 80e5dc
 	}
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
 	if (is_protected_kvm_enabled()) {
Ivan T. Ivanov 80e5dc
--- a/arch/arm64/kvm/mmu.c
Ivan T. Ivanov 80e5dc
+++ b/arch/arm64/kvm/mmu.c
Ivan T. Ivanov 80e5dc
@@ -1351,6 +1351,8 @@ static struct kvm_pgtable_mm_ops kvm_hyp
Ivan T. Ivanov 80e5dc
 int kvm_mmu_init(u32 *hyp_va_bits)
Ivan T. Ivanov 80e5dc
 {
Ivan T. Ivanov 80e5dc
 	int err;
Ivan T. Ivanov 80e5dc
+	u32 idmap_bits;
Ivan T. Ivanov 80e5dc
+	u32 kernel_bits;
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
 	hyp_idmap_start = __pa_symbol(__hyp_idmap_text_start);
Ivan T. Ivanov 80e5dc
 	hyp_idmap_start = ALIGN_DOWN(hyp_idmap_start, PAGE_SIZE);
Ivan T. Ivanov 80e5dc
@@ -1364,7 +1366,31 @@ int kvm_mmu_init(u32 *hyp_va_bits)
Ivan T. Ivanov 80e5dc
 	 */
Ivan T. Ivanov 80e5dc
 	BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
Ivan T. Ivanov 80e5dc
 
Ivan T. Ivanov 80e5dc
-	*hyp_va_bits = 64 - ((idmap_t0sz & TCR_T0SZ_MASK) >> TCR_T0SZ_OFFSET);
Ivan T. Ivanov 80e5dc
+	/*
Ivan T. Ivanov 80e5dc
+	 * The ID map may be configured to use an extended virtual address
Ivan T. Ivanov 80e5dc
+	 * range. This is only the case if system RAM is out of range for the
Ivan T. Ivanov 80e5dc
+	 * currently configured page size and VA_BITS_MIN, in which case we will
Ivan T. Ivanov 80e5dc
+	 * also need the extended virtual range for the HYP ID map, or we won't
Ivan T. Ivanov 80e5dc
+	 * be able to enable the EL2 MMU.
Ivan T. Ivanov 80e5dc
+	 *
Ivan T. Ivanov 80e5dc
+	 * However, in some cases the ID map may be configured for fewer than
Ivan T. Ivanov 80e5dc
+	 * the number of VA bits used by the regular kernel stage 1. This
Ivan T. Ivanov 80e5dc
+	 * happens when VA_BITS=52 and the kernel image is placed in PA space
Ivan T. Ivanov 80e5dc
+	 * below 48 bits.
Ivan T. Ivanov 80e5dc
+	 *
Ivan T. Ivanov 80e5dc
+	 * At EL2, there is only one TTBR register, and we can't switch between
Ivan T. Ivanov 80e5dc
+	 * translation tables *and* update TCR_EL2.T0SZ at the same time. Bottom
Ivan T. Ivanov 80e5dc
+	 * line: we need to use the extended range with *both* our translation
Ivan T. Ivanov 80e5dc
+	 * tables.
Ivan T. Ivanov 80e5dc
+	 *
Ivan T. Ivanov 80e5dc
+	 * So use the maximum of the idmap VA bits and the regular kernel stage
Ivan T. Ivanov 80e5dc
+	 * 1 VA bits to assure that the hypervisor can both ID map its code page
Ivan T. Ivanov 80e5dc
+	 * and map any kernel memory.
Ivan T. Ivanov 80e5dc
+	 */
Ivan T. Ivanov 80e5dc
+	idmap_bits = 64 - ((idmap_t0sz & TCR_T0SZ_MASK) >> TCR_T0SZ_OFFSET);
Ivan T. Ivanov 80e5dc
+	kernel_bits = vabits_actual;
Ivan T. Ivanov 80e5dc
+	*hyp_va_bits = max(idmap_bits, kernel_bits);
Ivan T. Ivanov 80e5dc
+
Ivan T. Ivanov 80e5dc
 	kvm_debug("Using %u-bit virtual addresses at EL2\n", *hyp_va_bits);
Ivan T. Ivanov 80e5dc
 	kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
Ivan T. Ivanov 80e5dc
 	kvm_debug("HYP VA range: %lx:%lx\n",