Blob Blame History Raw
From 7c1bd80cc216e7255bfabb94222676b51ab6868e Mon Sep 17 00:00:00 2001
From: Nicholas Piggin <npiggin@gmail.com>
Date: Fri, 18 May 2018 03:49:44 +1000
Subject: [PATCH] KVM: PPC: Book3S HV: Send kvmppc_bad_interrupt NMIs to Linux
 handlers

References: bsc#1061840
Patch-mainline: v4.18-rc1
Git-commit: 7c1bd80cc216e7255bfabb94222676b51ab6868e

It's possible to take a SRESET or MCE in these paths due to a bug
in the host code or a NMI IPI, etc. A recent bug attempting to load
a virtual address from real mode gave th complete but cryptic error,
abridged:

      Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
      LE SMP NR_CPUS=2048 NUMA PowerNV
      CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted
      NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
      REGS: c000000fff76dd80 TRAP: 0200   Not tainted
      MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
      CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
      NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
      LR [c0000000000c2430] do_tlbies+0x230/0x2f0

Sending the NMIs through the Linux handlers gives a nicer output:

      Severe Machine check interrupt [Not recovered]
        NIP [c0000000000155ac]: perf_trace_tlbie+0x2c/0x1a0
        Initiator: CPU
        Error type: Real address [Load (bad)]
          Effective address: d00017fffcc01a28
      opal: Machine check interrupt unrecoverable: MSR(RI=0)
      opal: Hardware platform error: Unrecoverable Machine Check exception
      CPU: 0 PID: 6700 Comm: qemu-system-ppc Tainted: G   M
      NIP:  c0000000000155ac LR: c0000000000c23c0 CTR: c000000000015580
      REGS: c000000fff9e9d80 TRAP: 0200   Tainted: G   M
      MSR:  9000000000201001 <SF,HV,ME,LE>  CR: 48082222  XER: 00000000
      CFAR: 000000010cbc1a30 DAR: d00017fffcc01a28 DSISR: 00000040 SOFTE: 3
      NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
      LR [c0000000000c23c0] do_tlbies+0x1c0/0x280

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Acked-by: Michal Suchanek <msuchanek@suse.de>
---
 arch/powerpc/kvm/book3s_hv_builtin.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index 2b127586be30..d4a3f4da409b 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -18,6 +18,7 @@
 #include <linux/cma.h>
 #include <linux/bitops.h>
 
+#include <asm/asm-prototypes.h>
 #include <asm/cputable.h>
 #include <asm/kvm_ppc.h>
 #include <asm/kvm_book3s.h>
@@ -633,7 +634,19 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
 
 void kvmppc_bad_interrupt(struct pt_regs *regs)
 {
-	die("Bad interrupt in KVM entry/exit code", regs, SIGABRT);
+	/*
+	 * 100 could happen at any time, 200 can happen due to invalid real
+	 * address access for example (or any time due to a hardware problem).
+	 */
+	if (TRAP(regs) == 0x100) {
+		get_paca()->in_nmi++;
+		system_reset_exception(regs);
+		get_paca()->in_nmi--;
+	} else if (TRAP(regs) == 0x200) {
+		machine_check_exception(regs);
+	} else {
+		die("Bad interrupt in KVM entry/exit code", regs, SIGABRT);
+	}
 	panic("Bad KVM trap");
 }
 
-- 
2.13.7