Michal Suchanek 8bdce1
From 2c9ac51b850d84ee496b0a5d832ce66d411ae552 Mon Sep 17 00:00:00 2001
Michal Suchanek 8bdce1
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Michal Suchanek 8bdce1
Date: Wed, 21 Jul 2021 01:48:29 -0400
Michal Suchanek 8bdce1
Subject: [PATCH] powerpc/perf: Fix PMU callbacks to clear pending PMI before
Michal Suchanek 8bdce1
 resetting an overflown PMC
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
References: bsc#1156395
Michal Suchanek 8bdce1
Patch-mainline: v5.17-rc1
Michal Suchanek 8bdce1
Git-commit: 2c9ac51b850d84ee496b0a5d832ce66d411ae552
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Running perf fuzzer showed below in dmesg logs:
Michal Suchanek 8bdce1
  "Can't find PMC that caused IRQ"
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
This means a PMU exception happened, but none of the PMC's (Performance
Michal Suchanek 8bdce1
Monitor Counter) were found to be overflown. There are some corner cases
Michal Suchanek 8bdce1
that clears the PMCs after PMI gets masked. In such cases, the perf
Michal Suchanek 8bdce1
interrupt handler will not find the active PMC values that had caused
Michal Suchanek 8bdce1
the overflow and thus leads to this message while replaying.
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Case 1: PMU Interrupt happens during replay of other interrupts and
Michal Suchanek 8bdce1
counter values gets cleared by PMU callbacks before replay:
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
During replay of interrupts like timer, __do_irq() and doorbell
Michal Suchanek 8bdce1
exception, we conditionally enable interrupts via may_hard_irq_enable().
Michal Suchanek 8bdce1
This could potentially create a window to generate a PMI. Since irq soft
Michal Suchanek 8bdce1
mask is set to ALL_DISABLED, the PMI will get masked here. We could get
Michal Suchanek 8bdce1
IPIs run before perf interrupt is replayed and the PMU events could
Michal Suchanek 8bdce1
be deleted or stopped. This will change the PMU SPR values and resets
Michal Suchanek 8bdce1
the counters. Snippet of ftrace log showing PMU callbacks invoked in
Michal Suchanek 8bdce1
__do_irq():
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
  <idle>-0 [051] dns. 132025441306354: __do_irq <-call_do_irq
Michal Suchanek 8bdce1
  <idle>-0 [051] dns. 132025441306430: irq_enter <-__do_irq
Michal Suchanek 8bdce1
  <idle>-0 [051] dns. 132025441306503: irq_enter_rcu <-__do_irq
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441306599: xive_get_irq <-__do_irq
Michal Suchanek 8bdce1
  <<>>
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441307770: generic_smp_call_function_single_interrupt <-smp_ipi_demux_relaxed
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441307839: flush_smp_call_function_queue <-smp_ipi_demux_relaxed
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308057: _raw_spin_lock <-event_function
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308206: power_pmu_disable <-perf_pmu_disable
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308337: power_pmu_del <-event_sched_out
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308407: power_pmu_read <-power_pmu_del
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308477: read_pmc <-power_pmu_read
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308590: isa207_disable_pmc <-power_pmu_del
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308663: write_pmc <-power_pmu_del
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308787: power_pmu_event_idx <-perf_event_update_userpage
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308859: rcu_read_unlock_strict <-perf_event_update_userpage
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441308975: power_pmu_enable <-perf_pmu_enable
Michal Suchanek 8bdce1
  <<>>
Michal Suchanek 8bdce1
  <idle>-0 [051] dnH. 132025441311108: irq_exit <-__do_irq
Michal Suchanek 8bdce1
  <idle>-0 [051] dns. 132025441311319: performance_monitor_exception <-replay_soft_interrupts
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Case 2: PMI's masked during local_* operations, example local_add(). If
Michal Suchanek 8bdce1
the local_add() operation happens within a local_irq_save(), replay of
Michal Suchanek 8bdce1
PMI will be during local_irq_restore(). Similar to case 1, this could
Michal Suchanek 8bdce1
also create a window before replay where PMU events gets deleted or
Michal Suchanek 8bdce1
stopped.
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Fix it by updating the PMU callback function power_pmu_disable() to
Michal Suchanek 8bdce1
check for pending perf interrupt. If there is an overflown PMC and
Michal Suchanek 8bdce1
pending perf interrupt indicated in paca, clear the PMI bit in paca to
Michal Suchanek 8bdce1
drop that sample. Clearing of PMI bit is done in power_pmu_disable()
Michal Suchanek 8bdce1
since disable is invoked before any event gets deleted/stopped. With
Michal Suchanek 8bdce1
this fix, if there are more than one event running in the PMU, there is
Michal Suchanek 8bdce1
a chance that we clear the PMI bit for the event which is not getting
Michal Suchanek 8bdce1
deleted/stopped. The other events may still remain active. Hence to make
Michal Suchanek 8bdce1
sure we don't drop valid sample in such cases, another check is added in
Michal Suchanek 8bdce1
power_pmu_enable. This checks if there is an overflown PMC found among
Michal Suchanek 8bdce1
the active events and if so enable back the PMI bit. Two new helper
Michal Suchanek 8bdce1
functions are introduced to clear/set the PMI, ie
Michal Suchanek 8bdce1
clear_pmi_irq_pending() and set_pmi_irq_pending(). Helper function
Michal Suchanek 8bdce1
pmi_irq_pending() is introduced to give a warning if there is pending
Michal Suchanek 8bdce1
PMI bit in paca, but no PMC is overflown.
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Also there are corner cases which result in performance monitor
Michal Suchanek 8bdce1
interrupts being triggered during power_pmu_disable(). This happens
Michal Suchanek 8bdce1
since PMXE bit is not cleared along with disabling of other MMCR0 bits
Michal Suchanek 8bdce1
in the pmu_disable. Such PMI's could leave the PMU running and could
Michal Suchanek 8bdce1
trigger PMI again which will set MMCR0 PMAO bit. This could lead to
Michal Suchanek 8bdce1
spurious interrupts in some corner cases. Example, a timer after
Michal Suchanek 8bdce1
power_pmu_del() which will re-enable interrupts and triggers a PMI again
Michal Suchanek 8bdce1
since PMAO bit is still set. But fails to find valid overflow since PMC
Michal Suchanek 8bdce1
was cleared in power_pmu_del(). Fix that by disabling PMXE along with
Michal Suchanek 8bdce1
disabling of other MMCR0 bits in power_pmu_disable().
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
We can't just replay PMI any time. Hence this approach is preferred
Michal Suchanek 8bdce1
rather than replaying PMI before resetting overflown PMC. Patch also
Michal Suchanek 8bdce1
documents core-book3s on a race condition which can trigger these PMC
Michal Suchanek 8bdce1
messages during idle path in PowerNV.
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
Fixes: f442d004806e ("powerpc/64s: Add support to mask perf interrupts and replay them")
Michal Suchanek 8bdce1
Reported-by: Nageswara R Sastry <nasastry@in.ibm.com>
Michal Suchanek 8bdce1
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Michal Suchanek 8bdce1
Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Michal Suchanek 8bdce1
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Michal Suchanek 8bdce1
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Michal Suchanek 8bdce1
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Michal Suchanek 8bdce1
[mpe: Make pmi_irq_pending() return bool, reflow/reword some comments]
Michal Suchanek 8bdce1
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michal Suchanek 8bdce1
Link: https://lore.kernel.org/r/1626846509-1350-2-git-send-email-atrajeev@linux.vnet.ibm.com
Michal Suchanek 8bdce1
Acked-by: Michal Suchanek <msuchanek@suse.de>
Michal Suchanek 8bdce1
---
Michal Suchanek 8bdce1
 arch/powerpc/include/asm/hw_irq.h | 40 +++++++++++++++++++++
Michal Suchanek 8bdce1
 arch/powerpc/perf/core-book3s.c   | 58 ++++++++++++++++++++++++++++++-
Michal Suchanek 8bdce1
 2 files changed, 97 insertions(+), 1 deletion(-)
Michal Suchanek 8bdce1
Michal Suchanek 8bdce1
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
Michal Suchanek 8bdce1
--- a/arch/powerpc/include/asm/hw_irq.h
Michal Suchanek 8bdce1
+++ b/arch/powerpc/include/asm/hw_irq.h
Michal Suchanek 8bdce1
@@ -224,6 +224,42 @@ static inline bool arch_irqs_disabled(void)
Michal Suchanek 8bdce1
 	return arch_irqs_disabled_flags(arch_local_save_flags());
Michal Suchanek 8bdce1
 }
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
+static inline void set_pmi_irq_pending(void)
Michal Suchanek 8bdce1
+{
Michal Suchanek 8bdce1
+	/*
Michal Suchanek 8bdce1
+	 * Invoked from PMU callback functions to set PMI bit in the paca.
Michal Suchanek 8bdce1
+	 * This has to be called with irq's disabled (via hard_irq_disable()).
Michal Suchanek 8bdce1
+	 */
Michal Suchanek 8bdce1
+	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
Michal Suchanek 8bdce1
+		WARN_ON_ONCE(mfmsr() & MSR_EE);
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	get_paca()->irq_happened |= PACA_IRQ_PMI;
Michal Suchanek 8bdce1
+}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+static inline void clear_pmi_irq_pending(void)
Michal Suchanek 8bdce1
+{
Michal Suchanek 8bdce1
+	/*
Michal Suchanek 8bdce1
+	 * Invoked from PMU callback functions to clear the pending PMI bit
Michal Suchanek 8bdce1
+	 * in the paca.
Michal Suchanek 8bdce1
+	 */
Michal Suchanek 8bdce1
+	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
Michal Suchanek 8bdce1
+		WARN_ON_ONCE(mfmsr() & MSR_EE);
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	get_paca()->irq_happened &= ~PACA_IRQ_PMI;
Michal Suchanek 8bdce1
+}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+static inline bool pmi_irq_pending(void)
Michal Suchanek 8bdce1
+{
Michal Suchanek 8bdce1
+	/*
Michal Suchanek 8bdce1
+	 * Invoked from PMU callback functions to check if there is a pending
Michal Suchanek 8bdce1
+	 * PMI bit in the paca.
Michal Suchanek 8bdce1
+	 */
Michal Suchanek 8bdce1
+	if (get_paca()->irq_happened & PACA_IRQ_PMI)
Michal Suchanek 8bdce1
+		return true;
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	return false;
Michal Suchanek 8bdce1
+}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 #ifdef CONFIG_PPC_BOOK3S
Michal Suchanek 8bdce1
 /*
Michal Suchanek 8bdce1
  * To support disabling and enabling of irq with PMI, set of
Michal Suchanek 8bdce1
@@ -408,6 +444,10 @@ static inline void do_hard_irq_enable(void)
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
 static inline void may_hard_irq_enable(void) { }
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
+static inline void clear_pmi_irq_pending(void) { }
Michal Suchanek 8bdce1
+static inline void set_pmi_irq_pending(void) { }
Michal Suchanek 8bdce1
+static inline bool pmi_irq_pending(void) { return false; }
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 #endif /* CONFIG_PPC64 */
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
 #define ARCH_IRQ_INIT_FLAGS	IRQ_NOREQUEST
Michal Suchanek 8bdce1
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
Michal Suchanek 8bdce1
--- a/arch/powerpc/perf/core-book3s.c
Michal Suchanek 8bdce1
+++ b/arch/powerpc/perf/core-book3s.c
Michal Suchanek 8bdce1
@@ -857,6 +857,19 @@ static void write_pmc(int idx, unsigned long val)
Michal Suchanek 8bdce1
 	}
Michal Suchanek 8bdce1
 }
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
+static int any_pmc_overflown(struct cpu_hw_events *cpuhw)
Michal Suchanek 8bdce1
+{
Michal Suchanek 8bdce1
+	int i, idx;
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	for (i = 0; i < cpuhw->n_events; i++) {
Michal Suchanek 8bdce1
+		idx = cpuhw->event[i]->hw.idx;
Michal Suchanek 8bdce1
+		if ((idx) && ((int)read_pmc(idx) < 0))
Michal Suchanek 8bdce1
+			return idx;
Michal Suchanek 8bdce1
+	}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	return 0;
Michal Suchanek 8bdce1
+}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 /* Called from sysrq_handle_showregs() */
Michal Suchanek 8bdce1
 void perf_event_print_debug(void)
Michal Suchanek 8bdce1
 {
Michal Suchanek 8bdce1
@@ -1281,11 +1294,13 @@ static void power_pmu_disable(struct pmu *pmu)
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
 		/*
Michal Suchanek 8bdce1
 		 * Set the 'freeze counters' bit, clear EBE/BHRBA/PMCC/PMAO/FC56
Michal Suchanek 8bdce1
+		 * Also clear PMXE to disable PMI's getting triggered in some
Michal Suchanek 8bdce1
+		 * corner cases during PMU disable.
Michal Suchanek 8bdce1
 		 */
Michal Suchanek 8bdce1
 		val  = mmcr0 = mfspr(SPRN_MMCR0);
Michal Suchanek 8bdce1
 		val |= MMCR0_FC;
Michal Suchanek 8bdce1
 		val &= ~(MMCR0_EBE | MMCR0_BHRBA | MMCR0_PMCC | MMCR0_PMAO |
Michal Suchanek 8bdce1
-			 MMCR0_FC56);
Michal Suchanek 8bdce1
+			 MMCR0_PMXE | MMCR0_FC56);
Michal Suchanek 8bdce1
 		/* Set mmcr0 PMCCEXT for p10 */
Michal Suchanek 8bdce1
 		if (ppmu->flags & PPMU_ARCH_31)
Michal Suchanek 8bdce1
 			val |= MMCR0_PMCCEXT;
Michal Suchanek 8bdce1
@@ -1299,6 +1314,23 @@ static void power_pmu_disable(struct pmu *pmu)
Michal Suchanek 8bdce1
 		mb();
Michal Suchanek 8bdce1
 		isync();
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
+		/*
Michal Suchanek 8bdce1
+		 * Some corner cases could clear the PMU counter overflow
Michal Suchanek 8bdce1
+		 * while a masked PMI is pending. One such case is when
Michal Suchanek 8bdce1
+		 * a PMI happens during interrupt replay and perf counter
Michal Suchanek 8bdce1
+		 * values are cleared by PMU callbacks before replay.
Michal Suchanek 8bdce1
+		 *
Michal Suchanek 8bdce1
+		 * If any PMC corresponding to the active PMU events are
Michal Suchanek 8bdce1
+		 * overflown, disable the interrupt by clearing the paca
Michal Suchanek 8bdce1
+		 * bit for PMI since we are disabling the PMU now.
Michal Suchanek 8bdce1
+		 * Otherwise provide a warning if there is PMI pending, but
Michal Suchanek 8bdce1
+		 * no counter is found overflown.
Michal Suchanek 8bdce1
+		 */
Michal Suchanek 8bdce1
+		if (any_pmc_overflown(cpuhw))
Michal Suchanek 8bdce1
+			clear_pmi_irq_pending();
Michal Suchanek 8bdce1
+		else
Michal Suchanek 8bdce1
+			WARN_ON(pmi_irq_pending());
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 		val = mmcra = cpuhw->mmcr.mmcra;
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
 		/*
Michal Suchanek 8bdce1
@@ -1390,6 +1422,15 @@ static void power_pmu_enable(struct pmu *pmu)
Michal Suchanek 8bdce1
 	 * (possibly updated for removal of events).
Michal Suchanek 8bdce1
 	 */
Michal Suchanek 8bdce1
 	if (!cpuhw->n_added) {
Michal Suchanek 8bdce1
+		/*
Michal Suchanek 8bdce1
+		 * If there is any active event with an overflown PMC
Michal Suchanek 8bdce1
+		 * value, set back PACA_IRQ_PMI which would have been
Michal Suchanek 8bdce1
+		 * cleared in power_pmu_disable().
Michal Suchanek 8bdce1
+		 */
Michal Suchanek 8bdce1
+		hard_irq_disable();
Michal Suchanek 8bdce1
+		if (any_pmc_overflown(cpuhw))
Michal Suchanek 8bdce1
+			set_pmi_irq_pending();
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 		mtspr(SPRN_MMCRA, cpuhw->mmcr.mmcra & ~MMCRA_SAMPLE_ENABLE);
Michal Suchanek 8bdce1
 		mtspr(SPRN_MMCR1, cpuhw->mmcr.mmcr1);
Michal Suchanek 8bdce1
 		if (ppmu->flags & PPMU_ARCH_31)
Michal Suchanek 8bdce1
@@ -2337,6 +2378,14 @@ static void __perf_event_interrupt(struct pt_regs *regs)
Michal Suchanek 8bdce1
 				break;
Michal Suchanek 8bdce1
 			}
Michal Suchanek 8bdce1
 		}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+		/*
Michal Suchanek 8bdce1
+		 * Clear PACA_IRQ_PMI in case it was set by
Michal Suchanek 8bdce1
+		 * set_pmi_irq_pending() when PMU was enabled
Michal Suchanek 8bdce1
+		 * after accounting for interrupts.
Michal Suchanek 8bdce1
+		 */
Michal Suchanek 8bdce1
+		clear_pmi_irq_pending();
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
 		if (!active)
Michal Suchanek 8bdce1
 			/* reset non active counters that have overflowed */
Michal Suchanek 8bdce1
 			write_pmc(i + 1, 0);
Michal Suchanek 8bdce1
@@ -2356,6 +2405,13 @@ static void __perf_event_interrupt(struct pt_regs *regs)
Michal Suchanek 8bdce1
 			}
Michal Suchanek 8bdce1
 		}
Michal Suchanek 8bdce1
 	}
Michal Suchanek 8bdce1
+
Michal Suchanek 8bdce1
+	/*
Michal Suchanek 8bdce1
+	 * During system wide profling or while specific CPU is monitored for an
Michal Suchanek 8bdce1
+	 * event, some corner cases could cause PMC to overflow in idle path. This
Michal Suchanek 8bdce1
+	 * will trigger a PMI after waking up from idle. Since counter values are _not_
Michal Suchanek 8bdce1
+	 * saved/restored in idle path, can lead to below "Can't find PMC" message.
Michal Suchanek 8bdce1
+	 */
Michal Suchanek 8bdce1
 	if (!found && !nmi && printk_ratelimit())
Michal Suchanek 8bdce1
 		printk(KERN_WARNING "Can't find PMC that caused IRQ\n");
Michal Suchanek 8bdce1
 
Michal Suchanek 8bdce1
-- 
Michal Suchanek 8bdce1
2.31.1
Michal Suchanek 8bdce1