|
Jiri Slaby |
9cb590 |
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
|
|
Jiri Slaby |
9cb590 |
Date: Tue, 18 Jul 2017 14:08:34 +0300
|
|
Jiri Slaby |
9cb590 |
Subject: [PATCH] perf/core: Fix scheduling regression of pinned groups
|
|
Jiri Slaby |
9cb590 |
References: bnc#1060662
|
|
Thomas Zimmermann |
1d81d2 |
Patch-mainline: v4.12.4
|
|
Jiri Slaby |
9cb590 |
Git-commit: 3bda69c1c3993a2bddbae01397d12bfef6054011
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
commit 3bda69c1c3993a2bddbae01397d12bfef6054011 upstream.
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
Vince Weaver reported:
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
> I was tracking down some regressions in my perf_event_test testsuite.
|
|
Jiri Slaby |
9cb590 |
> Some of the tests broke in the 4.11-rc1 timeframe.
|
|
Jiri Slaby |
9cb590 |
>
|
|
Jiri Slaby |
9cb590 |
> I've bisected one of them, this report is about
|
|
Jiri Slaby |
9cb590 |
> tests/overflow/simul_oneshot_group_overflow
|
|
Jiri Slaby |
9cb590 |
> This test creates an event group containing two sampling events, set
|
|
Jiri Slaby |
9cb590 |
> to overflow to a signal handler (which disables and then refreshes the
|
|
Jiri Slaby |
9cb590 |
> event).
|
|
Jiri Slaby |
9cb590 |
>
|
|
Jiri Slaby |
9cb590 |
> On a good kernel you get the following:
|
|
Jiri Slaby |
9cb590 |
> Event perf::instructions with period 1000000
|
|
Jiri Slaby |
9cb590 |
> Event perf::instructions with period 2000000
|
|
Jiri Slaby |
9cb590 |
> fd 3 overflows: 946 (perf::instructions/1000000)
|
|
Jiri Slaby |
9cb590 |
> fd 4 overflows: 473 (perf::instructions/2000000)
|
|
Jiri Slaby |
9cb590 |
> Ending counts:
|
|
Jiri Slaby |
9cb590 |
> Count 0: 946379875
|
|
Jiri Slaby |
9cb590 |
> Count 1: 946365218
|
|
Jiri Slaby |
9cb590 |
>
|
|
Jiri Slaby |
9cb590 |
> With the broken kernels you get:
|
|
Jiri Slaby |
9cb590 |
> Event perf::instructions with period 1000000
|
|
Jiri Slaby |
9cb590 |
> Event perf::instructions with period 2000000
|
|
Jiri Slaby |
9cb590 |
> fd 3 overflows: 938 (perf::instructions/1000000)
|
|
Jiri Slaby |
9cb590 |
> fd 4 overflows: 318 (perf::instructions/2000000)
|
|
Jiri Slaby |
9cb590 |
> Ending counts:
|
|
Jiri Slaby |
9cb590 |
> Count 0: 946373080
|
|
Jiri Slaby |
9cb590 |
> Count 1: 653373058
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
The root cause of the bug is that the following commit:
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
487f05e18a ("perf/core: Optimize event rescheduling on active contexts")
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
erronously assumed that event's 'pinned' setting determines whether the
|
|
Jiri Slaby |
9cb590 |
event belongs to a pinned group or not, but in fact, it's the group
|
|
Jiri Slaby |
9cb590 |
leader's pinned state that matters.
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
This was discovered by Vince in the test case described above, where two instruction
|
|
Jiri Slaby |
9cb590 |
counters are grouped, the group leader is pinned, but the other event is not;
|
|
Jiri Slaby |
9cb590 |
in the regressed case the counters were off by 33% (the difference between events'
|
|
Jiri Slaby |
9cb590 |
periods), but should be the same within the error margin.
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
Fix the problem by looking at the group leader's pinning.
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
|
|
Jiri Slaby |
9cb590 |
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
|
|
Jiri Slaby |
9cb590 |
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
|
|
Jiri Slaby |
9cb590 |
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Jiri Slaby |
9cb590 |
Cc: Jiri Olsa <jolsa@redhat.com>
|
|
Jiri Slaby |
9cb590 |
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Jiri Slaby |
9cb590 |
Cc: Peter Zijlstra <peterz@infradead.org>
|
|
Jiri Slaby |
9cb590 |
Cc: Stephane Eranian <eranian@gmail.com>
|
|
Jiri Slaby |
9cb590 |
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
Jiri Slaby |
9cb590 |
Fixes: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts")
|
|
Jiri Slaby |
9cb590 |
Link: http://lkml.kernel.org/r/87lgnmvw7h.fsf@ashishki-desk.ger.corp.intel.com
|
|
Jiri Slaby |
9cb590 |
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Jiri Slaby |
9cb590 |
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Jiri Slaby |
9cb590 |
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
Jiri Slaby |
9cb590 |
---
|
|
Jiri Slaby |
9cb590 |
kernel/events/core.c | 7 +++++++
|
|
Jiri Slaby |
9cb590 |
1 file changed, 7 insertions(+)
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
diff --git a/kernel/events/core.c b/kernel/events/core.c
|
|
Jiri Slaby |
9cb590 |
index 6c4e523dc1e2..f389166bc0e0 100644
|
|
Jiri Slaby |
9cb590 |
--- a/kernel/events/core.c
|
|
Jiri Slaby |
9cb590 |
+++ b/kernel/events/core.c
|
|
Jiri Slaby |
9cb590 |
@@ -1456,6 +1456,13 @@ static enum event_type_t get_event_type(struct perf_event *event)
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
lockdep_assert_held(&ctx->lock);
|
|
Jiri Slaby |
9cb590 |
|
|
Jiri Slaby |
9cb590 |
+ /*
|
|
Jiri Slaby |
9cb590 |
+ * It's 'group type', really, because if our group leader is
|
|
Jiri Slaby |
9cb590 |
+ * pinned, so are we.
|
|
Jiri Slaby |
9cb590 |
+ */
|
|
Jiri Slaby |
9cb590 |
+ if (event->group_leader != event)
|
|
Jiri Slaby |
9cb590 |
+ event = event->group_leader;
|
|
Jiri Slaby |
9cb590 |
+
|
|
Jiri Slaby |
9cb590 |
event_type = event->attr.pinned ? EVENT_PINNED : EVENT_FLEXIBLE;
|
|
Jiri Slaby |
9cb590 |
if (!ctx->task)
|
|
Jiri Slaby |
9cb590 |
event_type |= EVENT_CPU;
|
|
Jiri Slaby |
9cb590 |
--
|
|
Jiri Slaby |
9cb590 |
2.14.2
|
|
Jiri Slaby |
9cb590 |
|