Blob Blame History Raw
From 30b29406d9374989f34bce0eadaa630813049808 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri, 10 Nov 2017 11:25:50 +0000
Subject: [PATCH] drm/i915: Restore the wait for idle engine after flushing interrupts
Git-commit: 30b29406d9374989f34bce0eadaa630813049808
Patch-mainline: v4.16-rc1
References: FATE#322643 bsc#1055900

So it appears that commit 5427f207852d ("drm/i915: Bump wait-times for
the final CS interrupt before parking") was a little over optimistic in
its belief that it had successfully waited for all residual activity on
the engines before parking. Numerous sightings in CI since then of

<7>[   52.542886] [IGT] core_auth: executing
<3>[   52.561013] [drm:intel_engines_park [i915]] *ERROR* vcs0 is not idle before parking
<7>[   52.561215] intel_engines_park vcs0
<7>[   52.561229] intel_engines_park 	current seqno 98, last 98, hangcheck 0 [-247449 ms], inflight 0
<7>[   52.561238] intel_engines_park 	Reset count: 0
<7>[   52.561266] intel_engines_park 	Requests:
<7>[   52.561363] intel_engines_park 	RING_START: 0x00000000 [0x00000000]
<7>[   52.561377] intel_engines_park 	RING_HEAD:  0x00000000 [0x00000000]
<7>[   52.561390] intel_engines_park 	RING_TAIL:  0x00000000 [0x00000000]
<7>[   52.561406] intel_engines_park 	RING_CTL:   0x00000000
<7>[   52.561422] intel_engines_park 	RING_MODE:  0x00000200 [idle]
<7>[   52.561442] intel_engines_park 	ACTHD:  0x00000000_00000000
<7>[   52.561459] intel_engines_park 	BBADDR: 0x00000000_00000000
<7>[   52.561474] intel_engines_park 	Execlist status: 0x00000301 00000000
<7>[   52.561489] intel_engines_park 	Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no
<7>[   52.561500] intel_engines_park 		ELSP[0] idle
<7>[   52.561510] intel_engines_park 		ELSP[1] idle
<7>[   52.561519] intel_engines_park 		HW active? 0x0
<7>[   52.561608] intel_engines_park Idle? yes
<7>[   52.561617] intel_engines_park

on Braswell, which indicates that the engine just needs that little bit
longer after flushing the tasklet to settle. So give it a few more
milliseconds before declaring an err and applying the emergency brake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110112550.28909-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Takashi Iwai <tiwai@suse.de>

---
 drivers/gpu/drm/i915/intel_engine_cs.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1631,11 +1631,12 @@ void intel_engines_park(struct drm_i915_
 		 * will be no more interrupts arriving later and the engines
 		 * are truly idle.
 		 */
-		if (!intel_engine_is_idle(engine)) {
+		if (wait_for(intel_engine_is_idle(engine), 10)) {
 			struct drm_printer p = drm_debug_printer(__func__);
 
-			DRM_ERROR("%s is not idle before parking\n",
-				  engine->name);
+			dev_err(i915->drm.dev,
+				"%s is not idle before parking\n",
+				engine->name);
 			intel_engine_dump(engine, &p);
 		}