Blob Blame History Raw
From 0e124e19ce52d20b28ee9f1d5cdb22e2106bfd29 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu, 3 Dec 2020 10:34:32 +0000
Subject: [PATCH] drm/i915/gem: Propagate error from cancelled submit due to context closure
Git-commit: 0e124e19ce52d20b28ee9f1d5cdb22e2106bfd29
Patch-mainline: v5.10
No-fix: ba38b79eaeaeed29d2383f122d5c711ebf5ed3d1
References: git-fixes

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd056 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
(cherry picked from commit ba38b79eaeaeed29d2383f122d5c711ebf5ed3d1)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Takashi Iwai <tiwai@suse.de>

---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2610,7 +2610,7 @@ static void retire_requests(struct intel
 			break;
 }
 
-static void eb_request_add(struct i915_execbuffer *eb)
+static int eb_request_add(struct i915_execbuffer *eb, int err)
 {
 	struct i915_request *rq = eb->request;
 	struct intel_timeline * const tl = i915_request_timeline(rq);
@@ -2631,6 +2631,7 @@ static void eb_request_add(struct i915_e
 		/* Serialise with context_close via the add_to_timeline */
 		i915_request_set_error_once(rq, -ENOENT);
 		__i915_request_skip(rq);
+		err = -ENOENT; /* override any transient errors */
 	}
 
 	__i915_request_queue(rq, &attr);
@@ -2640,6 +2641,8 @@ static void eb_request_add(struct i915_e
 		retire_requests(tl, prev);
 
 	mutex_unlock(&tl->mutex);
+
+	return err;
 }
 
 static int
@@ -2841,7 +2844,7 @@ i915_gem_do_execbuffer(struct drm_device
 err_request:
 	add_to_client(eb.request, file);
 	i915_request_get(eb.request);
-	eb_request_add(&eb);
+	err = eb_request_add(&eb, err);
 
 	if (fences)
 		signal_fence_array(&eb, fences);