Blob Blame History Raw
From: Evan Quan <evan.quan@amd.com>
Date: Thu, 15 Mar 2018 09:49:01 +0800
Subject: drm/amdgpu: no job timeout setting on compute queues
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: f0c2b16ba84a0b8b960a6d442496ce2d2e6bfa99
Patch-mainline: v4.17-rc1
References: FATE#326289 FATE#326079 FATE#326049 FATE#322398 FATE#326166

Under some heavy computing environment(e.g. dgemm test), it
takes the asic over 10+ seconds to finish the dispatched job
which will trigger the timeout.

It's quite confusing although it does not seem to bring any
real problems. As a quick workround, we choose to not enfoce
the timeout setting on compute queues.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Petr Tesarik <ptesarik@suse.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -435,7 +435,9 @@ int amdgpu_fence_driver_init_ring(struct
 	if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
 		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
 				   num_hw_submission, amdgpu_job_hang_limit,
-				   msecs_to_jiffies(amdgpu_lockup_timeout), ring->name);
+				   (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) ?
+				   MAX_SCHEDULE_TIMEOUT : msecs_to_jiffies(amdgpu_lockup_timeout),
+				   ring->name);
 		if (r) {
 			DRM_ERROR("Failed to create scheduler on ring %s.\n",
 				  ring->name);