Blob Blame History Raw
From: Jay Cornwall <Jay.Cornwall@amd.com>
Date: Thu, 13 Jul 2017 20:21:53 -0500
Subject: drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
Git-commit: 3447d220155bd9f4b5435ea6e9d58b536c7e94dd
Patch-mainline: v4.13-rc2
References: FATE#326289 FATE#326079 FATE#326049 FATE#322398 FATE#326166

The number of compute queues available to the KFD was erroneously
calculated as 64. Only the first MEC can execute compute queues and
it has 32 queue slots.

This caused the oversubscription limit to be calculated incorrectly,
leading to a missing chained runlist command at the end of an
oversubscribed runlist.

v2: Remove unused num_mec field to avoid duplicate logic
v3: Separate num_mec removal into separate patches

Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Petr Tesarik <ptesarik@suse.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -122,7 +122,7 @@ void amdgpu_amdkfd_device_init(struct am
 
 		/* According to linux/bitmap.h we shouldn't use bitmap_clear if
 		 * nbits is not compile time constant */
-		last_valid_bit = adev->gfx.mec.num_mec
+		last_valid_bit = 1 /* only first MEC can have compute queues */
 				* adev->gfx.mec.num_pipe_per_mec
 				* adev->gfx.mec.num_queue_per_pipe;
 		for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)