From 9aa5b92e8e315999272e0a1ef5e8fc5172952397 Mon Sep 17 00:00:00 2001
From: Alex Sierra <alex.sierra@amd.com>
Date: Mon, 18 Nov 2019 15:33:07 -0600
Subject: amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: b4672c8a84beba6d04dc564b02e439590b5c93f7
Patch-mainline: v5.6-rc1
References: jsc#SLE-12680, jsc#SLE-12880, jsc#SLE-12882, jsc#SLE-12883, jsc#SLE-13496, jsc#SLE-15322
Only for the debugger use case.
[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.
[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Patrik Jakobsson <pjakobsson@suse.de>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 63f6e46bd642..f20b572d2438 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3197,11 +3197,20 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
AMDGPU_PTE_SYSTEM;
- if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
+ if (vm->is_compute_context) {
+ /* Intentionally setting invalid PTE flag
+ * combination to force a no-retry-fault
+ */
+ flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+ AMDGPU_PTE_TF;
+ value = 0;
+
+ } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
/* Redirect the access to the dummy page */
value = adev->dummy_page_addr;
flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
AMDGPU_PTE_WRITEABLE;
+
} else {
/* Let the hw retry silently on the PTE */
value = 0;
--
2.28.0