Jiri Slaby ef7db2
From: Waiman Long <longman@redhat.com>
Jiri Slaby ef7db2
Date: Wed, 25 Jan 2023 19:36:25 -0500
Jiri Slaby ef7db2
Subject: [PATCH] locking/rwsem: Prevent non-first waiter from spinning in
Jiri Slaby ef7db2
 down_write() slowpath
Jiri Slaby ef7db2
References: bsc#1012628
Jiri Slaby ef7db2
Patch-mainline: 6.2.3
Jiri Slaby ef7db2
Git-commit: b613c7f31476c44316bfac1af7cac714b7d6bef9
Jiri Slaby ef7db2
Jiri Slaby ef7db2
commit b613c7f31476c44316bfac1af7cac714b7d6bef9 upstream.
Jiri Slaby ef7db2
Jiri Slaby ef7db2
A non-first waiter can potentially spin in the for loop of
Jiri Slaby ef7db2
rwsem_down_write_slowpath() without sleeping but fail to acquire the
Jiri Slaby ef7db2
lock even if the rwsem is free if the following sequence happens:
Jiri Slaby ef7db2
Jiri Slaby ef7db2
  Non-first RT waiter    First waiter      Lock holder
Jiri Slaby ef7db2
  -------------------    ------------      -----------
Jiri Slaby ef7db2
  Acquire wait_lock
Jiri Slaby ef7db2
  rwsem_try_write_lock():
Jiri Slaby ef7db2
    Set handoff bit if RT or
Jiri Slaby ef7db2
      wait too long
Jiri Slaby ef7db2
    Set waiter->handoff_set
Jiri Slaby ef7db2
  Release wait_lock
Jiri Slaby ef7db2
                         Acquire wait_lock
Jiri Slaby ef7db2
                         Inherit waiter->handoff_set
Jiri Slaby ef7db2
                         Release wait_lock
Jiri Slaby ef7db2
					   Clear owner
Jiri Slaby ef7db2
                                           Release lock
Jiri Slaby ef7db2
  if (waiter.handoff_set) {
Jiri Slaby ef7db2
    rwsem_spin_on_owner(();
Jiri Slaby ef7db2
    if (OWNER_NULL)
Jiri Slaby ef7db2
      goto trylock_again;
Jiri Slaby ef7db2
  }
Jiri Slaby ef7db2
  trylock_again:
Jiri Slaby ef7db2
  Acquire wait_lock
Jiri Slaby ef7db2
  rwsem_try_write_lock():
Jiri Slaby ef7db2
     if (first->handoff_set && (waiter != first))
Jiri Slaby ef7db2
	return false;
Jiri Slaby ef7db2
  Release wait_lock
Jiri Slaby ef7db2
Jiri Slaby ef7db2
A non-first waiter cannot really acquire the rwsem even if it mistakenly
Jiri Slaby ef7db2
believes that it can spin on OWNER_NULL value. If that waiter happens
Jiri Slaby ef7db2
to be an RT task running on the same CPU as the first waiter, it can
Jiri Slaby ef7db2
block the first waiter from acquiring the rwsem leading to live lock.
Jiri Slaby ef7db2
Fix this problem by making sure that a non-first waiter cannot spin in
Jiri Slaby ef7db2
the slowpath loop without sleeping.
Jiri Slaby ef7db2
Jiri Slaby ef7db2
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Jiri Slaby ef7db2
Signed-off-by: Waiman Long <longman@redhat.com>
Jiri Slaby ef7db2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiri Slaby ef7db2
Tested-by: Mukesh Ojha <quic_mojha@quicinc.com>
Jiri Slaby ef7db2
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Jiri Slaby ef7db2
Cc: stable@vger.kernel.org
Jiri Slaby ef7db2
Link: https://lore.kernel.org/r/20230126003628.365092-2-longman@redhat.com
Jiri Slaby ef7db2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jiri Slaby ef7db2
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jiri Slaby ef7db2
---
Jiri Slaby ef7db2
 kernel/locking/rwsem.c | 19 +++++++++----------
Jiri Slaby ef7db2
 1 file changed, 9 insertions(+), 10 deletions(-)
Jiri Slaby ef7db2
Jiri Slaby ef7db2
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
Jiri Slaby ef7db2
index 324fc370..84d5b649 100644
Jiri Slaby ef7db2
--- a/kernel/locking/rwsem.c
Jiri Slaby ef7db2
+++ b/kernel/locking/rwsem.c
Jiri Slaby ef7db2
@@ -624,18 +624,16 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
Jiri Slaby ef7db2
 			 */
Jiri Slaby ef7db2
 			if (first->handoff_set && (waiter != first))
Jiri Slaby ef7db2
 				return false;
Jiri Slaby ef7db2
-
Jiri Slaby ef7db2
-			/*
Jiri Slaby ef7db2
-			 * First waiter can inherit a previously set handoff
Jiri Slaby ef7db2
-			 * bit and spin on rwsem if lock acquisition fails.
Jiri Slaby ef7db2
-			 */
Jiri Slaby ef7db2
-			if (waiter == first)
Jiri Slaby ef7db2
-				waiter->handoff_set = true;
Jiri Slaby ef7db2
 		}
Jiri Slaby ef7db2
 
Jiri Slaby ef7db2
 		new = count;
Jiri Slaby ef7db2
 
Jiri Slaby ef7db2
 		if (count & RWSEM_LOCK_MASK) {
Jiri Slaby ef7db2
+			/*
Jiri Slaby ef7db2
+			 * A waiter (first or not) can set the handoff bit
Jiri Slaby ef7db2
+			 * if it is an RT task or wait in the wait queue
Jiri Slaby ef7db2
+			 * for too long.
Jiri Slaby ef7db2
+			 */
Jiri Slaby ef7db2
 			if (has_handoff || (!rt_task(waiter->task) &&
Jiri Slaby ef7db2
 					    !time_after(jiffies, waiter->timeout)))
Jiri Slaby ef7db2
 				return false;
Jiri Slaby ef7db2
@@ -651,11 +649,12 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
Jiri Slaby ef7db2
 	} while (!atomic_long_try_cmpxchg_acquire(&sem->count, &count, new));
Jiri Slaby ef7db2
 
Jiri Slaby ef7db2
 	/*
Jiri Slaby ef7db2
-	 * We have either acquired the lock with handoff bit cleared or
Jiri Slaby ef7db2
-	 * set the handoff bit.
Jiri Slaby ef7db2
+	 * We have either acquired the lock with handoff bit cleared or set
Jiri Slaby ef7db2
+	 * the handoff bit. Only the first waiter can have its handoff_set
Jiri Slaby ef7db2
+	 * set here to enable optimistic spinning in slowpath loop.
Jiri Slaby ef7db2
 	 */
Jiri Slaby ef7db2
 	if (new & RWSEM_FLAG_HANDOFF) {
Jiri Slaby ef7db2
-		waiter->handoff_set = true;
Jiri Slaby ef7db2
+		first->handoff_set = true;
Jiri Slaby ef7db2
 		lockevent_inc(rwsem_wlock_handoff);
Jiri Slaby ef7db2
 		return false;
Jiri Slaby ef7db2
 	}
Jiri Slaby ef7db2
-- 
Jiri Slaby ef7db2
2.35.3
Jiri Slaby ef7db2