Blob Blame History Raw
From: Xiubo Li <xiubli@redhat.com>
Date: Wed, 20 May 2020 03:51:19 -0400
Subject: ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock
Git-commit: ea8412b284c09742d5b11721e225b4ff011aa397
Patch-mainline: v5.8-rc1
References: bsc#1172989

send_mds_reconnect takes the s_mutex while the mdsc->mutex is already
held. That inverts the locking order documented in mds_client.h. Drop
the mdsc->mutex, acquire the s_mutex and then reacquire the mdsc->mutex
to prevent a deadlock.

URL: https://tracker.ceph.com/issues/45609
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/mds_client.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 6c283c52d401..0e0ab01694dc 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc,
  * recovering MDS might have.
  *
  * This is a relatively heavyweight operation, but it's rare.
- *
- * called with mdsc->mutex held.
  */
 static void send_mds_reconnect(struct ceph_mds_client *mdsc,
 			       struct ceph_mds_session *session)
@@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc,
 			    oldstate != CEPH_MDS_STATE_STARTING)
 				pr_info("mds%d recovery completed\n", s->s_mds);
 			kick_requests(mdsc, i);
+			mutex_unlock(&mdsc->mutex);
 			mutex_lock(&s->s_mutex);
+			mutex_lock(&mdsc->mutex);
 			ceph_kick_flushing_caps(mdsc, s);
 			mutex_unlock(&s->s_mutex);
 			wake_up_session_caps(s, RECONNECT);