Qu Wenruo 0d6264
From d4aef1e122d8bbdc15ce3bd0bc813d6b44a7d63a Mon Sep 17 00:00:00 2001
Qu Wenruo 0d6264
Message-Id: <d4aef1e122d8bbdc15ce3bd0bc813d6b44a7d63a.1652142354.git.wqu@suse.com>
Qu Wenruo 0d6264
From: Sidong Yang <realwakka@gmail.com>
Qu Wenruo 0d6264
Date: Mon, 28 Feb 2022 01:43:40 +0000
Qu Wenruo 0d6264
Patch-mainline: v5.17-rc1
Qu Wenruo 0d6264
References: bsc#1199295
Qu Wenruo 0d6264
Git-commit: d4aef1e122d8bbdc15ce3bd0bc813d6b44a7d63a
Qu Wenruo 0d6264
Subject: [PATCH] btrfs: qgroup: fix deadlock between rescan worker and remove
Qu Wenruo 0d6264
 qgroup
Qu Wenruo 0d6264
Qu Wenruo 0d6264
The commit e804861bd4e6 ("btrfs: fix deadlock between quota disable and
Qu Wenruo 0d6264
qgroup rescan worker") by Kawasaki resolves deadlock between quota
Qu Wenruo 0d6264
disable and qgroup rescan worker. But also there is a deadlock case like
Qu Wenruo 0d6264
it. It's about enabling or disabling quota and creating or removing
Qu Wenruo 0d6264
qgroup. It can be reproduced in simple script below.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
for i in {1..100}
Qu Wenruo 0d6264
do
Qu Wenruo 0d6264
    btrfs quota enable /mnt &
Qu Wenruo 0d6264
    btrfs qgroup create 1/0 /mnt &
Qu Wenruo 0d6264
    btrfs qgroup destroy 1/0 /mnt &
Qu Wenruo 0d6264
    btrfs quota disable /mnt &
Qu Wenruo 0d6264
done
Qu Wenruo 0d6264
Qu Wenruo 0d6264
Here's why the deadlock happens:
Qu Wenruo 0d6264
Qu Wenruo 0d6264
1) The quota rescan task is running.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
Qu Wenruo 0d6264
   mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
Qu Wenruo 0d6264
   the quota rescan task to complete.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
Qu Wenruo 0d6264
   the qgroup_ioctl_lock mutex, because it's being held by task A. At that
Qu Wenruo 0d6264
   point task B is holding a transaction handle for the current transaction.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
4) The quota rescan task calls btrfs_commit_transaction(). This results
Qu Wenruo 0d6264
   in it waiting for all other tasks to release their handles on the
Qu Wenruo 0d6264
   transaction, but task B is blocked on the qgroup_ioctl_lock mutex
Qu Wenruo 0d6264
   while holding a handle on the transaction, and that mutex is being held
Qu Wenruo 0d6264
   by task A, which is waiting for the quota rescan task to complete,
Qu Wenruo 0d6264
   resulting in a deadlock between these 3 tasks.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
To resolve this issue, the thread disabling quota should unlock
Qu Wenruo 0d6264
qgroup_ioctl_lock before waiting rescan completion. Move
Qu Wenruo 0d6264
btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.
Qu Wenruo 0d6264
Qu Wenruo 0d6264
Fixes: e804861bd4e6 ("btrfs: fix deadlock between quota disable and qgroup rescan worker")
Qu Wenruo 0d6264
CC: stable@vger.kernel.org # 5.4+
Qu Wenruo 0d6264
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Qu Wenruo 0d6264
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Qu Wenruo 0d6264
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Qu Wenruo 0d6264
Reviewed-by: David Sterba <dsterba@suse.com>
Qu Wenruo 0d6264
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo 0d6264
Signed-off-by: Qu Wenruo <wqu@suse.com>
Qu Wenruo 0d6264
---
Qu Wenruo 0d6264
 fs/btrfs/qgroup.c | 9 ++++++++-
Qu Wenruo 0d6264
 1 file changed, 8 insertions(+), 1 deletion(-)
Qu Wenruo 0d6264
Qu Wenruo 0d6264
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
Qu Wenruo 0d6264
index f12dc687350c..30d42ea655ce 100644
Qu Wenruo 0d6264
--- a/fs/btrfs/qgroup.c
Qu Wenruo 0d6264
+++ b/fs/btrfs/qgroup.c
Qu Wenruo 0d6264
@@ -1196,6 +1196,14 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
Qu Wenruo 0d6264
 	if (!fs_info->quota_root)
Qu Wenruo 0d6264
 		goto out;
Qu Wenruo 0d6264
 
Qu Wenruo 0d6264
+	/*
Qu Wenruo 0d6264
+	 * Unlock the qgroup_ioctl_lock mutex before waiting for the rescan worker to
Qu Wenruo 0d6264
+	 * complete. Otherwise we can deadlock because btrfs_remove_qgroup() needs
Qu Wenruo 0d6264
+	 * to lock that mutex while holding a transaction handle and the rescan
Qu Wenruo 0d6264
+	 * worker needs to commit a transaction.
Qu Wenruo 0d6264
+	 */
Qu Wenruo 0d6264
+	mutex_unlock(&fs_info->qgroup_ioctl_lock);
Qu Wenruo 0d6264
+
Qu Wenruo 0d6264
 	/*
Qu Wenruo 0d6264
 	 * Request qgroup rescan worker to complete and wait for it. This wait
Qu Wenruo 0d6264
 	 * must be done before transaction start for quota disable since it may
Qu Wenruo 0d6264
@@ -1203,7 +1211,6 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
Qu Wenruo 0d6264
 	 */
Qu Wenruo 0d6264
 	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
Qu Wenruo 0d6264
 	btrfs_qgroup_wait_for_completion(fs_info, false);
Qu Wenruo 0d6264
-	mutex_unlock(&fs_info->qgroup_ioctl_lock);
Qu Wenruo 0d6264
 
Qu Wenruo 0d6264
 	/*
Qu Wenruo 0d6264
 	 * 1 For the root item
Qu Wenruo 0d6264
-- 
Qu Wenruo 0d6264
2.36.0
Qu Wenruo 0d6264