Blob Blame History Raw
From: Sagi Grimberg <sagi@grimberg.me>
Date: Tue, 1 Feb 2022 14:54:20 +0200
Subject: nvme-tcp: fix possible use-after-free in transport error_recovery
 work
Patch-mainline: v5.17-rc3
Git-commit: ff9fc7ebf5c06de1ef72a69f9b1ab40af8b07f9e
References: bsc#1193787 bsc#1197146 bsc#1193554

While nvme_tcp_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

Tested-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2096,6 +2096,7 @@ static void nvme_tcp_error_recovery_work
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
 	nvme_stop_keep_alive(ctrl);
+	flush_work(&ctrl->async_event_work);
 	nvme_tcp_teardown_io_queues(ctrl, false);
 	/* unquiesce to fail fast pending requests */
 	nvme_start_queues(ctrl);