Blob Blame History Raw
From: James Smart <jsmart2021@gmail.com>
Date: Thu, 11 Jan 2018 15:21:38 -0800
Subject: nvme-fc: correct hang in nvme_ns_remove()
Patch-mainline: v4.16-rc1
Git-commit: 0fd997d3f77296522e836f7002e8a0636c9886aa
Reviewed-by: Hannes Reinecke <hare@suse.com>
References: bsc#1075811

When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If connectivity is lost for a sufficient amount of time that the
controller is then deleted, the delete path starts tearing down queues,
and eventually calling nvme_ns_remove(). It appears that pending
commands may cause blk_cleanup_queue() to never complete and the
teardown stalls.

Correct by starting the ns queues after transitioning to a DELETING
state, allowing pending commands to be flushed with io failures. Thus
the delete path is clear when reached.

Signed-off-by: James Smart <james.smart@broadcom.com>
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/nvme/host/fc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8f9ddd0..aa916a4 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2938,6 +2938,9 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
 	 * waiting for io to terminate
 	 */
 	nvme_fc_delete_association(ctrl);
+
+	/* resume the io queues so that things will fast fail */
+	nvme_start_queues(nctrl);
 }
 
 static void
-- 
1.8.5.6