From: Ruozhu Li <liruozhu@huawei.com>
Date: Thu, 4 Nov 2021 15:13:32 +0800
Subject: [PATCH] nvme: fix use after free when disconnecting a reconnecting
ctrl
Git-commit: 8b77fa6fdce0fc7147bab91b1011048758290ca4
Patch-mainline: v5.16-rc1
References: git-fixes
A crash happens when trying to disconnect a reconnecting ctrl:
1) The network was cut off when the connection was just established,
scan work hang there waiting for some IOs complete. Those I/Os were
retried because we return BLK_STS_RESOURCE to blk in reconnecting.
2) After a while, I tried to disconnect this connection. This
procedure also hangs because it tried to obtain ctrl->scan_lock.
It should be noted that now we have switched the controller state
to NVME_CTRL_DELETING.
3) In nvme_check_ready(), we always return true when ctrl->state is
NVME_CTRL_DELETING, so those retrying I/Os were issued to the bottom
device which was already freed.
To fix this, when ctrl->state is NVME_CTRL_DELETING, issue cmd to bottom
device only when queue state is live. If not, return host path error to
the block layer
[hare: ported to SLE15 SP3]
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hannes Reinecke <hare@suse.com>
---
drivers/nvme/host/fabrics.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 31bfd509585c..13308c1992d2 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -551,6 +551,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
struct request *rq)
{
if (ctrl->state != NVME_CTRL_DELETING_NOIO &&
+ ctrl->state != NVME_CTRL_DELETING &&
ctrl->state != NVME_CTRL_DEAD &&
!test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
--
2.29.2