Blob Blame History Raw
From: James Smart <jsmart2021@gmail.com>
Date: Thu, 5 May 2022 20:55:14 -0700
Subject: scsi: lpfc: Decrement outstanding gidft_inp counter if
 lpfc_err_lost_link()
Patch-mainline: v5.19-rc1
Git-commit: dc8a71bd414fb550fd17164440c409f9ecf4b2a8
References: bsc#1200045

During large NPIV port testing, it was sometimes seen that not all vports
would log back in to the target device.

There are instances when the fabric is slow to respond to a spam of GID_PT
requests and as a result the SLI PORT may abort the GID_PT request because
the fabric takes so long.  lpfc_cmpl_ct_cmd_gid_pt() would enter the
lpfc_err_lost_link() logic and attempt to lpfc_els_flush_rscn(), which is
fine, but forgets to decrement the gidft_inp counter.  This results in a
vport->gidft_inp never reaching 0 and never restarting discovery again.

Decrement vport->gidft_inp if lpfc_err_lost_link() is true for both
lpfc_cmpl_ct_cmd_gid_pt() and lpfc_cmpl_ct_cmd_gid_ft().

Increase logging info during RSCN timeout and lpfc_err_lost_link() events.

Link: https://lore.kernel.org/r/20220506035519.50908-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/scsi/lpfc/lpfc_ct.c      |   16 ++++++++++++++--
 drivers/scsi/lpfc/lpfc_hbadisc.c |    5 +++--
 2 files changed, 17 insertions(+), 4 deletions(-)

--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -960,9 +960,15 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba
 	}
 	if (lpfc_error_lost_link(ulp_status, ulp_word4)) {
 		lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY,
-				 "0226 NS query failed due to link event\n");
+				 "0226 NS query failed due to link event: "
+				 "ulp_status x%x ulp_word4 x%x fc_flag x%x "
+				 "port_state x%x gidft_inp x%x\n",
+				 ulp_status, ulp_word4, vport->fc_flag,
+				 vport->port_state, vport->gidft_inp);
 		if (vport->fc_flag & FC_RSCN_MODE)
 			lpfc_els_flush_rscn(vport);
+		if (vport->gidft_inp)
+			vport->gidft_inp--;
 		goto out;
 	}
 
@@ -1177,9 +1183,15 @@ lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba
 	}
 	if (lpfc_error_lost_link(ulp_status, ulp_word4)) {
 		lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY,
-				 "4166 NS query failed due to link event\n");
+				 "4166 NS query failed due to link event: "
+				 "ulp_status x%x ulp_word4 x%x fc_flag x%x "
+				 "port_state x%x gidft_inp x%x\n",
+				 ulp_status, ulp_word4, vport->fc_flag,
+				 vport->port_state, vport->gidft_inp);
 		if (vport->fc_flag & FC_RSCN_MODE)
 			lpfc_els_flush_rscn(vport);
+		if (vport->gidft_inp)
+			vport->gidft_inp--;
 		goto out;
 	}
 
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -6355,8 +6355,9 @@ lpfc_disc_timeout_handler(struct lpfc_vp
 			lpfc_printf_vlog(vport, KERN_ERR,
 					 LOG_TRACE_EVENT,
 					 "0231 RSCN timeout Data: x%x "
-					 "x%x\n",
-					 vport->fc_ns_retry, LPFC_MAX_NS_RETRY);
+					 "x%x x%x x%x\n",
+					 vport->fc_ns_retry, LPFC_MAX_NS_RETRY,
+					 vport->port_state, vport->gidft_inp);
 
 			/* Cleanup any outstanding ELS commands */
 			lpfc_els_flush_cmd(vport);