Blob Blame History Raw
From: Steffen Maier <maier@linux.vnet.ibm.com>
Subject: scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records
Patch-mainline: v4.14-rc1
Git-commit: 975171b4461be296a35e83ebd748946b81cf0635
References: bnc#1066983, LTC#158494

Description:  zfcp: fix tracing regressions, part 2
Symptom:      On unsuccessful fabric nameserver responses during zfcp
              automatic port scan, zfcp wastes 63 trace records of size 256
              bytes each in the PAY trace area.

              SCSI trace records on task management functions cannot be
              correlated to HBA trace records because all fields related to
              the FSF request are empty (zero).

              On the following two rather seldom error cases during task
              management function (TMF) handling as part of the scsi_eh
              callbacks for LUN reset or Target reset, zfcp misses to write
              trace records making such situations difficult to debug:
              fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL, and
              not having gotten an FSF request after the maximum number of
              retry attempts and thus could not issue a TMF and has to
              return FAILED.

              SCSI trace records do not show FCP_RSP_INFO in case of TMFs,
              or would only show the first 96 bytes of FCP_SNS_INFO if a
              target would send more sense data.

              Cannot always determine root cause for FSF requests that:
              Failed by running into an ERP timeout or if they got
              dismissed because a higher level recovery was triggered
              [trace tags "erscf_1" or "erscf_2"], or
              failed by running into an FSF request timeout [trace tag
              "fsrth_1"], or
              failed with a deferred error when using FCP hardware data
              router.
Problem:      Since v4.9 commit aceeffbb59bb ("zfcp: trace full payload of
              all SAN records (req,resp,iels)"), zfcp only checked for
              FC_NS_FID_LAST being set in fp_flags to cap an automatic port
              scan SAN response. This flag likely does not occur in an
              unsuccessful response.

              V2.6.14 commit 8a36e4532ea1 ("[SCSI] zfcp: enhancement of
              zfcp debug features") introdued trace records for TMFs but
              hard coded NULL for a possibly existing TMF FSF request. The
              scsi_cmnd scribble is also zero or unrelated for the TMF
              request so it also could not lookup a suitable FSF request
              from there.

              The following two upstream commits introduced new early
              returns without adding a trace record:
              v2.6.35 commit a1dbfddd02d2 ("[SCSI] zfcp: Pass return code
              from fc_block_scsi_eh to scsi eh"),
              v2.6.30 commit 63caf367e1c9 ("[SCSI] zfcp: Improve
              reliability of SCSI eh handlers in zfcp").

              Zfcp only traced capped sense data.

              Zfcp only used protocol status and FSF status to decide on an
              unsuccessful response. However, this is only one of multiple
              possible sources determining a failed FSF request.
Solution:     Cap unsuccessful responses to at least the actual basic CT_IU
              response plus whatever fits the SAN trace record built-in
              "payload" buffer just in case there's trailing information of
              which we would at least see the existence and its beginning.
              In order not to erroneously cap successful responses, we need
              to swap calling the trace function and setting the CT / ELS
              status to success (0).

              Pass fsf_req to SCSI trace on TMF.
              There are some true cases where we really do not have an FSF
              request:
              "rsl_fai" from zfcp_dbf_scsi_fail_send() called for
              early returns / completions in zfcp_scsi_queuecommand();
              "abrt_or", "abrt_bl", "abrt_ru", "abrt_ar" from
              zfcp_scsi_eh_abort_handler() where we did not get as far;
              "lr_nres", "tr_nres" from zfcp_task_mgmt_function() where
              we're successful and do not need to do anything because
              adapter stopped. For these cases it's correct to pass NULL
              for fsf_req to _zfcp_dbf_scsi().

              Add missing trace record statements.

              Instead of just capped sense data, trace the full FCP_RSP IU
              which includes any sense data if available.

              Check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED or
              ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr"
              at the default trace level.
Reproduction: We typically see such case for an initiator WWPN, which is
              not in any zone.

              Tigger two timeouts in a row for the same single SCSI command
              (READ or WRITE).
              To manually create a similar situation: Stop multipathd so we
              don't get additional path checker TURs. Enable RSCN suppression
              on the SAN switch port beyond the first link, i.e. towards the
              storage target. Disable that switch port. Send one SCSI command
              in the background (because it will block for a while) e.g. via
              "dd if=/dev/mapper/... of=/dev/null count=1 &". After
              <SCSI command timeout> seconds, the command runs into the timeout
              for the first time, gets aborted, and then a retry is submitted.
              The retry is also lost because the switch port is still disabled.
              After 1.5 * <SCSI command timeout> seconds, enable that switch
              port again. After 2 * <SCSI command timeout> seconds, the command
              runs into the timeout for the second time and triggers scsi_eh.
              As first step, scsi_eh sends a LUN reset which should get a
              successful response from the storage target. This response
              causes a SCSI trace record for the TMF.

              Similar to above paragraph to trigger a LUN reset but
              triggering specific case is hardly possible.

              Similar to two paragraphs above for TMF trace records.
              During LUN attachment (automatic or manual) or after
              re-plugging fibres, we typically get some expected sense data.

              No known reproduction for ERP timeout, FSF request timeout,
              or erroneous FSF request using FCP hardware data router.

Upstream-Description:

              scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records

              v4.9 commit aceeffbb59bb ("zfcp: trace full payload of all SAN records
              (req,resp,iels)") fixed trace data loss of 2.6.38 commit 2c55b750a884
              ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
              necessary for problem determination, e.g. to see the
              currently active zone set during automatic port scan.

              While it already saves space by not dumping any empty residual entries
              of the large successful GPN_FT response (4 pages), there are seldom cases
              where the GPN_FT response is unsuccessful and likely does not have
              FC_NS_FID_LAST set in fp_flags so we did not cap the trace record.
              We typically see such case for an initiator WWPN, which is not in any zone.

              Cap unsuccessful responses to at least the actual basic CT_IU response
              plus whatever fits the SAN trace record built-in "payload" buffer
              just in case there's trailing information
              of which we would at least see the existence and its beginning.

              In order not to erroneously cap successful responses, we need to swap
              calling the trace function and setting the CT / ELS status to success (0).

              Example trace record pair formatted with zfcpdbf:

              Timestamp      : ...
              Area           : SAN
              Subarea        : 00
              Level          : 1
              Exception      : -
              CPU ID         : ..
              Caller         : 0x...
              Record ID      : 1
              Tag            : fssct_1
              Request ID     : 0x<request_id>
              Destination ID : 0x00fffffc
              SAN req short  : 01000000 fc020000 01720ffc 00000000
                               00000008
              SAN req length : 20
              |
              Timestamp      : ...
              Area           : SAN
              Subarea        : 00
              Level          : 1
              Exception      : -
              CPU ID         : ..
              Caller         : 0x...
              Record ID      : 2
              Tag            : fsscth2
              Request ID     : 0x<request_id>
              Destination ID : 0x00fffffc
              SAN resp short : 01000000 fc020000 80010000 00090700
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
              SAN resp length: 16384
              San resp info  : 01000000 fc020000 80010000 00090700
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]
                               00000000 00000000 00000000 00000000 [trailing info]

              The fix saves all but one of the previously associated 64 PAYload trace
              record chunks of size 256 bytes each.

              Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
              Fixes: aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)")
              Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
              Cc: <stable@vger.kernel.org> #2.6.38+
              Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
              Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
              Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>


Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Acked-by: Hannes Reinecke <hare@suse.com>
---
 drivers/s390/scsi/zfcp_dbf.c |   10 +++++++++-
 drivers/s390/scsi/zfcp_fsf.c |    4 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -3,7 +3,7 @@
  *
  * Debug traces for zfcp.
  *
- * Copyright IBM Corp. 2002, 2016
+ * Copyright IBM Corp. 2002, 2017
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -447,6 +447,7 @@ static u16 zfcp_dbf_san_res_cap_len_if_g
 	struct fc_ct_hdr *reqh = sg_virt(ct_els->req);
 	struct fc_ns_gid_ft *reqn = (struct fc_ns_gid_ft *)(reqh + 1);
 	struct scatterlist *resp_entry = ct_els->resp;
+	struct fc_ct_hdr *resph;
 	struct fc_gpn_ft_resp *acc;
 	int max_entries, x, last = 0;
 
@@ -473,6 +474,13 @@ static u16 zfcp_dbf_san_res_cap_len_if_g
 		return len; /* not GPN_FT response so do not cap */
 
 	acc = sg_virt(resp_entry);
+
+	/* cap all but accept CT responses to at least the CT header */
+	resph = (struct fc_ct_hdr *)acc;
+	if ((ct_els->status) ||
+	    (resph->ct_cmd != cpu_to_be16(FC_FS_ACC)))
+		return max(FC_CT_HDR_LEN, ZFCP_DBF_SAN_MAX_PAYLOAD);
+
 	max_entries = (reqh->ct_mr_size * 4 / sizeof(struct fc_gpn_ft_resp))
 		+ 1 /* zfcp_fc_scan_ports: bytes correct, entries off-by-one
 		     * to account for header as 1st pseudo "entry" */;
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -928,8 +928,8 @@ static void zfcp_fsf_send_ct_handler(str
 
 	switch (header->fsf_status) {
         case FSF_GOOD:
-		zfcp_dbf_san_res("fsscth2", req);
 		ct->status = 0;
+		zfcp_dbf_san_res("fsscth2", req);
 		break;
         case FSF_SERVICE_CLASS_NOT_SUPPORTED:
 		zfcp_fsf_class_not_supp(req);
@@ -1109,8 +1109,8 @@ static void zfcp_fsf_send_els_handler(st
 
 	switch (header->fsf_status) {
 	case FSF_GOOD:
-		zfcp_dbf_san_res("fsselh1", req);
 		send_els->status = 0;
+		zfcp_dbf_san_res("fsselh1", req);
 		break;
 	case FSF_SERVICE_CLASS_NOT_SUPPORTED:
 		zfcp_fsf_class_not_supp(req);