Blob Blame History Raw
From: Xiaofei Tan <tanxiaofei@huawei.com>
Date: Sat, 19 Jan 2019 14:23:29 +0800
Subject: RDMA/hns: Add the process of AEQ overflow for hip08
Patch-mainline: v5.1-rc1
Git-commit: 2b9acb9a97fe9b4101ca020643760c4a090b4cb4
References: bsc#1104427 FATE#326416 bsc#1126206

AEQ overflow will be reported by hardware when too many asynchronous
events occurred but not be handled in time.  Normally, AEQ overflow error
is not easy to occur. Once happened, we have to do physical function reset
to recover.  PF reset is implemented in two steps. Firstly, set reset
level with ae_dev->ops->set_default_reset_request.  Secondly, run reset
with ae_dev->ops->reset_event.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -4702,11 +4702,22 @@ static irqreturn_t hns_roce_v2_msix_inte
 	int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG);
 
 	if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) {
+		struct pci_dev *pdev = hr_dev->pci_dev;
+		struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev);
+		const struct hnae3_ae_ops *ops = ae_dev->ops;
+
 		dev_err(dev, "AEQ overflow!\n");
 
 		roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1);
 		roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st);
 
+		/* Set reset level for reset_event() */
+		if (ops->set_default_reset_request)
+			ops->set_default_reset_request(ae_dev,
+						       HNAE3_FUNC_RESET);
+		if (ops->reset_event)
+			ops->reset_event(pdev, NULL);
+
 		roce_set_bit(int_en, HNS_ROCE_V2_VF_ABN_INT_EN_S, 1);
 		roce_write(hr_dev, ROCEE_VF_ABN_INT_EN_REG, int_en);