Nicolas Morey b03454
From 00cbce5cbf88459cd1aa1d60d0f1df15477df127 Mon Sep 17 00:00:00 2001
Nicolas Morey b03454
From: Patrick Kelsey <pat.kelsey@cornelisnetworks.com>
Nicolas Morey b03454
Date: Fri, 7 Apr 2023 12:52:44 -0400
Nicolas Morey b03454
Subject: [PATCH 1/1] IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user
Nicolas Morey b03454
 SDMA requests
Nicolas Morey b03454
Git-commit: 00cbce5cbf88459cd1aa1d60d0f1df15477df127
Nicolas Morey b03454
Patch-mainline: v6.4-rc1
Nicolas Morey b03454
References: git-fixes
Nicolas Morey b03454
Nicolas Morey b03454
hfi1 user SDMA request processing has two bugs that can cause data
Nicolas Morey b03454
corruption for user SDMA requests that have multiple payload iovecs
Nicolas Morey b03454
where an iovec other than the tail iovec does not run up to the page
Nicolas Morey b03454
boundary for the buffer pointed to by that iovec.a
Nicolas Morey b03454
Nicolas Morey b03454
Here are the specific bugs:
Nicolas Morey b03454
1. user_sdma_txadd() does not use struct user_sdma_iovec->iov.iov_len.
Nicolas Morey b03454
   Rather, user_sdma_txadd() will add up to PAGE_SIZE bytes from iovec
Nicolas Morey b03454
   to the packet, even if some of those bytes are past
Nicolas Morey b03454
   iovec->iov.iov_len and are thus not intended to be in the packet.
Nicolas Morey b03454
2. user_sdma_txadd() and user_sdma_send_pkts() fail to advance to the
Nicolas Morey b03454
   next iovec in user_sdma_request->iovs when the current iovec
Nicolas Morey b03454
   is not PAGE_SIZE and does not contain enough data to complete the
Nicolas Morey b03454
   packet. The transmitted packet will contain the wrong data from the
Nicolas Morey b03454
   iovec pages.
Nicolas Morey b03454
Nicolas Morey b03454
This has not been an issue with SDMA packets from hfi1 Verbs or PSM2
Nicolas Morey b03454
because they only produce iovecs that end short of PAGE_SIZE as the tail
Nicolas Morey b03454
iovec of an SDMA request.
Nicolas Morey b03454
Nicolas Morey b03454
Fixing these bugs exposes other bugs with the SDMA pin cache
Nicolas Morey b03454
(struct mmu_rb_handler) that get in way of supporting user SDMA requests
Nicolas Morey b03454
with multiple payload iovecs whose buffers do not end at PAGE_SIZE. So
Nicolas Morey b03454
this commit fixes those issues as well.
Nicolas Morey b03454
Nicolas Morey b03454
Here are the mmu_rb_handler bugs that non-PAGE_SIZE-end multi-iovec
Nicolas Morey b03454
payload user SDMA requests can hit:
Nicolas Morey b03454
1. Overlapping memory ranges in mmu_rb_handler will result in duplicate
Nicolas Morey b03454
   pinnings.
Nicolas Morey b03454
2. When extending an existing mmu_rb_handler entry (struct mmu_rb_node),
Nicolas Morey b03454
   the mmu_rb code (1) removes the existing entry under a lock, (2)
Nicolas Morey b03454
   releases that lock, pins the new pages, (3) then reacquires the lock
Nicolas Morey b03454
   to insert the extended mmu_rb_node.
Nicolas Morey b03454
Nicolas Morey b03454
   If someone else comes in and inserts an overlapping entry between (2)
Nicolas Morey b03454
   and (3), insert in (3) will fail.
Nicolas Morey b03454
Nicolas Morey b03454
   The failure path code in this case unpins _all_ pages in either the
Nicolas Morey b03454
   original mmu_rb_node or the new mmu_rb_node that was inserted between
Nicolas Morey b03454
   (2) and (3).
Nicolas Morey b03454
3. In hfi1_mmu_rb_remove_unless_exact(), mmu_rb_node->refcount is
Nicolas Morey b03454
   incremented outside of mmu_rb_handler->lock. As a result, mmu_rb_node
Nicolas Morey b03454
   could be evicted by another thread that gets mmu_rb_handler->lock and
Nicolas Morey b03454
   checks mmu_rb_node->refcount before mmu_rb_node->refcount is
Nicolas Morey b03454
   incremented.
Nicolas Morey b03454
4. Related to #2 above, SDMA request submission failure path does not
Nicolas Morey b03454
   check mmu_rb_node->refcount before freeing mmu_rb_node object.
Nicolas Morey b03454
Nicolas Morey b03454
   If there are other SDMA requests in progress whose iovecs have
Nicolas Morey b03454
   pointers to the now-freed mmu_rb_node(s), those pointers to the
Nicolas Morey b03454
   now-freed mmu_rb nodes will be dereferenced when those SDMA requests
Nicolas Morey b03454
   complete.
Nicolas Morey b03454
Nicolas Morey b03454
Fixes: 7be85676f1d1 ("IB/hfi1: Don't remove RB entry when not needed.")
Nicolas Morey b03454
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Nicolas Morey b03454
Signed-off-by: Brendan Cunningham <bcunningham@cornelisnetworks.com>
Nicolas Morey b03454
Signed-off-by: Patrick Kelsey <pat.kelsey@cornelisnetworks.com>
Nicolas Morey b03454
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Nicolas Morey b03454
Link: https://lore.kernel.org/r/168088636445.3027109.10054635277810177889.stgit@252.162.96.66.static.eigbox.net
Nicolas Morey b03454
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Nicolas Morey b03454
Acked-by: Nicolas Morey <nmorey@suse.com>
Nicolas Morey b03454
---
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/ipoib_tx.c   |   1 +
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/mmu_rb.c     |  66 +--
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/mmu_rb.h     |   8 +-
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/sdma.c       |  21 +-
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/sdma.h       |  16 +-
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/sdma_txreq.h |   1 +
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/trace_mmu.h  |   4 -
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/user_sdma.c  | 600 +++++++++++++++---------
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/user_sdma.h  |   5 -
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/verbs.c      |   4 +-
Nicolas Morey b03454
 drivers/infiniband/hw/hfi1/vnic_sdma.c  |   1 +
Nicolas Morey b03454
 11 files changed, 423 insertions(+), 304 deletions(-)
Nicolas Morey b03454
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
Nicolas Morey b03454
index 349eb4139136..8973a081d641 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
Nicolas Morey b03454
@@ -215,6 +215,7 @@ static int hfi1_ipoib_build_ulp_payload(struct ipoib_txreq *tx,
Nicolas Morey b03454
 		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
Nicolas Morey b03454
 
Nicolas Morey b03454
 		ret = sdma_txadd_page(dd,
Nicolas Morey b03454
+				      NULL,
Nicolas Morey b03454
 				      txreq,
Nicolas Morey b03454
 				      skb_frag_page(frag),
Nicolas Morey b03454
 				      frag->bv_offset,
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
Nicolas Morey b03454
index af46ff203342..71b9ac018887 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
Nicolas Morey b03454
@@ -126,7 +126,7 @@ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
Nicolas Morey b03454
 	spin_lock_irqsave(&handler->lock, flags);
Nicolas Morey b03454
 	node = __mmu_rb_search(handler, mnode->addr, mnode->len);
Nicolas Morey b03454
 	if (node) {
Nicolas Morey b03454
-		ret = -EINVAL;
Nicolas Morey b03454
+		ret = -EEXIST;
Nicolas Morey b03454
 		goto unlock;
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 	__mmu_int_rb_insert(mnode, &handler->root);
Nicolas Morey b03454
@@ -143,6 +143,19 @@ unlock:
Nicolas Morey b03454
 	return ret;
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
+/* Caller must hold handler lock */
Nicolas Morey b03454
+struct mmu_rb_node *hfi1_mmu_rb_get_first(struct mmu_rb_handler *handler,
Nicolas Morey b03454
+					  unsigned long addr, unsigned long len)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct mmu_rb_node *node;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	trace_hfi1_mmu_rb_search(addr, len);
Nicolas Morey b03454
+	node = __mmu_int_rb_iter_first(&handler->root, addr, (addr + len) - 1);
Nicolas Morey b03454
+	if (node)
Nicolas Morey b03454
+		list_move_tail(&node->list, &handler->lru_list);
Nicolas Morey b03454
+	return node;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
 /* Caller must hold handler lock */
Nicolas Morey b03454
 static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
Nicolas Morey b03454
 					   unsigned long addr,
Nicolas Morey b03454
@@ -167,34 +180,6 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
Nicolas Morey b03454
 	return node;
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
-bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
Nicolas Morey b03454
-				     unsigned long addr, unsigned long len,
Nicolas Morey b03454
-				     struct mmu_rb_node **rb_node)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	struct mmu_rb_node *node;
Nicolas Morey b03454
-	unsigned long flags;
Nicolas Morey b03454
-	bool ret = false;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	if (current->mm != handler->mn.mm)
Nicolas Morey b03454
-		return ret;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	spin_lock_irqsave(&handler->lock, flags);
Nicolas Morey b03454
-	node = __mmu_rb_search(handler, addr, len);
Nicolas Morey b03454
-	if (node) {
Nicolas Morey b03454
-		if (node->addr == addr && node->len == len) {
Nicolas Morey b03454
-			list_move_tail(&node->list, &handler->lru_list);
Nicolas Morey b03454
-			goto unlock;
Nicolas Morey b03454
-		}
Nicolas Morey b03454
-		__mmu_int_rb_remove(node, &handler->root);
Nicolas Morey b03454
-		list_del(&node->list); /* remove from LRU list */
Nicolas Morey b03454
-		ret = true;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-unlock:
Nicolas Morey b03454
-	spin_unlock_irqrestore(&handler->lock, flags);
Nicolas Morey b03454
-	*rb_node = node;
Nicolas Morey b03454
-	return ret;
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
 void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
Nicolas Morey b03454
 {
Nicolas Morey b03454
 	struct mmu_rb_node *rbnode, *ptr;
Nicolas Morey b03454
@@ -225,29 +210,6 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
-/*
Nicolas Morey b03454
- * It is up to the caller to ensure that this function does not race with the
Nicolas Morey b03454
- * mmu invalidate notifier which may be calling the users remove callback on
Nicolas Morey b03454
- * 'node'.
Nicolas Morey b03454
- */
Nicolas Morey b03454
-void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
Nicolas Morey b03454
-			struct mmu_rb_node *node)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	unsigned long flags;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	if (current->mm != handler->mn.mm)
Nicolas Morey b03454
-		return;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	/* Validity of handler and node pointers has been checked by caller. */
Nicolas Morey b03454
-	trace_hfi1_mmu_rb_remove(node->addr, node->len);
Nicolas Morey b03454
-	spin_lock_irqsave(&handler->lock, flags);
Nicolas Morey b03454
-	__mmu_int_rb_remove(node, &handler->root);
Nicolas Morey b03454
-	list_del(&node->list); /* remove from LRU list */
Nicolas Morey b03454
-	spin_unlock_irqrestore(&handler->lock, flags);
Nicolas Morey b03454
-
Nicolas Morey b03454
-	handler->ops->remove(handler->ops_arg, node);
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
 static int mmu_notifier_range_start(struct mmu_notifier *mn,
Nicolas Morey b03454
 		const struct mmu_notifier_range *range)
Nicolas Morey b03454
 {
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
Nicolas Morey b03454
index 7417be2b9dc8..ed75acdb7b83 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
Nicolas Morey b03454
@@ -52,10 +52,8 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler);
Nicolas Morey b03454
 int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
Nicolas Morey b03454
 		       struct mmu_rb_node *mnode);
Nicolas Morey b03454
 void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg);
Nicolas Morey b03454
-void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
Nicolas Morey b03454
-			struct mmu_rb_node *mnode);
Nicolas Morey b03454
-bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
Nicolas Morey b03454
-				     unsigned long addr, unsigned long len,
Nicolas Morey b03454
-				     struct mmu_rb_node **rb_node);
Nicolas Morey b03454
+struct mmu_rb_node *hfi1_mmu_rb_get_first(struct mmu_rb_handler *handler,
Nicolas Morey b03454
+					  unsigned long addr,
Nicolas Morey b03454
+					  unsigned long len);
Nicolas Morey b03454
 
Nicolas Morey b03454
 #endif /* _HFI1_MMU_RB_H */
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
Nicolas Morey b03454
index 8ed20392e9f0..bb2552dd29c1 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/sdma.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/sdma.c
Nicolas Morey b03454
@@ -1593,22 +1593,7 @@ static inline void sdma_unmap_desc(
Nicolas Morey b03454
 	struct hfi1_devdata *dd,
Nicolas Morey b03454
 	struct sdma_desc *descp)
Nicolas Morey b03454
 {
Nicolas Morey b03454
-	switch (sdma_mapping_type(descp)) {
Nicolas Morey b03454
-	case SDMA_MAP_SINGLE:
Nicolas Morey b03454
-		dma_unmap_single(
Nicolas Morey b03454
-			&dd->pcidev->dev,
Nicolas Morey b03454
-			sdma_mapping_addr(descp),
Nicolas Morey b03454
-			sdma_mapping_len(descp),
Nicolas Morey b03454
-			DMA_TO_DEVICE);
Nicolas Morey b03454
-		break;
Nicolas Morey b03454
-	case SDMA_MAP_PAGE:
Nicolas Morey b03454
-		dma_unmap_page(
Nicolas Morey b03454
-			&dd->pcidev->dev,
Nicolas Morey b03454
-			sdma_mapping_addr(descp),
Nicolas Morey b03454
-			sdma_mapping_len(descp),
Nicolas Morey b03454
-			DMA_TO_DEVICE);
Nicolas Morey b03454
-		break;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
+	system_descriptor_complete(dd, descp);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 /*
Nicolas Morey b03454
@@ -3128,7 +3113,7 @@ int ext_coal_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx,
Nicolas Morey b03454
 
Nicolas Morey b03454
 		/* Add descriptor for coalesce buffer */
Nicolas Morey b03454
 		tx->desc_limit = MAX_DESC;
Nicolas Morey b03454
-		return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, tx,
Nicolas Morey b03454
+		return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, NULL, tx,
Nicolas Morey b03454
 					 addr, tx->tlen);
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 
Nicolas Morey b03454
@@ -3167,10 +3152,12 @@ int _pad_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx)
Nicolas Morey b03454
 			return rval;
Nicolas Morey b03454
 		}
Nicolas Morey b03454
 	}
Nicolas Morey b03454
+
Nicolas Morey b03454
 	/* finish the one just added */
Nicolas Morey b03454
 	make_tx_sdma_desc(
Nicolas Morey b03454
 		tx,
Nicolas Morey b03454
 		SDMA_MAP_NONE,
Nicolas Morey b03454
+		NULL,
Nicolas Morey b03454
 		dd->sdma_pad_phys,
Nicolas Morey b03454
 		sizeof(u32) - (tx->packet_len & (sizeof(u32) - 1)));
Nicolas Morey b03454
 	tx->num_desc++;
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/sdma.h b/drivers/infiniband/hw/hfi1/sdma.h
Nicolas Morey b03454
index b023fc461bd5..95aaec14c6c2 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/sdma.h
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/sdma.h
Nicolas Morey b03454
@@ -594,6 +594,7 @@ static inline dma_addr_t sdma_mapping_addr(struct sdma_desc *d)
Nicolas Morey b03454
 static inline void make_tx_sdma_desc(
Nicolas Morey b03454
 	struct sdma_txreq *tx,
Nicolas Morey b03454
 	int type,
Nicolas Morey b03454
+	void *pinning_ctx,
Nicolas Morey b03454
 	dma_addr_t addr,
Nicolas Morey b03454
 	size_t len)
Nicolas Morey b03454
 {
Nicolas Morey b03454
@@ -612,6 +613,7 @@ static inline void make_tx_sdma_desc(
Nicolas Morey b03454
 				<< SDMA_DESC0_PHY_ADDR_SHIFT) |
Nicolas Morey b03454
 			(((u64)len & SDMA_DESC0_BYTE_COUNT_MASK)
Nicolas Morey b03454
 				<< SDMA_DESC0_BYTE_COUNT_SHIFT);
Nicolas Morey b03454
+	desc->pinning_ctx = pinning_ctx;
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 /* helper to extend txreq */
Nicolas Morey b03454
@@ -643,6 +645,7 @@ static inline void _sdma_close_tx(struct hfi1_devdata *dd,
Nicolas Morey b03454
 static inline int _sdma_txadd_daddr(
Nicolas Morey b03454
 	struct hfi1_devdata *dd,
Nicolas Morey b03454
 	int type,
Nicolas Morey b03454
+	void *pinning_ctx,
Nicolas Morey b03454
 	struct sdma_txreq *tx,
Nicolas Morey b03454
 	dma_addr_t addr,
Nicolas Morey b03454
 	u16 len)
Nicolas Morey b03454
@@ -652,6 +655,7 @@ static inline int _sdma_txadd_daddr(
Nicolas Morey b03454
 	make_tx_sdma_desc(
Nicolas Morey b03454
 		tx,
Nicolas Morey b03454
 		type,
Nicolas Morey b03454
+		pinning_ctx,
Nicolas Morey b03454
 		addr, len);
Nicolas Morey b03454
 	WARN_ON(len > tx->tlen);
Nicolas Morey b03454
 	tx->num_desc++;
Nicolas Morey b03454
@@ -672,6 +676,7 @@ static inline int _sdma_txadd_daddr(
Nicolas Morey b03454
 /**
Nicolas Morey b03454
  * sdma_txadd_page() - add a page to the sdma_txreq
Nicolas Morey b03454
  * @dd: the device to use for mapping
Nicolas Morey b03454
+ * @pinning_ctx: context to be released at descriptor retirement
Nicolas Morey b03454
  * @tx: tx request to which the page is added
Nicolas Morey b03454
  * @page: page to map
Nicolas Morey b03454
  * @offset: offset within the page
Nicolas Morey b03454
@@ -687,6 +692,7 @@ static inline int _sdma_txadd_daddr(
Nicolas Morey b03454
  */
Nicolas Morey b03454
 static inline int sdma_txadd_page(
Nicolas Morey b03454
 	struct hfi1_devdata *dd,
Nicolas Morey b03454
+	void *pinning_ctx,
Nicolas Morey b03454
 	struct sdma_txreq *tx,
Nicolas Morey b03454
 	struct page *page,
Nicolas Morey b03454
 	unsigned long offset,
Nicolas Morey b03454
@@ -714,8 +720,7 @@ static inline int sdma_txadd_page(
Nicolas Morey b03454
 		return -ENOSPC;
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 
Nicolas Morey b03454
-	return _sdma_txadd_daddr(
Nicolas Morey b03454
-			dd, SDMA_MAP_PAGE, tx, addr, len);
Nicolas Morey b03454
+	return _sdma_txadd_daddr(dd, SDMA_MAP_PAGE, pinning_ctx, tx, addr, len);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 /**
Nicolas Morey b03454
@@ -749,7 +754,8 @@ static inline int sdma_txadd_daddr(
Nicolas Morey b03454
 			return rval;
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 
Nicolas Morey b03454
-	return _sdma_txadd_daddr(dd, SDMA_MAP_NONE, tx, addr, len);
Nicolas Morey b03454
+	return _sdma_txadd_daddr(dd, SDMA_MAP_NONE, NULL, tx,
Nicolas Morey b03454
+				 addr, len);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 /**
Nicolas Morey b03454
@@ -795,8 +801,7 @@ static inline int sdma_txadd_kvaddr(
Nicolas Morey b03454
 		return -ENOSPC;
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 
Nicolas Morey b03454
-	return _sdma_txadd_daddr(
Nicolas Morey b03454
-			dd, SDMA_MAP_SINGLE, tx, addr, len);
Nicolas Morey b03454
+	return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, NULL, tx, addr, len);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 struct iowait_work;
Nicolas Morey b03454
@@ -1030,4 +1035,5 @@ extern uint mod_num_sdma;
Nicolas Morey b03454
 
Nicolas Morey b03454
 void sdma_update_lmc(struct hfi1_devdata *dd, u64 mask, u32 lid);
Nicolas Morey b03454
 
Nicolas Morey b03454
+void system_descriptor_complete(struct hfi1_devdata *dd, struct sdma_desc *descp);
Nicolas Morey b03454
 #endif
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/sdma_txreq.h b/drivers/infiniband/hw/hfi1/sdma_txreq.h
Nicolas Morey b03454
index e262fb5c5ec6..fad946cb5e0d 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/sdma_txreq.h
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/sdma_txreq.h
Nicolas Morey b03454
@@ -19,6 +19,7 @@
Nicolas Morey b03454
 struct sdma_desc {
Nicolas Morey b03454
 	/* private:  don't use directly */
Nicolas Morey b03454
 	u64 qw[2];
Nicolas Morey b03454
+	void *pinning_ctx;
Nicolas Morey b03454
 };
Nicolas Morey b03454
 
Nicolas Morey b03454
 /**
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/trace_mmu.h b/drivers/infiniband/hw/hfi1/trace_mmu.h
Nicolas Morey b03454
index 187e9244fe5e..57900ebb7702 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/trace_mmu.h
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/trace_mmu.h
Nicolas Morey b03454
@@ -37,10 +37,6 @@ DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_rb_search,
Nicolas Morey b03454
 	     TP_PROTO(unsigned long addr, unsigned long len),
Nicolas Morey b03454
 	     TP_ARGS(addr, len));
Nicolas Morey b03454
 
Nicolas Morey b03454
-DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_rb_remove,
Nicolas Morey b03454
-	     TP_PROTO(unsigned long addr, unsigned long len),
Nicolas Morey b03454
-	     TP_ARGS(addr, len));
Nicolas Morey b03454
-
Nicolas Morey b03454
 DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_mem_invalidate,
Nicolas Morey b03454
 	     TP_PROTO(unsigned long addr, unsigned long len),
Nicolas Morey b03454
 	     TP_ARGS(addr, len));
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
Nicolas Morey b03454
index a71c5a36ceba..ae58b48afe07 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
Nicolas Morey b03454
@@ -24,7 +24,6 @@
Nicolas Morey b03454
 
Nicolas Morey b03454
 #include "hfi.h"
Nicolas Morey b03454
 #include "sdma.h"
Nicolas Morey b03454
-#include "mmu_rb.h"
Nicolas Morey b03454
 #include "user_sdma.h"
Nicolas Morey b03454
 #include "verbs.h"  /* for the headers */
Nicolas Morey b03454
 #include "common.h" /* for struct hfi1_tid_info */
Nicolas Morey b03454
@@ -39,11 +38,7 @@ static unsigned initial_pkt_count = 8;
Nicolas Morey b03454
 static int user_sdma_send_pkts(struct user_sdma_request *req, u16 maxpkts);
Nicolas Morey b03454
 static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status);
Nicolas Morey b03454
 static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq);
Nicolas Morey b03454
-static void user_sdma_free_request(struct user_sdma_request *req, bool unpin);
Nicolas Morey b03454
-static int pin_vector_pages(struct user_sdma_request *req,
Nicolas Morey b03454
-			    struct user_sdma_iovec *iovec);
Nicolas Morey b03454
-static void unpin_vector_pages(struct mm_struct *mm, struct page **pages,
Nicolas Morey b03454
-			       unsigned start, unsigned npages);
Nicolas Morey b03454
+static void user_sdma_free_request(struct user_sdma_request *req);
Nicolas Morey b03454
 static int check_header_template(struct user_sdma_request *req,
Nicolas Morey b03454
 				 struct hfi1_pkt_header *hdr, u32 lrhlen,
Nicolas Morey b03454
 				 u32 datalen);
Nicolas Morey b03454
@@ -81,6 +76,11 @@ static struct mmu_rb_ops sdma_rb_ops = {
Nicolas Morey b03454
 	.invalidate = sdma_rb_invalidate
Nicolas Morey b03454
 };
Nicolas Morey b03454
 
Nicolas Morey b03454
+static int add_system_pages_to_sdma_packet(struct user_sdma_request *req,
Nicolas Morey b03454
+					   struct user_sdma_txreq *tx,
Nicolas Morey b03454
+					   struct user_sdma_iovec *iovec,
Nicolas Morey b03454
+					   u32 *pkt_remaining);
Nicolas Morey b03454
+
Nicolas Morey b03454
 static int defer_packet_queue(
Nicolas Morey b03454
 	struct sdma_engine *sde,
Nicolas Morey b03454
 	struct iowait_work *wait,
Nicolas Morey b03454
@@ -410,6 +410,7 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
Nicolas Morey b03454
 		ret = -EINVAL;
Nicolas Morey b03454
 		goto free_req;
Nicolas Morey b03454
 	}
Nicolas Morey b03454
+
Nicolas Morey b03454
 	/* Copy the header from the user buffer */
Nicolas Morey b03454
 	ret = copy_from_user(&req->hdr, iovec[idx].iov_base + sizeof(info),
Nicolas Morey b03454
 			     sizeof(req->hdr));
Nicolas Morey b03454
@@ -484,9 +485,8 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
Nicolas Morey b03454
 		memcpy(&req->iovs[i].iov,
Nicolas Morey b03454
 		       iovec + idx++,
Nicolas Morey b03454
 		       sizeof(req->iovs[i].iov));
Nicolas Morey b03454
-		ret = pin_vector_pages(req, &req->iovs[i]);
Nicolas Morey b03454
-		if (ret) {
Nicolas Morey b03454
-			req->data_iovs = i;
Nicolas Morey b03454
+		if (req->iovs[i].iov.iov_len == 0) {
Nicolas Morey b03454
+			ret = -EINVAL;
Nicolas Morey b03454
 			goto free_req;
Nicolas Morey b03454
 		}
Nicolas Morey b03454
 		req->data_len += req->iovs[i].iov.iov_len;
Nicolas Morey b03454
@@ -584,7 +584,7 @@ free_req:
Nicolas Morey b03454
 		if (req->seqsubmitted)
Nicolas Morey b03454
 			wait_event(pq->busy.wait_dma,
Nicolas Morey b03454
 				   (req->seqcomp == req->seqsubmitted - 1));
Nicolas Morey b03454
-		user_sdma_free_request(req, true);
Nicolas Morey b03454
+		user_sdma_free_request(req);
Nicolas Morey b03454
 		pq_update(pq);
Nicolas Morey b03454
 		set_comp_state(pq, cq, info.comp_idx, ERROR, ret);
Nicolas Morey b03454
 	}
Nicolas Morey b03454
@@ -696,48 +696,6 @@ static int user_sdma_txadd_ahg(struct user_sdma_request *req,
Nicolas Morey b03454
 	return ret;
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
-static int user_sdma_txadd(struct user_sdma_request *req,
Nicolas Morey b03454
-			   struct user_sdma_txreq *tx,
Nicolas Morey b03454
-			   struct user_sdma_iovec *iovec, u32 datalen,
Nicolas Morey b03454
-			   u32 *queued_ptr, u32 *data_sent_ptr,
Nicolas Morey b03454
-			   u64 *iov_offset_ptr)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	int ret;
Nicolas Morey b03454
-	unsigned int pageidx, len;
Nicolas Morey b03454
-	unsigned long base, offset;
Nicolas Morey b03454
-	u64 iov_offset = *iov_offset_ptr;
Nicolas Morey b03454
-	u32 queued = *queued_ptr, data_sent = *data_sent_ptr;
Nicolas Morey b03454
-	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	base = (unsigned long)iovec->iov.iov_base;
Nicolas Morey b03454
-	offset = offset_in_page(base + iovec->offset + iov_offset);
Nicolas Morey b03454
-	pageidx = (((iovec->offset + iov_offset + base) - (base & PAGE_MASK)) >>
Nicolas Morey b03454
-		   PAGE_SHIFT);
Nicolas Morey b03454
-	len = offset + req->info.fragsize > PAGE_SIZE ?
Nicolas Morey b03454
-		PAGE_SIZE - offset : req->info.fragsize;
Nicolas Morey b03454
-	len = min((datalen - queued), len);
Nicolas Morey b03454
-	ret = sdma_txadd_page(pq->dd, &tx->txreq, iovec->pages[pageidx],
Nicolas Morey b03454
-			      offset, len);
Nicolas Morey b03454
-	if (ret) {
Nicolas Morey b03454
-		SDMA_DBG(req, "SDMA txreq add page failed %d\n", ret);
Nicolas Morey b03454
-		return ret;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	iov_offset += len;
Nicolas Morey b03454
-	queued += len;
Nicolas Morey b03454
-	data_sent += len;
Nicolas Morey b03454
-	if (unlikely(queued < datalen && pageidx == iovec->npages &&
Nicolas Morey b03454
-		     req->iov_idx < req->data_iovs - 1)) {
Nicolas Morey b03454
-		iovec->offset += iov_offset;
Nicolas Morey b03454
-		iovec = &req->iovs[++req->iov_idx];
Nicolas Morey b03454
-		iov_offset = 0;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-
Nicolas Morey b03454
-	*queued_ptr = queued;
Nicolas Morey b03454
-	*data_sent_ptr = data_sent;
Nicolas Morey b03454
-	*iov_offset_ptr = iov_offset;
Nicolas Morey b03454
-	return ret;
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
 static int user_sdma_send_pkts(struct user_sdma_request *req, u16 maxpkts)
Nicolas Morey b03454
 {
Nicolas Morey b03454
 	int ret = 0;
Nicolas Morey b03454
@@ -769,8 +727,7 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, u16 maxpkts)
Nicolas Morey b03454
 		maxpkts = req->info.npkts - req->seqnum;
Nicolas Morey b03454
 
Nicolas Morey b03454
 	while (npkts < maxpkts) {
Nicolas Morey b03454
-		u32 datalen = 0, queued = 0, data_sent = 0;
Nicolas Morey b03454
-		u64 iov_offset = 0;
Nicolas Morey b03454
+		u32 datalen = 0;
Nicolas Morey b03454
 
Nicolas Morey b03454
 		/*
Nicolas Morey b03454
 		 * Check whether any of the completions have come back
Nicolas Morey b03454
@@ -863,27 +820,17 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, u16 maxpkts)
Nicolas Morey b03454
 				goto free_txreq;
Nicolas Morey b03454
 		}
Nicolas Morey b03454
 
Nicolas Morey b03454
-		/*
Nicolas Morey b03454
-		 * If the request contains any data vectors, add up to
Nicolas Morey b03454
-		 * fragsize bytes to the descriptor.
Nicolas Morey b03454
-		 */
Nicolas Morey b03454
-		while (queued < datalen &&
Nicolas Morey b03454
-		       (req->sent + data_sent) < req->data_len) {
Nicolas Morey b03454
-			ret = user_sdma_txadd(req, tx, iovec, datalen,
Nicolas Morey b03454
-					      &queued, &data_sent, &iov_offset);
Nicolas Morey b03454
-			if (ret)
Nicolas Morey b03454
-				goto free_txreq;
Nicolas Morey b03454
-		}
Nicolas Morey b03454
-		/*
Nicolas Morey b03454
-		 * The txreq was submitted successfully so we can update
Nicolas Morey b03454
-		 * the counters.
Nicolas Morey b03454
-		 */
Nicolas Morey b03454
 		req->koffset += datalen;
Nicolas Morey b03454
 		if (req_opcode(req->info.ctrl) == EXPECTED)
Nicolas Morey b03454
 			req->tidoffset += datalen;
Nicolas Morey b03454
-		req->sent += data_sent;
Nicolas Morey b03454
-		if (req->data_len)
Nicolas Morey b03454
-			iovec->offset += iov_offset;
Nicolas Morey b03454
+		req->sent += datalen;
Nicolas Morey b03454
+		while (datalen) {
Nicolas Morey b03454
+			ret = add_system_pages_to_sdma_packet(req, tx, iovec,
Nicolas Morey b03454
+							      &datalen);
Nicolas Morey b03454
+			if (ret)
Nicolas Morey b03454
+				goto free_txreq;
Nicolas Morey b03454
+			iovec = &req->iovs[req->iov_idx];
Nicolas Morey b03454
+		}
Nicolas Morey b03454
 		list_add_tail(&tx->txreq.list, &req->txps);
Nicolas Morey b03454
 		/*
Nicolas Morey b03454
 		 * It is important to increment this here as it is used to
Nicolas Morey b03454
@@ -920,133 +867,14 @@ free_tx:
Nicolas Morey b03454
 static u32 sdma_cache_evict(struct hfi1_user_sdma_pkt_q *pq, u32 npages)
Nicolas Morey b03454
 {
Nicolas Morey b03454
 	struct evict_data evict_data;
Nicolas Morey b03454
+	struct mmu_rb_handler *handler = pq->handler;
Nicolas Morey b03454
 
Nicolas Morey b03454
 	evict_data.cleared = 0;
Nicolas Morey b03454
 	evict_data.target = npages;
Nicolas Morey b03454
-	hfi1_mmu_rb_evict(pq->handler, &evict_data);
Nicolas Morey b03454
+	hfi1_mmu_rb_evict(handler, &evict_data);
Nicolas Morey b03454
 	return evict_data.cleared;
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
-static int pin_sdma_pages(struct user_sdma_request *req,
Nicolas Morey b03454
-			  struct user_sdma_iovec *iovec,
Nicolas Morey b03454
-			  struct sdma_mmu_node *node,
Nicolas Morey b03454
-			  int npages)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	int pinned, cleared;
Nicolas Morey b03454
-	struct page **pages;
Nicolas Morey b03454
-	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
Nicolas Morey b03454
-	if (!pages)
Nicolas Morey b03454
-		return -ENOMEM;
Nicolas Morey b03454
-	memcpy(pages, node->pages, node->npages * sizeof(*pages));
Nicolas Morey b03454
-
Nicolas Morey b03454
-	npages -= node->npages;
Nicolas Morey b03454
-retry:
Nicolas Morey b03454
-	if (!hfi1_can_pin_pages(pq->dd, current->mm,
Nicolas Morey b03454
-				atomic_read(&pq->n_locked), npages)) {
Nicolas Morey b03454
-		cleared = sdma_cache_evict(pq, npages);
Nicolas Morey b03454
-		if (cleared >= npages)
Nicolas Morey b03454
-			goto retry;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	pinned = hfi1_acquire_user_pages(current->mm,
Nicolas Morey b03454
-					 ((unsigned long)iovec->iov.iov_base +
Nicolas Morey b03454
-					 (node->npages * PAGE_SIZE)), npages, 0,
Nicolas Morey b03454
-					 pages + node->npages);
Nicolas Morey b03454
-	if (pinned < 0) {
Nicolas Morey b03454
-		kfree(pages);
Nicolas Morey b03454
-		return pinned;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	if (pinned != npages) {
Nicolas Morey b03454
-		unpin_vector_pages(current->mm, pages, node->npages, pinned);
Nicolas Morey b03454
-		return -EFAULT;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	kfree(node->pages);
Nicolas Morey b03454
-	node->rb.len = iovec->iov.iov_len;
Nicolas Morey b03454
-	node->pages = pages;
Nicolas Morey b03454
-	atomic_add(pinned, &pq->n_locked);
Nicolas Morey b03454
-	return pinned;
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
-static void unpin_sdma_pages(struct sdma_mmu_node *node)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	if (node->npages) {
Nicolas Morey b03454
-		unpin_vector_pages(mm_from_sdma_node(node), node->pages, 0,
Nicolas Morey b03454
-				   node->npages);
Nicolas Morey b03454
-		atomic_sub(node->npages, &node->pq->n_locked);
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
-static int pin_vector_pages(struct user_sdma_request *req,
Nicolas Morey b03454
-			    struct user_sdma_iovec *iovec)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	int ret = 0, pinned, npages;
Nicolas Morey b03454
-	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
-	struct sdma_mmu_node *node = NULL;
Nicolas Morey b03454
-	struct mmu_rb_node *rb_node;
Nicolas Morey b03454
-	struct iovec *iov;
Nicolas Morey b03454
-	bool extracted;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	extracted =
Nicolas Morey b03454
-		hfi1_mmu_rb_remove_unless_exact(pq->handler,
Nicolas Morey b03454
-						(unsigned long)
Nicolas Morey b03454
-						iovec->iov.iov_base,
Nicolas Morey b03454
-						iovec->iov.iov_len, &rb_node);
Nicolas Morey b03454
-	if (rb_node) {
Nicolas Morey b03454
-		node = container_of(rb_node, struct sdma_mmu_node, rb);
Nicolas Morey b03454
-		if (!extracted) {
Nicolas Morey b03454
-			atomic_inc(&node->refcount);
Nicolas Morey b03454
-			iovec->pages = node->pages;
Nicolas Morey b03454
-			iovec->npages = node->npages;
Nicolas Morey b03454
-			iovec->node = node;
Nicolas Morey b03454
-			return 0;
Nicolas Morey b03454
-		}
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-
Nicolas Morey b03454
-	if (!node) {
Nicolas Morey b03454
-		node = kzalloc(sizeof(*node), GFP_KERNEL);
Nicolas Morey b03454
-		if (!node)
Nicolas Morey b03454
-			return -ENOMEM;
Nicolas Morey b03454
-
Nicolas Morey b03454
-		node->rb.addr = (unsigned long)iovec->iov.iov_base;
Nicolas Morey b03454
-		node->pq = pq;
Nicolas Morey b03454
-		atomic_set(&node->refcount, 0);
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-
Nicolas Morey b03454
-	iov = &iovec->iov;
Nicolas Morey b03454
-	npages = num_user_pages((unsigned long)iov->iov_base, iov->iov_len);
Nicolas Morey b03454
-	if (node->npages < npages) {
Nicolas Morey b03454
-		pinned = pin_sdma_pages(req, iovec, node, npages);
Nicolas Morey b03454
-		if (pinned < 0) {
Nicolas Morey b03454
-			ret = pinned;
Nicolas Morey b03454
-			goto bail;
Nicolas Morey b03454
-		}
Nicolas Morey b03454
-		node->npages += pinned;
Nicolas Morey b03454
-		npages = node->npages;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	iovec->pages = node->pages;
Nicolas Morey b03454
-	iovec->npages = npages;
Nicolas Morey b03454
-	iovec->node = node;
Nicolas Morey b03454
-
Nicolas Morey b03454
-	ret = hfi1_mmu_rb_insert(req->pq->handler, &node->rb);
Nicolas Morey b03454
-	if (ret) {
Nicolas Morey b03454
-		iovec->node = NULL;
Nicolas Morey b03454
-		goto bail;
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-	return 0;
Nicolas Morey b03454
-bail:
Nicolas Morey b03454
-	unpin_sdma_pages(node);
Nicolas Morey b03454
-	kfree(node);
Nicolas Morey b03454
-	return ret;
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
-static void unpin_vector_pages(struct mm_struct *mm, struct page **pages,
Nicolas Morey b03454
-			       unsigned start, unsigned npages)
Nicolas Morey b03454
-{
Nicolas Morey b03454
-	hfi1_release_user_pages(mm, pages + start, npages, false);
Nicolas Morey b03454
-	kfree(pages);
Nicolas Morey b03454
-}
Nicolas Morey b03454
-
Nicolas Morey b03454
 static int check_header_template(struct user_sdma_request *req,
Nicolas Morey b03454
 				 struct hfi1_pkt_header *hdr, u32 lrhlen,
Nicolas Morey b03454
 				 u32 datalen)
Nicolas Morey b03454
@@ -1388,7 +1216,7 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
Nicolas Morey b03454
 	if (req->seqcomp != req->info.npkts - 1)
Nicolas Morey b03454
 		return;
Nicolas Morey b03454
 
Nicolas Morey b03454
-	user_sdma_free_request(req, false);
Nicolas Morey b03454
+	user_sdma_free_request(req);
Nicolas Morey b03454
 	set_comp_state(pq, cq, req->info.comp_idx, state, status);
Nicolas Morey b03454
 	pq_update(pq);
Nicolas Morey b03454
 }
Nicolas Morey b03454
@@ -1399,10 +1227,8 @@ static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq)
Nicolas Morey b03454
 		wake_up(&pq->wait);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
-static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
Nicolas Morey b03454
+static void user_sdma_free_request(struct user_sdma_request *req)
Nicolas Morey b03454
 {
Nicolas Morey b03454
-	int i;
Nicolas Morey b03454
-
Nicolas Morey b03454
 	if (!list_empty(&req->txps)) {
Nicolas Morey b03454
 		struct sdma_txreq *t, *p;
Nicolas Morey b03454
 
Nicolas Morey b03454
@@ -1415,21 +1241,6 @@ static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
Nicolas Morey b03454
 		}
Nicolas Morey b03454
 	}
Nicolas Morey b03454
 
Nicolas Morey b03454
-	for (i = 0; i < req->data_iovs; i++) {
Nicolas Morey b03454
-		struct sdma_mmu_node *node = req->iovs[i].node;
Nicolas Morey b03454
-
Nicolas Morey b03454
-		if (!node)
Nicolas Morey b03454
-			continue;
Nicolas Morey b03454
-
Nicolas Morey b03454
-		req->iovs[i].node = NULL;
Nicolas Morey b03454
-
Nicolas Morey b03454
-		if (unpin)
Nicolas Morey b03454
-			hfi1_mmu_rb_remove(req->pq->handler,
Nicolas Morey b03454
-					   &node->rb);
Nicolas Morey b03454
-		else
Nicolas Morey b03454
-			atomic_dec(&node->refcount);
Nicolas Morey b03454
-	}
Nicolas Morey b03454
-
Nicolas Morey b03454
 	kfree(req->tids);
Nicolas Morey b03454
 	clear_bit(req->info.comp_idx, req->pq->req_in_use);
Nicolas Morey b03454
 }
Nicolas Morey b03454
@@ -1447,6 +1258,368 @@ static inline void set_comp_state(struct hfi1_user_sdma_pkt_q *pq,
Nicolas Morey b03454
 					idx, state, ret);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
+static void unpin_vector_pages(struct mm_struct *mm, struct page **pages,
Nicolas Morey b03454
+			       unsigned int start, unsigned int npages)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	hfi1_release_user_pages(mm, pages + start, npages, false);
Nicolas Morey b03454
+	kfree(pages);
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static void free_system_node(struct sdma_mmu_node *node)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	if (node->npages) {
Nicolas Morey b03454
+		unpin_vector_pages(mm_from_sdma_node(node), node->pages, 0,
Nicolas Morey b03454
+				   node->npages);
Nicolas Morey b03454
+		atomic_sub(node->npages, &node->pq->n_locked);
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	kfree(node);
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static inline void acquire_node(struct sdma_mmu_node *node)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	atomic_inc(&node->refcount);
Nicolas Morey b03454
+	WARN_ON(atomic_read(&node->refcount) < 0);
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static inline void release_node(struct mmu_rb_handler *handler,
Nicolas Morey b03454
+				struct sdma_mmu_node *node)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	atomic_dec(&node->refcount);
Nicolas Morey b03454
+	WARN_ON(atomic_read(&node->refcount) < 0);
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static struct sdma_mmu_node *find_system_node(struct mmu_rb_handler *handler,
Nicolas Morey b03454
+					      unsigned long start,
Nicolas Morey b03454
+					      unsigned long end)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct mmu_rb_node *rb_node;
Nicolas Morey b03454
+	struct sdma_mmu_node *node;
Nicolas Morey b03454
+	unsigned long flags;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	spin_lock_irqsave(&handler->lock, flags);
Nicolas Morey b03454
+	rb_node = hfi1_mmu_rb_get_first(handler, start, (end - start));
Nicolas Morey b03454
+	if (!rb_node) {
Nicolas Morey b03454
+		spin_unlock_irqrestore(&handler->lock, flags);
Nicolas Morey b03454
+		return NULL;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	node = container_of(rb_node, struct sdma_mmu_node, rb);
Nicolas Morey b03454
+	acquire_node(node);
Nicolas Morey b03454
+	spin_unlock_irqrestore(&handler->lock, flags);
Nicolas Morey b03454
+
Nicolas Morey b03454
+	return node;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int pin_system_pages(struct user_sdma_request *req,
Nicolas Morey b03454
+			    uintptr_t start_address, size_t length,
Nicolas Morey b03454
+			    struct sdma_mmu_node *node, int npages)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
+	int pinned, cleared;
Nicolas Morey b03454
+	struct page **pages;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
Nicolas Morey b03454
+	if (!pages)
Nicolas Morey b03454
+		return -ENOMEM;
Nicolas Morey b03454
+
Nicolas Morey b03454
+retry:
Nicolas Morey b03454
+	if (!hfi1_can_pin_pages(pq->dd, current->mm, atomic_read(&pq->n_locked),
Nicolas Morey b03454
+				npages)) {
Nicolas Morey b03454
+		SDMA_DBG(req, "Evicting: nlocked %u npages %u",
Nicolas Morey b03454
+			 atomic_read(&pq->n_locked), npages);
Nicolas Morey b03454
+		cleared = sdma_cache_evict(pq, npages);
Nicolas Morey b03454
+		if (cleared >= npages)
Nicolas Morey b03454
+			goto retry;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+
Nicolas Morey b03454
+	SDMA_DBG(req, "Acquire user pages start_address %lx node->npages %u npages %u",
Nicolas Morey b03454
+		 start_address, node->npages, npages);
Nicolas Morey b03454
+	pinned = hfi1_acquire_user_pages(current->mm, start_address, npages, 0,
Nicolas Morey b03454
+					 pages);
Nicolas Morey b03454
+
Nicolas Morey b03454
+	if (pinned < 0) {
Nicolas Morey b03454
+		kfree(pages);
Nicolas Morey b03454
+		SDMA_DBG(req, "pinned %d", pinned);
Nicolas Morey b03454
+		return pinned;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	if (pinned != npages) {
Nicolas Morey b03454
+		unpin_vector_pages(current->mm, pages, node->npages, pinned);
Nicolas Morey b03454
+		SDMA_DBG(req, "npages %u pinned %d", npages, pinned);
Nicolas Morey b03454
+		return -EFAULT;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	node->rb.addr = start_address;
Nicolas Morey b03454
+	node->rb.len = length;
Nicolas Morey b03454
+	node->pages = pages;
Nicolas Morey b03454
+	node->npages = npages;
Nicolas Morey b03454
+	atomic_add(pinned, &pq->n_locked);
Nicolas Morey b03454
+	SDMA_DBG(req, "done. pinned %d", pinned);
Nicolas Morey b03454
+	return 0;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int add_system_pinning(struct user_sdma_request *req,
Nicolas Morey b03454
+			      struct sdma_mmu_node **node_p,
Nicolas Morey b03454
+			      unsigned long start, unsigned long len)
Nicolas Morey b03454
+
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
+	struct sdma_mmu_node *node;
Nicolas Morey b03454
+	int ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
Nicolas Morey b03454
+	if (!node)
Nicolas Morey b03454
+		return -ENOMEM;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	node->pq = pq;
Nicolas Morey b03454
+	ret = pin_system_pages(req, start, len, node, PFN_DOWN(len));
Nicolas Morey b03454
+	if (ret == 0) {
Nicolas Morey b03454
+		ret = hfi1_mmu_rb_insert(pq->handler, &node->rb);
Nicolas Morey b03454
+		if (ret)
Nicolas Morey b03454
+			free_system_node(node);
Nicolas Morey b03454
+		else
Nicolas Morey b03454
+			*node_p = node;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		return ret;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+
Nicolas Morey b03454
+	kfree(node);
Nicolas Morey b03454
+	return ret;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int get_system_cache_entry(struct user_sdma_request *req,
Nicolas Morey b03454
+				  struct sdma_mmu_node **node_p,
Nicolas Morey b03454
+				  size_t req_start, size_t req_len)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
+	u64 start = ALIGN_DOWN(req_start, PAGE_SIZE);
Nicolas Morey b03454
+	u64 end = PFN_ALIGN(req_start + req_len);
Nicolas Morey b03454
+	struct mmu_rb_handler *handler = pq->handler;
Nicolas Morey b03454
+	int ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	if ((end - start) == 0) {
Nicolas Morey b03454
+		SDMA_DBG(req,
Nicolas Morey b03454
+			 "Request for empty cache entry req_start %lx req_len %lx start %llx end %llx",
Nicolas Morey b03454
+			 req_start, req_len, start, end);
Nicolas Morey b03454
+		return -EINVAL;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+
Nicolas Morey b03454
+	SDMA_DBG(req, "req_start %lx req_len %lu", req_start, req_len);
Nicolas Morey b03454
+
Nicolas Morey b03454
+	while (1) {
Nicolas Morey b03454
+		struct sdma_mmu_node *node =
Nicolas Morey b03454
+			find_system_node(handler, start, end);
Nicolas Morey b03454
+		u64 prepend_len = 0;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		SDMA_DBG(req, "node %p start %llx end %llu", node, start, end);
Nicolas Morey b03454
+		if (!node) {
Nicolas Morey b03454
+			ret = add_system_pinning(req, node_p, start,
Nicolas Morey b03454
+						 end - start);
Nicolas Morey b03454
+			if (ret == -EEXIST) {
Nicolas Morey b03454
+				/*
Nicolas Morey b03454
+				 * Another execution context has inserted a
Nicolas Morey b03454
+				 * conficting entry first.
Nicolas Morey b03454
+				 */
Nicolas Morey b03454
+				continue;
Nicolas Morey b03454
+			}
Nicolas Morey b03454
+			return ret;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		if (node->rb.addr <= start) {
Nicolas Morey b03454
+			/*
Nicolas Morey b03454
+			 * This entry covers at least part of the region. If it doesn't extend
Nicolas Morey b03454
+			 * to the end, then this will be called again for the next segment.
Nicolas Morey b03454
+			 */
Nicolas Morey b03454
+			*node_p = node;
Nicolas Morey b03454
+			return 0;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		SDMA_DBG(req, "prepend: node->rb.addr %lx, node->refcount %d",
Nicolas Morey b03454
+			 node->rb.addr, atomic_read(&node->refcount));
Nicolas Morey b03454
+		prepend_len = node->rb.addr - start;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		/*
Nicolas Morey b03454
+		 * This node will not be returned, instead a new node
Nicolas Morey b03454
+		 * will be. So release the reference.
Nicolas Morey b03454
+		 */
Nicolas Morey b03454
+		release_node(handler, node);
Nicolas Morey b03454
+
Nicolas Morey b03454
+		/* Prepend a node to cover the beginning of the allocation */
Nicolas Morey b03454
+		ret = add_system_pinning(req, node_p, start, prepend_len);
Nicolas Morey b03454
+		if (ret == -EEXIST) {
Nicolas Morey b03454
+			/* Another execution context has inserted a conficting entry first. */
Nicolas Morey b03454
+			continue;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+		return ret;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int add_mapping_to_sdma_packet(struct user_sdma_request *req,
Nicolas Morey b03454
+				      struct user_sdma_txreq *tx,
Nicolas Morey b03454
+				      struct sdma_mmu_node *cache_entry,
Nicolas Morey b03454
+				      size_t start,
Nicolas Morey b03454
+				      size_t from_this_cache_entry)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
Nicolas Morey b03454
+	unsigned int page_offset;
Nicolas Morey b03454
+	unsigned int from_this_page;
Nicolas Morey b03454
+	size_t page_index;
Nicolas Morey b03454
+	void *ctx;
Nicolas Morey b03454
+	int ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	/*
Nicolas Morey b03454
+	 * Because the cache may be more fragmented than the memory that is being accessed,
Nicolas Morey b03454
+	 * it's not strictly necessary to have a descriptor per cache entry.
Nicolas Morey b03454
+	 */
Nicolas Morey b03454
+
Nicolas Morey b03454
+	while (from_this_cache_entry) {
Nicolas Morey b03454
+		page_index = PFN_DOWN(start - cache_entry->rb.addr);
Nicolas Morey b03454
+
Nicolas Morey b03454
+		if (page_index >= cache_entry->npages) {
Nicolas Morey b03454
+			SDMA_DBG(req,
Nicolas Morey b03454
+				 "Request for page_index %zu >= cache_entry->npages %u",
Nicolas Morey b03454
+				 page_index, cache_entry->npages);
Nicolas Morey b03454
+			return -EINVAL;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		page_offset = start - ALIGN_DOWN(start, PAGE_SIZE);
Nicolas Morey b03454
+		from_this_page = PAGE_SIZE - page_offset;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		if (from_this_page < from_this_cache_entry) {
Nicolas Morey b03454
+			ctx = NULL;
Nicolas Morey b03454
+		} else {
Nicolas Morey b03454
+			/*
Nicolas Morey b03454
+			 * In the case they are equal the next line has no practical effect,
Nicolas Morey b03454
+			 * but it's better to do a register to register copy than a conditional
Nicolas Morey b03454
+			 * branch.
Nicolas Morey b03454
+			 */
Nicolas Morey b03454
+			from_this_page = from_this_cache_entry;
Nicolas Morey b03454
+			ctx = cache_entry;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		ret = sdma_txadd_page(pq->dd, ctx, &tx->txreq,
Nicolas Morey b03454
+				      cache_entry->pages[page_index],
Nicolas Morey b03454
+				      page_offset, from_this_page);
Nicolas Morey b03454
+		if (ret) {
Nicolas Morey b03454
+			/*
Nicolas Morey b03454
+			 * When there's a failure, the entire request is freed by
Nicolas Morey b03454
+			 * user_sdma_send_pkts().
Nicolas Morey b03454
+			 */
Nicolas Morey b03454
+			SDMA_DBG(req,
Nicolas Morey b03454
+				 "sdma_txadd_page failed %d page_index %lu page_offset %u from_this_page %u",
Nicolas Morey b03454
+				 ret, page_index, page_offset, from_this_page);
Nicolas Morey b03454
+			return ret;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+		start += from_this_page;
Nicolas Morey b03454
+		from_this_cache_entry -= from_this_page;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	return 0;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int add_system_iovec_to_sdma_packet(struct user_sdma_request *req,
Nicolas Morey b03454
+					   struct user_sdma_txreq *tx,
Nicolas Morey b03454
+					   struct user_sdma_iovec *iovec,
Nicolas Morey b03454
+					   size_t from_this_iovec)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	struct mmu_rb_handler *handler = req->pq->handler;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	while (from_this_iovec > 0) {
Nicolas Morey b03454
+		struct sdma_mmu_node *cache_entry;
Nicolas Morey b03454
+		size_t from_this_cache_entry;
Nicolas Morey b03454
+		size_t start;
Nicolas Morey b03454
+		int ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		start = (uintptr_t)iovec->iov.iov_base + iovec->offset;
Nicolas Morey b03454
+		ret = get_system_cache_entry(req, &cache_entry, start,
Nicolas Morey b03454
+					     from_this_iovec);
Nicolas Morey b03454
+		if (ret) {
Nicolas Morey b03454
+			SDMA_DBG(req, "pin system segment failed %d", ret);
Nicolas Morey b03454
+			return ret;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		from_this_cache_entry = cache_entry->rb.len - (start - cache_entry->rb.addr);
Nicolas Morey b03454
+		if (from_this_cache_entry > from_this_iovec)
Nicolas Morey b03454
+			from_this_cache_entry = from_this_iovec;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		ret = add_mapping_to_sdma_packet(req, tx, cache_entry, start,
Nicolas Morey b03454
+						 from_this_cache_entry);
Nicolas Morey b03454
+		if (ret) {
Nicolas Morey b03454
+			/*
Nicolas Morey b03454
+			 * We're guaranteed that there will be no descriptor
Nicolas Morey b03454
+			 * completion callback that releases this node
Nicolas Morey b03454
+			 * because only the last descriptor referencing it
Nicolas Morey b03454
+			 * has a context attached, and a failure means the
Nicolas Morey b03454
+			 * last descriptor was never added.
Nicolas Morey b03454
+			 */
Nicolas Morey b03454
+			release_node(handler, cache_entry);
Nicolas Morey b03454
+			SDMA_DBG(req, "add system segment failed %d", ret);
Nicolas Morey b03454
+			return ret;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		iovec->offset += from_this_cache_entry;
Nicolas Morey b03454
+		from_this_iovec -= from_this_cache_entry;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+
Nicolas Morey b03454
+	return 0;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+static int add_system_pages_to_sdma_packet(struct user_sdma_request *req,
Nicolas Morey b03454
+					   struct user_sdma_txreq *tx,
Nicolas Morey b03454
+					   struct user_sdma_iovec *iovec,
Nicolas Morey b03454
+					   u32 *pkt_data_remaining)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	size_t remaining_to_add = *pkt_data_remaining;
Nicolas Morey b03454
+	/*
Nicolas Morey b03454
+	 * Walk through iovec entries, ensure the associated pages
Nicolas Morey b03454
+	 * are pinned and mapped, add data to the packet until no more
Nicolas Morey b03454
+	 * data remains to be added.
Nicolas Morey b03454
+	 */
Nicolas Morey b03454
+	while (remaining_to_add > 0) {
Nicolas Morey b03454
+		struct user_sdma_iovec *cur_iovec;
Nicolas Morey b03454
+		size_t from_this_iovec;
Nicolas Morey b03454
+		int ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		cur_iovec = iovec;
Nicolas Morey b03454
+		from_this_iovec = iovec->iov.iov_len - iovec->offset;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		if (from_this_iovec > remaining_to_add) {
Nicolas Morey b03454
+			from_this_iovec = remaining_to_add;
Nicolas Morey b03454
+		} else {
Nicolas Morey b03454
+			/* The current iovec entry will be consumed by this pass. */
Nicolas Morey b03454
+			req->iov_idx++;
Nicolas Morey b03454
+			iovec++;
Nicolas Morey b03454
+		}
Nicolas Morey b03454
+
Nicolas Morey b03454
+		ret = add_system_iovec_to_sdma_packet(req, tx, cur_iovec,
Nicolas Morey b03454
+						      from_this_iovec);
Nicolas Morey b03454
+		if (ret)
Nicolas Morey b03454
+			return ret;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		remaining_to_add -= from_this_iovec;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+	*pkt_data_remaining = remaining_to_add;
Nicolas Morey b03454
+
Nicolas Morey b03454
+	return 0;
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
+void system_descriptor_complete(struct hfi1_devdata *dd,
Nicolas Morey b03454
+				struct sdma_desc *descp)
Nicolas Morey b03454
+{
Nicolas Morey b03454
+	switch (sdma_mapping_type(descp)) {
Nicolas Morey b03454
+	case SDMA_MAP_SINGLE:
Nicolas Morey b03454
+		dma_unmap_single(&dd->pcidev->dev, sdma_mapping_addr(descp),
Nicolas Morey b03454
+				 sdma_mapping_len(descp), DMA_TO_DEVICE);
Nicolas Morey b03454
+		break;
Nicolas Morey b03454
+	case SDMA_MAP_PAGE:
Nicolas Morey b03454
+		dma_unmap_page(&dd->pcidev->dev, sdma_mapping_addr(descp),
Nicolas Morey b03454
+			       sdma_mapping_len(descp), DMA_TO_DEVICE);
Nicolas Morey b03454
+		break;
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+
Nicolas Morey b03454
+	if (descp->pinning_ctx) {
Nicolas Morey b03454
+		struct sdma_mmu_node *node = descp->pinning_ctx;
Nicolas Morey b03454
+
Nicolas Morey b03454
+		release_node(node->rb.handler, node);
Nicolas Morey b03454
+	}
Nicolas Morey b03454
+}
Nicolas Morey b03454
+
Nicolas Morey b03454
 static bool sdma_rb_filter(struct mmu_rb_node *node, unsigned long addr,
Nicolas Morey b03454
 			   unsigned long len)
Nicolas Morey b03454
 {
Nicolas Morey b03454
@@ -1493,8 +1666,7 @@ static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode)
Nicolas Morey b03454
 	struct sdma_mmu_node *node =
Nicolas Morey b03454
 		container_of(mnode, struct sdma_mmu_node, rb);
Nicolas Morey b03454
 
Nicolas Morey b03454
-	unpin_sdma_pages(node);
Nicolas Morey b03454
-	kfree(node);
Nicolas Morey b03454
+	free_system_node(node);
Nicolas Morey b03454
 }
Nicolas Morey b03454
 
Nicolas Morey b03454
 static int sdma_rb_invalidate(void *arg, struct mmu_rb_node *mnode)
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
Nicolas Morey b03454
index ea56eb57e656..a241836371dc 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
Nicolas Morey b03454
@@ -112,16 +112,11 @@ struct sdma_mmu_node {
Nicolas Morey b03454
 struct user_sdma_iovec {
Nicolas Morey b03454
 	struct list_head list;
Nicolas Morey b03454
 	struct iovec iov;
Nicolas Morey b03454
-	/* number of pages in this vector */
Nicolas Morey b03454
-	unsigned int npages;
Nicolas Morey b03454
-	/* array of pinned pages for this vector */
Nicolas Morey b03454
-	struct page **pages;
Nicolas Morey b03454
 	/*
Nicolas Morey b03454
 	 * offset into the virtual address space of the vector at
Nicolas Morey b03454
 	 * which we last left off.
Nicolas Morey b03454
 	 */
Nicolas Morey b03454
 	u64 offset;
Nicolas Morey b03454
-	struct sdma_mmu_node *node;
Nicolas Morey b03454
 };
Nicolas Morey b03454
 
Nicolas Morey b03454
 /* evict operation argument */
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
Nicolas Morey b03454
index 7f6d7fc7951d..fbdcfecb1768 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/verbs.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/verbs.c
Nicolas Morey b03454
@@ -778,8 +778,8 @@ static int build_verbs_tx_desc(
Nicolas Morey b03454
 
Nicolas Morey b03454
 	/* add icrc, lt byte, and padding to flit */
Nicolas Morey b03454
 	if (extra_bytes)
Nicolas Morey b03454
-		ret = sdma_txadd_daddr(sde->dd, &tx->txreq,
Nicolas Morey b03454
-				       sde->dd->sdma_pad_phys, extra_bytes);
Nicolas Morey b03454
+		ret = sdma_txadd_daddr(sde->dd, &tx->txreq, sde->dd->sdma_pad_phys,
Nicolas Morey b03454
+				       extra_bytes);
Nicolas Morey b03454
 
Nicolas Morey b03454
 bail_txadd:
Nicolas Morey b03454
 	return ret;
Nicolas Morey b03454
diff --git a/drivers/infiniband/hw/hfi1/vnic_sdma.c b/drivers/infiniband/hw/hfi1/vnic_sdma.c
Nicolas Morey b03454
index c3f0f8d877c3..727eedfba332 100644
Nicolas Morey b03454
--- a/drivers/infiniband/hw/hfi1/vnic_sdma.c
Nicolas Morey b03454
+++ b/drivers/infiniband/hw/hfi1/vnic_sdma.c
Nicolas Morey b03454
@@ -64,6 +64,7 @@ static noinline int build_vnic_ulp_payload(struct sdma_engine *sde,
Nicolas Morey b03454
 
Nicolas Morey b03454
 		/* combine physically continuous fragments later? */
Nicolas Morey b03454
 		ret = sdma_txadd_page(sde->dd,
Nicolas Morey b03454
+				      NULL,
Nicolas Morey b03454
 				      &tx->txreq,
Nicolas Morey b03454
 				      skb_frag_page(frag),
Nicolas Morey b03454
 				      skb_frag_off(frag),
Nicolas Morey b03454
-- 
Nicolas Morey b03454
2.39.1.1.gbe015eda0162
Nicolas Morey b03454