Gary Lin 2852a9
From: Jesper Dangaard Brouer <brouer@redhat.com>
Gary Lin 2852a9
Date: Tue, 9 Feb 2021 14:38:09 +0100
Gary Lin 2852a9
Subject: bpf: Remove MTU check in __bpf_skb_max_len
Gary Lin 2852a9
Patch-mainline: v5.12-rc1
Gary Lin 2852a9
Git-commit: 6306c1189e77a513bf02720450bb43bd4ba5d8ae
Gary Lin 2852a9
References: bsc#1155518
Gary Lin 2852a9
Gary Lin 2852a9
Multiple BPF-helpers that can manipulate/increase the size of the SKB uses
Gary Lin 2852a9
__bpf_skb_max_len() as the max-length. This function limit size against
Gary Lin 2852a9
the current net_device MTU (skb->dev->mtu).
Gary Lin 2852a9
Gary Lin 2852a9
When a BPF-prog grow the packet size, then it should not be limited to the
Gary Lin 2852a9
MTU. The MTU is a transmit limitation, and software receiving this packet
Gary Lin 2852a9
should be allowed to increase the size. Further more, current MTU check in
Gary Lin 2852a9
__bpf_skb_max_len uses the MTU from ingress/current net_device, which in
Gary Lin 2852a9
case of redirects uses the wrong net_device.
Gary Lin 2852a9
Gary Lin 2852a9
This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit
Gary Lin 2852a9
is elsewhere in the system. Jesper's testing[1] showed it was not possible
Gary Lin 2852a9
to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting
Gary Lin 2852a9
factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for
Gary Lin 2852a9
SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is
Gary Lin 2852a9
in-effect due to this being called from softirq context see code
Gary Lin 2852a9
__gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed
Gary Lin 2852a9
that frames above 16KiB can cause NICs to reset (but not crash). Keep this
Gary Lin 2852a9
sanity limit at this level as memory layer can differ based on kernel
Gary Lin 2852a9
config.
Gary Lin 2852a9
Gary Lin 2852a9
[1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests
Gary Lin 2852a9
Gary Lin 2852a9
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Gary Lin 2852a9
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Gary Lin 2852a9
Acked-by: John Fastabend <john.fastabend@gmail.com>
Gary Lin 2852a9
Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul
Gary Lin 2852a9
Acked-by: Gary Lin <glin@suse.com>
Gary Lin 2852a9
---
Gary Lin 2852a9
 net/core/filter.c |   12 ++++--------
Gary Lin 2852a9
 1 file changed, 4 insertions(+), 8 deletions(-)
Gary Lin 2852a9
Gary Lin 2852a9
--- a/net/core/filter.c
Gary Lin 2852a9
+++ b/net/core/filter.c
Gary Lin 2852a9
@@ -3144,18 +3144,14 @@ static int bpf_skb_net_shrink(struct sk_
Gary Lin 2852a9
 	return 0;
Gary Lin 2852a9
 }
Gary Lin 2852a9
 
Gary Lin 2852a9
-static u32 __bpf_skb_max_len(const struct sk_buff *skb)
Gary Lin 2852a9
-{
Gary Lin 2852a9
-	return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len :
Gary Lin 2852a9
-			  SKB_MAX_ALLOC;
Gary Lin 2852a9
-}
Gary Lin 2852a9
+#define BPF_SKB_MAX_LEN SKB_MAX_ALLOC
Gary Lin 2852a9
 
Gary Lin 2852a9
 BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
Gary Lin 2852a9
 	   u32, mode, u64, flags)
Gary Lin 2852a9
 {
Gary Lin 2852a9
 	u32 len_cur, len_diff_abs = abs(len_diff);
Gary Lin 2852a9
 	u32 len_min = bpf_skb_net_base_len(skb);
Gary Lin 2852a9
-	u32 len_max = __bpf_skb_max_len(skb);
Gary Lin 2852a9
+	u32 len_max = BPF_SKB_MAX_LEN;
Gary Lin 2852a9
 	__be16 proto = skb->protocol;
Gary Lin 2852a9
 	bool shrink = len_diff < 0;
Gary Lin 2852a9
 	u32 off;
Gary Lin 2852a9
@@ -3235,7 +3231,7 @@ static int bpf_skb_trim_rcsum(struct sk_
Gary Lin 2852a9
 static inline int __bpf_skb_change_tail(struct sk_buff *skb, u32 new_len,
Gary Lin 2852a9
 					u64 flags)
Gary Lin 2852a9
 {
Gary Lin 2852a9
-	u32 max_len = __bpf_skb_max_len(skb);
Gary Lin 2852a9
+	u32 max_len = BPF_SKB_MAX_LEN;
Gary Lin 2852a9
 	u32 min_len = __bpf_skb_min_len(skb);
Gary Lin 2852a9
 	int ret;
Gary Lin 2852a9
 
Gary Lin 2852a9
@@ -3311,7 +3307,7 @@ static const struct bpf_func_proto sk_sk
Gary Lin 2852a9
 static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room,
Gary Lin 2852a9
 					u64 flags)
Gary Lin 2852a9
 {
Gary Lin 2852a9
-	u32 max_len = __bpf_skb_max_len(skb);
Gary Lin 2852a9
+	u32 max_len = BPF_SKB_MAX_LEN;
Gary Lin 2852a9
 	u32 new_len = skb->len + head_room;
Gary Lin 2852a9
 	int ret;
Gary Lin 2852a9