Blob Blame History Raw
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Fri, 14 Sep 2018 23:00:55 +0200
Subject: tls: fix currently broken MSG_PEEK behavior
Patch-mainline: v4.19-rc5
Git-commit: 50c6b58a814d86a93c0f6964570f839632854044
References: bsc#1109837

In kTLS MSG_PEEK behavior is currently failing, strace example:

  [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
  [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
  [pid  2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
  [pid  2430] listen(4, 10)               = 0
  [pid  2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
  [pid  2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
  [pid  2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
  [pid  2430] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
  [pid  2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
  [pid  2430] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
  [pid  2430] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
  [pid  2430] close(4)                    = 0
  [pid  2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
  [pid  2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
  [pid  2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64

As can be seen from strace, there are two TLS records sent,
i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
peeking 'test_read_peektest_read_peektest'. This is clearly
wrong, and what happens is that given peek cannot call into
tls_sw_advance_skb() to unpause strparser and proceed with
the next skb, we end up looping over the current one, copying
the 'test_read_peek' over and over into the user provided
buffer.

Here, we can only peek into the currently held skb (current,
full TLS record) as otherwise we would end up having to hold
all the original skb(s) (depending on the peek depth) in a
separate queue when unpausing strparser to process next
records, minimally intrusive is to return only up to the
current record's size (which likely was what c46234ebb4d1
("tls: RX path for ktls") originally intended as well). Thus,
after patch we properly peek the first record:

  [pid  2046] wait4(2075,  <unfinished ...>
  [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
  [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
  [pid  2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
  [pid  2075] listen(4, 10)               = 0
  [pid  2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
  [pid  2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
  [pid  2075] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
  [pid  2075] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
  [pid  2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
  [pid  2075] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
  [pid  2075] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
  [pid  2075] close(4)                    = 0
  [pid  2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
  [pid  2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
  [pid  2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14

Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
---
 net/tls/tls_sw.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -928,7 +928,15 @@ int tls_sw_recvmsg(struct sock *sk,
 				if (control != TLS_RECORD_TYPE_DATA)
 					goto recv_end;
 			}
+		} else {
+			/* MSG_PEEK right now cannot look beyond current skb
+			 * from strparser, meaning we cannot advance skb here
+			 * and thus unpause strparser since we'd loose original
+			 * one.
+			 */
+			break;
 		}
+
 		/* If we have a new message from strparser, continue now. */
 		if (copied >= target && !ctx->recv_pkt)
 			break;