Blob Blame History Raw
From: Eric Dumazet <edumazet@google.com>
Date: Fri, 1 Dec 2017 10:06:56 -0800
Subject: tcp/dccp: block bh before arming time_wait timer
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Patch-mainline: v4.15-rc3
Git-commit: cfac7f836a715b91f08c851df915d401a4d52783
References: bsc#1076830

Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
that might be caused by the following bug.

timewait timer is pinned to the cpu, because we want to transition
timwewait refcount from 0 to 4 in one go, once everything has been
initialized.

At the time commit ed2e92394589 ("tcp/dccp: fix timewait races in timer
handling") was merged, TCP was always running from BH habdler.

After commit 5413d1babe8f ("net: do not block BH while processing
socket backlog") we definitely can run tcp_time_wait() from process
context.

We need to block BH in the critical section so that the pinned timer
has still its purpose.

This bug is more likely to happen under stress and when very small RTO
are used in datacenter flows.

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Michal Kubecek <mkubecek@suse.cz>

---
 net/dccp/minisocks.c     | 6 ++++++
 net/ipv4/tcp_minisocks.c | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index abd07a443219..178bb9833311 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -57,10 +57,16 @@ void dccp_time_wait(struct sock *sk, int state, int timeo)
 		if (state == DCCP_TIME_WAIT)
 			timeo = DCCP_TIMEWAIT_LEN;
 
+		/* tw_timer is pinned, so we need to make sure BH are disabled
+		 * in following section, otherwise timer handler could run before
+		 * we complete the initialization.
+		 */
+		local_bh_disable();
 		inet_twsk_schedule(tw, timeo);
 		/* Linkage updates. */
 		__inet_twsk_hashdance(tw, sk, &dccp_hashinfo);
 		inet_twsk_put(tw);
+		local_bh_enable();
 	} else {
 		/* Sorry, if we're out of memory, just CLOSE this
 		 * socket up.  We've got bigger problems than
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 42de43a5c0d0..6e908d92e441 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -312,10 +312,16 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 		if (state == TCP_TIME_WAIT)
 			timeo = TCP_TIMEWAIT_LEN;
 
+		/* tw_timer is pinned, so we need to make sure BH are disabled
+		 * in following section, otherwise timer handler could run before
+		 * we complete the initialization.
+		 */
+		local_bh_disable();
 		inet_twsk_schedule(tw, timeo);
 		/* Linkage updates. */
 		__inet_twsk_hashdance(tw, sk, &tcp_hashinfo);
 		inet_twsk_put(tw);
+		local_bh_enable();
 	} else {
 		/* Sorry, if we're out of memory, just CLOSE this
 		 * socket up.  We've got bigger problems than
-- 
2.16.2