Michal Koutný 1036c8
From: Shakeel Butt <shakeelb@google.com>
Michal Koutný 1036c8
Date: Mon, 9 Mar 2020 22:16:05 -0700
Michal Koutný 1036c8
Subject: cgroup: memcg: net: do not associate sock with unrelated cgroup
Michal Koutný 1036c8
Git-commit: e876ecc67db80dfdb8e237f71e5b43bb88ae549c
Michal Koutný 1036c8
Patch-mainline: v5.6-rc6
Michal Koutný 1036c8
References: bsc#1167290
Michal Koutný 1036c8
Michal Koutný 1036c8
We are testing network memory accounting in our setup and noticed
Michal Koutný 1036c8
inconsistent network memory usage and often unrelated cgroups network
Michal Koutný 1036c8
usage correlates with testing workload. On further inspection, it
Michal Koutný 1036c8
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
Michal Koutný 1036c8
irq context specially for cgroup v1.
Michal Koutný 1036c8
Michal Koutný 1036c8
mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
Michal Koutný 1036c8
and kind of assumes that this can only happen from sk_clone_lock()
Michal Koutný 1036c8
and the source sock object has already associated cgroup. However in
Michal Koutný 1036c8
cgroup v1, where network memory accounting is opt-in, the source sock
Michal Koutný 1036c8
can be unassociated with any cgroup and the new cloned sock can get
Michal Koutný 1036c8
associated with unrelated interrupted cgroup.
Michal Koutný 1036c8
Michal Koutný 1036c8
Cgroup v2 can also suffer if the source sock object was created by
Michal Koutný 1036c8
process in the root cgroup or if sk_alloc() is called in irq context.
Michal Koutný 1036c8
The fix is to just do nothing in interrupt.
Michal Koutný 1036c8
Michal Koutný 1036c8
WARNING: Please note that about half of the TCP sockets are allocated
Michal Koutný 1036c8
from the IRQ context, so, memory used by such sockets will not be
Michal Koutný 1036c8
accouted by the memcg.
Michal Koutný 1036c8
Michal Koutný 1036c8
The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Michal Koutný 1036c8
Michal Koutný 1036c8
CPU: 70 PID: 12720 Comm: ssh Tainted:  5.6.0-smp-DEV #1
Michal Koutný 1036c8
Hardware name: ...
Michal Koutný 1036c8
Call Trace:
Michal Koutný 1036c8
 <IRQ>
Michal Koutný 1036c8
 dump_stack+0x57/0x75
Michal Koutný 1036c8
 mem_cgroup_sk_alloc+0xe9/0xf0
Michal Koutný 1036c8
 sk_clone_lock+0x2a7/0x420
Michal Koutný 1036c8
 inet_csk_clone_lock+0x1b/0x110
Michal Koutný 1036c8
 tcp_create_openreq_child+0x23/0x3b0
Michal Koutný 1036c8
 tcp_v6_syn_recv_sock+0x88/0x730
Michal Koutný 1036c8
 tcp_check_req+0x429/0x560
Michal Koutný 1036c8
 tcp_v6_rcv+0x72d/0xa40
Michal Koutný 1036c8
 ip6_protocol_deliver_rcu+0xc9/0x400
Michal Koutný 1036c8
 ip6_input+0x44/0xd0
Michal Koutný 1036c8
 ? ip6_protocol_deliver_rcu+0x400/0x400
Michal Koutný 1036c8
 ip6_rcv_finish+0x71/0x80
Michal Koutný 1036c8
 ipv6_rcv+0x5b/0xe0
Michal Koutný 1036c8
 ? ip6_sublist_rcv+0x2e0/0x2e0
Michal Koutný 1036c8
 process_backlog+0x108/0x1e0
Michal Koutný 1036c8
 net_rx_action+0x26b/0x460
Michal Koutný 1036c8
 __do_softirq+0x104/0x2a6
Michal Koutný 1036c8
 do_softirq_own_stack+0x2a/0x40
Michal Koutný 1036c8
 </IRQ>
Michal Koutný 1036c8
 do_softirq.part.19+0x40/0x50
Michal Koutný 1036c8
 __local_bh_enable_ip+0x51/0x60
Michal Koutný 1036c8
 ip6_finish_output2+0x23d/0x520
Michal Koutný 1036c8
 ? ip6table_mangle_hook+0x55/0x160
Michal Koutný 1036c8
 __ip6_finish_output+0xa1/0x100
Michal Koutný 1036c8
 ip6_finish_output+0x30/0xd0
Michal Koutný 1036c8
 ip6_output+0x73/0x120
Michal Koutný 1036c8
 ? __ip6_finish_output+0x100/0x100
Michal Koutný 1036c8
 ip6_xmit+0x2e3/0x600
Michal Koutný 1036c8
 ? ipv6_anycast_cleanup+0x50/0x50
Michal Koutný 1036c8
 ? inet6_csk_route_socket+0x136/0x1e0
Michal Koutný 1036c8
 ? skb_free_head+0x1e/0x30
Michal Koutný 1036c8
 inet6_csk_xmit+0x95/0xf0
Michal Koutný 1036c8
 __tcp_transmit_skb+0x5b4/0xb20
Michal Koutný 1036c8
 __tcp_send_ack.part.60+0xa3/0x110
Michal Koutný 1036c8
 tcp_send_ack+0x1d/0x20
Michal Koutný 1036c8
 tcp_rcv_state_process+0xe64/0xe80
Michal Koutný 1036c8
 ? tcp_v6_connect+0x5d1/0x5f0
Michal Koutný 1036c8
 tcp_v6_do_rcv+0x1b1/0x3f0
Michal Koutný 1036c8
 ? tcp_v6_do_rcv+0x1b1/0x3f0
Michal Koutný 1036c8
 __release_sock+0x7f/0xd0
Michal Koutný 1036c8
 release_sock+0x30/0xa0
Michal Koutný 1036c8
 __inet_stream_connect+0x1c3/0x3b0
Michal Koutný 1036c8
 ? prepare_to_wait+0xb0/0xb0
Michal Koutný 1036c8
 inet_stream_connect+0x3b/0x60
Michal Koutný 1036c8
 __sys_connect+0x101/0x120
Michal Koutný 1036c8
 ? __sys_getsockopt+0x11b/0x140
Michal Koutný 1036c8
 __x64_sys_connect+0x1a/0x20
Michal Koutný 1036c8
 do_syscall_64+0x51/0x200
Michal Koutný 1036c8
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Michal Koutný 1036c8
Michal Koutný 1036c8
The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Michal Koutný 1036c8
Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking")
Michal Koutný 1036c8
Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Michal Koutný 1036c8
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Michal Koutný 1036c8
Reviewed-by: Roman Gushchin <guro@fb.com>
Michal Koutný 1036c8
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Koutný 1036c8
Acked-by: Michal Koutný <mkoutny@suse.com>
Michal Koutný 1036c8
---
Michal Koutný 1036c8
 kernel/cgroup/cgroup.c | 4 ++++
Michal Koutný 1036c8
 mm/memcontrol.c        | 4 ++++
Michal Koutný 1036c8
 2 files changed, 8 insertions(+)
Michal Koutný 1036c8
Michal Koutný 1036c8
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
Michal Koutný 1036c8
index 75f687301bbf..6b2fc56b2201 100644
Michal Koutný 1036c8
--- a/kernel/cgroup/cgroup.c
Michal Koutný 1036c8
+++ b/kernel/cgroup/cgroup.c
Michal Koutný 1036c8
@@ -6258,6 +6258,10 @@ void cgroup_sk_alloc(struct sock_cgroup_data *skcd)
Michal Koutný 1036c8
 		return;
Michal Koutný 1036c8
 	}
Michal Koutný 1036c8
 
Michal Koutný 1036c8
+	/* Don't associate the sock with unrelated interrupted task's cgroup. */
Michal Koutný 1036c8
+	if (in_interrupt())
Michal Koutný 1036c8
+		return;
Michal Koutný 1036c8
+
Michal Koutný 1036c8
 	rcu_read_lock();
Michal Koutný 1036c8
 
Michal Koutný 1036c8
 	while (true) {
Michal Koutný 1036c8
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
Michal Koutný 1036c8
index d09776cd6e10..1cf41ee22258 100644
Michal Koutný 1036c8
--- a/mm/memcontrol.c
Michal Koutný 1036c8
+++ b/mm/memcontrol.c
Michal Koutný 1036c8
@@ -6696,6 +6696,10 @@ void mem_cgroup_sk_alloc(struct sock *sk)
Michal Koutný 1036c8
 		return;
Michal Koutný 1036c8
 	}
Michal Koutný 1036c8
 
Michal Koutný 1036c8
+	/* Do not associate the sock with unrelated interrupted task's memcg. */
Michal Koutný 1036c8
+	if (in_interrupt())
Michal Koutný 1036c8
+		return;
Michal Koutný 1036c8
+
Michal Koutný 1036c8
 	rcu_read_lock();
Michal Koutný 1036c8
 	memcg = mem_cgroup_from_task(current);
Michal Koutný 1036c8
 	if (memcg == root_mem_cgroup)
Michal Koutný 1036c8