|
Vlastimil Babka |
a2a4df |
From: Laurent Dufour <ldufour@linux.ibm.com>
|
|
Vlastimil Babka |
a2a4df |
Date: Fri, 13 Nov 2020 22:51:53 -0800
|
|
Vlastimil Babka |
a2a4df |
Subject: mm/slub: fix panic in slab_alloc_node()
|
|
Vlastimil Babka |
a2a4df |
Git-commit: 22e4663e916321b72972c69ca0c6b962f529bd78
|
|
Vlastimil Babka |
a2a4df |
Patch-mainline: v5.10-rc4
|
|
Vlastimil Babka |
a2a4df |
References: bsc#1208023
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
|
|
Vlastimil Babka |
a2a4df |
with 11TB of ram, I hit the following panic:
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
BUG: Kernel NULL pointer dereference on read at 0x00000007
|
|
Vlastimil Babka |
a2a4df |
Faulting instruction address: 0xc000000000456048
|
|
Vlastimil Babka |
a2a4df |
Oops: Kernel access of bad area, sig: 11 [#2]
|
|
Vlastimil Babka |
a2a4df |
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries
|
|
Vlastimil Babka |
a2a4df |
Modules linked in: rpadlpar_io rpaphp
|
|
Vlastimil Babka |
a2a4df |
CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1
|
|
Vlastimil Babka |
a2a4df |
NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
|
|
Vlastimil Babka |
a2a4df |
REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0)
|
|
Vlastimil Babka |
a2a4df |
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000
|
|
Vlastimil Babka |
a2a4df |
CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
|
|
Vlastimil Babka |
a2a4df |
GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
|
|
Vlastimil Babka |
a2a4df |
GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
|
|
Vlastimil Babka |
a2a4df |
GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
|
|
Vlastimil Babka |
a2a4df |
GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
|
|
Vlastimil Babka |
a2a4df |
GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
|
|
Vlastimil Babka |
a2a4df |
GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
|
|
Vlastimil Babka |
a2a4df |
GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
|
|
Vlastimil Babka |
a2a4df |
GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
|
|
Vlastimil Babka |
a2a4df |
NIP [c000000000456048] __kmalloc_node+0x108/0x790
|
|
Vlastimil Babka |
a2a4df |
LR [c000000000455fd4] __kmalloc_node+0x94/0x790
|
|
Vlastimil Babka |
a2a4df |
Call Trace:
|
|
Vlastimil Babka |
a2a4df |
kvmalloc_node+0x58/0x110
|
|
Vlastimil Babka |
a2a4df |
mem_cgroup_css_online+0x10c/0x270
|
|
Vlastimil Babka |
a2a4df |
online_css+0x48/0xd0
|
|
Vlastimil Babka |
a2a4df |
cgroup_apply_control_enable+0x2c4/0x470
|
|
Vlastimil Babka |
a2a4df |
cgroup_mkdir+0x408/0x5f0
|
|
Vlastimil Babka |
a2a4df |
kernfs_iop_mkdir+0x90/0x100
|
|
Vlastimil Babka |
a2a4df |
vfs_mkdir+0x138/0x250
|
|
Vlastimil Babka |
a2a4df |
do_mkdirat+0x154/0x1c0
|
|
Vlastimil Babka |
a2a4df |
system_call_exception+0xf8/0x200
|
|
Vlastimil Babka |
a2a4df |
system_call_common+0xf0/0x27c
|
|
Vlastimil Babka |
a2a4df |
Instruction dump:
|
|
Vlastimil Babka |
a2a4df |
e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
|
|
Vlastimil Babka |
a2a4df |
2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
This pointing to the following code:
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
mm/slub.c:2851
|
|
Vlastimil Babka |
a2a4df |
if (unlikely(!object || !node_match(page, node))) {
|
|
Vlastimil Babka |
a2a4df |
c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0
|
|
Vlastimil Babka |
a2a4df |
c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114>
|
|
Vlastimil Babka |
a2a4df |
node_match():
|
|
Vlastimil Babka |
a2a4df |
mm/slub.c:2491
|
|
Vlastimil Babka |
a2a4df |
if (node != NUMA_NO_NODE && page_to_nid(page) != node)
|
|
Vlastimil Babka |
a2a4df |
c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330>
|
|
Vlastimil Babka |
a2a4df |
page_to_nid():
|
|
Vlastimil Babka |
a2a4df |
include/linux/mm.h:1294
|
|
Vlastimil Babka |
a2a4df |
c000000000456044: 10 00 27 e9 ld r9,16(r7)
|
|
Vlastimil Babka |
a2a4df |
c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL
|
|
Vlastimil Babka |
a2a4df |
node_match():
|
|
Vlastimil Babka |
a2a4df |
mm/slub.c:2491
|
|
Vlastimil Babka |
a2a4df |
c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9
|
|
Vlastimil Babka |
a2a4df |
c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
The panic occurred in slab_alloc_node() when checking for the page's node:
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
object = c->freelist;
|
|
Vlastimil Babka |
a2a4df |
page = c->page;
|
|
Vlastimil Babka |
a2a4df |
if (unlikely(!object || !node_match(page, node))) {
|
|
Vlastimil Babka |
a2a4df |
object = __slab_alloc(s, gfpflags, node, addr, c);
|
|
Vlastimil Babka |
a2a4df |
stat(s, ALLOC_SLOWPATH);
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
The issue is that object is not NULL while page is NULL which is odd but
|
|
Vlastimil Babka |
a2a4df |
may happen if the cache flush happened after loading object but before
|
|
Vlastimil Babka |
a2a4df |
loading page. Thus checking for the page pointer is required too.
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
The cache flush is done through an inter processor interrupt when a
|
|
Vlastimil Babka |
a2a4df |
piece of memory is off-lined. That interrupt is triggered when a memory
|
|
Vlastimil Babka |
a2a4df |
hot-unplug operation is initiated and offline_pages() is calling the
|
|
Vlastimil Babka |
a2a4df |
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback()
|
|
Vlastimil Babka |
a2a4df |
which is calling flush_cpu_slab(). If that interrupt is caught between
|
|
Vlastimil Babka |
a2a4df |
the reading of c->freelist and the reading of c->page, this could lead
|
|
Vlastimil Babka |
a2a4df |
to such a situation. That situation is expected and the later call to
|
|
Vlastimil Babka |
a2a4df |
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
|
|
Vlastimil Babka |
a2a4df |
the whole operation.
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
|
|
Vlastimil Babka |
a2a4df |
node_match()") check on the page pointer has been removed assuming that
|
|
Vlastimil Babka |
a2a4df |
page is always valid when it is called. It happens that this is not
|
|
Vlastimil Babka |
a2a4df |
true in that particular case, so check for page before calling
|
|
Vlastimil Babka |
a2a4df |
node_match() here.
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
|
|
Vlastimil Babka |
a2a4df |
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
|
|
Vlastimil Babka |
a2a4df |
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Vlastimil Babka |
a2a4df |
Acked-by: Vlastimil Babka <vbabka@suse.cz>
|
|
Vlastimil Babka |
a2a4df |
Acked-by: Christoph Lameter <cl@linux.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Wei Yang <richard.weiyang@gmail.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Pekka Enberg <penberg@kernel.org>
|
|
Vlastimil Babka |
a2a4df |
Cc: David Rientjes <rientjes@google.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Nathan Lynch <nathanl@linux.ibm.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Scott Cheloha <cheloha@linux.ibm.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: Michal Hocko <mhocko@suse.com>
|
|
Vlastimil Babka |
a2a4df |
Cc: <stable@vger.kernel.org>
|
|
Vlastimil Babka |
a2a4df |
Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com
|
|
Vlastimil Babka |
a2a4df |
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Vlastimil Babka |
a2a4df |
---
|
|
Vlastimil Babka |
a2a4df |
mm/slub.c | 2 +-
|
|
Vlastimil Babka |
a2a4df |
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
--- a/mm/slub.c
|
|
Vlastimil Babka |
a2a4df |
+++ b/mm/slub.c
|
|
Vlastimil Babka |
a2a4df |
@@ -2802,7 +2802,7 @@ redo:
|
|
Vlastimil Babka |
a2a4df |
|
|
Vlastimil Babka |
a2a4df |
object = c->freelist;
|
|
Vlastimil Babka |
a2a4df |
page = c->page;
|
|
Vlastimil Babka |
a2a4df |
- if (unlikely(!object || !node_match(page, node))) {
|
|
Vlastimil Babka |
a2a4df |
+ if (unlikely(!object || !page || !node_match(page, node))) {
|
|
Vlastimil Babka |
a2a4df |
object = __slab_alloc(s, gfpflags, node, addr, c);
|
|
Vlastimil Babka |
a2a4df |
stat(s, ALLOC_SLOWPATH);
|
|
Vlastimil Babka |
a2a4df |
} else {
|