diff --git a/patches.suse/mm-slub-fix-panic-in-slab_alloc_node.patch b/patches.suse/mm-slub-fix-panic-in-slab_alloc_node.patch new file mode 100644 index 0000000..69ceb23 --- /dev/null +++ b/patches.suse/mm-slub-fix-panic-in-slab_alloc_node.patch @@ -0,0 +1,122 @@ +From: Laurent Dufour +Date: Fri, 13 Nov 2020 22:51:53 -0800 +Subject: mm/slub: fix panic in slab_alloc_node() +Git-commit: 22e4663e916321b72972c69ca0c6b962f529bd78 +Patch-mainline: v5.10-rc4 +References: bsc#1208023 + +While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs +with 11TB of ram, I hit the following panic: + + BUG: Kernel NULL pointer dereference on read at 0x00000007 + Faulting instruction address: 0xc000000000456048 + Oops: Kernel access of bad area, sig: 11 [#2] + LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries + Modules linked in: rpadlpar_io rpaphp + CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1 + NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350 + REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0) + MSR: 8000000000009033 CR: 24004228 XER: 00000000 + CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0 + GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000 + GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320 + GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000 + GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a + GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1 + GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8 + GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000 + GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001 + NIP [c000000000456048] __kmalloc_node+0x108/0x790 + LR [c000000000455fd4] __kmalloc_node+0x94/0x790 + Call Trace: + kvmalloc_node+0x58/0x110 + mem_cgroup_css_online+0x10c/0x270 + online_css+0x48/0xd0 + cgroup_apply_control_enable+0x2c4/0x470 + cgroup_mkdir+0x408/0x5f0 + kernfs_iop_mkdir+0x90/0x100 + vfs_mkdir+0x138/0x250 + do_mkdirat+0x154/0x1c0 + system_call_exception+0xf8/0x200 + system_call_common+0xf0/0x27c + Instruction dump: + e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a + 2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78 + +This pointing to the following code: + + mm/slub.c:2851 + if (unlikely(!object || !node_match(page, node))) { + c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0 + c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114> + node_match(): + mm/slub.c:2491 + if (node != NUMA_NO_NODE && page_to_nid(page) != node) + c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330> + page_to_nid(): + include/linux/mm.h:1294 + c000000000456044: 10 00 27 e9 ld r9,16(r7) + c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL + node_match(): + mm/slub.c:2491 + c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9 + c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330> + +The panic occurred in slab_alloc_node() when checking for the page's node: + + object = c->freelist; + page = c->page; + if (unlikely(!object || !node_match(page, node))) { + object = __slab_alloc(s, gfpflags, node, addr, c); + stat(s, ALLOC_SLOWPATH); + +The issue is that object is not NULL while page is NULL which is odd but +may happen if the cache flush happened after loading object but before +loading page. Thus checking for the page pointer is required too. + +The cache flush is done through an inter processor interrupt when a +piece of memory is off-lined. That interrupt is triggered when a memory +hot-unplug operation is initiated and offline_pages() is calling the +slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() +which is calling flush_cpu_slab(). If that interrupt is caught between +the reading of c->freelist and the reading of c->page, this could lead +to such a situation. That situation is expected and the later call to +this_cpu_cmpxchg_double() will detect the change to c->freelist and redo +the whole operation. + +In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in +node_match()") check on the page pointer has been removed assuming that +page is always valid when it is called. It happens that this is not +true in that particular case, so check for page before calling +node_match() here. + +Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()") +Signed-off-by: Laurent Dufour +Signed-off-by: Andrew Morton +Acked-by: Vlastimil Babka +Acked-by: Christoph Lameter +Cc: Wei Yang +Cc: Pekka Enberg +Cc: David Rientjes +Cc: Joonsoo Kim +Cc: Nathan Lynch +Cc: Scott Cheloha +Cc: Michal Hocko +Cc: +Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com +Signed-off-by: Linus Torvalds +--- + mm/slub.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/mm/slub.c ++++ b/mm/slub.c +@@ -2802,7 +2802,7 @@ redo: + + object = c->freelist; + page = c->page; +- if (unlikely(!object || !node_match(page, node))) { ++ if (unlikely(!object || !page || !node_match(page, node))) { + object = __slab_alloc(s, gfpflags, node, addr, c); + stat(s, ALLOC_SLOWPATH); + } else { diff --git a/series.conf b/series.conf index fa83288..26ba0e5 100644 --- a/series.conf +++ b/series.conf @@ -17576,6 +17576,7 @@ patches.suse/selinux-Fix-error-return-code-in-sel_ib_pkey_sid_slo.patch patches.suse/hwmon-pwm-fan-Fix-RPM-calculation.patch patches.suse/clk-define-to_clk_regmap-as-inline-function.patch + patches.suse/mm-slub-fix-panic-in-slab_alloc_node.patch patches.suse/Revert-kernel-reboot.c-convert-simple_strtoul-to-kst.patch patches.suse/reboot-fix-overflow-parsing-reboot-cpu-number.patch patches.suse/kernel-watchdog-fix-watchdog_allowed_mask-not-used-w.patch