|
Petr Mladek |
4a6e4c |
From 07edfece8bcb0580a1828d939e6f8d91a8603eb2 Mon Sep 17 00:00:00 2001
|
|
Petr Mladek |
4a6e4c |
From: Frederic Weisbecker <frederic@kernel.org>
|
|
Petr Mladek |
4a6e4c |
Date: Wed, 1 Dec 2021 16:19:44 +0100
|
|
Petr Mladek |
4a6e4c |
Subject: [PATCH] workqueue: Fix unbind_workers() VS wq_worker_running() race
|
|
Petr Mladek |
4a6e4c |
Git-commit: 07edfece8bcb0580a1828d939e6f8d91a8603eb2
|
|
Petr Mladek |
4a6e4c |
Patch-mainline: v5.17-rc1
|
|
Petr Mladek |
4a6e4c |
References: bsc#1195062
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
At CPU-hotplug time, unbind_worker() may preempt a worker while it is
|
|
Petr Mladek |
4a6e4c |
waking up. In that case the following scenario can happen:
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
unbind_workers() wq_worker_running()
|
|
Petr Mladek |
4a6e4c |
-------------- -------------------
|
|
Petr Mladek |
4a6e4c |
if (!(worker->flags & WORKER_NOT_RUNNING))
|
|
Petr Mladek |
4a6e4c |
//PREEMPTED by unbind_workers
|
|
Petr Mladek |
4a6e4c |
worker->flags |= WORKER_UNBOUND;
|
|
Petr Mladek |
4a6e4c |
[...]
|
|
Petr Mladek |
4a6e4c |
atomic_set(&pool->nr_running, 0);
|
|
Petr Mladek |
4a6e4c |
//resume to worker
|
|
Petr Mladek |
4a6e4c |
atomic_inc(&worker->pool->nr_running);
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
After unbind_worker() resets pool->nr_running, the value is expected to
|
|
Petr Mladek |
4a6e4c |
remain 0 until the pool ever gets rebound in case cpu_up() is called on
|
|
Petr Mladek |
4a6e4c |
the target CPU in the future. But here the race leaves pool->nr_running
|
|
Petr Mladek |
4a6e4c |
with a value of 1, triggering the following warning when the worker goes
|
|
Petr Mladek |
4a6e4c |
idle:
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
WARNING: CPU: 3 PID: 34 at kernel/workqueue.c:1823 worker_enter_idle+0x95/0xc0
|
|
Petr Mladek |
4a6e4c |
Modules linked in:
|
|
Petr Mladek |
4a6e4c |
CPU: 3 PID: 34 Comm: kworker/3:0 Not tainted 5.16.0-rc1+ #34
|
|
Petr Mladek |
4a6e4c |
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
|
|
Petr Mladek |
4a6e4c |
Workqueue: 0x0 (rcu_par_gp)
|
|
Petr Mladek |
4a6e4c |
RIP: 0010:worker_enter_idle+0x95/0xc0
|
|
Petr Mladek |
4a6e4c |
Code: 04 85 f8 ff ff ff 39 c1 7f 09 48 8b 43 50 48 85 c0 74 1b 83 e2 04 75 99 8b 43 34 39 43 30 75 91 8b 83 00 03 00 00 85 c0 74 87 <0f> 0b 5b c3 48 8b 35 70 f1 37 01 48 8d 7b 48 48 81 c6 e0 93 0
|
|
Petr Mladek |
4a6e4c |
RSP: 0000:ffff9b7680277ed0 EFLAGS: 00010086
|
|
Petr Mladek |
4a6e4c |
RAX: 00000000ffffffff RBX: ffff93465eae9c00 RCX: 0000000000000000
|
|
Petr Mladek |
4a6e4c |
RDX: 0000000000000000 RSI: ffff9346418a0000 RDI: ffff934641057140
|
|
Petr Mladek |
4a6e4c |
RBP: ffff934641057170 R08: 0000000000000001 R09: ffff9346418a0080
|
|
Petr Mladek |
4a6e4c |
R10: ffff9b768027fdf0 R11: 0000000000002400 R12: ffff93465eae9c20
|
|
Petr Mladek |
4a6e4c |
R13: ffff93465eae9c20 R14: ffff93465eae9c70 R15: ffff934641057140
|
|
Petr Mladek |
4a6e4c |
FS: 0000000000000000(0000) GS:ffff93465eac0000(0000) knlGS:0000000000000000
|
|
Petr Mladek |
4a6e4c |
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
|
|
Petr Mladek |
4a6e4c |
CR2: 0000000000000000 CR3: 000000001cc0c000 CR4: 00000000000006e0
|
|
Petr Mladek |
4a6e4c |
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
|
|
Petr Mladek |
4a6e4c |
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
|
|
Petr Mladek |
4a6e4c |
Call Trace:
|
|
Petr Mladek |
4a6e4c |
<TASK>
|
|
Petr Mladek |
4a6e4c |
worker_thread+0x89/0x3d0
|
|
Petr Mladek |
4a6e4c |
? process_one_work+0x400/0x400
|
|
Petr Mladek |
4a6e4c |
kthread+0x162/0x190
|
|
Petr Mladek |
4a6e4c |
? set_kthread_struct+0x40/0x40
|
|
Petr Mladek |
4a6e4c |
ret_from_fork+0x22/0x30
|
|
Petr Mladek |
4a6e4c |
</TASK>
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
Also due to this incorrect "nr_running == 1", further queued work may
|
|
Petr Mladek |
4a6e4c |
end up not being served, because no worker is awaken at work insert time.
|
|
Petr Mladek |
4a6e4c |
This raises rcutorture writer stalls for example.
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
Fix this with disabling preemption in the right place in
|
|
Petr Mladek |
4a6e4c |
wq_worker_running().
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
It's worth noting that if the worker migrates and runs concurrently with
|
|
Petr Mladek |
4a6e4c |
unbind_workers(), it is guaranteed to see the WORKER_UNBOUND flag update
|
|
Petr Mladek |
4a6e4c |
due to set_cpus_allowed_ptr() acquiring/releasing rq->lock.
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
Fixes: 6d25be5782e4 ("sched/core, workqueues: Distangle worker accounting from rq lock")
|
|
Petr Mladek |
4a6e4c |
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
|
|
Petr Mladek |
4a6e4c |
Tested-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Petr Mladek |
4a6e4c |
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Petr Mladek |
4a6e4c |
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
Petr Mladek |
4a6e4c |
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
Petr Mladek |
4a6e4c |
Cc: Ingo Molnar <mingo@redhat.com>
|
|
Petr Mladek |
4a6e4c |
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Petr Mladek |
4a6e4c |
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
|
|
Petr Mladek |
4a6e4c |
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Petr Mladek |
4a6e4c |
Acked-by: Petr Mladek <pmladek@suse.com>
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
---
|
|
Petr Mladek |
4a6e4c |
kernel/workqueue.c | 9 +++++++++
|
|
Petr Mladek |
4a6e4c |
1 file changed, 9 insertions(+)
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
|
|
Petr Mladek |
4a6e4c |
index 332361cf215f..5094573e8b45 100644
|
|
Petr Mladek |
4a6e4c |
--- a/kernel/workqueue.c
|
|
Petr Mladek |
4a6e4c |
+++ b/kernel/workqueue.c
|
|
Petr Mladek |
4a6e4c |
@@ -868,8 +868,17 @@ void wq_worker_running(struct task_struct *task)
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
if (!worker->sleeping)
|
|
Petr Mladek |
4a6e4c |
return;
|
|
Petr Mladek |
4a6e4c |
+
|
|
Petr Mladek |
4a6e4c |
+ /*
|
|
Petr Mladek |
4a6e4c |
+ * If preempted by unbind_workers() between the WORKER_NOT_RUNNING check
|
|
Petr Mladek |
4a6e4c |
+ * and the nr_running increment below, we may ruin the nr_running reset
|
|
Petr Mladek |
4a6e4c |
+ * and leave with an unexpected pool->nr_running == 1 on the newly unbound
|
|
Petr Mladek |
4a6e4c |
+ * pool. Protect against such race.
|
|
Petr Mladek |
4a6e4c |
+ */
|
|
Petr Mladek |
4a6e4c |
+ preempt_disable();
|
|
Petr Mladek |
4a6e4c |
if (!(worker->flags & WORKER_NOT_RUNNING))
|
|
Petr Mladek |
4a6e4c |
atomic_inc(&worker->pool->nr_running);
|
|
Petr Mladek |
4a6e4c |
+ preempt_enable();
|
|
Petr Mladek |
4a6e4c |
worker->sleeping = 0;
|
|
Petr Mladek |
4a6e4c |
}
|
|
Petr Mladek |
4a6e4c |
|
|
Petr Mladek |
4a6e4c |
--
|
|
Petr Mladek |
4a6e4c |
2.26.2
|
|
Petr Mladek |
4a6e4c |
|