Linux kernel source tree
 
 
 
 
 
 
Go to file
Peter Zijlstra 009836b4fa sched/core: Fix migrate_swap() vs. hotplug
On Mon, Jun 02, 2025 at 03:22:13PM +0800, Kuyo Chang wrote:

> So, the potential race scenario is:
>
> 	CPU0							CPU1
> 	// doing migrate_swap(cpu0/cpu1)
> 	stop_two_cpus()
> 							  ...
> 							 // doing _cpu_down()
> 							      sched_cpu_deactivate()
> 								set_cpu_active(cpu, false);
> 								balance_push_set(cpu, true);
> 	cpu_stop_queue_two_works
> 	    __cpu_stop_queue_work(stopper1,...);
> 	    __cpu_stop_queue_work(stopper2,..);
> 	stop_cpus_in_progress -> true
> 		preempt_enable();
> 								...
> 							1st balance_push
> 							stop_one_cpu_nowait
> 							cpu_stop_queue_work
> 							__cpu_stop_queue_work
> 							list_add_tail  -> 1st add push_work
> 							wake_up_q(&wakeq);  -> "wakeq is empty.
> 										This implies that the stopper is at wakeq@migrate_swap."
> 	preempt_disable
> 	wake_up_q(&wakeq);
> 	        wake_up_process // wakeup migrate/0
> 		    try_to_wake_up
> 		        ttwu_queue
> 		            ttwu_queue_cond ->meet below case
> 		                if (cpu == smp_processor_id())
> 			         return false;
> 			ttwu_do_activate
> 			//migrate/0 wakeup done
> 		wake_up_process // wakeup migrate/1
> 	           try_to_wake_up
> 		    ttwu_queue
> 			ttwu_queue_cond
> 		        ttwu_queue_wakelist
> 			__ttwu_queue_wakelist
> 			__smp_call_single_queue
> 	preempt_enable();
>
> 							2nd balance_push
> 							stop_one_cpu_nowait
> 							cpu_stop_queue_work
> 							__cpu_stop_queue_work
> 							list_add_tail  -> 2nd add push_work, so the double list add is detected
> 							...
> 							...
> 							cpu1 get ipi, do sched_ttwu_pending, wakeup migrate/1
>

So this balance_push() is part of schedule(), and schedule() is supposed
to switch to stopper task, but because of this race condition, stopper
task is stuck in WAKING state and not actually visible to be picked.

Therefore CPU1 can do another schedule() and end up doing another
balance_push() even though the last one hasn't been done yet.

This is a confluence of fail, where both wake_q and ttwu_wakelist can
cause crucial wakeups to be delayed, resulting in the malfunction of
balance_push.

Since there is only a single stopper thread to be woken, the wake_q
doesn't really add anything here, and can be removed in favour of
direct wakeups of the stopper thread.

Then add a clause to ttwu_queue_cond() to ensure the stopper threads
are never queued / delayed.

Of all 3 moving parts, the last addition was the balance_push()
machinery, so pick that as the point the bug was introduced.

Fixes: 2558aacff8 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")
Reported-by: Kuyo Chang <kuyo.chang@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Kuyo Chang <kuyo.chang@mediatek.com>
Link: https://lkml.kernel.org/r/20250605100009.GO39944@noisy.programming.kicks-ass.net
2025-07-01 15:02:03 +02:00
Documentation TTY/Serial driver fixes for 6.16-rc4 2025-06-29 09:21:27 -07:00
LICENSES LICENSES: add CC0-1.0 license text 2025-05-21 14:54:17 +02:00
arch - Make sure DR6 and DR7 are initialized to their architectural values and not 2025-06-29 08:28:24 -07:00
block block-6.16-20250626 2025-06-27 09:02:33 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto crypto: wp512 - Use API partial block handling 2025-06-23 16:56:56 +08:00
drivers Staging driver fix for 6.16-rc4 2025-06-29 09:25:55 -07:00
fs six smb3 client fixes 2025-06-27 20:38:05 -07:00
include - Make sure the new futex phash is not copied during fork in order to 2025-06-29 08:09:13 -07:00
init init: fix build warnings about export.h 2025-06-11 22:42:36 -07:00
io_uring io_uring-6.16-20250626 2025-06-27 08:55:57 -07:00
ipc - The 3 patch series "hung_task: extend blocking task stacktrace dump to 2025-05-31 19:12:53 -07:00
kernel sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
lib 16 hotfixes. 6 are cc:stable and the remainder address post-6.15 issues 2025-06-27 20:34:10 -07:00
mm mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write 2025-06-25 15:55:03 -07:00
net Including fixes from bluetooth and wireless. 2025-06-26 09:13:27 -07:00
rust Driver core fixes for 6.16-rc3 2025-06-18 14:31:16 -07:00
samples - The 3 patch series "hung_task: extend blocking task stacktrace dump to 2025-05-31 19:12:53 -07:00
scripts scripts/gdb: fix dentry_name() lookup 2025-06-25 15:55:03 -07:00
security selinux: change security_compute_sid to return the ssid or tsid on match 2025-06-19 16:13:16 -04:00
sound ALSA: hda/realtek: Fix built-in mic on ASUS VivoBook X507UAR 2025-06-26 08:02:44 +02:00
tools LoongArch fixes for v6.16-rc4 2025-06-28 11:35:11 -07:00
usr usr/include: openrisc: don't HDRTEST bpf_perf_event.h 2025-05-12 15:03:17 +09:00
virt Merge branch 'kvm-lockdep-common' into HEAD 2025-05-28 06:29:17 -04:00
.clang-format Linux 6.15-rc5 2025-05-06 16:39:25 +10:00
.clippy.toml rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration 2025-05-07 00:11:47 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore MAINTAINERS: Retire Ralf Baechle 2024-11-12 15:48:59 +01:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore .gitignore: ignore Python compiled bytecode 2025-04-24 10:12:46 -06:00
.mailmap 16 hotfixes. 6 are cc:stable and the remainder address post-6.15 issues 2025-06-27 20:34:10 -07:00
.pylintrc docs: add a .pylintrc file with sys path for docs scripts 2025-04-09 12:10:33 -06:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING
CREDITS CREDITS: Add entry for Shannon Nelson 2025-06-21 07:34:28 -07:00
Kbuild drm: ensure drm headers are self-contained and pass kernel-doc 2025-02-12 10:44:43 +02:00
Kconfig io_uring: Rename KConfig to Kconfig 2025-02-19 14:53:27 -07:00
MAINTAINERS i2c-for-6.16-rc4 2025-06-28 15:23:17 -07:00
Makefile Linux 6.16-rc4 2025-06-29 13:09:04 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.