mirror-linux/kernel
zhongjinji 59d4d36158 mm/oom_kill: thaw the entire OOM victim process
Patch series "Improvements to Victim Process Thawing and OOM Reaper
Traversal Order", v10.

This patch series focuses on optimizing victim process thawing and
refining the traversal order of the OOM reaper.  Since __thaw_task() is
used to thaw a single thread of the victim, thawing only one thread cannot
guarantee the exit of the OOM victim when it is frozen.  Patch 1 thaw the
entire process of the OOM victim to ensure that OOM victims are able to
terminate themselves.  Even if the oom_reaper is delayed, patch 2 is still
beneficial for reaping processes with a large address space footprint, and
it also greatly improves process_mrelease.


This patch (of 10):

OOM killer is a mechanism that selects and kills processes when the system
runs out of memory to reclaim resources and keep the system stable.  But
the oom victim cannot terminate on its own when it is frozen, even if the
OOM victim task is thawed through __thaw_task().  This is because
__thaw_task() can only thaw a single OOM victim thread, and cannot thaw
the entire OOM victim process.

In addition, freezing_slow_path() determines whether a task is an OOM
victim by checking the task's TIF_MEMDIE flag.  When a task is identified
as an OOM victim, the freezer bypasses both PM freezing and cgroup
freezing states to thaw it.

Historically, TIF_MEMDIE was a "this is the oom victim & it has access to
memory reserves" flag in the past.  It has that thread vs.  process
problems and tsk_is_oom_victim was introduced later to get rid of them and
other issues as well as the guarantee that we can identify the oom
victim's mm reliably for other oom_reaper.

Therefore, thaw_process() is introduced to unfreeze all threads within the
OOM victim process, ensuring that every thread is properly thawed.  The
freezer now uses tsk_is_oom_victim() to determine OOM victim status,
allowing all victim threads to be unfrozen as necessary.

With this change, the entire OOM victim process will be thawed when an OOM
event occurs, ensuring that the victim can terminate on its own.

Link: https://lkml.kernel.org/r/20250915162946.5515-1-zhongjinji@honor.com
Link: https://lkml.kernel.org/r/20250915162946.5515-2-zhongjinji@honor.com
Signed-off-by: zhongjinji <zhongjinji@honor.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-21 14:22:35 -07:00
..
bpf bpf: Fix memory leak of bpf_scc_info objects 2025-08-02 09:04:57 -07:00
cgroup cgroup: avoid null de-ref in css_rstat_exit() 2025-08-09 08:46:32 -10:00
configs configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON 2025-07-21 21:41:57 -07:00
debug TTY/Serial driver updates for 6.15-rc1 2025-04-02 18:17:33 -07:00
dma dma-remap: drop nth_page() in dma_common_contiguous_remap() 2025-09-21 14:22:06 -07:00
entry ARM: 2025-07-30 17:14:01 -07:00
events mm: convert uprobes to mm_flags_*() accessors 2025-09-13 16:54:57 -07:00
futex futex: Use user_write_access_begin/_end() in futex_put_value() 2025-08-11 17:53:21 +02:00
gcov kbuild: require gcc-8 and binutils-2.30 2025-04-30 21:53:35 +02:00
irq genirq/test: Resolve irq lock inversion warnings 2025-08-06 10:29:48 +02:00
kcsan kcsan: test: Initialize dummy variable 2025-07-23 08:51:32 +02:00
livepatch sched,livepatch: Untangle cond_resched() and live-patching 2025-05-14 13:16:24 +02:00
locking - Make sure sanity checks down in the mutex lock path happen on the correct 2025-08-17 05:57:47 -07:00
module Significant patch series in this pull request: 2025-08-05 16:02:07 +03:00
power drm for 6.17-rc1 2025-07-30 19:26:49 -07:00
printk printk changes for 6.17 2025-08-04 10:54:36 -07:00
rcu mm: replace (20 - PAGE_SHIFT) with common macros for pages<->MB conversion 2025-09-13 16:54:42 -07:00
sched mm: replace (20 - PAGE_SHIFT) with common macros for pages<->MB conversion 2025-09-13 16:54:42 -07:00
time bitmap-for-6.17 2025-07-31 16:52:32 -07:00
trace tracing fixes for v6.17-rc2: 2025-08-23 10:11:34 -04:00
unwind unwind: Finish up unwind when a task exits 2025-07-31 10:20:11 -04:00
.gitignore kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-06-24 20:30:37 +09:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec kho: mm: don't allow deferred struct page with KHO 2025-08-19 16:35:53 -07:00
Kconfig.locks
Kconfig.preempt
Makefile Kbuild updates for v6.17 2025-08-06 07:32:52 +03:00
acct.c acct: block access to kernel internal filesystems 2025-02-12 12:24:16 +01:00
async.c
audit.c audit: record AUDIT_ANOM_* events regardless of presence of rules 2025-04-11 14:14:41 -04:00
audit.h audit,module: restore audit logging in load failure case 2025-06-16 17:00:06 -04:00
audit_fsnotify.c
audit_tree.c replace collect_mounts()/drop_collected_mounts() with a safer variant 2025-06-23 14:01:49 -04:00
audit_watch.c fs: add kern_path_locked_negative() 2025-04-15 11:32:34 +02:00
auditfilter.c audit: fix suffixed '/' filename matching 2024-12-05 19:22:38 -05:00
auditsc.c audit,module: restore audit logging in load failure case 2025-06-16 17:00:06 -04:00
backtracetest.c
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c cfi: Move BPF CFI types and helpers to generic code 2025-07-31 18:23:53 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu.c cpu: Remove obsolete comment from takedown_cpu() 2025-08-06 22:48:12 +02:00
cpu_pm.c
crash_core.c kdump: wait for DMA to finish when using CMA 2025-07-19 19:08:23 -07:00
crash_dump_dm_crypt.c crash_dump: retrieve dm crypt keys in kdump kernel 2025-05-21 10:48:21 -07:00
crash_reserve.c kdump: implement reserve_crashkernel_cma 2025-07-19 19:08:23 -07:00
cred.c cred: remove old {override,revert}_creds() helpers 2024-12-02 11:25:09 +01:00
delayacct.c delayacct: remove redundant code and adjust indentation 2025-05-27 19:40:33 -07:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c task_stack.h: clean-up stack_not_used() implementation 2025-09-21 14:22:00 -07:00
exit.h
extable.c
fail_function.c
fork.c fork: check charging success before zeroing stack 2025-09-21 14:22:00 -07:00
freezer.c mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
gen_kheaders.sh kheaders: make it possible to override TAR 2025-08-06 10:23:36 +09:00
groups.c
hung_task.c hung_task: extend hung task blocker tracking to rwsems 2025-07-19 19:08:26 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: Use kthread_run_on_cpu() 2025-01-02 22:12:12 +01:00
kallsyms_selftest.h
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
kcov.c kcov: fix typo in comment of kcov_fault_in_area 2025-07-09 22:57:52 -07:00
kexec.c kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kexec_core.c Significant patch series in this pull request: 2025-08-03 16:23:09 -07:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c Significant patch series in this pull request: 2025-08-03 16:23:09 -07:00
kexec_handover.c kho: make sure kho_scratch argument is fully consumed 2025-09-13 16:55:18 -07:00
kexec_internal.h kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kheaders.c kheaders: Simplify attribute through __BIN_ATTR_SIMPLE_RO() 2024-12-24 09:46:49 +01:00
kprobes.c kprobes: Add missing kerneldoc for __get_insn_slot 2025-07-15 18:45:34 +09:00
kstack_erase.c stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth 2025-07-21 21:40:39 -07:00
ksyms_common.c
ksysfs.c kernel/ksysfs.c: simplify bin_attribute definition 2025-01-07 16:59:15 +01:00
kthread.c ipvs: Fix estimator kthreads preferred affinity 2025-08-13 08:34:33 +02:00
latencytop.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
module_signature.c
notifier.c
nsproxy.c kernel/nsproxy: remove unnecessary guards 2025-05-09 13:13:54 +02:00
padata.c padata: use cpumask_nth() 2025-06-13 17:26:17 +08:00
panic.c Significant patch series in this pull request: 2025-08-03 16:23:09 -07:00
params.c params: Replace deprecated strcpy() with strscpy() and memcpy() 2025-08-16 21:47:25 +02:00
pid.c Summary 2025-07-29 21:43:08 -07:00
pid_namespace.c pid: Do not set pid_max in new pid namespaces 2025-03-06 10:18:36 +01:00
pid_sysctl.h treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
profile.c
ptrace.c ptrace: introduce PTRACE_SET_SYSCALL_INFO request 2025-05-11 17:48:15 -07:00
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relayfs: support a counter tracking if data is too big to write 2025-07-09 22:57:52 -07:00
resource.c resource: improve child resource handling in release_mem_region_adjustable() 2025-09-21 14:22:34 -07:00
resource_kunit.c
rseq.c rseq: Fix segfault on registration when rseq_cs is non-zero 2025-03-06 22:26:49 +01:00
scftorture.c
scs.c
seccomp.c seccomp: avoid the lock trip seccomp_filter_release in common case 2025-02-24 11:17:10 -08:00
signal.c signal: Fix memory leak for PIDFD_SELF* sentinels 2025-08-19 13:51:28 +02:00
smp.c smp: Fix spelling in on_each_cpu_cond_mask()'s doc-comment 2025-08-02 14:24:50 +02:00
smpboot.c sched/smp: Use the SMP version of idle_thread_set_boot_cpu() 2025-06-13 08:47:20 +02:00
smpboot.h
softirq.c lockdep: Fix wait context check on softirq for PREEMPT_RT 2025-03-25 10:46:44 +01:00
stacktrace.c
static_call.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
stop_machine.c sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
sys.c prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE 2025-09-13 16:55:05 -07:00
sys_ni.c
sysctl-test.c sysctl: move u8 register test to lib/test_sysctl.c 2025-04-14 14:13:41 +02:00
sysctl.c sysctl: rename kern_table -> sysctl_subsys_table 2025-07-23 11:56:02 +02:00
task_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
taskstats.c
torture.c torture: Add get_torture_init_jiffies() for test-start time 2025-02-05 07:14:24 -08:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c
ucount.c ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() 2025-08-02 12:01:38 -07:00
uid16.c
uid16.h
umh.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
up.c
user-return-notifier.c
user.c
user_namespace.c uidgid: add map_id_range_up() 2025-02-12 12:12:27 +01:00
utsname.c
utsname_sysctl.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
vhost_task.c vhost: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) 2025-08-01 09:11:08 -04:00
vmcore_info.c crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo 2025-05-11 17:54:04 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog.c kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count 2025-05-21 10:48:22 -07:00
watchdog_buddy.c watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu() 2025-07-31 11:28:03 -04:00
watchdog_perf.c watchdog/perf: Provide function for adjusting the event period 2025-07-04 13:17:30 +01:00
workqueue.c workqueue: Changes for v6.17 2025-07-31 15:40:22 -07:00
workqueue_internal.h