mirror-linux/mm
Qi Zheng 99ebc509ee mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end()
Currently, get_non_dying_memcg_start() and get_non_dying_memcg_end() both
evaluate cgroup_subsys_on_dfl(memory_cgrp_subsys) independently to
determine whether to acquire or release the RCU read lock.

However, the result of cgroup_subsys_on_dfl() can change dynamically at
runtime due to cgroup hierarchy rebinding (e.g., when the memory
controller is moved between cgroup v1 and v2 hierarchies).  This can cause
the following warning:

 =====================================
 WARNING: bad unlock balance detected!
 7.0.0-next-20260420+ #83 Tainted: G        W
 -------------------------------------
 memcg-repro/270 is trying to release lock (rcu_read_lock) at:
 [<ffffffff815f57f7>] rcu_read_unlock+0x17/0x60
 but there are no more locks to release!

 other info that might help us debug this:
 1 lock held by memcg-repro/270:
  #0: ffff888102fa2088 (vm_lock){++++}-{0:0}, at: do_user_addr_fault+0x285/0x880

 stack backtrace:
 CPU: 0 UID: 0 PID: 270 Comm: memcg-repro Tainted: G        W           7.0.0-next-20260420+ #
 Tainted: [W]=WARN
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
 Call Trace:
  <TASK>
  ? rcu_read_unlock+0x17/0x60
  dump_stack_lvl+0x77/0xb0
  print_unlock_imbalance_bug+0xe0/0xf0
  ? rcu_read_unlock+0x17/0x60
  lock_release+0x21d/0x2a0
  rcu_read_unlock+0x1c/0x60
  do_pte_missing+0x233/0xb40
  __handle_mm_fault+0x80e/0xcd0
  handle_mm_fault+0x146/0x310
  do_user_addr_fault+0x303/0x880
  exc_page_fault+0x9b/0x270
  asm_exc_page_fault+0x26/0x30
 RIP: 0033:0x5590e4eb41ea
 Code: 61 cc 66 0f 6f e0 66 0f 61 c2 66 0f db cd 66 0f 69 e2 66 0f 6f d0 66 0f 69 d4 66 0f 61 0
 RSP: 002b:00007ffcad25f030 EFLAGS: 00010202
 RAX: 00005590e4eb8010 RBX: 00007ffcad260f7d RCX: 00007f73c474d44d
 RDX: 00005590e4eb80a0 RSI: 00005590e4eb503c RDI: 000000000000000f
 RBP: 00005590e4eb70a0 R08: 0000000000000000 R09: 00007f73c483a680
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007ffcad25f180 R14: 00005590e4eb6dd8 R15: 00007f73c4869020
  </TASK>
 ------------[ cut here ]------------

Fix this by explicitly tracking the RCU lock state, ensuring that
rcu_read_unlock() in get_non_dying_memcg_end() is strictly paired with the
lock acquisition, regardless of any runtime rebinding events.

Link: https://lore.kernel.org/20260429073105.44472-1-qi.zheng@linux.dev
Fixes: 8285917d6f ("mm: memcontrol: prepare for reparenting non-hierarchical stats")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-30 06:13:20 -07:00
..
damon mm/damon/stat: detect and use fresh enabled value 2026-04-27 05:54:27 -07:00
kasan kasan: fix bug type classification for SW_TAGS mode 2026-04-05 13:53:18 -07:00
kfence memblock: updates for 7.0-rc1 2026-04-18 11:29:14 -07:00
kmsan Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
tests sparc/mm: export symbols for lazy_mmu_mode KUnit tests 2026-01-31 14:22:40 -08:00
Kconfig Mostly cleanups and small things, notably: 2026-04-20 16:36:46 -07:00
Kconfig.debug mm: kmemleak: add CONFIG_DEBUG_KMEMLEAK_VERBOSE build option 2026-04-18 00:10:48 -07:00
Makefile mm.git review status for linus..mm-nonmm-stable 2026-02-12 12:13:01 -08:00
backing-dev.c mm: blk-cgroup: fix use-after-free in cgwb_release_workfn() 2026-04-18 23:24:27 -07:00
balloon.c mm: rename CONFIG_BALLOON_COMPACTION to CONFIG_BALLOON_MIGRATION 2026-01-31 14:22:36 -08:00
bootmem_info.c mm/bootmem_info: avoid using sparse_decode_mem_map() 2026-04-05 13:53:32 -07:00
bpf_memcontrol.c bpf: Revert "bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()" 2026-01-21 09:38:16 -08:00
cma.c dma-mapping updates for Linux 7.0: 2026-04-17 11:12:42 -07:00
cma.h
cma_debug.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
cma_sysfs.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
compaction.c mm: memcontrol: prepare for reparenting LRU pages for lruvec lock 2026-04-18 00:10:46 -07:00
debug.c mm: constify __dump_folio() arguments 2025-11-20 13:43:57 -08:00
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: replace WRITE_ONCE() with pxd_clear() 2026-04-05 13:53:11 -07:00
dmapool.c
dmapool_test.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
early_ioremap.c mm/early_ioremap: clean up the use of WARN() for debugging 2026-01-26 20:02:26 -08:00
execmem.c mm/execmem: make the populate and alloc atomic 2026-04-05 13:53:34 -07:00
fadvise.c mm/fadvise: validate offset in generic_fadvise 2026-04-05 13:52:53 -07:00
fail_page_alloc.c
failslab.c
filemap.c 7 hotfixes. 6 are cc:stable and all are for MM. Please see the 2026-04-19 14:45:37 -07:00
folio-compat.c mm: add SPDX id lines to some mm source files 2026-02-06 15:47:16 -08:00
gup.c folio_batch: rename pagevec.h to folio_batch.h 2026-04-05 13:53:07 -07:00
gup_test.c mm: add SPDX id lines to some mm source files 2026-02-06 15:47:16 -08:00
gup_test.h
highmem.c mm/highmem: fix __kmap_to_page() build error 2026-01-31 14:22:38 -08:00
hmm.c mm/hmm: Indicate that HMM requires DMA coherency 2026-03-20 12:05:56 +01:00
huge_memory.c mm: thp: prevent memory cgroup release in folio_split_queue_lock{_irqsave}() 2026-04-18 00:10:46 -07:00
hugetlb.c mm.git review status for linus..mm-stable 2026-04-19 08:01:17 -07:00
hugetlb_cgroup.c Convert 'alloc_flex' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
hugetlb_cma.c mm/hugetlb_cma: round up per_node before logging it 2026-04-27 05:54:24 -07:00
hugetlb_cma.h mm: hugetlb: allocate frozen pages for gigantic allocation 2026-01-26 20:02:28 -08:00
hugetlb_internal.h mm/hugetlb: extract sysctl into hugetlb_sysctl.c 2025-11-20 13:43:57 -08:00
hugetlb_sysctl.c mm, hugetlb: implement movable_gigantic_pages sysctl 2026-01-20 19:24:50 -08:00
hugetlb_sysfs.c mm/hugetlb: extract sysfs into hugetlb_sysfs.c 2025-11-20 13:43:57 -08:00
hugetlb_vmemmap.c mm/hugetlb: remove hugetlb_optimize_vmemmap_key static key 2026-04-05 13:53:09 -07:00
hugetlb_vmemmap.h
hwpoison-inject.c mm/hwpoison: decouple hwpoison_filter from mm/memory-failure.c 2025-09-21 14:22:21 -07:00
init-mm.c mm: rename cpu_bitmap field to flexible_array 2026-01-19 12:30:00 -08:00
internal.h 7 hotfixes. 6 are cc:stable and all are for MM. Please see the 2026-04-19 14:45:37 -07:00
interval_tree.c mm/memory: simplify calculation in unmap_mapping_range_tree() 2026-04-05 13:53:13 -07:00
ioremap.c
khugepaged.c mm/khugepaged: fix issue with tracking lock 2026-04-05 13:53:47 -07:00
kmemleak.c mm: kmemleak: add CONFIG_DEBUG_KMEMLEAK_VERBOSE build option 2026-04-18 00:10:48 -07:00
ksm.c mm: convert do_brk_flags() to use vma_flags_t 2026-04-05 13:53:40 -07:00
list_lru.c ttm/pool: port to list_lru. (v2) 2026-04-08 06:52:47 +10:00
maccess.c
madvise.c mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t 2026-04-05 13:53:41 -07:00
mapping_dirty_helpers.c mm/dirty: replace READ_ONCE() with pudp_get() 2025-11-16 17:27:58 -08:00
memblock.c mm.git review status for linus..mm-stable 2026-04-19 08:01:17 -07:00
memcontrol-v1.c mm: memcontrol: eliminate the problem of dying memory cgroup for LRU folios 2026-04-18 00:10:47 -07:00
memcontrol-v1.h mm: memcontrol: prepare for reparenting non-hierarchical stats 2026-04-18 00:10:47 -07:00
memcontrol.c mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end() 2026-04-30 06:13:20 -07:00
memfd.c memfd: export memfd_{add,get}_seals() 2026-04-05 13:53:00 -07:00
memfd_luo.c mm/memfd_luo: remove folio from page cache when accounting fails 2026-04-18 00:10:53 -07:00
memory-failure.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
memory-tiers.c mm: introduce CONFIG_NUMA_MIGRATION and simplify CONFIG_MIGRATION 2026-04-05 13:53:33 -07:00
memory.c mm: on remap assert that input range within the proposed VMA 2026-04-05 13:53:45 -07:00
memory_hotplug.c mm.git review status for linus..mm-stable 2026-04-15 12:59:16 -07:00
mempolicy.c 7 hotfixes. 6 are cc:stable and all are for MM. Please see the 2026-04-19 14:45:37 -07:00
mempool.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
memremap.c mm/zone_device: do not touch device folio after calling ->folio_free() 2026-04-18 23:24:27 -07:00
memtest.c mm/memtest: add underflow detection for size calculation 2026-01-09 11:53:51 +02:00
migrate.c mm: migrate: prevent memory cgroup release in folio_migrate_mapping() 2026-04-18 00:10:45 -07:00
migrate_device.c mm/migrate_device: remove dead migration entry check in migrate_vma_collect_huge_pmd() 2026-04-18 00:10:56 -07:00
mincore.c mm: replace remaining pte_to_swp_entry() with softleaf_from_pte() 2025-11-24 15:08:52 -08:00
mlock.c mm: rename unlock_page_lruvec_irq and its variants 2026-04-18 00:10:44 -07:00
mm_init.c memblock: updates for 7.0-rc1 2026-04-18 11:29:14 -07:00
mm_slot.h
mmap.c mm: convert do_brk_flags() to use vma_flags_t 2026-04-05 13:53:40 -07:00
mmap_lock.c mm/vma: improve and document __is_vma_write_locked() 2026-01-31 14:22:51 -08:00
mmu_gather.c mm/mmu_gather: replace IPI with synchronize_rcu() when batch allocation fails 2026-04-05 13:53:05 -07:00
mmu_notifier.c mm.git review status for linus..mm-stable 2026-04-15 12:59:16 -07:00
mmzone.c mm: introduce memdesc_flags_t 2025-09-13 16:55:07 -07:00
mprotect.c mm/mprotect: special-case small folios when applying permissions 2026-04-18 00:10:55 -07:00
mremap.c mm: convert do_brk_flags() to use vma_flags_t 2026-04-05 13:53:40 -07:00
mseal.c mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t 2026-04-05 13:53:41 -07:00
msync.c
nommu.c mm: abstract reading sysctl_max_map_count, and READ_ONCE() 2026-04-05 13:53:28 -07:00
numa.c
numa_emulation.c
numa_memblks.c memblock: numa_memblks: fix detection of NUMA node for CXL windows 2026-02-21 09:58:22 -08:00
oom_kill.c mm/oom_kill.c: simpilfy rcu call with guard(rcu) 2026-04-05 13:53:17 -07:00
page-writeback.c mm: start background writeback based on per-wb threshold for strictlimit BDIs 2026-04-27 05:54:24 -07:00
page_alloc.c mm.git review status for linus..mm-stable 2026-04-19 08:01:17 -07:00
page_counter.c
page_ext.c mm/page_ext: Add page_ext_get_from_phys() 2026-01-21 12:51:48 +01:00
page_frag_cache.c
page_idle.c mm/page_idle.c: remove redundant mmu notifier in aging code 2026-04-05 13:53:02 -07:00
page_io.c mm/page_io: use sio->len for PSWPIN accounting in sio_read_complete() 2026-04-18 00:10:55 -07:00
page_isolation.c mm: page_isolation: introduce page_is_unmovable() 2026-01-31 14:22:42 -08:00
page_owner.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
page_poison.c
page_reporting.c mm/page_reporting: change page_reporting_order to PAGE_REPORTING_ORDER_UNSPECIFIED 2026-04-05 13:53:17 -07:00
page_reporting.h
page_table_check.c mm/page_table_check: Pass mm_struct to pxx_user_accessible_page() 2026-03-13 00:07:47 +01:00
page_vma_mapped.c mm: centralize+fix comments about compound_mapcount() in new sync_with_folio_pmd_zap() 2026-04-05 13:53:03 -07:00
pagewalk.c mm/pagewalk: fix race between concurrent split and refault 2026-04-05 13:53:37 -07:00
percpu-internal.h
percpu-km.c mm/mm/percpu-km: drop nth_page() usage within single allocation 2025-09-21 14:22:04 -07:00
percpu-stats.c
percpu-vm.c kmsan: remove hard-coded GFP_KERNEL flags 2025-11-16 17:27:54 -08:00
percpu.c mm: memcontrol: return root object cgroup for root memory cgroup 2026-04-18 00:10:44 -07:00
pgalloc-track.h
pgtable-generic.c mm: change to return bool for pmdp_clear_flush_young() 2026-04-05 13:53:35 -07:00
process_vm_access.c
ptdump.c mm/ptdump: replace READ_ONCE() with standard page table accessors 2025-11-16 17:27:52 -08:00
readahead.c mm.git review status for linus..mm-stable 2026-02-12 11:32:37 -08:00
rmap.c mm/mglru: fix cgroup OOM during MGLRU state switching 2026-04-05 13:53:33 -07:00
rodata_test.c
secretmem.c mm: rename VMA flag helpers to be more readable 2026-04-05 13:53:18 -07:00
shmem.c mm.git review status for linus..mm-stable 2026-04-19 08:01:17 -07:00
shmem_quota.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
show_mem.c mm: add gpu active/reclaim per-node stat counters (v2) 2026-04-08 06:52:47 +10:00
shrinker.c mm: memcontrol: remove dead code of checking parent memory cgroup 2026-04-18 00:10:44 -07:00
shrinker_debug.c memcg: rename mem_cgroup_ino() to mem_cgroup_id() 2026-01-26 20:02:25 -08:00
shuffle.c
shuffle.h
slab.h mm.git review status for linus..mm-stable 2026-04-15 12:59:16 -07:00
slab_common.c Merge branch 'slab/for-7.0/sheaves' into slab/for-next 2026-02-10 09:10:00 +01:00
slub.c slub: fix data loss and overflow in krealloc() 2026-04-17 11:07:48 +02:00
sparse-vmemmap.c mm: mark early-init static variables with __meminitdata 2026-04-05 13:53:34 -07:00
sparse.c mm/sparse: fix preinited section_mem_map clobbering on failure path 2026-04-18 00:10:52 -07:00
swap.c mm: vmscan: prepare for reparenting traditional LRU folios 2026-04-18 00:10:46 -07:00
swap.h mm, swap: no need to clear the shadow explicitly 2026-04-05 13:52:59 -07:00
swap_cgroup.c
swap_state.c mm/swap: fix swap cache memcg accounting 2026-04-05 13:53:37 -07:00
swap_table.h mm, swap: use the swap table to track the swap count 2026-04-05 13:52:59 -07:00
swapfile.c mm.git review status for linus..mm-stable 2026-04-15 12:59:16 -07:00
truncate.c 7 hotfixes. 6 are cc:stable and all are for MM. Please see the 2026-04-19 14:45:37 -07:00
usercopy.c usercopy: Remove folio references from check_heap_object() 2025-11-13 11:01:08 +01:00
userfaultfd.c mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry() 2026-04-27 05:54:27 -07:00
util.c mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap() 2026-04-27 05:54:24 -07:00
vma.c mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap() 2026-04-27 05:54:24 -07:00
vma.h mm: allow handling of stacked mmap_prepare hooks in more drivers 2026-04-05 13:53:44 -07:00
vma_exec.c mm: convert do_brk_flags() to use vma_flags_t 2026-04-05 13:53:40 -07:00
vma_init.c Summary of significant series in this pull request: 2025-10-02 18:18:33 -07:00
vma_internal.h mm: relocate the page table ceiling and floor definitions 2026-02-12 15:42:53 -08:00
vmalloc.c vmalloc: fix buffer overflow in vrealloc_node_align() 2026-04-27 05:54:23 -07:00
vmpressure.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
vmscan.c mm/vmscan: avoid false-positive -Wuninitialized warning 2026-04-18 00:10:56 -07:00
vmstat.c mm.git review status for linus..mm-stable 2026-04-19 08:01:17 -07:00
workingset.c mm: workingset: use lruvec_lru_size() to get the number of lru pages 2026-04-18 00:10:47 -07:00
zpdesc.h mm: zpdesc: minor naming and comment corrections 2025-09-21 14:21:59 -07:00
zsmalloc.c mm/zsmalloc: copy KMSAN metadata in zs_page_migrate() 2026-04-05 13:53:34 -07:00
zswap.c mm: zswap: tie per-CPU acomp_ctx lifetime to the pool 2026-04-18 00:10:50 -07:00