mirror-linux/kernel/bpf
Menglong Dong 378b770819 sched: Make migrate_{en,dis}able() inline
For now, migrate_enable and migrate_disable are global, which makes them
become hotspots in some case. Take BPF for example, the function calling
to migrate_enable and migrate_disable in BPF trampoline can introduce
significant overhead, and following is the 'perf top' of FENTRY's
benchmark (./tools/testing/selftests/bpf/bench trig-fentry):

  54.63% bpf_prog_2dcccf652aac1793_bench_trigger_fentry [k]
                 bpf_prog_2dcccf652aac1793_bench_trigger_fentry
  10.43% [kernel] [k] migrate_enable
  10.07% bpf_trampoline_6442517037 [k] bpf_trampoline_6442517037
  8.06% [kernel] [k] __bpf_prog_exit_recur
  4.11% libc.so.6 [.] syscall
  2.15% [kernel] [k] entry_SYSCALL_64
  1.48% [kernel] [k] memchr_inv
  1.32% [kernel] [k] fput
  1.16% [kernel] [k] _copy_to_user
  0.73% [kernel] [k] bpf_prog_test_run_raw_tp

So in this commit, we make migrate_enable/migrate_disable inline to obtain
better performance. The struct rq is defined internally in
kernel/sched/sched.h, and the field "nr_pinned" is accessed in
migrate_enable/migrate_disable, which makes it hard to make them inline.

Alexei Starovoitov suggests to generate the offset of "nr_pinned" in [1],
so we can define the migrate_enable/migrate_disable in
include/linux/sched.h and access "this_rq()->nr_pinned" with
"(void *)this_rq() + RQ_nr_pinned".

The offset of "nr_pinned" is generated in include/generated/rq-offsets.h
by kernel/sched/rq-offsets.c.

Generally speaking, we move the definition of migrate_enable and
migrate_disable to include/linux/sched.h from kernel/sched/core.c. The
calling to __set_cpus_allowed_ptr() is leaved in ___migrate_enable().

The "struct rq" is not available in include/linux/sched.h, so we can't
access the "runqueues" with this_cpu_ptr(), as the compilation will fail
in this_cpu_ptr() -> raw_cpu_ptr() -> __verify_pcpu_ptr():
  typeof((ptr) + 0)

So we introduce the this_rq_raw() and access the runqueues with
arch_raw_cpu_ptr/PERCPU_PTR directly.

The variable "runqueues" is not visible in the kernel modules, and export
it is not a good idea. As Peter Zijlstra advised in [2], we define and
export migrate_enable/migrate_disable in kernel/sched/core.c too, and use
them for the modules.

Before this patch, the performance of BPF FENTRY is:

  fentry         :  113.030 ± 0.149M/s
  fentry         :  112.501 ± 0.187M/s
  fentry         :  112.828 ± 0.267M/s
  fentry         :  115.287 ± 0.241M/s

After this patch, the performance of BPF FENTRY increases to:

  fentry         :  143.644 ± 0.670M/s
  fentry         :  149.764 ± 0.362M/s
  fentry         :  149.642 ± 0.156M/s
  fentry         :  145.263 ± 0.221M/s

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/CAADnVQ+5sEDKHdsJY5ZsfGDO_1SEhhQWHrt2SMBG5SYyQ+jt7w@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/20250819123214.GH4067720@noisy.programming.kicks-ass.net/ [2]
2025-09-25 09:57:16 +02:00
..
preload umd: Remove usermode driver framework 2025-07-26 21:03:04 +02:00
Kconfig bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of 2024-05-14 00:36:29 -07:00
Makefile bpf: Introduce BPF standard streams 2025-07-03 19:30:06 -07:00
arena.c bpf/arena: add bpf_arena_reserve_pages kfunc 2025-07-11 10:43:54 -07:00
arraymap.c bpf: add btf_type_is_i{32,64} helpers 2025-06-25 15:15:49 -07:00
bloom_filter.c
bpf_cgrp_storage.c bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage 2025-03-18 19:05:46 -07:00
bpf_inode_storage.c bpf: Disable migration when destroying inode storage 2025-01-08 18:06:36 -08:00
bpf_iter.c bpf: Add attach_type field to bpf_link 2025-07-11 10:51:55 -07:00
bpf_local_storage.c bpf: add btf_type_is_i{32,64} helpers 2025-06-25 15:15:49 -07:00
bpf_lru_list.c bpf: Adjust free target to avoid global starvation of LRU map 2025-06-18 18:50:14 -07:00
bpf_lru_list.h bpf: Adjust free target to avoid global starvation of LRU map 2025-06-18 18:50:14 -07:00
bpf_lsm.c bpf: lsm: Add two more sleepable hooks 2025-02-13 19:35:31 -08:00
bpf_struct_ops.c bpf: Add attach_type field to bpf_link 2025-07-11 10:51:55 -07:00
bpf_task_storage.c bpf: Remove migrate_{disable|enable} from bpf_task_storage_lock helpers 2025-01-08 18:06:36 -08:00
btf.c bpf-next-6.17 2025-07-30 09:58:50 -07:00
btf_iter.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
btf_relocate.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
cgroup.c bpf-next-6.17 2025-07-30 09:58:50 -07:00
cgroup_iter.c
core.c bpf: Fix oob access in cgroup local storage 2025-07-31 11:30:05 -07:00
cpumap.c net: Create separate gro_flush_normal function 2025-07-24 18:34:55 -07:00
cpumask.c bpf: fix missing kdoc string fields in cpumask.c 2025-03-15 11:48:57 -07:00
crypto.c bpf: crypto: make state and IV dynptr nullable 2024-06-13 16:33:04 -07:00
devmap.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-12-12 14:19:05 -08:00
disasm.c bpftool: Using the right format specifiers 2025-03-17 13:50:56 -07:00
disasm.h
dispatcher.c bpf: Add kernel symbol for struct_ops trampoline 2024-11-12 17:13:46 -08:00
dmabuf_iter.c bpf: Add open coded dmabuf iterator 2025-05-27 09:51:25 -07:00
hashtab.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4 2025-04-28 08:40:45 -07:00
helpers.c vfs-6.17-rc1.bpf 2025-07-28 14:42:31 -07:00
inode.c VFS: rename lookup_one_len family to lookup_noperm and remove permission check 2025-04-08 11:24:36 +02:00
kmem_cache_iter.c bpf: Add open coded version of kmem_cache iterator 2024-11-01 11:08:32 -07:00
link_iter.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
local_storage.c bpf: add btf_type_is_i{32,64} helpers 2025-06-25 15:15:49 -07:00
log.c bpf: Introduce support for bpf_local_irq_{save,restore} 2024-12-04 08:38:29 -08:00
lpm_trie.c bpf: Convert lpm_trie.c to rqspinlock 2025-03-19 08:03:05 -07:00
map_in_map.c bpf: switch maps to CLASS(fd, ...) 2024-08-13 15:58:17 -07:00
map_in_map.h
map_iter.c
memalloc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2024-11-13 12:52:51 -08:00
mmap_unlock_work.h
mprog.c
net_namespace.c bpf: Remove attach_type in bpf_netns_link 2025-07-11 11:01:04 -07:00
offload.c net: move misc netdev_lock flavors to a separate header 2025-03-08 09:06:50 -08:00
percpu_freelist.c bpf: Convert percpu_freelist.c to rqspinlock 2025-03-19 08:03:05 -07:00
percpu_freelist.h bpf: Convert percpu_freelist.c to rqspinlock 2025-03-19 08:03:05 -07:00
prog_iter.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
queue_stack_maps.c bpf: Convert queue_stack map to rqspinlock 2025-04-10 12:51:10 -07:00
range_tree.c bpf: Disable migration before calling ops->map_free() 2025-01-08 18:06:36 -08:00
range_tree.h bpf: Introduce range_tree data structure and use it in bpf arena 2024-11-13 13:52:45 -08:00
relo_core.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
reuseport_array.c bpf: Use sockfd_put() helper 2024-08-30 08:57:47 -07:00
ringbuf.c bpf: Convert ringbuf map to rqspinlock 2025-04-11 10:28:26 -07:00
rqspinlock.c bpf: Report rqspinlock deadlocks/timeout to BPF stderr 2025-07-03 19:30:07 -07:00
rqspinlock.h rqspinlock: Protect waiters in queue from stalls 2025-03-19 08:03:05 -07:00
stackmap.c bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack() helpers 2024-09-11 09:58:31 -07:00
stream.c bpf: Fix improper int-to-ptr cast in dump_stack_cb 2025-07-07 08:30:15 -07:00
syscall.c bpf: Move bpf map owner out of common struct 2025-07-31 11:30:05 -07:00
sysfs_btf.c Driver core changes for 6.17-rc1 2025-07-29 12:15:39 -07:00
task_iter.c vfs-6.13.file 2024-11-18 10:30:29 -08:00
tcx.c bpf: Remove location field in tcx_link 2025-07-11 11:00:57 -07:00
tnum.c bpf: Add range tracking for BPF_NEG 2025-06-25 15:12:17 -07:00
token.c bpf: Add struct bpf_token_info 2025-07-16 18:38:05 -07:00
trampoline.c bpf: Add attach_type field to bpf_link 2025-07-11 10:51:55 -07:00
verifier.c sched: Make migrate_{en,dis}able() inline 2025-09-25 09:57:16 +02:00