mirror-linux/kernel/cgroup
Domenico Cerasuolo d5dca19776 sched/psi: Allow unprivileged polling of N*2s period
[ Upstream commit d82caa2735 ]

PSI offers 2 mechanisms to get information about a specific resource
pressure. One is reading from /proc/pressure/<resource>, which gives
average pressures aggregated every 2s. The other is creating a pollable
fd for a specific resource and cgroup.

The trigger creation requires CAP_SYS_RESOURCE, and gives the
possibility to pick specific time window and threshold, spawing an RT
thread to aggregate the data.

Systemd would like to provide containers the option to monitor pressure
on their own cgroup and sub-cgroups. For example, if systemd launches a
container that itself then launches services, the container should have
the ability to poll() for pressure in individual services. But neither
the container nor the services are privileged.

This patch implements a mechanism to allow unprivileged users to create
pressure triggers. The difference with privileged triggers creation is
that unprivileged ones must have a time window that's a multiple of 2s.
This is so that we can avoid unrestricted spawning of rt threads, and
use instead the same aggregation mechanism done for the averages, which
runs independently of any triggers.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/r/20230330105418.77061-5-cerasuolodomenico@gmail.com
Stable-dep-of: aff037078e ("sched/psi: use kernfs polling functions for PSI trigger polling")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27 08:50:38 +02:00
..
Makefile cgroup: Add misc cgroup controller 2021-04-04 13:34:46 -04:00
cgroup-internal.h memcg: Fix possible use-after-free in memcg_write_event_control() 2022-12-08 10:40:58 -08:00
cgroup-v1.c cgroup: fix missing cpus_read_{lock,unlock}() in cgroup_transfer_tasks() 2023-06-21 16:00:51 +02:00
cgroup.c sched/psi: Allow unprivileged polling of N*2s period 2023-07-27 08:50:38 +02:00
cpuset.c cgroup/cpuset: Add cpuset_can_fork() and cpuset_cancel_fork() methods 2023-04-20 12:35:14 +02:00
debug.c kernel: cgroup: fix misuse of %x 2019-05-06 08:47:48 -07:00
freezer.c cgroup: cleanup comments 2022-03-13 19:19:27 -10:00
legacy_freezer.c cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}() 2023-06-28 11:12:25 +02:00
misc.c misc_cgroup: remove error log to avoid log flood 2021-09-20 07:35:38 -10:00
namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
pids.c cgroup: add pids.peak interface for pids controller 2022-09-04 09:26:51 -10:00
rdma.c cgroup: fix spelling mistakes 2021-05-24 12:45:26 -04:00
rstat.c cgroup: fix display of forceidle time at root 2023-04-20 12:35:13 +02:00