Add a new clone3() flag CLONE_AUTOREAP that makes a child process auto-reap on exit without ever becoming a zombie. This is a per-process property in contrast to the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for SIGCHLD which applies to all children of a given parent. Currently the only way to automatically reap children is to set SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped property affecting all children which makes it unsuitable for libraries or applications that need selective auto-reaping of specific children while still being able to wait() on others. CLONE_AUTOREAP stores an autoreap flag in the child's signal_struct. When the child exits do_notify_parent() checks this flag and causes exit_notify() to transition the task directly to EXIT_DEAD. Since the flag lives on the child it survives reparenting: if the original parent exits and the child is reparented to a subreaper or init the child still auto-reaps when it eventually exits. CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent to monitor the child's exit via poll() and retrieve exit status via PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget pattern where the parent simply doesn't care about the child's exit status. No exit signal is delivered so exit_signal must be zero. CLONE_AUTOREAP is rejected in combination with CLONE_PARENT. If a CLONE_AUTOREAP child were to clone(CLONE_PARENT) the new grandchild would inherit exit_signal == 0 from the autoreap parent's group leader but without signal->autoreap. This grandchild would become a zombie that never sends a signal and is never autoreaped - confusing and arguably broken behavior. The flag is not inherited by the autoreap process's own children. Each child that should be autoreaped must be explicitly created with CLONE_AUTOREAP. Link: https://github.com/uapi-group/kernel-features/issues/45 Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-1-d148b984a989@kernel.org Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|---|---|---|
| .. | ||
| affinity.h | ||
| autogroup.h | ||
| clock.h | ||
| cond_resched.h | ||
| coredump.h | ||
| cpufreq.h | ||
| cputime.h | ||
| deadline.h | ||
| debug.h | ||
| ext.h | ||
| hotplug.h | ||
| idle.h | ||
| init.h | ||
| isolation.h | ||
| jobctl.h | ||
| loadavg.h | ||
| mm.h | ||
| nohz.h | ||
| numa_balancing.h | ||
| posix-timers.h | ||
| prio.h | ||
| rseq_api.h | ||
| rt.h | ||
| sd_flags.h | ||
| signal.h | ||
| smt.h | ||
| stat.h | ||
| sysctl.h | ||
| task.h | ||
| task_flags.h | ||
| task_stack.h | ||
| thread_info_api.h | ||
| topology.h | ||
| types.h | ||
| user.h | ||
| vhost_task.h | ||
| wake_q.h | ||
| xacct.h | ||