mirror-linux/Documentation
Aaron Tomlin 49085e1b70 hung_task: enable runtime reset of hung_task_detect_count
Currently, the hung_task_detect_count sysctl provides a cumulative count
of hung tasks since boot.  In long-running, high-availability
environments, this counter may lose its utility if it cannot be reset once
an incident has been resolved.  Furthermore, the previous implementation
relied upon implicit ordering, which could not strictly guarantee that
diagnostic metadata published by one CPU was visible to the panic logic on
another.

This patch introduces the capability to reset the detection count by
writing "0" to the hung_task_detect_count sysctl.  The proc_handler logic
has been updated to validate this input and atomically reset the counter.

The synchronisation of sysctl_hung_task_detect_count relies upon a
transactional model to ensure the integrity of the detection counter
against concurrent resets from userspace.  The application of
atomic_long_read_acquire() and atomic_long_cmpxchg_release() is correct
and provides the following guarantees:

    1. Prevention of Load-Store Reordering via Acquire Semantics By
       utilising atomic_long_read_acquire() to snapshot the counter
       before initiating the task traversal, we establish a strict
       memory barrier. This prevents the compiler or hardware from
       reordering the initial load to a point later in the scan. Without
       this "acquire" barrier, a delayed load could potentially read a
       "0" value resulting from a userspace reset that occurred
       mid-scan. This would lead to the subsequent cmpxchg succeeding
       erroneously, thereby overwriting the user's reset with stale
       increment data.

    2. Atomicity of the "Commit" Phase via Release Semantics The
       atomic_long_cmpxchg_release() serves as the transaction's commit
       point. The "release" barrier ensures that all diagnostic
       recordings and task-state observations made during the scan are
       globally visible before the counter is incremented.

    3. Race Condition Resolution This pairing effectively detects any
       "out-of-band" reset of the counter. If
       sysctl_hung_task_detect_count is modified via the procfs
       interface during the scan, the final cmpxchg will detect the
       discrepancy between the current value and the "acquire" snapshot.
       Consequently, the update will fail, ensuring that a reset command
       from the administrator is prioritised over a scan that may have
       been invalidated by that very reset.

Link: https://lkml.kernel.org/r/20260303203031.4097316-3-atomlin@atomlin.com
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Joel Granados <joel.granados@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 21:19:40 -07:00
..
ABI 15 hotfixes. 6 are cc:stable. 14 are for MM. 2026-03-10 12:47:56 -07:00
PCI Networking changes for 7.0 2026-02-11 19:31:52 -08:00
RCU A slightly calmer cycle for docs this time around, though there is still a 2026-02-09 20:53:18 -08:00
accel
accounting delayacct: add timestamp of delay max 2026-01-31 16:16:06 -08:00
admin-guide hung_task: enable runtime reset of hung_task_detect_count 2026-03-27 21:19:40 -07:00
arch RISC-V updates for v7.0 2026-02-12 19:17:44 -08:00
block
bpf bpf-next-7.0 2026-02-10 11:26:21 -08:00
cdrom
core-api A handful of small, late-arriving documentation fixes. 2026-02-15 10:47:59 -08:00
cpu-freq
crypto
dev-tools kunit: Add documentation of --list_suites 2026-03-09 10:46:02 -06:00
devicetree regulator: Fix for v7.0 2026-03-20 09:52:45 -07:00
doc-guide
driver-api docs: driver-model: document driver_override 2026-03-17 20:30:57 +01:00
edac
fault-injection
fb
features s390: Document s390 stackprotector support 2026-02-03 12:48:27 +01:00
filesystems overlayfs updates for 7.0 2026-02-17 15:08:24 -08:00
firmware-guide docs: fix 're-use' -> 'reuse' in documentation 2026-02-02 09:54:15 -07:00
firmware_class
fpga
gpu drm for 7.0-rc1 2026-02-11 12:55:44 -08:00
hid
hwmon Revert "hwmon: add SMARC-sAM67 support" 2026-02-24 07:25:26 -08:00
i2c
iio
images
infiniband
input docs: fix 're-use' -> 'reuse' in documentation 2026-02-02 09:54:15 -07:00
isdn
kbuild Kbuild/Kconfig updates for 7.0 2026-02-11 13:40:35 -08:00
kernel-hacking
leds docs: leds: Document TI LP5812 LED driver 2026-02-04 09:23:37 +00:00
litmus-tests
livepatch
locking
maintainer
mhi
misc-devices TTY / Serial driver updates for 7.0-rc1 2026-02-17 09:30:52 -08:00
mm A handful of small, late-arriving documentation fixes. 2026-02-15 10:47:59 -08:00
netlabel
netlink net: shaper: protect from late creation of hierarchy 2026-03-19 13:47:15 +01:00
networking ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow() 2026-02-18 16:46:36 -08:00
nvdimm
nvme
pcmcia
peci
power power supply and reset changes for the 7.0 series 2026-02-12 18:24:37 -08:00
process A handful of small, late-arriving documentation fixes. 2026-02-15 10:47:59 -08:00
rust Rust changes for v6.20 / v7.0 2026-02-10 11:53:01 -08:00
scheduler sched_ext: Documentation: Update sched-ext.rst 2026-03-06 12:40:27 -10:00
scsi SCSI misc on 20260212 2026-02-12 15:43:02 -08:00
security docs: trusted-encryped: add PKWM as a new trust source 2026-01-30 09:27:27 +05:30
sound ALSA: doc: usb-audio: Add doc for QUIRK_FLAG_SKIP_IFACE_SETUP 2026-03-03 07:35:24 +01:00
sphinx
sphinx-includes
sphinx-static
spi spi: Updates for v7.0 2026-02-11 09:43:43 -08:00
staging
sunrpc/xdr Add RPC language definition of NFSv4 POSIX ACL extension 2026-01-29 09:48:33 -05:00
target
tee
timers
tools RTLA patches for v7.0 2026-02-12 14:31:02 -08:00
trace Char/Misc/IIO driver changes for 7.0-rc1 2026-02-17 09:11:04 -08:00
translations mm.git review status for linus..mm-stable 2026-02-12 11:32:37 -08:00
usb USB / Thunderbolt changes for 7.0-rc1 2026-02-17 09:36:43 -08:00
userspace-api Char/Misc/IIO driver changes for 7.0-rc1 2026-02-17 09:11:04 -08:00
virt Documentation: kvm: fix formatting of the quirks table 2026-03-11 19:16:52 +01:00
w1
watchdog linux-watchdog 6.20-rc1 tag 2026-02-16 12:21:22 -08:00
wmi platform-drivers-x86 for v7.0-1 2026-02-13 15:39:15 -08:00
.gitignore
.renames.txt
Changes
CodingStyle
Kconfig
Makefile
SubmittingPatches
atomic_bitops.txt
atomic_t.txt
conf.py
docutils.conf
index.rst
memory-barriers.txt
subsystem-apis.rst