mirror-linux/drivers/block
Ilya Dryomov 9fc75b71fd rbd: eliminate a race in lock_dwork draining on unmap
Given how rbd_lock_add_request() and rbd_img_exclusive_lock() are
written, lock_dwork may be (re)queued more than it's actually needed:
for example in case a new I/O request comes in while we are in the
middle of rbd_acquire_lock() on behalf of another I/O request.  This is
expected and with rbd_release_lock() preemptively canceling lock_dwork
is benign under normal operation.

A more problematic example is maybe_kick_acquire():

    if (have_requests || delayed_work_pending(&rbd_dev->lock_dwork)) {
            dout("%s rbd_dev %p kicking lock_dwork\n", __func__, rbd_dev);
            mod_delayed_work(rbd_dev->task_wq, &rbd_dev->lock_dwork, 0);
    }

It's not unrealistic for lock_dwork to get canceled right after
delayed_work_pending() returns true and for mod_delayed_work() to
requeue it right there anyway.  This is a classic TOCTOU race.

When it comes to unmapping the image, there is an implicit assumption
of no self-initiated exclusive lock activity past the point of return
from rbd_dev_image_unlock() which unlocks the lock if it happens to be
held.  This unlock is assumed to be final and lock_dwork (as well as
all other exclusive lock tasks, really) isn't expected to get queued
again.  However, lock_dwork is canceled only in cancel_tasks_sync()
(i.e. later in the unmap sequence) and on top of that the cancellation
can get in effect nullified by maybe_kick_acquire().  This may result
in rbd_acquire_lock() executing after rbd_dev_device_release() and
rbd_dev_image_release() run and free and/or reset a bunch of things.
One of the possible failure modes then is a violated

    rbd_assert(rbd_image_format_valid(rbd_dev->image_format));

in rbd_dev_header_info() which is called via rbd_dev_refresh() from
rbd_post_acquire_action().

Redo exclusive lock task draining to provide saner semantics and try
to meet the assumptions around rbd_dev_image_unlock().

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2026-05-20 22:09:08 +02:00
..
aoe Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
drbd drbd: use get_random_u64() where appropriate 2026-04-07 06:27:39 -06:00
mtip32xx block: switch ->getgeo() to struct gendisk 2025-08-13 02:59:29 -04:00
null_blk Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rnbd Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
rnull configfs changes for v7.0 2026-02-12 14:01:38 -08:00
xen-blkback Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
zram zram: reject unrecognized type= values in recompress_store() 2026-04-18 00:10:55 -07:00
Kconfig rbd: stop selecting CRC32, CRYPTO, and CRYPTO_AES 2025-12-10 11:50:54 +01:00
Makefile rnull: move driver to separate directory 2025-09-02 05:23:56 -06:00
amiflop.c block: switch ->getgeo() to struct gendisk 2025-08-13 02:59:29 -04:00
ataflop.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
brd.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
floppy.c Revert "floppy: fix reference leak on platform_device_register() failure" 2026-04-23 05:07:37 -06:00
loop.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
n64cart.c block: move the nonrot flag to queue_limits 2024-06-19 07:58:28 -06:00
nbd.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ps3disk.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ps3vram.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
rbd.c rbd: eliminate a race in lock_dwork draining on unmap 2026-05-20 22:09:08 +02:00
rbd_types.h
sunvdc.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
swim.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
swim3.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
swim_asm.S
ublk_drv.c ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation 2026-05-11 07:44:20 -06:00
virtio_blk.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
xen-blkfront.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
z2ram.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
zloop.c zloop: remove irq-safe locking 2026-04-15 13:58:37 -06:00