The basic rules are simple:
* stores to dentry->d_flags are OK under dentry->d_lock.
* stores to dentry->d_flags are OK in the dentry constructor, before
becomes potentially visible to other threads.
Unfortunately, there's a couple of exceptions to that, and that's where the
headache comes from.
Main PITA comes from d_set_d_op(); that primitive sets ->d_op
of dentry and adjusts the flags that correspond to presence of individual
methods. It's very easy to misuse; existing uses _are_ safe, but proof
of correctness is brittle.
Use in __d_alloc() is safe (we are within a constructor), but we
might as well precalculate the initial value of ->d_flags when we set
the default ->d_op for given superblock and set ->d_flags directly
instead of messing with that helper.
The reasons why other uses are safe are bloody convoluted; I'm not going
to reproduce it here. See https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/
for gory details, if you care. The critical part is using d_set_d_op() only
just prior to d_splice_alias(), which makes a combination of d_splice_alias()
with setting ->d_op, etc. a natural replacement primitive. Better yet, if
we go that way, it's easy to take setting ->d_op and modifying ->d_flags
under ->d_lock, which eliminates the headache as far as ->d_flags exclusion
rules are concerned. Other exceptions are minor and easy to deal with.
What this series does:
* d_set_d_op() is no longer available; new primitive (d_splice_alias_ops())
is provided, equivalent to combination of d_set_d_op() and d_splice_alias().
* new field of struct super_block - ->s_d_flags. Default value of ->d_flags
to be used when allocating dentries on this filesystem.
* new primitive for setting ->s_d_op: set_default_d_op(). Replaces stores
to ->s_d_op at mount time. All in-tree filesystems converted; out-of-tree
ones will get caught by compiler (->s_d_op is renamed, so stores to it will
be caught). ->s_d_flags is set by the same primitive to match the ->s_d_op.
* a lot of filesystems had ->s_d_op->d_delete equal to always_delete_dentry;
that is equivalent to setting DCACHE_DONTCACHE in ->d_flags, so such filesystems
can bloody well set that bit in ->s_d_flags and drop ->d_delete() from
dentry_operations. In quite a few cases that results in empty dentry_operations,
which means that we can get rid of those.
* kill simple_dentry_operations - not needed anymore.
* massage d_alloc_parallel() to get rid of the other exception wrt ->d_flags
stores - we can set DCACHE_PAR_LOOKUP as soon as we allocate the new dentry;
no need to delay that until we commit to using the sucker.
As the result, ->d_flags stores are all either under ->d_lock or done before
the dentry becomes visible in any shared data structures.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaIQ/tQAKCRBZ7Krx/gZQ
66AhAQDgQ+S224x5YevNXc9mDoGUBMF4OG0n0fIla9rfdL4I6wEAqpOWMNDcVPCZ
GwYOvJ9YuqNdz+MyprAI18Yza4GOmgs=
=rTYB
-----END PGP SIGNATURE-----
Merge tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dentry d_flags updates from Al Viro:
"The current exclusion rules for dentry->d_flags stores are rather
unpleasant. The basic rules are simple:
- stores to dentry->d_flags are OK under dentry->d_lock
- stores to dentry->d_flags are OK in the dentry constructor, before
becomes potentially visible to other threads
Unfortunately, there's a couple of exceptions to that, and that's
where the headache comes from.
The main PITA comes from d_set_d_op(); that primitive sets ->d_op of
dentry and adjusts the flags that correspond to presence of individual
methods. It's very easy to misuse; existing uses _are_ safe, but proof
of correctness is brittle.
Use in __d_alloc() is safe (we are within a constructor), but we might
as well precalculate the initial value of 'd_flags' when we set the
default ->d_op for given superblock and set 'd_flags' directly instead
of messing with that helper.
The reasons why other uses are safe are bloody convoluted; I'm not
going to reproduce it here. See [1] for gory details, if you care. The
critical part is using d_set_d_op() only just prior to
d_splice_alias(), which makes a combination of d_splice_alias() with
setting ->d_op, etc a natural replacement primitive.
Better yet, if we go that way, it's easy to take setting ->d_op and
modifying 'd_flags' under ->d_lock, which eliminates the headache as
far as 'd_flags' exclusion rules are concerned. Other exceptions are
minor and easy to deal with.
What this series does:
- d_set_d_op() is no longer available; instead a new primitive
(d_splice_alias_ops()) is provided, equivalent to combination of
d_set_d_op() and d_splice_alias().
- new field of struct super_block - 's_d_flags'. This sets the
default value of 'd_flags' to be used when allocating dentries on
this filesystem.
- new primitive for setting 's_d_op': set_default_d_op(). This
replaces stores to 's_d_op' at mount time.
All in-tree filesystems converted; out-of-tree ones will get caught
by the compiler ('s_d_op' is renamed, so stores to it will be
caught). 's_d_flags' is set by the same primitive to match the
's_d_op'.
- a lot of filesystems had sb->s_d_op->d_delete equal to
always_delete_dentry; that is equivalent to setting
DCACHE_DONTCACHE in 'd_flags', so such filesystems can bloody well
set that bit in 's_d_flags' and drop 'd_delete()' from
dentry_operations.
In quite a few cases that results in empty dentry_operations, which
means that we can get rid of those.
- kill simple_dentry_operations - not needed anymore
- massage d_alloc_parallel() to get rid of the other exception wrt
'd_flags' stores - we can set DCACHE_PAR_LOOKUP as soon as we
allocate the new dentry; no need to delay that until we commit to
using the sucker.
As the result, 'd_flags' stores are all either under ->d_lock or done
before the dentry becomes visible in any shared data structures"
Link: https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/ [1]
* tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
configfs: use DCACHE_DONTCACHE
debugfs: use DCACHE_DONTCACHE
efivarfs: use DCACHE_DONTCACHE instead of always_delete_dentry()
9p: don't bother with always_delete_dentry
ramfs, hugetlbfs, mqueue: set DCACHE_DONTCACHE
kill simple_dentry_operations
devpts, sunrpc, hostfs: don't bother with ->d_op
shmem: no dentry retention past the refcount reaching zero
d_alloc_parallel(): set DCACHE_PAR_LOOKUP earlier
make d_set_d_op() static
simple_lookup(): just set DCACHE_DONTCACHE
tracefs: Add d_delete to remove negative dentries
set_default_d_op(): calculate the matching value for ->d_flags
correct the set of flags forbidden at d_set_d_op() time
split d_flags calculation out of d_set_d_op()
new helper: set_default_d_op()
fuse: no need for special dentry_operations for root dentry
switch procfs from d_set_d_op() to d_splice_alias_ops()
new helper: d_splice_alias_ops()
procfs: kill ->proc_dops
...