Lock debugging:
- Implement compiler-driven static analysis locking context
checking, using the upcoming Clang 22 compiler's context
analysis features. (Marco Elver)
We removed Sparse context analysis support, because prior to
removal even a defconfig kernel produced 1,700+ context
tracking Sparse warnings, the overwhelming majority of which
are false positives. On an allmodconfig kernel the number of
false positive context tracking Sparse warnings grows to
over 5,200... On the plus side of the balance actual locking
bugs found by Sparse context analysis is also rather ... sparse:
I found only 3 such commits in the last 3 years. So the
rate of false positives and the maintenance overhead is
rather high and there appears to be no active policy in
place to achieve a zero-warnings baseline to move the
annotations & fixers to developers who introduce new code.
Clang context analysis is more complete and more aggressive
in trying to find bugs, at least in principle. Plus it has
a different model to enabling it: it's enabled subsystem by
subsystem, which results in zero warnings on all relevant
kernel builds (as far as our testing managed to cover it).
Which allowed us to enable it by default, similar to other
compiler warnings, with the expectation that there are no
warnings going forward. This enforces a zero-warnings baseline
on clang-22+ builds. (Which are still limited in distribution,
admittedly.)
Hopefully the Clang approach can lead to a more maintainable
zero-warnings status quo and policy, with more and more
subsystems and drivers enabling the feature. Context tracking
can be enabled for all kernel code via WARN_CONTEXT_ANALYSIS_ALL=y
(default disabled), but this will generate a lot of false positives.
( Having said that, Sparse support could still be added back,
if anyone is interested - the removal patch is still
relatively straightforward to revert at this stage. )
Rust integration updates: (Alice Ryhl, Fujita Tomonori, Boqun Feng)
- Add support for Atomic<i8/i16/bool> and replace most Rust native
AtomicBool usages with Atomic<bool>
- Clean up LockClassKey and improve its documentation
- Add missing Send and Sync trait implementation for SetOnce
- Make ARef Unpin as it is supposed to be
- Add __rust_helper to a few Rust helpers as a preparation for
helper LTO
- Inline various lock related functions to avoid additional
function calls.
WW mutexes:
- Extend ww_mutex tests and other test-ww_mutex updates (John Stultz)
Misc fixes and cleanups:
- rcu: Mark lockdep_assert_rcu_helper() __always_inline
(Arnd Bergmann)
- locking/local_lock: Include more missing headers (Peter Zijlstra)
- seqlock: fix scoped_seqlock_read kernel-doc (Randy Dunlap)
- rust: sync: Replace `kernel::c_str!` with C-Strings
(Tamir Duberstein)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmIXiURHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gH+A/9GX5UmU6+HuDfDrCtXm9GDve6wkwahvcW
jLDxOYjs764I2BhyjZnjKjyF5zw60hbykem7Wcf5EV2YH30nM4XRgEWVJfkr1UAI
Pra415X4DdOzZ6qYQIpO8Udt1LtR7BMSaXITVLJaLicxEoOVtq3SKxjqyhCFs7UW
MfJdqleB+RMLqq3LlzgB4l43eKk1xyeHh+oQwI0RSxuIpVZme3p4TObnCKjIWnK7
Ihd+dkgC852WBjANgNL7F/sd5UsF5QX3wjtOrLhMKvkIgTPdXln0g398pivjN/G/
Kpnw18SFeb159JfJu8eMotsYvVnQ0D5aOcTBfL4qvOHCImhpcu2s6ik9BcXqt2yT
8IiuWk9xEM3Ok+I/I4ClT5cf5GYpyigV2QsXxn+IjDX5Na8v4zlHh0r8SElP8fOt
7dpQx7iw8UghAib3AzA3suN78Oh39m8l5BNobj7LAjnqOQcVvoPo4o7/48ntuH7A
38EucFrXfxQBMfGbMwvxEmgYuX7MyVfQLaPE06MHy1BkZkffT8Um38TB0iNtZmtf
WUx01yLKWYspehlwFi319uVI4/Zp7FnTfqa5uKv1oSXVdL9vZojSXUzrgDV7FVqT
Z4xAAw/kwNHpUG7y0zNOqd6PukovG1t+CjbLvK+eHPwc5c0vEGG2oTRAfEvvP1z/
kesYDmCyJnk=
=N1gA
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2026-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Lock debugging:
- Implement compiler-driven static analysis locking context checking,
using the upcoming Clang 22 compiler's context analysis features
(Marco Elver)
We removed Sparse context analysis support, because prior to
removal even a defconfig kernel produced 1,700+ context tracking
Sparse warnings, the overwhelming majority of which are false
positives. On an allmodconfig kernel the number of false positive
context tracking Sparse warnings grows to over 5,200... On the plus
side of the balance actual locking bugs found by Sparse context
analysis is also rather ... sparse: I found only 3 such commits in
the last 3 years. So the rate of false positives and the
maintenance overhead is rather high and there appears to be no
active policy in place to achieve a zero-warnings baseline to move
the annotations & fixers to developers who introduce new code.
Clang context analysis is more complete and more aggressive in
trying to find bugs, at least in principle. Plus it has a different
model to enabling it: it's enabled subsystem by subsystem, which
results in zero warnings on all relevant kernel builds (as far as
our testing managed to cover it). Which allowed us to enable it by
default, similar to other compiler warnings, with the expectation
that there are no warnings going forward. This enforces a
zero-warnings baseline on clang-22+ builds (Which are still limited
in distribution, admittedly)
Hopefully the Clang approach can lead to a more maintainable
zero-warnings status quo and policy, with more and more subsystems
and drivers enabling the feature. Context tracking can be enabled
for all kernel code via WARN_CONTEXT_ANALYSIS_ALL=y (default
disabled), but this will generate a lot of false positives.
( Having said that, Sparse support could still be added back,
if anyone is interested - the removal patch is still
relatively straightforward to revert at this stage. )
Rust integration updates: (Alice Ryhl, Fujita Tomonori, Boqun Feng)
- Add support for Atomic<i8/i16/bool> and replace most Rust native
AtomicBool usages with Atomic<bool>
- Clean up LockClassKey and improve its documentation
- Add missing Send and Sync trait implementation for SetOnce
- Make ARef Unpin as it is supposed to be
- Add __rust_helper to a few Rust helpers as a preparation for
helper LTO
- Inline various lock related functions to avoid additional function
calls
WW mutexes:
- Extend ww_mutex tests and other test-ww_mutex updates (John
Stultz)
Misc fixes and cleanups:
- rcu: Mark lockdep_assert_rcu_helper() __always_inline (Arnd
Bergmann)
- locking/local_lock: Include more missing headers (Peter Zijlstra)
- seqlock: fix scoped_seqlock_read kernel-doc (Randy Dunlap)
- rust: sync: Replace `kernel::c_str!` with C-Strings (Tamir
Duberstein)"
* tag 'locking-core-2026-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits)
locking/rwlock: Fix write_trylock_irqsave() with CONFIG_INLINE_WRITE_TRYLOCK
rcu: Mark lockdep_assert_rcu_helper() __always_inline
compiler-context-analysis: Remove __assume_ctx_lock from initializers
tomoyo: Use scoped init guard
crypto: Use scoped init guard
kcov: Use scoped init guard
compiler-context-analysis: Introduce scoped init guards
cleanup: Make __DEFINE_LOCK_GUARD handle commas in initializers
seqlock: fix scoped_seqlock_read kernel-doc
tools: Update context analysis macros in compiler_types.h
rust: sync: Replace `kernel::c_str!` with C-Strings
rust: sync: Inline various lock related methods
rust: helpers: Move #define __rust_helper out of atomic.c
rust: wait: Add __rust_helper to helpers
rust: time: Add __rust_helper to helpers
rust: task: Add __rust_helper to helpers
rust: sync: Add __rust_helper to helpers
rust: refcount: Add __rust_helper to helpers
rust: rcu: Add __rust_helper to helpers
rust: processor: Add __rust_helper to helpers
...
Add a new helper function crypto_skcipher_tested() which evaluates
the CRYPTO_ALG_TESTED flag from the tfm base cra_flags field.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Enable context analysis for crypto subsystem.
This demonstrates a larger conversion to use Clang's context
analysis. The benefit is additional static checking of locking rules,
along with better documentation.
Note the use of the __acquire_ret macro how to define an API where a
function returns a pointer to an object (struct scomp_scratch) with a
lock held. Additionally, the analysis only resolves aliases where the
analysis unambiguously sees that a variable was not reassigned after
initialization, requiring minor code changes.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251219154418.3592607-36-elver@google.com
API:
- Rewrite memcpy_sglist from scratch.
- Add on-stack AEAD request allocation.
- Fix partial block processing in ahash.
Algorithms:
- Remove ansi_cprng.
- Remove tcrypt tests for poly1305.
- Fix EINPROGRESS processing in authenc.
- Fix double-free in zstd.
Drivers:
- Use drbg ctr helper when reseeding xilinx-trng.
- Add support for PCI device 0x115A to ccp.
- Add support of paes in caam.
- Add support for aes-xts in dthev2.
Others:
- Use likely in rhashtable lookup.
- Fix lockdep false-positive in padata by removing a helper.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmktaHwACgkQxycdCkmx
i6duthAAl4ZjsuSgt0P9ZPJXWgSH+QbNT/6fL1QzLEuzLVGn8Mt99LTQpaYU8HRh
fced8+R7UpqA/FgZTYbRKopZJVJJqhmTf2zqjbe47CroRm2Wf5UO+6ZXBsiqbMwa
6fNLilhcrq5G3DrIHepCpIQ7NM2+ucTMnPRIWP3cvzLwX0JzPtYIpYUSiVPAtkjh
9g24oPz6LR/xZfyk+wPbHOSYeqz4sSXnGJkL+Vn33AtU5KJZLum9zMP4Lleim7HP
XaNnUL/S/PYCspycrvfrnq6+YMLPw2USguttuZe0Dg0qhq/jPMyzdEkTAjcTD5LG
NZavVUbQsf6BW+YjXgaE/ybcSs6WR3ySs8aza1Ev8QqsmpbJj9xdpF9fn4RsffGR
mbhc5plJCKWzfiaparea8yY9n5vHwbOK4zoyF9P6kI5ykkoA+GmwRwTW73M9KCfa
i1R6g97O+t4Yaq9JI9GG7dkm9bxJpY+XaKouW7rqv/MX0iND1ExDYaqdcA+Xa61c
TNfdlVcGyX7Dolm2xnpvRv8EqF9NzeK4Vw1QslrdCijXfe7eJymabNKhLBlV4li0
tVfmh4vyQFgruyiR7r7AkXIKzsLZbji030UoOsQqiMW7ualBUQ0dCDbBa8J6kUcX
/vjbSmxV3LKgVgYvUBRRGIi9CJbKfs29RkS6RFtdqcq/YT4KsJU=
=DHes
-----END PGP SIGNATURE-----
Merge tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Rewrite memcpy_sglist from scratch
- Add on-stack AEAD request allocation
- Fix partial block processing in ahash
Algorithms:
- Remove ansi_cprng
- Remove tcrypt tests for poly1305
- Fix EINPROGRESS processing in authenc
- Fix double-free in zstd
Drivers:
- Use drbg ctr helper when reseeding xilinx-trng
- Add support for PCI device 0x115A to ccp
- Add support of paes in caam
- Add support for aes-xts in dthev2
Others:
- Use likely in rhashtable lookup
- Fix lockdep false-positive in padata by removing a helper"
* tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
crypto: zstd - fix double-free in per-CPU stream cleanup
crypto: ahash - Zero positive err value in ahash_update_finish
crypto: ahash - Fix crypto_ahash_import with partial block data
crypto: lib/mpi - use min() instead of min_t()
crypto: ccp - use min() instead of min_t()
hwrng: core - use min3() instead of nested min_t()
crypto: aesni - ctr_crypt() use min() instead of min_t()
crypto: drbg - Delete unused ctx from struct sdesc
crypto: testmgr - Add missing DES weak and semi-weak key tests
Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist"
crypto: scatterwalk - Fix memcpy_sglist() to always succeed
crypto: iaa - Request to add Kanchana P Sridhar to Maintainers.
crypto: tcrypt - Remove unused poly1305 support
crypto: ansi_cprng - Remove unused ansi_cprng algorithm
crypto: asymmetric_keys - fix uninitialized pointers with free attribute
KEYS: Avoid -Wflex-array-member-not-at-end warning
crypto: ccree - Correctly handle return of sg_nents_for_len
crypto: starfive - Correctly handle return of sg_nents_for_len
crypto: iaa - Fix incorrect return value in save_iaa_wq()
crypto: zstd - Remove unnecessary size_t cast
...
This reverts commit 0f8d42bf12, with the
memcpy_sglist() part dropped.
Now that memcpy_sglist() no longer uses the skcipher_walk code, the
skcipher_walk code can be moved back to where it belongs.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Replace blake2b_generic.c with a new file blake2b.c which implements the
BLAKE2b crypto_shash algorithms on top of the BLAKE2b library API.
Change the driver name suffix from "-generic" to "-lib" to reflect that
these algorithms now just use the (possibly arch-optimized) library.
This closely mirrors crypto/{md5,sha1,sha256,sha512}.c.
Remove include/crypto/internal/blake2b.h since it is no longer used.
Likewise, remove struct blake2b_state from include/crypto/blake2b.h.
Omit support for import_core and export_core, since there are no legacy
drivers that need these for these algorithms.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.
This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer. In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.
Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b. This
way, the BLAKE2s and BLAKE2b code is kept consistent. Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.
The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs. The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Export drbg_ctr_df() derivative function to new module df_sp80090.
Signed-off-by: Harsh Jain <h.jain@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Drivers:
- Add ciphertext hiding support to ccp.
- Add hashjoin, gather and UDMA data move features to hisilicon.
- Add lz4 and lz77_only to hisilicon.
- Add xilinx hwrng driver.
- Add ti driver with ecb/cbc aes support.
- Add ring buffer idle and command queue telemetry for GEN6 in qat.
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable.
- Fix CPU number wraparound in padata.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmjbpJsACgkQxycdCkmx
i6fuTRAAzv5o0MIw4Kc7EEU3zMgFSX0FdcTUPY+eiFrWZrSrvUVW+jYcH9ppO8J7
offAYSZYatcyyU9+u8X22CQNKLdXnKQQ0YymWO35TOpvVxveUM1bqEEV1ZK0xaXD
hlJTLoFIsPaVVhi8CW+ZNhDJBwJHNCv7Yi9TUB6sC7rilWWbJ5LzbEVw3Rtg81Lx
0hcuGX2LrpsHOVVWYxGdJ534Kt2lrkt+8/gWOFg3ap3RVQ39tohEjS2Adm2p8eiX
zIdru/aYd89EcYoxuFyylX2d/OLmMAQpFsADy/Fys26eeOWtqggH62V1LAiSyEqw
vLRBCVKpLhlbNNfnUs0f5nqjjYEUrNk9SA4rgoxITwKoucbWBQMS4zWJTEDKz29n
iBBqHsukGpwVOE6RY8BzR/QNJKhZCSsJpGkagS1v6VPa5P1QomuKftGXKB7JKXKz
xoyk+DhJyA8rkb/E5J9Ni7+Tb08Y4zvJ1dpCQHZMlln3DKkK+kk3gkpoxXMZwBV2
LbEMGTI+sfnAfqkGCJYAZR9gDJ5LQDR9jy/Ds5jvPuVvvjyY5LY/bjETqGPF2QVs
Rz2Sg0RHl7PVZOP6QgbQzkV7SkJrZfyu5iYd0ZfUqZr7BaHLOHJG/E/HlUW3/mXu
OjD+Q5gPhiOdc/qn+32+QERTDCFQdbByv0h7khGQA5vHE3XCu8E=
=knnk
-----END PGP SIGNATURE-----
Merge tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"Drivers:
- Add ciphertext hiding support to ccp
- Add hashjoin, gather and UDMA data move features to hisilicon
- Add lz4 and lz77_only to hisilicon
- Add xilinx hwrng driver
- Add ti driver with ecb/cbc aes support
- Add ring buffer idle and command queue telemetry for GEN6 in qat
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable
- Fix CPU number wraparound in padata"
* tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (78 commits)
dt-bindings: rng: hisi-rng: convert to DT schema
crypto: doc - Add explicit title heading to API docs
hwrng: ks-sa - fix division by zero in ks_sa_rng_init
KEYS: X.509: Fix Basic Constraints CA flag parsing
crypto: anubis - simplify return statement in anubis_mod_init
crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
crypto: hisilicon/qm - clear all VF configurations in the hardware
crypto: hisilicon - enable error reporting again
crypto: hisilicon/qm - mask axi error before memory init
crypto: hisilicon/qm - invalidate queues in use
crypto: qat - Return pointer directly in adf_ctl_alloc_resources
crypto: aspeed - Fix dma_unmap_sg() direction
rhashtable: Use rcu_dereference_all and rcu_dereference_all_check
crypto: comp - Use same definition of context alloc and free ops
crypto: omap - convert from tasklet to BH workqueue
crypto: qat - Replace kzalloc() + copy_from_user() with memdup_user()
crypto: caam - double the entropy delay interval for retry
padata: WQ_PERCPU added to alloc_workqueue users
padata: replace use of system_unbound_wq with system_dfl_wq
crypto: cryptd - WQ_PERCPU added to alloc_workqueue users
...
In commit 42d9f6c774 ("crypto: acomp - Move scomp stream allocation
code into acomp"), the crypto_acomp_streams struct was made to rely on
having the alloc_ctx and free_ctx operations defined in the same order
as the scomp_alg struct. But in that same commit, the alloc_ctx and
free_ctx members of scomp_alg may be randomized by structure layout
randomization, since they are contained in a pure ops structure
(containing only function pointers). If the pointers within scomp_alg
are randomized, but those in crypto_acomp_streams are not, then
the order may no longer match. This fixes the problem by removing the
union from scomp_alg so that both crypto_acomp_streams and scomp_alg
will share the same definition of alloc_ctx and free_ctx, ensuring
they will always have the same layout.
Signed-off-by: Dan Moulding <dan@danm.net>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 42d9f6c774 ("crypto: acomp - Move scomp stream allocation code into acomp")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As was done with the other algorithms, reorganize the BLAKE2s code so
that the generic implementation and the arch-specific "glue" code is
consolidated into a single translation unit, so that the compiler will
inline the functions and automatically decide whether to include the
generic code in the resulting binary or not.
Similarly, also consolidate the build rules into
lib/crypto/{Makefile,Kconfig}. This removes the last uses of
lib/crypto/{arm,x86}/{Makefile,Kconfig}, so remove those too.
Don't keep the !KMSAN dependency. It was needed only for other
algorithms such as ChaCha that initialize memory from assembly code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the Poly1305 code into a single module, similar to various
other algorithms (SHA-1, SHA-256, SHA-512, etc.):
- Each arch now provides a header file lib/crypto/$(SRCARCH)/poly1305.h,
replacing lib/crypto/$(SRCARCH)/poly1305*.c. The header defines
poly1305_block_init(), poly1305_blocks(), poly1305_emit(), and
optionally poly1305_mod_init_arch(). It is included by
lib/crypto/poly1305.c, and thus the code gets built into the single
libpoly1305 module, with improved inlining in some cases.
- Whether arch-optimized Poly1305 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default. (The PPC64 one
remains unconditionally disabled due to 'depends on BROKEN'.)
- Any additional arch-specific translation units for the optimized
Poly1305 code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
A special consideration is needed because the Adiantum code uses the
poly1305_core_*() functions directly. For now, just carry forward that
approach. This means retaining the CRYPTO_LIB_POLY1305_GENERIC kconfig
symbol, and keeping the poly1305_core_*() functions in separate
translation units. So it's not quite as streamlined I've done with the
other hash functions, but we still get a single libpoly1305 module.
Note: to see the diff from the arm, arm64, and x86 .c files to the new
.h files, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
API:
- Allow hash drivers without fallbacks (e.g., hardware key).
Algorithms:
- Add hmac hardware key support (phmac) on s390.
- Re-enable sha384 in FIPS mode.
- Disable sha1 in FIPS mode.
- Convert zstd to acomp.
Drivers:
- Lower priority of qat skcipher and aead.
- Convert aspeed to partial block API.
- Add iMX8QXP support in caam.
- Add rate limiting support for GEN6 devices in qat.
- Enable telemetry for GEN6 devices in qat.
- Implement full backlog mode for hisilicon/sec2.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmiHQXwACgkQxycdCkmx
i6f49A//dQtMg/nvlqForj3BTYKPtjpfZhGxOhda1Y01ts4nFLwM39HtNXGCa6no
e5L/taHdGd4loZoFa0H7Jz8Qn+I8F3YJLE1gDmN1zogzM6hG7KwFpJLy+PrusS3H
IwjUehPKNTK2XWmJCdxpsulmwBD+Y//DG3wpwGlkr+MMvlzoMkesvBSCwmXKh/rh
dn8efrHqL+3LBM6F4nM5zTwcKpLvp7V9arwAE6Zat95WN1X2puEk9L8vYG96hU9/
YmG79E6WIb4UBILJlYdfba+3tK0bZaU3iDHMLQVlAPgM8JvzF9THyFRlLRa586/P
rHo2xrgD1vPlMFXKhNI9p+D65zF/5Z0EKTfn1Z99z1kVzz3L71GOYlAvcAw1S9/j
dRAcfrs/7xEW1SI9j+jVYqZn5g/ClGF8MwEL2VYHzyxN3VPY7ELys4rk6Il29NQp
EVH8VfZS3XmdF1oiH51/ZDT4mfvQjn3v33ssdNpAFsZX2XIBj0d48JtTN/ynDfUB
SPS2pTa5FBJCOpRR/Pbct+eloyrVP4Lcy8/gwlKAEY0ZffBBPmd2wCujQf/SKcUH
e4b6hXAWe0gns/4VSnaker3YdG6o4uPWotZKvIiyKlkKGmJXHuSRK32odRO66+Bg
tlaUYOmRghmxgU9Sc6h9M6vkm5rBLMw4ccykmhGSvvudm9rLh6A=
=E8nj
-----END PGP SIGNATURE-----
Merge tag 'v6.17-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Allow hash drivers without fallbacks (e.g., hardware key)
Algorithms:
- Add hmac hardware key support (phmac) on s390
- Re-enable sha384 in FIPS mode
- Disable sha1 in FIPS mode
- Convert zstd to acomp
Drivers:
- Lower priority of qat skcipher and aead
- Convert aspeed to partial block API
- Add iMX8QXP support in caam
- Add rate limiting support for GEN6 devices in qat
- Enable telemetry for GEN6 devices in qat
- Implement full backlog mode for hisilicon/sec2"
* tag 'v6.17-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
crypto: keembay - Use min() to simplify ocs_create_linked_list_from_sg()
crypto: hisilicon/hpre - fix dma unmap sequence
crypto: qat - make adf_dev_autoreset() static
crypto: ccp - reduce stack usage in ccp_run_aes_gcm_cmd
crypto: qat - refactor ring-related debug functions
crypto: qat - fix seq_file position update in adf_ring_next()
crypto: qat - fix DMA direction for compression on GEN2 devices
crypto: jitter - replace ARRAY_SIZE definition with header include
crypto: engine - remove {prepare,unprepare}_crypt_hardware callbacks
crypto: engine - remove request batching support
crypto: qat - flush misc workqueue during device shutdown
crypto: qat - enable rate limiting feature for GEN6 devices
crypto: qat - add compression slice count for rate limiting
crypto: qat - add get_svc_slice_cnt() in device data structure
crypto: qat - add adf_rl_get_num_svc_aes() in rate limiting
crypto: qat - relocate service related functions
crypto: qat - consolidate service enums
crypto: qat - add decompression service for rate limiting
crypto: qat - validate service in rate limiting sysfs api
crypto: hisilicon/sec2 - implement full backlog mode for sec
...
The {prepare,unprepare}_crypt_hardware callbacks were added back in 2016
by commit 735d37b542 ("crypto: engine - Introduce the block request
crypto engine framework"), but they were never implemented by any driver.
Remove them as they are unused.
Since the 'engine->idling' and 'was_busy' flags are no longer needed,
remove them as well.
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove request batching support from crypto_engine, as there are no
drivers using this feature and it doesn't really work that well.
Instead of doing batching based on backlog, a more optimal approach
would be for the user to handle the batching (similar to how IPsec
can hook into GSO to get 64K of data each time or how block encryption
can use unit sizes much greater than 4K).
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To avoid a crash when control flow integrity is enabled, make the
workspace ("stream") free function use a consistent type, and call it
through a function pointer that has that same type.
Fixes: 42d9f6c774 ("crypto: acomp - Move scomp stream allocation code into acomp")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
replacing lib/crypto/$(SRCARCH)/sha256.c. The header defines
sha256_blocks() and optionally sha256_mod_init_arch(). It is included
by lib/crypto/sha256.c, and thus the code gets built into the single
libsha256 module, with proper inlining and dead code elimination.
- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
lib/crypto/sha256.c. It's now a static function marked with
__maybe_unused, so the compiler automatically eliminates it in any
cases where it's not used.
- Whether arch-optimized SHA-256 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default.
- Any additional arch-specific translation units for the optimized
SHA-256 code (such as assembly files) are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The previous commit made the SHA-256 compression function state be
strongly typed, but it wasn't propagated all the way down to the
implementations of it. Do that now.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of having both sha256_blocks_arch() and sha256_blocks_simd(),
instead have just sha256_blocks_arch() which uses the most efficient
implementation that is available in the calling context.
This is simpler, as it reduces the API surface. It's also safer, since
sha256_blocks_arch() just works in all contexts, including contexts
where the FPU/SIMD/vector registers cannot be used. This doesn't mean
that SHA-256 computations *should* be done in such contexts, but rather
we should just do the right thing instead of corrupting a random task's
registers. Eliminating this footgun and simplifying the code is well
worth the very small performance cost of doing the check.
Note: in the case of arm and arm64, what used to be sha256_blocks_arch()
is renamed back to its original name of sha256_block_data_order().
sha256_blocks_arch() is now used for the higher-level dispatch function.
This renaming also required an update to lib/crypto/arm64/sha512.h,
since sha2-armv8.pl is shared by both SHA-256 and SHA-512.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix a regression where the purgatory code sometimes fails to build.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaF7e+xQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOKwB8AP0eDd9f+Zm/vM9V/4ekdcOWh/m5Lk/g
LmNziU123T7ZGwEA/qUqiM6/eRU1F375XW6EhLtxbNico/4KOf7A0kkxlAc=
=xjmX
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix a regression where the purgatory code sometimes fails to build"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: sha256: Mark sha256_choose_blocks as __always_inline
Add a little inline helper function
crypto_ahash_tested()
to the internal/hash.h header file to retrieve the tested
status (that is the CRYPTO_ALG_TESTED bit in the cra_flags).
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make the hash walk functions
crypto_hash_walk_done()
crypto_hash_walk_first()
crypto_hash_walk_last()
public again.
These functions had been removed from the header file
include/crypto/internal/hash.h with commit 7fa4817340
("crypto: ahash - make hash walk functions private to ahash.c")
as there was no crypto algorithm code using them.
With the upcoming crypto implementation for s390 phmac
these functions will be exploited and thus need to be
public within the kernel again.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that drivers that have not been converted to the ahash API
do not use the ahash_request_set_virt fallback path as they cannot
use the software fallback.
Reported-by: Eric Biggers <ebiggers@kernel.org>
Fixes: 9d7a0ab1c7 ("crypto: ahash - Handle partial blocks in API")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the compiler chooses to not inline sha256_choose_blocks() in
the purgatory code, it fails to link against the missing CPU
specific version:
x86_64-linux-ld: arch/x86/purgatory/purgatory.ro: in function `sha256_choose_blocks.part.0':
sha256.c:(.text+0x6a6): undefined reference to `irq_fpu_usable'
sha256.c:(.text+0x6c7): undefined reference to `sha256_blocks_arch'
sha256.c:(.text+0x6cc): undefined reference to `sha256_blocks_simd'
Mark this function as __always_inline to prevent this, same as sha256_finup().
Fixes: 5b90a779bc ("crypto: lib/sha256 - Add helpers for block-based shash")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620191952.1867578-1-arnd@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit 698de82278 ("crypto: testmgr - make it easier to enable the
full set of tests") removed support for building kernels that run only
the "fast" set of crypto self-tests by default. This assumed that
nearly everyone actually wanted the full set of tests, *if* they had
already chosen to enable the tests at all.
Unfortunately, it turns out that both Debian and Fedora intentionally
have the crypto self-tests enabled in their production kernels. And for
production kernels we do need to keep the testing time down, which
implies just running the "fast" tests, not the full set of tests.
For Fedora, a reason for enabling the tests in production is that they
are being (mis)used to meet the FIPS 140-3 pre-operational testing
requirement.
However, the other reason for enabling the tests in production, which
applies to both distros, is that they provide some value in protecting
users from buggy drivers. Unfortunately, the crypto/ subsystem has many
buggy and untested drivers for off-CPU hardware accelerators on rare
platforms. These broken drivers get shipped to users, and there have
been multiple examples of the tests preventing these buggy drivers from
being used. So effectively, the tests are being relied on in production
kernels. I think this is kind of crazy (untested drivers should just
not be enabled at all), but that seems to be how things work currently.
Thus, reintroduce a kconfig option that controls the level of testing.
Call it CRYPTO_SELFTESTS_FULL instead of the original name
CRYPTO_MANAGER_EXTRA_TESTS, which was slightly misleading.
Moreover, given the "production kernel" use case, make CRYPTO_SELFTESTS
depend on EXPERT instead of DEBUG_KERNEL.
I also haven't reinstated all the #ifdefs in crypto/testmgr.c. Instead,
just rely on the compiler to optimize out unused code.
Fixes: 40b9969796 ("crypto: testmgr - replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS")
Fixes: 698de82278 ("crypto: testmgr - make it easier to enable the full set of tests")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add ahash support to hmac so that drivers that can't do hmac in
hardware do not have to implement duplicate copies of hmac.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add export_core and import_core hooks. These are intended to be
used by algorithms which are wrappers around block-only algorithms,
but are not themselves block-only, e.g., hmac.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The core export and import functions are targeted at implementors
so move them into internal/hash.h.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the full set of crypto self-tests requires
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y. This is problematic in two ways.
First, developers regularly overlook this option. Second, the
description of the tests as "extra" sometimes gives the impression that
it is not required that all algorithms pass these tests.
Given that the main use case for the crypto self-tests is for
developers, make enabling CONFIG_CRYPTO_SELFTESTS=y just enable the full
set of crypto self-tests by default.
The slow tests can still be disabled by adding the command-line
parameter cryptomgr.noextratests=1, soon to be renamed to
cryptomgr.noslowtests=1. The only known use case for doing this is for
people trying to use the crypto self-tests to satisfy the FIPS 140-3
pre-operational self-testing requirements when the kernel is being
validated as a FIPS 140-3 cryptographic module.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For copying data between two scatterlists, just use memcpy_sglist()
instead of the so-called "null skcipher". This is much simpler.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the BLOCK_HASH_UPDATE_BLOCKS helper instead of duplicating
partial block handling.
Also remove the unused lib/sha256 force-generic interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add an internal sha256_finup helper and move the finalisation code
from __sha256_final into it.
Also add sha256_choose_blocks and CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD
so that the Crypto API can use the SIMD block function unconditionally.
The Crypto API must not be used in hard IRQs and there is no reason
to have a fallback path for hardirqs.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As chaining has been removed, all that remains of REQ_CHAIN is
just virtual address support. Rename it before the reintroduction
of batching creates confusion.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As has been done for various other algorithms, rework the design of the
SHA-256 library to support arch-optimized implementations, and make
crypto/sha256.c expose both generic and arch-optimized shash algorithms
that wrap the library functions.
This allows users of the SHA-256 library functions to take advantage of
the arch-optimized code, and this makes it much simpler to integrate
SHA-256 for each architecture.
Note that sha256_base.h is not used in the new design. It will be
removed once all the architecture-specific code has been updated.
Move the generic block function into its own module to avoid a circular
dependency from libsha256.ko => sha256-$ARCH.ko => libsha256.ko.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add export and import functions to maintain existing export format.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a block-only interface for poly1305. Implement the generic
code first.
Also use the generic partial block helper.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Extract the common partial block handling into a helper macro
that can be reused by other library code.
Also delete the unused sha256_base_do_finalize function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the generic part of skcipher walk into scatterwalk, and use
it to implement memcpy_sglist.
This makes memcpy_sglist do the right thing when two distinct SG
lists contain identical subsets (e.g., the AD part of AEAD).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper to initialise crypto stack requests and use it for
ahash and acomp. Make sure that the flags field is initialised
fully in the helper to silence false-positive warnings from the
compiler.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504250751.mdy28Ibr-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provide an option to handle the partial blocks in the shash API.
Almost every hash algorithm has a block size and are only able
to hash partial blocks on finalisation.
Rather than duplicating the partial block handling many times,
add this functionality to the shash API.
It is optional (e.g., hmac would never need this by relying on
the partial block handling of the underlying hash), and to enable
it set the bit CRYPTO_AHASH_ALG_BLOCK_ONLY.
The export format is always that of the underlying hash export,
plus the partial block buffer, followed by a single-byte for the
partial block length.
Set the bit CRYPTO_AHASH_ALG_FINAL_NONZERO to withhold an extra
byte in the partial block. This will come in handy when this
is extended to ahash where hardware often can't deal with a
zero-length final.
It will also be used for algorithms requiring an extra block for
finalisation (e.g., cmac).
As an optimisation, set the bit CRYPTO_AHASH_ALG_FINUP_MAX if
the algorithm wishes to get as much data as possible instead of
just the last partial block.
The descriptor will be zeroed after finalisation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Realign struct crypto_engine to reduce its size by 8 bytes. Total size
is now 192 bytes, allowing it to fit within 3 cachelines instead of 4.
pahole output before:
/* size: 200, cachelines: 4, members: 17 */
/* sum members: 183, holes: 3, sum holes: 17 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 8 bytes */
and after:
/* size: 192, cachelines: 3, members: 17 */
/* sum members: 183, holes: 2, sum holes: 9 */
/* paddings: 1, sum paddings: 4 */
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add an atomic flag to the acomp walk and use that in deflate.
Due to the use of a per-cpu context, it is impossible to sleep
during the walk in deflate.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202504151654.4c3b6393-lkp@intel.com
Fixes: 08cabc7d3c ("crypto: deflate - Convert to acomp")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Allow any ahash to be used with a stack request, with optional
dynamic allocation when async is needed. The intended usage is:
HASH_REQUEST_ON_STACK(req, tfm);
...
err = crypto_ahash_digest(req);
/* The request cannot complete synchronously. */
if (err == -EAGAIN) {
/* This will not fail. */
req = HASH_REQUEST_CLONE(req, gfp);
/* Redo operation. */
err = crypto_ahash_digest(req);
}
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>