Commit Graph

12210 Commits (d78ddeb8938a366aabfabf60255c1a94de8d8ea1)

Author SHA1 Message Date
Avraham Stern 86c6b6e4d1 wifi: nl80211/cfg80211: add new FTM capabilities
Add new capabilities to the PMSR FTM capabilities list. The new
capabilities include 6 GHz support, supported number of spatial streams
and supported number of LTF repetitions.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Tested-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111190221.bf43785c18f6.Ic98cf9790ddee84bf88e5720b93c46c23af3c96c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27 13:40:25 +01:00
Jinjiang Tu 3d702678f5 mm/mempolicy: fix mpol_rebind_nodemask() for MPOL_F_NUMA_BALANCING
commit bda420b985 ("numa balancing: migrate on fault among multiple
bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing
for MPOL_BIND memory policy.

When the cpuset of tasks changes, the mempolicy of the task is rebound by
mpol_rebind_nodemask().  When MPOL_F_STATIC_NODES and
MPOL_F_RELATIVE_NODES are both not set, the behaviour of rebinding should
be same whenever MPOL_F_NUMA_BALANCING is set or not.  So, when an
application calls set_mempolicy() with MPOL_F_NUMA_BALANCING set but both
MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES cleared,
mempolicy.w.cpuset_mems_allowed should be set to
cpuset_current_mems_allowed nodemask.  However, in current implementation,
mpol_store_user_nodemask() wrongly returns true, causing
mempolicy->w.user_nodemask to be incorrectly set to the user-specified
nodemask.  Later, when the cpuset of the application changes,
mpol_rebind_nodemask() ends up rebinding based on the user-specified
nodemask rather than the cpuset_mems_allowed nodemask as intended.

I can reproduce with the following steps in qemu with 4 NUMA nodes:
1. echo '+cpuset' > /sys/fs/cgroup/cgroup.subtree_control
2. mkdir /sys/fs/cgroup/test
3. ./reproducer &
4. cat /proc/$pid/numa_maps, the task is bound to NUMA 1
5. echo $pid > /sys/fs/cgroup/test/cgroup.procs
6. cat /proc/$pid/numa_maps, the task is bound to NUMA 0 now.

The reproducer code:

int main()
{
        struct bitmask *bmp;
        int ret;

        bmp = numa_parse_nodestring("1");
        ret = set_mempolicy(MPOL_BIND | MPOL_F_NUMA_BALANCING,
                bmp->maskp, bmp->size + 1);
        if (ret < 0) {
                perror("Failed to call set_mempolicy");
                exit(-1);
        }

        while (1);
        return 0;
}

If I call set_mempolicy() without MPOL_F_NUMA_BALANCING in the reproducer
code.  After step 5, the task is still bound to NUMA 1.

To fix this, only set mempolicy->w.user_nodemask to the user-specified
nodemask if MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES is present.

Link: https://lkml.kernel.org/r/20260120011018.1256654-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20251223110523.1161421-1-tujinjiang@huawei.com
Fixes: bda420b985 ("numa balancing: migrate on fault among multiple bound nodes")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26 20:02:32 -08:00
Thomas Weißschuh 1965bbb8f3 ipc/shm: uapi: remove dependency on libc
Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.  shm.h does not even use
any symbols from the libc header as the usage of getpagesize() was removed
a decade ago in commit 060028bac9 ("ipc/shm.c: increase the defaults for
SHMALL, SHMMAX")

Drop the unnecessary inclusion.

Link: https://lkml.kernel.org/r/20251222-uapi-shm-v1-1-270bb7f75d97@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26 19:07:10 -08:00
Chuck Lever 0ac903d1bf NFS: NFSERR_INVAL is not defined by NFSv2
A documenting comment in include/uapi/linux/nfs.h claims incorrectly
that NFSv2 defines NFSERR_INVAL. There is no such definition in either
RFC 1094 or https://pubs.opengroup.org/onlinepubs/9629799/chap7.htm

NFS3ERR_INVAL is introduced in RFC 1813.

NFSD returns NFSERR_INVAL for PROC_GETACL, which has no
specification (yet).

However, nfsd_map_status() maps nfserr_symlink and nfserr_wrong_type
to nfserr_inval, which does not align with RFC 1094. This logic was
introduced only recently by commit 438f81e0e9 ("nfsd: move error
choice for incorrect object types to version-specific code."). Given
that we have no INVAL or SERVERFAULT status in NFSv2, probably the
only choice is NFSERR_IO.

Fixes: 438f81e0e9 ("nfsd: move error choice for incorrect object types to version-specific code.")
Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-01-26 10:10:58 -05:00
Greg Kroah-Hartman dbd91d4f55 Merge 6.19-rc7 into char-misc-next
We need the char/misc/iio fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-26 12:04:04 +01:00
Linus Torvalds 0a6dce0a5c Char/Misc/IIO driver fixes for 6.19-rc7
Here are some small char/misc/iio and some other minor driver subsystem
 fixes for 6.19-rc7.  Nothing huge here, just some fixes for reported
 issues including:
   - lots of little iio driver fixes
   - comedi driver fixes
   - mux driver fix
   - w1 driver fixes
   - uio driver fix
   - slimbus driver fixes
   - hwtracing bugfix
   - other tiny bugfixes
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaXY0dA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylkhgCgvwYNIuJosuQRKK0sFTOR1Utig3sAn2g6E2H9
 AOZZ43qoosl++HsuXDLP
 =XzfC
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc/iio driver fixes from Greg KH:
 "Here are some small char/misc/iio and some other minor driver
  subsystem fixes for 6.19-rc7. Nothing huge here, just some fixes for
  reported issues including:

   - lots of little iio driver fixes

   - comedi driver fixes

   - mux driver fix

   - w1 driver fixes

   - uio driver fix

   - slimbus driver fixes

   - hwtracing bugfix

   - other tiny bugfixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (36 commits)
  comedi: dmm32at: serialize use of paged registers
  mei: trace: treat reg parameter as string
  uio: pci_sva: correct '-ENODEV' check logic
  uacce: ensure safe queue release with state management
  uacce: implement mremap in uacce_vm_ops to return -EPERM
  uacce: fix isolate sysfs check condition
  uacce: fix cdev handling in the cleanup path
  slimbus: core: clean up of_slim_get_device()
  slimbus: core: fix of_slim_get_device() kernel doc
  slimbus: core: amend slim_get_device() kernel doc
  slimbus: core: fix device reference leak on report present
  slimbus: core: fix runtime PM imbalance on report present
  slimbus: core: fix OF node leak on registration failure
  intel_th: rename error label
  intel_th: fix device leak on output open()
  comedi: Fix getting range information for subdevices 16 to 255
  mux: mmio: Fix IS_ERR() vs NULL check in probe()
  interconnect: debugfs: initialize src_node and dst_node to empty strings
  iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source
  iio: accel: iis328dq: fix gain values
  ...
2026-01-25 09:57:31 -08:00
Menglong Dong 2d419c4465 bpf: add fsession support
The fsession is something that similar to kprobe session. It allow to
attach a single BPF program to both the entry and the exit of the target
functions.

Introduce the struct bpf_fsession_link, which allows to add the link to
both the fentry and fexit progs_hlist of the trampoline.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260124062008.8657-2-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-24 18:49:35 -08:00
Pavel Begunkov 795663b4d1 io_uring/zcrx: implement large rx buffer support
There are network cards that support receive buffers larger than 4K, and
that can be vastly beneficial for performance, and benchmarks for this
patch showed up to 30% CPU util improvement for 32K vs 4K buffers.

Allows zcrx users to specify the size in struct
io_uring_zcrx_ifq_reg::rx_buf_len. If set to zero, zcrx will use a
default value. zcrx will check and fail if the memory backing the area
can't be split into physically contiguous chunks of the required size.
It's more restrictive as it only needs dma addresses to be contig, but
that's beyond this series.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: kill duplicate netdev_queues.h include]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-24 08:33:03 -07:00
Linus Torvalds 00d20db21e block-6.19-20260122
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmly5rUQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpv30EACNx0n1MlRvrIHvqIH+gq2NmpweLhJphars
 3e4lyDD5bBUuQ1V2iRc5k7MiKPEtQJpF83m1WVIvycNjjx2qWLBsZyCc9PCNoZIn
 nBe1/qCuKOs8kmrNkTXLKjr1uFVp+0Nm5jWSjeIgNI1zd7IDe6fpkjyH6JrnKsw8
 DD7aCYE4jHGJH8q9Ks3rOu3M2syiFwkqmOD12Xgqiz29fP4qJJDvC/y4yOMDWbZY
 G7ZKFHZK5QS8tmjRUhgJacVpJlN3NCL6lZ16f0fRTJoIlWzOAaBafeJTAjiS3S01
 TS8sadfTLrVZocFYSKZjixVgUocHqE5QeKBrDe09N0A5XsWz/YJkcNisNw0oRcvb
 BKCrO1TCHiPzliPCobPcaFZII4z2HeiIfVGSTaJAzSM4ZCt/EH9BFTKrgohGRZi4
 v+bPNdJaCLaFOipgsUeA63NZpt9xN+PsfmzFGC1PvF50gQMM+JeFpqjNh37XjCIt
 6QtK9rzLzC9GvdFXV3o1CRbIBAli3UEgjyLeThhsCu2HPnab/yrrf9AwF5udZQlJ
 wK7S8kB3zl25nyM3jwKzzaVLqi+aUWr1Jn+D+DA2BwwPg3m8h7yIrysJryF5Leon
 5VAYHCOg+MhPuvhNFjqkgahjreJqV/3Qou+sbGcLur4PXdKXeW8SY5wrDlplumFb
 +BZSUZ06Kw==
 =RK+t
 -----END PGP SIGNATURE-----

Merge tag 'block-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - A set of selftest fixes for ublk

 - Fix for a pid mismatch in ublk, comparing PIDs in different
   namespaces if run inside a namespace

 - Fix for a regression added in this release with polling, where the
   nvme tcp connect code would spin forever

 - Zoned device error path fix

 - Tweak the blkzoned uapi additions from this kernel release, making
   them more easily discoverable

 - Fix for a regression in bcache with bio endio handling added in this
   release

* tag 'block-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  bcache: use bio cloning for detached device requests
  blk-mq: use BLK_POLL_ONESHOT for synchronous poll completion
  selftests/ublk: fix garbage output in foreground mode
  selftests/ublk: fix error handling for starting device
  selftests/ublk: fix IO thread idle check
  block: make the new blkzoned UAPI constants discoverable
  ublk: fix ublksrv pid handling for pid namespaces
  block: Fix an error path in disk_update_zone_resources()
2026-01-23 12:53:56 -08:00
Paolo Abeni ba1b8c97b9 geneve: add netlink support for GRO hint
Allow configuring and dumping the new device option, and cache its value
into the geneve socket itself.
The new option is not tie to it any code yet.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/2295d4e4d1e919a3189425141bbc71c7850a2de0.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23 11:31:14 -08:00
Michael Roth fa9893fadb KVM: Introduce KVM_EXIT_SNP_REQ_CERTS for SNP certificate-fetching
For SEV-SNP, the host can optionally provide a certificate table to the
guest when it issues an attestation request to firmware (see GHCB 2.0
specification regarding "SNP Extended Guest Requests"). This certificate
table can then be used to verify the endorsement key used by firmware to
sign the attestation report.

While it is possible for guests to obtain the certificates through other
means, handling it via the host provides more flexibility in being able
to keep the certificate data in sync with the endorsement key throughout
host-side operations that might resulting in the endorsement key
changing.

In the case of KVM, userspace will be responsible for fetching the
certificate table and keeping it in sync with any modifications to the
endorsement key by other userspace management tools. Define a new
KVM_EXIT_SNP_REQ_CERTS event where userspace is provided with the GPA of
the buffer the guest has provided as part of the attestation request so
that userspace can write the certificate data into it while relying on
filesystem-based locking to keep the certificates up-to-date relative to
the endorsement keys installed/utilized by firmware at the time the
certificates are fetched.

[Melody: Update the documentation scheme about how file locking is
         expected to happen.]

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Melody Wang <huibo.wang@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Link: https://patch.msgid.link/20260109231732.1160759-2-michael.roth@amd.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23 09:14:15 -08:00
Marc Zyngier 31c70b971a Merge branch arm64/for-next/cpufeature into kvmarm-master/next
Merge arm64/for-next/cpufeature in to resolve conflicts resulting from
the removal of CONFIG_PAN.

* arm64/for-next/cpufeature:
  arm64: Add support for FEAT_{LS64, LS64_V}
  KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
  KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
  KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  arm64: Unconditionally enable PAN support
  arm64: Unconditionally enable LSE support
  arm64: Add support for TSV110 Spectre-BHB mitigation

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23 09:52:03 +00:00
Jakub Kicinski 9abf22075d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc7).

Conflicts:

drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
  b35a6fd37a ("hinic3: Add adaptive IRQ coalescing with DIM")
  fb2bb2a1eb ("hinic3: Fix netif_queue_set_napi queue_index input parameter error")
https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com

drivers/net/wireless/ath/ath12k/mac.c
drivers/net/wireless/ath/ath12k/wifi7/hw.c
  3170757210 ("wifi: ath12k: Fix wrong P2P device link id issue")
  c26f294fef ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module")
https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au

Adjacent changes:

drivers/net/wireless/ath/ath12k/mac.c
  8b8d6ee53d ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel")
  914c890d3b ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-22 20:14:36 -08:00
Ming Lei e2723e6ce6 ublk: add new feature UBLK_F_BATCH_IO
Add new feature UBLK_F_BATCH_IO which replaces the following two
per-io commands:

	- UBLK_U_IO_FETCH_REQ

	- UBLK_U_IO_COMMIT_AND_FETCH_REQ

with three per-queue batch io uring_cmd:

	- UBLK_U_IO_PREP_IO_CMDS

	- UBLK_U_IO_COMMIT_IO_CMDS

	- UBLK_U_IO_FETCH_IO_CMDS

Then ublk can deliver batch io commands to ublk server in single
multishort uring_cmd, also allows to prepare & commit multiple
commands in batch style via single uring_cmd, communication cost is
reduced a lot.

This feature also doesn't limit task context any more for all supported
commands, so any allowed uring_cmd can be issued in any task context.
ublk server implementation becomes much easier.

Meantime load balance becomes much easier to support with this feature.
The command `UBLK_U_IO_FETCH_IO_CMDS` can be issued from multiple task
contexts, so each task can adjust this command's buffer length or number
of inflight commands for controlling how much load is handled by current
task.

Later, priority parameter will be added to command `UBLK_U_IO_FETCH_IO_CMDS`
for improving load balance support.

UBLK_U_IO_NEED_GET_DATA isn't supported in batch io yet, but it may be
enabled in future via its batch pair.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 20:05:40 -07:00
Ming Lei a4d8837553 ublk: add UBLK_U_IO_FETCH_IO_CMDS for batch I/O processing
Add UBLK_U_IO_FETCH_IO_CMDS command to enable efficient batch processing
of I/O requests. This multishot uring_cmd allows the ublk server to fetch
multiple I/O commands in a single operation, significantly reducing
submission overhead compared to individual FETCH_REQ* commands.

Key Design Features:

1. Multishot Operation: One UBLK_U_IO_FETCH_IO_CMDS can fetch many I/O
   commands, with the batch size limited by the provided buffer length.

2. Dynamic Load Balancing: Multiple fetch commands can be submitted
   simultaneously, but only one is active at any time. This enables
   efficient load distribution across multiple server task contexts.

3. Implicit State Management: The implementation uses three key variables
   to track state:
   - evts_fifo: Queue of request tags awaiting processing
   - fcmd_head: List of available fetch commands
   - active_fcmd: Currently active fetch command (NULL = none active)

   States are derived implicitly:
   - IDLE: No fetch commands available
   - READY: Fetch commands available, none active
   - ACTIVE: One fetch command processing events

4. Lockless Reader Optimization: The active fetch command can read from
   evts_fifo without locking (single reader guarantee), while writers
   (ublk_queue_rq/ublk_queue_rqs) use evts_lock protection. The memory
   barrier pairing plays key role for the single lockless reader
   optimization.

Implementation Details:

- ublk_queue_rq() and ublk_queue_rqs() save request tags to evts_fifo
- __ublk_acquire_fcmd() selects an available fetch command when
  events arrive and no command is currently active
- ublk_batch_dispatch() moves tags from evts_fifo to the fetch command's
  buffer and posts completion via io_uring_mshot_cmd_post_cqe()
- State transitions are coordinated via evts_lock to maintain consistency

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 20:05:40 -07:00
Ming Lei 1e500e106d ublk: handle UBLK_U_IO_COMMIT_IO_CMDS
Handle UBLK_U_IO_COMMIT_IO_CMDS by walking the uring_cmd fixed buffer:

- read each element into one temp buffer in batch style

- parse and apply each element for committing io result

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 20:05:40 -07:00
Ming Lei b256795b36 ublk: handle UBLK_U_IO_PREP_IO_CMDS
This commit implements the handling of the UBLK_U_IO_PREP_IO_CMDS command,
which allows userspace to prepare a batch of I/O requests.

The core of this change is the `ublk_walk_cmd_buf` function, which iterates
over the elements in the uring_cmd fixed buffer. For each element, it parses
the I/O details, finds the corresponding `ublk_io` structure, and prepares it
for future dispatch.

Add per-io lock for protecting concurrent delivery and committing.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 20:05:40 -07:00
Ming Lei e86f89ab24 ublk: add new batch command UBLK_U_IO_PREP_IO_CMDS & UBLK_U_IO_COMMIT_IO_CMDS
Add new command UBLK_U_IO_PREP_IO_CMDS, which is the batch version of
UBLK_IO_FETCH_REQ.

Add new command UBLK_U_IO_COMMIT_IO_CMDS, which is for committing io command
result only, still the batch version.

The new command header type is `struct ublk_batch_io`.

This patch doesn't actually implement these commands yet, just validates the
SQE fields.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 20:05:40 -07:00
Pavel Begunkov 5247c034a6 io_uring: introduce non-circular SQ
Outside of SQPOLL, normally SQ entries are consumed by the time the
submission syscall returns. For those cases we don't need a circular
buffer and the head/tail tracking, instead the kernel can assume that
entries always start from the beginning of the SQ at index 0. This patch
introduces a setup flag doing exactly that. It's a simpler and helps
to keeps SQEs hot in cache.

The feature is optional and enabled by setting IORING_SETUP_SQ_REWIND.
The flag is rejected if passed together with SQPOLL as it'd require
waiting for SQ before each submission. It also requires
IORING_SETUP_NO_SQARRAY, which can be supported but it's unlikely there
will be users, so leave more space for future optimisations.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-22 15:47:23 -07:00
Terry Bowman 7c29ba0221 PCI: Introduce pcie_is_cxl()
CXL is a protocol that runs on top of PCIe electricals. Its error model
also runs on top of the PCIe AER error model by standardizing "internal"
errors as "CXL" errors. Linux has historically ignored internal errors.

CXL protocol error handling is then a task of enhancing the PCIe AER
core to understand that PCIe ports (upstream and downstream) and
endpoints may throw internal errors that represent standard CXL protocol
errors.

The proposed method to make that determination is to teach 'struct
pci_dev' to cache when its link has trained the CXL.mem and/or CXL.cache
protocols and then treat all internal errors as CXL errors. A design
goal is to not burden the PCIe AER core with CXL knowledge beyond just
enough to forward error notifications to the CXL RAS core. The forwarded
notification looks up a 'struct cxl_port' or 'struct cxl_dport'
companion device to the PCI device.

Introduce set_pcie_cxl() with logic checking for CXL.mem or CXL.cache
status in the CXL Flex Bus DVSEC status register. The CXL Flex Bus DVSEC
presence is used because it is required for all the CXL PCIe devices.[1]

[1] CXL 3.1 Spec, 8.1.1 PCIe Designated Vendor-Specific Extended
    Capability (DVSEC) ID Assignment, Table 8-2

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-4-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:55:27 -07:00
Terry Bowman 6612bd9ff0 PCI: Update CXL DVSEC definitions
CXL DVSEC definitions were recently moved into uapi/pci_regs.h, but the
newly added macros do not follow the file's existing naming conventions.
The current format uses CXL_DVSEC_XYZ, while the new CXL entries must
instead use the PCI_DVSEC_CXL_XYZ prefix to match the conventions already
established in pci_regs.h.

The new CXL DVSEC macros also introduce _MASK and _OFFSET suffixes, which
are not used anywhere else in the file. These suffixes lengthen the
identifiers and reduce readability. Remove _MASK and _OFFSET from the
recently added definitions.

Additionally, remove PCI_DVSEC_HEADER1_LENGTH, as it duplicates the existing
PCI_DVSEC_HEADER1_LEN() macro.

Update all existing references to use the new macro names.

Finally, update the inline documentation to reference the latest revision
of the CXL specification.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-3-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:52:23 -07:00
Terry Bowman 0f7afd80d8 PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h
The CXL DVSECs are currently defined in cxl/core/cxlpci.h. These are not
accessible to other subsystems. Move these to uapi/linux/pci_regs.h.

The CXL DVSEC definitions will be renamed and reformatted to fit better
with existing defines.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-2-terry.bowman@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:50:16 -07:00
Linus Torvalds 0a80e38d0f Including fixes from CAN and wireless.
Pretty big PR, but hard to make up any cohesive story that would
 explain it, a random collection of fixes. The two reverts of bad
 patches from this release here feel like stuff that'd normally
 show up by rc5 or rc6. Perhaps obvious thing to say, given the MW
 timing.
 
 That said, no active investigations / regressions. Let's see what
 the next week brings.
 
 Current release - fix to a fix:
 
  - can: alloc_candev_mqs(): add missing default CAN capabilities
 
 Current release - regressions:
 
  - usbnet: fix crash due to missing BQL accounting after resume
 
  - Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not ...
 
 Previous releases - regressions:
 
  - Revert "nfc/nci: Add the inconsistency check between the input ...
 
 Previous releases - always broken:
 
  - number of driver fixes for incorrect use of seqlocks on stats
 
  - rxrpc: fix recvmsg() unconditional requeue, don't corrupt rcv queue
    when MSG_PEEK was set
 
  - ipvlan: make the addrs_lock be per port avoid races in the port
    hash table
 
  - sched: enforce that teql can only be used as root qdisc
 
  - virtio: coalesce only linear skb
 
  - wifi: ath12k: fix dead lock while flushing management frames
 
  - eth: igc: reduce TSN TX packet buffer from 7KB to 5KB per queue
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmlyVNQACgkQMUZtbf5S
 IrvoMQ/+N+KikpNLtHFpkqKAsTnNC6gR9yG+NYvUck1Oo+CQQpc/wJ78pS2cyPpM
 siMa7lLzRuCKPYni3RVpG9nqG/jv9HKvFXpgpSb71Evs6syO85kI8Dy4z8G4mClk
 A4QFH6/8V4uZpH/ewy9DTIjxdWPEJA6CuXXnmolTEBhrNkgskv0Sz3jFVTxDjFzN
 xTMk6H2PhgANUsYFree9PQ+gR+QAHKHPvjobWCsepYRHlT4YVi8PrmqoYBvycLwa
 Q//xy6PUOSXZM4sa6YSUMZk2Nou/84BvqouQ05ljaLSZDl4Ecfxl5MDkh/kzZHjR
 wPbMhiZUrJngwwyXQWcSnir1M17IPKjC/FAEDFUFVWDT/wYYHODt9npCRpdglsa5
 SGoy0yDgUN5Lq9Q5dPL0PMLHVLkuDdlzmyva4uso/dQYovXgTi/5Kx2bPc6LLqMJ
 CP5QxjN5OsOHKnGdI/UO28OU5Yn2KRCmJiW9nW3XdxR316lkmo6BWHAxWG9XhbJE
 aCBXu3EoqNSDVuWaZFVQ8g2uTN3uOhnuJqBllhahtENbdqjsat0cxcjYTuUDvuMm
 B1W0My5qq7O6TRjlFl6JfC3k2gqbZ+HlLg3LfmC74bmddOL52up+9HKyeRoWiIaA
 9BQ3DKpjD3VFEbsbBVPfoZ4Ch4jrB+G+ck8f7g/l9BCNM+Ji1Ds=
 =Ehzr
 -----END PGP SIGNATURE-----

Merge tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from CAN and wireless.

  Pretty big, but hard to make up any cohesive story that would explain
  it, a random collection of fixes. The two reverts of bad patches from
  this release here feel like stuff that'd normally show up by rc5 or
  rc6. Perhaps obvious thing to say, given the holiday timing.

  That said, no active investigations / regressions. Let's see what the
  next week brings.

  Current release - fix to a fix:

   - can: alloc_candev_mqs(): add missing default CAN capabilities

  Current release - regressions:

   - usbnet: fix crash due to missing BQL accounting after resume

   - Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not ...

  Previous releases - regressions:

   - Revert "nfc/nci: Add the inconsistency check between the input ...

  Previous releases - always broken:

   - number of driver fixes for incorrect use of seqlocks on stats

   - rxrpc: fix recvmsg() unconditional requeue, don't corrupt rcv queue
     when MSG_PEEK was set

   - ipvlan: make the addrs_lock be per port avoid races in the port
     hash table

   - sched: enforce that teql can only be used as root qdisc

   - virtio: coalesce only linear skb

   - wifi: ath12k: fix dead lock while flushing management frames

   - eth: igc: reduce TSN TX packet buffer from 7KB to 5KB per queue"

* tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
  Octeontx2-af: Add proper checks for fwdata
  dpll: Prevent duplicate registrations
  net/sched: act_ife: avoid possible NULL deref
  hinic3: Fix netif_queue_set_napi queue_index input parameter error
  vsock/test: add stream TX credit bounds test
  vsock/virtio: cap TX credit to local buffer size
  vsock/test: fix seqpacket message bounds test
  vsock/virtio: fix potential underflow in virtio_transport_get_credit()
  net: fec: account for VLAN header in frame length calculations
  net: openvswitch: fix data race in ovs_vport_get_upcall_stats
  octeontx2-af: Fix error handling
  net: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-X
  rxrpc: Fix data-race warning and potential load/store tearing
  net: dsa: fix off-by-one in maximum bridge ID determination
  net: bcmasp: Fix network filter wake for asp-3.0
  bonding: provide a net pointer to __skb_flow_dissect()
  selftests: net: amt: wait longer for connection before sending packets
  be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
  Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warning"
  netrom: fix double-free in nr_route_frame()
  ...
2026-01-22 09:32:11 -08:00
Jakub Kicinski 9146fe2829 Another set of updates:
- various small fixes for ath10k/ath12k/mwifiex/rsi
  - cfg80211 fix for HE bitrate overflow
  - mac80211 fixes
    - S1G beacon handling in scan
    - skb tailroom handling for HW encryption
    - CSA fix for multi-link
    - handling of disabled links during association
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmlyArkACgkQ10qiO8sP
 aACZxA/+N/q+DAHhVgqETwqOh80WAFTSuhDZsUXc6PNtFkOHvuZaeHePU+fn8hco
 +CkUVEnWYoNgUiaVg1697PJubp0psN7H+3+cq8kn8C/oNB2YnEV2kPCk4x7R8LCj
 TP6mad/Fb+I6Ct7XaCFymCS49eP4Bju7UBgYgLTiKJYvabA+Jim7LavZr8j0Tvra
 h9TNeA0I8+dgGppAWLTssrnsxp65xfSdq71mtRCFUrpEUHjzCl589PEv6BYcIRwv
 N50pm6Am5KJ1TZn5sVSYVfKiiG7UtL/pbXbsM5Cj/54yIFIgmE3bGI5MGAXRlWNG
 o/d/bo0rJg1xppipyZDEN5OJS6S0ijyC5TNKFFRX6IU2eZ8jJs7CWrQw6L8hWCbY
 G+lnVSh4yPAzLEk80S/zBHQccAPXnONtm+cFyPsPab79oAboxQVauuDdH1t5cxbQ
 1HRn0RopyfEoPLmxsCcCSVcdF+hDwRZxAUO735Opnz/amDJNjBKgXTKezXSubfPH
 5hvoAs/VZh7xSyJmEViDO5gavW1SX4nKlUixLYLUrXyq4i+eZkEIvqlCkIWuN9EW
 y/leooWqFzoXY3K8QL8nKQyHWg10WJL1l56tVz+Y+YAh8TvqgWWTB/O/u3g+/ZqO
 eDkaeKbbexUy6iyN0/TMb2RNnTERO0xOepMmIu3sw0HXhptkl1Q=
 =njuT
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2026-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Another set of updates:
 - various small fixes for ath10k/ath12k/mwifiex/rsi
 - cfg80211 fix for HE bitrate overflow
 - mac80211 fixes
   - S1G beacon handling in scan
   - skb tailroom handling for HW encryption
   - CSA fix for multi-link
   - handling of disabled links during association

* tag 'wireless-2026-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: cfg80211: ignore link disabled flag from userspace
  wifi: mac80211: apply advertised TTLM from association response
  wifi: mac80211: parse all TTLM entries
  wifi: mac80211: don't increment crypto_tx_tailroom_needed_cnt twice
  wifi: mac80211: don't perform DA check on S1G beacon
  wifi: ath12k: Fix wrong P2P device link id issue
  wifi: ath12k: fix dead lock while flushing management frames
  wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel
  wifi: ath12k: cancel scan only on active scan vdev
  wifi: mwifiex: Fix a loop in mwifiex_update_ampdu_rxwinsize()
  wifi: mac80211: correctly check if CSA is active
  wifi: cfg80211: Fix bitrate calculation overflow for HE rates
  wifi: rsi: Fix memory corruption due to not set vif driver data size
  wifi: ath12k: don't force radio frequency check in freq_to_idx()
  wifi: ath12k: fix dma_free_coherent() pointer
  wifi: ath10k: fix dma_free_coherent() pointer
====================

Link: https://patch.msgid.link/20260122110248.15450-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-22 07:54:31 -08:00
Marc Zyngier f174a9ffcd KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.

However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.

Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.

A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-01-22 13:24:49 +00:00
Peter Zijlstra d6200245c7 rseq: Allow registering RSEQ with slice extension
Since glibc cares about the number of syscalls required to initialize a new
thread, allow initializing rseq with slice extension on. This avoids having to
do another prctl().

Requested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260121143207.814193010@infradead.org
2026-01-22 11:11:19 +01:00
Thomas Gleixner 28621ec2d4 rseq: Add prctl() to enable time slice extensions
Implement a prctl() so that tasks can enable the time slice extension
mechanism. This fails, when time slice extensions are disabled at compile
time or on the kernel command line and when no rseq pointer is registered
in the kernel.

That allows to implement a single trivial check in the exit to user mode
hotpath, to decide whether the whole mechanism needs to be invoked.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.858717691@linutronix.de
2026-01-22 11:11:17 +01:00
Thomas Gleixner d7a5da7a0f rseq: Add fields and constants for time slice extension
Aside of a Kconfig knob add the following items:

   - Two flag bits for the rseq user space ABI, which allow user space to
     query the availability and enablement without a syscall.

   - A new member to the user space ABI struct rseq, which is going to be
     used to communicate request and grant between kernel and user space.

   - A rseq state struct to hold the kernel state of this

   - Documentation of the new mechanism

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.669472597@linutronix.de
2026-01-22 11:11:16 +01:00
Christoph Hellwig f5f2bad67a block: make the new blkzoned UAPI constants discoverable
The Linux 6.19 merge window added the new BLKREPORTZONESV2 ioctl, and
with it the new BLK_ZONE_REP_CACHED and BLK_ZONE_COND_ACTIVE constants.

The two constants are defined as part of enums, which makes it very
painful for userspace to discover if they are present in the installed
system headers.

Use the #define to the same name trick to make them trivially
discoverable using CPP directives.

Fixes: 0bf0e2e466 ("block: track zone conditions")
Fixes: b30ffcdc0c ("block: introduce BLKREPORTZONESV2 ioctl")
Reported-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-21 07:47:44 -07:00
Detlev Casanova fa05705107 media: v4l2-ctrls: Add hevc_ext_sps_[ls]t_rps controls
The vdpu381 decoder found on newer Rockchip SoC need the information
from the long term and short term ref pic sets from the SPS.

So far, it wasn't included in the v4l2 API, so add it with new dynamic
sized controls.

Each element of the hevc_ext_sps_lt_rps array contains the long term ref
pic set at that index.
Each element of the hevc_ext_sps_st_rps contains the short term ref pic
set at that index, as the raw data.
It is the role of the drivers to calculate the reference sets values.

Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-21 14:43:09 +01:00
Johannes Weiner 64dd89ae01 mm/block/fs: remove laptop_mode
Laptop mode was introduced to save battery, by delaying and consolidating
writes and thereby maximize the time rotating hard drives wouldn't have to
spin.

Luckily, rotating hard drives, with their high spin-up times and power
draw, are a thing of the past for battery-powered devices.  Reclaim has
also since changed to not write single filesystem pages anymore, and
regular filesystem writeback is lumpy by design.

The juice doesn't appear worth the squeeze anymore.  The footprint of the
feature is small, but nevertheless it's a complicating factor in mm,
block, filesystems.  Developers don't think about it, and it likely hasn't
been tested with new reclaim and writeback changes in years.

Let's sunset it.  Keep the sysctl with a deprecation warning around for a
few more cycles, but remove all functionality behind it.

[akpm@linux-foundation.org: fix Documentation/admin-guide/laptops/index.rst]
Link: https://lkml.kernel.org/r/20251216185201.GH905277@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-20 19:24:47 -08:00
Jakub Kicinski 8766d61a1d Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'"
This reverts commit 77b9c4a438, reversing
changes made to 4515ec4ad58a37e70a9e1256c0b993958c9b7497:

 931420a2fc ("selftests/net: Add netkit container tests")
 ab771c938d ("selftests/net: Make NetDrvContEnv support queue leasing")
 6be87fbb27 ("selftests/net: Add env for container based tests")
 61d99ce3df ("selftests/net: Add bpf skb forwarding program")
 920da36341 ("netkit: Add xsk support for af_xdp applications")
 eef51113f8 ("netkit: Add netkit notifier to check for unregistering devices")
 b5ef109d22 ("netkit: Implement rtnl_link_ops->alloc and ndo_queue_create")
 b5c3fa4a0b ("netkit: Add single device mode for netkit")
 0073d2fd67 ("xsk: Proxy pool management for leased queues")
 1ecea95dd3 ("xsk: Extend xsk_rcv_check validation")
 804bf334d0 ("net: Proxy netdev_queue_get_dma_dev for leased queues")
 0caa9a8dde ("net: Proxy net_mp_{open,close}_rxq for leased queues")
 ff8889ff91 ("net, ethtool: Disallow leased real rxqs to be resized")
 9e2103f361 ("net: Add lease info to queue-get response")
 31127dedde ("net: Implement netdev_nl_queue_create_doit")
 a5546e18f7 ("net: Add queue-create operation")

The series will conflict with io_uring work, and the code needs more
polish.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:06:01 -08:00
Daniel Borkmann b5c3fa4a0b netkit: Add single device mode for netkit
Add a single device mode for netkit instead of netkit pairs. The primary
target for the paired devices is to connect network namespaces, of course,
and support has been implemented in projects like Cilium [0]. For the rxq
leasing the plan is to support two main scenarios related to single device
mode:

* For the use-case of io_uring zero-copy, the control plane can either
  set up a netkit pair where the peer device can perform rxq leasing which
  is then tied to the lifetime of the peer device, or the control plane
  can use a regular netkit pair to connect the hostns to a Pod/container
  and dynamically add/remove rxq leasing through a single device without
  having to interrupt the device pair. In the case of io_uring, the memory
  pool is used as skb non-linear pages, and thus the skb will go its way
  through the regular stack into netkit. Things like the netkit policy when
  no BPF is attached or skb scrubbing etc apply as-is in case the paired
  devices are used, or if the backend memory is tied to the single device
  and traffic goes through a paired device.

* For the use-case of AF_XDP, the control plane needs to use netkit in the
  single device mode. The single device mode currently enforces only a
  pass policy when no BPF is attached, and does not yet support BPF link
  attachments for AF_XDP. skbs sent to that device get dropped at the
  moment. Given AF_XDP operates at a lower layer of the stack tying this
  to the netkit pair did not make sense. In future, the plan is to allow
  BPF at the XDP layer which can: i) process traffic coming from the AF_XDP
  application (e.g. QEMU with AF_XDP backend) to filter egress traffic or
  to push selected egress traffic up to the single netkit device to the
  local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the
  single netkit into the AF_XDP application (e.g. DHCP replies). Also,
  the control-plane can dynamically manage rxq leasing for the single
  netkit device without having to interrupt (e.g. down/up cycle) the main
  netkit pair for the Pod which has traffic going in and out.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Jordan Rife <jordan@jrife.io>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0]
Link: https://patch.msgid.link/20260115082603.219152-10-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-20 11:58:50 +01:00
Daniel Borkmann a5546e18f7 net: Add queue-create operation
Add a ynl netdev family operation called queue-create that creates a
new queue on a netdevice:

      name: queue-create
      attribute-set: queue
      flags: [admin-perm]
      do:
        request:
          attributes:
            - ifindex
            - type
            - lease
        reply: &queue-create-op
          attributes:
            - id

This is a generic operation such that it can be extended for various
use cases in future. Right now it is mandatory to specify ifindex,
the queue type which is enforced to rx and a lease. The newly created
queue id is returned to the caller.

A queue from a virtual device can have a lease which refers to another
queue from a physical device. This is useful for memory providers
and AF_XDP operations which take an ifindex and queue id to allow
applications to bind against virtual devices in containers. The lease
couples both queues together and allows to proxy the operations from
a virtual device in a container to the physical device.

In future, the nested lease attribute can be lifted and made optional
for other use-cases such as dynamic queue creation for physical
netdevs. The lack of lease and the specification of the physical
device as an ifindex will imply that we need a real queue to be
allocated. Similarly, the queue type enforcement to rx can then be
lifted as well to support tx.

An early implementation had only driver-specific integration [0], but
in order for other virtual devices to reuse, it makes sense to have
this as a generic API in core net.

For leasing queues, the virtual netdev must have real_num_rx_queue
less than num_rx_queues at the time of calling queue-create. The
queue-type must be rx as only rx queues are supported for leasing
for now. We also enforce that the queue-create ifindex must point
to a virtual device, and that the nested lease attribute's ifindex
must point to a physical device. The nested lease attribute set
contains a netns-id attribute which is currently only intended for
dumping as part of the queue-get operation. Also, it is modeled as
an s32 type similarly as done elsewhere in the stack.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0]
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260115082603.219152-2-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-20 11:58:49 +01:00
Benjamin Berg 50b359896f wifi: cfg80211: ignore link disabled flag from userspace
When the AP has an advertised TID to Link Mapping (TTLM) it shall
include the element in the association response. As such, when this
element is present it needs to be used for the currently dormant links.
See Draft P802.11REVmf_D1.0 section 35.3.7.2.3 ("Negotiation of TTLM")
for the details. The flag is also not usable in case userspace wants to
specify a negotiated TTLM during association.

Note that for the link reconfiguration case, mac80211 did not use the
information. Draft P802.11REVmf_D1.0 states in section 35.3.6.4 ("Link
reconfiguration to the setup links) that we "shall operate with all the
TIDs mapped to the newly added links ..."

All this means that the flag is not needed. The implementation should
parse the information from the association response.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260118093904.754e057896a5.Ifd06f5ef839a93bfd54d0593dc932870f95f3242@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-20 10:02:01 +01:00
Mika Westerberg a9927022c4 net: ethtool: Add support for 80Gbps speed
USB4 v2 link used in peer-to-peer networking is symmetric 80Gbps so in
order to support reading this link speed, add support for it to ethtool.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260115115646.328898-3-mika.westerberg@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:09:58 -08:00
Miri Korenblit 8a42938a28 wifi: nl80211: ignore cluster id after NAN started
After NAN was started, cluster id updates from the user space should not
happen, since the device already started a cluster with the
previousely provided id.

Since NL80211_CMD_CHANGE_NAN_CONFIG requires to set the full NAN
configuration, we can't require that NL80211_NAN_CONF_CLUSTER_ID won't
be included in this command, and keeping the last confgiured value just
to be able to compare it against the new one seems a bit overkill.

Therefore, just ignore cluster id in this command and clarify the
documentation.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107142229.fb55e5853269.I10d18c8f69d98b28916596d6da4207c15ea4abb5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:49 +01:00
Linus Torvalds 90a855e75a Landlock fix for v6.19-rc6
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCaWledRAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSBowBAID4pRQB8EKRNaoqadUWnoSE/wl929Y6KY7i
 FBf8aODOAP9lOtU/CjL5jHKjt00zKW5gX3o0LIuFzSeuzCFTVx2MBA==
 =Aef0
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fixes from Mickaël Salaün:
 "This fixes TCP handling, tests, documentation, non-audit elided code,
  and minor cosmetic changes"

* tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Clarify documentation for the IOCTL access right
  selftests/landlock: Properly close a file descriptor
  landlock: Improve the comment for domain_is_scoped
  selftests/landlock: Use scoped_base_variants.h for ptrace_test
  selftests/landlock: Fix missing semicolon
  selftests/landlock: Fix typo in fs_test
  landlock: Optimize stack usage when !CONFIG_AUDIT
  landlock: Fix spelling
  landlock: Clean up hook_ptrace_access_check()
  landlock: Improve erratum documentation
  landlock: Remove useless include
  landlock: Fix wrong type usage
  selftests/landlock: NULL-terminate unix pathname addresses
  selftests/landlock: Remove invalid unix socket bind()
  selftests/landlock: Add missing connect(minimal AF_UNSPEC) test
  selftests/landlock: Fix TCP bind(AF_UNSPEC) test case
  landlock: Fix TCP handling of short AF_UNSPEC addresses
  landlock: Fix formatting
2026-01-18 15:15:47 -08:00
Linus Torvalds f8907398a6 Ext4 bug fixes for 6.19:
* Fix an inconsistency in structure size on 32-bit platforms caused by padding
      differences for the new EXT4_IOC_[GS]ET_TUNE_SB_PARAM ioctls
    * Fix a buffer leak on the error path when dropping the refcount an xattr
      value stored in an inode
    * Fix missing locking on the error path for the file defragmentation ioctl
      leading to a BUG
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmltDvoACgkQ8vlZVpUN
 gaOe8Af/cKUtk6niuANObDZ4BKIxiTVq1dujTVfftfDQoTsi+G3R4GrZS3UeZ8FB
 9K7UdnnmMkIVgv4e6mYjyjfvNY9oUF2PI9y19E+bk+sxhEEtIJfw0DCgbTfwtIc2
 6NKUS66DDlM1vy8qTrStIM4qShmFjfP+xqiOHWQKD0Xfd9vdxbPV/otVr+Qjl/Xj
 W2kT2erpvPnD9WLB+iaRv4G8zH3RSXEBPGCx43bP6E6JqwFhMw9qzIg1tcy4bVqG
 WPROiAuctzgnE9yEVvi5gF8gG3e1naB72rXi2aP6Ush4QMeNKLOWnDHhyCDr/V+5
 0fme7qlZYyIJZ7gys2sBAW4PkTaHEA==
 =SXcS
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:

 - Fix an inconsistency in structure size on 32-bit platforms caused by
   padding differences for the new EXT4_IOC_[GS]ET_TUNE_SB_PARAM ioctls

 - Fix a buffer leak on the error path when dropping the refcount an
   xattr value stored in an inode

 - Fix missing locking on the error path for the file defragmentation
   ioctl leading to a BUG

* tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref
  ext4: add missing down_write_data_sem in mext_move_extent().
  ext4: fix ext4_tune_sb_params padding
2026-01-18 14:01:20 -08:00
Arnd Bergmann cd16edba1c ext4: fix ext4_tune_sb_params padding
The padding at the end of struct ext4_tune_sb_params is architecture
specific and in particular is different between x86-32 and x86-64,
since the __u64 member only enforces struct alignment on the latter.

This shows up as a new warning when test-building the headers with
-Wpadded:

include/linux/ext4.h:144:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded]

All members inside the structure are naturally aligned, so the only
difference here is the amount of padding at the end. Make the padding
explicit, to have a consistent sizeof(struct ext4_tune_sb_params) of
232 on all architectures and avoid adding compat ioctl handling for
EXT4_IOC_GET_TUNE_SB_PARAM/EXT4_IOC_SET_TUNE_SB_PARAM.

This is an ABI break on x86-32 but hopefully this can go into 6.18.y early
enough as a fixup so no actual users will be affected.  Alternatively, the
kernel could handle the ioctl commands for both sizes (232 and 228 bytes)
on all architectures.

Fixes: 04a91570ac ("ext4: implemet new ioctls to set and get superblock parameters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20251204101914.1037148-1-arnd@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-01-18 11:22:53 -05:00
Suravee Suthikulpanit e05698c10d iommufd: Introduce data struct for AMD nested domain allocation
Introduce IOMMU_HWPT_DATA_AMD_GUEST data type for IOMMU guest page table,
which is used for stage-1 in nested translation. The data structure
contains information necessary for setting up the AMD HW-vIOMMU support.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-18 10:56:12 +01:00
Suravee Suthikulpanit 7d8b06ecc4 iommu/amd: Add support for hw_info for iommu capability query
AMD IOMMU Extended Feature (EFR) and Extended Feature 2 (EFR2) registers
specify features supported by each IOMMU hardware instance.
The IOMMU driver checks each feature-specific bits before enabling
each feature at run time.

For IOMMUFD, the hypervisor passes the raw value of amd_iommu_efr and
amd_iommu_efr2 to VMM via iommufd IOMMU_DEVICE_GET_HW_INFO ioctl.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-18 10:56:09 +01:00
Thomas Weißschuh 0b3877bec7 netfilter: uapi: Use UAPI definition of INT_MAX and INT_MIN
Using <limits.h> to gain access to INT_MAX and INT_MIN introduces a
dependency on a libc, which UAPI headers should not do.

Use the equivalent UAPI constants.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260113-uapi-limits-v2-3-93c20f4b2c1a@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:24:05 -08:00
Thomas Weißschuh a8a11e5237 ethtool: uapi: Use UAPI definition of INT_MAX
Using <limits.h> to gain access to INT_MAX introduces a dependency on a
libc, which UAPI headers should not do.

Use the equivalent UAPI constant.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260113-uapi-limits-v2-2-93c20f4b2c1a@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:24:05 -08:00
Thomas Weißschuh ca9d74eb5f uapi: add INT_MAX and INT_MIN constants
Some UAPI headers use INT_MAX and INT_MIN. Currently they include
<limits.h> for their definitions, which introduces a problematic
dependency on libc.

Add custom, namespaced definitions of INT_MAX and INT_MIN using the
same values as the regular kernel code.
These definitions are not added to uapi/linux/limits.h, as that header
will conflict with libc definitions on some platforms.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260113-uapi-limits-v2-1-93c20f4b2c1a@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:24:05 -08:00
Bill Wendling 150a04d817 compiler_types.h: Attributes: Add __counted_by_ptr macro
Introduce __counted_by_ptr(), which works like __counted_by(), but for
pointer struct members.

struct foo {
	int a, b, c;
	char *buffer __counted_by_ptr(bytes);
	short nr_bars;
	struct bar *bars __counted_by_ptr(nr_bars);
	size_t bytes;
};

Because "counted_by" can only be applied to pointer members in very
recent compiler versions, its application ends up needing to be distinct
from flexibe array "counted_by" annotations, hence a separate macro.

This is a reworking of Kees' previous patch [1].

Link: https://lore.kernel.org/all/20251020220118.1226740-1-kees@kernel.org/ [1]
Co-developed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Bill Wendling <morbo@google.com>
Link: https://patch.msgid.link/20260116005838.2419118-1-morbo@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
2026-01-17 11:00:28 -08:00
Thomas Weißschuh c25d01e1c4
virt: vbox: uapi: Mark inner unions in packed structs as packed
The unpacked unions within a packed struct generates alignment warnings
on clang for 32-bit ARM:

./usr/include/linux/vbox_vmmdev_types.h:239:4: error: field u within 'struct vmmdev_hgcm_function_parameter32'
  is less aligned than 'union (unnamed union at ./usr/include/linux/vbox_vmmdev_types.h:223:2)'
  and is usually due to 'struct vmmdev_hgcm_function_parameter32' being packed,
  which can lead to unaligned accesses [-Werror,-Wunaligned-access]
     239 |         } u;
         |           ^

./usr/include/linux/vbox_vmmdev_types.h:254:6: error: field u within
  'struct vmmdev_hgcm_function_parameter64::(anonymous union)::(unnamed at ./usr/include/linux/vbox_vmmdev_types.h:249:3)'
  is less aligned than 'union (unnamed union at ./usr/include/linux/vbox_vmmdev_types.h:251:4)' and is usually due to
  'struct vmmdev_hgcm_function_parameter64::(anonymous union)::(unnamed at ./usr/include/linux/vbox_vmmdev_types.h:249:3)'
  being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]

With the recent changes to compile-test the UAPI headers in more cases,
these warning in combination with CONFIG_WERROR breaks the build.

Fix the warnings.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512140314.DzDxpIVn-lkp@intel.com/
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/linux-kbuild/20260110-uapi-test-disable-headers-arm-clang-unaligned-access-v1-1-b7b0fa541daa@kernel.org/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/linux-kbuild/29b2e736-d462-45b7-a0a9-85f8d8a3de56@app.fastmail.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Tested-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260115-kbuild-alignment-vbox-v1-2-076aed1623ff@linutronix.de
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-01-16 15:00:54 -07:00
Thomas Weißschuh 1e5271393d
hyper-v: Mark inner union in hv_kvp_exchg_msg_value as packed
The unpacked union within a packed struct generates alignment warnings
on clang for 32-bit ARM:

./usr/include/linux/hyperv.h:361:2: error: field  within 'struct hv_kvp_exchg_msg_value'
  is less aligned than 'union hv_kvp_exchg_msg_value::(anonymous at ./usr/include/linux/hyperv.h:361:2)'
  and is usually due to 'struct hv_kvp_exchg_msg_value' being packed,
  which can lead to unaligned accesses [-Werror,-Wunaligned-access]
     361 |         union {
         |         ^

With the recent changes to compile-test the UAPI headers in more cases,
this warning in combination with CONFIG_WERROR breaks the build.

Fix the warning.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512140314.DzDxpIVn-lkp@intel.com/
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/linux-kbuild/20260110-uapi-test-disable-headers-arm-clang-unaligned-access-v1-1-b7b0fa541daa@kernel.org/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/linux-kbuild/29b2e736-d462-45b7-a0a9-85f8d8a3de56@app.fastmail.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Tested-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260115-kbuild-alignment-vbox-v1-1-076aed1623ff@linutronix.de
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-01-16 15:00:54 -07:00
Linus Torvalds b62ce2547f Power management fixes for 6.19-rc6
- Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)
 
  - Fix stale description of the cost field in struct em_perf_state to
    reflect the current code (Yaxiong Tian)
 
  - Fix and revamp the energy model YNL specification added recently
    along with the energy model netlink interface (Changwoo Min)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmlqWS8SHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1BhAH/iSA/9YP9/xIW+rIZj2irsncGVc/hJ+V
 8PcM2tar966AEQBlmDPc7ug2MXT0Mb/5WRtQo/IvZ1tdAALB+bC70FHjdu3y7SAX
 IzFGyHKJ9OVJHGxYq0TCBbRXgYubUZiqvAnaBTBZ/ZIFlt3B4KEyatfFfxJMb1Pe
 H9zQIyUVBN1tjPfgNVbc22XGfkwfmmla72le6GmHseKZRqpwutIwzxm1PoMNVeLb
 fQv625LsWrDD9NU6j5uGq3WEq3xPltubtHN545FTFi4aGwBYJaG1FuIi+kPiQuCR
 VZmxTWVEiVwjttiZf/xWbOkPfwQqdgeXlTk5j+iBlF1jkrrW75IuLYQ=
 =eJ2p
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix an error path memory leak in the energy model management
  code, fix a kerneldoc comment in it, and fix and revamp the energy
  model YNL specification added recently along with the new energy model
  management netlink interface (that received feedback after being
  added):

   - Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)

   - Fix stale description of the cost field in struct em_perf_state to
     reflect the current code (Yaxiong Tian)

   - Fix and revamp the energy model YNL specification added recently
     along with the energy model netlink interface (Changwoo Min)"

* tag 'pm-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: EM: Add dump to get-perf-domains in the EM YNL spec
  PM: EM: Change cpus' type from string to u64 array in the EM YNL spec
  PM: EM: Rename em.yaml to dev-energymodel.yaml
  PM: EM: Fix yamllint warnings in the EM YNL spec
  PM: EM: Fix memory leak in em_create_pd() error path
  PM: EM: Fix incorrect description of the cost field in struct em_perf_state
2026-01-16 12:08:19 -08:00
Christian Brauner 9b8a0ba682
mount: add OPEN_TREE_NAMESPACE
When creating containers the setup usually involves using CLONE_NEWNS
via clone3() or unshare(). This copies the caller's complete mount
namespace. The runtime will also assemble a new rootfs and then use
pivot_root() to switch the old mount tree with the new rootfs. Afterward
it will recursively umount the old mount tree thereby getting rid of all
mounts.

On a basic system here where the mount table isn't particularly large
this still copies about 30 mounts. Copying all of these mounts only to
get rid of them later is pretty wasteful.

This is exacerbated if intermediary mount namespaces are used that only
exist for a very short amount of time and are immediately destroyed
again causing a ton of mounts to be copied and destroyed needlessly.

With a large mount table and a system where thousands or ten-thousands
of containers are spawned in parallel this quickly becomes a bottleneck
increasing contention on the semaphore.

Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to
OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of
returning a file descriptor referring to that mount tree
OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor
to a new mount namespace. In that new mount namespace the copied mount
tree has been mounted on top of a copy of the real rootfs.

The caller can setns() into that mount namespace and perform any
additionally required setup such as move_mount() detached mounts in
there.

This allows OPEN_TREE_NAMESPACE to function as a combined
unshare(CLONE_NEWNS) and pivot_root().

A caller may for example choose to create an extremely minimal rootfs:

fd_mntns = open_tree(-EBADF, "/var/lib/containers/wootwoot", OPEN_TREE_NAMESPACE);

This will create a mount namespace where "wootwoot" has become the
rootfs mounted on top of the real rootfs. The caller can now setns()
into this new mount namespace and assemble additional mounts.

This also works with user namespaces:

unshare(CLONE_NEWUSER);
fd_mntns = open_tree(-EBADF, "/var/lib/containers/wootwoot", OPEN_TREE_NAMESPACE);

which creates a new mount namespace owned by the earlier created user
namespace with "wootwoot" as the rootfs mounted on top of the real
rootfs.

Link: https://patch.msgid.link/20251229-work-empty-namespace-v1-1-bfb24c7b061f@kernel.org
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Suggested-by: Christian Brauner <brauner@kernel.org>
Suggested-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-16 19:21:40 +01:00
Ian Abbott 10d28cffb3 comedi: Fix getting range information for subdevices 16 to 255
The `COMEDI_RANGEINFO` ioctl does not work properly for subdevice
indices above 15.  Currently, the only in-tree COMEDI drivers that
support more than 16 subdevices are the "8255" driver and the
"comedi_bond" driver.  Making the ioctl work for subdevice indices up to
255 is achievable.  It needs minor changes to the handling of the
`COMEDI_RANGEINFO` and `COMEDI_CHANINFO` ioctls that should be mostly
harmless to user-space, apart from making them less broken.  Details
follow...

The `COMEDI_RANGEINFO` ioctl command gets the list of supported ranges
(usually with units of volts or milliamps) for a COMEDI subdevice or
channel.  (Only some subdevices have per-channel range tables, indicated
by the `SDF_RANGETYPE` flag in the subdevice information.)  It uses a
`range_type` value and a user-space pointer, both supplied by
user-space, but the `range_type` value should match what was obtained
using the `COMEDI_CHANINFO` ioctl (if the subdevice has per-channel
range tables)  or `COMEDI_SUBDINFO` ioctl (if the subdevice uses a
single range table for all channels).  Bits 15 to 0 of the `range_type`
value contain the length of the range table, which is the only part that
user-space should care about (so it can use a suitably sized buffer to
fetch the range table).  Bits 23 to 16 store the channel index, which is
assumed to be no more than 255 if the subdevice has per-channel range
tables, and is set to 0 if the subdevice has a single range table.  For
`range_type` values produced by the `COMEDI_SUBDINFO` ioctl, bits 31 to
24 contain the subdevice index, which is assumed to be no more than 255.
But for `range_type` values produced by the `COMEDI_CHANINFO` ioctl,
bits 27 to 24 contain the subdevice index, which is assumed to be no
more than 15, and bits 31 to 28 contain the COMEDI device's minor device
number for some unknown reason lost in the mists of time.  The
`COMEDI_RANGEINFO` ioctl extract the length from bits 15 to 0 of the
user-supplied `range_type` value, extracts the channel index from bits
23 to 16 (only used if the subdevice has per-channel range tables),
extracts the subdevice index from bits 27 to 24, and ignores bits 31 to
28.  So for subdevice indices 16 to 255, the `COMEDI_SUBDINFO` or
`COMEDI_CHANINFO` ioctl will report a `range_type` value that doesn't
work with the `COMEDI_RANGEINFO` ioctl.  It will either get the range
table for the subdevice index modulo 16, or will fail with `-EINVAL`.

To fix this, always use bits 31 to 24 of the `range_type` value to hold
the subdevice index (assumed to be no more than 255).  This affects the
`COMEDI_CHANINFO` and `COMEDI_RANGEINFO` ioctls.  There should not be
anything in user-space that depends on the old, broken usage, although
it may now see different values in bits 31 to 28 of the `range_type`
values reported by the `COMEDI_CHANINFO` ioctl for subdevices that have
per-channel subdevices.  User-space should not be trying to decode bits
31 to 16 of the `range_type` values anyway.

Fixes: ed9eccbe89 ("Staging: add comedi core")
Cc: stable@vger.kernel.org #5.17+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://patch.msgid.link/20251203162438.176841-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-16 16:42:15 +01:00
Rafael J. Wysocki d51e68b700 Merge branch 'pm-em'
Merge fixes related to the energy model management for 6.19-rc6:

 - Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)

 - Fix stale description of the cost field in struct em_perf_state to
   reflect the current code (Yaxiong Tian)

 - Fix and revamp the energy model YNL specification added recently
   along with the energy model netlink interface (Changwoo Min)

* pm-em:
  PM: EM: Add dump to get-perf-domains in the EM YNL spec
  PM: EM: Change cpus' type from string to u64 array in the EM YNL spec
  PM: EM: Rename em.yaml to dev-energymodel.yaml
  PM: EM: Fix yamllint warnings in the EM YNL spec
  PM: EM: Fix memory leak in em_create_pd() error path
  PM: EM: Fix incorrect description of the cost field in struct em_perf_state
2026-01-16 16:16:24 +01:00
Richard Leitner 5be4154f62 media: v4l: ctrls: add a control for enabling strobe output
Add a control V4L2_CID_FLASH_STROBE_OE to en- or disable the
strobe output of v4l2 devices (most likely sensors).

Signed-off-by: Richard Leitner <richard.leitner@linux.dev>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-16 14:08:53 +01:00
Richard Leitner d89ccbf3dd media: v4l: ctrls: add a control for flash/strobe duration
Add a V4L2_CID_FLASH_DURATION control to set the duration of a
flash/strobe pulse. This controls the length of the flash/strobe pulse
output by device (typically a camera sensor) and connected to the flash
controller. This is different to the V4L2_CID_FLASH_TIMEOUT control,
which is implemented by the flash controller and defines a limit after
which the flash is "forcefully" turned off again.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Richard Leitner <richard.leitner@linux.dev>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-16 14:08:53 +01:00
Gal Pressman 567873005d ethtool: Clarify len/n_stats fields in/out semantics
Document that the 'len' field in ethtool_gstrings and 'n_stats' field in
ethtool_stats optionally serve dual purposes: on entry they specify the
number of items requested, and on return they indicate the number
actually returned (which is not necessarily the same).

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/20260115060544.481550-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:06:46 -08:00
Jakub Kicinski c27022497d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc6).

No conflicts, or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 18:02:48 -08:00
Dapeng Mi d2bdcde962 perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
With the introduction of the OMR feature, the PEBS memory auxiliary info
field for load and store latency events has been restructured for DMR.

The memory auxiliary info field's bit[8] indicates whether a L2 cache
miss occurred for a memory load or store instruction. If bit[8] is 0,
it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
the OMR encoding, indicating the specific L3 cache or memory region
involved in the memory access. A significant enhancement is OMR encoding
provides up to 8 fine-grained memory regions besides the cache region.

A significant enhancement for OMR encoding is the ability to provide
up to 8 fine-grained memory regions in addition to the cache region,
offering more detailed insights into memory access regions.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the ISE documentation.

This patch ensures that the PEBS memory auxiliary info field is correctly
interpreted and utilized in DMR.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
2026-01-15 10:04:26 +01:00
Dave Airlie 83dc0ba275 amd-drm-next-6.20-2026-01-09:
amdgpu:
 - GPUVM updates
 - Initial support for larger GPU address spaces
 - Initial SMUIO 15.x support
 - Documentation updates
 - Initial PSP 15.x support
 - Initial IH 7.1 support
 - Initial IH 6.1.1 support
 - SMU 13.0.12 updates
 - RAS updates
 - Initial MMHUB 3.4 support
 - Initial MMHUB 4.2 support
 - Initial GC 12.1 support
 - Initial GC 11.5.4 support
 - HDMI fixes
 - Panel replay improvements
 - DML updates
 - DC FP fixes
 - Initial SDMA 6.1.4 support
 - Initial SDMA 7.1 support
 - Userq updates
 - DC HPD refactor
 - SwSMU cleanups and refactoring
 - TTM memory ops parallelization
 - DCN 3.5 fixes
 - DP audio fixes
 - Clang fixes
 - Misc spelling fixes and cleanups
 - Initial SDMA 7.11.4 support
 - Convert legacy DRM logging helpers to new drm logging helpers
 - Initial JPEG 5.3 support
 - Add support for changing UMA size via the driver
 - DC analog fixes
 - GC 9 gfx queue reset support
 - Initial SMU 15.x support
 
 amdkfd:
 - Reserved SDMA rework
 - Refactor SPM
 - Initial GC 12.1 support
 - Initial GC 11.5.4 support
 - Initial SDMA 7.1 support
 - Initial SDMA 6.1.4 support
 - Increase the kfd process hash table
 - Per context support
 - Topology fixes
 
 radeon:
 - Convert legacy DRM logging helpers to new drm logging helpers
 - Use devm for i2c adapters
 - Variable sized array fix
 - Misc cleanups
 
 UAPI:
 - KFD context support.  Proposed userspace:
   https://github.com/ROCm/rocm-systems/pull/1705
   https://github.com/ROCm/rocm-systems/pull/1701
 - Add userq metadata queries for more queue types.  Proposed userspace:
   https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaWEiMQAKCRC93/aFa7yZ
 2LXpAP4spIs5nStw1Q7SKhFAU1YSbNBIzrPJVjeQcudAebrgcAD/V9dEdeTPjXSj
 VmtJ9w0WO0vi+VGKPTWsukMB0kyVnwE=
 =HgYC
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.20-2026-01-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.20-2026-01-09:

amdgpu:
- GPUVM updates
- Initial support for larger GPU address spaces
- Initial SMUIO 15.x support
- Documentation updates
- Initial PSP 15.x support
- Initial IH 7.1 support
- Initial IH 6.1.1 support
- SMU 13.0.12 updates
- RAS updates
- Initial MMHUB 3.4 support
- Initial MMHUB 4.2 support
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- HDMI fixes
- Panel replay improvements
- DML updates
- DC FP fixes
- Initial SDMA 6.1.4 support
- Initial SDMA 7.1 support
- Userq updates
- DC HPD refactor
- SwSMU cleanups and refactoring
- TTM memory ops parallelization
- DCN 3.5 fixes
- DP audio fixes
- Clang fixes
- Misc spelling fixes and cleanups
- Initial SDMA 7.11.4 support
- Convert legacy DRM logging helpers to new drm logging helpers
- Initial JPEG 5.3 support
- Add support for changing UMA size via the driver
- DC analog fixes
- GC 9 gfx queue reset support
- Initial SMU 15.x support

amdkfd:
- Reserved SDMA rework
- Refactor SPM
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- Initial SDMA 7.1 support
- Initial SDMA 6.1.4 support
- Increase the kfd process hash table
- Per context support
- Topology fixes

radeon:
- Convert legacy DRM logging helpers to new drm logging helpers
- Use devm for i2c adapters
- Variable sized array fix
- Misc cleanups

UAPI:
- KFD context support.  Proposed userspace:
  https://github.com/ROCm/rocm-systems/pull/1705
  https://github.com/ROCm/rocm-systems/pull/1701
- Add userq metadata queries for more queue types.  Proposed userspace:
  https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260109154713.3242957-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-01-15 14:49:33 +10:00
Alexei Starovoitov e3d0dbb3b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5
Cross-merge BPF and other fixes after downstream PR.

No conflicts.

Adjacent:
Auto-merging MAINTAINERS
Auto-merging Makefile
Auto-merging kernel/bpf/verifier.c
Auto-merging kernel/sched/ext.c
Auto-merging mm/memcontrol.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-14 15:22:01 -08:00
Thorsten Blum dacbfc1678 crypto: af_alg - Annotate struct af_alg_iv with __counted_by
Add the __counted_by() compiler attribute to the flexible array member
'iv' to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260105122402.2685-2-thorsten.blum@linux.dev
Signed-off-by: Kees Cook <kees@kernel.org>
2026-01-14 14:43:18 -08:00
Linus Torvalds d19954ee63 [GIT PULL for v6.19-rc6] media fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmlnqP0ACgkQCF8+vY7k
 4RVz9w/+P5aS+tLXMKIu/1GzCHUkQ41kUtx8irZ2kTBHpKTfvNyh4zp+hkOkIrgn
 NAfNoOogAFgAgPoFT4xPnF+wRdER+oqmlbtu4zvVAPtG4By6lHs6QjrT8k4LKYG2
 L/ItH6LCGKt9WiYVBpdp7rHUAPiUyVJnzYUQRu7uyGclJFNIDs2EhCl4/KAwol1a
 yILpU6DiJz75TwtohdNeJ1E4/sXLW7f35aUgF0OqxkOnROGGk09aUAsNkxeh9EVR
 CeXd6cnKWFlbGgcdoZ1tjHlxSFdDHho1wUxhHmK5eyPgLwYmAGHfzt7cFlyxPLYE
 I9fsGHe4yKDXy6A1va5EVNyyI05OpsYaFsu48NeSdpkGe5gYXRbSLkcXuSsMmNCs
 x/MXIMOdTaEEO6jI8YH5/VRUUvj0FKdlpbo391aZfVJZc9sSr23Q9kN/hjllJd5G
 zEy4U31yzN9y8s3GQVxRPnKlIC/dul/cpNNAFJfUW7Bzjp/APuv73NKyIUmELGOV
 Vsu67cDmfDg/aG6ov5+mY2DnXKahoLmHaLcgDF4CYRny8qM+/0l40425hX7z9wgt
 pXxGjEXp5l6rFWqMIL+bUkBJTJpXcFadX0sqwWPiUllvNrBlxyT81x6LKyn8TH6C
 8vCPPewojbholFXe6FBZzVXclPwLnQBlX6nzIOkNnBM0xbD6xg8=
 =XjkH
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - ov02c10: some fixes related to preserving bayer pattern and
   horizontal control

 - ipu-bridge: Add quirks for some Dell XPS laptops with inverted
   sensors

 - mali-c55: Fix version identifier logic

 - rzg2l-cru: csi-2: fix RZ/V2H input sizes on some variants

* tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: ov02c10: Remove unnecessary hflip and vflip pointers
  media: ipu-bridge: Add DMI quirk for Dell XPS laptops with upside down sensors
  media: ov02c10: Fix the horizontal flip control
  media: ov02c10: Adjust x-win/y-win when changing flipping to preserve bayer-pattern
  media: ov02c10: Fix bayer-pattern change after default vflip change
  media: rzg2l-cru: csi-2: Support RZ/V2H input sizes
  media: uapi: mali-c55-config: Remove version identifier
  media: mali-c55: Remove duplicated version check
  media: Documentation: mali-c55: Use v4l2-isp version identifier
2026-01-14 08:18:01 -08:00
Sai Pratyusha Magam 6ee3a22c61 wifi: nl80211: Add support for EPP peer indication
Introduce a new netlink attribute NL80211_ATTR_EPP_PEER
to be used with NL80211_CMD_NEW_STA and
NL80211_CMD_ADD_LINK_STA for the userspace to indicate
that a non-AP STA is an Enhanced Privacy Protection (EPP)
peer.

Co-developed-by: Rohan Dutta <quic_drohan@quicinc.com>
Signed-off-by: Rohan Dutta <quic_drohan@quicinc.com>
Signed-off-by: Sai Pratyusha Magam <sai.magam@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-5-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:16 +01:00
Ainy Kumari 9d17a040c1 wifi: cfg80211: add feature flag for (re)association frame encryption
Introduce an extended feature flag that allows drivers to signal
support for encryption of (Re)Association Request and Response frames
in both non-AP STA and AP mode, as specified in specification
"IEEE P802.11bi/D3.0, 12.16.6".

Signed-off-by: Ainy Kumari <ainy.kumari@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-3-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:15 +01:00
Ainy Kumari f29c852149 wifi: cfg80211: add support for EPPKE Authentication Protocol
Add an extended feature flag NL80211_EXT_FEATURE_EPPKE to allow a
driver to indicate support for the Enhanced Privacy Protection Key
Exchange (EPPKE) authentication protocol in non-AP STA mode, as
defined in "IEEE P802.11bi/D3.0, 12.16.9".

In case of SME in userspace, the Authentication frame body is prepared
in userspace while the driver finalizes the Authentication frame once
it receives the required fields and elements. The driver indicates
support for EPPKE using the extended feature flag so that userspace
can initiate EPPKE authentication.

When the feature flag is set, process EPPKE Authentication frames from
userspace in non-AP STA mode. If the flag is not set, reject EPPKE
Authentication frames.

Define a new authentication type NL80211_AUTHTYPE_EPPKE for EPPKE.

Signed-off-by: Ainy Kumari <ainy.kumari@oss.qualcomm.com>
Co-developed-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-2-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:15 +01:00
Jonas Köppeler 1bddd758ba net/sched: sch_cake: share shaper state across sub-instances of cake_mq
This commit adds shared shaper state across the cake instances beneath a
cake_mq qdisc. It works by periodically tracking the number of active
instances, and scaling the configured rate by the number of active
queues.

The scan is lockless and simply reads the qlen and the last_active state
variable of each of the instances configured beneath the parent cake_mq
instance. Locking is not required since the values are only updated by
the owning instance, and eventual consistency is sufficient for the
purpose of estimating the number of active queues.

The interval for scanning the number of active queues is set to 200 us.
We found this to be a good tradeoff between overhead and response time.
For a detailed analysis of this aspect see the Netdevconf talk:

https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-5-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-13 11:54:29 +01:00
Deepa Guthyappa Madivalara 406fc2e9ca media: uapi: videodev2: Add support for AV1 stateful decoder
Introduce a new pixel format, V4L2_PIX_FMT_AV1, to the
Video4Linux2(V4L2) API. This format is intended for AV1
bitstreams in stateful decoding/encoding workflows.
The fourcc code 'AV10' is used to distinguish
this format from the existing V4L2_PIX_FMT_AV1_FRAME,
which is used for stateless AV1 decoder implementation.

Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Deepa Guthyappa Madivalara <deepa.madivalara@oss.qualcomm.com>
Tested-by: Val Packett <val@packett.cool>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-14 08:25:55 +01:00
Jakub Kicinski 669aa3e3fa First set of changes for the current -next cycle, of note:
- ath12k gets an overhaul to support multi-wiphy device
    wiphy and pave the way for future device support in the
    same driver (rather than splitting to ath13k)
  - mac80211 gets some better iteration macros
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmllQ8YACgkQ10qiO8sP
 aAA5Eg/+Nk1NNSP0+69YdXZECE+cGPoXVUsLZSEalgd7SZ8jKkCfVVshGAhimFp0
 qMcRgkjGsQkVrOlKcCdX9GkQTQnI2HIvxfqXaE0pB39KelB3lKnBycD47FuC6OVE
 mjNZYpUhNs9wuKs8uP+BRO9L/41C/5uE8hyTR3cf2bqkhx+FAasdYZaVFenpaOJ2
 lmH5GkeAjxEYfyYk/7I70ixtIZ4oDISj99W97rfqSiTQx7VEOD8NdWZirUpLbpt1
 UDR+rCapJQ1wl9p8riSE09hzJALKKVI9YHIDWvfTI81pO+Xt1eyf1wWu0ewz2t6P
 5tLk4LChFZeUqV6oJqTyRYWVlSQ8d9wVFNjPF0dJCiqmh48oz4BCFCWE5dJHgh8Q
 LN8LErrjTBBbgldwbQm7HRHb7llt0MRCmW2qJKKSU4aIBbEC/1sFu0w6smNIPCst
 GJ+fCBYugFAHZ0cft468FW/TSs49zlpGZShRcy22/Ll2iuGp1+TA5nmMAcfF4kdf
 GoPIO2c4gdnHUAp046czquU4KnbtzI0AWLdti7jAHSdVNv1+u3uTgudKVh+PGqIj
 BjOfJvUYCDqK9vkLyZXA0iMfKOdKjSOM+IWT0elYjr4MUy/E9fKK28AmXsze4IzB
 iNMHowJFlQUEi+/26NxO1+8kZpf1x05rxDGcszvECoDiow2TWJg=
 =F1n0
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
First set of changes for the current -next cycle, of note:

 - ath12k gets an overhaul to support multi-wiphy device
   wiphy and pave the way for future device support in
   the same driver (rather than splitting to ath13k)

 - mac80211 gets some better iteration macros

* tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (120 commits)
  wifi: mac80211: remove width argument from ieee80211_parse_bitrates
  wifi: mac80211_hwsim: remove NAN by default
  wifi: mac80211: improve station iteration ergonomics
  wifi: mac80211: improve interface iteration ergonomics
  wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel
  wifi: mac80211: unexport ieee80211_get_bssid()
  wl1251: Replace strncpy with strscpy in wl1251_acx_fw_version
  wifi: iwlegacy: 3945-rs: remove redundant pointer check in il3945_rs_tx_status() and il3945_rs_get_rate()
  wifi: mac80211: don't send an unused argument to ieee80211_check_combinations
  wifi: libertas: fix WARNING in usb_tx_block
  wifi: mwifiex: Allocate dev name earlier for interface workqueue name
  wifi: wlcore: sdio: Use pm_ptr instead of #ifdef CONFIG_PM
  wifi: cfg80211: Fix use_for flag update on BSS refresh
  wifi: brcmfmac: rename function that frees vif
  wifi: brcmfmac: fix/add kernel-doc comments
  wifi: mac80211: Update csa_finalize to use link_id
  wifi: cfg80211: add cfg80211_stop_link() for per-link teardown
  wifi: ath12k: Skip DP peer creation for scan vdev
  wifi: ath12k: move firmware stats request outside of atomic context
  wifi: ath12k: add the missing RCU lock in ath12k_dp_tx_free_txbuf()
  ...
====================

Link: https://patch.msgid.link/20260112185836.378736-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12 17:02:02 -08:00
Yoav Cohen 93ada1b3da ublk: add UBLK_CMD_TRY_STOP_DEV command
Add a best-effort stop command, UBLK_CMD_TRY_STOP_DEV, which only stops a
ublk device when it has no active openers.

Unlike UBLK_CMD_STOP_DEV, this command does not disrupt existing users.
New opens are blocked only after disk_openers has reached zero; if the
device is busy, the command returns -EBUSY and leaves it running.

The ub->block_open flag is used only to close a race with an in-progress
open and does not otherwise change open behavior.

Advertise support via the UBLK_F_SAFE_STOP_DEV feature flag.

Signed-off-by: Yoav Cohen <yoav@nvidia.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12 15:07:31 -07:00
Lachlan Hodges e1cbdf78f6 wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel
When sending a channel ensure we include the IEEE80211_CHAN_S1G_NO_PRIMARY
flag.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20260109081439.3168-1-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:47:35 +01:00
Askar Safin e6ce36ccc8
init: remove /proc/sys/kernel/real-root-dev
It is not used anymore.

Signed-off-by: Askar Safin <safinaskar@gmail.com>
Link: https://patch.msgid.link/20251119222407.3333257-4-safinaskar@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12 17:22:27 +01:00
Stanley Zhang be82a89066 ublk: implement integrity user copy
Add a function ublk_copy_user_integrity() to copy integrity information
between a request and a user iov_iter. This mirrors the existing
ublk_copy_user_pages() but operates on request integrity data instead of
regular data. Check UBLKSRV_IO_INTEGRITY_FLAG in iocb->ki_pos in
ublk_user_copy() to choose between copying data or integrity data.

[csander: change offset units from data bytes to integrity data bytes,
 fix CONFIG_BLK_DEV_INTEGRITY=n build, rebase on user copy refactor]

Signed-off-by: Stanley Zhang <stazhang@purestorage.com>
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12 09:16:38 -07:00
Caleb Sander Mateos f82f0a16a8 ublk: set UBLK_IO_F_INTEGRITY in ublksrv_io_desc
Indicate to the ublk server when an incoming request has integrity data
by setting UBLK_IO_F_INTEGRITY in the ublksrv_io_desc's op_flags field.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12 09:15:05 -07:00
Stanley Zhang 98bf225685 ublk: support UBLK_PARAM_TYPE_INTEGRITY in device creation
Add a feature flag UBLK_F_INTEGRITY for a ublk server to request
integrity/metadata support when creating a ublk device. The ublk server
can also check for the feature flag on the created device or the result
of UBLK_U_CMD_GET_FEATURES to tell if the ublk driver supports it.
UBLK_F_INTEGRITY requires UBLK_F_USER_COPY, as user copy is the only
data copy mode initially supported for integrity data.
Add UBLK_PARAM_TYPE_INTEGRITY and struct ublk_param_integrity to struct
ublk_params to specify the integrity params of a ublk device.
UBLK_PARAM_TYPE_INTEGRITY requires UBLK_F_INTEGRITY and a nonzero
metadata_size. The LBMD_PI_CAP_* and LBMD_PI_CSUM_* values from the
linux/fs.h UAPI header are used for the flags and csum_type fields.
If the UBLK_PARAM_TYPE_INTEGRITY flag is set, validate the integrity
parameters and apply them to the blk_integrity limits.
The struct ublk_param_integrity validations are based on the checks in
blk_validate_integrity_limits(). Any invalid parameters should be
rejected before being applied to struct blk_integrity.

[csander: drop redundant pi_tuple_size field, use block metadata UAPI
 constants, add param validation]

Signed-off-by: Stanley Zhang <stazhang@purestorage.com>
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12 09:15:05 -07:00
Günther Noack 6abbb8703a
landlock: Clarify documentation for the IOCTL access right
Move the description of the LANDLOCK_ACCESS_FS_IOCTL_DEV access right
together with the file access rights.

This group of access rights applies to files (in this case device
files), and they can be added to file or directory inodes using
landlock_add_rule(2).  The check for that works the same for all file
access rights, including LANDLOCK_ACCESS_FS_IOCTL_DEV.

Invoking ioctl(2) on directory FDs can not currently be restricted
with Landlock.  Having it grouped separately in the documentation is a
remnant from earlier revisions of the LANDLOCK_ACCESS_FS_IOCTL_DEV
patch set.

Link: https://lore.kernel.org/all/20260108.Thaex5ruach2@digikod.net/
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260111175203.6545-2-gnoack3000@gmail.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-01-12 17:07:21 +01:00
Christian Brauner 576ee5dfd4 fs: add immutable rootfs
Currently pivot_root() doesn't work on the real rootfs because it
cannot be unmounted. Userspace has to do a recursive removal of the
initramfs contents manually before continuing the boot.

Really all we want from the real rootfs is to serve as the parent mount
for anything that is actually useful such as the tmpfs or ramfs for
initramfs unpacking or the rootfs itself. There's no need for the real
rootfs to actually be anything meaningful or useful. Add a immutable
rootfs called "nullfs" that can be selected via the "nullfs_rootfs"
kernel command line option.

The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
and fire up userspace which mounts the rootfs and can then just do:

  chdir(rootfs);
  pivot_root(".", ".");
  umount2(".", MNT_DETACH);

and be done with it. (Ofc, userspace can also choose to retain the
initramfs contents by using something like pivot_root(".", "/initramfs")
without unmounting it.)

Technically this also means that the rootfs mount in unprivileged
namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
that the immutable rootfs remains permanently empty so there cannot be
anything revealed by unmounting the covering mount.

In the future this will also allow us to create completely empty mount
namespaces without risking to leak anything.

systemd already handles this all correctly as it tries to pivot_root()
first and falls back to MS_MOVE only when that fails.

This goes back to various discussion in previous years and a LPC 2024
presentation about this very topic.

Link: https://patch.msgid.link/20260112-work-immutable-rootfs-v2-3-88dd1c34a204@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12 16:52:09 +01:00
Greg Kroah-Hartman e92d336eaf Merge 6.19-rc5 into char-misc-next
We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-12 08:51:13 +01:00
Thomas Gleixner 2e4b28c48f treewide: Update email address
In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-01-11 06:09:11 -10:00
Michael Chan bc87b14594 bnxt_en: Implement ethtool_ops -> get_link_ext_state()
Map the link_down_reason from the FW to the ethtool link_ext_state
when it is available.  Also log it to the link down dmesg when it is
available.  Add 2 new link_ext_state enums to the UAPI:

ETHTOOL_LINK_EXT_STATE_OTP_SPEED_VIOLATION
ETHTOOL_LINK_EXT_STATE_BMC_REQUEST_DOWN

to cover OTP (one-time-programmable) speed restrictions and
BMC (Baseboard management controller) forcing the link down.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:51 -08:00
Nicolin Chen c279e83953 iommu: Introduce pci_dev_reset_iommu_prepare/done()
PCIe permits a device to ignore ATS invalidation TLPs while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out. E.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.

The OS should do something to mitigate this as we do not want production
systems to be reporting critical ATS failures, especially in a hypervisor
environment. Broadly, OS could arrange to ignore the timeouts, block page
table mutations to prevent invalidations, or disable and block ATS.

The PCIe r6.0, sec 10.3.1 IMPLEMENTATION NOTE recommends SW to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.

Provide a callback from the PCI subsystem that will enclose the reset and
have the iommu core temporarily change all the attached RID/PASID domains
group->blocking_domain so that the IOMMU hardware would fence any incoming
ATS queries. And IOMMU drivers should also synchronously stop issuing new
ATS invalidations and wait for all ATS invalidations to complete. This can
avoid any ATS invaliation timeouts.

However, if there is a domain attachment/replacement happening during an
ongoing reset, ATS routines may be re-activated between the two function
calls. So, introduce a new resetting_domain in the iommu_group structure
to reject any concurrent attach_dev/set_dev_pasid call during a reset for
a concern of compatibility failure. Since this changes the behavior of an
attach operation, update the uAPI accordingly.

Note that there are two corner cases:
 1. Devices in the same iommu_group
    Since an attachment is always per iommu_group, this means that any
    sibling devices in the iommu_group cannot change domain, to prevent
    race conditions.
 2. An SR-IOV PF that is being reset while its VF is not
    In such case, the VF itself is already broken. So, there is no point
    in preventing PF from going through the iommu reset.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-10 10:26:44 +01:00
Changwoo Min 380ff27af2 PM: EM: Add dump to get-perf-domains in the EM YNL spec
Add dump to get-perf-domains, so that a user can fetch either information
about a specific performance domain with do or information about all
performance domains with dump. Share the reply format of do and dump using
perf-domain-attrs, so remove perf-domains. The YNL spec, autogenerated
files, and the do implementation are updated, and the dump implementation
is added.

Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-5-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-09 21:44:46 +01:00
Changwoo Min caa07a815d PM: EM: Rename em.yaml to dev-energymodel.yaml
The EM YNL specification used many acronyms, including ‘em’, ‘pd’,
‘ps’, etc. While the acronyms are short and convenient, they could be
confusing. So, let’s spell them out to be more specific. The following
changes were made in the spec. Note that the protocol name cannot exceed
GENL_NAMSIZ (16).

  em           -> dev-energymodel
  pds          -> perf-domains
  pd           -> perf-domain
  pd-id        -> perf-domain-id
  pd-table     -> perf-table
  ps           -> perf-state
  get-pds      -> get-perf-domains
  get-pd-table -> get-perf-table
  pd-created   -> perf-domain-created
  pd-updated   -> perf-domain-updated
  pd-deleted   -> perf-domain-deleted

In addition. doc strings were added to the spec. based on the comments in
energy_model.h. Two flag attributes (perf-state-flags and
perf-domain-flags) were added for easily interpreting the bit flags.

Finally, the autogenerated files and em_netlink.c were updated accordingly
to reflect the name changes.

Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-3-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-09 21:44:46 +01:00
Linus Torvalds 2bfe3e0da6 vfs-6.19-rc5.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaWDTUAAKCRCRxhvAZXjc
 okDeAQDsEpgFy8hpD08HBs4TpUXv7MSRqwZS7emlsfUqEjeprwEAgu95YuJ+z6hV
 RFk/Lior/+YlB5FN5VcKzyQGMuRDUwc=
 =snsQ
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Remove incorrect __user annotation from struct xattr_args::value

 - Documentation fix: Add missing kernel-doc description for the @isnew
   parameter in ilookup5_nowait() to silence Sphinx warnings

 - Documentation fix: Fix kernel-doc comment for __start_dirop() - the
   function name in the comment was wrong and the @state parameter was
   undocumented

 - Replace dynamic folio_batch allocation with stack allocation in
   iomap_zero_range(). The dynamic allocation was problematic for
   ext4-on-iomap work (didn't handle allocation failure properly) and
   triggered lockdep complaints. Uses a flag instead to control batch
   usage

 - Re-add #ifdef guards around PIDFD_GET_<ns-type>_NAMESPACE ioctls.
   When a namespace type is disabled, ns->ops is NULL, causes crashes
   during inode eviction when closing the fd. The ifdefs were removed in
   a recent simplification but are still needed

 - Fixe a race where a folio could be unlocked before the trailing zeros
   (for EOF within the page) were written

 - Split out a dedicated lease_dispose_list() helper since lease code
   paths always know they're disposing of leases. Removes unnecessary
   runtime flag checks and prepares for upcoming lease_manager
   enhancements

 - Fix userland delegation requests succeeding despite conflicting
   opens. Previously, FL_LAYOUT and FL_DELEG leases bypassed conflict
   checks (a hack for nfsd). Adds new ->lm_open_conflict() lease_manager
   operation so userland delegations get proper conflict checking while
   nfsd can continue its own conflict handling

 - Fix LOOKUP_CACHED path lookups incorrectly falling through to the
   slow path. After legitimize_links() calls were conditionally elided,
   the routine would always fail with LOOKUP_CACHED regardless of
   whether there were any links. Now the flag is checked at the two
   callsites before calling legitimize_links()

 - Fix bug in media fd allocation in media_request_alloc()

 - Fix mismatched API calls in ecryptfs_mknod(): was calling
   end_removing() instead of end_creating() after
   ecryptfs_start_creating_dentry()

 - Fix dentry reference count leak in ecryptfs_mkdir(): a dget() of the
   lower parent dir was added but never dput()'d, causing BUG during
   lower filesystem unmount due to the still-in-use dentry

* tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  pidfs: protect PIDFD_GET_* ioctls() via ifdef
  ecryptfs: Release lower parent dentry after creating dir
  ecryptfs: Fix improper mknod pairing of start_creating()/end_removing()
  get rid of bogus __user in struct xattr_args::value
  VFS: fix __start_dirop() kernel-doc warnings
  fs: Describe @isnew parameter in ilookup5_nowait()
  fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
  netfs: Fix early read unlock of page with EOF in middle
  filelock: allow lease_managers to dictate what qualifies as a conflict
  filelock: add lease_dispose_list() helper
  iomap: replace folio_batch allocation with stack allocation
  media: mc: fix potential use-after-free in media_request_alloc()
2026-01-09 05:57:57 -10:00
Sean Christopherson da142f3d37 KVM: Remove subtle "struct kvm_stats_desc" pseudo-overlay
Remove KVM's internal pseudo-overlay of kvm_stats_desc, which subtly
aliases the flexible name[] in the uAPI definition with a fixed-size array
of the same name.  The unusual embedded structure results in compiler
warnings due to -Wflex-array-member-not-at-end, and also necessitates an
extra level of dereferencing in KVM.  To avoid the "overlay", define the
uAPI structure to have a fixed-size name when building for the kernel.

Opportunistically clean up the indentation for the stats macros, and
replace spaces with tabs.

No functional change intended.

Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Closes: https://lore.kernel.org/all/aPfNKRpLfhmhYqfP@kspp
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
[..]
Acked-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/20251205232655.445294-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 10:40:48 -08:00
Leon Hwang 2b421662c7 bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags
Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags and check them for
following APIs:

* 'map_lookup_elem()'
* 'map_update_elem()'
* 'generic_map_lookup_batch()'
* 'generic_map_update_batch()'

And, get the correct value size for these APIs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-06 20:48:32 -08:00
Linus Torvalds f0b9d8eb98 nfsd-6.19 fixes:
A set of NFSD fixes that arrived after the 6.19 merge window.
 
 Issues that need expedient stable backports:
 - Remove an invalid NFS status code
 - Fix an fstests failure when using pNFS
 - Fix a UAF in v4_end_grace()
 - Fix the administrative interface used to revoke NFSv4 state
 - Fix a memory leak reported by syzbot
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmlb2WUACgkQM2qzM29m
 f5cqaA/+MbO1kop63/TiNE0tRc34yTBnApg1XVza4vSmcpSgpB8ZKGZ5xOjnRpwg
 yBw9+/puEJhyogPE6JKEGnLiFr+s3ApInFHaxnXnrGZz1RR1qkqfioKudIcpC0s1
 /pKx7y/fktltgo/5Dl0gp2QH3Oytg375ge+dcSQbopSTQYPbsAw7AmoHDPBQd8Nr
 Q/pIu1q/tAM8R2zyijU3eAiUMyYRCrxNVYnlsdYmj7Dn0ypybOyKufkpVCEaS3kO
 a7SV/QSVKdNbZOf8annwAhW+VN4urFmA9nnnr/yirrLJ0i2h18E0txrPFBszhftf
 xpOvaDR7okfEvzqwrHvVfRsqB4nYq9f0TSvvpPsS8vCtq34pWKZPa6iiSxeVL/jb
 EmFtiesUWClZzTIQSpUdbuU80cST6WEoNJJKDPZwF1XbA2navsDqgxKiYxsczjt6
 M5SStHcafK5LrXPruqOhfco/uKTmHNJJlvBWxUGCMQEDvdXdEJ4MIlg8VxxvoWPR
 FQDwU+iSdPOwlG7L3Tl9/PGSNe0MxJSgvzK6JNoKL3LvDx80FtMErWxPJdqdIL0+
 RpBsW7zaCyX9lwD866Frs4K2H1w2XFeQjOMI0Pz1SG9dZ8NoKJ+lzcwVY7GgHUvq
 NUNJLzL6MVCHytwTfqrSY7PGvUCrDqR102FQusQyplT4edcUv0M=
 =okQh
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:
 "A set of NFSD fixes for stable that arrived after the merge window:

   - Remove an invalid NFS status code

   - Fix an fstests failure when using pNFS

   - Fix a UAF in v4_end_grace()

   - Fix the administrative interface used to revoke NFSv4 state

   - Fix a memory leak reported by syzbot"

* tag 'nfsd-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: net ref data still needs to be freed even if net hasn't startup
  nfsd: check that server is running in unlock_filesystem
  nfsd: use correct loop termination in nfsd4_revoke_states()
  nfsd: provide locking for v4_end_grace
  NFSD: Fix permission check for read access to executable-only files
  NFSD: Remove NFSERR_EAGAIN
2026-01-06 09:12:52 -08:00
Jacopo Mondi 22cd0db47f media: uapi: mali-c55-config: Remove version identifier
The Mali C55 driver uses the v4l2-isp framework, which defines its own
versioning number which does not need to be defined again in each
platform-specific header.

Remove the definition of mali_c55_param_buffer_version enumeration from
the Mali C55 uAPI header.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-06 10:14:13 +01:00
David Yat Sin c51bb53d5c drm/amdkfd: Add metadata ring buffer for compute
Add support for separate ring-buffer for metadata packets when using
compute queues. Userspace application allocate the metadata ring-buffer
and the queue ring-buffer with a single allocation. The metadata
ring-buffer starts after the queue ring-buffer.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Chuck Lever c6c209ceb8 NFSD: Remove NFSERR_EAGAIN
I haven't found an NFSERR_EAGAIN in RFCs 1094, 1813, 7530, or 8881.
None of these RFCs have an NFS status code that match the numeric
value "11".

Based on the meaning of the EAGAIN errno, I presume the use of this
status in NFSD means NFS4ERR_DELAY. So replace the one usage of
nfserr_eagain, and remove it from NFSD's NFS status conversion
tables.

As far as I can tell, NFSERR_EAGAIN has existed since the pre-git
era, but was not actually used by any code until commit f4e44b3933
("NFSD: delay unmount source's export after inter-server copy
completed."), at which time it become possible for NFSD to return
a status code of 11 (which is not valid NFS protocol).

Fixes: f4e44b3933 ("NFSD: delay unmount source's export after inter-server copy completed.")
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-01-02 13:43:41 -05:00
Carlos Llamas 40fc797ba1 binder: fix trivial typo in uapi header
As reported by codespell:

  include/uapi/linux/android/binder.h:281: interupted ==> interrupted

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20251215181724.3811977-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-29 11:58:52 +01:00
Thomas Weißschuh 3c4629b68d virtio: uapi: avoid usage of libc types
Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.

On Linux 'unsigned long' works as a replacement for 'uintptr_t' and does
not depend on libc.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251222-uapi-virtio-v1-1-29390f87bcad@linutronix.de>
2025-12-26 15:00:00 -05:00
Al Viro 3dd57ddec9
get rid of bogus __user in struct xattr_args::value
The first member of struct xattr_args is declared as
	__aligned_u64 __user value;
which makes no sense whatsoever; __user is a qualifier and what that
declaration says is "all struct xattr_args instances have .value
_stored_ in user address space, no matter where the rest of the
structure happens to be".

	Something like "int __user *p" stands for "value of p is a pointer
to an instance of int that happens to live in user address space"; it
says nothing about location of p itself, just as const char *p declares a
pointer to unmodifiable char rather than an unmodifiable pointer to char.

	With xattr_args the intent clearly had been "the 64bit value
represents a _pointer_ to object in user address space", but __user has
nothing to do with that.  All it gets us is a couple of bogus warnings
in fs/xattr.c where (userland) instance of xattr_args is copied to local
variable of that type (in kernel address space), followed by access
to its members.  Since we've told sparse that args.value must somehow be
located in userland memory, we get warned that looking at that 64bit
unsigned integer (in a variable already on kernel stack) is not allowed.

	Note that sparse has no way to express "this integer shall never
be cast into a pointer to be dereferenced directly" and I don't see any
way to assign a sane semantics to that.  In any case, __user is not it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://patch.msgid.link/20251216081939.GQ1712166@ZenIV
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-24 13:52:50 +01:00
Shuai Xue 9e541b3cee PCI: trace: Add generic RAS tracepoint for hotplug event
Hotplug events are critical indicators for analyzing hardware health, and
surprise link downs can significantly impact system performance and
reliability.

Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
for hotplug event to help health checks. Add enum pci_hotplug_event in
include/uapi/linux/pci.h so applications like rasdaemon can register
tracepoint event handlers for it.

The following output is generated when a device is hotplugged:

  $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
  $ cat /sys/kernel/debug/tracing/trace_pipe
     irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event: 0000:00:02.0 slot:10, event:CARD_PRESENT

     irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event: 0000:00:02.0 slot:10, event:LINK_UP

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for trace event
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20251210132907.58799-2-xueshuai@linux.alibaba.com
2025-12-23 16:05:56 -06:00
Thomas Weißschuh 98b9f207af dmaengine: idxd: uapi: use UAPI types
Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.

Use the fixed-width integer types provided by the UAPI headers instead.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20251222-uapi-idxd-v1-1-baa183adb20d@linutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-12-23 12:29:14 +05:30
Ryusuke Konishi 6fd8a09f48 nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc
Eliminate the following kernel-doc warnings in nilfs2_api.h:

Warning: include/uapi/linux/nilfs2_api.h:65 cannot understand function
 prototype: 'struct nilfs_suinfo'
Warning: include/uapi/linux/nilfs2_api.h:101 cannot understand function
 prototype: 'struct nilfs_suinfo_update'

This ensures that the documentation for nilfs_suinfo and
nilfs_suinfo_update is correctly parsed and generated by adding the
missing 'struct' keyword to their kernel-doc comments.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-12-22 15:45:29 -08:00
Randy Dunlap cb8fe62f87 nilfs2: convert nilfs_super_block to kernel-doc
Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting
all of the struct member comments to kernel-doc comments.

Fix one misnamed struct member in nilfs_direct_node.

Object files before and after are the same size and content.

Examples of warnings:
Warning: include/uapi/linux/nilfs2_ondisk.h:202 struct member 's_rev_level'
 not described in 'nilfs_super_block'
Warning: include/uapi/linux/nilfs2_ondisk.h:202 struct member
 's_minor_rev_level' not described in 'nilfs_super_block'
Warning: include/uapi/linux/nilfs2_ondisk.h:202 struct member 's_magic'
 not described in 'nilfs_super_block'
Warning: include/uapi/linux/nilfs2_ondisk.h:202 struct member 's_bytes'
 not described in 'nilfs_super_block'
Warning: include/uapi/linux/nilfs2_ondisk.h:202 struct member 's_flags'
 not described in 'nilfs_super_block'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-12-22 15:45:29 -08:00
Linus Torvalds 10a0e846d8 Input updates for v6.19-rc1
- a quirk for i8042 to better handle another TUXEDO model
 
 - a quirk to atkbd to handle incorcet behavior of HONOR FMB-P internal
   keyboard
 
 - a definition for a new ABS_SND_PROFILE event
 
 - fixes to alps and lkkbd drivers to reliably shut down pending work on
   removal
 
 - a fix to apple_z2 driver tightening input report parsing
 
 - a fix for "off-by-one" error when validating config in ti_am335x_tsc
   driver
 
 - addition of CRKD Guitars device IDs to xpad driver.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCaUeyQQAKCRBAj56VGEWX
 nDlZAQCLkGoaHMX6i6Wi2iQIKexRxlERp386vaa2nA4/La3L2wEA1Ce01Ve72v+k
 JyVplQCihOzAxDZ8+c2dGtLSwqR8fQY=
 =ycHD
 -----END PGP SIGNATURE-----

Merge tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - a quirk for i8042 to better handle another TUXEDO model

 - a quirk to atkbd to handle incorcet behavior of HONOR FMB-P internal
   keyboard

 - a definition for a new ABS_SND_PROFILE event

 - fixes to alps and lkkbd drivers to reliably shut down pending work on
   removal

 - a fix to apple_z2 driver tightening input report parsing

 - a fix for "off-by-one" error when validating config in ti_am335x_tsc
   driver

 - addition of CRKD Guitars device IDs to xpad driver.

* tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ti_am335x_tsc - fix off-by-one error in wire_order validation
  Input: xpad - add support for CRKD Guitars
  Input: add ABS_SND_PROFILE
  Input: apple_z2 - fix reading incorrect reports after exiting sleep
  Input: alps - fix use-after-free bugs caused by dev3_register_work
  Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table
  Input: atkbd - skip deactivate for HONOR FMB-P's internal keyboard
  Input: lkkbd - disable pending work before freeing device
2025-12-21 15:21:10 -08:00
Linus Torvalds d8ba32c5a4 block-6.19-20251218
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmlEvbAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphTLD/9Gx4wnirRvzo1Vz77Odhqqbipww/gjDV0G
 HASzF4KczfusTei4NABfuSVA4/LS28csZOYq0ZeDmFVg+YPQcmr4DNw/eJzzi1Xh
 TDmYuf8y7jAJjwWmX0fgVF2uDg3/kp3tWKfBxSFqpqHrZUkOLr8IY3y8/uJYKakm
 r8K9HsS7hyPhx4xdfRQixUil5TgQxBtCgwTX+JRv/K9AMnEVHkBkJ+m4GSmjJ3n8
 tr5lL8fQmIQcsSWqTvMH/m2xeLG7gr9LE33wfu2sCLvNJ4CAYFGT9aMf2F0nGbzK
 Uu5XOyq1rJ8vKWJuDVfFsjOca4fyyvaKX4bqycv89Z7MVzQOXsjbs+4sHFycsfYC
 P+OBmxLb2+rowNsawyzObT9wnd37xptl+tF5nXL0LL6W4PWHmUgnDuM6Vg5zpNv+
 n0nO6y7I0iL9o4hgqN8kUOr5e7IjvJccJbratOA/jVXmAw1I/Gg6HDo+W7rEqsVP
 0hrlE+RuJSRypbCO2pcvviahaDaCWo3gQmp9TZCBmgqgoiI0WStPAgYrG424+69Y
 Hoq5HBmtN4GV3EnU0Ybuawzl2OKtQwwr9DrRpvrxdN9qm4E44CD6eTmE5fEnrTnX
 Rcs5VosBAavPTbl8LFDtj2h6qPak+5VHHVZwimLNdOyCLTNf7D/kiAjmOHRBbmdt
 AopIuvCipA==
 =lk7H
 -----END PGP SIGNATURE-----

Merge tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - ublk selftests for missing coverage

 - two fixes for the block integrity code

 - fix for the newly added newly added PR read keys ioctl, limiting the
   memory that can be allocated

 - work around for a deadlock that can occur with ublk, where partition
   scanning ends up recursing back into file closure, which needs the
   same mutex grabbed. Not the prettiest thing in the world, but an
   acceptable work-around until we can eliminate the reliance on
   disk->open_mutex for this

 - fix for a race between enabling writeback throttling and new IO
   submissions

 - move a bit of bio flag handling code. No changes, but needed for a
   patchset for a future kernel

 - fix for an init time id leak failure in rnbd

 - loop/zloop state check fix

* tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block: validate interval_exp integrity limit
  block: validate pi_offset integrity limit
  block: rnbd-clt: Fix leaked ID in init_dev()
  ublk: fix deadlock when reading partition table
  block: add allocation size check in blkdev_pr_read_keys()
  Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
  zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
  loop: use READ_ONCE() to read lo->lo_state without locking
  block: fix race between wbt_enable_default and IO submission
  selftests: ublk: add user copy test cases
  selftests: ublk: add support for user copy to kublk
  selftests: ublk: forbid multiple data copy modes
  selftests: ublk: don't share backing files between ublk servers
  selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04
  selftests: ublk: fix fio arguments in run_io_and_recover()
  selftests: ublk: remove unused ios map in seq_io.bt
  selftests: ublk: correct last_rw map type in seq_io.bt
  selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
  block: move around bio flagging helpers
2025-12-20 09:48:56 -08:00
Gergo Koteles 733a892422 Input: add ABS_SND_PROFILE
ABS_SND_PROFILE used to describe the state of a multi-value sound profile
switch. This will be used for the alert-slider on OnePlus phones or other
phones.

Profile values added as SND_PROFLE_(SILENT|VIBRATE|RING) identifiers
to input-event-codes.h so they can be used from DTS.

Signed-off-by: Gergo Koteles <soyer@irl.hu>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Tested-by: Guido Günther <agx@sigxcpu.org> # oneplus,fajita & oneplus,enchilada
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Link: https://patch.msgid.link/20251113-op6-tri-state-v8-1-54073f3874bc@ixit.cz
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-12-18 21:34:42 -08:00
Linus Torvalds 7b8e9264f5 Including fixes from netfilter and CAN.
Current release - regressions:
 
   - netfilter: nf_conncount: fix leaked ct in error paths
 
   - sched: act_mirred: fix loop detection
 
   - sctp: fix potential deadlock in sctp_clone_sock()
 
   - can: fix build dependency
 
   - eth: mlx5e: do not update BQL of old txqs during channel reconfiguration
 
 Previous releases - regressions:
 
   - sched: ets: always remove class from active list before deleting it
 
   - inet: frags: flush pending skbs in fqdir_pre_exit()
 
   - netfilter:  nf_nat: remove bogus direction check
 
   - mptcp:
     - schedule rtx timer only after pushing data
     - avoid deadlock on fallback while reinjecting
 
   - can: gs_usb: fix error handling
 
   - eth: mlx5e:
     - avoid unregistering PSP twice
     - fix double unregister of HCA_PORTS component
 
   - eth: bnxt_en: fix XDP_TX path
 
   - eth: mlxsw: fix use-after-free when updating multicast route stats
 
 Previous releases - always broken:
 
   - ethtool: avoid overflowing userspace buffer on stats query
 
   - openvswitch: fix middle attribute validation in push_nsh() action
 
   - eth: mlx5: fw_tracer, validate format string parameters
 
   - eth: mlxsw: spectrum_router: fix neighbour use-after-free
 
   - eth: ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2()
 
 Misc:
 
   - Jozsef Kadlecsik retires from maintaining netfilter
 
   - tools: ynl: fix build on systems with old kernel headers
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmlEPS0SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkHLEP/3TnVI9qLivOGsZ48bn5UUcIaIlX0wiw
 i5gwpUhbtW3zcLSphO0Nh/CmProme6dFhaMOOkk48bKAdIxRpOMbCC20bfYDLyd0
 ZJTUQqheKpuI23vOnhfs2TTOkcqz6cM7txUq681taQ8Nfo8Dbf0fgOO2HT5ljRD+
 JXlxrdvicZDtve68sSdnsAj5M15EPQGKTPMyPkymBmCKnCMQK9SQKaeTEtaW6NIO
 yM0KqDSNoulDc/6LYMhx8DTtyE7yTiTxwe2NixjdyYljXsk0KiRDirCzxnZuSgh8
 oh+cFLdFDq4mwdYEKjgC5c3ifQyLEpZvwzlY5MKoobsVnT5SbigeSK53l5rEkO3V
 sM84lITHfqTJvlid0AF/ixEc6iWwV7nGRBHh2FXNbfoIKt45eF77jPi9YFsq6Z95
 vlCzYIbY0f2L1y3mPvZbzGQbh2Z12b5kyK8QA1j7SK+zNzxgXbf4+ZtYxHg7O3Ne
 gecmIpTKXMWodaZyfsRQPjR/F6UIlMqsgl9Ci9bfUw+XwL8x+7bJxQAQz+yVjLla
 ng14BItiYKBavcPZBjYlhGKqD1fzGhVZqQecrCkF0VTbMusRd9RcwytU1NG4QGDx
 V5aL28ht85KtMednEWOBkrg+PeXnNyZHzLAf2Xtx3UkaGgiDC8G4IxUv3orlOLD1
 sFPfZnSiGCof
 =6pla
 -----END PGP SIGNATURE-----

Merge tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter and CAN.

  Current release - regressions:

   - netfilter: nf_conncount: fix leaked ct in error paths

   - sched: act_mirred: fix loop detection

   - sctp: fix potential deadlock in sctp_clone_sock()

   - can: fix build dependency

   - eth: mlx5e: do not update BQL of old txqs during channel
     reconfiguration

  Previous releases - regressions:

   - sched: ets: always remove class from active list before deleting it

   - inet: frags: flush pending skbs in fqdir_pre_exit()

   - netfilter: nf_nat: remove bogus direction check

   - mptcp:
      - schedule rtx timer only after pushing data
      - avoid deadlock on fallback while reinjecting

   - can: gs_usb: fix error handling

   - eth:
      - mlx5e:
         - avoid unregistering PSP twice
         - fix double unregister of HCA_PORTS component
      - bnxt_en: fix XDP_TX path
      - mlxsw: fix use-after-free when updating multicast route stats

  Previous releases - always broken:

   - ethtool: avoid overflowing userspace buffer on stats query

   - openvswitch: fix middle attribute validation in push_nsh() action

   - eth:
      - mlx5: fw_tracer, validate format string parameters
      - mlxsw: spectrum_router: fix neighbour use-after-free
      - ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2()

  Misc:

   - Jozsef Kadlecsik retires from maintaining netfilter

   - tools: ynl: fix build on systems with old kernel headers"

* tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  net: hns3: add VLAN id validation before using
  net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx
  net: hns3: using the num_tqps in the vf driver to apply for resources
  net: enetc: do not transmit redirected XDP frames when the link is down
  selftests/tc-testing: Test case exercising potential mirred redirect deadlock
  net/sched: act_mirred: fix loop detection
  sctp: Clear inet_opt in sctp_v6_copy_ip_options().
  sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock().
  net/handshake: duplicate handshake cancellations leak socket
  net/mlx5e: Don't include PSP in the hard MTU calculations
  net/mlx5e: Do not update BQL of old txqs during channel reconfiguration
  net/mlx5e: Trigger neighbor resolution for unresolved destinations
  net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
  net/mlx5: Serialize firmware reset with devlink
  net/mlx5: fw_tracer, Handle escaped percent properly
  net/mlx5: fw_tracer, Validate format string parameters
  net/mlx5: Drain firmware reset in shutdown callback
  net/mlx5: fw reset, clear reset requested on drain_fw_reset
  net: dsa: mxl-gsw1xx: manually clear RANEG bit
  net: dsa: mxl-gsw1xx: fix .shutdown driver operation
  ...
2025-12-19 07:55:35 +12:00
Deepanshu Kartikey a58383fa45 block: add allocation size check in blkdev_pr_read_keys()
blkdev_pr_read_keys() takes num_keys from userspace and uses it to
calculate the allocation size for keys_info via struct_size(). While
there is a check for SIZE_MAX (integer overflow), there is no upper
bound validation on the allocation size itself.

A malicious or buggy userspace can pass a large num_keys value that
doesn't trigger overflow but still results in an excessive allocation
attempt, causing a warning in the page allocator when the order exceeds
MAX_PAGE_ORDER.

Fix this by introducing PR_KEYS_MAX to limit the number of keys to
a sane value. This makes the SIZE_MAX check redundant, so remove it.
Also switch to kvzalloc/kvfree to handle larger allocations gracefully.

Fixes: 22a1ffea5f ("block: add IOC_PR_READ_KEYS ioctl")
Tested-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com
Reported-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=660d079d90f8a1baf54d
Link: https://lore.kernel.org/all/20251212013510.3576091-1-kartikey406@gmail.com/T/ [v1]
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-17 07:35:22 -07:00
Alexei Starovoitov ec439c3801 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1
Cross-merge BPF and other fixes after downstream PR.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-16 21:29:38 -08:00
Jonathan Kim e83f63da2a drm/amdkfd: allow debug subscription to lds violations on gfx 1250
GFX 1250 allows the debugger to subcribe to LDS out-of-range read/write
memory violations.
Bump IOCTL minor version and flag KFD capabilities for enablement
hint.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16 13:23:17 -05:00
Bhavik Sachdev 0e5032237e
statmount: accept fd as a parameter
Extend `struct mnt_id_req` to take in a fd and introduce STATMOUNT_BY_FD
flag. When a valid fd is provided and STATMOUNT_BY_FD is set, statmount
will return mountinfo about the mount the fd is on.

This even works for "unmounted" mounts (mounts that have been umounted
using umount2(mnt, MNT_DETACH)), if you have access to a file descriptor
on that mount. These "umounted" mounts will have no mountpoint and no
valid mount namespace. Hence, we unset the STATMOUNT_MNT_POINT and
STATMOUNT_MNT_NS_ID in statmount.mask for "unmounted" mounts.

In case of STATMOUNT_BY_FD, given that we already have access to an fd
on the mount, accessing mount information without a capability check
seems fine because of the following reasons:

- All fs related information is available via fstatfs() without any
  capability check.
- Mount information is also available via /proc/pid/mountinfo (without
  any capability check).
- Given that we have access to a fd on the mount which tells us that we
  had access to the mount at some point (or someone that had access gave
  us the fd). So, we should be able to access mount info.

Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
Link: https://patch.msgid.link/20251129091455.757724-3-b.sachdev1904@gmail.com
Acked-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15 14:13:14 +01:00
Linus Torvalds c9b47175e9 i2c-for-6.19-rc1
- general cleanups in bcm2835, designware, pcf8584, and stm32
 - amd-mp2: fix device refcount
 - designware: avoid interrupt storms caused by bad firmware
 - spacemit: fix device detection failures
 - new devices: Intel Diamond Rapids, Rockchip RK3506, Qualcomm Kaanapali
   and MSM8953
 - minor fixes to i801, core documentation, elektor Kconfig dependencies
 
 at24 updates:
 - add new compatible for Belling BL24S64
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmk4MacACgkQFA3kzBSg
 KbajAw/8CMDDB6ENq8aqnRqWCWEcjxcklIJi/6aDsnpUJr7jzf98hmrhlFzYQLqu
 iR6YMo9tt3eGTtv+CDAAABLIqQM5ivx155YSfw8tmfB7OZHpe8hTp72+aNJtJqup
 Rp8vu0vIbdcITE6ohxxJHi+DuBghq3CuZIxridx+ApV3JiuEQlqivvkZS2kz7hS6
 wYaUQWh8fc33HUY05jR1oJPoOyf2LkjNqcwSY3LF70GgiQ4TpMFOwWmKIsoW4u+/
 +i15bI8Vp5Q+DQ/Eu5GgDsSs188BGIY/Ydie50DB229ByetpCUrkjGN3MM8YMAP6
 TJHQtl7a6/zfgU5NCEWKKS1Pf1Pd5cVR8TvZcJ7tFiKCSU5het8QAIYtPqJ3ziJs
 WjRBAnkV6c5NVVzWuGx3wx/5NBQk1S5KLMGmjKqfHHRDdMGLnQYtXtf+rhOw0e+Y
 UjF6xG6JBtKYgbBe7vtJIFGcHXBS2yDePgRCKsmB81BpSC2Eb45Z+Dk7IcjUnTSL
 EqyEFs7lQArh+TMMEuxHfccp/L4um5hvC0WOY6nf5hOrV4H+wAbfpLU8mMtlUhr2
 yrxUFskzOCOjsxixwAtYX90wNhYRMUut0hJvW2ezD75JgFxlHJXUUtOjoyiO2weW
 jws5lXJHRiI77SHBoSLlr2j5W/K1rjvu1VTgis2Jv68ceTTU6pc=
 =+PBF
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:

 - general cleanups in bcm2835, designware, pcf8584, and stm32

 - amd-mp2: fix device refcount

 - designware: avoid interrupt storms caused by bad firmware

 - spacemit: fix device detection failures

 - new devices: Intel Diamond Rapids, Rockchip RK3506, Qualcomm
   Kaanapali and MSM8953

 - minor fixes to i801, core documentation, elektor Kconfig dependencies

 - at24 updates: add new compatible for Belling BL24S64

* tag 'i2c-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (21 commits)
  i2c: qcom-cci: Add msm8953 compatible
  i2c: spacemit: fix detect issue
  i2c: amd-mp2: fix reference leak in MP2 PCI device
  i2c: i2c.h: fix a bad kernel-doc line
  i2c: i2c-elektor: Allow building on SMP kernels
  dt-bindings: i2c: qcom-cci: Document Kaanapali compatible
  dt-bindings: i2c: qcom-cci: Document msm8953 compatible
  dt-bindings: eeprom: at24: Add compatible for Belling BL24S64
  i2c: i801: Fix the Intel Diamond Rapids features
  i2c: pcf8584: Change pcf_doAdress() to pcf_send_address()
  i2c: pcf8584: Make pcf_doAddress() function void
  i2c: pcf8584: Move 'ret' variable inside for loop, goto out if ret < 0.
  i2c: designware: Disable SMBus interrupts to prevent storms from mis-configured firmware
  dt-bindings: i2c: i2c-rk3x: Add compatible string for RK3506
  i2c: i801: Add support for Intel Diamond Rapids
  i2c: stm32: Omit two variable reassignments in stm32_i2c_dma_request()
  i2c: designware: Omit a variable reassignment in dw_i2c_plat_probe()
  i2c: pcf8584: Fix do not use assignment inside if conditional
  i2c: pcf8584: Remove debug macros from i2c-algo-pcf.c
  i2c: busses: bcm2835: convert from round_rate() to determine_rate()
  ...
2025-12-10 07:48:05 +09:00
Matthieu Baerts (NGI0) 0ace3297a7 mptcp: pm: ignore unknown endpoint flags
Before this patch, the kernel was saving any flags set by the userspace,
even unknown ones. This doesn't cause critical issues because the kernel
is only looking at specific ones. But on the other hand, endpoints dumps
could tell the userspace some recent flags seem to be supported on older
kernel versions.

Instead, ignore all unknown flags when parsing them. By doing that, the
userspace can continue to set unsupported flags, but it has a way to
verify what is supported by the kernel.

Note that it sounds better to continue accepting unsupported flags not
to change the behaviour, but also that eases things on the userspace
side by adding "optional" endpoint types only supported by newer kernel
versions without having to deal with the different kernel versions.

A note for the backports: there will be conflicts in mptcp.h on older
versions not having the mentioned flags, the new line should still be
added last, and the '5' needs to be adapted to have the same value as
the last entry.

Fixes: 01cacb00b3 ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-1-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:02 -08:00
Jakub Kicinski e56cadaa27 ynl: add regen hint to new headers
Recent commit 68e83f3472 ("tools: ynl-gen: add regeneration comment")
added a hint how to regenerate the code to the headers. Update
the new headers from this release cycle to also include it.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251207004740.1657799-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:52:43 -08:00
Linus Torvalds 4482ebb297 block-6.19-20251208
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmk3KZsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkNYD/91yqAJeehx2Heq3dWj9L8hDuETQelj/g9j
 gtZCiriAPy+bb1/BmWjK+BmvjtBt+g3a4Cwi6tVj4F1zoE46IPeLhO+2iJTEBiBq
 AhRtEf/MFXFK3qUnTpEnS8w3CtsXejOTB81VQ+6BysSu+B708m/1AQHv2HocZ37R
 jivrzfCsEdBr+ISwYw/EG5KcDBVTFo/JdXIhs7k4Z8bBfa3P5ye4EhKjORtgbFNU
 5nXb78SZoWNCZF143YV++9MpZc3M2jzkzrk1CTLsUHhOxWg4T/6wTXfPGZc/W4m8
 UBhs03u/gMJnKHhlZd4kpZWDito1TQZTdY2f5sBsysRQqeT7bwDK/1xiQ1nllZiP
 oYbeD6t65yMAlELwNFXo7y/DNcS2VLBMvChIX6p1gweEzyf23YneoHYyN5agEQlN
 9C4EdcYzZRt0DwtHlIRtKvDk2LZzkJAcLau3D6ahU/DPLOawyWZKmvGiU+sSyJjF
 bEIO5c/+MLqkAgLAGaFgA4twFF1aYH9ssmJerDxprarkf1jtlOBLvUQ391Gtb5Hd
 B1yugmIgEwLbCFzhk9FlCtv2nQcWRCElnaeqv+Lv+xCBVPGCLm2qIHoTqmvHZPCd
 GbN/h0XLdgUboYPCFWVAX72/4K/cv+fQQcb+a7tiq6vMKcgJ/2I1szFGpFqz7azB
 hyiK0v3x2g==
 =r1xa
 -----END PGP SIGNATURE-----

Merge tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block updates from Jens Axboe:
 "Followup set of fixes and updates for block for the 6.19 merge window.

  NVMe had some late minute debates which lead to dropping some patches
  from that tree, which is why the initial PR didn't have NVMe included.
  It's here now. This pull request contains:

   - NVMe pull request via Keith:
       - Subsystem usage cleanups (Max)
       - Endpoint device fixes (Shin'ichiro)
       - Debug statements (Gerd)
       - FC fabrics cleanups and fixes (Daniel)
       - Consistent alloc API usages (Israel)
       - Code comment updates (Chu)
       - Authentication retry fix (Justin)

   - Fix a memory leak in the discard ioctl code, if the task is being
     interrupted by a signal at just the wrong time

   - Zoned write plugging fixes

   - Add ioctls for for persistent reservations

   - Enable per-cpu bio caching by default

   - Various little fixes and tweaks"

* tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (27 commits)
  nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
  nvme-auth: use kvfree() for memory allocated with kvcalloc()
  nvmet-tcp: use kvcalloc for commands array
  nvmet-rdma: use kvcalloc for commands and responses arrays
  nvme: fix typo error in nvme target
  nvmet-fc: use pr_* print macros instead of dev_*
  nvmet-fcloop: remove unused lsdir member.
  nvmet-fcloop: check all request and response have been processed
  nvme-fc: check all request and response have been processed
  block: fix memory leak in __blkdev_issue_zero_pages
  block: fix comment for op_is_zone_mgmt() to include RESET_ALL
  block: Clear BLK_ZONE_WPLUG_PLUGGED when aborting plugged BIOs
  blk-mq: Abort suspend when wakeup events are pending
  blk-mq: add blk_rq_nr_bvec() helper
  block: add IOC_PR_READ_RESERVATION ioctl
  block: add IOC_PR_READ_KEYS ioctl
  nvme: reject invalid pr_read_keys() num_keys values
  scsi: sd: reject invalid pr_read_keys() num_keys values
  block: enable per-cpu bio cache by default
  block: use bio_alloc_bioset for passthru IO by default
  ...
2025-12-09 08:53:24 +09:00
Linus Torvalds feb06d2690 hyperv-next for v6.19
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmk2b0ITHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXkefCACpUWTK0U0i47hXT+s4aA0T3sq6V3/T
 +su9WnT3GPQ3BuRCRk51w6u9ADYt1EXtu8gRwq/wZiES9PJtz+9DmNuLT8nkkHXH
 exbaRIBAiwLGg6QFC2VpbQzeHLp7qeko0MsLWyMiVPkw+lw9QPqcLKVEWuzPZfOn
 UCkPB+XpzZg9Ft4vKRjXLyUMpwKzkqJw/aiXMfwonuaelcrzLw0hkzO3/I+eKRHv
 JKxaHCwLgrPZyGCJpWtwiLxgu0DKLeDDhj0WSqDz/kUNhjo/GEshLA25UQJUdzI0
 O+tFN9my7SZSYtq7fGoyfo16mAsLaXh0oYuwP8UnR4CDm4UF4JB4QTsM
 =laZR
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - Enhancements to Linux as the root partition for Microsoft Hypervisor:
     - Support a new mode called L1VH, which allows Linux to drive the
       hypervisor running the Azure Host directly
     - Support for MSHV crash dump collection
     - Allow Linux's memory management subsystem to better manage guest
       memory regions
     - Fix issues that prevented a clean shutdown of the whole system on
       bare metal and nested configurations
     - ARM64 support for the MSHV driver
     - Various other bug fixes and cleanups

 - Add support for Confidential VMBus for Linux guest on Hyper-V

 - Secure AVIC support for Linux guests on Hyper-V

 - Add the mshv_vtl driver to allow Linux to run as the secure kernel in
   a higher virtual trust level for Hyper-V

* tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (58 commits)
  mshv: Cleanly shutdown root partition with MSHV
  mshv: Use reboot notifier to configure sleep state
  mshv: Add definitions for MSHV sleep state configuration
  mshv: Add support for movable memory regions
  mshv: Add refcount and locking to mem regions
  mshv: Fix huge page handling in memory region traversal
  mshv: Move region management to mshv_regions.c
  mshv: Centralize guest memory region destruction
  mshv: Refactor and rename memory region handling functions
  mshv: adjust interrupt control structure for ARM64
  Drivers: hv: use kmalloc_array() instead of kmalloc()
  mshv: Add ioctl for self targeted passthrough hvcalls
  Drivers: hv: Introduce mshv_vtl driver
  Drivers: hv: Export some symbols for mshv_vtl
  static_call: allow using STATIC_CALL_TRAMP_STR() from assembly
  mshv: Extend create partition ioctl to support cpu features
  mshv: Allow mappings that overlap in uaddr
  mshv: Fix create memory region overlap check
  mshv: add WQ_PERCPU to alloc_workqueue users
  Drivers: hv: Use kmalloc_array() instead of kmalloc()
  ...
2025-12-09 06:10:17 +09:00
Mario Limonciello 7a5fb05b5b amdkfd: Bump ABI to indicate presence of Trap handler support for expert scheduling
commit 0f0c8a6983db ("drm/amdkfd: Trap handler support for expert
scheduling mode") introduced support for a trap handler when expert
scheduling mode. However userspace needs to know whether or not a trap
handler support is present.

Bump the KFD IOCTL API so that userspace can key off this to decide.

Suggested-by: Stella Laurenzo <stella.laurenzo@amd.com>
Fixes: 4238888794 ("drm/amdkfd: Trap handler support for expert scheduling mode")
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 14:23:21 -05:00
Zhu Lingshan cc6b66d661 amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
This commit implemetns a new ioctl AMDKFD_IOC_CREATE_PROCESS
that creates a new secondary kfd_progress on the FD.

To keep backward compatibility, userspace programs need to invoke
this ioctl explicitly on a FD to create a secondary
kfd_process which replacing its primary kfd_process.

This commit bumps ioctl minor version.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 14:19:34 -05:00
Linus Torvalds 37bb2e7217 Staging driver updates for 6.19-rc1
Here is the big set of staging driver updates for 6.19-rc1.  Only thing
 "major" in here is that two subsystems, gpib and vc04 have moved out of
 the staging tree into the "real" portion of the kernel, which is great
 to see.  Other than that, the rest of the changes are just tiny coding
 style cleanups, nothing earth-shattering.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaTTPBw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylolwCgkT8lVXb5FVjqF9qploVy4HprpigAn2kKHhAc
 wc9RtBK9a+AEsC6dRvSh
 =E/yw
 -----END PGP SIGNATURE-----

Merge tag 'staging-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver updates from Greg KH:
 "Here is the big set of staging driver updates for 6.19-rc1.

  Only thing "major" in here is that two subsystems, gpib and vc04 have
  moved out of the staging tree into the "real" portion of the kernel,
  which is great to see. Other than that, the rest of the changes are
  just tiny coding style cleanups, nothing earth-shattering.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'staging-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (53 commits)
  staging: rtl8723bs: fix out-of-bounds read in OnBeacon ESR IE parsing
  staging: rtl8723bs: fix stack buffer overflow in OnAssocReq IE parsing
  staging: rtl8723bs: fix out-of-bounds read in rtw_get_ie() parser
  staging: gpib: Clean-up commented-out code
  staging: rtl8723bs: remove custom FIELD_OFFSET macro
  staging: rtl8723bs: replace FIELD_OFFSET usage with offsetof in rtw_mlme_ext.c
  staging: rtl8723bs: remove dead commented code from odm.c
  staging: rtl8723bs: use standard offsetof in cfg80211 operations
  staging: rtl8723bs: remove unused registry and BSSID offset macros
  staging: rtl8723bs: core: delete commented-out code
  staging: rtl8723bs: core: fix block comment style issues
  staging: greybus: uart: check return values during probe
  staging: fbtft: core: fix potential memory leak in fbtft_probe_common()
  staging: gpib: Destage gpib
  staging: gpib: Fix SPDX license for gpib headers
  staging: gpib: Update TODO file
  staging: gpib: Change // comments in uapi header file
  platform/raspberrypi: Destage VCHIQ MMAL driver
  platform/raspberrypi: Destage VCHIQ interface
  staging: vc04_services: Cleanup VCHIQ TODO entries
  ...
2025-12-06 18:52:00 -08:00
Linus Torvalds f5e9d31e79 USB/Thunderbolt changes for 6.19-rc1
Here is the big set of USB and Thunderbolt driver updates for 6.19-rc1.
 Nothing major here, just lots of tiny updates for most of the common USB
 drivers.  Included in here are:
   - more xhci driver updates and fixes
   - Thunderbolt driver cleanups
   - usb serial driver updates
   - typec driver updates
   - USB tracepoint additions
   - dwc3 driver updates, including support for Apple hardware
   - lots of other smaller driver updates and cleanups
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaTTLNQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn29wCcD23xCSWuRIKHfSWlefCwptHKT0AAoM0NJdeO
 C4/4pui5ifTu1g5Vud1r
 =Sm/3
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt updates from Greg KH:
 "Here is the big set of USB and Thunderbolt driver updates for
  6.19-rc1. Nothing major here, just lots of tiny updates for most of
  the common USB drivers. Included in here are:

   - more xhci driver updates and fixes

   - Thunderbolt driver cleanups

   - usb serial driver updates

   - typec driver updates

   - USB tracepoint additions

   - dwc3 driver updates, including support for Apple hardware

   - lots of other smaller driver updates and cleanups

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (161 commits)
  usb: gadget: tegra-xudc: Always reinitialize data toggle when clear halt
  USB: serial: option: move Telit 0x10c7 composition in the right place
  USB: serial: option: add Telit Cinterion FE910C04 new compositions
  usb: typec: ucsi: fix use-after-free caused by uec->work
  usb: typec: ucsi: fix probe failure in gaokun_ucsi_probe()
  usb: dwc3: core: Remove redundant comment in core init
  usb: phy: Initialize struct usb_phy list_head
  USB: serial: option: add Foxconn T99W760
  usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.
  usb: typec: hd3ss3220: Enable VBUS based on ID pin state
  dt-bindings: usb: ti,hd3ss3220: Add support for VBUS based on ID state
  usb: typec: anx7411: add WQ_PERCPU to alloc_workqueue users
  USB: add WQ_PERCPU to alloc_workqueue users
  dt-bindings: usb: dwc3-xilinx: Describe the reset constraint for the versal platform
  drivers/usb/storage: use min() instead of min_t()
  usb: raw-gadget: cap raw_io transfer length to KMALLOC_MAX_SIZE
  usb: ohci-da8xx: remove unused platform data
  usb: gadget: functionfs: use dma_buf_unmap_attachment_unlocked() helper
  usb: uas: reduce time under spinlock
  usb: dwc3: eic7700: Add EIC7700 USB driver
  ...
2025-12-06 18:42:12 -08:00
Linus Torvalds 83bd89291f Char/Misc/IIO driver updates for 6.19-rc1
Here is the big set of char/misc/iio driver updates for 6.19-rc1.  Lots
 of stuff in here including:
   - lots of IIO driver updates, cleanups, and additions.
   - large interconnect driver changes as they get converted over to a
     dynamic system of ids
   - coresight driver updates
   - mwave driver updates
   - binder driver updates and changes
   - comedi driver fixes now that the fuzzers are being set loose on them
   - nvmem driver updates
   - new uio driver addition
   - lots of other small char/misc driver updates, full details in the
     shortlog
 
 All of these have been in linux-next for a while now, with no reported
 issues other than a merge conflict with your tree that should be trivial
 to handle (take both sides).
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaTTNDQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykVIACeN0AiTosAtp4CAGe4fAwM7EvbnkQAoNJE5NAx
 Ef31/j1Tq2pCTWt6SVbs
 =AY/e
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc/IIO driver updates from Greg KH:
 "Here is the big set of char/misc/iio driver updates for 6.19-rc1. Lots
  of stuff in here including:

   - lots of IIO driver updates, cleanups, and additions

   - large interconnect driver changes as they get converted over to a
     dynamic system of ids

   - coresight driver updates

   - mwave driver updates

   - binder driver updates and changes

   - comedi driver fixes now that the fuzzers are being set loose on
     them

   - nvmem driver updates

   - new uio driver addition

   - lots of other small char/misc driver updates, full details in the
     shortlog

  All of these have been in linux-next for a while now"

* tag 'char-misc-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (304 commits)
  char: applicom: fix NULL pointer dereference in ac_ioctl
  hangcheck-timer: fix coding style spacing
  hangcheck-timer: Replace %Ld with %lld
  hangcheck-timer: replace printk(KERN_CRIT) with pr_crit
  uio: Add SVA support for PCI devices via uio_pci_generic_sva.c
  dt-bindings: slimbus: fix warning from example
  intel_th: Fix error handling in intel_th_output_open
  misc: rp1: Fix an error handling path in rp1_probe()
  char: xillybus: add WQ_UNBOUND to alloc_workqueue users
  misc: bh1770glc: use pm_runtime_resume_and_get() in power_state_store
  misc: cb710: Fix a NULL vs IS_ERR() check in probe()
  mux: mmio: Add suspend and resume support
  virt: acrn: split acrn_mmio_dev_res out of acrn_mmiodev
  greybus: gb-beagleplay: Fix timeout handling in bootloader functions
  greybus: add WQ_PERCPU to alloc_workqueue users
  char/mwave: drop typedefs
  char/mwave: drop printk wrapper
  char/mwave: remove printk tracing
  char/mwave: remove unneeded fops
  char/mwave: remove MWAVE_FUTZ_WITH_OTHER_DEVICES ifdeffery
  ...
2025-12-06 18:34:24 -08:00
Linus Torvalds 509d3f4584 Significant patch series in this pull request:
- The 6 patch series "panic: sys_info: Refactor and fix a potential
   issue" from Andy Shevchenko fixes a build issue and does some cleanup in
   ib/sys_info.c.
 
 - The 9 patch series "Implement mul_u64_u64_div_u64_roundup()" from
   David Laight enhances the 64-bit math code on behalf of a PWM driver and
   beefs up the test module for these library functions.
 
 - The 2 patch series "scripts/gdb/symbols: make BPF debug info available
   to GDB" from Ilya Leoshkevich makes BPF symbol names, sizes, and line
   numbers available to the GDB debugger.
 
 - The 4 patch series "Enable hung_task and lockup cases to dump system
   info on demand" from Feng Tang adds a sysctl which can be used to cause
   additional info dumping when the hung-task and lockup detectors fire.
 
 - The 6 patch series "lib/base64: add generic encoder/decoder, migrate
   users" from Kuan-Wei Chiu adds a general base64 encoder/decoder to lib/
   and migrates several users away from their private implementations.
 
 - The 2 patch series "rbree: inline rb_first() and rb_last()" from Eric
   Dumazet makes TCP a little faster.
 
 - The 9 patch series "liveupdate: Rework KHO for in-kernel users" from
   Pasha Tatashin reworks the KEXEC Handover interfaces in preparation for
   Live Update Orchestrator (LUO), and possibly for other future clients.
 
 - The 13 patch series "kho: simplify state machine and enable dynamic
   updates" from Pasha Tatashin increases the flexibility of KEXEC
   Handover.  Also preparation for LUO.
 
 - The 18 patch series "Live Update Orchestrator" from Pasha Tatashin is
   a major new feature targeted at cloud environments.  Quoting the [0/N]:
 
     This series introduces the Live Update Orchestrator, a kernel subsystem
     designed to facilitate live kernel updates using a kexec-based reboot.
     This capability is critical for cloud environments, allowing hypervisors
     to be updated with minimal downtime for running virtual machines.  LUO
     achieves this by preserving the state of selected resources, such as
     memory, devices and their dependencies, across the kernel transition.
 
     As a key feature, this series includes support for preserving memfd file
     descriptors, which allows critical in-memory data, such as guest RAM or
     any other large memory region, to be maintained in RAM across the kexec
     reboot.
 
   Mike Rappaport merits a mention here, for his extensive review and
   testing work.
 
 - The 3 patch series "kexec: reorganize kexec and kdump sysfs" from
   Sourabh Jain moves the kexec and kdump sysfs entries from /sys/kernel/
   to /sys/kernel/kexec/ and adds back-compatibility symlinks which can
   hopefully be removed one day.
 
 - The 2 patch series "kho: fixes for vmalloc restoration" from Mike
   Rapoport fixes a BUG which was being hit during KHO restoration of
   vmalloc() regions.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTSAkQAKCRDdBJ7gKXxA
 jrkiAP9QKfsRv46XZaM5raScjY1ayjP+gqb2rgt6BQ/gZvb2+wD/cPAYOR6BiX52
 n0pVpQmG5P/KyOmpLztn96ejL4heKwQ=
 =JY96
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
   fixes a build issue and does some cleanup in ib/sys_info.c

 - "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
   enhances the 64-bit math code on behalf of a PWM driver and beefs up
   the test module for these library functions

 - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
   makes BPF symbol names, sizes, and line numbers available to the GDB
   debugger

 - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
   adds a sysctl which can be used to cause additional info dumping when
   the hung-task and lockup detectors fire

 - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
   adds a general base64 encoder/decoder to lib/ and migrates several
   users away from their private implementations

 - "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
   makes TCP a little faster

 - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
   reworks the KEXEC Handover interfaces in preparation for Live Update
   Orchestrator (LUO), and possibly for other future clients

 - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
   increases the flexibility of KEXEC Handover. Also preparation for LUO

 - "Live Update Orchestrator" (Pasha Tatashin)
   is a major new feature targeted at cloud environments. Quoting the
   cover letter:

      This series introduces the Live Update Orchestrator, a kernel
      subsystem designed to facilitate live kernel updates using a
      kexec-based reboot. This capability is critical for cloud
      environments, allowing hypervisors to be updated with minimal
      downtime for running virtual machines. LUO achieves this by
      preserving the state of selected resources, such as memory,
      devices and their dependencies, across the kernel transition.

      As a key feature, this series includes support for preserving
      memfd file descriptors, which allows critical in-memory data, such
      as guest RAM or any other large memory region, to be maintained in
      RAM across the kexec reboot.

   Mike Rappaport merits a mention here, for his extensive review and
   testing work.

 - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
   moves the kexec and kdump sysfs entries from /sys/kernel/ to
   /sys/kernel/kexec/ and adds back-compatibility symlinks which can
   hopefully be removed one day

 - "kho: fixes for vmalloc restoration" (Mike Rapoport)
   fixes a BUG which was being hit during KHO restoration of vmalloc()
   regions

* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
  calibrate: update header inclusion
  Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
  vmcoreinfo: track and log recoverable hardware errors
  kho: fix restoring of contiguous ranges of order-0 pages
  kho: kho_restore_vmalloc: fix initialization of pages array
  MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
  init: replace simple_strtoul with kstrtoul to improve lpj_setup
  KHO: fix boot failure due to kmemleak access to non-PRESENT pages
  Documentation/ABI: new kexec and kdump sysfs interface
  Documentation/ABI: mark old kexec sysfs deprecated
  kexec: move sysfs entries to /sys/kernel/kexec
  test_kho: always print restore status
  kho: free chunks using free_page() instead of kfree()
  selftests/liveupdate: add kexec test for multiple and empty sessions
  selftests/liveupdate: add simple kexec-based selftest for LUO
  selftests/liveupdate: add userspace API selftests
  docs: add documentation for memfd preservation via LUO
  mm: memfd_luo: allow preserving memfd
  liveupdate: luo_file: add private argument to store runtime state
  mm: shmem: export some functions to internal.h
  ...
2025-12-06 14:01:20 -08:00
Linus Torvalds 249872f53d tsm for 6.19
- Introduce the PCI/TSM core for the coordination of device
   authentication, link encryption and establishment (IDE), and later
   management of the device security operational states (TDISP). Notify
   the new TSM core layer of PCI device arrival and departure.
 
 - Add a low level TSM driver for the link encryption establishment
   capabilities of the AMD SEV-TIO architecture.
 
 - Add a library of helpers TSM drivers to use for IDE establishment and
   the DOE transport.
 
 - Add skeleton support for 'bind' and 'guest_request' operations in
   support of TDISP.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCaTOdAwAKCRDfioYZHlFs
 Z/fWAQDS5mwS/8rn0UdH/SijTm/oKVxdiyIQbTstrjk8AySITgEA5ki9w2iKa0WG
 x1ACZKlo9gS9emyx4wuJpCBIMtR50Qc=
 =B4oG
 -----END PGP SIGNATURE-----

Merge tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm

Pull PCIe Link Encryption and Device Authentication from Dan Williams:
 "New PCI infrastructure and one architecture implementation for PCIe
  link encryption establishment via platform firmware services.

  This work is the result of multiple vendors coming to consensus on
  some core infrastructure (thanks Alexey, Yilun, and Aneesh!), and
  three vendor implementations, although only one is included in this
  pull. The PCI core changes have an ack from Bjorn, the crypto/ccp/
  changes have an ack from Tom, and the iommu/amd/ changes have an ack
  from Joerg.

  PCIe link encryption is made possible by the soup of acronyms
  mentioned in the shortlog below. Link Integrity and Data Encryption
  (IDE) is a protocol for installing keys in the transmitter and
  receiver at each end of a link. That protocol is transported over Data
  Object Exchange (DOE) mailboxes using PCI configuration requests.

  The aspect that makes this a "platform firmware service" is that the
  key provisioning and protocol is coordinated through a Trusted
  Execution Envrionment (TEE) Security Manager (TSM). That is either
  firmware running in a coprocessor (AMD SEV-TIO), or quasi-hypervisor
  software (Intel TDX Connect / ARM CCA) running in a protected CPU
  mode.

  Now, the only reason to ask a TSM to run this protocol and install the
  keys rather than have a Linux driver do the same is so that later, a
  confidential VM can ask the TSM directly "can you certify this
  device?".

  That precludes host Linux from provisioning its own keys, because host
  Linux is outside the trust domain for the VM. It also turns out that
  all architectures, save for one, do not publish a mechanism for an OS
  to establish keys in the root port. So "TSM-established link
  encryption" is the only cross-architecture path for this capability
  for the foreseeable future.

  This unblocks the other arch implementations to follow in v6.20/v7.0,
  once they clear some other dependencies, and it unblocks the next
  phase of work to implement the end-to-end flow of confidential device
  assignment. The PCIe specification calls this end-to-end flow Trusted
  Execution Environment (TEE) Device Interface Security Protocol
  (TDISP).

  In the meantime, Linux gets a link encryption facility which has
  practical benefits along the same lines as memory encryption. It
  authenticates devices via certificates and may protect against
  interposer attacks trying to capture clear-text PCIe traffic.

  Summary:

   - Introduce the PCI/TSM core for the coordination of device
     authentication, link encryption and establishment (IDE), and later
     management of the device security operational states (TDISP).
     Notify the new TSM core layer of PCI device arrival and departure

   - Add a low level TSM driver for the link encryption establishment
     capabilities of the AMD SEV-TIO architecture

   - Add a library of helpers TSM drivers to use for IDE establishment
     and the DOE transport

   - Add skeleton support for 'bind' and 'guest_request' operations in
     support of TDISP"

* tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: (23 commits)
  crypto/ccp: Fix CONFIG_PCI=n build
  virt: Fix Kconfig warning when selecting TSM without VIRT_DRIVERS
  crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)
  iommu/amd: Report SEV-TIO support
  psp-sev: Assign numbers to all status codes and add new
  ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public
  PCI/TSM: Add 'dsm' and 'bound' attributes for dependent functions
  PCI/TSM: Add pci_tsm_guest_req() for managing TDIs
  PCI/TSM: Add pci_tsm_bind() helper for instantiating TDIs
  PCI/IDE: Initialize an ID for all IDE streams
  PCI/IDE: Add Address Association Register setup for downstream MMIO
  resource: Introduce resource_assigned() for discerning active resources
  PCI/TSM: Drop stub for pci_tsm_doe_transfer()
  drivers/virt: Drop VIRT_DRIVERS build dependency
  PCI/TSM: Report active IDE streams
  PCI/IDE: Report available IDE streams
  PCI/IDE: Add IDE establishment helpers
  PCI: Establish document for PCI host bridge sysfs attributes
  PCI: Add PCIe Device 3 Extended Capability enumeration
  PCI/TSM: Establish Secure Sessions and Link Encryption
  ...
2025-12-06 10:15:41 -08:00
Linus Torvalds a7405aa92f dma-mapping updates for Linux 6.19:
- next part of DMA mapping API refactoring to physical addresses as the primary
 interface instead of page+offset parameters; this time dma_map_ops callbacks
 are converted to physical addresses, what in turn results also in some
 simplification of architecture specific code (Leon Romanovsky and Jason
 Gunthorpe)
 - clarify that dma_map_benchmark is not a kernel self-test, but standalone
 tool (Qinxin Xia)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSrngzkoBtlA8uaaJ+Jp1EFxbsSRAUCaTLpeQAKCRCJp1EFxbsS
 RKlAAQCo/gVslheoJ+h4Hk5oUjLfVQaiamJOlzxw12EVHVAs3AEA7GILIAL1GbGn
 EpkgwjIuz/J/4aGhNbYO6J+C1qMAHww=
 =aM6P
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping updates from Marek Szyprowski:

 - More DMA mapping API refactoring to physical addresses as the primary
   interface instead of page+offset parameters.

   This time dma_map_ops callbacks are converted to physical addresses,
   what in turn results also in some simplification of architecture
   specific code (Leon Romanovsky and Jason Gunthorpe)

 - Clarify that dma_map_benchmark is not a kernel self-test, but
   standalone tool (Qinxin Xia)

* tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma-mapping: remove unused map_page callback
  xen: swiotlb: Convert mapping routine to rely on physical address
  x86: Use physical address for DMA mapping
  sparc: Use physical address DMA mapping
  powerpc: Convert to physical address DMA mapping
  parisc: Convert DMA map_page to map_phys interface
  MIPS/jazzdma: Provide physical address directly
  alpha: Convert mapping routine to rely on physical address
  dma-mapping: remove unused mapping resource callbacks
  xen: swiotlb: Switch to physical address mapping callbacks
  ARM: dma-mapping: Switch to physical address mapping callbacks
  ARM: dma-mapping: Reduce struct page exposure in arch_sync_dma*()
  dma-mapping: convert dummy ops to physical address mapping
  dma-mapping: prepare dma_map_ops to conversion to physical address
  tools/dma: move dma_map_benchmark from selftests to tools/dma
2025-12-06 09:25:05 -08:00
Linus Torvalds f19b84186d [GIT PULL for v6.19] media updates
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmkykUMACgkQCF8+vY7k
 4RWyJxAAgYSDiC5NVO5P+G5kadsdq2YS/eulN9A/MapwvdPxNcTO58A0B3lcjoN5
 0ECrNxrXTqzDNjjgAqmE50YWp6DNDEZqzA6KggJ3Xx6bpXm0w+fq27G7JxQX9jog
 5/u0mE59YZIJKoj3lFfHVkETr9IblrkWrb+SFZ663dLkbC2/eyboItv90EWOs4RF
 ykC1aSKCIClifCwvk/NF1iybkU/TaKkrSMEwgGfZHPFBNsUhu7Di3wjaQikxEECv
 qw113lfjhzaPXIy97LC62lCezIs3j2QF5wxq1KL7HKvzBJxHb+OvKnNtfCx/ST8k
 PTLxZiT3pgoK/ITz35i3TqORSVqPzhNPUy0XJocyuq0MtFsoM11uDEPXvZF42Ylx
 8tw+9+v7x3A4tM/v48rT7gkMmMAyahqv48Y1mZS7ZiyfnrC3HsysKJxNJoy8muof
 xT4mt8EdfPUl6BYa8sdeCcyXJyPE+cfc3W9wfJuRSCn/IoONO9gq7maZO+4pvJlB
 T4gixRplhEcgt3Opi0RHNMEsO/QzvjW9jtB9Q9gZrkl49FFuqLUcEqvXU3rbEncb
 C2YJ+UEwqy8XiFTSCZg5PecPYmVu72EJZnP6PVnvDbmpROjlCFnkvENKV4sGY/t/
 WFcTpTRLryCmlzBbUubnLW2SApHgwxwKlqXXYB+92cpxAh1Q3Us=
 =VbRk
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media kernel-doc fix from Mauro Carvalho Chehab:
 "A fix to shut up a kernel-doc warning on c3-isp driver"

* tag 'media/v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: uapi: c3-isp: Fix documentation warning
2025-12-05 20:17:50 -08:00
Linus Torvalds 51d90a15fe ARM:
- Support for userspace handling of synchronous external aborts (SEAs),
   allowing the VMM to potentially handle the abort in a non-fatal
   manner.
 
 - Large rework of the VGIC's list register handling with the goal of
   supporting more active/pending IRQs than available list registers in
   hardware. In addition, the VGIC now supports EOImode==1 style
   deactivations for IRQs which may occur on a separate vCPU than the
   one that acked the IRQ.
 
 - Support for FEAT_XNX (user / privileged execute permissions) and
   FEAT_HAF (hardware update to the Access Flag) in the software page
   table walkers and shadow MMU.
 
 - Allow page table destruction to reschedule, fixing long need_resched
   latencies observed when destroying a large VM.
 
 - Minor fixes to KVM and selftests
 
 Loongarch:
 
 - Get VM PMU capability from HW GCFG register.
 
 - Add AVEC basic support.
 
 - Use 64-bit register definition for EIOINTC.
 
 - Add KVM timer test cases for tools/selftests.
 
 RISC/V:
 
 - SBI message passing (MPXY) support for KVM guest
 
 - Give a new, more specific error subcode for the case when in-kernel
   AIA virtualization fails to allocate IMSIC VS-file
 
 - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
   in small chunks
 
 - Fix guest page fault within HLV* instructions
 
 - Flush VS-stage TLB after VCPU migration for Andes cores
 
 s390:
 
 - Always allocate ESCA (Extended System Control Area), instead of
   starting with the basic SCA and converting to ESCA with the
   addition of the 65th vCPU.  The price is increased number of
   exits (and worse performance) on z10 and earlier processor;
   ESCA was introduced by z114/z196 in 2010.
 
 - VIRT_XFER_TO_GUEST_WORK support
 
 - Operation exception forwarding support
 
 - Cleanups
 
 x86:
 
 - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE
   caching is disabled, as there can't be any relevant SPTEs to zap.
 
 - Relocate a misplaced export.
 
 - Fix an async #PF bug where KVM would clear the completion queue when the
   guest transitioned in and out of paging mode, e.g. when handling an SMI and
   then returning to paged mode via RSM.
 
 - Leave KVM's user-return notifier registered even when disabling
   virtualization, as long as kvm.ko is loaded.  On reboot/shutdown, keeping
   the notifier registered is ok; the kernel does not use the MSRs and the
   callback will run cleanly and restore host MSRs if the CPU manages to
   return to userspace before the system goes down.
 
 - Use the checked version of {get,put}_user().
 
 - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC
   timers can result in a hard lockup in the host.
 
 - Revert the periodic kvmclock sync logic now that KVM doesn't use a
   clocksource that's subject to NTP corrections.
 
 - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter
   behind CONFIG_CPU_MITIGATIONS.
 
 - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path;
   the only reason they were handled in the fast path was to paper of a bug
   in the core #MC code, and that has long since been fixed.
 
 - Add emulator support for AVX MOV instructions, to play nice with emulated
   devices whose guest drivers like to access PCI BARs with large multi-byte
   instructions.
 
 x86 (AMD):
 
 - Fix a few missing "VMCB dirty" bugs.
 
 - Fix the worst of KVM's lack of EFER.LMSLE emulation.
 
 - Add AVIC support for addressing 4k vCPUs in x2AVIC mode.
 
 - Fix incorrect handling of selective CR0 writes when checking intercepts
   during emulation of L2 instructions.
 
 - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on
   VMRUN and #VMEXIT.
 
 - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft
   interrupt if the guest patched the underlying code after the VM-Exit, e.g.
   when Linux patches code with a temporary INT3.
 
 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to
   userspace, and extend KVM "support" to all policy bits that don't require
   any actual support from KVM.
 
 x86 (Intel):
 
 - Use the root role from kvm_mmu_page to construct EPTPs instead of the
   current vCPU state, partly as worthwhile cleanup, but mostly to pave the
   way for tracking per-root TLB flushes, and elide EPT flushes on pCPU
   migration if the root is clean from a previous flush.
 
 - Add a few missing nested consistency checks.
 
 - Rip out support for doing "early" consistency checks via hardware as the
   functionality hasn't been used in years and is no longer useful in general;
   replace it with an off-by-default module param to WARN if hardware fails
   a check that KVM does not perform.
 
 - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32]
   on VM-Enter.
 
 - Misc cleanups.
 
 - Overhaul the TDX code to address systemic races where KVM (acting on behalf
   of userspace) could inadvertantly trigger lock contention in the TDX-Module;
   KVM was either working around these in weird, ugly ways, or was simply
   oblivious to them (though even Yan's devilish selftests could only break
   individual VMs, not the host kernel)
 
 - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU,
   if creating said vCPU failed partway through.
 
 - Fix a few sparse warnings (bad annotation, 0 != NULL).
 
 - Use struct_size() to simplify copying TDX capabilities to userspace.
 
 - Fix a bug where TDX would effectively corrupt user-return MSR values if the
   TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected.
 
 Selftests:
 
 - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM.
 
 - Forcefully override ARCH from x86_64 to x86 to play nice with specifying
   ARCH=x86_64 on the command line.
 
 - Extend a bunch of nested VMX to validate nested SVM as well.
 
 - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to
   verify KVM can save/restore nested VMX state when L1 is using 5-level
   paging, but L2 is not.
 
 - Clean up the guest paging code in anticipation of sharing the core logic for
   nested EPT and nested NPT.
 
 guest_memfd:
 
 - Add NUMA mempolicy support for guest_memfd, and clean up a variety of
   rough edges in guest_memfd along the way.
 
 - Define a CLASS to automatically handle get+put when grabbing a guest_memfd
   from a memslot to make it harder to leak references.
 
 - Enhance KVM selftests to make it easer to develop and debug selftests like
   those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs
   often result in hard-to-debug SIGBUS errors.
 
 - Misc cleanups.
 
 Generic:
 
 - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for
   irqfd cleanup.
 
 - Fix a goof in the dirty ring documentation.
 
 - Fix choice of target for directed yield across different calls to
   kvm_vcpu_on_spin(); the function was always starting from the first
   vCPU instead of continuing the round-robin search.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmkvMa8UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMlFwf+Ow7zOYUuELSQ+Jn+hOYXiCNrdBDx
 ZamvMU8kLPr7XX0Zog6HgcMm//qyA6k5nSfqCjfsQZrIhRA/gWJ61jz1OX/Jxq18
 pJ9Vz6epnEPYiOtBwz+v8OS8MqDqVNzj2i6W1/cLPQE50c1Hhw64HWS5CSxDQiHW
 A7PVfl5YU12lW1vG3uE0sNESDt4Eh/spNM17iddXdF4ZUOGublserjDGjbc17E7H
 8BX3DkC2plqkJKwtjg0ae62hREkITZZc7RqsnftUkEhn0N0H9+rb6NKUyzIVh9NZ
 bCtCjtrKN9zfZ0Mujnms3ugBOVqNIputu/DtPnnFKXtXWSrHrgGSNv5ewA==
 =PEcw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - Support for userspace handling of synchronous external aborts
     (SEAs), allowing the VMM to potentially handle the abort in a
     non-fatal manner

   - Large rework of the VGIC's list register handling with the goal of
     supporting more active/pending IRQs than available list registers
     in hardware. In addition, the VGIC now supports EOImode==1 style
     deactivations for IRQs which may occur on a separate vCPU than the
     one that acked the IRQ

   - Support for FEAT_XNX (user / privileged execute permissions) and
     FEAT_HAF (hardware update to the Access Flag) in the software page
     table walkers and shadow MMU

   - Allow page table destruction to reschedule, fixing long
     need_resched latencies observed when destroying a large VM

   - Minor fixes to KVM and selftests

  Loongarch:

   - Get VM PMU capability from HW GCFG register

   - Add AVEC basic support

   - Use 64-bit register definition for EIOINTC

   - Add KVM timer test cases for tools/selftests

  RISC/V:

   - SBI message passing (MPXY) support for KVM guest

   - Give a new, more specific error subcode for the case when in-kernel
     AIA virtualization fails to allocate IMSIC VS-file

   - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
     in small chunks

   - Fix guest page fault within HLV* instructions

   - Flush VS-stage TLB after VCPU migration for Andes cores

  s390:

   - Always allocate ESCA (Extended System Control Area), instead of
     starting with the basic SCA and converting to ESCA with the
     addition of the 65th vCPU. The price is increased number of exits
     (and worse performance) on z10 and earlier processor; ESCA was
     introduced by z114/z196 in 2010

   - VIRT_XFER_TO_GUEST_WORK support

   - Operation exception forwarding support

   - Cleanups

  x86:

   - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO
     SPTE caching is disabled, as there can't be any relevant SPTEs to
     zap

   - Relocate a misplaced export

   - Fix an async #PF bug where KVM would clear the completion queue
     when the guest transitioned in and out of paging mode, e.g. when
     handling an SMI and then returning to paged mode via RSM

   - Leave KVM's user-return notifier registered even when disabling
     virtualization, as long as kvm.ko is loaded. On reboot/shutdown,
     keeping the notifier registered is ok; the kernel does not use the
     MSRs and the callback will run cleanly and restore host MSRs if the
     CPU manages to return to userspace before the system goes down

   - Use the checked version of {get,put}_user()

   - Fix a long-lurking bug where KVM's lack of catch-up logic for
     periodic APIC timers can result in a hard lockup in the host

   - Revert the periodic kvmclock sync logic now that KVM doesn't use a
     clocksource that's subject to NTP corrections

   - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the
     latter behind CONFIG_CPU_MITIGATIONS

   - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast
     path; the only reason they were handled in the fast path was to
     paper of a bug in the core #MC code, and that has long since been
     fixed

   - Add emulator support for AVX MOV instructions, to play nice with
     emulated devices whose guest drivers like to access PCI BARs with
     large multi-byte instructions

  x86 (AMD):

   - Fix a few missing "VMCB dirty" bugs

   - Fix the worst of KVM's lack of EFER.LMSLE emulation

   - Add AVIC support for addressing 4k vCPUs in x2AVIC mode

   - Fix incorrect handling of selective CR0 writes when checking
     intercepts during emulation of L2 instructions

   - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32]
     on VMRUN and #VMEXIT

   - Fix a bug where KVM corrupt the guest code stream when re-injecting
     a soft interrupt if the guest patched the underlying code after the
     VM-Exit, e.g. when Linux patches code with a temporary INT3

   - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits
     to userspace, and extend KVM "support" to all policy bits that
     don't require any actual support from KVM

  x86 (Intel):

   - Use the root role from kvm_mmu_page to construct EPTPs instead of
     the current vCPU state, partly as worthwhile cleanup, but mostly to
     pave the way for tracking per-root TLB flushes, and elide EPT
     flushes on pCPU migration if the root is clean from a previous
     flush

   - Add a few missing nested consistency checks

   - Rip out support for doing "early" consistency checks via hardware
     as the functionality hasn't been used in years and is no longer
     useful in general; replace it with an off-by-default module param
     to WARN if hardware fails a check that KVM does not perform

   - Fix a currently-benign bug where KVM would drop the guest's
     SPEC_CTRL[63:32] on VM-Enter

   - Misc cleanups

   - Overhaul the TDX code to address systemic races where KVM (acting
     on behalf of userspace) could inadvertantly trigger lock contention
     in the TDX-Module; KVM was either working around these in weird,
     ugly ways, or was simply oblivious to them (though even Yan's
     devilish selftests could only break individual VMs, not the host
     kernel)

   - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a
     TDX vCPU, if creating said vCPU failed partway through

   - Fix a few sparse warnings (bad annotation, 0 != NULL)

   - Use struct_size() to simplify copying TDX capabilities to userspace

   - Fix a bug where TDX would effectively corrupt user-return MSR
     values if the TDX Module rejects VP.ENTER and thus doesn't clobber
     host MSRs as expected

  Selftests:

   - Fix a math goof in mmu_stress_test when running on a single-CPU
     system/VM

   - Forcefully override ARCH from x86_64 to x86 to play nice with
     specifying ARCH=x86_64 on the command line

   - Extend a bunch of nested VMX to validate nested SVM as well

   - Add support for LA57 in the core VM_MODE_xxx macro, and add a test
     to verify KVM can save/restore nested VMX state when L1 is using
     5-level paging, but L2 is not

   - Clean up the guest paging code in anticipation of sharing the core
     logic for nested EPT and nested NPT

  guest_memfd:

   - Add NUMA mempolicy support for guest_memfd, and clean up a variety
     of rough edges in guest_memfd along the way

   - Define a CLASS to automatically handle get+put when grabbing a
     guest_memfd from a memslot to make it harder to leak references

   - Enhance KVM selftests to make it easer to develop and debug
     selftests like those added for guest_memfd NUMA support, e.g. where
     test and/or KVM bugs often result in hard-to-debug SIGBUS errors

   - Misc cleanups

  Generic:

   - Use the recently-added WQ_PERCPU when creating the per-CPU
     workqueue for irqfd cleanup

   - Fix a goof in the dirty ring documentation

   - Fix choice of target for directed yield across different calls to
     kvm_vcpu_on_spin(); the function was always starting from the first
     vCPU instead of continuing the round-robin search"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits)
  KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
  KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
  KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
  KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
  KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
  KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
  KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
  KVM: arm64: selftests: Add test for AT emulation
  KVM: arm64: nv: Expose hardware access flag management to NV guests
  KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
  KVM: arm64: Implement HW access flag management in stage-1 SW PTW
  KVM: arm64: Propagate PTW errors up to AT emulation
  KVM: arm64: Add helper for swapping guest descriptor
  KVM: arm64: nv: Use pgtable definitions in stage-2 walk
  KVM: arm64: Handle endianness in read helper for emulated PTW
  KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
  KVM: arm64: Call helper for reading descriptors directly
  KVM: arm64: nv: Advertise support for FEAT_XNX
  KVM: arm64: Teach ptdump about FEAT_XNX permissions
  KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions
  ...
2025-12-05 17:01:20 -08:00
Amery Hung b5709f6d26 bpf: Support associating BPF program with struct_ops
Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
a BPF program with a struct_ops map. This command takes a file
descriptor of a struct_ops map and a BPF program and set
prog->aux->st_ops_assoc to the kdata of the struct_ops map.

The command does not accept a struct_ops program nor a non-struct_ops
map. Programs of a struct_ops map is automatically associated with the
map during map update. If a program is shared between two struct_ops
maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
associated struct_ops is ambiguous. The pointer, once poisoned, cannot
be reset since we have lost track of associated struct_ops. For other
program types, the associated struct_ops map, once set, cannot be
changed later. This restriction may be lifted in the future if there is
a use case.

A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
the associated struct_ops pointer. The returned pointer, if not NULL, is
guaranteed to be valid and point to a fully updated struct_ops struct.
For struct_ops program reused in multiple struct_ops map, the return
will be NULL.

prog->aux->st_ops_assoc is protected by bumping the refcount for
non-struct_ops programs and RCU for struct_ops programs. Since it would
be inefficient to track programs associated with a struct_ops map, every
non-struct_ops program will bump the refcount of the map to make sure
st_ops_assoc stays valid. For a struct_ops program, it is protected by
RCU as map_free will wait for an RCU grace period before disassociating
the program with the map. The helper must be called in BPF program
context or RCU read-side critical section.

struct_ops implementers should note that the struct_ops returned may not
be initialized nor attached yet. The struct_ops implementer will be
responsible for tracking and checking the state of the associated
struct_ops map if the use case expects an initialized or attached
struct_ops.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20251203233748.668365-3-ameryhung@gmail.com
2025-12-05 16:17:57 -08:00
Linus Torvalds 4b9d25b4d3 vfs-6.19-rc1.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaTLXbgAKCRCRxhvAZXjc
 og6oAP9QnawWV/g0vFt8O+jODb/JE54A6Vdb3Ai5Pt+yEB6iVgD/Reu8OhB694tO
 6j8j+lfPV35Z432VirfeKU7Vd4GUSwg=
 =HMln
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix a type conversion bug in the ipc subsystem

 - Fix per-dentry timeout warning in autofs

 - Drop the fd conversion from sockets

 - Move assert from iput_not_last() to iput()

 - Fix reversed check in filesystems_freeze_callback()

 - Use proper uapi types for new struct delegation definitions

* tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  vfs: use UAPI types for new struct delegation definition
  mqueue: correct the type of ro to int
  Revert "net/socket: convert sock_map_fd() to FD_ADD()"
  autofs: fix per-dentry timeout warning
  fs: assert on I_FREEING not being set in iput() and iput_not_last()
  fs: PM: Fix reverse check in filesystems_freeze_callback()
2025-12-05 15:52:30 -08:00
Naman Jain 7bfe3b8ea6 Drivers: hv: Introduce mshv_vtl driver
Provide an interface for Virtual Machine Monitor like OpenVMM and its
use as OpenHCL paravisor to control VTL0 (Virtual trust Level).
Expose devices and support IOCTLs for features like VTL creation,
VTL0 memory management, context switch, making hypercalls,
mapping VTL0 address space to VTL2 userspace, getting new VMBus
messages and channel events in VTL2 etc.

Co-developed-by: Roman Kisel <romank@linux.microsoft.com>
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05 23:16:26 +00:00
Thomas Weißschuh fe93446b5e
vfs: use UAPI types for new struct delegation definition
Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.

Use the fixed-width integer types provided by the UAPI headers instead.

Fixes: 1602bad16d ("vfs: expose delegation support to userland")
Fixes: 4be9e04ebf ("vfs: add needed headers for new struct delegation definition")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20251203-uapi-fcntl-v1-1-490c67bf3425@linutronix.de
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-05 13:57:39 +01:00
Linus Torvalds bc69ed9752 virtio,vhost: fixes, cleanups
Just a bunch of fixes and cleanups, mostly very simple. Several
 features are merged through net-next this time around.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmkvRoEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpKK4H/2VhNPqr3OL3RZJsHv6iJ6PQJU0JYRfqEH6B
 x1nOPEsTt34ypLfSqP15DWyOeshTENknnnYuZ8Gzyiim9R/l+2OyMLW0bsCqdfV8
 L/FyDJPMW3WmfesMlUUMKRwFEzFolj+SrlH+Us9UUd1wKbC1ctHJ+4Ov5hw5z5/e
 wEVeIwf/WcAU0CyqolTF72+0BSIU7ml0wrzxBp05hkeDnugVmhAskQpb5p0SNI/d
 u4Zfvuz9s1Qq9TGpwWaibWTBQ3F6Hbyw3Pr+ojwVSoCufnVDQ/FuCZfe607lv7Dh
 YZuHc4PtFRfLgmwNuHSnPv7S7fp00kHAeEGbPpKyh09UjDtqC+A=
 =lXGm
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "Just a bunch of fixes and cleanups, mostly very simple. Several
  features were merged through net-next this time around"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_pci: drop kernel.h
  vhost: switch to arrays of feature bits
  vhost/test: add test specific macro for features
  virtio: clean up features qword/dword terms
  vduse: add WQ_PERCPU to alloc_workqueue users
  virtio_balloon: add WQ_PERCPU to alloc_workqueue users
  vdpa/pds: use %pe for ERR_PTR() in event handler registration
  vhost: Fix kthread worker cgroup failure handling
  virtio: vdpa: Fix reference count leak in octep_sriov_enable()
  vdpa/mlx5: Fix incorrect error code reporting in query_virtqueues
  virtio: fix map ops comment
  virtio: fix virtqueue_set_affinity() docs
  virtio: standardize Returns documentation style
  virtio: fix grammar in virtio_map_ops docs
  virtio: fix grammar in virtio_queue_info docs
  virtio: fix whitespace in virtio_config_ops
  virtio: fix typo in virtio_device_ready() comment
  virtio: fix kernel-doc for mapping/free_coherent functions
  virtio_vdpa: fix misleading return in void function
2025-12-04 18:59:21 -08:00
Linus Torvalds 056daec292 iommufd 6.19 pull request
- Expand IOMMU_IOAS_MAP_FILE to accept a DMABUF exported from VFIO. This
   is the first step to broader DMABUF support in iommufd, right now it
   only works with VFIO. This closes the last functional gap with classic
   VFIO type 1 to safely support PCI peer to peer DMA by mapping the VFIO
   device's MMIO into the IOMMU.
 
 - Relax SMMUv3 restrictions on nesting domains to better support qemu's
   sequence to have an identity mapping before the vSID is established.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaS8DrgAKCRCFwuHvBreF
 YY/kAP0Q1s7LkVb83uQb8kIW3xKzEnFNTlhrSSGV5UBuYLbaDgD+J+y+4VrSkJem
 85LMipmzaoZdHqtxMhQWrlYbZMr9TAM=
 =BacK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "This is a pretty consequential cycle for iommufd, though this pull is
  not too big. It is based on a shared branch with VFIO that introduces
  VFIO_DEVICE_FEATURE_DMA_BUF a DMABUF exporter for VFIO device's MMIO
  PCI BARs. This was a large multiple series journey over the last year
  and a half.

  Based on that work IOMMUFD gains support for VFIO DMABUF's in its
  existing IOMMU_IOAS_MAP_FILE, which closes the last major gap to
  support PCI peer to peer transfers within VMs.

  In Joerg's iommu tree we have the "generic page table" work which aims
  to consolidate all the duplicated page table code in every iommu
  driver into a single algorithm. This will be used by iommufd to
  implement unique page table operations to start adding new features
  and improve performance.

  In here:

   - Expand IOMMU_IOAS_MAP_FILE to accept a DMABUF exported from VFIO.
     This is the first step to broader DMABUF support in iommufd, right
     now it only works with VFIO. This closes the last functional gap
     with classic VFIO type 1 to safely support PCI peer to peer DMA by
     mapping the VFIO device's MMIO into the IOMMU.

   - Relax SMMUv3 restrictions on nesting domains to better support
     qemu's sequence to have an identity mapping before the vSID is
     established"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommu/arm-smmu-v3-iommufd: Allow attaching nested domain for GBPA cases
  iommufd/selftest: Add some tests for the dmabuf flow
  iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE
  iommufd: Have iopt_map_file_pages convert the fd to a file
  iommufd: Have pfn_reader process DMABUF iopt_pages
  iommufd: Allow MMIO pages in a batch
  iommufd: Allow a DMABUF to be revoked
  iommufd: Do not map/unmap revoked DMABUFs
  iommufd: Add DMABUF to iopt_pages
  vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
2025-12-04 18:50:11 -08:00
Linus Torvalds a3ebb59eee VFIO updates for v6.19-rc1
- Move libvfio selftest artifacts in preparation of more tightly
    coupled integration with KVM selftests. (David Matlack)
 
  - Fix comment typo in mtty driver. (Chu Guangqing)
 
  - Support for new hardware revision in the hisi_acc vfio-pci variant
    driver where the migration registers can now be accessed via the PF.
    When enabled for this support, the full BAR can be exposed to the
    user. (Longfang Liu)
 
  - Fix vfio cdev support for VF token passing, using the correct size
    for the kernel structure, thereby actually allowing userspace to
    provide a non-zero UUID token.  Also set the match token callback for
    the hisi_acc, fixing VF token support for this this vfio-pci variant
    driver. (Raghavendra Rao Ananta)
 
  - Introduce internal callbacks on vfio devices to simplify and
    consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO
    data, removing various ioctl intercepts with a more structured
    solution. (Jason Gunthorpe)
 
  - Introduce dma-buf support for vfio-pci devices, allowing MMIO regions
    to be exposed through dma-buf objects with lifecycle managed through
    move operations.  This enables low-level interactions such as a
    vfio-pci based SPDK drivers interacting directly with dma-buf capable
    RDMA devices to enable peer-to-peer operations.  IOMMUFD is also now
    able to build upon this support to fill a long standing feature gap
    versus the legacy vfio type1 IOMMU backend with an implementation of
    P2P support for VM use cases that better manages the lifecycle of the
    P2P mapping. (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy)
 
  - Convert eventfd triggering for error and request signals to use RCU
    mechanisms in order to avoid a 3-way lockdep reported deadlock issue.
    (Alex Williamson)
 
  - Fix a 32-bit overflow introduced via dma-buf support manifesting with
    large DMA buffers. (Alex Mastro)
 
  - Convert nvgrace-gpu vfio-pci variant driver to insert mappings on
    fault rather than at mmap time.  This conversion serves both to make
    use of huge PFNMAPs but also to both avoid corrected RAS events
    during reset by now being subject to vfio-pci-core's use of
    unmap_mapping_range(), and to enable a device readiness test after
    reset. (Ankit Agrawal)
 
  - Refactoring of vfio selftests to support multi-device tests and split
    code to provide better separation between IOMMU and device objects.
    This work also enables a new test suite addition to measure parallel
    device initialization latency. (David Matlack)
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmkvV3IRHGFsZXhAc2hh
 emJvdC5vcmcACgkQI5ubbjuwiyIpIQ/9GwpjLH5Vdv0v2d9mkHmZIWFpG/tr3zJa
 +spQqOjO0etASc67PtIJArT9pWib+s6O8OaG7iFrdNR65HCSsXSZbIGbMThPODfy
 DdDj1ipAqMVwcaCZT8un2N8Sktu9YpFQMvc5IoXWWYhw88vili7bBx+OTrEFV2T0
 6qQijSBdhw1TXVFHG6BGSmqmisyMepIebA6GmPWdfYu6BfoWBYMdcMjDwd1J61Q5
 DDwFRzn/Dz2Tvb1jbXiiRMRuFIuegFQii+wtd30S/cRPFZhZLWzc+drimC6oOFiQ
 qL19vQQsBPnLtGvch40HsET/AbY5w0pLCkYX5qacxP3sq27+N+KuotzCvbnVMN+H
 e2BqOCujyoce8z1Br6BzV71Lr2yzPDcc5pXTuEuuBT+J/ptOY8hfEikOj85s5Wzj
 aKsTrdDRGMrn/o11NkGSzYwFcMs9MxCX9mo98U6OkWDr0+cmPLf4LGZgpJudWg4E
 POUlzPpnzJrTlX5d+OqCdKJG0a1hTlTa2udzRa5hCDANHaZWLoAssfgSEKfV9xt1
 PzOMf0UIJmPJmFcw/OpMO72/5xp8O4WslJS0ulSm6vrAJDtutLApHZ7bJ44KniNd
 4vte+gOjyZY8ibTDKRULhXVlCDxkEnZjRBbApgI9HJD61IElOzjqohRuRx77J09B
 7c8OSLI8d1U=
 =tpee
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Move libvfio selftest artifacts in preparation of more tightly
   coupled integration with KVM selftests (David Matlack)

 - Fix comment typo in mtty driver (Chu Guangqing)

 - Support for new hardware revision in the hisi_acc vfio-pci variant
   driver where the migration registers can now be accessed via the PF.
   When enabled for this support, the full BAR can be exposed to the
   user (Longfang Liu)

 - Fix vfio cdev support for VF token passing, using the correct size
   for the kernel structure, thereby actually allowing userspace to
   provide a non-zero UUID token. Also set the match token callback for
   the hisi_acc, fixing VF token support for this this vfio-pci variant
   driver (Raghavendra Rao Ananta)

 - Introduce internal callbacks on vfio devices to simplify and
   consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO
   data, removing various ioctl intercepts with a more structured
   solution (Jason Gunthorpe)

 - Introduce dma-buf support for vfio-pci devices, allowing MMIO regions
   to be exposed through dma-buf objects with lifecycle managed through
   move operations. This enables low-level interactions such as a
   vfio-pci based SPDK drivers interacting directly with dma-buf capable
   RDMA devices to enable peer-to-peer operations. IOMMUFD is also now
   able to build upon this support to fill a long standing feature gap
   versus the legacy vfio type1 IOMMU backend with an implementation of
   P2P support for VM use cases that better manages the lifecycle of the
   P2P mapping (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy)

 - Convert eventfd triggering for error and request signals to use RCU
   mechanisms in order to avoid a 3-way lockdep reported deadlock issue
   (Alex Williamson)

 - Fix a 32-bit overflow introduced via dma-buf support manifesting with
   large DMA buffers (Alex Mastro)

 - Convert nvgrace-gpu vfio-pci variant driver to insert mappings on
   fault rather than at mmap time. This conversion serves both to make
   use of huge PFNMAPs but also to both avoid corrected RAS events
   during reset by now being subject to vfio-pci-core's use of
   unmap_mapping_range(), and to enable a device readiness test after
   reset (Ankit Agrawal)

 - Refactoring of vfio selftests to support multi-device tests and split
   code to provide better separation between IOMMU and device objects.
   This work also enables a new test suite addition to measure parallel
   device initialization latency (David Matlack)

* tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio: (65 commits)
  vfio: selftests: Add vfio_pci_device_init_perf_test
  vfio: selftests: Eliminate INVALID_IOVA
  vfio: selftests: Split libvfio.h into separate header files
  vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c
  vfio: selftests: Rename vfio_util.h to libvfio.h
  vfio: selftests: Stop passing device for IOMMU operations
  vfio: selftests: Move IOVA allocator into iova_allocator.c
  vfio: selftests: Move IOMMU library code into iommu.c
  vfio: selftests: Rename struct vfio_dma_region to dma_region
  vfio: selftests: Upgrade driver logging to dev_err()
  vfio: selftests: Prefix logs with device BDF where relevant
  vfio: selftests: Eliminate overly chatty logging
  vfio: selftests: Support multiple devices in the same container/iommufd
  vfio: selftests: Introduce struct iommu
  vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode
  vfio: selftests: Allow passing multiple BDFs on the command line
  vfio: selftests: Split run.sh into separate scripts
  vfio: selftests: Move run.sh into scripts directory
  vfio/nvgrace-gpu: wait for the GPU mem to be ready
  vfio/nvgrace-gpu: Inform devmem unmapped after reset
  ...
2025-12-04 18:42:48 -08:00
Linus Torvalds d7aa60d966 [GIT PULL for v6.19] media updates
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmktWxIACgkQCF8+vY7k
 4RXGuQ/9EhcMfiHQ8fu1/AIPDZl5WGMNkHaBi84Y6Q9CroxdZwNsX8NQtk0caVyQ
 nDyshL3APBTeMOiAxxxkIe/LMNBJMpah5Dprg3Lk/G6+4soEDHhD5+PRJFwGhgYQ
 qa6Z8TpABxvGPNBVhyLgPnPznFsiRc9HzXYxQc+7mQENzioUWaRW93510Jq7vutl
 wmkBI5TQVGOU3wlc4tDTpkQcMy2CMZygRwUXcDXfw3sC+OxWUSB6EsDKoQ6Y30s7
 VqRibO48eQK1k9L39xGs14yEZeIyj0sBRXZ2cCvdQWpZUoDy4UERLepqpw2iF4CU
 Cr15VEp7cdB8RF0d/Y3AQBS08KXsyiR4x1TfGqiCllfP0DjX+AXA0aW8m1ofAAhq
 CJ8wR13oOVsDWeYBO39nYp4SG5y9B9BslXXFS6HCN52txlTL10+W1kpSJnzGzMo9
 oBXAD0TNyko+MVB9OBMkdQB8h8UDA5Y4jDR1OLwWGCGhO6TTYcFbERdy1gvLsz76
 ARsC4uF/EU5rS92UvcE6FPp6E0RRXCuOUFa9vqBO8XiL1Z+vft5IUIeqlj5v4dKf
 67WwXmmOcm5e59l9m6lf4qkYnILgQQr5rFd20e1F2QpFl1QrncEVJJdDU2+akD8c
 h/n6FhrkFrGXFzeqfmd/ZMvF8uU3u5j/fyt5wNEkc28fVz7WPxw=
 =y7ej
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - New drivers:
    - Mali-C55 ISP
    - Rockchip VICAP (RKCIF)
    - RKVDEC HEVC Decoder
    - Renesas RZV2H IVC
    - Sony IMX111 CMOS sensor driver

 - Removed STi C8SECTPFE Driver

 - Added a V4L2 ISP generic framework

 - Usual set of cleanup, fixes and driver improvements

* tag 'media/v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (249 commits)
  media: rockchip: rkcif: add support for rk3568 vicap mipi capture
  media: rockchip: rkcif: add support for rk3568 vicap dvp capture
  media: rockchip: rkcif: add support for px30 vip dvp capture
  media: rockchip: rkcif: add abstraction for dma blocks
  media: rockchip: rkcif: add abstraction for interface and crop blocks
  media: rockchip: add driver for the rockchip camera interface
  media: dt-bindings: add rockchip rk3568 vicap
  media: dt-bindings: add rockchip px30 vip
  media: dt-bindings: video-interfaces: add defines for sampling modes
  Documentation: admin-guide: media: add rockchip camera interface
  media: mali-c55: Mark pm handlers as __maybe_unused
  media: mali-c55: Assert ISP blocks size correctness
  media: v4l2-isp: Rename block_info to block_type_info
  MAINTAINERS: Add entry for rzv2h-ivc driver
  media: platform: Add Renesas Input Video Control block driver
  dt-bindings: media: Add bindings for the RZ/V2H(P) IVC block
  Documentation: media: mali-c55: Document the mali-c55 parameter setting
  media: platform: Add mali-c55 parameters video node
  media: uapi: Add parameters structs to mali-c55-config.h
  media: mali-c55: Add image formats for Mali-C55 parameters buffer
  ...
2025-12-04 08:15:19 -08:00
Stefan Hajnoczi 3e2cb9ee76 block: add IOC_PR_READ_RESERVATION ioctl
Add a Persistent Reservations ioctl to read the current reservation.
This calls the pr_ops->read_reservation() function that was previously
added in commit c787f1baa5 ("block: Add PR callouts for read keys and
reservation") but was only used by the in-kernel SCSI target so far.

The IOC_PR_READ_RESERVATION ioctl is necessary so that userspace
applications that rely on Persistent Reservations ioctls have a way of
inspecting the current state. Cluster managers and validation tests need
this functionality.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-04 07:19:26 -07:00
Stefan Hajnoczi 22a1ffea5f block: add IOC_PR_READ_KEYS ioctl
Add a Persistent Reservations ioctl to read the list of currently
registered reservation keys. This calls the pr_ops->read_keys() function
that was previously added in commit c787f1baa5 ("block: Add PR
callouts for read keys and reservation") but was only used by the
in-kernel SCSI target so far.

The IOC_PR_READ_KEYS ioctl is necessary so that userspace applications
that rely on Persistent Reservations ioctls have a way of inspecting the
current state. Cluster managers and validation tests need this
functionality.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-04 07:19:26 -07:00
Linus Torvalds 7696286034 for-6.19-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmko/DUACgkQxWXV+ddt
 WDtyCw//UaFOTX/k72HgA1n2MWfegWbWyD+OGNbGosoZljKOrAe/mRnjXTyF9lyW
 8GDzGvJzF4Tkl5lyuGyiequlrO2F3veTpwHo94xnBTOYCeiTpMqTN/e/5SkasBpN
 4YlWq7OGYR4hwghRvZpaW7nsmVCKDLIlZVkH77x9Bmvx0NLO24EJlEZusQT4zYew
 ntC/i9x3DW0ZxYyfRhFIFvk9JUUdgXfxJ6dNexz0zi3dKUSUIR9hI0J9Nwl++1cF
 SgjAzbtO064htWoCvsKykgA6YGbJCZjw8XO8D2eJonkN24VbqSMaY44TPXmCMLVs
 ZXw871jV2E/urfWhRNdxv/kJdCFudPk0qXG5ZtfHO4UUwS/nZ+qAig+LHawgAOCJ
 9CgWy4zrfiYCqULRuqF1wzWu/z22++zIlZC552VAZd1RQ+JjqJY/aje4xhY5nUF4
 n1uVBReZaI9sH3jJOsMWpwLMptbhpH9RZp3QPgqZlUHo6GtPJJmNKfw8KgMAhZ7L
 wf7iy6v9yo+7VZ2ACwu2qJ+lZRxbZ0yvCnFatN3O5G1O0kkIrZFUM3MwdKtufZ0u
 LHWkGfoaq7zR6E6DhIaxIhiTTXMlOfLTikNKgBUO3NEdrRZwrDhr7K07S25jFxSx
 ZCNV6OdSCeziShPqT0ntcwecnJ41/kOcm13732NHF+QgzMK5LrI=
 =rO4x
 -----END PGP SIGNATURE-----

Merge tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "Features:

   - shutdown ioctl support (needs CONFIG_BTRFS_EXPERIMENTAL for now):
      - set filesystem state as being shut down (also named going down
        in other filesystems), where all active operations return EIO
        and this cannot be changed until unmount
      - pending operations are attempted to be finished but error
        messages may still show up depending on where exactly the
        shutdown happened

   - scrub (and device replace) vs suspend/hibernate:
      - a running scrub will prevent suspend, which can be annoying as
        suspend is an immediate request and scrub is not critical
      - filesystem freezing before suspend was not sufficient as the
        problem was in process freezing
      - behaviour change: on suspend scrub and device replace are
        cancelled, where scrub can record the last state and continue
        from there; the device replace has to be restarted from the
        beginning

   - zone stats exported in sysfs, from the perspective of the
     filesystem this includes active, reclaimable, relocation etc zones

  Performance:

   - improvements when processing space reservation tickets by
     optimizing locking and shrinking critical sections, cumulative
     improvements in lockstat numbers show +15%

  Notable fixes:

   - use vmalloc fallback when allocating bios as high order allocations
     can happen with wide checksums (like sha256)

   - scrub will always track the last position of progress so it's not
     starting from zero after an error

  Core:

   - under experimental config, checksum calculations are offloaded to
     process context, simplifies locking and allows to remove
     compression write worker kthread(s):
      - speed improvement in direct IO throughput with buffered IO
        fallback is +15% when not offloaded but this is more related to
        internal crypto subsystem improvements
      - this will be probably default in the future removing the sysfs
        tunable

   - (experimental) block size > page size updates:
      - support more operations when not using large folios (encoded
        read/write and send)
      - raid56

   - more preparations for fscrypt support

  Other:

   - more conversions to auto-cleaned variables

   - parameter cleanups and removals

   - extended warning fixes

   - improved printing of structured values like keys

   - lots of other cleanups and refactoring"

* tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
  btrfs: remove unnecessary inode key in btrfs_log_all_parents()
  btrfs: remove redundant zero/NULL initializations in btrfs_alloc_root()
  btrfs: remaining BTRFS_PATH_AUTO_FREE conversions
  btrfs: send: do not allocate memory for xattr data when checking it exists
  btrfs: send: add unlikely to all unexpected overflow checks
  btrfs: reduce arguments to btrfs_del_inode_ref_in_log()
  btrfs: remove root argument from btrfs_del_dir_entries_in_log()
  btrfs: use test_and_set_bit() in btrfs_delayed_delete_inode_ref()
  btrfs: don't search back for dir inode item in INO_LOOKUP_USER
  btrfs: don't rewrite ret from inode_permission
  btrfs: add orig_logical to btrfs_bio for encryption
  btrfs: disable verity on encrypted inodes
  btrfs: disable various operations on encrypted inodes
  btrfs: remove redundant level reset in btrfs_del_items()
  btrfs: simplify leaf traversal after path release in btrfs_next_old_leaf()
  btrfs: optimize balance_level() path reference handling
  btrfs: factor out root promotion logic into promote_child_to_root()
  btrfs: raid56: remove the "_step" infix
  btrfs: raid56: enable bs > ps support
  btrfs: raid56: prepare finish_parity_scrub() to support bs > ps cases
  ...
2025-12-03 20:03:46 -08:00
Linus Torvalds cc25df3e2e for-6.19/block-20251201
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmktsoMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpuiUD/92eivL+HmOh10o8trvxajB0yuyqfSjHHrL
 g+xUbF4s9bgAg/v+Upx7sTY8jdrTcMjKov+G9T6uPvBMqVmeVdZckA1PSAKQaIX1
 Zb7nS2LnO7F6JKbwpwVrrIaqVbcz8MfGIIMbN4yNNEOMCwdIVMp4fo7trPBknJNx
 WddNSGUFlIF3NqSI8AflSS/pYnGm+McfBHXBpJAKipI3iquKKubHv+FX9kLp7Tn4
 x27ZoCWOHglIBTJXU0mmXCVsLF8b5BA8DQcGtT62azb8+l0cRTkaHY0DFAv5BvhG
 TqcjrKdmR0cGSNt+nEmFrujE3atBRl0G0kiHA80YgA1MTtYzdPaUVOUtM9k/rEem
 gpiGMDpBypdxyJAyijPSaVJdfcg0psOlYbhIR4N2wbj/dq8268h+cWzXlF1spgVt
 /7ygoaCmfMNbTy9rKThTjH+es787AVXUAXXaPHhIFsnCKUj8xQl4pT7XltmgYeWx
 1/XD1NEJeLHHog5upAVlGX3H5tbvP1nIICxbZa9mDOJX1rwxxI7/s/RucPjbNXuY
 AiaKPTfxtB9+Ihd2HrJ/76RVMkckcOBc4GIKoFfwuKDbcdLXQ5FcZCmVRoI1V9SV
 KsH7JBgihLwR9XWKE1vp9+CBNe1Qlu3K4IjG/E7CNLeuDntIBu73ihqGP/DqV6Bq
 RX1Dc0OyAQ==
 =m22w
 -----END PGP SIGNATURE-----

Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block updates from Jens Axboe:

 - Fix head insertion for mq-deadline, a regression from when priority
   support was added

 - Series simplifying and improving the ublk user copy code

 - Various ublk related cleanups

 - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the
   request is punted to a thread for handling

 - Merge and then later revert loop dio nowait support, as it ended up
   causing excessive stack usage for when the inline issue code needs to
   dip back into the full file system code

 - Improve auto integrity code, making it less deadlock prone

 - Speedup polled IO handling, but manually managing the hctx lookups

 - Fixes for blk-throttle for SSD devices

 - Small series with fixes for the S390 dasd driver

 - Add support for caching zones, avoiding unnecessary report zone
   queries

 - MD pull requests via Yu:
      - fix null-ptr-dereference regression for dm-raid0
      - fix IO hang for raid5 when array is broken with IO inflight
      - remove legacy 1s delay to speed up system shutdown
      - change maintainer's email address
      - data can be lost if array is created with different lbs devices,
        fix this problem and record lbs of the array in metadata
      - fix rcu protection for md_thread
      - fix mddev kobject lifetime regression
      - enable atomic writes for md-linear
      - some cleanups

 - bcache updates via Coly
      - remove useless discard and cache device code
      - improve usage of per-cpu workqueues

 - Reorganize the IO scheduler switching code, fixing some lockdep
   reports as well

 - Improve the block layer P2P DMA support

 - Add support to the block tracing code for zoned devices

 - Segment calculation improves, and memory alignment flexibility
   improvements

 - Set of prep and cleanups patches for ublk batching support. The
   actual batching hasn't been added yet, but helps shrink down the
   workload of getting that patchset ready for 6.20

 - Fix for how the ps3 block driver handles segments offsets

 - Improve how block plugging handles batch tag allocations

 - nbd fixes for use-after-free of the configuration on device clear/put

 - Set of improvements and fixes for zloop

 - Add Damien as maintainer of the block zoned device code handling

 - Various other fixes and cleanups

* tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits)
  block/rnbd: correct all kernel-doc complaints
  blk-mq: use queue_hctx in blk_mq_map_queue_type
  md: remove legacy 1s delay in md_notify_reboot
  md/raid5: fix IO hang when array is broken with IO inflight
  md: warn about updating super block failure
  md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid
  sbitmap: fix all kernel-doc warnings
  ublk: add helper of __ublk_fetch()
  ublk: pass const pointer to ublk_queue_is_zoned()
  ublk: refactor auto buffer register in ublk_dispatch_req()
  ublk: add `union ublk_io_buf` with improved naming
  ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg()
  kfifo: add kfifo_alloc_node() helper for NUMA awareness
  blk-mq: fix potential uaf for 'queue_hw_ctx'
  blk-mq: use array manage hctx map instead of xarray
  ublk: prevent invalid access with DEBUG
  s390/dasd: Use scnprintf() instead of sprintf()
  s390/dasd: Move device name formatting into separate function
  s390/dasd: Remove unnecessary debugfs_create() return checks
  s390/dasd: Fix gendisk parent after copy pair swap
  ...
2025-12-03 19:26:18 -08:00
Linus Torvalds 0abcfd8983 for-6.19/io_uring-20251201
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmktsm0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpiLvD/0dptgeJyLHKchOtRHzi/UvtM/EuNFKJrvI
 LBWCyIMjygxsVfPR41Lave9SE3UpcavF8Mg/EddasTci8VlMcDF8zPxWLb289Lz2
 tkp/wOVuyYmDhNXKmKNW59NOPTd0NosEJFTZI4VhMudwx+UtAHELJGfBWW5hRyQB
 Md+UwZ2+J9HbYd19mToaDFxz7jpIPLEE4BYUGtljveRUdpnxhyFGGUS2+CQXZt/5
 lnRvJmmEv4nSGH9ZRksix1xnV6KvJM0UwYQhrWvXhgwyiKu47zG7ONpd39KqoaRw
 Fw+6zZd0t7nyyuZkk15cKNnBLnjilnsCzmdcPq0Cuvkmbf6y1hlhEQQTGWXTKfJx
 zCZxEZcnCC4wL0CBQjZjS38AEMfH2p76M/36+NTWtlYCibY7qUtd9ndpUr49BYGo
 o4qfT0HMpI1PHuUvpZwpMcf4OX5qvtLmavT9vt78uqmtM+Aryzzuy3bI3S2SGjNe
 if/cNHnZc8Z06hUqdEit5NW+lYzj642AoF/j7qH9ADDH+VXRWaCdK/iI8tPaEpDV
 Rw6j442eVugS5tDPoTjdO8jsJ9+OCNNV1t/Jxy+Or+zrGdq7lfg4mnzEia1/izy5
 8MnSubRy6LEd+I5PnK/9y9mPIwFMIFgULi+mUjucAhJjRj5beiG74eR6+jBAdyp1
 GhFvN6fwdw==
 =4g/f
 -----END PGP SIGNATURE-----

Merge tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring updates from Jens Axboe:

 - Unify how task_work cancelations are detected, placing it in the
   task_work running state rather than needing to check the task state

 - Series cleaning up and moving the cancelation code to where it
   belongs, in cancel.c

 - Cleanup of waitid and futex argument handling

 - Add support for mixed sized SQEs. 6.18 added support for mixed sized
   CQEs, improving flexibility and efficiency of workloads that need big
   CQEs. This adds similar support for SQEs, where the occasional need
   for a 128b SQE doesn't necessitate having all SQEs be 128b in size

 - Introduce zcrx and SQ/CQ layout queries. The former returns what zcrx
   features are available. And both return the ring size information to
   help with allocation size calculation for user provided rings like
   IORING_SETUP_NO_MMAP and IORING_MEM_REGION_TYPE_USER

 - Zcrx updates for 6.19. It includes a bunch of small patches,
   IORING_REGISTER_ZCRX_CTRL and RQ flushing and David's work on sharing
   zcrx b/w multiple io_uring instances

 - Series cleaning up ring initializations, notable deduplicating ring
   size and offset calculations. It also moves most of the checking
   before doing any allocations, making the code simpler

 - Add support for getsockname and getpeername, which is mostly a
   trivial hookup after a bit of refactoring on the networking side

 - Various fixes and cleanups

* tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits)
  io_uring: Introduce getsockname io_uring cmd
  socket: Split out a getsockname helper for io_uring
  socket: Unify getsockname and getpeername implementation
  io_uring/query: drop unused io_handle_query_entry() ctx arg
  io_uring/kbuf: remove obsolete buf_nr_pages and update comments
  io_uring/register: use correct location for io_rings_layout
  io_uring/zcrx: share an ifq between rings
  io_uring/zcrx: add io_fill_zcrx_offsets()
  io_uring/zcrx: export zcrx via a file
  io_uring/zcrx: move io_zcrx_scrub() and dependencies up
  io_uring/zcrx: count zcrx users
  io_uring/zcrx: add sync refill queue flushing
  io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL
  io_uring/zcrx: elide passing msg flags
  io_uring/zcrx: use folio_nr_pages() instead of shift operation
  io_uring/zcrx: convert to use netmem_desc
  io_uring/query: introduce rings info query
  io_uring/query: introduce zcrx query
  io_uring: move cq/sq user offset init around
  io_uring: pre-calculate scq layout
  ...
2025-12-03 18:58:57 -08:00
Linus Torvalds 8f7aa3d3c7 Networking changes for 6.19.
Core & protocols
 ----------------
 
  - Replace busylock at the Tx queuing layer with a lockless list. Resulting
    in a 300% (4x) improvement on heavy TX workloads, sending twice the
    number of packets per second, for half the cpu cycles.
 
  - Allow constantly busy flows to migrate to a more suitable CPU/NIC
    queue. Normally we perform queue re-selection when flow comes out
    of idle, but under extreme circumstances the flows may be constantly
    busy. Add sysctl to allow periodic rehashing even if it'd risk packet
    reordering.
 
  - Optimize the NAPI skb cache, make it larger, use it in more paths.
 
  - Attempt returning Tx skbs to the originating CPU (like we already did
    for Rx skbs).
 
  - Various data structure layout and prefetch optimizations from Eric.
 
  - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly
    quite expensive on recent AMD machines.
 
  - Extend threaded NAPI polling to allow the kthread busy poll for packets.
 
  - Make MPTCP use Rx backlog processing. This lowers the lock pressure,
    improving the Rx performance.
 
  - Support memcg accounting of MPTCP socket memory.
 
  - Allow admin to opt sockets out of global protocol memory accounting
    (using a sysctl or BPF-based policy). The global limits are a poor fit
    for modern container workloads, where limits are imposed using cgroups.
 
  - Improve heuristics for when to kick off AF_UNIX garbage collection.
 
  - Allow users to control TCP SACK compression, and default to 33% of RTT.
 
  - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily
    aggressive rcvbuf growth and overshot when the connection RTT is low.
 
  - Preserve skb metadata space across skb_push / skb_pull operations.
 
  - Support for IPIP encapsulation in the nftables flowtable offload.
 
  - Support appending IP interface information to ICMP messages (RFC 5837).
 
  - Support setting max record size in TLS (RFC 8449).
 
  - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL.
 
  - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock.
 
  - Let users configure the number of write buffers in SMC.
 
  - Add new struct sockaddr_unsized for sockaddr of unknown length,
    from Kees.
 
  - Some conversions away from the crypto_ahash API, from Eric Biggers.
 
  - Some preparations for slimming down struct page.
 
  - YAML Netlink protocol spec for WireGuard.
 
  - Add a tool on top of YAML Netlink specs/lib for reporting commonly
    computed derived statistics and summarized system state.
 
 Driver API
 ----------
 
  - Add CAN XL support to the CAN Netlink interface.
 
  - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics,
    as defined by the OPEN Alliance's "Advanced diagnostic features
    for 100BASE-T1 automotive Ethernet PHYs" specification.
 
  - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x).
 
  - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec
    and performs RSS.
 
  - Add info to devlink params whether the current setting is the default
    or a user override. Allow resetting back to default.
 
  - Add standard device stats for PSP crypto offload.
 
  - Leverage DSA frame broadcast to implement simple HSR frame duplication
    for a lot of switches without dedicated HSR offload.
 
  - Add uAPI defines for 1.6Tbps link modes.
 
 Device drivers
 --------------
 
  - Add Motorcomm YT921x gigabit Ethernet switch support.
 
  - Add MUCSE driver for N500/N210 1GbE NIC series.
 
  - Convert drivers to support dedicated ops for timestamping control,
    and away from the direct IOCTL handling. While at it support GET
    operations for PHY timestamping.
 
  - Add (and convert most drivers to) a dedicated ethtool callback
    for reading the Rx ring count.
 
  - Significant refactoring efforts in the STMMAC driver, which supports
    Synopsys turn-key MAC IP integrated into a ton of SoCs.
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - support PPS in/out on all pins
    - Intel (100G, ice, idpf):
      - ice: implement standard ethtool and timestamping stats
      - i40e: support setting the max number of MAC addresses per VF
      - iavf: support RSS of GTP tunnels for 5G and LTE deployments
    - nVidia/Mellanox (mlx5):
      - reduce downtime on interface reconfiguration
      - disable being an XDP redirect target by default (same as other
        drivers) to avoid wasting resources if feature is unused
    - Meta (fbnic):
      - add support for Linux-managed PCS on 25G, 50G, and 100G links
    - Wangxun:
      - support Rx descriptor merge, and Tx head writeback
      - support Rx coalescing offload
      - support 25G SPF and 40G QSFP modules
 
  - Ethernet virtual:
    - Google (gve):
      - allow ethtool to configure rx_buf_len
      - implement XDP HW RX Timestamping support for DQ descriptor format
    - Microsoft vNIC (mana):
      - support HW link state events
      - handle hardware recovery events when probing the device
 
  - Ethernet NICs consumer, and embedded:
    - usbnet: add support for Byte Queue Limits (BQL)
    - AMD (amd-xgbe):
      - add device selftests
    - NXP (enetc):
      - add i.MX94 support
    - Broadcom integrated MACs (bcmgenet, bcmasp):
      - bcmasp: add support for PHY-based Wake-on-LAN
    - Broadcom switches (b53):
      - support port isolation
      - support BCM5389/97/98 and BCM63XX ARL formats
    - Lantiq/MaxLinear switches:
      - support bridge FDB entries on the CPU port
      - use regmap for register access
      - allow user to enable/disable learning
      - support Energy Efficient Ethernet
      - support configuring RMII clock delays
      - add tagging driver for MaxLinear GSW1xx switches
    - Synopsys (stmmac):
      - support using the HW clock in free running mode
      - add Eswin EIC7700 support
      - add Rockchip RK3506 support
      - add Altera Agilex5 support
    - Cadence (macb):
      - cleanup and consolidate descriptor and DMA address handling
      - add EyeQ5 support
    - TI:
      - icssg-prueth: support AF_XDP
    - Airoha access points:
      - add missing Ethernet stats and link state callback
      - add AN7583 support
      - support out-of-order Tx completion processing
    - Power over Ethernet:
      - pd692x0: preserve PSE configuration across reboots
      - add support for TPS23881B devices
 
  - Ethernet PHYs:
    - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support
    - Support 50G SerDes and 100G interfaces in Linux-managed PHYs
    - micrel:
      - support for non PTP SKUs of lan8814
      - enable in-band auto-negotiation on lan8814
    - realtek:
      - cable testing support on RTL8224
      - interrupt support on RTL8221B
    - motorcomm: support for PHY LEDs on YT853
    - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag
    - mscc: support for PHY LED control
 
  - CAN drivers:
    - m_can: add support for optional reset and system wake up
    - remove can_change_mtu() obsoleted by core handling
    - mcp251xfd: support GPIO controller functionality
 
  - Bluetooth:
    - add initial support for PASTa
 
  - WiFi:
    - split ieee80211.h file, it's way too big
    - improvements in VHT radiotap reporting, S1G, Channel Switch
      Announcement handling, rate tracking in mesh networks
    - improve multi-radio monitor mode support, and add a cfg80211 debugfs
      interface for it
    - HT action frame handling on 6 GHz
    - initial chanctx work towards NAN
    - MU-MIMO sniffer improvements
 
  - WiFi drivers:
    - RealTek (rtw89):
      - support USB devices RTL8852AU and RTL8852CU
      - initial work for RTL8922DE
      - improved injection support
    - Intel:
      - iwlwifi: new sniffer API support
    - MediaTek (mt76):
      - WED support for >32-bit DMA
      - airoha NPU support
      - regdomain improvements
      - continued WiFi7/MLO work
    - Qualcomm/Atheros:
      - ath10k: factory test support
      - ath11k: TX power insertion support
      - ath12k: BSS color change support
      - ath12k: statistics improvements
    - brcmfmac: Acer A1 840 tablet quirk
    - rtl8xxxu: 40 MHz connection fixes/support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkveRQACgkQMUZtbf5S
 IrvY7A/+Nb0o4BxLHjPkAl1m3t3q2d0Y29B7SNkwnwEtxAV8EkNeZ3GWrdtDnTQY
 MYhmc7LEzvz8/lihapr7UJkcokzSASUV54hbez5jDBKC8EEoyUk8FdWDPerwlcRI
 zmCFNAVFyh9GX8i7wcrzKbDTHT5+GZLbSlGl9U5mhLsDdRlJgH7d8PJ7vWcmtLFY
 XN0paDyaeHfCl8wReWNAYx4C/I0ODOvlscpO0tnAKhB0ngJbQCKY2t6tn3rOYdif
 ZSQ5KwVRnJtQ4fYOFMOy9+FSCjVXtyrxF8KLxD+mqom2ZhmO00UpOMl09tqhq3uT
 WnvwoHUVBt6F+iITHwg5kMgIDPUq1kpUvL4S4UbVSuUm9ZKD+4KRU2ZHRBYMx+MU
 bsqmtY8/IULClUoRz+tZhltA8eb0NEqNZE2JPOFDiJHn1YiCCkFwxibhir893oM3
 sB7x65D7LQI2ty2BBGVGYnwYDPtyaxOA/s3WTwPvLEi3+Y/TGNIIrS9lBLA4U+Yr
 Gi93WQGVjttMmVyaHgXBUGmi3L52hvolm0AZ8zSRGrnIEpecjhly2KfYuaOzuxXC
 IHEQ6AFLdRh6JzafXGb/mQwGCHNmhwsY8A49i94fakWQamaL/L6A+1dyPu4LXMqi
 NwqCmlVb/LKGlfNG+V4wT27srJ+yBA2Vk3tpR1sZQQytFh0LKHI=
 =UoDR
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Replace busylock at the Tx queuing layer with a lockless list.

     Resulting in a 300% (4x) improvement on heavy TX workloads, sending
     twice the number of packets per second, for half the cpu cycles.

   - Allow constantly busy flows to migrate to a more suitable CPU/NIC
     queue.

     Normally we perform queue re-selection when flow comes out of idle,
     but under extreme circumstances the flows may be constantly busy.

     Add sysctl to allow periodic rehashing even if it'd risk packet
     reordering.

   - Optimize the NAPI skb cache, make it larger, use it in more paths.

   - Attempt returning Tx skbs to the originating CPU (like we already
     did for Rx skbs).

   - Various data structure layout and prefetch optimizations from Eric.

   - Remove ktime_get() from the recvmsg() fast path, ktime_get() is
     sadly quite expensive on recent AMD machines.

   - Extend threaded NAPI polling to allow the kthread busy poll for
     packets.

   - Make MPTCP use Rx backlog processing. This lowers the lock
     pressure, improving the Rx performance.

   - Support memcg accounting of MPTCP socket memory.

   - Allow admin to opt sockets out of global protocol memory accounting
     (using a sysctl or BPF-based policy). The global limits are a poor
     fit for modern container workloads, where limits are imposed using
     cgroups.

   - Improve heuristics for when to kick off AF_UNIX garbage collection.

   - Allow users to control TCP SACK compression, and default to 33% of
     RTT.

   - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid
     unnecessarily aggressive rcvbuf growth and overshot when the
     connection RTT is low.

   - Preserve skb metadata space across skb_push / skb_pull operations.

   - Support for IPIP encapsulation in the nftables flowtable offload.

   - Support appending IP interface information to ICMP messages (RFC
     5837).

   - Support setting max record size in TLS (RFC 8449).

   - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL.

   - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock.

   - Let users configure the number of write buffers in SMC.

   - Add new struct sockaddr_unsized for sockaddr of unknown length,
     from Kees.

   - Some conversions away from the crypto_ahash API, from Eric Biggers.

   - Some preparations for slimming down struct page.

   - YAML Netlink protocol spec for WireGuard.

   - Add a tool on top of YAML Netlink specs/lib for reporting commonly
     computed derived statistics and summarized system state.

  Driver API:

   - Add CAN XL support to the CAN Netlink interface.

   - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as
     defined by the OPEN Alliance's "Advanced diagnostic features for
     100BASE-T1 automotive Ethernet PHYs" specification.

   - Add DPLL phase-adjust-gran pin attribute (and implement it in
     zl3073x).

   - Refactor xfrm_input lock to reduce contention when NIC offloads
     IPsec and performs RSS.

   - Add info to devlink params whether the current setting is the
     default or a user override. Allow resetting back to default.

   - Add standard device stats for PSP crypto offload.

   - Leverage DSA frame broadcast to implement simple HSR frame
     duplication for a lot of switches without dedicated HSR offload.

   - Add uAPI defines for 1.6Tbps link modes.

  Device drivers:

   - Add Motorcomm YT921x gigabit Ethernet switch support.

   - Add MUCSE driver for N500/N210 1GbE NIC series.

   - Convert drivers to support dedicated ops for timestamping control,
     and away from the direct IOCTL handling. While at it support GET
     operations for PHY timestamping.

   - Add (and convert most drivers to) a dedicated ethtool callback for
     reading the Rx ring count.

   - Significant refactoring efforts in the STMMAC driver, which
     supports Synopsys turn-key MAC IP integrated into a ton of SoCs.

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support PPS in/out on all pins
      - Intel (100G, ice, idpf):
         - ice: implement standard ethtool and timestamping stats
         - i40e: support setting the max number of MAC addresses per VF
         - iavf: support RSS of GTP tunnels for 5G and LTE deployments
      - nVidia/Mellanox (mlx5):
         - reduce downtime on interface reconfiguration
         - disable being an XDP redirect target by default (same as
           other drivers) to avoid wasting resources if feature is
           unused
      - Meta (fbnic):
         - add support for Linux-managed PCS on 25G, 50G, and 100G links
      - Wangxun:
         - support Rx descriptor merge, and Tx head writeback
         - support Rx coalescing offload
         - support 25G SPF and 40G QSFP modules

   - Ethernet virtual:
      - Google (gve):
         - allow ethtool to configure rx_buf_len
         - implement XDP HW RX Timestamping support for DQ descriptor
           format
      - Microsoft vNIC (mana):
         - support HW link state events
         - handle hardware recovery events when probing the device

   - Ethernet NICs consumer, and embedded:
      - usbnet: add support for Byte Queue Limits (BQL)
      - AMD (amd-xgbe):
         - add device selftests
      - NXP (enetc):
         - add i.MX94 support
      - Broadcom integrated MACs (bcmgenet, bcmasp):
         - bcmasp: add support for PHY-based Wake-on-LAN
      - Broadcom switches (b53):
         - support port isolation
         - support BCM5389/97/98 and BCM63XX ARL formats
      - Lantiq/MaxLinear switches:
         - support bridge FDB entries on the CPU port
         - use regmap for register access
         - allow user to enable/disable learning
         - support Energy Efficient Ethernet
         - support configuring RMII clock delays
         - add tagging driver for MaxLinear GSW1xx switches
      - Synopsys (stmmac):
         - support using the HW clock in free running mode
         - add Eswin EIC7700 support
         - add Rockchip RK3506 support
         - add Altera Agilex5 support
      - Cadence (macb):
         - cleanup and consolidate descriptor and DMA address handling
         - add EyeQ5 support
      - TI:
         - icssg-prueth: support AF_XDP
      - Airoha access points:
         - add missing Ethernet stats and link state callback
         - add AN7583 support
         - support out-of-order Tx completion processing
      - Power over Ethernet:
         - pd692x0: preserve PSE configuration across reboots
         - add support for TPS23881B devices

   - Ethernet PHYs:
      - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support
      - Support 50G SerDes and 100G interfaces in Linux-managed PHYs
      - micrel:
         - support for non PTP SKUs of lan8814
         - enable in-band auto-negotiation on lan8814
      - realtek:
         - cable testing support on RTL8224
         - interrupt support on RTL8221B
      - motorcomm: support for PHY LEDs on YT853
      - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag
      - mscc: support for PHY LED control

   - CAN drivers:
      - m_can: add support for optional reset and system wake up
      - remove can_change_mtu() obsoleted by core handling
      - mcp251xfd: support GPIO controller functionality

   - Bluetooth:
      - add initial support for PASTa

   - WiFi:
      - split ieee80211.h file, it's way too big
      - improvements in VHT radiotap reporting, S1G, Channel Switch
        Announcement handling, rate tracking in mesh networks
      - improve multi-radio monitor mode support, and add a cfg80211
        debugfs interface for it
      - HT action frame handling on 6 GHz
      - initial chanctx work towards NAN
      - MU-MIMO sniffer improvements

   - WiFi drivers:
      - RealTek (rtw89):
         - support USB devices RTL8852AU and RTL8852CU
         - initial work for RTL8922DE
         - improved injection support
      - Intel:
         - iwlwifi: new sniffer API support
      - MediaTek (mt76):
         - WED support for >32-bit DMA
         - airoha NPU support
         - regdomain improvements
         - continued WiFi7/MLO work
      - Qualcomm/Atheros:
         - ath10k: factory test support
         - ath11k: TX power insertion support
         - ath12k: BSS color change support
         - ath12k: statistics improvements
      - brcmfmac: Acer A1 840 tablet quirk
      - rtl8xxxu: 40 MHz connection fixes/support"

* tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits)
  net: page_pool: sanitise allocation order
  net: page pool: xa init with destroy on pp init
  net/mlx5e: Support XDP target xmit with dummy program
  net/mlx5e: Update XDP features in switch channels
  selftests/tc-testing: Test CAKE scheduler when enqueue drops packets
  net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
  wireguard: netlink: generate netlink code
  wireguard: uapi: generate header with ynl-gen
  wireguard: uapi: move flag enums
  wireguard: uapi: move enum wg_cmd
  wireguard: netlink: add YNL specification
  selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py
  selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py
  selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py
  selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py
  selftests: drv-net: introduce Iperf3Runner for measurement use cases
  selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS
  net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive()
  Documentation: net: dsa: mention simple HSR offload helpers
  Documentation: net: dsa: mention availability of RedBox
  ...
2025-12-03 17:24:33 -08:00
Linus Torvalds 015e7b0b0e bpf-next-6.19
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmktzC4ACgkQ6rmadz2v
 bTpA1w/+PZ45N3y6O+NQVIpBlpnHG7DEMK7Lw19On0xVLwH+XPHz6J5PEfzjyJR1
 SCbsV30qkJ1YCtgRHHf+ZCuWPWm58hY8dXYwSDyjNavdQyVGOdf17aBu9pvH45NW
 K20OhwQHpCHWIfDlijjPkDdiHnYf5S7Xy6ctt/3ztF0pMDHIaghGxJymG4wULcDT
 iLKnT37kwO8b2ihmw/HbcZPQYMWfHRye7X009K+wCv0dnhJ6q/Ny1m+Pg4kF92e6
 ON/RY26ep2dq7LpaNWa1rI1yOgFlI7uUlVojqrAuAb+xrg+64wUDBxeijvE37EN1
 s/+PuEKAR6xwz1dbY2cWAI0D633saz24UdV6kCBW9HrjHKVRQ7ZSsBF9ENkS4DTK
 nowx4wOe1ZHc/6YgTktZp9LEn/0YrmQtFxjqEAJiYUgD18FrBrSjmhHpBiL+HghP
 sTqy41qDQGoKtg3bRu42Co9wmNeeLsnxT8NQExCmTYQ4ufpdA/VMQux9cBVX3GBq
 EchJb465+AcvvCJUiKbnHLxDsHCQz1YYytz3RqyFLgGDFZnHOE0FjwPJmM8I5kkK
 gvDB3ZYdO3Halm8BZfZZBnKv5uK7myuAWwqRLgMRanuZcRgmIV1oUP5EP88HdH75
 fB20vZSVcfzB17SLyhiM20ivEWodJa9VCLEw9WDOmDoml+33Pks=
 =kaCJ
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to
   test_progs runner (Alexis Lothoré)

 - Convert selftests/bpf/test_xsk to test_progs runner (Bastien
   Curutchet)

 - Replace bpf memory allocator with kmalloc_nolock() in
   bpf_local_storage (Amery Hung), and in bpf streams and range tree
   (Puranjay Mohan)

 - Introduce support for indirect jumps in BPF verifier and x86 JIT
   (Anton Protopopov) and arm64 JIT (Puranjay Mohan)

 - Remove runqslower bpf tool (Hoyeon Lee)

 - Fix corner cases in the verifier to close several syzbot reports
   (Eduard Zingerman, KaFai Wan)

 - Several improvements in deadlock detection in rqspinlock (Kumar
   Kartikeya Dwivedi)

 - Implement "jmp" mode for BPF trampoline and corresponding
   DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance
   from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong)

 - Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul
   Chaignon)

 - Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel
   Borkmann)

 - Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko)

 - Optimize bpf_map_update_elem() for map-in-map types (Ritesh
   Oedayrajsingh Varma)

 - Introduce overwrite mode for BPF ring buffer (Xu Kuohai)

* tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits)
  bpf: optimize bpf_map_update_elem() for map-in-map types
  bpf: make kprobe_multi_link_prog_run always_inline
  selftests/bpf: do not hardcode target rate in test_tc_edt BPF program
  selftests/bpf: remove test_tc_edt.sh
  selftests/bpf: integrate test_tc_edt into test_progs
  selftests/bpf: rename test_tc_edt.bpf.c section to expose program type
  selftests/bpf: Add success stats to rqspinlock stress test
  rqspinlock: Precede non-head waiter queueing with AA check
  rqspinlock: Disable spinning for trylock fallback
  rqspinlock: Use trylock fallback when per-CPU rqnode is busy
  rqspinlock: Perform AA checks immediately
  rqspinlock: Enclose lock/unlock within lock entry acquisitions
  bpf: Remove runqslower tool
  selftests/bpf: Remove usage of lsm/file_alloc_security in selftest
  bpf: Disable file_alloc_security hook
  bpf: check for insn arrays in check_ptr_alignment
  bpf: force BPF_F_RDONLY_PROG on insn array creation
  bpf: Fix exclusive map memory leak
  selftests/bpf: Make CS length configurable for rqspinlock stress test
  selftests/bpf: Add lock wait time stats to rqspinlock stress test
  ...
2025-12-03 16:54:54 -08:00
Jacopo Mondi f7231cff1f media: uapi: c3-isp: Fix documentation warning
Building htmldocs generates a warning:

WARNING: include/uapi/linux/media/amlogic/c3-isp-config.h:199
error: Cannot parse struct or union!

Which correctly highlights that the c3_isp_params_block_header symbol
is wrongly documented as a struct while it's a plain #define instead.

Fix this by removing the 'struct' identifier from the documentation of
the c3_isp_params_block_header symbol.

[ribalda: Add Closes:]

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20251127131425.4b5b6644@canb.auug.org.au/
Fixes: 4566208285 ("media: uapi: Convert Amlogic C3 to V4L2 extensible params")
Cc: stable@vger.kernel.org
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-12-03 10:07:37 +01:00
Linus Torvalds d348c22394 Power management updates for 6.19-rc1
- Introduce and document a QoS limit on CPU exit latency during wakeup
    from suspend-to-idle (Ulf Hansson)
 
  - Add support for building libcpupower statically (Zuo An)
 
  - Add support for sending netlink notifications to user space on energy
    model updates (Changwoo Mini, Peng Fan)
 
  - Minor improvements to the Rust OPP interface (Tamir Duberstein)
 
  - Fixes to scope-based pointers in the OPP library (Viresh Kumar)
 
  - Use residency threshold in polling state override decisions in the
    menu cpuidle governor (Aboorva Devarajan)
 
  - Add sanity check for exit latency and target residency in the cpufreq
    core (Rafael Wysocki)
 
  - Use this_cpu_ptr() where possible in the teo governor (Christian
    Loehle)
 
  - Rework the handling of tick wakeups in the teo cpuidle governor to
    increase the likelihood of stopping the scheduler tick in the cases
    when tick wakeups can be counted as non-timer ones (Rafael Wysocki)
 
  - Fix a reverse condition in the teo cpuidle governor and drop a
    misguided target residency check from it (Rafael Wysocki)
 
  - Clean up multiple minor defects in the teo cpuidle governor (Rafael
    Wysocki)
 
  - Update header inclusion to make it follow the Include What You Use
    principle (Andy Shevchenko)
 
  - Enable MSR-based RAPL PMU support in the intel_rapl power capping
    driver and arrange for using it on the Panther Lake and Wildcat Lake
    processors (Kuppuswamy Sathyanarayanan)
 
  - Add support for Nova Lake and Wildcat Lake processors to the
    intel_rapl power capping driver (Kaushlendra Kumar, Srinivas
    Pandruvada)
 
  - Add OPP and bandwidth support for Tegra186 (Aaron Kling)
 
  - Optimizations for parameter array handling in the amd-pstate cpufreq
    driver (Mario Limonciello)
 
  - Fix for mode changes with offline CPUs in the amd-pstate cpufreq
    driver (Gautham Shenoy)
 
  - Preserve freq_table_sorted across suspend/hibernate in the cpufreq
    core (Zihuan Zhang)
 
  - Adjust energy model rules for Intel hybrid platforms in the
    intel_pstate cpufreq driver and improve printing of debug messages
    in it (Rafael Wysocki)
 
  - Replace deprecated strcpy() in cpufreq_unregister_governor()
    (Thorsten Blum)
 
  - Fix duplicate hyperlink target errors in the intel_pstate cpufreq
    driver documentation and use :ref: directive for internal linking in
    it (Swaraj Gaikwad, Bagas Sanjaya)
 
  - Add Diamond Rapids OOB mode support to the intel_pstate cpufreq
    driver (Kuppuswamy Sathyanarayanan)
 
  - Use mutex guard for driver locking in the intel_pstate driver and
    eliminate some code duplication from it (Rafael Wysocki)
 
  - Replace udelay() with usleep_range() in ACPI cpufreq (Kaushlendra
    Kumar)
 
  - Minor improvements to various cpufreq drivers (Christian Marangi, Hal
    Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu)
 
  - Replace snprintf() with scnprintf() in show_trace_dev_match()
    (Kaushlendra Kumar)
 
  - Fix memory allocation error handling in pm_vt_switch_required()
    (Malaya Kumar Rout)
 
  - Introduce CALL_PM_OP() macro and use it to simplify code in
    generic PM operations (Kaushlendra Kumar)
 
  - Add module param to backtrace all CPUs in the device power management
    watchdog (Sergey Senozhatsky)
 
  - Rework message printing in swsusp_save() (Rafael Wysocki)
 
  - Make it possible to change the number of hibernation compression
    threads (Xueqin Luo)
 
  - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo)
 
  - Add document on debugging shutdown hangs to PM documentation and
    correct a mistaken configuration option in it (Mario Limonciello)
 
  - Shut down wakeup source timer before removing the wakeup source from
    the list (Kaushlendra Kumar, Rafael Wysocki)
 
  - Introduce new PMSG_POWEROFF event for system shutdown handling with
    the help of PM device callbacks (Mario Limonciello)
 
  - Make pm_test delay interruptible by wakeup events (Riwen Lu)
 
  - Clean up kernel-doc comment style usage in the core hibernation
    code and remove unuseful comments from it (Sunday Adelodun, Rafael
    Wysocki)
 
  - Add support for handling wakeup events and aborting the suspend
    process while it is syncing file systems (Samuel Wu, Rafael Wysocki)
 
  - Add WQ_UNBOUND to pm_wq workqueue (Marco Crivellari)
 
  - Add runtime PM wrapper macros for ACQUIRE()/ACQUIRE_ERR() and use
    them in the PCI core and the ACPI TAD driver (Rafael Wysocki)
 
  - Improve runtime PM in the ACPI TAD driver (Rafael Wysocki)
 
  - Update pm_runtime_allow/forbid() documentation (Rafael Wysocki)
 
  - Fix typos in runtime.c comments (Malaya Kumar Rout)
 
  - Move governor.h from devfreq under include/linux/ and rename to
    devfreq-governor.h to allow devfreq governor definitions in out
    of drivers/devfreq/ (Dmitry Baryshkov)
 
  - Use min() to improve readability in tegra30-devfreq.c (Thorsten
    Blum)
 
  - Fix potential use-after-free issue of OPP handling in
    hisi_uncore_freq.c (Pengjie Zhang)
 
  - Fix typo in DFSO_DOWNDIFFERENTIAL macro name in
    governor_simpleondemand.c in devfreq (Riwen Lu)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmkp0BYSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1Pc8H/2G5d0aD/ym1a8MDTpKqn7t3/rVMHa76
 YGfxXMBr1oY++r5GTJTKBxZBHmF89VH71kdyvsMidTAtHjR+iZAS1ajd2Q5VYjOF
 QNMld1qgPEzAZU8WSetDrBqMr89zls05Uubo4aCoNy6rFmgRaLHh3AmIKSS9aJuo
 C1eH8dRONME5I/rafkOUpFs1+/Agq1vePwPZmwVnZX9A3qI+UOhMRdU9A37kYkx9
 YwfQvR2fKTIPjZ6B9f/wGXPOvdrT37d4+dWT3EABOHMkxlpAPDMvmVzZsUaXSQMr
 0d9NGEjPGo33qciKJJpHqNOdDOhi90606WBBf7aaMF+GMhDX3PznOK4=
 =rzXO
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "There are quite a few interesting things here, including new hardware
  support, new features, some bug fixes and documentation updates. In
  addition, there are a usual bunch of minor fixes and cleanups all
  over.

  In the new hardware support category, there are intel_pstate and
  intel_rapl driver updates to support new processors, Panther Lake,
  Wildcat Lake, Noval Lake, and Diamond Rapids in the OOB mode, OPP and
  bandwidth allocation support in the tegra186 cpufreq driver, and
  JH7110S SOC support in dt-platdev cpufreq.

  The new features are the PM QoS CPU latency limit for suspend-to-idle,
  the netlink support for the energy model management, support for
  terminating system suspend via a wakeup event during the sync of file
  systems, configurable number of hibernation compression threads, the
  runtime PM auto-cleanup macros, and the "poweroff" PM event that is
  expected to be used during system shutdown.

  Bugs are mostly fixed in cpuidle governors, but there are also fixes
  elsewhere, like in the amd-pstate cpufreq driver.

  Documentation updates include, but are not limited to, a new doc on
  debugging shutdown hangs, cross-referencing fixes and cleanups in the
  intel_pstate documentation, and updates of comments in the core
  hibernation code.

  Specifics:

   - Introduce and document a QoS limit on CPU exit latency during
     wakeup from suspend-to-idle (Ulf Hansson)

   - Add support for building libcpupower statically (Zuo An)

   - Add support for sending netlink notifications to user space on
     energy model updates (Changwoo Mini, Peng Fan)

   - Minor improvements to the Rust OPP interface (Tamir Duberstein)

   - Fixes to scope-based pointers in the OPP library (Viresh Kumar)

   - Use residency threshold in polling state override decisions in the
     menu cpuidle governor (Aboorva Devarajan)

   - Add sanity check for exit latency and target residency in the
     cpufreq core (Rafael Wysocki)

   - Use this_cpu_ptr() where possible in the teo governor (Christian
     Loehle)

   - Rework the handling of tick wakeups in the teo cpuidle governor to
     increase the likelihood of stopping the scheduler tick in the cases
     when tick wakeups can be counted as non-timer ones (Rafael Wysocki)

   - Fix a reverse condition in the teo cpuidle governor and drop a
     misguided target residency check from it (Rafael Wysocki)

   - Clean up multiple minor defects in the teo cpuidle governor (Rafael
     Wysocki)

   - Update header inclusion to make it follow the Include What You Use
     principle (Andy Shevchenko)

   - Enable MSR-based RAPL PMU support in the intel_rapl power capping
     driver and arrange for using it on the Panther Lake and Wildcat
     Lake processors (Kuppuswamy Sathyanarayanan)

   - Add support for Nova Lake and Wildcat Lake processors to the
     intel_rapl power capping driver (Kaushlendra Kumar, Srinivas
     Pandruvada)

   - Add OPP and bandwidth support for Tegra186 (Aaron Kling)

   - Optimizations for parameter array handling in the amd-pstate
     cpufreq driver (Mario Limonciello)

   - Fix for mode changes with offline CPUs in the amd-pstate cpufreq
     driver (Gautham Shenoy)

   - Preserve freq_table_sorted across suspend/hibernate in the cpufreq
     core (Zihuan Zhang)

   - Adjust energy model rules for Intel hybrid platforms in the
     intel_pstate cpufreq driver and improve printing of debug messages
     in it (Rafael Wysocki)

   - Replace deprecated strcpy() in cpufreq_unregister_governor()
     (Thorsten Blum)

   - Fix duplicate hyperlink target errors in the intel_pstate cpufreq
     driver documentation and use :ref: directive for internal linking
     in it (Swaraj Gaikwad, Bagas Sanjaya)

   - Add Diamond Rapids OOB mode support to the intel_pstate cpufreq
     driver (Kuppuswamy Sathyanarayanan)

   - Use mutex guard for driver locking in the intel_pstate driver and
     eliminate some code duplication from it (Rafael Wysocki)

   - Replace udelay() with usleep_range() in ACPI cpufreq (Kaushlendra
     Kumar)

   - Minor improvements to various cpufreq drivers (Christian Marangi,
     Hal Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu)

   - Replace snprintf() with scnprintf() in show_trace_dev_match()
     (Kaushlendra Kumar)

   - Fix memory allocation error handling in pm_vt_switch_required()
     (Malaya Kumar Rout)

   - Introduce CALL_PM_OP() macro and use it to simplify code in generic
     PM operations (Kaushlendra Kumar)

   - Add module param to backtrace all CPUs in the device power
     management watchdog (Sergey Senozhatsky)

   - Rework message printing in swsusp_save() (Rafael Wysocki)

   - Make it possible to change the number of hibernation compression
     threads (Xueqin Luo)

   - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo)

   - Add document on debugging shutdown hangs to PM documentation and
     correct a mistaken configuration option in it (Mario Limonciello)

   - Shut down wakeup source timer before removing the wakeup source
     from the list (Kaushlendra Kumar, Rafael Wysocki)

   - Introduce new PMSG_POWEROFF event for system shutdown handling with
     the help of PM device callbacks (Mario Limonciello)

   - Make pm_test delay interruptible by wakeup events (Riwen Lu)

   - Clean up kernel-doc comment style usage in the core hibernation
     code and remove unuseful comments from it (Sunday Adelodun, Rafael
     Wysocki)

   - Add support for handling wakeup events and aborting the suspend
     process while it is syncing file systems (Samuel Wu, Rafael
     Wysocki)

   - Add WQ_UNBOUND to pm_wq workqueue (Marco Crivellari)

   - Add runtime PM wrapper macros for ACQUIRE()/ACQUIRE_ERR() and use
     them in the PCI core and the ACPI TAD driver (Rafael Wysocki)

   - Improve runtime PM in the ACPI TAD driver (Rafael Wysocki)

   - Update pm_runtime_allow/forbid() documentation (Rafael Wysocki)

   - Fix typos in runtime.c comments (Malaya Kumar Rout)

   - Move governor.h from devfreq under include/linux/ and rename to
     devfreq-governor.h to allow devfreq governor definitions in out of
     drivers/devfreq/ (Dmitry Baryshkov)

   - Use min() to improve readability in tegra30-devfreq.c (Thorsten
     Blum)

   - Fix potential use-after-free issue of OPP handling in
     hisi_uncore_freq.c (Pengjie Zhang)

   - Fix typo in DFSO_DOWNDIFFERENTIAL macro name in
     governor_simpleondemand.c in devfreq (Riwen Lu)"

* tag 'pm-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (96 commits)
  PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name
  cpuidle: Warn instead of bailing out if target residency check fails
  cpuidle: Update header inclusion
  Documentation: power/cpuidle: Document the CPU system wakeup latency QoS
  cpuidle: Respect the CPU system wakeup QoS limit for cpuidle
  sched: idle: Respect the CPU system wakeup QoS limit for s2idle
  pmdomain: Respect the CPU system wakeup QoS limit for cpuidle
  pmdomain: Respect the CPU system wakeup QoS limit for s2idle
  PM: QoS: Introduce a CPU system wakeup QoS limit
  cpuidle: governors: teo: Add missing space to the description
  PM: hibernate: Extra cleanup of comments in swap handling code
  PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate
  PM / devfreq: hisi: Fix potential UAF in OPP handling
  PM / devfreq: Move governor.h to a public header location
  powercap: intel_rapl: Enable MSR-based RAPL PMU support
  powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers
  cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list
  PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper()
  PM: sleep: Add support for wakeup during filesystem sync
  cpufreq: ACPI: Replace udelay() with usleep_range()
  ...
2025-12-02 17:31:22 -08:00
Linus Torvalds 44fc84337b arm64 updates for 6.19:
Core features:
 
  - Basic Arm MPAM (Memory system resource Partitioning And Monitoring)
    driver under drivers/resctrl/ which makes use of the fs/rectrl/ API
 
 Perf and PMU:
 
  - Avoid cycle counter on multi-threaded CPUs
 
  - Extend CSPMU device probing and add additional filtering support for
    NVIDIA implementations
 
  - Add support for the PMUs on the NoC S3 interconnect
 
  - Add additional compatible strings for new Cortex and C1 CPUs
 
  - Add support for data source filtering to the SPE driver
 
  - Add support for i.MX8QM and "DB" PMU in the imx PMU driver
 
 Memory managemennt:
 
  - Avoid broadcast TLBI if page reused in write fault
 
  - Elide TLB invalidation if the old PTE was not valid
 
  - Drop redundant cpu_set_*_tcr_t0sz() macros
 
  - Propagate pgtable_alloc() errors outside of __create_pgd_mapping()
 
  - Propagate return value from __change_memory_common()
 
 ACPI and EFI:
 
  - Call EFI runtime services without disabling preemption
 
  - Remove unused ACPI function
 
 Miscellaneous:
 
  - ptrace support to disable streaming on SME-only systems
 
  - Improve sysreg generation to include a 'Prefix' descriptor
 
  - Replace __ASSEMBLY__ with __ASSEMBLER__
 
  - Align register dumps in the kselftest zt-test
 
  - Remove some no longer used macros/functions
 
  - Various spelling corrections
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmkvMjkACgkQa9axLQDI
 XvGaGg//dtT/ZAqrWa6Yniv1LOlh837C07YdxAYTTuJ+I87DnrxIqjwbW+ye+bF+
 61RTkioeCUm3PH+ncO9gPVNi4ASZ1db3/Rc8Fb6rr1TYOI1sMIeBsbbVdRJgsbX6
 zu9197jOBHscTAeDceB6jZBDyW8iSLINPZ7LN6lGxXsZM/Vn5zfE0heKEEio6Fsx
 +AzO2vos0XcwBR9vFGXtiCDx57T+/cXUtrWfA0Cjz4nvHSgD8+ghS+Jwv+kHMt1L
 zrarqbeQfj+Iixm9PVHiazv+8THo9QdNl1yGLxDmJ4LEVPewjW5jBs8+5e8e3/Gj
 p5JEvmSyWvKTTbFoM5vhxC72A7yuT1QwAk2iCyFIxMbQ25PndHboKVp/569DzOkT
 +6CjI88sVSP6D7bVlN6pFlzc/Fa07YagnDMnMCSfk4LBjUfE3jYb+usaFydyv/rl
 jwZbJrnSF/H+uQlyoJFgOEXSoQdDsll3dv6yEsUCwbd8RqXbAe3svbguOUHSdvIj
 sCViezGZQ7Rkn6D21AfF9j6e7ceaSDaf5DWMxPI3dAxFKG8TJbCBsToR59NnoSj+
 bNEozbZ1mCxmwH8i43wZ6P0RkClvJnoXcvRA+TJj02fSZACO39d3XDNswfXWL41r
 KiWGUJZyn2lPKtiAWVX6pSBtDJ+5rFhuoFgADLX6trkxDe9/EMQ=
 =4Sb6
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "These are the arm64 updates for 6.19.

  The biggest part is the Arm MPAM driver under drivers/resctrl/.
  There's a patch touching mm/ to handle spurious faults for huge pmd
  (similar to the pte version). The corresponding arm64 part allows us
  to avoid the TLB maintenance if a (huge) page is reused after a write
  fault. There's EFI refactoring to allow runtime services with
  preemption enabled and the rest is the usual perf/PMU updates and
  several cleanups/typos.

  Summary:

  Core features:

   - Basic Arm MPAM (Memory system resource Partitioning And Monitoring)
     driver under drivers/resctrl/ which makes use of the fs/rectrl/ API

  Perf and PMU:

   - Avoid cycle counter on multi-threaded CPUs

   - Extend CSPMU device probing and add additional filtering support
     for NVIDIA implementations

   - Add support for the PMUs on the NoC S3 interconnect

   - Add additional compatible strings for new Cortex and C1 CPUs

   - Add support for data source filtering to the SPE driver

   - Add support for i.MX8QM and "DB" PMU in the imx PMU driver

  Memory managemennt:

   - Avoid broadcast TLBI if page reused in write fault

   - Elide TLB invalidation if the old PTE was not valid

   - Drop redundant cpu_set_*_tcr_t0sz() macros

   - Propagate pgtable_alloc() errors outside of __create_pgd_mapping()

   - Propagate return value from __change_memory_common()

  ACPI and EFI:

   - Call EFI runtime services without disabling preemption

   - Remove unused ACPI function

  Miscellaneous:

   - ptrace support to disable streaming on SME-only systems

   - Improve sysreg generation to include a 'Prefix' descriptor

   - Replace __ASSEMBLY__ with __ASSEMBLER__

   - Align register dumps in the kselftest zt-test

   - Remove some no longer used macros/functions

   - Various spelling corrections"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
  arm64/mm: Document why linear map split failure upon vm_reset_perms is not problematic
  arm64/pageattr: Propagate return value from __change_memory_common
  arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS
  KVM: arm64: selftests: Consider all 7 possible levels of cache
  KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user
  arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros
  Documentation/arm64: Fix the typo of register names
  ACPI: GTDT: Get rid of acpi_arch_timer_mem_init()
  perf: arm_spe: Add support for filtering on data source
  perf: Add perf_event_attr::config4
  perf/imx_ddr: Add support for PMU in DB (system interconnects)
  perf/imx_ddr: Get and enable optional clks
  perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe()
  dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL
  arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT
  arm64: mm: use untagged address to calculate page index
  MAINTAINERS: new entry for MPAM Driver
  arm_mpam: Add kunit tests for props_mismatch()
  arm_mpam: Add kunit test for bitmap reset
  arm_mpam: Add helper to reset saved mbwu state
  ...
2025-12-02 17:03:55 -08:00
Alexey Kardashevskiy c3859de858 psp-sev: Assign numbers to all status codes and add new
Make the definitions explicit. Add some more new codes.

The following patches will be using SPDM_REQUEST and
EXPAND_BUFFER_LENGTH_REQUEST, others are useful for the PSP FW
diagnostics.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Link: https://patch.msgid.link/20251202024449.542361-3-aik@amd.com
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-12-02 12:06:38 -08:00
Paolo Bonzini e0c26d47de - SCA rework
- VIRT_XFER_TO_GUEST_WORK support
 - Operation exception forwarding support
 - Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmktiX8ACgkQ41TmuOI4
 ufhozBAAuyPxu1cZqfAiuEpftR0fUFZeyqRLHqfFPNQUGW/kPZRz2uNd38qulboV
 gmbu5jcwf8SdbF+p8f7RLvkEyTEnzuXELrfSwcwyv9IUiK++p9gRNkuppHbNnTI7
 yK21hJz+jZmRzUrSxnLylTC3++RZczhVeHqHzwosnHcNerK6FLcIjjsl7YinJToI
 T3jiTmprXl5NzFu7O5N/3J2KAIqNr+3DfnOf2lnLzHeupc52Z6TtvdizypAAV7Yk
 qWQ/81HI8GtIPFWss1kNwrJXQBjgBObz3XBOtq0bw1Ycs+BijsQh424vFoetV1/n
 bdmEh38lfY3sbbSE3RomnEATRdzremiYb63v5E4Bg7/bpLPhXw+jMF2Hp8jNqOiZ
 jI7KpGPOA4+C1EzS+Uge81fksW+ylNEYk/dZgGQgOFtF8Vf+Ana0NloDAqMHUeXq
 gVI2Sd9nMR80WslVzs5DMj/XK86J2TsFxtKYPa1cHV9PkHegO+eJm2nWCRHbfddz
 iEymokTm9xmfykjFfKDwZ4EcB5vdV7cuNE8aedsp9NXgICrgDbPn8ualG6aZUB0c
 ScvfRuoiZT7e4D8UZ79uCOCPQqwGCffOfIOee3ocf/95ZVY+9xv7FTTh200DjBU2
 Jv1NoTe9ZOO4+dYWRsht0fzC7zBVDO3CEb6OcNRB9wgNidDQaeM=
 =PtzZ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-6.19-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

- SCA rework
- VIRT_XFER_TO_GUEST_WORK support
- Operation exception forwarding support
- Cleanups
2025-12-02 18:58:47 +01:00
Paolo Bonzini f58e70cc31 KVM/arm64 updates for 6.19
- Support for userspace handling of synchronous external aborts (SEAs),
    allowing the VMM to potentially handle the abort in a non-fatal
    manner.
 
  - Large rework of the VGIC's list register handling with the goal of
    supporting more active/pending IRQs than available list registers in
    hardware. In addition, the VGIC now supports EOImode==1 style
    deactivations for IRQs which may occur on a separate vCPU than the
    one that acked the IRQ.
 
  - Support for FEAT_XNX (user / privileged execute permissions) and
    FEAT_HAF (hardware update to the Access Flag) in the software page
    table walkers and shadow MMU.
 
  - Allow page table destruction to reschedule, fixing long need_resched
    latencies observed when destroying a large VM.
 
  - Minor fixes to KVM and selftests
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYKADAWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCaS3m5RIcb3VwdG9uQGtl
 cm5lbC5vcmcACgkQor51iCR83Rb4NAD8C1fGoiCErb6htQMHf1I7ua0ThdIx7OnY
 Mk1EysNWu94BAI/VKEYgz+UC5uapHh+gnsoOdVTMJZedI/OPrnKa3QIA
 =/Vl1
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.19

 - Support for userspace handling of synchronous external aborts (SEAs),
   allowing the VMM to potentially handle the abort in a non-fatal
   manner.

 - Large rework of the VGIC's list register handling with the goal of
   supporting more active/pending IRQs than available list registers in
   hardware. In addition, the VGIC now supports EOImode==1 style
   deactivations for IRQs which may occur on a separate vCPU than the
   one that acked the IRQ.

 - Support for FEAT_XNX (user / privileged execute permissions) and
   FEAT_HAF (hardware update to the Access Flag) in the software page
   table walkers and shadow MMU.

 - Allow page table destruction to reschedule, fixing long need_resched
   latencies observed when destroying a large VM.

 - Minor fixes to KVM and selftests
2025-12-02 18:36:26 +01:00
Linus Torvalds 2b09f480f0 A large overhaul of the restartable sequences and CID management:
The recent enablement of RSEQ in glibc resulted in regressions which are
   caused by the related overhead. It turned out that the decision to invoke
   the exit to user work was not really a decision. More or less each
   context switch caused that. There is a long list of small issues which
   sums up nicely and results in a 3-4% regression in I/O benchmarks.
 
   The other detail which caused issues due to extra work in context switch
   and task migration is the CID (memory context ID) management. It also
   requires to use a task work to consolidate the CID space, which is
   executed in the context of an arbitrary task and results in sporadic
   uncontrolled exit latencies.
 
   The rewrite addresses this by:
 
   - Removing deprecated and long unsupported functionality
 
   - Moving the related data into dedicated data structures which are
     optimized for fast path processing.
 
   - Caching values so actual decisions can be made
 
   - Replacing the current implementation with a optimized inlined variant.
 
   - Separating fast and slow path for architectures which use the generic
     entry code, so that only fault and error handling goes into the
     TIF_NOTIFY_RESUME handler.
 
   - Rewriting the CID management so that it becomes mostly invisible in the
     context switch path. That moves the work of switching modes into the
     fork/exit path, which is a reasonable tradeoff. That work is only
     required when a process creates more threads than the cpuset it is
     allowed to run on or when enough threads exit after that. An artificial
     thread pool benchmarks which triggers this did not degrade, it actually
     improved significantly.
 
     The main effect in migration heavy scenarios is that runqueue lock held
     time and therefore contention goes down significantly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksaRYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoencEADA5he8PAFPmSRRPo6+2G5mHzWe8kIU
 5ZViQStWFNAA0qqy8VXryWiJ6qqrO6la9o7K4YOXASUtlkVjquRp1DF7PabqGwuy
 zshbRCXNlT51J8uqanN8VrGVjlf+bMdHDbGoI1SLkUTxG8b+kDD5PXUQE1ARelPP
 Slbg9u+EMrxj6D5MDTPbuW6TqryJEkPtiNScyOz43emp9ww9+WVxenOcRqU4D+Th
 mjWmrGIzkroSf4XReMoD/wg9TPTpUjXnNCwl2viY9JvBpkMfYtU4tJAGK3aNFOWy
 zsAN0O9CaFGrUEFne7qUmtwhNLdtnjx5HN5pe7yZd1EhdTuQKq4jPiiQnwwm8w72
 c0o6m45FNPmPoSyfaZWCkLjbTEUXonT9JF61iN35JVxim8gBDDJjHFKnLxDmLrH3
 X0eESE48ReY2EneDV6Y8RJRo6oG14Fccvc39aTf/2Rw3trpmtt2agvConQzupQIg
 DzANw4jhUUzFRrHrMHACNsqKFXh9ratue/S9DM3xxTpGO/bKdeK7jGIgzNf8O34M
 J0O6Hvk5jMdcWlIJTx21GoGzoSkkXnR49g/71aCcp+MwdY4x9zFz5SWi8LWQRmkx
 xRo6tY27Bma8/SEwMJjPpAUXDTpq6v+j3cPisybL1yGsyt9lh+p8LX7VUtwcoEqe
 6ZelC5Kgw/+/kg==
 =n5KT
 -----END PGP SIGNATURE-----

Merge tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull rseq updates from Thomas Gleixner:
 "A large overhaul of the restartable sequences and CID management:

  The recent enablement of RSEQ in glibc resulted in regressions which
  are caused by the related overhead. It turned out that the decision to
  invoke the exit to user work was not really a decision. More or less
  each context switch caused that. There is a long list of small issues
  which sums up nicely and results in a 3-4% regression in I/O
  benchmarks.

  The other detail which caused issues due to extra work in context
  switch and task migration is the CID (memory context ID) management.
  It also requires to use a task work to consolidate the CID space,
  which is executed in the context of an arbitrary task and results in
  sporadic uncontrolled exit latencies.

  The rewrite addresses this by:

   - Removing deprecated and long unsupported functionality

   - Moving the related data into dedicated data structures which are
     optimized for fast path processing.

   - Caching values so actual decisions can be made

   - Replacing the current implementation with a optimized inlined
     variant.

   - Separating fast and slow path for architectures which use the
     generic entry code, so that only fault and error handling goes into
     the TIF_NOTIFY_RESUME handler.

   - Rewriting the CID management so that it becomes mostly invisible in
     the context switch path. That moves the work of switching modes
     into the fork/exit path, which is a reasonable tradeoff. That work
     is only required when a process creates more threads than the
     cpuset it is allowed to run on or when enough threads exit after
     that. An artificial thread pool benchmarks which triggers this did
     not degrade, it actually improved significantly.

     The main effect in migration heavy scenarios is that runqueue lock
     held time and therefore contention goes down significantly"

* tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/mmcid: Switch over to the new mechanism
  sched/mmcid: Implement deferred mode change
  irqwork: Move data struct to a types header
  sched/mmcid: Provide CID ownership mode fixup functions
  sched/mmcid: Provide new scheduler CID mechanism
  sched/mmcid: Introduce per task/CPU ownership infrastructure
  sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex
  sched/mmcid: Provide precomputed maximal value
  sched/mmcid: Move initialization out of line
  signal: Move MMCID exit out of sighand lock
  sched/mmcid: Convert mm CID mask to a bitmap
  cpumask: Cache num_possible_cpus()
  sched/mmcid: Use cpumask_weighted_or()
  cpumask: Introduce cpumask_weighted_or()
  sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
  sched/mmcid: Move scheduler code out of global header
  sched: Fixup whitespace damage
  sched/mmcid: Cacheline align MM CID storage
  sched/mmcid: Use proper data structures
  sched/mmcid: Revert the complex CID management
  ...
2025-12-02 08:48:53 -08:00
Linus Torvalds 6c26fbe8c9 Performance events changes for v6.19:
Callchain support:
 
  - Add support for deferred user-space stack unwinding for
    perf, enabled on x86. (Peter Zijlstra, Steven Rostedt)
 
  - unwind_user/x86: Enable frame pointer unwinding on x86
    (Josh Poimboeuf)
 
 x86 PMU support and infrastructure:
 
  - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)
 
  - x86/insn,uprobes,alternative: Unify insn_is_nop()
    (Peter Zijlstra)
 
 Intel PMU driver:
 
  - Large series to prepare for and implement architectural PEBS
    support for Intel platforms such as Clearwater Forest (CWF)
    and Panther Lake (PTL). (Dapeng Mi, Kan Liang)
 
  - Check dynamic constraints (Kan Liang)
 
  - Optimize PEBS extended config (Peter Zijlstra)
 
  - cstates: Remove PC3 support from LunarLake (Zhang Rui)
 
  - cstates: Add Pantherlake support (Zhang Rui)
 
  - cstates: Clearwater Forest support (Zide Chen)
 
 AMD PMU driver:
 
  - x86/amd: Check event before enable to avoid GPF (George Kennedy)
 
 Fixes and cleanups:
 
  - task_work: Fix NMI race condition (Peter Zijlstra)
 
  - perf/x86: Fix NULL event access and potential PEBS record loss
    (Dapeng Mi)
 
  - Misc other fixes and cleanups.
    (Dapeng Mi, Ingo Molnar, Peter Zijlstra)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktcU0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gNKw//ThLmbkoGJ0/yLOdEcW8rA/7HB43Oz6j9
 k0Vs7zwDBMRFP4zQg2XeF5SH7CWS9p/nI3eMhorgmH77oJCvXJxVtD5991zmlZhf
 eafOar5ZMVaoMz+tK8WWiENZyuN0bt0mumZmz9svXR3KV1S/q18XZ8bCas0itwnq
 D0T3Gqi/Z39gJIy7bHNgLoFY2zvI9b2EJNDKlzHk3NJ7UamA4GuMHN0cM2dIzKGK
 2L+wXOe2BH9YYzYrz/cdKq7sBMjOvFsCQ/5jh23A2Yu6JI4nJbw0WmexZRK1OWCp
 GAdMjBuqIShibLRxK746WRO9iut49uTsah4iSG80hXzhpwf7VaegOarost1nLaqm
 zweIOr3iwJRf273r6IqRuaporVHpQYMj2w2H63z36sQtGtkKHNyxZ50b6bqpwwjU
 LikLEJ9Bmh3mlvlXsOx2wX6dTb1fUk+cy2ezCDKUHqOLjqy4dM8V+jYhuRO4yxXz
 mj9aHZKgyuREt8yo/3nLqAzF5Okj9cXp7H6F1hCKWuCoAhNXkrvYcvbg8h6aRxOX
 2vGhMYjpElkl/DG6OWCSwuqCt9nVEC/dazW9fKQjh4S0CFOVopaMGSkGcS/xUPub
 92J4XMDEJX4RJ6dfspeQr97+1fETXEIWNv4WbKnDjqJlAucU1gnOTprVnAYUjcWw
 74320FjGN1E=
 =/8GE
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance events updates from Ingo Molnar:
 "Callchain support:

   - Add support for deferred user-space stack unwinding for perf,
     enabled on x86. (Peter Zijlstra, Steven Rostedt)

   - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh
     Poimboeuf)

  x86 PMU support and infrastructure:

   - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)

   - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra)

  Intel PMU driver:

   - Large series to prepare for and implement architectural PEBS
     support for Intel platforms such as Clearwater Forest (CWF) and
     Panther Lake (PTL). (Dapeng Mi, Kan Liang)

   - Check dynamic constraints (Kan Liang)

   - Optimize PEBS extended config (Peter Zijlstra)

   - cstates:
      - Remove PC3 support from LunarLake (Zhang Rui)
      - Add Pantherlake support (Zhang Rui)
      - Clearwater Forest support (Zide Chen)

  AMD PMU driver:

   - x86/amd: Check event before enable to avoid GPF (George Kennedy)

  Fixes and cleanups:

   - task_work: Fix NMI race condition (Peter Zijlstra)

   - perf/x86: Fix NULL event access and potential PEBS record loss
     (Dapeng Mi)

   - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter
     Zijlstra)"

* tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  perf/x86/intel: Optimize PEBS extended config
  perf/x86/intel: Check PEBS dyn_constraints
  perf/x86/intel: Add a check for dynamic constraints
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Update dyn_constraint base on PEBS event precise level
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86: Fix NULL event access and potential PEBS record loss
  perf/x86: Remove redundant is_x86_event() prototype
  entry,unwind/deferred: Fix unwind_reset_info() placement
  unwind_user/x86: Fix arch=um build
  perf: Support deferred user unwind
  unwind_user/x86: Teach FP unwind about start of function
  ...
2025-12-01 20:42:01 -08:00
Asbjørn Sloth Tønnesen 88cedad45b wireguard: uapi: generate header with ynl-gen
Use ynl-gen to generate the UAPI header for WireGuard.

The cosmetic changes in this patch confirms that the spec is aligned
with the implementation. By using the generated version, it ensures
that they stay in sync.

Changes in the generated header:
* Trivial header guard rename.
* Trivial white space changes.
* Trivial comment changes.
* Precompute bitflags in ynl-gen (see [1]).
* Drop __*_F_ALL constants (see [1]).

[1] https://lore.kernel.org/r/20251014123201.6ecfd146@kernel.org/

No behavioural changes intended.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-12-02 04:12:49 +01:00
Asbjørn Sloth Tønnesen 8d974872ab wireguard: uapi: move flag enums
Move the wg*_flag enums, so they are defined above the attribute set
enums, where ynl-gen would place them.

This is an incremental step towards adopting an UAPI header generated
by ynl-gen. This is split out to keep the patches readable.

This is a trivial patch with no behavioural changes intended.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-12-02 04:12:49 +01:00
Asbjørn Sloth Tønnesen b5c5a82bf5 wireguard: uapi: move enum wg_cmd
This patch moves enum wg_cmd to the end of the file, where ynl-gen
would generate it.

This is an incremental step towards adopting an UAPI header generated
by ynl-gen. This is split out to keep the patches readable.

This is a trivial patch with no behavioural changes intended.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-12-02 04:12:49 +01:00
Asbjørn Sloth Tønnesen 6b0f4ca079 wireguard: netlink: add YNL specification
This patch adds a near[1] complete YNL specification for WireGuard,
documenting the protocol in a machine-readable format, rather than
comments in wireguard.h, and eases usage from C and non-C programming
languages alike.

The generated C library will be featured in a later patch, so in
this patch I will use the in-kernel python client for examples.

This makes the documentation in the UAPI header redundant, it is
therefore removed. The in-line documentation in the spec is based
on the existing comment in wireguard.h, and once released it will
be available in the kernel documentation at:
  https://docs.kernel.org/netlink/specs/wireguard.html
  (until then run: make htmldocs)

Generate wireguard.rst from this spec:
$ make -C tools/net/ynl/generated/ wireguard.rst

Query wireguard interface through pyynl:
$ sudo ./tools/net/ynl/pyynl/cli.py --family wireguard \
                                    --dump get-device \
                                    --json '{"ifindex":3}'
[{'fwmark': 0,
  'ifindex': 3,
  'ifname': 'wg-test',
  'listen-port': 54318,
  'peers': [{0: {'allowedips': [{0: {'cidr-mask': 0,
                                     'family': 2,
                                     'ipaddr': '0.0.0.0'}},
                                {0: {'cidr-mask': 0,
                                     'family': 10,
                                     'ipaddr': '::'}}],
                 'endpoint': b'[...]',
                 'last-handshake-time': {'nsec': 42, 'sec': 42},
                 'persistent-keepalive-interval': 42,
                 'preshared-key': '[...]',
                 'protocol-version': 1,
                 'public-key': '[...]',
                 'rx-bytes': 42,
                 'tx-bytes': 42}}],
  'private-key': '[...]',
  'public-key': '[...]'}]

Add another allowed IP prefix:
$ sudo ./tools/net/ynl/pyynl/cli.py --family wireguard \
  --do set-device --json '{"ifindex":3,"peers":[
    {"public-key":"6a df b1 83 a4 ..","allowedips":[
      {"cidr-mask":0,"family":10,"ipaddr":"::"}]}]}'

[1] As can be seen above, the "endpoint" is only dumped as binary data,
    as it can't be fully described in YNL. It's either a struct
    sockaddr_in or struct sockaddr_in6 depending on the attribute length.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-12-02 04:12:19 +01:00
Linus Torvalds db74a7d02a vfs-6.19-rc1.directory.delegations
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 ooiEAPwNZfkqiSs6G1B2EmjFpMrA2BDqskaOsnN2sywra0sNewD9EQxJwlYXUn+z
 nNUIAvmegJGg2OiU2UaNGwxMR3lR3w8=
 =YELr
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull directory delegations update from Christian Brauner:
 "This contains the work for recall-only directory delegations for
  knfsd.

  Add support for simple, recallable-only directory delegations. This
  was decided at the fall NFS Bakeathon where the NFS client and server
  maintainers discussed how to merge directory delegation support.

  The approach starts with recallable-only delegations for several reasons:

   1. RFC8881 has gaps that are being addressed in RFC8881bis. In
      particular, it requires directory position information for
      CB_NOTIFY callbacks, which is difficult to implement properly
      under Linux. The spec is being extended to allow that information
      to be omitted.

   2. Client-side support for CB_NOTIFY still lags. The client side
      involves heuristics about when to request a delegation.

   3. Early indication shows simple, recallable-only delegations can
      help performance. Anna Schumaker mentioned seeing a multi-minute
      speedup in xfstests runs with them enabled.

  With these changes, userspace can also request a read lease on a
  directory that will be recalled on conflicting accesses. This may be
  useful for applications like Samba. Users can disable leases
  altogether via the fs.leases-enable sysctl if needed.

  VFS changes:

   - Dedicated Type for Delegations

     Introduce struct delegated_inode to track inodes that may have
     delegations that need to be broken. This replaces the previous
     approach of passing raw inode pointers through the delegation
     breaking code paths, providing better type safety and clearer
     semantics for the delegation machinery.

   - Break parent directory delegations in open(..., O_CREAT) codepath

   - Allow mkdir to wait for delegation break on parent

   - Allow rmdir to wait for delegation break on parent

   - Add try_break_deleg calls for parents to vfs_link(), vfs_rename(),
     and vfs_unlink()

   - Make vfs_create(), vfs_mknod(), and vfs_symlink() break delegations
     on parent directory

   - Clean up argument list for vfs_create()

   - Expose delegation support to userland

  Filelock changes:

   - Make lease_alloc() take a flags argument

   - Rework the __break_lease API to use flags

   - Add struct delegated_inode

   - Push the S_ISREG check down to ->setlease handlers

   - Lift the ban on directory leases in generic_setlease

  NFSD changes:

   - Allow filecache to hold S_IFDIR files

   - Allow DELEGRETURN on directories

   - Wire up GET_DIR_DELEGATION handling

  Fixes:

   - Fix kernel-doc warnings in __fcntl_getlease

   - Add needed headers for new struct delegation definition"

* tag 'vfs-6.19-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  vfs: add needed headers for new struct delegation definition
  filelock: __fcntl_getlease: fix kernel-doc warnings
  vfs: expose delegation support to userland
  nfsd: wire up GET_DIR_DELEGATION handling
  nfsd: allow DELEGRETURN on directories
  nfsd: allow filecache to hold S_IFDIR files
  filelock: lift the ban on directory leases in generic_setlease
  vfs: make vfs_symlink break delegations on parent dir
  vfs: make vfs_mknod break delegations on parent directory
  vfs: make vfs_create break delegations on parent directory
  vfs: clean up argument list for vfs_create()
  vfs: break parent dir delegations in open(..., O_CREAT) codepath
  vfs: allow rmdir to wait for delegation break on parent
  vfs: allow mkdir to wait for delegation break on parent
  vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  filelock: push the S_ISREG check down to ->setlease handlers
  filelock: add struct delegated_inode
  filelock: rework the __break_lease API to use flags
  filelock: make lease_alloc() take a flags argument
2025-12-01 15:34:41 -08:00
Linus Torvalds 212c4053a1 vfs-6.19-rc1.coredump
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 oji0AQC5jl35xh04fJKB343InVAxtRFp8mSkJJ9Bx6x7xA7a+QEAiBMxYilUgYIW
 bZMcI5LU+gNO/1y076QkVt84jTUQLww=
 =WIBZ
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull pidfd and coredump updates from Christian Brauner:
 "Features:

   - Expose coredump signal via pidfd

     Expose the signal that caused the coredump through the pidfd
     interface. The recent changes to rework coredump handling to rely
     on unix sockets are in the process of being used in systemd. The
     previous systemd coredump container interface requires the coredump
     file descriptor and basic information including the signal number
     to be sent to the container. This means the signal number needs to
     be available before sending the coredump to the container.

   - Add supported_mask field to pidfd

     Add a new supported_mask field to struct pidfd_info that indicates
     which information fields are supported by the running kernel. This
     allows userspace to detect feature availability without relying on
     error codes or kernel version checks.

  Cleanups:

   - Drop struct pidfs_exit_info and prepare to drop exit_info pointer,
     simplifying the internal publication mechanism for exit and
     coredump information retrievable via the pidfd ioctl

   - Use guard() for task_lock in pidfs

   - Reduce wait_pidfd lock scope

   - Add missing PIDFD_INFO_SIZE_VER1 constant

   - Add missing BUILD_BUG_ON() assert on struct pidfd_info

  Fixes:

   - Fix PIDFD_INFO_COREDUMP handling

  Selftests:

   - Split out coredump socket tests and common helpers into separate
     files for better organization

   - Fix userspace coredump client detection issues

   - Handle edge-triggered epoll correctly

   - Ignore ENOSPC errors in tests

   - Add debug logging to coredump socket tests, socket protocol tests,
     and test helpers

   - Add tests for PIDFD_INFO_COREDUMP_SIGNAL

   - Add tests for supported_mask field

   - Update pidfd header for selftests"

* tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits)
  pidfs: reduce wait_pidfd lock scope
  selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test
  selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test
  selftests/coredump: ignore ENOSPC errors
  selftests/coredump: add debug logging to coredump socket protocol tests
  selftests/coredump: add debug logging to coredump socket tests
  selftests/coredump: add debug logging to test helpers
  selftests/coredump: handle edge-triggered epoll correctly
  selftests/coredump: fix userspace coredump client detection
  selftests/coredump: fix userspace client detection
  selftests/coredump: split out coredump socket tests
  selftests/coredump: split out common helpers
  selftests/pidfd: add second supported_mask test
  selftests/pidfd: add first supported_mask test
  selftests/pidfd: update pidfd header
  pidfs: expose coredump signal
  pidfs: drop struct pidfs_exit_info
  pidfs: prepare to drop exit_info pointer
  pidfd: add a new supported_mask field
  pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info
  ...
2025-12-01 10:17:39 -08:00
Linus Torvalds 415d34b92c namespace-6.19-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 ooKwAP4kR5kMjHlthf8jHmmCjVU3nQFO9hUZsIQL9gFJLOIQMAD+LLoTaq1WJufl
 oSgZpREXZVmI1TK61eR6EZMB1YikGAo=
 =TExi
 -----END PGP SIGNATURE-----

Merge tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull namespace updates from Christian Brauner:
 "This contains substantial namespace infrastructure changes including a new
  system call, active reference counting, and extensive header cleanups.
  The branch depends on the shared kbuild branch for -fms-extensions support.

  Features:

   - listns() system call

     Add a new listns() system call that allows userspace to iterate
     through namespaces in the system. This provides a programmatic
     interface to discover and inspect namespaces, addressing
     longstanding limitations:

     Currently, there is no direct way for userspace to enumerate
     namespaces. Applications must resort to scanning /proc/*/ns/ across
     all processes, which is:
      - Inefficient - requires iterating over all processes
      - Incomplete - misses namespaces not attached to any running
        process but kept alive by file descriptors, bind mounts, or
        parent references
      - Permission-heavy - requires access to /proc for many processes
      - No ordering or ownership information
      - No filtering per namespace type

     The listns() system call solves these problems:

       ssize_t listns(const struct ns_id_req *req, u64 *ns_ids,
                      size_t nr_ns_ids, unsigned int flags);

       struct ns_id_req {
             __u32 size;
             __u32 spare;
             __u64 ns_id;
             struct /* listns */ {
                     __u32 ns_type;
                     __u32 spare2;
                     __u64 user_ns_id;
             };
       };

     Features include:
      - Pagination support for large namespace sets
      - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.)
      - Filtering by owning user namespace
      - Permission checks respecting namespace isolation

   - Active Reference Counting

     Introduce an active reference count that tracks namespace
     visibility to userspace. A namespace is visible in the following
     cases:
      - The namespace is in use by a task
      - The namespace is persisted through a VFS object (namespace file
        descriptor or bind-mount)
      - The namespace is a hierarchical type and is the parent of child
        namespaces

     The active reference count does not regulate lifetime (that's still
     done by the normal reference count) - it only regulates visibility
     to namespace file handles and listns().

     This prevents resurrection of namespaces that are pinned only for
     internal kernel reasons (e.g., user namespaces held by
     file->f_cred, lazy TLB references on idle CPUs, etc.) which should
     not be accessible via (1)-(3).

   - Unified Namespace Tree

     Introduce a unified tree structure for all namespaces with:
      - Fixed IDs assigned to initial namespaces
      - Lookup based solely on inode number
      - Maintained list of owned namespaces per user namespace
      - Simplified rbtree comparison helpers

   Cleanups

    - Header Reorganization:
      - Move namespace types into separate header (ns_common_types.h)
      - Decouple nstree from ns_common header
      - Move nstree types into separate header
      - Switch to new ns_tree_{node,root} structures with helper functions
      - Use guards for ns_tree_lock

   - Initial Namespace Reference Count Optimization
      - Make all reference counts on initial namespaces a nop to avoid
        pointless cacheline ping-pong for namespaces that can never go
        away
      - Drop custom reference count initialization for initial namespaces
      - Add NS_COMMON_INIT() macro and use it for all namespaces
      - pid: rely on common reference count behavior

   - Miscellaneous Cleanups
      - Rename exit_task_namespaces() to exit_nsproxy_namespaces()
      - Rename is_initial_namespace() and make argument const
      - Use boolean to indicate anonymous mount namespace
      - Simplify owner list iteration in nstree
      - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly
      - nsfs: use inode_just_drop()
      - pidfs: raise DCACHE_DONTCACHE explicitly
      - pidfs: simplify PIDFD_GET__NAMESPACE ioctls
      - libfs: allow to specify s_d_flags
      - cgroup: add cgroup namespace to tree after owner is set
      - nsproxy: fix free_nsproxy() and simplify create_new_namespaces()

  Fixes:

   - setns(pidfd, ...) race condition

     Fix a subtle race when using pidfds with setns(). When the target
     task exits after prepare_nsset() but before commit_nsset(), the
     namespace's active reference count might have been dropped. If
     setns() then installs the namespaces, it would bump the active
     reference count from zero without taking the required reference on
     the owner namespace, leading to underflow when later decremented.

     The fix resurrects the ownership chain if necessary - if the caller
     succeeded in grabbing passive references, the setns() should
     succeed even if the target task exits or gets reaped.

   - Return EFAULT on put_user() error instead of success

   - Make sure references are dropped outside of RCU lock (some
     namespaces like mount namespace sleep when putting the last
     reference)

   - Don't skip active reference count initialization for network
     namespace

   - Add asserts for active refcount underflow

   - Add asserts for initial namespace reference counts (both passive
     and active)

   - ipc: enable is_ns_init_id() assertions

   - Fix kernel-doc comments for internal nstree functions

   - Selftests
      - 15 active reference count tests
      - 9 listns() functionality tests
      - 7 listns() permission tests
      - 12 inactive namespace resurrection tests
      - 3 threaded active reference count tests
      - commit_creds() active reference tests
      - Pagination and stress tests
      - EFAULT handling test
      - nsid tests fixes"

* tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits)
  pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls
  nstree: fix kernel-doc comments for internal functions
  nsproxy: fix free_nsproxy() and simplify create_new_namespaces()
  selftests/namespaces: fix nsid tests
  ns: drop custom reference count initialization for initial namespaces
  pid: rely on common reference count behavior
  ns: add asserts for initial namespace active reference counts
  ns: add asserts for initial namespace reference counts
  ns: make all reference counts on initial namespace a nop
  ipc: enable is_ns_init_id() assertions
  fs: use boolean to indicate anonymous mount namespace
  ns: rename is_initial_namespace()
  ns: make is_initial_namespace() argument const
  nstree: use guards for ns_tree_lock
  nstree: simplify owner list iteration
  nstree: switch to new structures
  nstree: add helper to operate on struct ns_tree_{node,root}
  nstree: move nstree types into separate header
  nstree: decouple from ns_common header
  ns: move namespace types into separate header
  ...
2025-12-01 09:47:41 -08:00
Michael S. Tsirkin 205dd7a5d6 virtio_pci: drop kernel.h
virtio UAPI headers really have no business pulling in kernel.h
Replace it with const.h which seems to be what's needed
for __KERNEL_DIV_ROUND_UP.

Fixes: 7c1ae151e8 ("virtio_pci: Introduce device parts access commands")
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Message-ID: <7a73b6c6af67e13b86633cd7bf11ad56b5d9809b.1763535341.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-30 18:02:43 -05:00
Randy Dunlap 414690746d i2c: i2c.h: fix a bad kernel-doc line
Change an empty line into a blank kernel-doc line to prevent
a kernel-doc warning:

Warning: ../include/uapi/linux/i2c.h:38 bad line:

Fixes: bfb3939c51 ("i2c: refactor documentation of struct i2c_msg")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-11-29 21:39:58 +09:00