Add a new VM_BIND flag, DRM_XE_VM_BIND_FLAG_DECOMPRESS, that lets userspace
express intent for the driver to perform on-device in-place decompression
for the GPU mapping created by a MAP bind operation.
This flag is used by subsequent driver changes to trigger scheduling of
GPU work that resolves compressed VRAM pages into an uncompressed PAT
VM mapping.
Behavior and semantics:
- Valid only for DRM_XE_VM_BIND_OP_MAP. IOCTLs using this flag on other ops
are rejected (-EINVAL).
- The bind's pat_index must select the device "no-compression" PAT entry;
otherwise the ioctl is rejected (-EINVAL).
- Only meaningful for VRAM-backed BOs on devices that support Flat CCS and
the required hardware generation (driver will return -EOPNOTSUPP if not).
- On success the driver schedules a migrate/resolve and installs the
returned dma_fence into the BO's kernel reservation
(DMA_RESV_USAGE_KERNEL).
Compute PR: https://github.com/intel/compute-runtime/pull/898
v3: Rebase on latest drm-tip and add compute pr info
v2: Add kernel doc (Matt)
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mrozek, Michal <michal.mrozek@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260304123758.3050386-6-nitin.r.gote@intel.com
The AMD PMF driver provides realtime column utilization (npu_busy)
metrics for the NPU. Extend the DRM_IOCTL_AMDXDNA_GET_INFO sensor
query to expose these metrics to userspace.
Add AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION to the sensor type enum
and update aie2_get_sensors() to return both the total power and up
to 8 column utilization sensors if the user buffer permits.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
[lizhi: support legacy tool which uses small buffer. checkpatch cleanup]
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260311171842.473453-1-lizhi.hou@amd.com
Similar to i915's commit cebc13de7e
("drm/i915: Whitelist COMMON_SLICE_CHICKEN3 for UMD access"), except
that instead of putting the register on the allowlist for UMD to
program, the KMD is doing the programming at context initialization
based on a queue creation flag.
This is a recommended tuning setting for both gen12 and Xe_HP
platforms.
If a render queue is created with
DRM_XE_EXEC_QUEUE_SET_STATE_CACHE_PERF_FIX, COMMON_SLICE_CHICKEN3 will
be programmed at initialization to enable the render color cache to
key with BTP+BTI (binding table pool + binding table entry) instead of
just BTI (binding table entry). This enables the UMD to avoid emitting
render-target-cache-flush + stall-at-pixel-scoreboard every time a
binding table entry pointing to a render target is changed.
v2: Use xe_lrc_write_ring()
v3: Update xe_query.c to report availability
v4: Rename defines to add DISABLE_
v5: update commit message
v6: rebase
Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982
Bspec: 73993, 73994, 72161, 31870, 68331
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patch.msgid.link/20260306075504.1288676-1-lionel.g.landwerlin@intel.com
amdgpu:
- FAMS2 updates
- Refactor DC I2C
- Rework ttm handling to allow for multiple engines
- UserQ updates
- Ring reset improvements
- DC DCE 6.x cleanups
- DC support for NUTMEG and TRAVIS DP bridges
- Enable DC by default on CIK APUs
- Add DCN 4.2 support
- IPS fixes
- Overlay fixes for DCN4
- SDMA Limit updates
- Misc fixes
- RAS updates
- Register access callback rework
- GC 12.1 updates
amdkfd:
- Misc cleanups
UAPI:
- UserQ fence IOCTL parameter size fixes. The change is backwards compatible on LE, but not BE.
UserQs are still not considered stable and are disabled by default.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaaikJwAKCRC93/aFa7yZ
2CklAQDWTLGzAuei2qob6wV4dcbxi+8eIB7uy4wzppMYen7WkgEA9oChRFo8eOAo
SAlAx59UnuGYDEdQTlQUfP/1BSLZ6AM=
=Cvp8
-----END PGP SIGNATURE-----
Merge tag 'amd-drm-next-7.1-2026-03-04' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-7.1-2026-03-04:
amdgpu:
- FAMS2 updates
- Refactor DC I2C
- Rework ttm handling to allow for multiple engines
- UserQ updates
- Ring reset improvements
- DC DCE 6.x cleanups
- DC support for NUTMEG and TRAVIS DP bridges
- Enable DC by default on CIK APUs
- Add DCN 4.2 support
- IPS fixes
- Overlay fixes for DCN4
- SDMA Limit updates
- Misc fixes
- RAS updates
- Register access callback rework
- GC 12.1 updates
amdkfd:
- Misc cleanups
UAPI:
- UserQ fence IOCTL parameter size fixes. The change is backwards compatible on LE, but not BE.
UserQs are still not considered stable and are disabled by default.
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260304213233.1938311-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmqPTAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpueIEACHF0uws+uZSEsy9LUyC9ha8+5YN9szIJ3K
QUGxa9pmCnQG5K50KpxyEYP6buaJDy1smJGgD2obkeGncC4w6xKK2kQTQV1U+C1C
YA+7B/3HLhz5AWS6GIbRy6VZ599I4evlF8W79dX8BTnF8Y1ddkSuUnKx//q0AoQZ
hr3foglcFlchy8JuQ2/MpxzfOouvNMdMmeUN4O+t8iXDrmePFYIOgrLcT+ObgC5D
SXWx2cc3hMJ35hcSzedMWEBFcXnkX9nfh8Hd/+uPRcKsIwS8kCo6z01GoT/BCPRA
jdrxAfoYSL16HPfq6GU52n6iCaRd+5NK+tt/ECCzTxGL32Hadrr+nxw4O7g3Q96u
07zeaqHSoTGUchtlqrGjALQLP2yxdACEjxMh3rfdStRv3x3bbbVVDdioVEzPukCr
EBA+AbqaaG3LIYXwcY+15zx5NrAfeBAP1RjLgoV0s2ch4ghEqvnZGY4NLBDkcQ2R
97tM9+OdecBrsnlQr5GBoDbwpqc2pDEqSjkYDwoXqvqXs0DrMRq2MQw1Hjjh7Z7G
FZx1KNTiLB/YQ0sSyMcUKnH+qBA0FxwN/C6dDnRjj4dH5YsoeG/GhsS3B00a+0yE
S3MKrsf+uN21OYLVPSTEN6qS+02ZvK6E/Aw7/fk2IV60JMeM5KvCccmxa53dKls8
iyEJ7nVLOg==
=xyKA
-----END PGP SIGNATURE-----
Merge tag 'io_uring-7.0-20260305' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Fix a typo in the mock_file help text
- Fix a comment regarding IORING_SETUP_TASKRUN_FLAG in the
io_uring.h UAPI header
- Use READ_ONCE() for reading refill queue entries
- Reject SEND_VECTORIZED for fixed buffer sends, as it isn't
implemented. Currently this flag is silently ignored
This is in preparation for making these work, but first we
need a fixup so that older kernels will correctly reject them
- Ensure "0" means default for the rx page size
* tag 'io_uring-7.0-20260305' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/zcrx: use READ_ONCE with user shared RQEs
io_uring/mock: Fix typo in help text
io_uring/net: reject SEND_VECTORIZED when unsupported
io_uring: correct comment for IORING_SETUP_TASKRUN_FLAG
io_uring/zcrx: don't set rx_page_size when not requested
for amdxdna, a DSI clock rate fix for rz-du, a uapi fix for syncobj, a
possible build failure fix for dma-buf, a doc warning fix for sched, a
build failure fix for ttm tests, and a crash fix when suspended for
nouveau.
-----BEGIN PGP SIGNATURE-----
iJUEABMJAB0WIQTkHFbLp4ejekA/qfgnX84Zoj2+dgUCaak6GgAKCRAnX84Zoj2+
dkHpAX91/gbgY5FDu7va/7Ybo3oH/YvZOIQsbOz0sfJsjnszyKT3Wh4MGM8QphlI
93YHoi8Bf2M++H1mQgFrm97kjISmjgZYufM+6Cy92oqMO/SKCxiHTCBRnTBxas1B
CXek10L1Pg==
=jxDJ
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2026-03-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A return type fix for ttm, a display fix for solomon, several misc fixes
for amdxdna, a DSI clock rate fix for rz-du, a uapi fix for syncobj, a
possible build failure fix for dma-buf, a doc warning fix for sched, a
build failure fix for ttm tests, and a crash fix when suspended for
nouveau.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260305-ludicrous-quirky-raven-7cdafd@houat
Introduces the DRM RAS infrastructure over generic netlink.
The new interface allows drivers to expose RAS nodes and their
associated error counters to userspace in a structured and extensible
way. Each drm_ras node can register its own set of error counters, which
are then discoverable and queryable through netlink operations. This
lays the groundwork for reporting and managing hardware error states
in a unified manner across different DRM drivers.
Currently it only supports error-counter nodes. But it can be
extended later.
The registration is also not tied to any drm node, so it can be
used by accel devices as well.
It uses the new and mandatory YAML description format stored in
Documentation/netlink/specs/. This forces a single generic netlink
family namespace for the entire drm: "drm-ras".
But multiple-endpoints are supported within the single family.
Any modification to this API needs to be applied to
Documentation/netlink/specs/drm_ras.yaml before regenerating the
code:
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml --mode uapi --header \
-o include/uapi/drm/drm_ras.h
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml --mode kernel \
--header -o drivers/gpu/drm/drm_ras_nl.h
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml \
--mode kernel --source -o drivers/gpu/drm/drm_ras_nl.c
Cc: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Co-developed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/20260304074412.464435-8-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
include/uapi/linux/dma-buf.h uses several macros from ioctl.h to define
its ioctl commands. However, it does not include ioctl.h itself. So,
if userspace source code tries to include the dma-buf.h file without
including ioctl.h, it can result in build failures.
Therefore, include ioctl.h in the dma-buf UAPI header.
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20260303002309.1401849-1-isaacmanjarres@google.com
update the type for num_syncobj_handles from __u32 to _u16 with
required padding.
This breaks the UAPI for big-endian platforms but this is deliberate
and harmless since userqueues is still a beta feature. It is enabled
via module parameter and need the right fw support to work.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update the type for num_syncobj_handles from __u64 to _u16 with
required padding.
This breaks the UAPI for big-endian platforms but this is deliberate
and harmless since userqueues is still a beta feature. It is enabled
via module parameter and need the right fw support to work.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UAPI Changes:
connector:
- Add panel_type property
fourcc:
- Add ARM interleaved 64k modifier
nouveau:
- Query Z-Cull info with DRM_IOCTL_NOUVEAU_GET_ZCULL_INFO
Cross-subsystem Changes:
coreboot:
- Clean up coreboot framebuffer support
dma-buf:
- Provide revoke mechanism for shared buffers
- Rename move_notify callback to invalidate_mappings and update users.
- Always enable move_notify
- Support dma_fence_was_initialized() test
- Protect dma_fence_ops by RCU and improve locking
- Fix sparse warnings
Core Changes:
atomic:
- Allocate drm_private_state via callback and convert drivers
atomic-helper:
- Use system_percpu_wq
buddy:
- Make buddy allocator available to all DRM drivers
- Document flags and structures
colorop:
- Add destroy helper and convert drivers
fbdev-emulation:
- Clean up
gem:
- Fix drm_gem_objects_lookup() error cleanup
Driver Changes:
amdgpu:
- Set panel_type to OELD for eDP
atmel-hlcdc:
- Support sana5d65 LCD controller
bridge:
- anx7625: Support USB-C plus DT bindings
- connector: Fix EDID detection
- dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve others
- fsl-ldb: Fix visual artifacts plus related DT property 'enable-termination-resistor'
- imx8qxp-pixel-link: Improve bridge reference handling
- lt9611: Support Port-B-only input plus DT bindings
- tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up
- Support TH1520 HDMI plus DT bindings
- Clean up
imagination:
- Clean up
komeda:
- Fix integer overflow in AFBC checks
mcde:
- Improve bridge handling
nouveau:
- Provide Z-cull info to user space
- gsp: Support GA100
- Shutdown on PCI device shutdown
- Clean up
panel:
- panel-jdi-lt070me05000: Use mipi-dsi multi functions
- panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN N116BCL-EAK (C2); Support FriendlyELEC plus DT changes
- Fix Kconfig dependencies
panthor:
- Add tracepoints for power and IRQs
rcar-du:
- dsi: fix VCLK calculation
rockchip:
- vop2: Use drm_ logging functions
- Support DisplayPort on RK3576
sysfb:
- corebootdrm: Support system framebuffer on coreboot firmware; detect orientation
- Clean up pixel-format lookup
sun4i:
- Clean up
tilcdc:
- Use DT bindings scheme
- Use managed DRM interfaces
- Support DRM_BRIDGE_ATTACH_NO_CONNECTOR
- Clean up a lot of obsolete code
v3d:
- Clean up
vc4:
- Use system_percpu_wq
- Clean up
verisilicon:
- Support DC8200 plus DT bindings
virtgpu:
- Support PRIME imports with enabled 3D
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmmgV6IbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJEGgNwR1TC3oj3VsH/0xa8RvMN/B882kLA9N2
Dz9MR3sD3gCKiO3kzLTWncMnvI3qpSZCD/mR86+h7sEh5qiFFSiziZCCEF+BFkpB
UHK4mawDhMpOQiW+HhTR9V110WVrqtW8jVVg0db9XBE8y7xtj/lV3hv+/1N7WCkg
9fuxPkXby5cnW6aA8lPYzIdILFknwNVrzYCiPFmc4bXM5uynD0ft7KGR0dgViBIh
e6IQ+2POz0N0oVOdKGu10wvBQXKOznHpDPRK8844QrLb8CD/W8/mlifeqESDEsM1
TaPFdupXNdt0jdE2Jb+R+kGWfhM9T+CrtgaQ9LBNTDD8/RO2NrQIxDSYvnKw0DzQ
jlE=
=5H3Q
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2026-02-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v7.1:
UAPI Changes:
connector:
- Add panel_type property
fourcc:
- Add ARM interleaved 64k modifier
nouveau:
- Query Z-Cull info with DRM_IOCTL_NOUVEAU_GET_ZCULL_INFO
Cross-subsystem Changes:
coreboot:
- Clean up coreboot framebuffer support
dma-buf:
- Provide revoke mechanism for shared buffers
- Rename move_notify callback to invalidate_mappings and update users.
- Always enable move_notify
- Support dma_fence_was_initialized() test
- Protect dma_fence_ops by RCU and improve locking
- Fix sparse warnings
Core Changes:
atomic:
- Allocate drm_private_state via callback and convert drivers
atomic-helper:
- Use system_percpu_wq
buddy:
- Make buddy allocator available to all DRM drivers
- Document flags and structures
colorop:
- Add destroy helper and convert drivers
fbdev-emulation:
- Clean up
gem:
- Fix drm_gem_objects_lookup() error cleanup
Driver Changes:
amdgpu:
- Set panel_type to OELD for eDP
atmel-hlcdc:
- Support sana5d65 LCD controller
bridge:
- anx7625: Support USB-C plus DT bindings
- connector: Fix EDID detection
- dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve others
- fsl-ldb: Fix visual artifacts plus related DT property 'enable-termination-resistor'
- imx8qxp-pixel-link: Improve bridge reference handling
- lt9611: Support Port-B-only input plus DT bindings
- tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up
- Support TH1520 HDMI plus DT bindings
- Clean up
imagination:
- Clean up
komeda:
- Fix integer overflow in AFBC checks
mcde:
- Improve bridge handling
nouveau:
- Provide Z-cull info to user space
- gsp: Support GA100
- Shutdown on PCI device shutdown
- Clean up
panel:
- panel-jdi-lt070me05000: Use mipi-dsi multi functions
- panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN N116BCL-EAK (C2); Support FriendlyELEC plus DT changes
- Fix Kconfig dependencies
panthor:
- Add tracepoints for power and IRQs
rcar-du:
- dsi: fix VCLK calculation
rockchip:
- vop2: Use drm_ logging functions
- Support DisplayPort on RK3576
sysfb:
- corebootdrm: Support system framebuffer on coreboot firmware; detect orientation
- Clean up pixel-format lookup
sun4i:
- Clean up
tilcdc:
- Use DT bindings scheme
- Use managed DRM interfaces
- Support DRM_BRIDGE_ATTACH_NO_CONNECTOR
- Clean up a lot of obsolete code
v3d:
- Clean up
vc4:
- Use system_percpu_wq
- Clean up
verisilicon:
- Support DC8200 plus DT bindings
virtgpu:
- Support PRIME imports with enabled 3D
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260226143615.GA47200@linux.fritz.box
- Fix zero_vruntime tracking when there's a single task running
- Fix slice protection logic
- Fix the ->vprot logic for reniced tasks
- Fix lag clamping in mixed slice workloads
- Fix objtool uaccess warning (and bug) in the
!CONFIG_RSEQ_SLICE_EXTENSION case caused by unexpected
un-inlining, which triggers with older compilers
- Fix a comment in the rseq registration rseq_size bound check code
- Fix a legacy RSEQ ABI quirk that handled 32-byte area sizes
differently, which special size we now reached naturally and
want to avoid. The visible ugliness of the new reserved field
will be avoided the next time the RSEQ area is extended.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmkAl0RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hdsg/+IdQpmtEXsugE1FqEuuptm0ld6hcFI9WC
mlEXhid5Gq3a3KMBv0CLd73o+k8Ju/BDEdbfLMzY8A9h8OxfnuUL1T6Jt4q7dF1h
76ja1R+i+GNFcXWmSG8z6FUns4bRBJeNWFs3dzCFE9N2qOCCj1xBr/9BqgKvNVfZ
cbcaiMvmi3z/vPUmT8hdMdEcA0Zo2gVcKDmny4Tca9sigyLZD8FqtW1FhqL1HX8H
Cx8fZ2lkD2z6gKtOAbC3QuWVmP88tvZldaMsGHTAQIa14PP5h2xhyuLxBF1Zjnwy
aWl4iYr6ILu3LRi54CmQOiESdEf3Srdbl8JxDcvU9vh8ecqXvDGPUB2xCszlPvOx
R+scskNgNyd1WtUF2VYFLTNkj0B7Xe6eTYfIu2d5r8GrRt0YjRzsK/JQallAkV6V
KORDm4/Xyl5Ss6tNtfZP7lpHD2qykscRGxgr0HjjJCyjA1ZNtGc1A+JKZ8D8q9Nq
rxEbaa65KfAtYJ4i5j9goFPQwNeHXm/emToVzEfyKwZHs3ns0LwffDGSFFOYSm/p
FVVmi9iSoxRvRFHBflvBIwFaCnIyBLTJZlB/Bp8MVaFnv+6OzdE/nfcKNaYqcVaT
mzCpY2DFTx5KISmJR7DAWsPntoRV6WPcxVApWicTaT5G3C2TLvvTAEq8g2WIYDFB
j6oNyEkX/Xw=
=Nxqx
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix zero_vruntime tracking when there's a single task running
- Fix slice protection logic
- Fix the ->vprot logic for reniced tasks
- Fix lag clamping in mixed slice workloads
- Fix objtool uaccess warning (and bug) in the
!CONFIG_RSEQ_SLICE_EXTENSION case caused by unexpected un-inlining,
which triggers with older compilers
- Fix a comment in the rseq registration rseq_size bound check code
- Fix a legacy RSEQ ABI quirk that handled 32-byte area sizes
differently, which special size we now reached naturally and want to
avoid. The visible ugliness of the new reserved field will be avoided
the next time the RSEQ area is extended.
* tag 'sched-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: slice ext: Ensure rseq feature size differs from original rseq size
rseq: Clarify rseq registration rseq_size bound check comment
sched/core: Fix wakeup_preempt's next_class tracking
rseq: Mark rseq_arm_slice_extension_timer() __always_inline
sched/fair: Fix lag clamp
sched/eevdf: Update se->vprot in reweight_entity()
sched/fair: Only set slice protection at pick time
sched/fair: Fix zero_vruntime tracking
Sync with a recent liburing fix, which corrects the comment explaining
when the IORING_SETUP_TASKRUN_FLAG setup flag is valid to use. May be
use with COOP_TASKRUN or DEFER_TASKRUN, not useful without either of
this task_work mechanisms being used.
Link: https://github.com/axboe/liburing/pull/1543
Signed-off-by: Jens Axboe <axboe@kernel.dk>
fb82437fdd ("PCI: Change capability register offsets to hex") incorrectly
converted the PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 value from decimal 52 to hex
0x32:
-#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 52 /* v2 endpoints with link end here */
+#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 0x32 /* end of v2 EPs w/ link */
This broke PCI capabilities in a VMM because subsequent ones weren't
DWORD-aligned.
Change PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 to the correct value of 0x34.
fb82437fdd was from Baruch Siach <baruch@tkos.co.il>, but this was not
Baruch's fault; it's a mistake I made when applying the patch.
Fixes: fb82437fdd ("PCI: Change capability register offsets to hex")
Reported-by: David Woodhouse <dwmw2@infradead.org>
Closes: https://lore.kernel.org/all/3ae392a0158e9d9ab09a1d42150429dd8ca42791.camel@infradead.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Some compute applications may try to allocate device memory to probe
how much device memory is actually available, assuming that the
application will be the only one running on the particular GPU.
That strategy fails in fault mode since it allows VM overcommit.
While this could be resolved in user-space it's further complicated
by cgroups potentially restricting the amount of memory available
to the application.
Introduce a vm create flag, DRM_XE_VM_CREATE_NO_VM_OVERCOMMIT, that
allows fault mode to mimic the behaviour of !fault mode WRT this. It
blocks evicting same vm bos during VM_BIND processing. However,
it does *not* block evicting same-vm bos during pagefault
processing, preferring eviction rather than VM banning in
OOM situations.
Cc: John Falkowski <john.falkowski@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260204153320.17989-1-thomas.hellstrom@linux.intel.com
Add kernel-side support for using the zcull hardware in nvidia gpus.
zcull aims to improve memory bandwidth by using an early approximate
depth test, similar to hierarchical Z on an AMD card.
Add a new ioctl that exposes zcull information that has been read
from the hardware. Userspace uses each of these parameters either
in a heuristic for determining zcull region parameters or in the
calculation of a buffer size.
It appears the hardware hasn't changed its structure for these
values since FERMI_C (circa 2011), so the assumption is that it
won't change on us too quickly, and is therefore reasonable to
include in UAPI.
This bypasses the nvif layer and instead accesses nvkm_gr directly,
which mirrors existing usage of nvkm_gr_units(). There is no nvif
object for nvkm_gr yet, and adding one is not trivial.
Signed-off-by: Mel Henning <mhenning@darkrefraction.com>
Link: https://patch.msgid.link/20260219-zcull3-v3-2-dbe6a716f104@darkrefraction.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Before rseq became extensible, its original size was 32 bytes even
though the active rseq area was only 20 bytes. This had the following
impact in terms of userspace ecosystem evolution:
* The GNU libc between 2.35 and 2.39 expose a __rseq_size symbol set
to 32, even though the size of the active rseq area is really 20.
* The GNU libc 2.40 changes this __rseq_size to 20, thus making it
express the active rseq area.
* Starting from glibc 2.41, __rseq_size corresponds to the
AT_RSEQ_FEATURE_SIZE from getauxval(3).
This means that users of __rseq_size can always expect it to
correspond to the active rseq area, except for the value 32, for
which the active rseq area is 20 bytes.
Exposing a 32 bytes feature size would make life needlessly painful
for userspace. Therefore, add a reserved field at the end of the
rseq area to bump the feature size to 33 bytes. This reserved field
is expected to be replaced with whatever field will come next,
expecting that this field will be larger than 1 byte.
The effect of this change is to increase the size from 32 to 64 bytes
before we actually have fields using that memory.
Clarify the allocation size and alignment requirements in the struct
rseq uapi comment.
Change the value returned by getauxval(AT_RSEQ_ALIGN) to return the
value of the active rseq area size rounded up to next power of 2, which
guarantees that the rseq structure will always be aligned on the nearest
power of two large enough to contain it, even as it grows. Change the
alignment check in the rseq registration accordingly.
This will minimize the amount of ABI corner-cases we need to document
and require userspace to play games with. The rule stays simple when
__rseq_size != 32:
#define rseq_field_available(field) (__rseq_size >= offsetofend(struct rseq_abi, field))
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260220200642.1317826-3-mathieu.desnoyers@efficios.com
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmmWuQwTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXnnHB/41Jji+y8FHe2SqpQhUOqHb6NDEr3GX
YpAybhz2IsBHVhbCQn789UiIcSr0UDR7wnVLAmXe+5eY/jRwNggIO3tFqLYn92pK
KSTNafgNbLxh3iKBxRsUy0b3JutjD2LytkpFj2KVbBsZfmRxCZmKIV/4V18rV+fA
uemvoqLwU7emEWkhZ24suHMHPVpv6xKs9O6gOrQ4+zXR0g//eMLDqb17uj8h+8sM
ZsPsMYeuOihXlvGeBRjbnWYjA1ODWGDvwR9VT+VU4+HWht/KSr15EGeXZdV2eZUt
e/8swbqOS94a2ZjOgStzVkcPqAF88t9zZ+gvYElTDzLlHjqbrZdpeDDt
=A7tT
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:
- Debugfs support for MSHV statistics (Nuno Das Neves)
- Support for the integrated scheduler (Stanislav Kinsburskii)
- Various fixes for MSHV memory management and hypervisor status
handling (Stanislav Kinsburskii)
- Expose more capabilities and flags for MSHV partition management
(Anatol Belski, Muminul Islam, Magnus Kulke)
- Miscellaneous fixes to improve code quality and stability (Carlos
López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh
Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros
Bizjak)
- PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka)
* tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (34 commits)
mshv: Handle insufficient root memory hypervisor statuses
mshv: Handle insufficient contiguous memory hypervisor status
mshv: Introduce hv_deposit_memory helper functions
mshv: Introduce hv_result_needs_memory() helper function
mshv: Add SMT_ENABLED_GUEST partition creation flag
mshv: Add nested virtualization creation flag
Drivers: hv: vmbus: Simplify allocation of vmbus_evt
mshv: expose the scrub partition hypercall
mshv: Add support for integrated scheduler
mshv: Use try_cmpxchg() instead of cmpxchg()
x86/hyperv: Fix error pointer dereference
x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV
Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
mshv: fix SRCU protection in irqfd resampler ack handler
mshv: make field names descriptive in a header struct
x86/hyperv: Update comment in hyperv_cleanup()
mshv: clear eventfd counter on irqfd shutdown
x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
...
Current release - new code bugs:
- net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT
- eth: mlx5e: XSK, Fix unintended ICOSQ change
- phy_port: correctly recompute the port's linkmodes
- vsock: prevent child netns mode switch from local to global
- couple of kconfig fixes for new symbols
Previous releases - regressions:
- nfc: nci: fix false-positive parameter validation for packet data
- net: do not delay zero-copy skbs in skb_attempt_defer_free()
Previous releases - always broken:
- mctp: ensure our nlmsg responses to user space are zero-initialised
- ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()
- fixes for ICMP rate limiting
Misc:
- intel: fix PCI device ID conflict between i40e and ipw2200
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmmXUh8ACgkQMUZtbf5S
IrufYA//ZVj+4gvegqKwKZYXNBndVW00GGTYqaILbaenK1olUVUelVB91eV2Klc/
dXCeKG/MgEPuT89IjkPzVr2Yg4x6uhjcQL1rsahORn+GuQfSI/P8y7ysDOPnHVeM
Rtsg1m8z3EizJcHPeAJe7nEqFzfvZ2m+FCEGe++z8BYaUZUVApytgpIWOHO/aB+p
t13bCNzd05XxPphMl610T00Fncj2jCVDHILMgTB5rmFmkeJuQwNrRGXQSoQame46
+g+yCZjT0eVTrBaH1EUssWfrOT3VJj3BEee6gSp7k9mxMkbW18i8shBgmxS+EHjk
u19wwBzSrHK+JY1UExim+1E/rZisQVmEE1Gs0ALedxAu9zC/Julzfa2/+BFsc0j7
QTXd4jukG3aTPIX8v3TV2Igu0j+bAT4WdpzvnsXXBMVKy7wFYMd1+aSOLyFH2W9L
qRbg50oUATcsz77bZt6YUTJEgua4HXNYGtn15FMZOR7HJVR2L44Q5TK5mQxGp5iM
GabeKMzg6bsjE98STM3nbWks3pIb9ptIk++i0913eSqKgn84bDPtp3Gabfgle2SJ
8gjKS61K8rDt5x8StXVod7oGQ4asL8RJyOtE/avgbWUu9BNH8/oKqsE6TQrpXauv
1ndiyim/mPe4fBCxkVAi2+uq5/ph9z8XyleESz9VYwyL3Rl4nsg=
=qSCj
-----END PGP SIGNATURE-----
Merge tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.
Current release - new code bugs:
- net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT
- eth: mlx5e: XSK, Fix unintended ICOSQ change
- phy_port: correctly recompute the port's linkmodes
- vsock: prevent child netns mode switch from local to global
- couple of kconfig fixes for new symbols
Previous releases - regressions:
- nfc: nci: fix false-positive parameter validation for packet data
- net: do not delay zero-copy skbs in skb_attempt_defer_free()
Previous releases - always broken:
- mctp: ensure our nlmsg responses to user space are zero-initialised
- ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()
- fixes for ICMP rate limiting
Misc:
- intel: fix PCI device ID conflict between i40e and ipw2200"
* tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: nfc: nci: Fix parameter validation for packet data
net/mlx5e: Use unsigned for mlx5e_get_max_num_channels
net/mlx5e: Fix deadlocks between devlink and netdev instance locks
net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event
net/mlx5: Fix misidentification of write combining CQE during poll loop
net/mlx5e: Fix misidentification of ASO CQE during poll loop
net/mlx5: Fix multiport device check over light SFs
bonding: alb: fix UAF in rlb_arp_recv during bond up/down
bnge: fix reserving resources from FW
eth: fbnic: Advertise supported XDP features.
rds: tcp: fix uninit-value in __inet_bind
net/rds: Fix NULL pointer dereference in rds_tcp_accept_one
octeontx2-af: Fix default entries mcam entry action
net/mlx5e: XSK, Fix unintended ICOSQ change
ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero
ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero
ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()
inet: move icmp_global_{credit,stamp} to a separate cache line
icmp: prevent possible overflow in icmp_global_allow()
selftests/net: packetdrill: add ipv4-mapped-ipv6 tests
...
Add support for HV_PARTITION_CREATION_FLAG_SMT_ENABLED_GUEST
to allow userspace VMMs to enable SMT for guest partitions.
Expose this via new MSHV_PT_BIT_SMT_ENABLED_GUEST flag in the UAPI.
Without this flag, the hypervisor schedules guest VPs incorrectly,
causing SMT unusable.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Introduce HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE to
indicate support for nested virtualization during partition creation.
This enables clearer configuration and capability checks for nested
virtualization scenarios.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Core:
- Add Frank Li as susbstem reviewer to help with reviews
New Support:
- Mediatek support for Dimensity 6300 and 9200 controller
- Qualcomm Kaanapali and Glymur GPI DMA engine support
- Synopsis DW AXI Agilex5 support
- Renesas RZ/V2N SoC support
- Atmel microchip lan9691-dma support
- Tegra ADMA tegra264 support
Updates:
- sg_nents_for_dma() helper use in subsystem
- pm_runtime_mark_last_busy() redundant call update for subsystem
- Residue support for xilinx AXIDMA driver
- Intel Max SGL Size Support and capabilities for DSA3.0
- AXI dma larger than 32bits address support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmmUkGgACgkQfBQHDyUj
g0d6uRAAnVfM6GxVt4PuRd1t+i6qeNhqZrq+8001YtFgOJp0hxPX7k9PP1F42kjp
1zrICvvdqH8gw8+8AaT2JpIZp4vENmjGdnkCo+HU6FgGPCUmkkpehPk58Y2K3r0a
LbHNjjj7V7SDGs9SzT2It+d7KfKnv1adushBhuO7xv524hwSuetw1q8CnLPoxaPx
KNWToovCp5tlHCucWQAdmd4bPsUv1mFMvlJxK4a26WKeL7lU6BeDS06rLTNq5PNZ
51sYdSvyBOSCUcFGToeebJFsKCQukryZTXTtsKMsmLvmHaTMahu2TwNzQ+PRSBSr
kZ9GpS51tet67txGzGzJRFGDY9quKFrajQ60Om6dr9aYm2xW7gEZFa0NKTlz9q7w
kbwsPgd87sYI8MDWpinAuvwUS2OXnihjqdYVp0QouJd8eRiH1pWagtjubGRVYekC
eEZjyxpz8ZD+LT2G3I0uy2FnqCkcEjSfchBCtuPcxSSSkHRXVf4tgUAI833YIdek
gtd7h+/jepcVOcVAeaMVvVYnNIhVkHQkQC1/HmZCqNoyoY/oK8JcUF3UskzP7BPW
gvEJhtFD0RBInu5UM0rS31zF+8Q9EMbDXKY2PiCCyxtsjAe5yWyeNNsUx9DN8ixv
XyclsF7javUOZSoxzH3mCLy+x51p84Mq2KGQGL9H7PK/hWDMmoo=
=nrcD
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"Core:
- Add Frank Li as susbstem reviewer to help with reviews
New Support:
- Mediatek support for Dimensity 6300 and 9200 controller
- Qualcomm Kaanapali and Glymur GPI DMA engine
- Synopsis DW AXI Agilex5
- Renesas RZ/V2N SoC
- Atmel microchip lan9691-dma
- Tegra ADMA tegra264
Updates:
- sg_nents_for_dma() helper use in subsystem
- pm_runtime_mark_last_busy() redundant call update for subsystem
- Residue support for xilinx AXIDMA driver
- Intel Max SGL Size Support and capabilities for DSA3.0
- AXI dma larger than 32bits address support"
* tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (64 commits)
dmaengine: add Frank Li as reviewer
dt-bindings: dma: qcom,gpi: Update max interrupts lines to 16
dmaengine: fsl-edma: don't explicitly disable clocks in .remove()
dmaengine: xilinx: xdma: use sg_nents_for_dma() helper
dmaengine: sh: use sg_nents_for_dma() helper
dmaengine: sa11x0: use sg_nents_for_dma() helper
dmaengine: qcom: bam_dma: use sg_nents_for_dma() helper
dmaengine: qcom: adm: use sg_nents_for_dma() helper
dmaengine: pxa-dma: use sg_nents_for_dma() helper
dmaengine: lgm: use sg_nents_for_dma() helper
dmaengine: k3dma: use sg_nents_for_dma() helper
dmaengine: dw-axi-dmac: use sg_nents_for_dma() helper
dmaengine: bcm2835-dma: use sg_nents_for_dma() helper
dmaengine: axi-dmac: use sg_nents_for_dma() helper
dmaengine: altera-msgdma: use sg_nents_for_dma() helper
scatterlist: introduce sg_nents_for_dma() helper
dmaengine: idxd: Add Max SGL Size Support for DSA3.0
dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
dmaengine: sh: rz-dmac: Make channel irq local
dmaengine: pl08x: Fix comment stating the difference between PL080 and PL081
...
Here is the big set of char/misc/iio and other smaller driver subsystem
changes for 7.0-rc1. Lots of little things in here, including:
- Loads of iio driver changes and updates and additions
- gpib driver updates
- interconnect driver updates
- i3c driver updates
- hwtracing (coresight and intel) driver updates
- deletion of the obsolete mwave driver
- binder driver updates (rust and c versions)
- mhi driver updates (causing a merge conflict, see below)
- mei driver updates
- fsi driver updates
- eeprom driver updates
- lots of other small char and misc driver updates and cleanups
All of these have been in linux-next for a while, with no reported
issues except for a merge conflict with your tree due to the mhi driver
changes in the drivers/net/wireless/ath/ath12k/mhi.c file. To fix that
up, just delete the "auto_queue" structure fields being set, see this
message for the full change needed:
https://lore.kernel.org/r/aXD6X23btw8s-RZP@sirena.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaZRxOg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykIrACgs9S+A/GG9X0Kvc+ND/J1XYZpj3QAoKl0yXGj
SV1SR/giEBc7iKV6Dn6O
=jbok
-----END PGP SIGNATURE-----
Merge tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO driver updates from Greg KH:
"Here is the big set of char/misc/iio and other smaller driver
subsystem changes for 7.0-rc1. Lots of little things in here,
including:
- Loads of iio driver changes and updates and additions
- gpib driver updates
- interconnect driver updates
- i3c driver updates
- hwtracing (coresight and intel) driver updates
- deletion of the obsolete mwave driver
- binder driver updates (rust and c versions)
- mhi driver updates (causing a merge conflict, see below)
- mei driver updates
- fsi driver updates
- eeprom driver updates
- lots of other small char and misc driver updates and cleanups
All of these have been in linux-next for a while, with no reported
issues"
* tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (297 commits)
mux: mmio: fix regmap leak on probe failure
rust_binder: return p from rust_binder_transaction_target_node()
drivers: android: binder: Update ARef imports from sync::aref
rust_binder: fix needless borrow in context.rs
iio: magn: mmc5633: Fix Kconfig for combination of I3C as module and driver builtin
iio: sca3000: Fix a resource leak in sca3000_probe()
iio: proximity: rfd77402: Add interrupt handling support
iio: proximity: rfd77402: Document device private data structure
iio: proximity: rfd77402: Use devm-managed mutex initialization
iio: proximity: rfd77402: Use kernel helper for result polling
iio: proximity: rfd77402: Align polling timeout with datasheet
iio: cros_ec: Allow enabling/disabling calibration mode
iio: frequency: ad9523: correct kernel-doc bad line warning
iio: buffer: buffer_impl.h: fix kernel-doc warnings
iio: gyro: itg3200: Fix unchecked return value in read_raw
MAINTAINERS: add entry for ADE9000 driver
iio: accel: sca3000: remove unused last_timestamp field
iio: accel: adxl372: remove unused int2_bitmask field
iio: adc: ad7766: Use iio_trigger_generic_data_rdy_poll()
iio: magnetometer: Remove IRQF_ONESHOT
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmTrLsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplYaEACgWcIcGa9/nWq1x02uN7Zi9vHWpDJqgEhq
JCLpMLdn3ZG6Ksn8RAfI4dKAKZKS7MuXDrpoXgchQ8LQjpssN6kTj2TlKdZR8Je3
NNWfkPnLUp/t3MN/V0vZiX5NQaJVCNblbcnauDzlN+6WkWku5p1wkwYwy3I7NPJ4
P7HHqFJAOwhyBpk/Nr3sQEDnKIn/vOiedyOuO+3HB6rlmnSmjY1cQ+FUSaOI+rNQ
D3i9TMEojHYhMDt76ql2YdKcksBu6HaZQ6JNpIiN9iqNB+96e+X2bcLPyfwkuHwC
N7G1IMfyTsuV7JWktcZP+AT8WK4Qf45fuUN/1EkKEL9MWF2TUMob8toQ0GXRCb22
NqSC1JyeVJ/sSnKzb2Z4wY4+BgRMo83ME3l6hi6QckWXfFyTAQe70JyUnu4w11qn
62astpZXVRSfvbH3vT76BWTa+5HUZExQgLRgor19BTeVY4ihh+muaoMH6An6jf6i
ZnqUSsn7nFB20MEudVqhgiKTvqVic2Atsl6JD4wjwWs5nEP9wzmmCSEGd3Nkrrji
HPWN4zu+1qczDZxmCJAj3w29cRO/vZCNpFARlSCMcXNOQsZaFWVaaQlzt26ZMhTi
AyMav25X8fNCERvGP++uo7cKzDGCuhhIR6y5GlXZ6yTHsGTcSgooW/NNz6Ik2jUW
Bwa5GBK36A==
=TgoD
-----END PGP SIGNATURE-----
Merge tag 'io_uring-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull more io_uring updates from Jens Axboe:
"This is a mix of cleanups and fixes. No major fixes in here, just a
bunch of little fixes. Some of them marked for stable as it fixes
behavioral issues
- Fix an issue with SOCKET_URING_OP_SETSOCKOPT for netlink sockets,
due to a too restrictive check on it having an ioctl handler
- Remove a redundant SQPOLL check in ring creation
- Kill dead accounting for zero-copy send, which doesn't use ->buf
or ->len post the initial setup
- Fix missing clamp of the allocation hint, which could cause
allocations to fall outside of the range the application asked
for. Still within the allowed limits.
- Fix for IORING_OP_PIPE's handling of direct descriptors
- Tweak to the API for the newly added BPF filters, making them
more future proof in terms of how applications deal with them
- A few fixes for zcrx, fixing a few error handling conditions
- Fix for zcrx request flag checking
- Add support for querying the zcrx page size
- Improve the NO_SQARRAY static branch inc/dec, avoiding busy
conditions causing too much traffic
- Various little cleanups"
* tag 'io_uring-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/bpf_filter: pass in expected filter payload size
io_uring/bpf_filter: move filter size and populate helper into struct
io_uring/cancel: de-unionize file and user_data in struct io_cancel_data
io_uring/rsrc: improve regbuf iov validation
io_uring: remove unneeded io_send_zc accounting
io_uring/cmd_net: fix too strict requirement on ioctl
io_uring: delay sqarray static branch disablement
io_uring/query: add query.h copyright notice
io_uring/query: return support for custom rx page size
io_uring/zcrx: check unsupported flags on import
io_uring/zcrx: fix post open error handling
io_uring/zcrx: fix sgtable leak on mapping failures
io_uring: use the right type for creds iteration
io_uring/openclose: fix io_pipe_fixed() slot tracking for specific slots
io_uring/filetable: clamp alloc_hint to the configured alloc range
io_uring/rsrc: replace reg buffer bit field with flags
io_uring/zcrx: improve types for size calculation
io_uring/tctx: avoid modifying loop variable in io_ring_add_registered_file
io_uring: simplify IORING_SETUP_DEFER_TASKRUN && !SQPOLL check
Musl defines its own struct ethhdr and thus defines __UAPI_DEF_ETHHDR to
zero. To avoid struct redefinition errors, user space is therefore
supposed to include netinet/if_ether.h before (or instead of)
linux/if_ether.h. To relieve them from this burden, include the libc
header here if not building for kernel space.
Reported-by: Alyssa Ross <hi@alyssa.is>
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
It's quite possible that opcodes that have payloads attached to them,
like IORING_OP_OPENAT/OPENAT2 or IORING_OP_SOCKET, that these paylods
can change over time. For example, on the openat/openat2 side, the
struct open_how argument is extensible, and could be extended in the
future to allow further arguments to be passed in.
Allow registration of a cBPF filter to give the size of the filter as
seen by userspace. If that filter is for an opcode that takes extra
payload data, allow it if the application payload expectation is the
same size than the kernels. If that is the case, the kernel supports
filtering on the payload that the application expects. If the size
differs, the behavior depends on the IO_URING_BPF_FILTER_SZ_STRICT flag:
1) If IO_URING_BPF_FILTER_SZ_STRICT is set and the size expectation
differs, fail the attempt to load the filter.
2) If IO_URING_BPF_FILTER_SZ_STRICT isn't set, allow the filter if
the userspace pdu size is smaller than what the kernel offers.
3) Regardless if IO_URING_BPF_FILTER_SZ_STRICT, fail loading the filter
if the userspace pdu size is bigger than what the kernel supports.
An attempt to load a filter due to sizing will error with -EMSGSIZE.
For that error, the registration struct will have filter->pdu_size
populated with the pdu size that the kernel uses.
Reported-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Please consider pulling these changes from the signed vfs-7.0-rc1.misc.2 tag.
Thanks!
Christian
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaZMOCwAKCRCRxhvAZXjc
oswrAP9r1zjzMimjX2J0hBoMnYjNzQfLLew8+IRygImQ+yaqWgD9Fiw/cQ9eE1Hm
TMLqck/ky588ywSDaBzfztrXAY3ISgg=
=4yr2
-----END PGP SIGNATURE-----
Merge tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull more misc vfs updates from Christian Brauner:
"Features:
- Optimize close_range() from O(range size) to O(active FDs) by using
find_next_bit() on the open_fds bitmap instead of linearly scanning
the entire requested range. This is a significant improvement for
large-range close operations on sparse file descriptor tables.
- Add FS_XFLAG_VERITY file attribute for fs-verity files, retrievable
via FS_IOC_FSGETXATTR and file_getattr(). The flag is read-only.
Add tracepoints for fs-verity enable and verify operations,
replacing the previously removed debug printk's.
- Prevent nfsd from exporting special kernel filesystems like pidfs
and nsfs. These filesystems have custom ->open() and ->permission()
export methods that are designed for open_by_handle_at(2) only and
are incompatible with nfsd. Update the exportfs documentation
accordingly.
Fixes:
- Fix KMSAN uninit-value in ovl_fill_real() where strcmp() was used
on a non-null-terminated decrypted directory entry name from
fscrypt. This triggered on encrypted lower layers when the
decrypted name buffer contained uninitialized tail data.
The fix also adds VFS-level name_is_dot(), name_is_dotdot(), and
name_is_dot_dotdot() helpers, replacing various open-coded "." and
".." checks across the tree.
- Fix read-only fsflags not being reset together with xflags in
vfs_fileattr_set(). Currently harmless since no read-only xflags
overlap with flags, but this would cause inconsistencies for any
future shared read-only flag
- Return -EREMOTE instead of -ESRCH from PIDFD_GET_INFO when the
target process is in a different pid namespace. This lets userspace
distinguish "process exited" from "process in another namespace",
matching glibc's pidfd_getpid() behavior
Cleanups:
- Use C-string literals in the Rust seq_file bindings, replacing the
kernel::c_str!() macro (available since Rust 1.77)
- Fix typo in d_walk_ret enum comment, add porting notes for the
readlink_copy() calling convention change"
* tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: add porting notes about readlink_copy()
pidfs: return -EREMOTE when PIDFD_GET_INFO is called on another ns
nfsd: do not allow exporting of special kernel filesystems
exportfs: clarify the documentation of open()/permission() expotrfs ops
fsverity: add tracepoints
fs: add FS_XFLAG_VERITY for fs-verity files
rust: seq_file: replace `kernel::c_str!` with C-Strings
fs: dcache: fix typo in enum d_walk_ret comment
ovl: use name_is_dot* helpers in readdir code
fs: add helpers name_is_dot{,dot,_dotdot}
ovl: Fix uninit-value in ovl_fill_real
fs: reset read-only fsflags together with xflags
fs/file: optimize close_range() complexity from O(N) to O(Sparse)
Add an ability to query if the zcrx rx page size setting is available.
Note, even when the API is supported by io_uring, the registration can
still get rejected for various reasons, e.g. when the NIC or the driver
doesn't support it, when the particular specified size is unsupported,
when the memory area doesn't satisfy all requirements, etc.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The short comments had the correct order, but the long comments
had the planes reversed.
Fixes: 2271e0a20e ("drm: drm_fourcc: add 10/12/16bit software decoder YCbCr formats")
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Robert Mader <robert.mader@collabora.com>
Link: https://patch.msgid.link/20260208224718.57199-1-contact@emersion.fr
- in order support in virtio core
- multiple address space support in vduse
- fixes, cleanups all over the place, notably
- dma alignment fixes for non cache coherent systems
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmmO9rYPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpBzYH/2wUPo3T8/CKGFjF7QSPzgL/UI2NhnP8iSm4
btg1zVnrWmJK6vVIwnf5UsG8dFKsMcp/BEGCewTmIddNM2wEeSul0kKDXtIzrK/U
jdA9bJrUKLMeU7IFKne1Fip/yE+5nkWJttWXXyVRJtOJrYxZlkWfqSns3qYcPvsG
g7HXvF6tmici5uoKdRCLqHtQCWsvpnvTD5A7qoZAlEUjlQCDKKmuukpN9oK5UYLl
9uUOgPQAJaxIwx1C4uP7L+AwbLUcN/+MtrvQRNz+sFpP3sN9oXeDJKBpNQp109NB
JGk1sUsINL+54Cmdd5RwZ9T1vBJyRDrdWRDy1yHj95LildaPfh0=
=pnob
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- in-order support in virtio core
- multiple address space support in vduse
- fixes, cleanups all over the place, notably dma alignment fixes for
non-cache-coherent systems
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (59 commits)
vduse: avoid adding implicit padding
vhost: fix caching attributes of MMIO regions by setting them explicitly
vdpa/mlx5: update MAC address handling in mlx5_vdpa_set_attr()
vdpa/mlx5: reuse common function for MAC address updates
vdpa/mlx5: update mlx_features with driver state check
crypto: virtio: Replace package id with numa node id
crypto: virtio: Remove duplicated virtqueue_kick in virtio_crypto_skcipher_crypt_req
crypto: virtio: Add spinlock protection with virtqueue notification
Documentation: Add documentation for VDUSE Address Space IDs
vduse: bump version number
vduse: add vq group asid support
vduse: merge tree search logic of IOTLB_GET_FD and IOTLB_GET_INFO ioctls
vduse: take out allocations from vduse_dev_alloc_coherent
vduse: remove unused vaddr parameter of vduse_domain_free_coherent
vduse: refactor vdpa_dev_add for goto err handling
vhost: forbid change vq groups ASID if DRIVER_OK is set
vdpa: document set_group_asid thread safety
vduse: return internal vq group struct as map token
vduse: add vq group support
vduse: add v1 API definition
...
- Add more CPUCFG mask bits.
- Improve feature detection.
- Add lazy load support for FPU and binary translation (LBT) register state.
- Fix return value for memory reads from and writes to in-kernel devices.
- Add support for detecting preemption from within a guest.
- Add KVM steal time test case to tools/selftests.
ARM:
- Add support for FEAT_IDST, allowing ID registers that are not
implemented to be reported as a normal trap rather than as an UNDEF
exception.
- Add sanitisation of the VTCR_EL2 register, fixing a number of
UXN/PXN/XN bugs in the process.
- Full handling of RESx bits, instead of only RES0, and resulting in
SCTLR_EL2 being added to the list of sanitised registers.
- More pKVM fixes for features that are not supposed to be exposed to
guests.
- Make sure that MTE being disabled on the pKVM host doesn't give it
the ability to attack the hypervisor.
- Allow pKVM's host stage-2 mappings to use the Force Write Back
version of the memory attributes by using the "pass-through'
encoding.
- Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the
guest.
- Preliminary work for guest GICv5 support.
- A bunch of debugfs fixes, removing pointless custom iterators stored
in guest data structures.
- A small set of FPSIMD cleanups.
- Selftest fixes addressing the incorrect alignment of page
allocation.
- Other assorted low-impact fixes and spelling fixes.
RISC-V:
- Fixes for issues discoverd by KVM API fuzzing in
kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(),
and kvm_riscv_vcpu_aia_imsic_update()
- Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM
- Transparent huge page support for hypervisor page tables
- Adjust the number of available guest irq files based on MMIO
register sizes found in the device tree or the ACPI tables
- Add RISC-V specific paging modes to KVM selftests
- Detect paging mode at runtime for selftests
s390:
- Performance improvement for vSIE (aka nested virtualization)
- Completely new memory management. s390 was a special snowflake that enlisted
help from the architecture's page table management to build hypervisor
page tables, in particular enabling sharing the last level of page
tables. This however was a lot of code (~3K lines) in order to support
KVM, and also blocked several features. The biggest advantages is
that the page size of userspace is completely independent of the
page size used by the guest: userspace can mix normal pages, THPs and
hugetlbfs as it sees fit, and in fact transparent hugepages were not
possible before. It's also now possible to have nested guests and
guests with huge pages running on the same host.
- Maintainership change for s390 vfio-pci
- Small quality of life improvement for protected guests
x86:
- Add support for giving the guest full ownership of PMU hardware (contexted
switched around the fastpath run loop) and allowing direct access to data
MSRs and PMCs (restricted by the vPMU model). KVM still intercepts
access to control registers, e.g. to enforce event filtering and to
prevent the guest from profiling sensitive host state. This is more
accurate, since it has no risk of contention and thus dropped events, and
also has significantly less overhead.
For more information, see the commit message for merge commit bf2c3138ae
("Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD").
- Disallow changing the virtual CPU model if L2 is active, for all the same
reasons KVM disallows change the model after the first KVM_RUN.
- Fix a bug where KVM would incorrectly reject host accesses to PV MSRs
when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those
were advertised as supported to userspace,
- Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM
would attempt to read CR3 configuring an async #PF entry.
- Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86
only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports
that are intended for external usage, and those are allowed explicitly.
- When checking nested events after a vCPU is unblocked, ignore -EBUSY instead
of WARNing. Userspace can sometimes put the vCPU into what should be an
impossible state, and spurious exit to userspace on -EBUSY does not really
do anything to solve the issue.
- Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU
is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller
stuffing architecturally impossible states into KVM.
- Add support for new Intel instructions that don't require anything beyond
enumerating feature flags to userspace.
- Grab SRCU when reading PDPTRs in KVM_GET_SREGS2.
- Add WARNs to guard against modifying KVM's CPU caps outside of the intended
setup flow, as nested VMX in particular is sensitive to unexpected changes
in KVM's golden configuration.
- Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts
when the suppression feature is enabled by the guest (currently limited to
split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor
Suppress EOI Broadcasts isn't an option as some userspaces have come to rely
on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective
of whether or not userspace I/O APIC supports Directed EOIs).
- Clean up KVM's handling of marking mapped vCPU pages dirty.
- Drop a pile of *ancient* sanity checks hidden behind in KVM's unused
ASSERT() macro, most of which could be trivially triggered by the guest
and/or user, and all of which were useless.
- Fold "struct dest_map" into its sole user, "struct rtc_status", to make it
more obvious what the weird parameter is used for, and to allow fropping
these RTC shenanigans if CONFIG_KVM_IOAPIC=n.
- Bury all of ioapic.h, i8254.h and related ioctls (including
KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y.
- Add a regression test for recent APICv update fixes.
- Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv()
to consolidate the updates, and to co-locate SVI updates with the updates
for KVM's own cache of ISR information.
- Drop a dead function declaration.
- Minor cleanups.
x86 (Intel):
- Rework KVM's handling of VMCS updates while L2 is active to temporarily
switch to vmcs01 instead of deferring the update until the next nested
VM-Exit. The deferred updates approach directly contributed to several
bugs, was proving to be a maintenance burden due to the difficulty in
auditing the correctness of deferred updates, and was polluting
"struct nested_vmx" with a growing pile of booleans.
- Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults,
and instead always reflect them into the guest. Since KVM doesn't shadow
EPCM entries, EPCM violations cannot be due to KVM interference and
can't be resolved by KVM.
- Fix a bug where KVM would register its posted interrupt wakeup handler even
if loading kvm-intel.ko ultimately failed.
- Disallow access to vmcb12 fields that aren't fully supported, mostly to
avoid weirdness and complexity for FRED and other features, where KVM wants
enable VMCS shadowing for fields that conditionally exist.
- Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or
refuses to online a CPU) due to a VMCS config mismatch.
x86 (AMD):
- Drop a user-triggerable WARN on nested_svm_load_cr3() failure.
- Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS
relies on an upcoming, publicly announced change in the APM to reduce the
set of conditions where hardware (i.e. KVM) *must* flush the RAP.
- Ignore nSVM intercepts for instructions that are not supported according to
L1's virtual CPU model.
- Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath
for EPT Misconfig.
- Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM,
and allow userspace to restore nested state with GIF=0.
- Treat exit_code as an unsigned 64-bit value through all of KVM.
- Add support for fetching SNP certificates from userspace.
- Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD
or VMSAVE on behalf of L2.
- Misc fixes and cleanups.
x86 selftests:
- Add a regression test for TPR<=>CR8 synchronization and IRQ masking.
- Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support,
and extend x86's infrastructure to support EPT and NPT (for L2 guests).
- Extend several nested VMX tests to also cover nested SVM.
- Add a selftest for nested VMLOAD/VMSAVE.
- Rework the nested dirty log test, originally added as a regression test for
PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage
and to hopefully make the test easier to understand and maintain.
guest_memfd:
- Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage
handling. SEV/SNP was the only user of the tracking and it can do it via
the RMP.
- Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE
and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to
avoid non-trivial complexity for something that no known VMM seems to be
doing and to avoid an API special case for in-place conversion, which
simply can't support unaligned sources.
- When populating guest_memfd memory, GUP the source page in common code and
pass the refcounted page to the vendor callback, instead of letting vendor
code do the heavy lifting. Doing so avoids a looming deadlock bug with
in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap
invalidate lock.
Generic:
- Fix a bug where KVM would ignore the vCPU's selected address space when
creating a vCPU-specific mapping of guest memory. Actually this bug
could not be hit even on x86, the only architecture with multiple
address spaces, but it's a bug nevertheless.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmNqwwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPaZAf/cJx5B67lnST272esz0j29MIuT/Ti
jnf6PI9b7XubKYOtNvlu5ZW4Jsa5dqRG0qeO/JmcXDlwBf5/UkWOyvqIXyiuTl0l
KcSUlKPtTgKZSoZpJpTppuuDE8FSYqEdcCmjNvoYzcJoPjmaeJbK6aqO0AkBbb6e
L5InrLV7nV9iua6rFvA0s/G8/Eq2DG8M9hTRHe6NcI/z4hvslOudvpUXtC8Jygoo
cV8vFavUwc+atrmvhAOLvSitnrjfNa4zcG6XMOlwXPfIdvi3zqTlQTgUpwGKiAGQ
RIDUVZ/9bcWgJqbPRsdEWwaYRkNQWc5nmrAHRpEEaYV/NeBBNf4v6qfKSw==
=SkJ1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"Loongarch:
- Add more CPUCFG mask bits
- Improve feature detection
- Add lazy load support for FPU and binary translation (LBT) register
state
- Fix return value for memory reads from and writes to in-kernel
devices
- Add support for detecting preemption from within a guest
- Add KVM steal time test case to tools/selftests
ARM:
- Add support for FEAT_IDST, allowing ID registers that are not
implemented to be reported as a normal trap rather than as an UNDEF
exception
- Add sanitisation of the VTCR_EL2 register, fixing a number of
UXN/PXN/XN bugs in the process
- Full handling of RESx bits, instead of only RES0, and resulting in
SCTLR_EL2 being added to the list of sanitised registers
- More pKVM fixes for features that are not supposed to be exposed to
guests
- Make sure that MTE being disabled on the pKVM host doesn't give it
the ability to attack the hypervisor
- Allow pKVM's host stage-2 mappings to use the Force Write Back
version of the memory attributes by using the "pass-through'
encoding
- Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the
guest
- Preliminary work for guest GICv5 support
- A bunch of debugfs fixes, removing pointless custom iterators
stored in guest data structures
- A small set of FPSIMD cleanups
- Selftest fixes addressing the incorrect alignment of page
allocation
- Other assorted low-impact fixes and spelling fixes
RISC-V:
- Fixes for issues discoverd by KVM API fuzzing in
kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and
kvm_riscv_vcpu_aia_imsic_update()
- Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM
- Transparent huge page support for hypervisor page tables
- Adjust the number of available guest irq files based on MMIO
register sizes found in the device tree or the ACPI tables
- Add RISC-V specific paging modes to KVM selftests
- Detect paging mode at runtime for selftests
s390:
- Performance improvement for vSIE (aka nested virtualization)
- Completely new memory management. s390 was a special snowflake that
enlisted help from the architecture's page table management to
build hypervisor page tables, in particular enabling sharing the
last level of page tables. This however was a lot of code (~3K
lines) in order to support KVM, and also blocked several features.
The biggest advantages is that the page size of userspace is
completely independent of the page size used by the guest:
userspace can mix normal pages, THPs and hugetlbfs as it sees fit,
and in fact transparent hugepages were not possible before. It's
also now possible to have nested guests and guests with huge pages
running on the same host
- Maintainership change for s390 vfio-pci
- Small quality of life improvement for protected guests
x86:
- Add support for giving the guest full ownership of PMU hardware
(contexted switched around the fastpath run loop) and allowing
direct access to data MSRs and PMCs (restricted by the vPMU model).
KVM still intercepts access to control registers, e.g. to enforce
event filtering and to prevent the guest from profiling sensitive
host state. This is more accurate, since it has no risk of
contention and thus dropped events, and also has significantly less
overhead.
For more information, see the commit message for merge commit
bf2c3138ae ("Merge tag 'kvm-x86-pmu-6.20' ...")
- Disallow changing the virtual CPU model if L2 is active, for all
the same reasons KVM disallows change the model after the first
KVM_RUN
- Fix a bug where KVM would incorrectly reject host accesses to PV
MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled,
even if those were advertised as supported to userspace,
- Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs,
where KVM would attempt to read CR3 configuring an async #PF entry
- Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM
(for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL.
Only a few exports that are intended for external usage, and those
are allowed explicitly
- When checking nested events after a vCPU is unblocked, ignore
-EBUSY instead of WARNing. Userspace can sometimes put the vCPU
into what should be an impossible state, and spurious exit to
userspace on -EBUSY does not really do anything to solve the issue
- Also throw in the towel and drop the WARN on INIT/SIPI being
blocked when vCPU is in Wait-For-SIPI, which also resulted in
playing whack-a-mole with syzkaller stuffing architecturally
impossible states into KVM
- Add support for new Intel instructions that don't require anything
beyond enumerating feature flags to userspace
- Grab SRCU when reading PDPTRs in KVM_GET_SREGS2
- Add WARNs to guard against modifying KVM's CPU caps outside of the
intended setup flow, as nested VMX in particular is sensitive to
unexpected changes in KVM's golden configuration
- Add a quirk to allow userspace to opt-in to actually suppress EOI
broadcasts when the suppression feature is enabled by the guest
(currently limited to split IRQCHIP, i.e. userspace I/O APIC).
Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an
option as some userspaces have come to rely on KVM's buggy behavior
(KVM advertises Supress EOI Broadcast irrespective of whether or
not userspace I/O APIC supports Directed EOIs)
- Clean up KVM's handling of marking mapped vCPU pages dirty
- Drop a pile of *ancient* sanity checks hidden behind in KVM's
unused ASSERT() macro, most of which could be trivially triggered
by the guest and/or user, and all of which were useless
- Fold "struct dest_map" into its sole user, "struct rtc_status", to
make it more obvious what the weird parameter is used for, and to
allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n
- Bury all of ioapic.h, i8254.h and related ioctls (including
KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y
- Add a regression test for recent APICv update fixes
- Handle "hardware APIC ISR", a.k.a. SVI, updates in
kvm_apic_update_apicv() to consolidate the updates, and to
co-locate SVI updates with the updates for KVM's own cache of ISR
information
- Drop a dead function declaration
- Minor cleanups
x86 (Intel):
- Rework KVM's handling of VMCS updates while L2 is active to
temporarily switch to vmcs01 instead of deferring the update until
the next nested VM-Exit.
The deferred updates approach directly contributed to several bugs,
was proving to be a maintenance burden due to the difficulty in
auditing the correctness of deferred updates, and was polluting
"struct nested_vmx" with a growing pile of booleans
- Fix an SGX bug where KVM would incorrectly try to handle EPCM page
faults, and instead always reflect them into the guest. Since KVM
doesn't shadow EPCM entries, EPCM violations cannot be due to KVM
interference and can't be resolved by KVM
- Fix a bug where KVM would register its posted interrupt wakeup
handler even if loading kvm-intel.ko ultimately failed
- Disallow access to vmcb12 fields that aren't fully supported,
mostly to avoid weirdness and complexity for FRED and other
features, where KVM wants enable VMCS shadowing for fields that
conditionally exist
- Print out the "bad" offsets and values if kvm-intel.ko refuses to
load (or refuses to online a CPU) due to a VMCS config mismatch
x86 (AMD):
- Drop a user-triggerable WARN on nested_svm_load_cr3() failure
- Add support for virtualizing ERAPS. Note, correct virtualization of
ERAPS relies on an upcoming, publicly announced change in the APM
to reduce the set of conditions where hardware (i.e. KVM) *must*
flush the RAP
- Ignore nSVM intercepts for instructions that are not supported
according to L1's virtual CPU model
- Add support for expedited writes to the fast MMIO bus, a la VMX's
fastpath for EPT Misconfig
- Don't set GIF when clearing EFER.SVME, as GIF exists independently
of SVM, and allow userspace to restore nested state with GIF=0
- Treat exit_code as an unsigned 64-bit value through all of KVM
- Add support for fetching SNP certificates from userspace
- Fix a bug where KVM would use vmcb02 instead of vmcb01 when
emulating VMLOAD or VMSAVE on behalf of L2
- Misc fixes and cleanups
x86 selftests:
- Add a regression test for TPR<=>CR8 synchronization and IRQ masking
- Overhaul selftest's MMU infrastructure to genericize stage-2 MMU
support, and extend x86's infrastructure to support EPT and NPT
(for L2 guests)
- Extend several nested VMX tests to also cover nested SVM
- Add a selftest for nested VMLOAD/VMSAVE
- Rework the nested dirty log test, originally added as a regression
test for PML where KVM logged L2 GPAs instead of L1 GPAs, to
improve test coverage and to hopefully make the test easier to
understand and maintain
guest_memfd:
- Remove kvm_gmem_populate()'s preparation tracking and half-baked
hugepage handling. SEV/SNP was the only user of the tracking and it
can do it via the RMP
- Retroactively document and enforce (for SNP) that
KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the
source page to be 4KiB aligned, to avoid non-trivial complexity for
something that no known VMM seems to be doing and to avoid an API
special case for in-place conversion, which simply can't support
unaligned sources
- When populating guest_memfd memory, GUP the source page in common
code and pass the refcounted page to the vendor callback, instead
of letting vendor code do the heavy lifting. Doing so avoids a
looming deadlock bug with in-place due an AB-BA conflict betwee
mmap_lock and guest_memfd's filemap invalidate lock
Generic:
- Fix a bug where KVM would ignore the vCPU's selected address space
when creating a vCPU-specific mapping of guest memory. Actually
this bug could not be hit even on x86, the only architecture with
multiple address spaces, but it's a bug nevertheless"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (267 commits)
KVM: s390: Increase permitted SE header size to 1 MiB
MAINTAINERS: Replace backup for s390 vfio-pci
KVM: s390: vsie: Fix race in acquire_gmap_shadow()
KVM: s390: vsie: Fix race in walk_guest_tables()
KVM: s390: Use guest address to mark guest page dirty
irqchip/riscv-imsic: Adjust the number of available guest irq files
RISC-V: KVM: Transparent huge page support
RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list test
RISC-V: KVM: Allow Zalasr extensions for Guest/VM
KVM: riscv: selftests: Add riscv vm satp modes
KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test
riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM
RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized
RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr()
RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr()
RISC-V: KVM: Remove unnecessary 'ret' assignment
KVM: s390: Add explicit padding to struct kvm_s390_keyop
KVM: LoongArch: selftests: Add steal time test case
LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side
LoongArch: KVM: Add paravirt preempt feature in hypervisor side
...
- Add support for control flow integrity for userspace processes.
This is based on the standard RISC-V ISA extensions Zicfiss and
Zicfilp
- Improve ptrace behavior regarding vector registers, and add some selftests
- Optimize our strlen() assembly
- Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI
volume mounting
- Clean up some code slightly, including defining copy_user_page() as
copy_page() rather than memcpy(), aligning us with other
architectures; and using max3() to slightly simplify an expression
in riscv_iommu_init_check()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmmOYpYACgkQx4+xDQu9
KkvzOQ/9Fq8ZxWgYofhTPtw9/vps3avheOHlEoRrBWYfn1VkTRPAcbUULL4PGXwg
dnVFEl3AcrpOFikIthbukklLeLoOnUshZJBU25zY5h0My1jb63V1//gEwJR6I0dg
+V+GJmfzc4+YVaHK6UFdn7j3GgKUbTC7xXRMuGEriAzKPnm3AXAjh94wMNx6depv
Li3IXRoZT/HvqIAyfeAoM9STwOzJtE3Sc6fXABkzsIbNTjjdgIqoRSsQsKY10178
z6ox/sVStnLmVaMbOd/ZVN0J70JRDsvK0TC0/13K1ESUbnVia9a3bPIxLRmSapKC
wXnwAuSeevtFshGGyd5LZO0QQGxzG1H63Gky2GRoh8bTQbd2tQcfQzANdnPkBAQS
j2aOiSsiUQeNZqfZAfEBwRd27GXRYlKb/MpgCZKUH+ZO9VG6QaD3VGvg17/Caghy
nVdbBQ81ZV9tkz9EMN0vt2VJHmEqARh88w619laHjg+ioPTG4/UIDPzskt1I+Fgm
Y6NQLeFyfaO3RKKDYWGPcY7fmWQI9V8MECHOvyVI4xJcgqAbqnfsgytjuiFbrfRo
fTvpuB7kvltBZ180QSB79xj0sWGFTWR02MeWy3uOaLZz2eIm2ZTZbMUSgNYR0ldG
L3y7CEkTkoVF1ijYgAfuMgptk3Yf0dpa66D9HUo947wWkNrW5ds=
=4fTk
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
- Add support for control flow integrity for userspace processes.
This is based on the standard RISC-V ISA extensions Zicfiss and
Zicfilp
- Improve ptrace behavior regarding vector registers, and add some
selftests
- Optimize our strlen() assembly
- Enable the ISO-8859-1 code page as built-in, similar to ARM64, for
EFI volume mounting
- Clean up some code slightly, including defining copy_user_page() as
copy_page() rather than memcpy(), aligning us with other
architectures; and using max3() to slightly simplify an expression
in riscv_iommu_init_check()
* tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
riscv: lib: optimize strlen loop efficiency
selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function
selftests: riscv: verify ptrace accepts valid vector csr values
selftests: riscv: verify ptrace rejects invalid vector csr inputs
selftests: riscv: verify syscalls discard vector context
selftests: riscv: verify initial vector state with ptrace
selftests: riscv: test ptrace vector interface
riscv: ptrace: validate input vector csr registers
riscv: csr: define vtype register elements
riscv: vector: init vector context with proper vlenb
riscv: ptrace: return ENODATA for inactive vector extension
kselftest/riscv: add kselftest for user mode CFI
riscv: add documentation for shadow stack
riscv: add documentation for landing pad / indirect branch tracking
riscv: create a Kconfig fragment for shadow stack and landing pad support
arch/riscv: add dual vdso creation logic and select vdso based on hw
arch/riscv: compile vdso with landing pad and shadow stack note
riscv: enable kernel access to shadow stack memory via the FWFT SBI call
riscv: add kernel command line option to opt out of user CFI
riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe
...
Usual smallish cycle:
- Various code improvements in irdma, rtrs, qedr, ocrdma, irdma, rxe
- Small driver improvements and minor bug fixes to hns, mlx5, rxe, mana,
mlx5, irdma
- Robusness improvements in completion processing for EFA
- New query_port_speed() verb to move past limited IBA defined speed steps
- Support for SG_GAPS in rts and many other small improvements
- Rare list corruption fix in iwcm
- Better support different page sizes in rxe
- Device memory support for mana
- Direct bio vec to kernel MR for use by NFS-RDMA
- QP rate limiting for bnxt_re
- Remote triggerable NULL pointer crash in siw
- DMA-buf exporter support for RDMA mmaps like doorbells
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaY44vgAKCRCFwuHvBreF
YfiZAP91cMZfogN7r1FMD75xDZu55dI3Jvy8OaixyRxlWLGPcQEAjritdL0o7fZp
YrD1OXNS/1XG//rPBVw7xj+54Aa8hAU=
=AVcu
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Usual smallish cycle. The NFS biovec work to push it down into RDMA
instead of indirecting through a scatterlist is pretty nice to see,
been talked about for a long time now.
- Various code improvements in irdma, rtrs, qedr, ocrdma, irdma, rxe
- Small driver improvements and minor bug fixes to hns, mlx5, rxe,
mana, mlx5, irdma
- Robusness improvements in completion processing for EFA
- New query_port_speed() verb to move past limited IBA defined speed
steps
- Support for SG_GAPS in rts and many other small improvements
- Rare list corruption fix in iwcm
- Better support different page sizes in rxe
- Device memory support for mana
- Direct bio vec to kernel MR for use by NFS-RDMA
- QP rate limiting for bnxt_re
- Remote triggerable NULL pointer crash in siw
- DMA-buf exporter support for RDMA mmaps like doorbells"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits)
RDMA/mlx5: Implement DMABUF export ops
RDMA/uverbs: Add DMABUF object type and operations
RDMA/uverbs: Support external FD uobjects
RDMA/siw: Fix potential NULL pointer dereference in header processing
RDMA/umad: Reject negative data_len in ib_umad_write
IB/core: Extend rate limit support for RC QPs
RDMA/mlx5: Support rate limit only for Raw Packet QP
RDMA/bnxt_re: Report QP rate limit in debugfs
RDMA/bnxt_re: Report packet pacing capabilities when querying device
RDMA/bnxt_re: Add support for QP rate limiting
MAINTAINERS: Drop RDMA files from Hyper-V section
RDMA/uverbs: Add __GFP_NOWARN to ib_uverbs_unmarshall_recv() kmalloc
svcrdma: use bvec-based RDMA read/write API
RDMA/core: add rdma_rw_max_sge() helper for SQ sizing
RDMA/core: add MR support for bvec-based RDMA operations
RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations
RDMA/core: add bio_vec based RDMA read/write API
RDMA/irdma: Use kvzalloc for paged memory DMA address array
RDMA/rxe: Fix race condition in QP timer handlers
RDMA/mana_ib: Add device‑memory support
...
- A set of commits that introduces cxl_memdev_attach and pave way for
soft reserved handling, type2 accelerator enabling, and LSA 2.0
enabling. All these series require the endpoint driver to settle
before continuing the memdev driver probe.
dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
cxl/mem: Drop @host argument to devm_cxl_add_memdev()
cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
cxl/port: Arrange for always synchronous endpoint attach
cxl/mem: Arrange for always-synchronous memdev attach
cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
- A set to address CXL port error protocol handling and reporting. The
large patch series was split into 3 parts. Part 1 and 2 are included
here with part 3 coming later. Part 1 consists of a series of code
refactoring to PCI AER sub-system that addresses CXL and also CXL
RAS code to prepare for port error handling. Part 2 refactors the
CXL code to move management of component registers to cxl_port
objects to allow all CXL AER errors to be handled through the
cxl_port hierarchy.
Part 2:
cxl/port: Move endpoint component register management to cxl_port
cxl/port: Map Port RAS registers
cxl/port: Move dport RAS setup to dport add time
cxl/port: Move dport probe operations to a driver event
cxl/port: Move decoder setup before dport creation
cxl/port: Cleanup dport removal with a devres group
cxl/port: Reduce number of @dport variables in cxl_port_add_dport()
cxl/port: Cleanup handling of the nr_dports 0 -> 1 transition
Part 1:
cxl: Update RAS handler interfaces to also support CXL Ports
cxl/mem: Clarify @host for devm_cxl_add_nvdimm()
PCI/AER: Update struct aer_err_info with kernel-doc formatting
PCI/AER: Report CXL or PCIe bus type in AER trace logging
PCI/AER: Use guard() in cxl_rch_handle_error_iter()
PCI/AER: Move CXL RCH error handling to aer_cxl_rch.c
PCI/AER: Update is_internal_error() to be non-static is_aer_internal_error()
PCI/AER: Export pci_aer_unmask_internal_errors()
cxl/pci: Move CXL driver's RCH error handling into core/ras_rch.c
PCI/AER: Replace PCIEAER_CXL symbol with CXL_RAS
cxl/pci: Remove CXL VH handling in CONFIG_PCIEAER_CXL conditional blocks from core/pci.c
PCI: Replace cxl_error_is_native() with pcie_aer_is_native()
cxl/pci: Remove unnecessary CXL RCH handling helper functions
cxl/pci: Remove unnecessary CXL Endpoint handling helper functions
PCI: Introduce pcie_is_cxl()
PCI: Update CXL DVSEC definitions
PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h
- A set of patches to provide AMD Zen5 platform address translation for
CXL using ACPI PRMT. Set includes a conventions document to explain
why this is needed and how it's implemented.
cxl: Disable HPA/SPA translation handlers for Normalized Addressing
cxl/region: Factor out code into cxl_region_setup_poison()
cxl/atl: Lock decoders that need address translation
cxl: Enable AMD Zen5 address translation using ACPI PRMT
cxl/acpi: Prepare use of EFI runtime services
cxl: Introduce callback for HPA address ranges translation
cxl/region: Use region data to get the root decoder
cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
cxl/region: Separate region parameter setup and region construction
cxl: Simplify cxl_root_ops allocation and handling
cxl/region: Store HPA range in struct cxl_region
cxl/region: Store root decoder in struct cxl_region
cxl/region: Rename misleading variable name @hpa to @hpa_range
Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
cxl, doc: Moving conventions in separate files
cxl, doc: Remove isonum.txt inclusion
- A set of misc CXL patches of fixes, cleanups, and updates. Including
CXL address translation for unaligned MOD3 regions.
cxl: Fix premature commit_end increment on decoder commit failure
cxl/region: Use do_div() for 64-bit modulo operation
cxl/region: Translate HPA to DPA and memdev in unaligned regions
cxl/region: Translate DPA->HPA in unaligned MOD3 regions
cxl/core: Fix cxl_dport debugfs EINJ entries
cxl/acpi: Remove cxl_acpi_set_cache_size()
cxl/hdm: Fix newline character in dev_err() messages
cxl/pci: Remove outdated FIXME comment and BUILD_BUG_ON
Documentation/driver-api/cxl: device hotplug section
Documentation/driver-api/cxl: BIOS/EFI expectation update
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmmOFXcACgkQYGjFFmlT
OEojaxAApQJFLyX1MkPbhtm6j6GRzzEAEWTBX2XsmliZf1JhfahsNMWI69kO33rm
LddF+nyZNEl/foyHgUaxVzlQwqWuihyp7Qk2djXnMzLsuCAsWhPbB9j0RgJUN8h5
N4U76AmOdmhLlXH4CCqoW2jNy0OjxNdgp1FtTHv7VO7RxgRE9MFJRkLulKxB03wy
t6lRZXPofEFcHen40DlYRtW26vy1BYUO0dng2f16DxWrb1ztdACH/zVqCJJtdoFc
FAT5EaQCeRYZ9Yz4dONw3DcUjYlG6NcRN9FWNiptBn1Pb7pUX55Le8lfD3qZg0an
m3lWRs1T/lGz7pWmz4GPUKDwGFCEqLqd4oSz5v+dFR3JJxjJpRzKa19y5TfqK/LF
diqNZsDD9gCXE1HXzNr1YcbllpU2cPRPf58gWG9bLmG5xUUmScib8LoTMfgcCJW5
SlC6kf7BFLkJfDTcFaILc/UANeZaLGhrV0vyJntfGyT5EqKOcfjQEvrZvofA8mef
bdxt0IRDW4D+7kkcuR33OipTVUFG3ban8yYq4zXD64dmeHF76gwdJm3nyXsqdtpc
IYIIhz0W6pbTKjJ2fy1rZcTac1ZaALstyaF4bYWIjyF3NylPM8tDi48DFr+DGgeX
xkFs2B9p5vY5Cq73gCmSWsi3PBPTjWzeRp7YZrV6VoBd9uqewUs=
=blFQ
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
- Introduce cxl_memdev_attach and pave way for soft reserved handling,
type2 accelerator enabling, and LSA 2.0 enabling. All these series
require the endpoint driver to settle before continuing the memdev
driver probe.
- Address CXL port error protocol handling and reporting.
The large patch series was split into three parts. The first two
parts are included here with the final part coming later.
The first part consists of a series of code refactoring to PCI AER
sub-system that addresses CXL and also CXL RAS code to prepare for
port error handling.
The second part refactors the CXL code to move management of
component registers to cxl_port objects to allow all CXL AER errors
to be handled through the cxl_port hierarchy.
- Provide AMD Zen5 platform address translation for CXL using ACPI
PRMT. This includes a conventions document to explain why this is
needed and how it's implemented.
- Misc CXL patches of fixes, cleanups, and updates. Including CXL
address translation for unaligned MOD3 regions.
[ TLA service: CXL is "Compute Express Link" ]
* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
cxl: Disable HPA/SPA translation handlers for Normalized Addressing
cxl/region: Factor out code into cxl_region_setup_poison()
cxl/atl: Lock decoders that need address translation
cxl: Enable AMD Zen5 address translation using ACPI PRMT
cxl/acpi: Prepare use of EFI runtime services
cxl: Introduce callback for HPA address ranges translation
cxl/region: Use region data to get the root decoder
cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
cxl/region: Separate region parameter setup and region construction
cxl: Simplify cxl_root_ops allocation and handling
cxl/region: Store HPA range in struct cxl_region
cxl/region: Store root decoder in struct cxl_region
cxl/region: Rename misleading variable name @hpa to @hpa_range
Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
cxl, doc: Moving conventions in separate files
cxl, doc: Remove isonum.txt inclusion
cxl/port: Unify endpoint and switch port lookup
cxl/port: Move endpoint component register management to cxl_port
cxl/port: Map Port RAS registers
cxl/port: Move dport RAS setup to dport add time
...
Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted
cleanups and fixes. The biggest core change is the massive code
motion in the sd driver to remove forward declarations and the most
significant change is to enumify the queuecommand return.
Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
-----BEGIN PGP SIGNATURE-----
iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaY4ljBsUgAAAAAAEAA5t
YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy
c2hpcC5jb20ACgkQ50LJTO6YrIWFlwEAr9nc1ntxH4UNPgFCVjKyAOa5IE+p5o5C
2lwQIufcihEBAORvI9KO6AoEK6v9TmMKZXoyVsDRFe79fQE5NwCjfAA3
=My8f
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted
cleanups and fixes.
The biggest core change is the massive code motion in the sd driver to
remove forward declarations and the most significant change is to
enumify the queuecommand return"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (78 commits)
scsi: csiostor: Fix dereference of null pointer rn
scsi: buslogic: Reduce stack usage
scsi: ufs: host: mediatek: Require CONFIG_PM
scsi: ufs: mediatek: Fix page faults in ufs_mtk_clk_scale() trace event
scsi: smartpqi: Fix memory leak in pqi_report_phys_luns()
scsi: mpi3mr: Make driver probing asynchronous
scsi: ufs: core: Flush exception handling work when RPM level is zero
scsi: efct: Use IRQF_ONESHOT and default primary handler
scsi: ufs: core: Use a host-wide tagset in SDB mode
scsi: qla2xxx: target: Add WQ_PERCPU to alloc_workqueue() users
scsi: qla2xxx: Add WQ_PERCPU to alloc_workqueue() users
scsi: qla4xxx: Add WQ_PERCPU to alloc_workqueue() users
scsi: mpi3mr: Driver version update to 8.17.0.3.50
scsi: mpi3mr: Fixed the W=1 compilation warning
scsi: mpi3mr: Record and report controller firmware faults
scsi: mpi3mr: Update MPI Headers to revision 39
scsi: mpi3mr: Use negotiated link rate from DevicePage0
scsi: mpi3mr: Avoid redundant diag-fault resets
scsi: mpi3mr: Rename log data save helper to reflect threaded/BH context
scsi: mpi3mr: Add module parameter to control threaded IRQ polling
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGJ4AQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpnL0D/9rEC0FGoGM3473v2SUPEkcpSIIab3KMZl5
YW/GcPeE8i/jMTfa/QJjTpp24ke1m+AuMYmEUm+KDHNXNQTjJQVnSFRPxER5dLJl
goarFuk5JVDdzovL8QZuYGQHY3o1G54QLuBpVypBBqZdOQFTk8UwOdLFk4JrHMM2
PkUBUf9lZq9KiH6jdwn3v4qpZsZq93IubZ2dSncDcuZ3l2FbiZG88C3pp7wd+w1Z
VM+xTLkqS3OhXiLVfbmRVM//3PgQZU4bO6k+vjWhlztC+5+ELcuiPyN6nd++6lLw
LHH55T/xzko+TVc7kW7NTZ79MjBrjFKSNq4M/LV5SXSEpUAlqMoEakXVupVrBv+Z
gheHHHas3t8FWKwIkjEH6iV1TyGOxZzxaiQ8MurzQ5v3FC2PdfH+Uqisf56LTNKY
yAI0Ka8JWBv0sImgGB5J5ZzMP+xPxp4oyLEqKGLkUSi2OGtOeSHXwEiNbbJuTJBQ
Q+xGdTXukG+zlMnTIcq0fGFaT0hXYs8a6ZiF7vyaYdW9R6/WdfDlS/YN+gM+JgrD
EH2k29NE4kZi6cPFayfmgnfh10leiyMYt020b0GR4VpCHtthz7k+oasLB400qGrs
X6wad+Y1YfdEpD1SW447ZrM8JmcOVrLCj/yIodyV30Kw0v3QSjKLcp2LoxwvPpRi
Crs4Sb74Lg==
=d42R
-----END PGP SIGNATURE-----
Merge tag 'for-7.0/io_uring-zcrx-large-buffers-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring large rx buffer support from Jens Axboe:
"Now that the networking updates are upstream, here's the support for
large buffers for zcrx.
Using larger (bigger than 4K) rx buffers can increase the effiency of
zcrx. For example, it's been shown that using 32K buffers can decrease
CPU usage by ~30% compared to 4K buffers"
* tag 'for-7.0/io_uring-zcrx-large-buffers-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/zcrx: implement large rx buffer support
Set the family for GC 11.5.4
Fixes: 47ae1f938d ("drm/amdgpu: add support for GC IP version 11.5.4")
Cc: Tim Huang <tim.huang@amd.com>
Cc: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Total patches: 107
Reviews/patch: 1.07
Reviewed rate: 67%
- The 2 patch series "ocfs2: give ocfs2 the ability to reclaim
suballocator free bg" from Heming Zhao saves disk space by teaching
ocfs2 to reclaim suballocator block group space.
- The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one
bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in
various places.
- The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than
PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if
VMCOREINFO_BYTES ever exceeds the page size.
- The 7 patch series "kallsyms: Prevent invalid access when showing
module buildid" from Petr Mladek cleans up kallsyms code related to
module buildid and fixes an invalid access crash when printing
backtraces.
- The 3 patch series "Address page fault in
ima_restore_measurement_list()" from Harshit Mogalapalli fixes a
kexec-related crash that can occur when booting the second-stage kernel
on x86.
- The 6 patch series "kho: ABI headers and Documentation updates" from
Mike Rapoport updates the kexec handover ABI documentation.
- The 4 patch series "Align atomic storage" from Finn Thain adds the
__aligned attribute to atomic_t and atomic64_t definitions to get
natural alignment of both types on csky, m68k, microblaze, nios2,
openrisc and sh.
- The 2 patch series "kho: clean up page initialization logic" from
Pratyush Yadav simplifies the page initialization logic in
kho_restore_page().
- The 6 patch series "Unload linux/kernel.h" from Yury Norov moves
several things out of kernel.h and into more appropriate places.
- The 7 patch series "don't abuse task_struct.group_leader" from Oleg
Nesterov removes the usage of ->group_leader when it is "obviously
unnecessary".
- The 5 patch series "list private v2 & luo flb" from Pasha Tatashin
adds some infrastructure improvements to the live update orchestrator.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA
jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6
0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww=
=mmsH
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
disk space by teaching ocfs2 to reclaim suballocator block group
space (Heming Zhao)
- "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
ARRAY_END() macro and uses it in various places (Alejandro Colomar)
- "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
page size (Pnina Feder)
- "kallsyms: Prevent invalid access when showing module buildid" cleans
up kallsyms code related to module buildid and fixes an invalid
access crash when printing backtraces (Petr Mladek)
- "Address page fault in ima_restore_measurement_list()" fixes a
kexec-related crash that can occur when booting the second-stage
kernel on x86 (Harshit Mogalapalli)
- "kho: ABI headers and Documentation updates" updates the kexec
handover ABI documentation (Mike Rapoport)
- "Align atomic storage" adds the __aligned attribute to atomic_t and
atomic64_t definitions to get natural alignment of both types on
csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)
- "kho: clean up page initialization logic" simplifies the page
initialization logic in kho_restore_page() (Pratyush Yadav)
- "Unload linux/kernel.h" moves several things out of kernel.h and into
more appropriate places (Yury Norov)
- "don't abuse task_struct.group_leader" removes the usage of
->group_leader when it is "obviously unnecessary" (Oleg Nesterov)
- "list private v2 & luo flb" adds some infrastructure improvements to
the live update orchestrator (Pasha Tatashin)
* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
procfs: fix missing RCU protection when reading real_parent in do_task_stat()
watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
kho: fix doc for kho_restore_pages()
tests/liveupdate: add in-kernel liveupdate test
liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
liveupdate: luo_file: Use private list
list: add kunit test for private list primitives
list: add primitives for private list manipulations
delayacct: fix uapi timespec64 definition
panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
netclassid: use thread_group_leader(p) in update_classid_task()
RDMA/umem: don't abuse current->group_leader
drm/pan*: don't abuse current->group_leader
drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
drm/amdgpu: don't abuse current->group_leader
android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
android/binder: don't abuse current->group_leader
kho: skip memoryless NUMA nodes when reserving scratch areas
...