Following the previous set of fixes, this addresses another significant
number of small issues found in firmware drivers (tee, optee, qcomtee,
qcom ice, exynos acpm) drivers through various tools. This is about
error handling, resource leaks, concurrency and a use-after-free bug.
The fixes for the Qualcomm ICE driver also introduce interface changes
in the UFS and MMC drivers using it.
Outside of firmware drivers, there are a few fixes across the tree:
- Minor driver code mistakes in the Atmel EBI memory controller,
the i.MX soc ID driver and socfpga boot logic
- A defconfig change to avoid a boot time regression on multiple
qualcomm boards
- Device tree fixes for qualcomm, at91 and gemini, addressing
mostly minor configuration mistakes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmofFFQACgkQmmx57+YA
GNkoZxAAjEoE6xgVJjQBQug02+puOll3DSjYjYahg434nUoIpfY4bU4247eUtvxi
QU5kSRQbMaYBpBM5pqgqdG0P0KiP8UsrDWgGr1CbHhCtO2H2cJ6ICUFAXmgpZ+n3
P7R4hROu+EoRb6urgRG6koL6LYId1nRSKOvWsPEz8cXcVE/nwdaYgU6GB9aS2B0v
zAcLGMtACNU9iiDZNW+pt97CkMr38pjEkcmWxfQBSqcjck0JsujHCuCWLwPALKAo
V1aSKPgg1YUMs3+2LeXyhv5rFrBmXfRJ1v7unLKXAvJ9k+DZb63D5AIFT6xjD7Qh
nF/IgPmiFPaYKVskTWS7UHWVLZY7mBstb4gWei1fNE8deCXA35ntuNSg1YkIabvF
s/g5g2/EuiCTobZLO+xAGHJvfB/iVx2k2w6CzYpxXtOOf8CNzskWkRnerK3RF+TM
LIN1JhrZJzAnHL+Z8jv3z2+vo1sUOMuQax723xYoh/7LUUr1yp0hBJExkjJJ8s3q
5GOob3WnjH9n15OAsJvNXlKIWstIi/BPSXyATmca6tDsEnuv3KE9Sok4ZT/Gsjxk
2L/aIlnnu6qX4BlQ7Hrfl2+LDp7jX1RP7MFMGxHscD36ws7ZU9asoUc5MxPrnZno
Oay2HTyiA42YvsN+R6oI331VSTd/hXLemqvP9JpENvSbpuGm2So=
=uqVQ
-----END PGP SIGNATURE-----
Merge tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"Following the previous set of fixes, this addresses another
significant number of small issues found in firmware drivers (tee,
optee, qcomtee, qcom ice, exynos acpm) drivers through various tools.
This is about error handling, resource leaks, concurrency and a
use-after-free bug.
The fixes for the Qualcomm ICE driver also introduce interface changes
in the UFS and MMC drivers using it.
Outside of firmware drivers, there are a few fixes across the tree:
- Minor driver code mistakes in the Atmel EBI memory controller, the
i.MX soc ID driver and socfpga boot logic
- A defconfig change to avoid a boot time regression on multiple
qualcomm boards
- Device tree fixes for qualcomm, at91 and gemini, addressing mostly
minor configuration mistakes"
* tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits)
firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
firmware: samsung: acpm: Fix missing LKMM barriers in sequence allocator
firmware: samsung: acpm: Fix false timeouts and Use-After-Free in polling
ARM: dts: gemini: Fix partition offsets
ARM: socfpga: Fix OF node refcount leak in SMP setup
soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found
arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
tee: qcomtee: add missing va_end in early return qcomtee_object_user_init()
tee: fix params_from_user() error path in tee_ioctl_supp_recv
tee: shm: fix shm leak in register_shm_helper()
tee: fix tee_ioctl_object_invoke_arg padding
arm64: defconfig: Enable PCI M.2 power sequencing driver
scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get()
mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get()
soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL
soc: qcom: ice: Return -ENODEV if the ICE platform device is not found
soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get()
ARM: dts: microchip: sam9x7: fix GMAC clock configuration
firmware: samsung: acpm: Fix mailbox channel leak on probe error
...
address post-7.1 issues or aren't considered suitable for backporting.
There's a 3 patch series "userfaultfd: verify VMA state across UFFDIO_COPY
retry" from Mike Rapoport which fixes a few uffd things. The rest are
singletons - please see the individual changelogs for details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCah5U6wAKCRDdBJ7gKXxA
jmzcAP9N+WQo4qNZYYjSURqJof48Q4nght5C0ZHtsVk5itNJEQEAiecouCreqDSE
VUY9mQHyEawIfORkPTUijnkjV8b+lwc=
=m8J3
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2026-06-01-20-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"13 hotfixes. All are for MM. 10 are cc:stable and the remaining 3
address post-7.1 issues or aren't considered suitable for backporting.
There's a three-patch series "userfaultfd: verify VMA state across
UFFDIO_COPY retry" from Mike Rapoport which fixes a few uffd things.
The rest are singletons - please see the individual changelogs for
details"
* tag 'mm-hotfixes-stable-2026-06-01-20-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
userfaultfd: remove redundant check in vm_uffd_ops()
userfaultfd: refuse to __mfill_atomic_pte() for unsupported VMAs
userfaultfd: verify VMA state across UFFDIO_COPY retry
mm/huge_memory: update file PMD counter before folio_put()
mm/huge_memory: update file PUD counter before folio_put()
mm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback
mm/damon/ops-common: call folio_test_lru() after folio_get()
mm/cma: fix reserved page leak on activation failure
mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison
mm/hugetlb: restore reservation on error in hugetlb folio copy paths
mm/cma_debug: fix invalid accesses for inactive CMA areas
memcg: use round-robin victim selection in refill_stock
mm/hugetlb: avoid false positive lockdep assertion
* Fix potential out-of-bound access in line-display library
* Miscellaneous refactoring and cleaning up
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAmodSKkACgkQb7wzTHR8
rCgnTBAAoHqEq6XKc6qqsShvyDhcyiyGdGasF1a1kfMVRTjRYa/KytEgD8Jl27N/
MTmN3mq+Z2INKovcs4iGoRUAFxMAs9pzRqBQHI6u98mhwbMuWklCzaRa9NyrdEAI
iHYpWZ557qGsn+aU/N9Q9Dy78FIOFlKSuvgbLQmjr1XQUwpbc7EkPO0o3QD/o7qP
BphL0FpP6CMhLsb7+ZyFjjN+ntftvZ92LGVaBBiZA+xqDHHTzcEVHDNdl4C/0ZjS
JIDTtStCgcCBHg/EUHJ68Oa2yEqOuWsuclX+YoL22Zu2qi7UW0FFhQEXiKtx/eMi
HS2C1aYGVQVLvFjuTe1lYO1PbxCL5ddhvZarz+8LfvK03T6ExPHkB/Z1VUWJoE+L
3ZrGpjFw6LOkk3MkeCul2MNclmvspPz2XKUern9+pX62NWbOxSbzRNknDH87muSF
iQMpo8qTNf+9XlyxC2EGNS6IqwAJ/qXCXUmXdFJzRoWtwLmm6CBiwZGxaPE5XwsB
iWvitQj0dkPjcgungKhl0c/eZ0wpZ8HFEh4muUIEIP/uZWiNjQywGGXyuduhlKvs
ro2uDWSBG/7EVds676abD3Po/PRYUzxD3KgrN2CG0C+8YuqWc8tO7pcWtKWgI+yt
tw143EAS30/HBCsP7A+wdNG0Rddlv4NaE5PTaz7zbQmOc5YqFho=
=VUfV
-----END PGP SIGNATURE-----
Merge tag 'auxdisplay-v7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay
Pull auxdisplay updates from Andy Shevchenko:
- Fix potential out-of-bound access in line-display library
- Miscellaneous refactoring and cleaning up
[ Andy says this could easily be delayed until 7.2, but it's _so_ tiny
that it's more work for me to schedule it for later than to just take
it now, and just doesn't seem worth delaying - Linus ]
* tag 'auxdisplay-v7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
auxdisplay: Kconfig: drop unneeded quotes in PANEL_BOOT_MESSAGE dep
auxdisplay: line-display: fix OOB read on zero-length message_store()
auxdisplay: max6959: use regmap_assign_bits() in max6959_enable()
commit 2d1f7b65f5 ("dm cache policy smq: fix missing locks in
invalidating cache blocks") added mq->lock around the destructive part of
smq_invalidate_mapping(), but left the e->allocated check outside the
critical section.
That leaves a check-then-act race. Two concurrent invalidators can both
observe e->allocated as true before either of them takes mq->lock. The
first invalidator that acquires the lock removes the entry from the
queues and hash table and then calls free_entry(), which clears
e->allocated and puts the entry back on the free list. The second
invalidator can then acquire mq->lock and continue with the stale result
of the unlocked check.
This can corrupt the SMQ queues or hash table by deleting an entry that
is no longer on those structures. It can also hit the allocation check in
free_entry() when the same entry is freed again.
Move the allocation check under mq->lock so the predicate and the
destructive operations are serialized by the same lock.
Fixes: 2d1f7b65f5 ("dm cache policy smq: fix missing locks in invalidating cache blocks")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
This update includes:
- a fix for the GMAC DT node on SAM9X7 SoC to properly describe the
available clocks
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTsZ8eserC1pmhwqDmejrg/N2X7/QUCahxHEwAKCRCejrg/N2X7
/aZcAP9W2CfnPGFuBHoT4DR3NLDcqgt0p6hni2RpV3P+ELK1NgEAqIYlIyf0RXbV
j4TU9cbq6Z7iXb2hsMjF1gebKj1V6QU=
=Z26Y
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmodlkkACgkQmmx57+YA
GNkk2BAAsLgM/0tVJsx22CF0KouOrDhyjjbKt7/i70qnCkhC9K831X9O8dFBYIma
8hZT/Rc0PWX9KueQl2IqiI+KKY2kjZ+zJGS64j21LVmdzPgVgR3/gzgf81IIzJMO
aBuL7JbOZAO8aZ5sJQc6xvPo+fdO1sZmBU0yMRLZbSiOyH3rbpDO7KVAJzOo95F5
GC3cNPGqSt91ytNSBI8E8OhVrJktr7ssgKOsIaGRxfERVTBA+utGfqJucDQJHIOC
VDGg/31KQfvkz2cgrqFKzP6xAKk2drqM5Z5b7jZpu4FQkSEz2lAeqQEE1i+9eDx4
EntZUaro1xQUOmk4J2MYLb0Kbf89jEvQ33wZogYpjLWsUt4nSAeviS2fcN4B8OQ7
L7oYHeQh2hZoqncaz8xRT8R5WHvN8udI29LHsPBVNaka/BTphkgE35Ggxww5H0Ij
XeWxyx4+DpxJk2NZ0UdlF1tUCSz5hEInLMkJ8uOsY5FFmBa8h57KHnCymu/wxn8G
CFm/lc4p/VkIrivU0Kn9YHYT1HtBpYE2q9U9clLPthxI/IGma7DjF8qy+FXXQ51m
0zdp/OcCVcVWIpOPj+iFyucQUpxY+mdAEWN5ppIf3yZVYxcxFWVOW6ADzHP1G9N2
1ryn6jJIqp6mm9W+T7hNbgUqzraTGH4gHGjUD+79dVQeL1HVZZk=
=xRm6
-----END PGP SIGNATURE-----
Merge tag 'at91-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes
Microchip AT91 fixes for v7.1
This update includes:
- a fix for the GMAC DT node on SAM9X7 SoC to properly describe the
available clocks
* tag 'at91-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
ARM: dts: microchip: sam9x7: fix GMAC clock configuration
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Lorenzo says:
static const struct vm_uffd_ops *vma_uffd_ops(struct vm_area_struct *vma)
{
if (vma_is_anonymous(vma))
return &anon_uffd_ops;
return vma->vm_ops ? vma->vm_ops->uffd_ops : NULL;
}
This is doing a redundant check _and_ making life confusing, as if
!vma->vm_ops is a condition that can be reached there, it can't, as
vma_is_anonymous() is literally a !vma->vm_ops check :)
Remove the redundant check.
Link: https://lore.kernel.org/20260527184751.4147364-4-rppt@kernel.org
Fixes: 0f48947c42 ("userfaultfd: introduce vm_uffd_ops")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: David Carlier <devnexen@gmail.com>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__mfill_atomic_pte() unconditionally dereferences ops because there is an
assumption that VMAs that can undergo mfill_* operations are vetted on
registration and must have valid vm_uffd_ops.
Add a guard against potential bugs and make sure __mfill_atomic_pte()
bails out if ops is NULL.
Link: https://lore.kernel.org/20260527184751.4147364-3-rppt@kernel.org
Fixes: ad9ac30813 ("userfaultfd: introduce vm_uffd_ops->alloc_folio()")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: David CARLIER <devnexen@gmail.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "userfaultfd: verify VMA state across UFFDIO_COPY retry", v2.
... and two more small fixes.
This patch (of 3):
mfill_copy_folio_retry() drops the VMA lock for copy_from_user() and
reacquires it afterwards. The destination VMA can be replaced during that
window.
The existing check compares vma_uffd_ops() before and after the retry, but
if a shmem VMA with MAP_SHARED is replaced with a shmem VMA with
MAP_PRIVATE (or vice versa) the replacement goes undetected.
The change from MAP_PRIVATE to MAP_SHARED will treat the folio allocated
with shmem_alloc_folio() as anonymous and this will cause BUG() when
mfill_atomic_install_pte() will try to folio_add_new_anon_rmap().
The change from MAP_SHARED to MAP_PRIVATE allows injection of folios into
the page cache of the original VMA.
There is no need to change for hugetlb because it never uses
mfill_copy_folio_retry().
Introduce helpers for more comprehensive comparison of VMA state:
- mfill_retry_state_save() to save the relevant VMA state into a struct
mfill_retry_state (original uffd_ops, relevant VMA flags, vm_file and
pgoff) before dropping the lock
- mfill_retry_state_changed() to compare the saved state with the state
of the VMA acquired after retaking the locks
- mfill_retry_state_put() to release vm_file pinning.
Use DEFINE_FREE() cleanup to wrap mfill_retry_state_put() to avoid
complicating error handling paths in mfill_copy_folio_retry().
Link: https://lore.kernel.org/20260527184751.4147364-1-rppt@kernel.org
Link: https://lore.kernel.org/20260527184751.4147364-2-rppt@kernel.org
Fixes: 292411fda2 ("mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()")
Fixes: 6ab703034f ("userfaultfd: mfill_atomic(): remove retry logic")
Co-developed-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Co-developed-by: David Carlier <devnexen@gmail.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__split_huge_pmd_locked() updates the file/shmem RSS counter after
dropping the PMD mapping's folio reference. If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().
Move the counter update before folio_put().
Link: https://lore.kernel.org/20260526101337.1984081-1-yintirui@huawei.com
Fixes: fadae29530 ("thp: use mm_file_counter to determine update which rss counter")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__split_huge_pud_locked() updates the file/shmem RSS counter after
dropping the PUD mapping's folio reference. If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().
Move the counter update before folio_put().
Link: https://lore.kernel.org/20260526101355.1984244-1-yintirui@huawei.com
Fixes: dbe5415329 ("mm/huge_memory: add vmf_insert_folio_pud()")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
vmemmap_restore_pte() rebuilds restored vmemmap pages from a tail-page
template derived from compound_head(). This is wrong when the current PTE
already maps a page whose contents are not tail-page metadata.
In the rollback path of vmemmap_remap_free(), the first restored PTE is
backed by vmemmap_head and contains head-page metadata. Reconstructing
that page from a tail-page template overwrites the head-page state and
corrupts the restored vmemmap page.
Fix this by copying the full page from the page currently mapped by the
PTE. Also pass vmemmap_tail to the rollback walk so only PTEs backed by
the shared tail page are restored, while the head PTE remains mapped to
vmemmap_head. Add VM_WARN_ON_ONCE() checks for unexpected cases.
Link: https://lore.kernel.org/20260525025213.2229628-1-songmuchun@bytedance.com
Fixes: c0b495b91a ("mm/hugetlb: refactor code around vmemmap_walk")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Kiryl Shutsemau <kas@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
damon_get_folio() speculatively calls folio_test_lru() before
folio_try_get(). The folio can get freed and reallocated to a tail page.
In the case, VM_BUG_ON_PGFLAGS() in const_folio_flags() can be triggered.
Remove the speculative call.
Also mark folio_test_lru() check right after folio_try_get() success as no
more unlikely.
The race should be rare. Also the problem can happen only if the kernel
has enabled CONFIG_DEBUG_VM_PGFLAGS. No real world report of this issue
has been made so far. This fix is based on only theoretical analysis.
That said, a bug is a bug. A similar issue was also fixed via commit
3203b3ab0f ("mm/filemap: don't call folio_test_locked() without a
reference in next_uptodate_folio()"). I don't expect this change will
make a meaningful impact to DAMON performance in the real world, though I
will be happy to be corrected from the real world reports.
The issue was discovered [1] by Sashiko.
Link: https://lore.kernel.org/20260525162256.8317-1-sj@kernel.org
Link: https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org [1]
Fixes: 3f49584b26 ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Fernand Sieber <sieberf@amazon.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Make the clearcpuid= boot parameter less prominent
and warn about its dangers & caveats (Borislav Petkov)
- Do not access the (new) PLATFORM_ID MSR when running as a guest
(Borislav Petkov)
- x86 ftrace: Relocate %rip-relative percpu refs in dynamic
trampolines, to fix crash when using such trampolines
(Alexis Lothoré)
- Fix x86-64 CFI build error (Peter Zijlstra)
- Revert FPU signal return magic number check optimization, because
it broke CRIU and gVisor in certain FPU configurations
(Andrei Vagin)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmob2swRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jCdA//aKJpLZRwT3cHs/Wmu8oh+MbWVcOJqavz
Vkdbh5EEDQJ0/HAbrt/ms3gismg08GBTSpycJIKCF+p98n0s3hjBSzVJjF1kLbT+
vCZH4M3qXiYylghu+gHCYwCPr8cWEWjf+x8nIBlBdHgDPJ88g09VHmtHRdnwFMOt
lA6lew7UfNlUqxMA6p58elIZtbJvRG7QpUHHjOidut5uBSvrij08vPo1IYFAqUw8
DvQnvXEv1jit8zAOp7gQDheqdzP058zk6kR177taIxLDJsJlqL8ozJRDW4NB72v2
HBaj8DiXzsXyovck/H/t36nv5ikotNjDvlP31RZfTZ5VIlfvyNPF1vb38kC8lVlC
gfq2H8Bw3pktldDKTFh0f1XnkmmiDTEf/zsDqeMSDXfF4t5LPw15nfRLzU9rO4nd
po6s9GhLdo5lOv6BG2dsFhYxLSZTEHFTe03Q8EzGNgYUJupJVAtRwmnGQP0zv9QJ
/nMUhZQPGN5PCeXo2YMdyB+wWaKVRjfAE2lz/V97Vb3WfjSAQ1eF+kb0keDFFwQD
3HQq8k2nxLFIXEX7LJa+qTH2Hnu0PZgoy9yRreo2sOaUHcTfzAIxyT5RzeKl+eDw
0wUjDGkXS7vCzIOMp4+H2HC3r5aeCTX9YY0XWhxs8M0UcGzO8XEKdgjJTD6KiKCY
RX9Uj2FmmIA=
=4sLh
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Make the clearcpuid= boot parameter less prominent
and warn about its dangers & caveats (Borislav Petkov)
- Do not access the (new) PLATFORM_ID MSR when running
as a guest (Borislav Petkov)
- x86 ftrace: Relocate %rip-relative percpu refs in dynamic
trampolines, to fix crash when using such trampolines
(Alexis Lothoré)
- Fix x86-64 CFI build error (Peter Zijlstra)
- Revert FPU signal return magic number check optimization,
because it broke CRIU and gVisor in certain FPU configurations
(Andrei Vagin)
* tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86/fpu: Refine and simplify the magic number check during signal return"
x86/kvm/vmx: Fix x86_64 CFI build
x86/ftrace: Relocate %rip-relative percpu refs in dynamic trampolines
x86/microcode: Do not access MSR_IA32_PLATFORM_ID when running as a guest
Documentation/arch/x86: Hide clearcpuid=
Two core changes, the only one of significance being the change to kick queues
in SDEV_CANCEL which had a small window for stuck requests. The major driver
fixes are the one to the FC transport class to widen the FPIN counter to
counter a theoretical (and privileged) fabric traffic injection attack and the
other is an iscsi fix where a malicious target could trick the kernel into an
output buffer overrun. Both the driver fixes were AI assisted.
Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
-----BEGIN PGP SIGNATURE-----
iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCahxQ/RsUgAAAAAAEAA5t
YW51MiwyLjUrMS4xMiwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy
c2hpcC5jb20ACgkQ50LJTO6YrIX+HgD+Mqf+AKbV/EhPhKAfONeBaE0Q5e78KTyK
3e47Qxjs+mQBAKhWUwodLDS/WSm7Fbdj7tn/kuPSaH+R+JQltnPR4QPG
=BRJ7
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two core changes, the only one of significance being the change to
kick queues in SDEV_CANCEL which had a small window for stuck
requests.
The major driver fixes are the one to the FC transport class to widen
the FPIN counter to counter a theoretical (and privileged) fabric
traffic injection attack and the other is an iscsi fix where a
malicious target could trick the kernel into an output buffer overrun.
Both the driver fixes were AI assisted"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: iscsi: Validate CHAP_R length before base64 decode
scsi: target: iscsi: Bound iscsi_encode_text_output() appends to rsp_buf
scsi: target: iscsi: Fix CRC overread and double-free in iscsit_handle_text_cmd()
scsi: fcoe: Reject FIP descriptors with zero fip_dlen in CVL walker
scsi: scsi_transport_fc: Widen FPIN pname walker counter to u32
scsi: scsi_debug: Add missing newline in scsi_debug_device_reset()
scsi: megaraid_sas: Fix NULL pointer dereference on firmware duplicate completion
scsi: devinfo: Add BLIST_NO_RSOC for Promise VTrak E310f
scsi: core: Run queues for all non-SDEV_DEL devices from scsi_run_host_queues
davinci: fix fallback bus frequency on missing clock-frequency
virtio: mark device ready initially
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmocCPMACgkQFA3kzBSg
KbaSxw/+Kp+Rcj0G2RTRqpf5d7V9KIY35l+7gerd8/F7+zfIVNDJrz0FQfiTn6hr
nB0EPmKtq1VnJDMxCFeCT0P4AoNBFbGCXxmCXkVnypUiLeHooBFZH5d8f+HCII4c
t72HKZ0OgitAu12ApgnNFuMoInUnVrQy64DfIjkDxQLWva08WfW2mT74Ok61OJHd
TjgononOu/Tsrk/F0fekRdxfAXx8myULOG/kLIxRNlv4ysv962ros2t0i05PtYyg
z+wz0uZ1KP06nmOwZz0Pm1dlVHNuAlWqfzczF4Drts4Qj3vIHhuhIVQM4zGdylSR
39KGiDmpOomPg9OIg70iMGw/FBDPd6UcOj20AWrTKsRENTU2xnIslQ/9Nk00RF7Q
mP1kBolEkaXF7pugdeBDDWi7iwbawT6dpKlv/hSr6hAnBJXnLO+QxXaMproEl/1f
igRoYx/Q/BCozYp3aR7ZahUKUF+xfpIkSfFZ02DUcbf2mBsSmQNXwN1VjrUkBqnQ
VSYLZsBz+kqgkF+yAr2sL7BfTYzFCKeJMPOfBK4OF3wXZ43omCCWzr2hxMTGehTS
17K5lHtzxneaAdTDLrFDUe8YBMSvKRolGcnIcnrNM/4fc7QEYBDaYns6j05vpnrP
+4rDv4Db/hRxL6FsUuVaMc5aW9tdT+oxkVv9jea+2EbI7rxdqIM=
=AcAd
-----END PGP SIGNATURE-----
Merge tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- davinci: fix fallback bus frequency on missing clock-frequency
- virtio: mark device ready initially
* tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: virtio: mark device ready before registering the adapter
i2c: davinci: fix division by zero on missing clock-frequency
- updates to Elan I2C touchpad driver to handle a new IC type and to
validate size of supplied firmware to prevent OOB access
- updates to Xpad controller driver to recognize ASUS ROG RAIKIRI II and
"Nova 2 Lite" from GameSir controllers as well as a fix to prevent a
potential OOB access when handling "Share" button
- an update to Synaptics touchpad driver to use RMI mode for touchpad in
Thinkpad E490
- updates to Atmel MXT driver adding checks to prevent potential OOB
accesses
- a fix to IMS PCU driver to free correct amount of memory when
tearing it down
- a fixup to the recent change to Atlas buttons driver
- a small cleanup in fm801-fp for PCI IDs table initialisation
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCahvBFwAKCRBAj56VGEWX
nP7XAQDH9WIYBX7XEHSPInN6GH8GngWA/94v3b6UpBSL2UFdQQEAo1+WoyOGL/Dh
0U0g5CcvDwx9irF8qJBWX2a3DRNPngo=
=hDBA
-----END PGP SIGNATURE-----
Merge tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- updates to Elan I2C touchpad driver to handle a new IC type and to
validate size of supplied firmware to prevent OOB access
- updates to Xpad controller driver to recognize ASUS ROG RAIKIRI II
and "Nova 2 Lite" from GameSir controllers as well as a fix to
prevent a potential OOB access when handling "Share" button
- an update to Synaptics touchpad driver to use RMI mode for touchpad
in Thinkpad E490
- updates to Atmel MXT driver adding checks to prevent potential OOB
accesses
- a fix to IMS PCU driver to free correct amount of memory when tearing
it down
- a fixup to the recent change to Atlas buttons driver
- a small cleanup in fm801-fp for PCI IDs table initialisation
* tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ims-pcu - fix usb_free_coherent() size in ims_pcu_buffers_free()
Input: synaptics - add LEN2058 to SMBus passlist for ThinkPad E490
Input: atlas - check ACPI_COMPANION() against NULL
Input: atmel_mxt_ts - check mem_size before calculating config memory size
Input: atmel_mxt_ts - fix boundary check in mxt_prepare_cfg_mem
Input: fm801-gp - simplify initialisation of pci_device_id array
Input: xpad - add "Nova 2 Lite" from GameSir
Input: xpad - add support for ASUS ROG RAIKIRI II
Input: elan_i2c - validate firmware size before use
Input: xpad - fix out-of-bounds access for Share button
Input: usbtouchscreen - clamp NEXIO data_len/x_len to URB buffer size
Input: elan_i2c - increase device reset wait timeout after update FW
Input: elan_i2c - add ic type 0x19
* fix order calculation for kho_unpreserve_pages() to make sure sure that
the order calculation in kho_unpreserve_pages() mathes the order
calculation in kho_preserve_pages().
* fix math in calculation of KHO_TREE_MAX_DEPTH to make it work with 16KB
pages.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmobGd4ACgkQOQOGJssO
/ZGaRQf/a0vTak489XqXhddHkulyMnif1UEEYsltxQ8bJkC0SRx+v/PcC0Uf2g7+
n/1vQZxWGUlLawjMDhubCWp2JawRZh9/rzPfb96z3nsjUckaQI3sKdEe7fK9jIVL
2y2QHa26RJj7dlEcJbUToSgVbRrP8qJbiUVjo1i3ViVFsevj1gaNBo8h8oJa694z
S1wXndBz7HYdSNuRgMc5rGUbzgVu9rl2rdTHR6ecRUfTVuQr1ZYrb7v6wi4AI3XL
KZp6TXmDuvPikJwoWsQtBRK5VmLQxsCa5ryu4M+GEBOwezNZex29Yi2TrZWZ2KZk
ViCFzYQLHY7RUrlhL7+tGN2SiLYKqg==
=LtZL
-----END PGP SIGNATURE-----
Merge tag 'liveupdate-fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux
Pull liveupdate fixes from Mike Rapoport:
"Two kexec handover regression fixes:
- fix order calculation for kho_unpreserve_pages() to make sure sure
that the order calculation in kho_unpreserve_pages() mathes the
order calculation in kho_preserve_pages().
- fix math in calculation of KHO_TREE_MAX_DEPTH to make it work with
16KB pages"
* tag 'liveupdate-fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux:
kho: fix order calculation for kho_unpreserve_pages()
kho: fix KHO_TREE_MAX_DEPTH for non-4KB page sizes
After refactoring of memblock_free_late() and free_init_pages() it became
possible to call memblock_free() after memblock init data was discarded.
Make sure memblock_free() does not touch memblock.reserved unless it is
called early enough or when ARCH_KEEP_MEMBLOCK is enabled.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmobERsACgkQOQOGJssO
/ZGiqQgArOfAxLDWWinyherNcejWY+GKsNdWNYoGOv8UEw2oTTBmjyrOqHrGcevQ
YlLjBxc2v9LzW9wCRnW0ngoq6/28ABLwyLpB+sMyHU8KaJDyYnAhfe7xt59aqE2N
JuQaSRY8irZG8g2Yks2ZWIPbDoIXJVvGI342L96OYLO63eehV9u5e7kbBebOZpH1
JlnbsaMGjhh2RgLrWWEy4EW1NZ5bYHer6fmCVIlUWtz9X67OjKD5na8bdi9ADEay
Wu2CjYwZFdScM4FQ8r2l9UHxjnU8EQKRxVOm2+hO8wRiE3efzezDCp5Laacvf9o6
KLrkVONfYUpScis3qLpNZly3IjAeVw==
=H1VN
-----END PGP SIGNATURE-----
Merge tag 'fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix regression from memblock_free_late() refactoring
After refactoring of memblock_free_late() and free_init_pages() it
became possible to call memblock_free() after memblock init data was
discarded.
Make sure memblock_free() does not touch memblock.reserved unless it
is called early enough or when ARCH_KEEP_MEMBLOCK is enabled"
* tag 'fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: don't touch memblock arrays when memblock_free() is called late
Commit eac69475b0 ("media: rc: igorplugusb: heed coherency
rules") changed the control request storage from an embedded struct to
an allocated pointer so it can obey DMA coherency rules.
However, the driver still passes &ir->request to usb_fill_control_urb().
That points the URB setup packet at the pointer field itself rather than
at the allocated struct usb_ctrlrequest.
USB core then interprets pointer bytes as the setup packet. This can
produce an invalid bRequestType and trigger the control direction warning
reported by syzbot:
usb 2-1: BOGUS control dir, pipe 80003580 doesn't match bRequestType 0
Pass ir->request itself as the setup packet.
Fixes: eac69475b0 ("media: rc: igorplugusb: heed coherency rules")
Reported-by: syzbot+11f0e4f957c7c3bf3d51@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=11f0e4f957c7c3bf3d51
Tested-by: syzbot+11f0e4f957c7c3bf3d51@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Assisted-by: Codex:GPT-5.5
Signed-off-by: Henri A <contact@henrialfonso.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Here is a set of USB fixes and new device ids for 7.1-rc6. Nothing
major in here, just lots of tiny fixes for reported issues found by
users and some older patches found by some scanning tools. Included in
here are:
- typec fixes found by fuzzers that have decided to finally look at
that device interaction path (i.e. before a driver is bound to a
device).
- typec fixes for issues found by users
- thunderbolt driver fixes for reported problems
- cdns3 driver fixes
- dwc3 driver fixes
- new device quirks added
- usb serial driver fixes for broken devices
- other small driver fixes
All of these have been in linux-next for over a week with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCahrHyA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymLRgCeN/oLFmGYFHjcJEZ8d0AjKbRS34oAn3r822bO
1mEsGeOojdWNUm4wzu1k
=pf1m
-----END PGP SIGNATURE-----
Merge tag 'usb-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and Thunderbolt fixes from Greg KH:
"Here is a set of USB fixes and new device ids for 7.1-rc6. Nothing
major in here, just lots of tiny fixes for reported issues found by
users and some older patches found by some scanning tools. Included in
here are:
- typec fixes found by fuzzers that have decided to finally look at
that device interaction path (i.e. before a driver is bound to a
device)
- typec fixes for issues found by users
- thunderbolt driver fixes for reported problems
- cdns3 driver fixes
- dwc3 driver fixes
- new device quirks added
- usb serial driver fixes for broken devices
- other small driver fixes
All of these have been in linux-next for over a week with no reported
issues"
* tag 'usb-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (54 commits)
USB: serial: cypress_m8: validate interrupt packet headers
USB: serial: safe_serial: fix memory corruption with small endpoint
USB: serial: omninet: fix memory corruption with small endpoint
USB: serial: mxuport: fix memory corruption with small endpoint
USB: serial: cypress_m8: fix memory corruption with small endpoint
USB: cdc-acm: Fix bit overlap and move quirk definitions to header
usb: dwc2: Fix use after free in debug code
usb: chipidea: core: convert ci_role_switch to local variable
usb: gadget: f_fs: serialize DMABUF cancel against request completion
usb: gadget: f_fs: copy only received bytes on short ep0 read
usb: gadget: dummy_hcd: Reject hub port requests for non-existent ports
dt-bindings: usb: Fix EIC7700 USB reset's issue
usbip: vudc: Fix use after free bug in vudc_remove due to race condition
dt-bindings: usb: ti,omap4-musb: Drop duplicate 'usb-phy' property constraints
usb: storage: Add quirks for PNY Elite Portable SSD
USB: quirks: add NO_LPM for Lenovo ThinkPad USB-C Dock Gen2 hub controllers
usb: usbtmc: reject interrupt endpoints with small wMaxPacketSize
usb: usbtmc: check URB actual_length for interrupt-IN notifications
xhci: tegra: Fix ghost USB device on dual-role port unplug
usb: gadget: uvc: hold opts->lock across XU walks in uvc_function_bind
...
Here are some small serial driver fixes for 7.1-rc6. Included in here
are:
- mips serial driver fixes to resolve some long-standing issues with
how they interacted with the console. That's the "majority" of the
changes in this merge request
- sh-sci driver regression fix
- 8250 driver regression fixes
- other small serial driver fixes for reported problems.
All of these have been in linux-next for over a week with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCahrImA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykedQCfb1CR0Z6lElo/02m3wmR+EfvGyoUAoLj8QU71
dFaLWzZQk8Hb6ajmVYK5
=tgAC
-----END PGP SIGNATURE-----
Merge tag 'tty-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small serial driver fixes for 7.1-rc6. Included in here
are:
- mips serial driver fixes to resolve some long-standing issues with
how they interacted with the console. That's the "majority" of the
changes in this merge request
- sh-sci driver regression fix
- 8250 driver regression fixes
- other small serial driver fixes for reported problems.
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: dz: Enable modular build
serial: zs: Convert to use a platform device
serial: dz: Convert to use a platform device
serial: zs: Switch to using channel reset
serial: zs: Fix bootconsole handover lockup
serial: dz: Fix bootconsole handover lockup
serial: dz: Fix bootconsole message clobbering at chip reset
serial: 8250_dw: dispatch SysRq character in dw8250_handle_irq()
serial: 8250: dispatch SysRq character in serial8250_handle_irq()
serial: core: introduce guard(uart_port_lock_check_sysrq_irqsave)
tty: serial: samsung: Remove redundant port lock acquisition in rx helpers
serial: altera_jtaguart: handle uart_add_one_port() failures
serial: qcom_geni: fix kfifo underflow when flush precedes DMA completion IRQ
serial: fsl_lpuart: fix rx buffer and DMA map leaks in start_rx_dma
tty: add missing tty_driver include to tty_port.h
serial: qcom-geni: fix UART_RX_PAR_EN bit position
serial: sh-sci: fix memory region release in error path
tty: serial: pch_uart: add check for dma_alloc_coherent()
serial: zs: Fix swapped RI/DSR modem line transition counting
Here are some small char/misc/iio driver fixes for 7.1-rc6. Included in
here are:
- lots of small IIO driver fixes for reported problems.
- Android binder bugfixes for reported issues.
- small comedi test driver fixes
- counter driver fix
- parport driver fix (people still use this?)
- rpi driver fix
- uio driver fix
All of these have been in linux-next for over a week with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCahrJeQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykKIwCfelyNaZgO2yRfSS1fGmzSv3+W8+sAoK5QHkEY
TvJIOm1Cwi8/n3vI42Hz
=EB+S
-----END PGP SIGNATURE-----
Merge tag 'char-misc-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/iio fixes from Greg KH:
"Here are some small char/misc/iio driver fixes for 7.1-rc6. Included
in here are:
- lots of small IIO driver fixes for reported problems.
- Android binder bugfixes for reported issues.
- small comedi test driver fixes
- counter driver fix
- parport driver fix (people still use this?)
- rpi driver fix
- uio driver fix
All of these have been in linux-next for over a week with no reported
problems"
* tag 'char-misc-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (41 commits)
Revert "gpib: cb7210: Fix region leak when request_irq fails"
misc: rp1: Send IACK on IRQ activate to fix kdump/kexec
gpib: cb7210: Fix region leak when request_irq fails
parport: Fix race between port and client registration
uio: uio_pci_generic_sva: fix double free of devm_kzalloc() memory
rust_binder: Avoid holding lock when dropping delivered_death
rust_binder: avoid calling pending_oneway_finished() on TF_UPDATE_TXN
comedi: comedi_test: fix check for valid scan_begin_src in waveform_ai_cmdtest()
comedi: comedi_test: Fix limiting of convert_arg in waveform_ai_cmdtest()
iio: adc: viperboard: Fix error handling in vprbrd_iio_read_raw
iio: gyro: itg3200: fix i2c read into the wrong stack location
iio: dac: ad5686: fix powerdown control on dual-channel devices
iio: dac: ad5686: acquire lock when doing powerdown control
iio: temperature: tsys01: fix broken PROM checksum validation
iio: dac: ad3530r: Fix AD3531/AD3531R powerdown mode strings
iio: buffer: hw-consumer: fix use-after-free in error path
iio: dac: ad5686: fix input raw value check
iio: dac: ad5686: fix ref bit initialization for single-channel parts
iio: ssp_sensors: cancel delayed work_refresh on remove
iio: adc: meson-saradc: fix calibration buffer leak on error
...
virtio_i2c_probe() synchronously probes child i2c drivers on the bus,
but peripherals may use the bus at probe for tasks like reading a chip
id. The vhost-user-i2c backend stalls at such probes unless DRIVER_OK
is already set before the virtqueue is first kicked.
Set DRIVER_OK explicitly before i2c_add_adapter(), as done for the
same reason in commit f5866db64f ("virtio_console: enable VQs
early") and commit 71e4b8bf04 ("virtio_rpmsg: set DRIVER_OK before
using device").
Signed-off-by: Alexis Bouzigues <BouziguesAlexis@JohnDeere.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
davinci: fix fallback bus frequency on missing clock-frequency
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQScDfrjQa34uOld1VLaeAVmJtMtbgUCahpiiQAKCRDaeAVmJtMt
bp+eAQDfsx42uAWytASKf7/lFJcPJvlBPXMa5ZwwqtwlBW+jPwEAned0rKUYyIGz
bpVpIO+H3iiaG325v9dxD56rmu9eEwE=
=Vmap
-----END PGP SIGNATURE-----
Merge tag 'i2c-host-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host-fixes for v7.1-rc6
davinci: fix fallback bus frequency on missing clock-frequency
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmoZxw8ACgkQiiy9cAdy
T1HhnwwAvivw/s84qQhbkgQllNMdb4SPl+Ph+DRMiwyrZjXr36kv8jtiPIRIlplB
Uk+jXpswQXNk6qVKriUzbM1xGTyBin4iFhDzXfLoMmtZtETAmnbHWX9cVFblOibb
o+kMYMRXo+TGvQE5d47VKMioL7W5AUFoXfrIfOvWMhnRBaPwgb/aTblUxLFtHYLw
rhDm24p5JKxHv9YsR5+XWofGP2STstMDgkKBYjqYolmrEaq1ho3qBVQtcGY/DJFT
5heZ/b+Tv8N0s9ccMOAipAW509Qjn3Tml5SvgRCTZ56nEuZHeZBYCoXLhdV1tPG9
iuCPxTKrgFkDOZNSdweZscR5OD3MlbDC103K6W/mDEZk3IIv3ZGYe4atBwiz8kMl
09xvct3UJviHuOWjVgI7TBDV+Y0Gpf7zTeOLfixhn2RrVjU2IwrKUjZBjKGkZAFI
r5YcTK1FOe3a7WwXNYkVXVvTfwqvpIclQCs+qnQqAiEjvBNWvmTtgGg2eOlxEnBo
j4AE8Ryh
=0uMS
-----END PGP SIGNATURE-----
Merge tag 'v7.1-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- security fix for FSCTL_SET_SPARSE
- fix leak in ksmbd_query_inode_status()
- fix OOB read in smb_check_perm_dacl()
* tag 'v7.1-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE
ksmbd: release ksmbd_inode ref via ksmbd_inode_put on lookup paths
ksmbd: OOB read regression in smb_check_perm_dacl() ACE-walk loops
dumb-buffer:
- prevent overflows in dumb-buffer creation
dma-buf:
- fix UAF in dma_buf_fd() tracepoint
gem:
- fix for the fix for the fix for the change handle ioctl
i915:
- Fix potential UAF in TTM object purge
- Use polling when irqs are unavailable
- Fix HDR pre-CSC LUT programming loop
- Block DC states on vblank enable when Panel Replay supported
- Use DC_OFF wake reference to block DC6 on vblank enable
xe:
- Restore IDLEDLY regiter on engine reset
amdgpu:
- GEM_OP warning fix
- GEM_OP locking fix
- Userq fixes
- DCN 2.1 refclk fix
- SI fix
- HMM fixes
amdkfd:
- svm_range_set_attr locking fix
- CRIU restore fix
- KFD debugger fix
amdxdna:
- require IOMMU on AIE2
hyperv:
- improve protocol validation
ivpu:
- test write offset in debugfs
rocket:
- fix UAF in bo creation
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmoaG2kACgkQDHTzWXnE
hr4euhAAl3u8saUVPRr3QObFBUdCi8X8ngZKd0o9L82TEqNz6iC8TFxwyVPr+0Mh
gEi2nvskopDRTb9So8LUAuevuT1r5KZRKH0w3FdR/bSCuO3hxrei0v3lM/L7D7Gn
ULQS1bP3kp4N1Bikt3UP5JSeJDhlpuwMqoapnA4Zz3WRnxVg6aFEYSc7Xb7dOWpS
2k5cJHd0IBxgF3/faw9vOM5DklwZqUAbIBxdziFA7Oepg0OxkypXpBPSioNhEW7f
hj8rlIDRXbLp33RXoL9UyJkXRIkFhjiAhmVcqTgO+nN7w8M2TKBsH/wkWly4d+aS
PveE+15zdRo0Tzw2DbTDYRCMyeBRpeik4s332ILSGse0MMt4RFz3wpB0fB/DriOz
BvHBOnUQerZfPk/dGarC8PxqoE70SZaPrMlvEkBmZpcY3OzIUEwV7Pd+r8W9PWjH
g6ECgPkvCyskJJPugrLYVqQkwjSIF5s+LmKwVfgvuqfyRenXPqrdPwSwEeZb43iC
NXGP8Au8QVsQKNdP6LzASclq3GcxYTfRK/v+wAgaSTpdvOkWhLBtyjPcO3aGWDgM
Ve8YFAPSobetEe5B9esBpliXQKj17juSxdJkut6B63RJ6fPk6r2Mza3lGmhHQs0R
FiyryFtQpzA5iETaiLe4TsaRZzWxzSHZEfV9GediQTRn8gQKEe0=
=tlSw
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2026-05-30' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Regular pull, doesn't seem too insane or AI owned, couple of UAF fixes
and another repair for an earlier fix, mostly amdgpu and i915 display
with xe/i915 accel, and misc core/driver fixes.
It might be a bit bigger than usual at this stage, but I'm not seeing
anything too scary here.
dumb-buffer:
- prevent overflows in dumb-buffer creation
dma-buf:
- fix UAF in dma_buf_fd() tracepoint
gem:
- fix for the fix for the fix for the change handle ioctl
i915:
- Fix potential UAF in TTM object purge
- Use polling when irqs are unavailable
- Fix HDR pre-CSC LUT programming loop
- Block DC states on vblank enable when Panel Replay supported
- Use DC_OFF wake reference to block DC6 on vblank enable
xe:
- Restore IDLEDLY regiter on engine reset
amdgpu:
- GEM_OP warning fix
- GEM_OP locking fix
- Userq fixes
- DCN 2.1 refclk fix
- SI fix
- HMM fixes
amdkfd:
- svm_range_set_attr locking fix
- CRIU restore fix
- KFD debugger fix
amdxdna:
- require IOMMU on AIE2
hyperv:
- improve protocol validation
ivpu:
- test write offset in debugfs
rocket:
- fix UAF in bo creation"
* tag 'drm-fixes-2026-05-30' of https://gitlab.freedesktop.org/drm/kernel: (33 commits)
drm/gem: fix race between change_handle and handle_delete
drm: prevent integer overflows in dumb buffer creation helpers
dma-buf: fix UAF in dma_buf_fd() tracepoint
drm/amdgpu: fix calling VM invalidation in amdgpu_hmm_invalidate_gfx
drm/amdgpu: fix amdgpu_hmm_range_get_pages
drm/amdgpu/userq: use array instead of list for userq_vas
drm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid
drm/amdkfd: fix a vulnerability of integer overflow in kfd debugger
drm/amdgpu/userq: remove amdgpu_userq_create/destroy_object wrapper
drm/amd/pm/si: Disregard vblank time when no displays are connected
drm/amdkfd: Check for pdd drm file first in CRIU restore path
drm/amdgpu: fix potential overflow in fs_info.debugfs_name
drm/amdgpu/userq: make sure queue is valid in the hang_detect_work
drm/amdgpu/userq: reserve root bo without interruption
drm/amdgpu/userq: add amdgpu_bo_unpin when amdgpu_ttm_alloc_gart fails
drm/amdgpu: simplify return value in amdgpu_userq_get_doorbell_index
drm/amdkfd: fix NULL pointer bug in svm_range_set_attr
drm/amd/display: Write REFCLK to 48MHz on DCN21
drm/amdgpu/userq: Fix the mutex_init cleanup for fence_drv_lock
drm/amdgpu/userq: Fix doorbell object cleanup of queue
...
One substantive fix here, fixing corruption of the maximum frequency for
spi-mem operations which caused users to remember what should have been
a temporarily modified maximum frequency as the standard going forward,
potentially causing instability when the modification raised rather than
lowered the frequency.
We also have a trivial patch which just documents the correct way to
describe the Qualcomm IPQ5210 SNAND controller in the DT, there are no
code changes.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmoZyKAACgkQJNaLcl1U
h9BkSQf+OPY/DuHr46oL4AH36df3FkihkFmaTlipyP2io7HiTgYG67wdXtVfh4H2
w8D9GT+ukrCGgWkJK+Qgf3IbX19BkL/1kyeDATBJErMd/XHQqn6u38+Lhe3EAqzP
L4S/pWHmtC/pKynPbsUUkunAYeUa3DK6ZHteZyCe/R+fTKizXOH5Lh74Dcm/pa+V
b8Aut8Igq32K5KXDkk+TzACkiGDaFs+M7QDfNGI9WN3zBrzxWtS/9ktF5/cN5gBG
dzjcLb+XJoZnJHwMa7geS7cVvKpPMbqX8fYwip+hhaPfyugP9EvdPibcAhUltGw3
TNFYV4HpFeGuf7yIoEIzapvVdvvbjA==
=iELF
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"One substantive fix here, fixing corruption of the maximum frequency
for spi-mem operations which caused users to remember what should have
been a temporarily modified maximum frequency as the standard going
forward, potentially causing instability when the modification raised
rather than lowered the frequency.
We also have a trivial patch which just documents the correct way to
describe the Qualcomm IPQ5210 SNAND controller in the DT, there are no
code changes"
* tag 'spi-fix-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-mem: avoid mutating op template in spi_mem_supports_op()
spi: dt-bindings: spi-qpic-snand: Add ipq5210 compatible
Some other fixing in an API user turned up the fact that we weren't
correctly applying cache only mode to volatile registers in
regmap_update_bits(), causing us to try to access hardware that was
powered off or otherwise not in a state to accept I/O. This fix returns
an error instead, avoiding more serious consequences.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmoZyXsACgkQJNaLcl1U
h9DJZgf/aOLHE/EBIBcq9QBczGj5Js2/diVO4YCR0WGISbYv4tRfhwkO5SMyPFZ0
Zk7c9rfzEqauQW3tpsPUhg/B5gK6HOZ/gIAZA7+CjmLCDxg2EBxclFFnl7UKWt8d
Xs5YdokBL4ZrlVBtgL3YerQ4dCSiDr6FLZYAnFWy5FLXkNbwqvxhUzc7LzAnY/Z3
pLw/LOnSc1LwXhf10gCKI8OoHdQSPu0pNr9ZYG1smD1J/K9V9Pgbdq0oLrGquwB1
F8mIFdncblhW4ChbWDUxhF26htKpv4qwdjWKdkHlNDXAYPrS70Ea2PqVP1XwEWdK
WxouWVBIpOOqGt0OCYooADmHADJ6VQ==
=9bVn
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"Some other fixing in an API user turned up the fact that we weren't
correctly applying cache only mode to volatile registers in
regmap_update_bits(), causing us to try to access hardware that was
powered off or otherwise not in a state to accept I/O. This fix
returns an error instead, avoiding more serious consequences"
* tag 'regmap-fix-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: reject volatile update_bits() in cache-only mode
this out is because the IPsec and Bluetooth PRs did not make it
yesterday. I don't want to have to send you all of this + whatever
comes next week, for rc7. The fixes under "Previous releases -
regressions" are for real user-reported regressions from v7.0.
Previous releases - regressions:
- Revert "ipv6: preserve insertion order for same-scope addresses"
- xfrm: move policy_bydst RCU sync, a fix which added a sync RCU
on netns exit got backported to stable and was causing serious
accumulation of dying netns's for real workloads
- pcs-mtk-lynxi: fix bpi-r3 serdes configuration
Previous releases - always broken:
- usual grab bag of race, locking and leak fixes for Bluetooth
- handful of page handling fixes for IPsec
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmoZ+wAACgkQMUZtbf5S
IrtXJQ/9Gwf702wvkRaeLdqrwQ/qLsvDfx5s+3ALIE0Xsm4z9g7V0XKrZ0cfiI1h
aWGX8HugXEQuy9QvlFt09tgGEd76159g2WdlsBbh1raqiJRUw4GJKXYvwCmBZxsT
o8bwfVTQ8CVmUTCKhYrpzJKroT6jR8dKHIrkRn5ZyBOBPMOhK8rnDs1OdseW5haI
b/EkQrzzvTxd7/dJETIJszMQh/nbS5XIlKpQ+f7dfzR1gtO2GOJ24VWqrimonRTo
qvMwyt+ca2axv7Af796I8mz7X9rqLjWVWzY2uSpd7Y5zITyQwHNbeNvxzr2Ivi4g
2BcIi+ZHeeRbgQ9EL+rzapTnnIPIw0APPXnp5NnnNDj0RRG3G6PzulW9SmcdsmGD
o6E7axSZPQT/KnCw1/N7uMfB9cPzgb1i0h8rbE6tCvtkDtJwECtey7Dc7RU9zLqP
e0jWDv99+MyEqGPcu2LAg2IWLfsuQiV4priy4mM1NgOTQVgS1yw7+x0GiTqiClJ0
GcOCTOdvYKlmzhLzsLo4I+AcKZq2uJi8wNXMUEP5pmuYByVeF5j+MmoFpQspzx+L
gdUh9IctAjd47oX/uNaRtocOriU+JJEApToE9WekMb0XYd5Qx1jnt3WqB9ZFuDf4
smjUirtAWYcT3d4SXR4wGzB5WEa8TITH07A7sa8noozzNmQRu1E=
=ttPc
-----END PGP SIGNATURE-----
Merge tag 'net-7.1-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
"Quick follow up, nothing super urgent here. Main reason I'm sending
this out is because the IPsec and Bluetooth PRs did not make it
yesterday. I don't want to have to send you all of this + whatever
comes next week, for rc7. The fixes under "Previous releases -
regressions" are for real user-reported regressions from v7.0.
Previous releases - regressions:
- Revert "ipv6: preserve insertion order for same-scope addresses"
- xfrm: move policy_bydst RCU sync, a fix which added a sync RCU on
netns exit got backported to stable and was causing serious
accumulation of dying netns's for real workloads
- pcs-mtk-lynxi: fix bpi-r3 serdes configuration
Previous releases - always broken:
- usual grab bag of race, locking and leak fixes for Bluetooth
- handful of page handling fixes for IPsec"
* tag 'net-7.1-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
wireguard: send: append trailer after expanding head
Revert "ipv6: preserve insertion order for same-scope addresses"
net: skbuff: fix pskb_carve leaking zcopy pages
ipv6: fix possible infinite loop in fib6_select_path()
ipv6: fix possible infinite loop in rt6_fill_node()
bpf: sockmap: fix tail fragment offset in bpf_msg_push_data
vsock/virtio: bind uarg before filling zerocopy skb
Revert "esp: fix page frag reference leak on skb_to_sgvec failure"
net: pcs: pcs-mtk-lynxi: fix bpi-r3 serdes configuration
sctp: fix race between sctp_wait_for_connect and peeloff
net: mana: Skip redundant detach on already-detached port
net: mana: Add NULL guards in teardown path to prevent panic on attach failure
Bluetooth: hci_sync: Reset device counters in hci_dev_close_sync()
Bluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close
Bluetooth: hci_core: Rework hci_dev_do_reset() to use hci_sync functions
Bluetooth: ISO: serialize iso_sock_clear_timer with socket lock
Bluetooth: ISO: fix UAF in iso_recv_frame
Bluetooth: L2CAP: Fix possible crash on l2cap_ecred_conn_rsp
Bluetooth: l2cap: clear chan->ident on ECRED reconfiguration success
Bluetooth: hci_qca: Use 100 ms SSR delay for rampatch and NVM loading
...
- Account for recently implemented -Wattribute-alias in clang by
disabling it in the same places it is disabled for GCC.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCahnrhQAKCRAdayaRccAa
loxRAP0dhAttt5Nzj1QVvVshp5gUdELM+zXD1qyXAY9Z81V3rgD8C867ax5pGAgO
kV2PwaBK5P6FJ8joUz4m3qB20FXXYQs=
=drMM
-----END PGP SIGNATURE-----
Merge tag 'clang-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/nathan/linux
Pull clang build fix from Nathan Chancellor:
"A small fix to disable -Wattribute-alias for clang in the few places
it is already disabled for GCC, now that tip of tree clang has
implemented -Wattribute-alias as GCC has"
* tag 'clang-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/nathan/linux:
Disable -Wattribute-alias for clang-23 and newer
This reverts
dc8aa31a7a ("x86/fpu: Refine and simplify the magic number check during signal return").
The aforementioned commit broke applications that construct signal frames in
userspace (such as CRIU and gVisor) if the frame's xstate size is smaller than
the kernel's fpstate->user_size.
Furthermore, this introduces a critical issue for checkpoint/restore tools
like CRIU. If a process is checkpointed while inside a signal handler, its
stack contains a signal frame formatted according to the source host's xstate
capabilities.
If that process is later restored on a destination host with larger xstate
capabilities (e.g., a newer CPU with more features enabled, resulting in
a larger fpstate->user_size), the kernel will look for FP_XSTATE_MAGIC2 at the
destination host's larger user_size offset instead of the offset encoded in
the frame's fx_sw->xstate_size.
This causes the magic2 check to fail, forcing sigreturn to silently fall back
to "FX-only" mode. Upon return from the signal handler, the process's extended
state is reset to initial values instead of being restored, leading to silent
data corruption.
The aforementioned commit cited
d877550eaf ("x86/fpu: Stop relying on userspace for info to fault in xsave buffer")
as justification to stop relying on userspace for the magic number check.
However, these two changes are fundamentally different. The last one only
changed how much memory the kernel ensures is paged-in before running XRSTOR
to prevent an infinite loop. It did not change the signal frame format or how
the layout is validated.
Reverting this change restores the use of fx_sw->xstate_size for
locating magic2 and restores the necessary sanity checks, ensuring that
the signal frame remains self-describing and portable.
[ bp: Massage commit message. ]
Fixes: dc8aa31a7a ("x86/fpu: Refine and simplify the magic number check during signal return")
Signed-off-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20260429000623.3356606-1-avagin@google.com
Fix probing of Atmel EBI memory controller driver e.g. on at91
sam9x60-curiosity board due to usage of platform_driver_probe() which is
not handling deferred probe. Lack of EBI driver caused dependant NAND
controller to fail to probe, basically failing entire board boot.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAmoZPZ4QHGtyemtAa2Vy
bmVsLm9yZwAKCRDBN2bmhouD1wBTEACWlhnslXLbIma+QXkLDvVG4wi+WvDLMM+q
nQGXDwFa5bkmE9dwlkVeLiDMIhBj+q0WMgH3ctHQCPyde2l8vIvdkuWM+HyGgFTs
eNmQTIFyn09YayZTzWE/5q/fJcIBMeFyfLPDnf562qs+cWu7BxfT8M4jvvdpQ9QH
BJZbBLq4l80+X6Q1AlLP4djklY7CRosV/8Acuo2cIaiIEPVEWaGdFX5EcGSqSvFf
3uMIHw/X+BiHpsZv02NLFOF57El3COEJNrgaiSUM9NtUkPfgvrIIBZgUwhdvG81r
0+3eUKqL1jvZc2K6QLUvNR9lWuCE+5g4jVhgF+mPkXyIKu5Nl1x1JO09Q+IBTkZd
r+rJFn+mBdAFGc0gHufHNyP8QiXntmCn0BVp6paBH0RQUroC3MWaJaYGxyk1fUA4
ohCtRqFUqkDyDebqLrtZAe1R87B382Ia+477h5pRvivEViVrRFm+n3TJl2k54Gcw
eTs/HFv58jn2122Uz2vOmUMVlVQsbrzrm1LpPHoAmkcDkogkRSDaGUK/o3RLvh1l
nzIelNNL3KNpw6R6wHzEJUINS31VgVxc6WadK60py5MmmoGvS1hIXOF6/iBx8Vk6
RAyuW1LNuj6eAhwmBK6h2tQn+Gs/pidymlrassdxzPUwrft7mMuc2CwfL92QAwtN
X3eFiWuayw==
=4VXt
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoaB6MACgkQmmx57+YA
GNl5aQ/+LKUsj8Z1+ldO3y8RznLbgD4eszkgT3vjUrGj1VxdBijHL/EWqBtfacoJ
Lat6nNMmFeE8LcDqStm3SwM20N2m5PaXFPcIeDaDL/ZiM+v3YfCTvX4lkzQGJ1B7
dIs89cPyMQX2qwF8+uKRZou80RNAwRWTp/zT0Q7sIxnzsQ6Gz2tybuxEZ2abgdmT
X2F1HY3ex8TXTA+NqfJLubf9eHtbXxI1KqFtjU9+YI83OHFqSPNHLNmKCNH0uHkn
vhGXZgX6RuebtYCw0Fo73cLm3UUj24204pVuujA07OmQmPiZ5Xvp1wMGZtmNkzQU
L1f3Pav+UDopM6IIzbc2d+Y/YSvupS8ADkaYmcsxLo2KTpdRyRrEf3If2AiQGpxG
kTCLKWkXDVg8SQic0W+jHC+8wnm9Ec8Idp08s/4XPlA1vSo8MRSE+TDHBmDGgDhQ
IqvddJE+uT2ZW1yU2Wao7rfQ3KRaWuz7xPYb7ySilAn1i6LtCjysUaDISxRBCiYQ
bQoxXvvWdmqg8aDpsRyhW2mwQN+4vEeq/IH73XoiZH6jzTj9wReHbudKDG1gREFx
X/0tBBwuuJhwNbtOj+DVrJlM1WqPDjb7zw6yzEinf3LzvEYR8BGK+R0w3mWa8Eal
bJCEjHg4mx6LEkFdleFmhLyAL7fVJoxydmpb3HpQ+5r0xk01YhM=
=8TQP
-----END PGP SIGNATURE-----
Merge tag 'memory-controller-drv-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/fixes
Memory controller drivers - fixes for v7.1
Fix probing of Atmel EBI memory controller driver e.g. on at91
sam9x60-curiosity board due to usage of platform_driver_probe() which is
not handling deferred probe. Lack of EBI driver caused dependant NAND
controller to fail to probe, basically failing entire board boot.
* tag 'memory-controller-drv-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
memory: atmel-ebi: Allow deferred probing
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix several concurrency issues present in Samsung ACPM firmware drivers,
used currently only on Google GS101. Tudor with help of Sashiko
identified several missing barriers and incomplete synchronization,
leading to possible transfer data corruption or use after free. Few
other issues related to probe, including missing mailbox cleanup, were
also fixed.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAmoZgsoQHGtyemtAa2Vy
bmVsLm9yZwAKCRDBN2bmhouD1wmbEACCXrSzY82MLt6XmVByWQy9U9OBLkC471qA
GhcVdC0uMs6hD5prvJCnHVP236XKUZnT6EZjpBW+8Me3Yw4z4j9Wdd5j0z1haEF5
DdldsjBjrawsXazVVlj4J7iE1fv83dZrgOASDXFhP+GHRPsfrCYWlNIPhWcG8wwp
68yYY8FT4hQ4/i5otWixHSdeaadPV/TWf+W90l/1Z6uXdt/9CzAcavHiLOkLWji+
dy87ZNTcpZoCqa12sPz2wkmMXj5ET5vUsuvoVZu3376ViU8q/6kSIRYtIQjCgM8J
cMNArDequv6g24ra+W8D234qo50fBAVpeYHFyZKDJYekkJ9HJyQ2NXarJz/kvEUw
ZkmUA/SgZUc0hM5i0J+5O6K7lTed6eMmkR72QTT/uQZOAcEkqsFpkuZhAetxguR8
vO7p99+QAS/grHKiXaLKTOYgpoxitqx0int0FZ1shWbeO/OYorrZ00x0OkK++V++
WsI7SJVNco/LM4APdSIRBphZ5ULl6xgu2SLO05J7uQFgiUWU0Q9v+t1MxIbvmL0T
HH5H1STGr12LOBwnJDjK/AYbAhFFM9pZxgPJU9S4NzKu3peQJjgLvnOeEEmLkmeQ
47bKUlrOMQIP5lpzLBYtL9NT/7fyQKfl8fbdq5O3KwthMvOiit6mGD+ExDoptIp4
4T+NBiG08A==
=hb6T
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoaByIACgkQmmx57+YA
GNmeiA/9EZ+LO2JUrUVhMNXOfUmDGrpfL/mVqcGL9ksK+EkOW+9Fae8f9hWrvBFh
exeMRTwKDcA2hag6AnBiKhctLYFIZ7+nRb/xu2ybasdIAloR10oDKlmgzH/Yj4c9
MsIgfuK+q5FnOxyhtNEZ4JszYYvAl3wfFvBR1241L52tJ+vyOLxEjk/8rhESFv1P
rTeCXlRA/dJ4hqpP9j+QflgrvrIYe1vhsBdNFXQEcjUgQnqySPY7Ynlnr1BJMyWQ
N4x+gJ9ZITr75kzMJCwqGXfktwPP6+BahBHDk8i6U6zfFwb2DJFHx0OSYf3Z1s1w
V7wxIMiGr/XZ/r0D6prlqYAjLZS8KL9sYEC/UB+WB95SwnFWm3J1K895rt4awuzY
HD6jhgPlyvIBEJRI7ZNMB4Hg0p9Z5pXse1EDseDdAnAth9brbFN7d/JbbtTMKfny
jyQJG8rEUvmSF+No35VprFW6JhyDHFlnrQvoSQdLgxsKHrMRcI1xYnQXi2aILlYI
cYCYzL2lAeteaMHVMrPDoLQGCbKnvTz+pAhiOXTX2ie6XS8JuYPhf1H+Pp8GTozp
P+Iy995FZ92aaCyPmINa8lN2hfuC3qhxNZehZ8bEShKQ+44wab1SV6q/shM+HPt2
i1OLt9Kg/CWt6hVXW3jkZvdNry1HuPP3f4jrQwquOFh6h2VlnXw=
=zqbF
-----END PGP SIGNATURE-----
Merge tag 'samsung-drivers-fixes-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/fixes
Samsung SoC driver fixes for v7.1
Fix several concurrency issues present in Samsung ACPM firmware drivers,
used currently only on Google GS101. Tudor with help of Sashiko
identified several missing barriers and incomplete synchronization,
leading to possible transfer data corruption or use after free. Few
other issues related to probe, including missing mailbox cleanup, were
also fixed.
* tag 'samsung-drivers-fixes-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
firmware: samsung: acpm: Fix missing LKMM barriers in sequence allocator
firmware: samsung: acpm: Fix false timeouts and Use-After-Free in polling
firmware: samsung: acpm: Fix mailbox channel leak on probe error
firmware: samsung: acpm: Fix cross-thread RX length corruption
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
drm_gem_change_handle_ioctl leaves the old handle live in the IDR
during the window between spin_unlock(table_lock) and the final
spin_lock(table_lock). A concurrent drm_gem_handle_delete on the old
handle succeeds in this window, decrements handle_count to 0, and frees
the GEM object while the new handle's IDR entry still references it.
NULL the old handle's IDR entry before dropping table_lock so that any
concurrent GEM_CLOSE on the old handle sees NULL and returns -EINVAL.
Restore the old entry on the prime-bookkeeping error path.
Fixes: 5e28b7b944 ("drm: Set old handle to NULL before prime swap in change_handle")
Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260526085313.26791-1-kipreyyy@gmail.com
- Restore CONFIG_PKVM_DISABLE_STAGE2_ON_PANIC to its former glory by
making sure the config symbol is correctly spelled out in the code
- Don't reset the AArch32 view of the PMU counters to zero when the
guest is writing to them
- Fix an assorted collection of memory leaks in the newly added tracing
code
- Fix the capping of ZCR_EL2 which could be used in an unsanitised way
by an L2 guest
x86:
- Include the kernel's linux/mman.h in KVM selftests to ensure MADV_COLLAPSE
is defined, as older libc versions may not provide it.
- Include execinfo.h if and only if KVM selftests are building against glibc,
and provide a test_dump_stack() for non-glibc builds.
- Silence an annoying RCU splat on (even non-KVM-related) panics. The splat
is technically legit, but in practice not an issue. To have a race, you
would need to unload the KVM modules at exactly the time a panic happens;
and speaking of incredibly rare races, taking the locks risks introducing
a deadlock if the module unload code took the lock on a CPU that has been
halted. Which seems possibly more likely than the RCU grace period issue,
so just shut it up. This code used to be in KVM but is now outside it;
but the x86 maintainers haven't picked it up, so here we are.
- Rate-limit global clock updates once again (but without delayed work), as
KVM was subtly relying on the old rate-limiting for NPT correction to guard
against "update storms" when running without a master clock on systems with
overcommitted CPUs.
- Fix a brown paper bag goof where KVM checked if ERAPS is "dirty" instead of
marking it dirty when emulating INVPCID.
- Flush the TLB when transitioning from xAVIC => x2AVIC to ensure the CPU TLB
doesn't contain AVIC-tagged entries for the APIC base GPA.
- The top 10 commits fix buffer overflow (and potential TOC/TOU) flaws in the
page state change protocol for encrypted VMs. AI models find it quite
easily given it was reported three times, but aren't as good at writing
a comprehensive fix. There's more to clean up in the area, which will
come in 7.2.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmoZ2qQUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPPFQgAhDwSk+VVnn4vuerijZh6eo3Tz4EQ
af0Ccng1uDuTuz9HzkF/ffR4z3tBMYtUhVtUiPu5xrUabzmIW7T0roNvsCwzVZor
ekZt3Y8FgwSgF+nxbBQQXBPvv+tOHpoIhfbirftWE9tRRFivfK1Z1duRGwsv7Seb
0eK+iB1huJLjXqIZQtSLEY44LSoQbDIt/StkkYFLUr10oOvTRCFiu2wPA2gZrK56
KTVrCg7rtn135wh8TVA72u+pIszylIPFTQ1HbbzzBoQ8/Opp0olFL3q0HeAwkx6D
q0EJiNMP0QD8NDC7Q8efAit4wI0pXE4Y6ScHQJTm3p+hB6KXc9o7LKbCmA==
=6jit
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"arm64:
- Restore CONFIG_PKVM_DISABLE_STAGE2_ON_PANIC to its former glory by
making sure the config symbol is correctly spelled out in the code
- Don't reset the AArch32 view of the PMU counters to zero when the
guest is writing to them
- Fix an assorted collection of memory leaks in the newly added
tracing code
- Fix the capping of ZCR_EL2 which could be used in an unsanitised
way by an L2 guest
x86:
- Include the kernel's linux/mman.h in KVM selftests to ensure
MADV_COLLAPSE is defined, as older libc versions may not provide
it.
- Include execinfo.h if and only if KVM selftests are building
against glibc, and provide a test_dump_stack() for non-glibc
builds.
- Silence an annoying RCU splat on (even non-KVM-related) panics.
The splat is technically legit, but in practice not an issue. To
have a race, you would need to unload the KVM modules at exactly
the time a panic happens; and speaking of incredibly rare races,
taking the locks risks introducing a deadlock if the module unload
code took the lock on a CPU that has been halted. Which seems
possibly more likely than the RCU grace period issue, so just shut
it up. This code used to be in KVM but is now outside it; but the
x86 maintainers haven't picked it up, so here we are.
- Rate-limit global clock updates once again (but without delayed
work), as KVM was subtly relying on the old rate-limiting for NPT
correction to guard against "update storms" when running without a
master clock on systems with overcommitted CPUs.
- Fix a brown paper bag goof where KVM checked if ERAPS is "dirty"
instead of marking it dirty when emulating INVPCID.
- Flush the TLB when transitioning from xAVIC => x2AVIC to ensure the
CPU TLB doesn't contain AVIC-tagged entries for the APIC base GPA.
- The top 10 commits fix buffer overflow (and potential TOC/TOU)
flaws in the page state change protocol for encrypted VMs. AI
models find it quite easily given it was reported three times, but
aren't as good at writing a comprehensive fix. There's more to
clean up in the area, which will come in 7.2"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer
KVM: SEV: Check PSC request indices against the actual size of the buffer
KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc()
KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0
KVM: SEV: Compute the correct max length of the in-GHCB scratch area
KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests
KVM: SEV: Ignore Port I/O requests of length '0'
KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+
KVM: SEV: Ignore MMIO requests of length '0'
KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use
KVM: arm64: Correctly cap ZCR_EL2 provided by a guest hypervisor
KVM: arm64: Fix memory leak in hyp_trace_unload()
KVM: arm64: Fix rollback in hyp_trace_buffer_share_hyp()
KVM: arm64: Fix meta-page unsharing in pKVM hyp tracing
KVM: arm64: PMU: Preserve AArch32 counter low bits
KVM: SVM: Flush the current TLB when transitioning from xAVIC => x2AVIC
KVM: x86: Fix ERAPS RAP clear on INVPCID single-context invalidation
KVM: arm64: Fix CONFIG_PKVM_DISABLE_STAGE2_ON_PANIC
KVM: selftests: Guard execinfo.h inclusion for non-glibc builds
KVM: x86: Rate-limit global clock updates on vCPU load
...
Jason A. Donenfeld says:
====================
WireGuard fixes for 7.1-rc6
Please find one small patch, fixing the order of adding padding onto a
packet, to ensure padding bytes get zeroed properly.
====================
Link: https://patch.msgid.link/20260529173134.3080773-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With how this is currently written, we add the trailer, zero it out, and
then add the header space on. If that header space requires a
reallocation + copy, the zeros in the trailer aren't copied, because the
skb len hasn't actually been yet expanded to cover that. Instead add the
padding at the end of the process rather than at the beginning.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20260529173134.3080773-2-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chris Adams reported that preserving insertion order for same-scope
addresses is causing SSH connections to be dropped after stopping a VM
while running NetworkManager.
NetworkManager caches the IPv6 address configuration, when a RA arrives,
it determines the list of addresses to configure and checks if the
addresses are already in the right order in the kernel. If they aren't,
NetworkManager removes and re-adds them to achieve the desired order.
As the order changes, NetworkManager is confused and reconfigures the
addresses on every update. In addition, this would also affect to cloud
tooling that relies on IPv6 addresses order to identify primary and
secondaries addresses.
This reverts commit cb3de96eea.
Fixes: cb3de96eea ("ipv6: preserve insertion order for same-scope addresses")
Reported-by: Chris Adams <linux@cmadams.net>
Closes: https://lore.kernel.org/netdev/20260521135310.GC977@cmadams.net/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260529112357.5079-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmoZWqgACgkQrB3Eaf9P
W7cz1A//RDEq8pvp1kefBC6YLM9nAEpiIS+gdBWjUty/zC2bpuvWPnEaDKXeZVVx
Vvo9ITV6BsgNsiUEOyM5ehsDknY9TZMFXSawQQWGiRZmGtP+wM3fesoklUDUz+QD
JBaPg7JEcGjFXPlr1X+MF+bvPVfyPaf/s8VEcatFfkPVV2JZPiENwLmxq/ZV3LWF
R5pB0Mz1AreRJQ3IZuUn8ae/UqUQ+GSP3VtI45lrNDWDBeVeP8zT3orm4Tv9ITYm
doNvbXWYhZNlXUcP0qZ887G2Kn6dbrUbsdp0dOnQDAQu2NR0+tYQWxhoCN5Ps3zl
OisDsNEp4aUzwFkwIE84E43rygD6wc7lx+BGgdFUM2FtmxRv7fUiIuvVuCtC87hv
CsK0SueSgog5x3Ltx/P5O+hn80wKAUqPMESb/7Oxja0rUXi251E7WLVNJdgV0t2y
OJMOMFm1uFwsckFBoSi54QNbJkFFK2lvdl+jQ068E7Cqf88LeqtNe56TOLr/Ut7I
UnQakEDnOgzi1HHcpOs/hycyqvPgvBqhRI6IwAtZZFUzQ/i+usmLUIP4AhQRsA9u
ffI/m+7uF4EJ4H+L/FxZds+AMGh28sL6a3muKpYgcHRJ/3bDPOGaL8NHyy+sTfFW
U6GpFqjv2sEWZM8bCN1g7ymNg+70a/xeFwu6/38+X3cP7bg+QgE=
=NQJ5
-----END PGP SIGNATURE-----
Merge tag 'ipsec-2026-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2026-05-29
1) xfrm: route MIGRATE notifications to caller's netns
Thread the caller's netns through km_migrate() so that
MIGRATE notifications go to the issuing netns, fixing both the
init_net listener leak and MOBIKE notifications inside
non-init netns. From Maoyi Xie.
2) xfrm: ipcomp: Free destination pages on acomp errors
Move the out_free_req label up so that allocated destination
pages are released on decompression errors, not only on success.
From Herbert Xu.
3) xfrm: Check for underflow in xfrm_state_mtu
Reject configurations that cause xfrm_state_mtu() to underflow,
preventing a negative TFCPAD value from becoming a memset size
that triggers an out-of-bounds write of several terabytes.
From David Ahern.
4) xfrm: ah: use skb_to_full_sk in async output callbacks
Convert the possibly-incomplete skb->sk to a full socket pointer
in async AH callbacks so that a request_sock or timewait_sock
never reaches xfrm_output_resume() downstream consumers.
From Michael Bommarito.
5) Add and revert: esp: fix page frag reference leak on skb_to_sgvec failure
The patch does not fix te issue completely.
6) xfrm: esp: restore combined single-frag length gate
Check the aligned post-trailer combined length against a page limit
in the fast path, preventing skb_page_frag_refill() from falling
back to a page too small for the destination scatterlist.
From Jingguo Tan.
7) xfrm: iptfs: reset runtime state when cloning SAs
Reinitialise the clone's mode_data runtime objects before
publishing it, preventing queued skbs from being freed with
list state copied from the original SA when migration fails.
From Shaomin Chen.
8) xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
Flush policy tables and drain the workqueue in a .pre_exit handler
so that cleanup_net() pays one RCU grace period per batch instead
of one per namespace, fixing stalls at high CLONE_NEWNET rates.
From Usama Arif.
9) xfrm: input: hold netns during deferred transport reinjection
Take a netns reference when queueing deferred transport reinjection
work and drop it after the callback completes, keeping the skb->cb
net pointer valid until the deferred work runs.
From Zhengchuan Liang.
* tag 'ipsec-2026-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
Revert "esp: fix page frag reference leak on skb_to_sgvec failure"
xfrm: input: hold netns during deferred transport reinjection
xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
xfrm: iptfs: reset runtime state when cloning SAs
xfrm: esp: restore combined single-frag length gate
esp: fix page frag reference leak on skb_to_sgvec failure
xfrm: ah: use skb_to_full_sk in async output callbacks
xfrm: Check for underflow in xfrm_state_mtu
xfrm: ipcomp: Free destination pages on acomp errors
xfrm: route MIGRATE notifications to caller's netns
====================
Link: https://patch.msgid.link/20260529092648.3878973-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When SKBFL_MANAGED_FRAG_REFS is set, frag pages are not refcounted but
their lifetime is controlled by the attached ubuf_info. To make a copy
of the skb_shared_info, we either should clear the flag and reference
the frags, or keep the flag and have frags unreferenced.
pskb_carve_inside_header() and pskb_carve_inside_nonlinear() don't
follow the rule and thus can leak page references. Let's clear
SKBFL_MANAGED_FRAG_REFS from the original skb to fix it. It's the
simplest way to address it, but there are more performant ways to do
that if it ever becomes a problem.
Link: https://lore.kernel.org/all/20260523085809.26331-1-nvminh232@clc.fitus.edu.vn/
Fixes: 753f1ca4e1 ("net: introduce managed frags infrastructure")
Reported-by: Minh Nguyen <minhnguyen.080505@gmail.com>
Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/1e2086aa69217d7f9c8da3d38f5be7160f1b4cd1.1779993185.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Found while auditing the same pattern Sashiko reported in
rt6_fill_node() [1]. Apply the same fix as
commit f8d8ce1b51 ("ipv6: fix possible infinite loop in fib6_info_uses_dev()").
Writers holding tb6_lock can list_del_rcu(&first->fib6_siblings)
without waiting for RCU readers; first->fib6_siblings.next then
still points into the old ring and this softirq-side walker never
reaches &first->fib6_siblings as its terminator. fib6_purge_rt()
always WRITE_ONCE()s first->fib6_nsiblings to 0 before
list_del_rcu(), so an inside-loop check is a reliable detach signal.
[1] https://sashiko.dev/#/patchset/20260526020227.4857-1-jiayuan.chen%40linux.dev
Fixes: d9ccb18f83 ("ipv6: Fix soft lockups in fib6_select_path under high next hop churn")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260527053133.180695-2-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sashiko reported this issue [1]. Apply the same fix as
commit f8d8ce1b51 ("ipv6: fix possible infinite loop in fib6_info_uses_dev()").
Writers holding tb6_lock can list_del_rcu(&rt->fib6_siblings)
without waiting for RCU readers; rt->fib6_siblings.next then still
points into the old ring and this softirq-side walker never reaches
&rt->fib6_siblings, causing a CPU stall. fib6_del_route() always
WRITE_ONCE()s rt->fib6_nsiblings to 0 before list_del_rcu(), so an
inside-loop check is a reliable detach signal.
[1] https://sashiko.dev/#/patchset/20260526020227.4857-1-jiayuan.chen%40linux.dev
Fixes: d9ccb18f83 ("ipv6: Fix soft lockups in fib6_select_path under high next hop churn")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260527053133.180695-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When bpf_msg_push_data() inserts data in the middle of a scatterlist
entry, it splits the original entry into a left fragment and a right
fragment.
The right fragment offset is page-local, but the code advances it with
`start`, which is the message-global insertion point. For inserts into a
non-first SG entry, this over-advances the offset and leaves the split
layout inconsistent.
Advance the right fragment offset by the fragment-local delta,
`start - offset`, which matches the length removed from the front of the
original entry.
Fixes: 6fff607e2f ("bpf: sk_msg program helper bpf_msg_push_data")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yuqi Xu <xuyq21@lenovo.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Link: https://patch.msgid.link/8b129d10566aa3eb43f61a8f9757bcf51707d324.1779636774.git.xuyq21@lenovo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
virtio_transport_send_pkt_info() allocates or reuses the zerocopy uarg
before entering the send loop, but virtio_transport_alloc_skb() still
fills the skb before it inherits that uarg. When fixed-buffer vectored
zerocopy hits MAX_SKB_FRAGS, io_sg_from_iter() may partially attach
managed frags and return -EMSGSIZE. The rollback path call kfree_skb()
to free an skb that carries SKBFL_MANAGED_FRAG_REFS but no uarg, so
skb_release_data() falls through to ordinary frag unref.
Pass the uarg into virtio_transport_alloc_skb() and bind it immediately
before virtio_transport_fill_skb(). This keeps control or no-payload skbs
untouched while ensuring success and rollback share one lifetime rule.
Fixes: 581512a6dc ("vsock/virtio: MSG_ZEROCOPY flag support")
Signed-off-by: Lin Ma <malin89@huawei.com>
Signed-off-by: Rongzhen Cui <cuirongzhen@huawei.com>
Signed-off-by: Jingguo Tan <tanjingguo@huawei.com>
Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260527023301.1075581-1-malin89@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use READ_ONCE() when reading entries/indices from the guest-accessible
Page State Change buffer to defend against TOCTOU bugs.
Don't bother with READ_ONCE()/WRITE_ONCE() for cases where KVM is writing
(and not consuming the result!), as the guest isn't supposed to touch the
buffer while it's being processed. I.e. using READ_ONCE() is all about
protecting against misbehaving guests.
Fixes: 9b54e248d2 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-11-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When processing Page State Change (PSC) requests, validate the PSC buffer
against the effective size of the scratch area, which could be less than
the maximum size if the guest provided a pointer that isn't exactly at the
start of the GHCB shared buffer.
Fixes: 9b54e248d2 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop explicitly passing the PSC buffer to snp_begin_psc(): it *must*
be the scratch area. This will allow fixing a variety of bugs without
further complicating the code.
No functional change intended.
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that all paths in KVM properly validate the length needed for the
scratch area, and are guaranteed to pass in a non-zero length, WARN if KVM
attempts to configured the scratch area with min_len==0 to guard against
future bugs.
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When setting the length of the GHCB scratch area, and the area is in the
GHCB shared buffer, set the effective length of the scratch area to the max
possible size given the start of the guest-provided pointer, and the end of
the shared buffer.
The code was "fine" when first introduced, as KVM doesn't consult the
length of the buffer when emulating MMIO, because the passed in @len always
specifies the *max* size required. But for PSC requests, the incoming @len
is just the minimum length (to process the header), and KVM needs to know
the full size of the scratch area to avoid buffer overflows (spoiler alert).
Opportunistically rename @len => @min_len to better reflect its role.
Fixes: 9b54e248d2 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When handling a Page State Change (PSC) #VMGEXIT use the size of the PSC
header as the minimum size for the scratch area. Per the GHCB spec, PSC
requests do NOT provide the length, i.e. using control->exit_info_2 for the
length is completely made up behavior. The existing code "works", e.g.
even though Linux-as-a-guest always passes '0', because KVM doesn't do
anything with the length when the request is in the GHCB's shared buffer.
Use the header as the min length. Once the header is retrieved, KVM can
use the specified indices to compute the full size of the request.
Fixes: 9b54e248d2 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly ignore Port I/O requests of length '0' (or count '0'), so that
setting up the software scratch area (and other code) doesn't have to
worry about underflowing the length, and to allow for WARNing on trying
to configure the scratch area with len==0.
Fixes: 291bd20d5d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using GHCB v2+, reject MMIO requests that are larger than 8 bytes.
Per the GHCB spec:
SW_EXITINFO2 must be less than or equal to 0x7fffffff for version 1 and
less than or equal to 0x8 for all other versions.
Fixes: 4af663c2f6 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly ignore MMIO requests of length '0', so that setting up the
software scratch area (and other code) doesn't have to worry about
underflowing the length, and to allow for special casing '0' in the
future.
Fixes: 8f423a80d2 ("KVM: SVM: Support MMIO for an SEV-ES guest")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As per the GHCB spec, when using GHCB v2+ require the software scratch area
to reside in the GHCB's shared buffer. Note, things like Page State Change
(PSC) requests _rely_ on this behavior, as the guest can't provide a length
when making the request, i.e. the size of the guest payload is bounded by
the size of the shared buffer.
Failure to force usage of the GHCB, and a slew of other flaws, lets a
malicious SNP guest corrupt host kernel heap memory, and leak host heap
layout information.
setup_vmgexit_scratch() allocates a buffer via kvzalloc(exit_info_2),
where exit_info_2 is guest-controlled. With exit_info_2=24, this yields
a 24-byte allocation in kmalloc-cg-32 (32-byte slab objects). The buffer
holds an 8-byte psc_hdr followed by 8-byte psc_entry structs, so only
entries[0] and entries[1] are in-bounds.
snp_begin_psc() validates end_entry against VMGEXIT_PSC_MAX_COUNT (253)
but NOT against the actual buffer size:
idx_end = hdr->end_entry;
if (idx_end >= VMGEXIT_PSC_MAX_COUNT) { // checks 253, not buffer
snp_complete_psc(svm, ...);
return 1;
}
for (idx = idx_start; idx <= idx_end; idx++) {
entry_start = entries[idx]; // OOB when idx >= 2
The guest sets end_entry=10+, causing the host to iterate entries[2+]
which are OOB into adjacent slab objects. For each OOB entry:
- The host reads 8 bytes (OOB READ / info leak oracle)
- If the data passes PSC validation, __snp_complete_one_psc() writes
cur_page = 1 or 512 into the entry (OOB WRITE, sev.c:3806)
- If validation fails, the error response reveals whether adjacent
memory is zero vs non-zero (information disclosure to guest)
The guest controls allocation size (exit_info_2), entry range
(cur_entry/end_entry), and can fire unlimited VMGEXITs to repeatedly
hit different slab positions.
By exploiting the variety of bugs, a malicious SEV-SNP guest can:
- OOB read adjacent kmalloc-cg-32 objects (heap layout disclosure)
- OOB write cur_page bits into adjacent objects (heap corruption)
- Trigger use-after-free conditions across VMGEXITs
E.g. with KASAN enabled, a single insmod of the PoC guest module
produces 73 KASAN reports:
BUG: KASAN: slab-out-of-bounds in snp_begin_psc+0x126/0x890
Read of size 8 at addr ffff888219ffb5e0 by task qemu-system-x86/2199
BUG: KASAN: slab-out-of-bounds in snp_begin_psc+0x468/0x890
Write of size 8 at addr ffff888351566648 by task qemu-system-x86/2199
The buggy address belongs to the object at ffff888XXXXXXXXX
which belongs to the cache kmalloc-cg-32 of size 32
The buggy address is located N bytes to the right of
allocated 32-byte region [ffff888XXXXXXXXX, ffff888XXXXXXXXX)
Breakdown:
62 slab-out-of-bounds (reads + writes past allocation)
7 slab-use-after-free
4 use-after-free
All credit to Stan for the wonderful description and reproducer!
Reported-by: Stan Shaw <shawstan96@gmail.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Peter Gonda <pgonda@google.com>
Cc: Jacky Li <jackyli@google.com>
Fixes: 4af663c2f6 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
[sean: write changelog]
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmoZxo8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplONEADDu3cC/oIuivuKb/vW1x9Gy+heRkbV7PKQ
AqpvdsOODmFko9wbZ5Jzydiq1gqNoBIenl0rAD/lQAeWz6o1iR8cFWL6vTVCAwUl
nUUXW8189e/LffwgjoEHHQfKImUuNNYfNsbQCStUI2jPQNaxbOOQtlkg+EkiyCm1
E54ENp+qX18dsor7PlChzFdBU53Y5LFnJAivGpE/Pp0XfBjoNPiqIehYaPXidxws
XXl0mLGFOcxNgQ8C8pmo5koSI9KjLR4qLXT7dP44d+liwu9nIoLxHBn5050TNAw8
8p0mTl3oEdsIl3YJmzmlHCRGBWnHxxbMmTVFPRsn6sWKyGgZNfEfr/8vwTHo4T/+
uofPR21fUTzmEVFbSWa9h709DJguSQx0zCLex/MrUa4jY1k26RY2R1IfakFRs6xF
RA9LNnjjoHhuzb6cCe4ZKTANXajgsUSg4p1n5+d/0cBAekGEWkWMGxPMe4snRkdu
aW+1RVrzJeUkFBS2pLRjCg/Y376TD4M6Lg4r7xReP0dJQP5KihIp+Hg2aCS1tMQP
erIFcuvTjbSoAKw5FyGi/mACScZKMml/bnaMkQQNrYIW4CZXN473DHcfEu7DQhjY
S1VK4SGwTOoktqLFM5vXA5yNv3qaUyOjB/JR11P+twaDjug75oEvS0NCnXqvyXfe
SzznzpjEVg==
=YY45
-----END PGP SIGNATURE-----
Merge tag 'block-7.1-20260529' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fix from Jens Axboe:
"Just a single fix for the block side, making a slight tweak to a fix
from this cycle"
* tag 'block-7.1-20260529' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
blk-mq: reinsert cached request to the list
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmoZxb4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgDQEADbTyZ2P1pOKJwr7Gt5HETXXuvsocb7Z+CY
qfdfVgTz3aUcIg1ui+2QbpNU6l5zQKlp8IA7igjLHMhiif9nNzZTUPHayBrlhTo2
gS/RNOLtwGEv5oz+7i+2JAl+SyNRLQzK+UH0kmePIZp0bQPr1ciS13Ndb33Eus/K
FmtfeRJEsT6fZ+Q+QooxiN5n5jRdcB/WKSX5cwoIOnKJxKCyms/C8hUsG8Tz+0L/
2Ys8W06as3tE5eWjwpOXo5sb2d0zhr6tiqPUNpwEVbaBzLiBqI2wQW9gAy2ZXa1Z
vgCSceKOtt1I9/tf0y3xOy/V6LgUe+fp8X5V4yNPiDOwDTVrTYDOfT0qeqWJpwF+
4W+ZaIjHCNY2q/lbdeEuHYOOstvc5hShH4AJFQqbhUCXyB/qivO0rwQtZ75pGZRF
RhYJ8VdBdC740OyAdHPyTTUxWHAHr3+9lA10DsEw2dMxeDDmqdNTUrnw5b4/OVFl
jk2/cC1MmxtnPWPFK4NQGj2BMWpTqNfTykD7WQWOHUwvtCj7ZQHH6g4qfovuu2VR
Jt0vQE2uAab440knpQZCbURkgRrlZLLNYImhtbgTUyN/Jsk7vrV5/bevFwGWsA4y
CiwHsdztx62J2OsAbNBGw/a4qvMn7X3cLeylEeJiWyQxtkpb8JgdmxDjBGDavwLB
H5rtGBToWg==
=tYXi
-----END PGP SIGNATURE-----
Merge tag 'io_uring-7.1-20260529' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fix from Jens Axboe:
"Just a single fix for a regression introduced in this cycle, where
we should ensure the node is visible before the entry is added to
the tctx list"
* tag 'io_uring-7.1-20260529' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/tctx: set ->io_uring before publishing the tctx node
- Restore CONFIG_PKVM_DISABLE_STAGE2_ON_PANIC to its former glory by
making sure the config symbol is correctly spelled out in the code
- Don't reset the AArch32 view of the PMU counters to zero when the
guest is writing to them
- Fix an assorted collection of memory leaks in the newly added tracing
code
- Fix the capping of ZCR_EL2 which could be used in an unsanitised way
by an L2 guest
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmoZbFAACgkQI9DQutE9
ekMqHA/+MUfOMP8C8q87SVWsb8JrtMoZKBl6YWsn19kSGXkR14HVTLwmTWsYCNfi
KvuOf5p0ALvCt9KhE8EUpsIScwdcLMJWjwKbzKkgooX9YwlgMLxA5w3He6GgyTzB
0fCqQ/2H/frJP1kdQI567Zc4eO7JNcMJfI+/lrZxUM6xbo0KyhehQcFlTBVm5/h/
RptX9n1NqdJ1qqbJw7UkAAt0hDMIAvDxZyTKGYTQla/+ckK+UEa6UUl/1pfXPs6Y
Domxu7OO9qQFSxR3KfFlHEml4lrLxNTiCFzoBgPCrf83vAkK0Ceudj3luCs7UDMX
j4sH44ddHDoO/8BmueNzsQ+gHTqmDaQGRu3G8m/4EN0dCnD5TsQ9FtaB7LErRFze
wZCTxiAuAxAaQC41H/Nul3S+7D92/vRW664b7HPf92uvD4QpGA96qvafnz9SK8AN
CyiB4EBElau+D4nOBJ7MO9vu0doWwnTXRHSIvJv8mRkkYedhEv9G8OX8QOgLGi2r
/uYtMYRcZ6tglvVhQzcOwpHe//oJ7CGxfACgwUhH2XjL1V3tDuCtCfQheo242UAe
nBUfs1R38UyaWKZmlkzymHygAprw2s/idAIJV7i80rDGwAuNTdMV3SyjC1so7ZGP
1grcr7yi2aPr3phA1RWD5bbGZhTl8a+fzyKrwIltz3rTAaV7lOk=
=bNQ/
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 7.1, take #4
- Restore CONFIG_PKVM_DISABLE_STAGE2_ON_PANIC to its former glory by
making sure the config symbol is correctly spelled out in the code
- Don't reset the AArch32 view of the PMU counters to zero when the
guest is writing to them
- Fix an assorted collection of memory leaks in the newly added tracing
code
- Fix the capping of ZCR_EL2 which could be used in an unsanitised way
by an L2 guest
- Include the kernel's linux/mman.h in KVM selftests to ensure MADV_COLLAPSE
is defined, as older libc versions may not provide it.
- Include execinfo.h if and only if KVM selftests are building against glibc,
and provide a test_dump_stack() for non-glibc builds.
- Fudge around an RCU splat in the emegerncy reboot code that is technically
a legitimate flaw, but in practice is a non-issue and fixing the flaw, e.g.
by adding locking, would incur meaningful risk, i.e. do more harm than good.
- Rate-limit global clock updates once again (but without delayed work), as
KVM was subtly relying on the old rate-limiting for NPT correction to guard
against "update storms" when running without a master clock on systems with
overcommitted CPUs.
- Fix a brown paper bag goof where KVM checked if ERAPS is "dirty" instead of
marking it dirty when emulating INVPCID.
- Flush the TLB when transitioning from xAVIC => x2AVIC to ensure the CPU TLB
doesn't contain AVIC-tagged entries for the APIC base GPA.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmoZtdgACgkQOlYIJqCj
N/26sw/+IWOA5AxyoNW/lKAhhkzDTzGWrNCQkpMv+F4tOUbHYniTxI/pv4L3eMvf
ZLUXijYxhpJtnblLtrnPpSFl5tll4xQdMUv7+fljgpYmy6+erQHodtCgRi5wHDbM
NlD7DWOgwmpvzYLcybq1RfjZ3n+OBRvq95haQ6Ph4FtoYuIomtJ5tF2mnMlyxlc/
aIK5wzQ/JeYdQxwwz1ctlHkgE5bPnS+Sxr33+MRFQ5cIpuwdoS9zYRITNBM107kg
bLeei8Cxh91sgEidgwS8JToLvaEQH8AodkROjcScllwUxYsshPKsHeH7sTMbCOVd
DiH9VbheZo7d4kb6pvhGsY891ec00dR5E/l2gZYLWHg4v0lINTw6uBdoJuq3t2TO
Q3KmGVaUWz+c6dY/0qntVpws35zG106S8Pp4mx/1EnUHbJKZYDsUMC1ppwhrr3Pz
WEyQ9PFXhOyoSbrtOaEfU+wsFPeAfT9eYADu7oV1t7l75TJAKW1EEaSGfzOO/crj
3GK3vRq2B1cMHX9c4fwhSs4h8k5JvKlI/mtGPxZN3khVorx9dv/rTqOoeQEsFS5+
8s5XcNPPJlKfNXcu3Jq6rn8U/JA2HnbH298Nk5uXTCfTrZtDgbOnI8YVYWnoadOl
8xJoie5ccEsysVj1npNNh61LNMF1XBUUC+eNn0I1o0NzeRauxF8=
=QQUn
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-fixes-7.1-rc6' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 7.1-rcN
- Include the kernel's linux/mman.h in KVM selftests to ensure MADV_COLLAPSE
is defined, as older libc versions may not provide it.
- Include execinfo.h if and only if KVM selftests are building against glibc,
and provide a test_dump_stack() for non-glibc builds.
- Fudge around an RCU splat in the emegerncy reboot code that is technically
a legitimate flaw, but in practice is a non-issue and fixing the flaw, e.g.
by adding locking, would incur meaningful risk, i.e. do more harm than good.
- Rate-limit global clock updates once again (but without delayed work), as
KVM was subtly relying on the old rate-limiting for NPT correction to guard
against "update storms" when running without a master clock on systems with
overcommitted CPUs.
- Fix a brown paper bag goof where KVM checked if ERAPS is "dirty" instead of
marking it dirty when emulating INVPCID.
- Flush the TLB when transitioning from xAVIC => x2AVIC to ensure the CPU TLB
doesn't contain AVIC-tagged entries for the APIC base GPA.
cxl/test: Update mock dev array before calling platform_device_add()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmoZsZYACgkQYGjFFmlT
OEqWGQ//dw5RvgcSxTF9+bUE/kyBJD37UfsmUXMAbC8AhVbBgQsXaWUM72azi/s2
Ht+lYgQbtyGC5RmK+A0GCOLh6UtYP3WW49eii7otbJUafdbXHwNiTxdkUXCqX2Mn
voE5Sw8bClNq3g2dIQuXDAPi0AFYzpMpvqjyzB8v1BzxqrFCZm8IUIubjcPwsan1
E6vQZ0arAGBqgWsbYyOkvWHHyhKIp7ymI2yQ8xljjoPeqHvcVbpFeJUSwS1XDh0d
HciWK0VQZhRvonP5xM7lJLN5RMIBXBnPk98LWQO4xgmxuKOxVkxapsHY2yfrzkE3
PFqlVgddPN/wT3XgPXP/1B2TvqqPJEUHImU8YlMNL3PT6IJhhyNcf5qbvr29rdSq
T2ysQCDrBZDXe4FqvVc02JkUuY9/yJ83Z4CY6iTRCSUIuAMk5dL0Y4dOA/O1hLbc
fnxbbj0bV9tgpjt2bOyPqcs/3p7eRkteJHJ8FoCyZFBdGk2hROoyD3vvCH8DI10/
kZ/ZUnLuBYxRIDwSIbv9yIsTKHz9eO58gBrf3zEm/1HuWY0e4R4CCllGKrwiBJ89
Q+TpJwCWQ7nwyf5uhI3WK2dAAZO22U+rW1noGLbbRTDX66HSFVk1dUXqPw9p7V4R
AiaJksa0apPSp69wiVeBy2rVZz3lxSnRlvCvKra7qZ26N45iqFw=
=a0MQ
-----END PGP SIGNATURE-----
Merge tag 'cxl-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) fixes from Dave Jiang:
- cxl/test: update mock dev array before calling platform_device_add()
* tag 'cxl-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/test: Update mock dev array before calling platform_device_add()
A collection of recent small fixes and quirks.
We still see a bit more changes than wished, but most of them are
device-specific ones that are pretty safe to apply, while a core fix
is a typical UAF fix for PCM core that was recently caught by fuzzer;
so overall nothing looks really worrisome.
* Core:
- Fix a UAF in PCM OSS proc interface
* HD-audio:
- Fix memory leaks in CS35L56 driver
- Various device-specific quirks for Realtek and CS420x codecs
* USB-audio:
- Quirk for TAE1160 USB Audio
- Fix for Scarlett2 Gen4 direct monitor gain
* ASoC:
- Fixes for QCom q6asm-dai, Intel bytcht_es8316, and simple-mux codec
* FireWire:
- Fix for Motu DSP event queue protection
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmoZTLUOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8RMxAAmdvBYW7xB64onopKCTjdHewCx3nsK9V9jiWi
0xsQDEs1+JukXG4Bag06R2W9EJLrdFewa88QnuwtEG+wMxJO30oJxm/b6k5EkwNu
ai8Vd8pFkcxmpAC7cL/ROtQurqW9Tfr+w9qGUTcS12tRuvQV+AjfHvGSlwqiS3K0
ZFPozyPVD6kQ5g3io/GH3HW76ZTLB+jtVx+VYvQsJ43Bii3g2pfNW6HlWQyundPt
Q6vLgUHEhvf9fgA76YztQedtoxBxQyuWrZCOV2+qzobW5sJHdOAlh4XmR6VjYzui
Gp6Xn4U1wPsP722kqzm+uK6QDVsaGGzKOuwKWYpGBwRYVol0CY/YFC1ZcT19VCMI
k6J93OMYZRoAl+ECbbWQjhVN4g8Ut7CQlvcZpPVwr5+HgcoitvuBt41KGUesKFkp
0h8kgy/X59JCJJ+AaZlxKOKoPL+WeVCXm1z9GF3VcS1h9kT1ZXfB1d3nSi9383xR
ho+YJYQQiyvXK7ZfESKZgChmC9LBSU5joW77HslXU5tM9ZkLGW1w8O7Zasas7cSm
xeNhhR6J3AnfTLX2XHDKWR/w32kqaEYTUr+t1Z997Vm33D7idcOtQIr20AgQon+0
EfKLijHFQB0HCkqgndMXdgjnus9TnXOoJE9aQudRr8rSUgu3dONrPguv+z23y25n
9u7awX4=
=IHLg
-----END PGP SIGNATURE-----
Merge tag 'sound-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of recent small fixes and quirks.
We still see a bit more changes than wished, but most of them are
device-specific ones that are pretty safe to apply, while a core fix
is a typical UAF fix for PCM core that was recently caught by fuzzer;
so overall nothing looks really worrisome.
Core:
- Fix a UAF in PCM OSS proc interface
HD-audio:
- Fix memory leaks in CS35L56 driver
- Various device-specific quirks for Realtek and CS420x codecs
USB-audio:
- Quirk for TAE1160 USB Audio
- Fix for Scarlett2 Gen4 direct monitor gain
ASoC:
- Fixes for QCom q6asm-dai, Intel bytcht_es8316, and simple-mux codec
FireWire:
- Fix for Motu DSP event queue protection"
* tag 'sound-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: codecs: simple-mux: Fix enum control bounds check
ALSA: usb-audio: Add iface reset and delay quirk for TAE1160 USB Audio
ALSA: hda/cs420x: Add CS4208 fixup for iMac16,1
ALSA: hda/realtek: add quirk for HP Dragonfly Folio G3 2-in-1
ALSA: hda/realtek: Fix speaker output on ASUS ROG Strix G615LP
ASoC: qcom: q6asm-dai: use pointer type with kzalloc_obj()
ASoC: qcom: q6asm-dai: remove unnecessary braces
ASoC: qcom: q6asm-dai: fix error handling in prepare and set_params
ASoC: qcom: q6asm-dai: close stream only when running
ASoC: qcom: q6asm-dai: do not set stream state in event and trigger callbacks
ASoC: Intel: bytcht_es8316: Fix MCLK leak on init errors
ALSA: hda/realtek: Limit mic boost on Positivo DN140
ALSA: scarlett2: Fix 2i2 Gen 4 direct monitor gain on firmware 2417
ALSA: pcm: oss: Fix setup list UAF on proc write error
ALSA: hda: cs35l56: Fix system name string leaks
ALSA: hda/realtek: Add HDA_CODEC_QUIRK for Lenovo Yoga Slim 7 14AGP11
ALSA: hda/realtek: Fix incorrect comment for ALC299_FIXUP_PREDATOR_SPK
ALSA: firewire-motu: Protect register DSP event queue positions
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmoYZ3EACgkQ7MXwXhnZ
SjY8QQ/8DjopbIJ9mJzOLRRzCyKf/+JthOL48Vg7lu7uftRkgZ/IfOp9n7dWguG2
yiOI2/yyQDL3wsd3DoUEtMwTPhd/JNpc7qJ+N19l/tpcnup+PaDxpiZWD1k1zKz6
cNROdNtGKyBlSl80qWNDN65u/AkEv7QTcvZS2Nw/Kh+0jmiOf/FqplyjNwHXsDsi
VCSfkpAAPz3zaHaJcaWMMImV/s6MOwVAI+2PlO/Kbgv62BrzGXTJBdiWK2LIbxPh
Udo/fXwnX/I/SuQ+j5IGRMJOokbWXugyKnstpDIv/1brVEmrTzp0CB0yoKGNTjy0
S8/ZYJBsnCAXMA9pcpzM86jtB2yrA/HHwtOo+mH2AD3z9r25qHTiWbTh9/tnSAiW
MYNcwvpbgvFHmwWrjpt7jrndlFhbXT2fHjyMZg3j9tmisXZCqwmU972J1jXTb8wL
xjhkKJFjjgTkRwXXx3BP51I+W+15P1DKhsAOqJS2K19At5dgZTD5KY6wQXkDVbk7
c+o9Duq7PoBGzHz/gl1Z2nqA5KEgP17HQIAhM+HEjvrmOscnKpGdHLTpMSwqyKhF
e/4yLeZdpvn6KB/B1fXYIPhTAmjKI2JxNNsae4ir7m7S6HxwamXZ406A05NoCGIQ
XDm3B1xDYcS4NYKE9G6oO5+HcanZD1QBAJBa5k8COTyIQ42svnY=
=pFq/
-----END PGP SIGNATURE-----
Merge tag 'hid-for-linus-2026052801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- buffer overflow fix for lenovo (Kean) and wacom (Lee Jones) drivers
- segfaults prevention in lenovo-go driver when used with an emulated
device (Louis Clinckx)
- cleanup of resources in u2fzero (Myeonghun Pak)
- a quirk for a USB mouse and a cleanup in hid.h (hlleng and Liu Kai)
* tag 'hid-for-linus-2026052801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: Fix OOB write in wacom_hid_set_device_mode()
HID: lenovo-go: drop dead NULL check on to_usb_interface()
HID: lenovo-go: reject non-USB transports in probe
HID: lenovo: Fix buffer over-read and unaligned access in X12 Tab raw_event handler
HID: quirks: Add ALWAYS_POLL quirk for SIGMACHIP USB mouse
HID: remove duplicate hid_warn_ratelimited definition
HID: u2fzero: free allocated URB on probe errors
Sashiko identified a possible infinite loop [1].
ACPM IPC sequence numbers are tracked via a 64-bit bitmap. Previously,
acpm_prepare_xfer() used a do...while loop to search for a free
sequence number.
If all 63 available sequence numbers are leaked due to transient
hardware timeouts or mailbox failures, the bitmap becomes full.
The next call to acpm_prepare_xfer() would enter an infinite loop.
Fix this by utilizing the kernel's optimized bitmap search functions
(find_next_zero_bit / find_first_zero_bit). If the pool is completely
exhausted, log the failure and return -EBUSY to allow the kernel to
fail gracefully instead of hanging.
Furthermore, drop the allocation loop entirely. Because
acpm_prepare_xfer() is strictly called under the 'tx_lock' mutex,
sequence number allocations are perfectly serialized. If
find_next_zero_bit() locates a free bit, a single
test_and_set_bit_lock() is mathematically guaranteed to succeed.
To enforce this locking invariant, wrap the allocation in a
WARN_ON_ONCE. If the atomic set fails, it indicates the driver's
mutex serialization is fundamentally broken. The warning generates a
stack trace for debugging, while returning -EIO immediately aborts the
transfer to prevent silent payload corruption.
Cc: stable@vger.kernel.org
Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-7-43b5ee7f1674@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Sashiko identified memory ordering races in [1].
The ACPM driver uses a globally shared 'bitmap_seqnum' to track
available sequence numbers. Even though threads now strictly free their
own sequence numbers, the allocation and freeing of these bits across
concurrent threads are effectively lockless operations and require
explicit LKMM memory barriers.
Previously, the driver used plain bitwise operators (test_bit, set_bit,
clear_bit), which lack ordering guarantees. This creates two race
conditions on weakly ordered architectures like ARM64:
1. Polling Release Violation: The polling thread copies its payload and
calls clear_bit(). Without a release barrier, the CPU can reorder
the memory operations, making the cleared bit globally visible
before the payload reads have fully completed.
2. TX Acquire Violation: The TX thread loops on test_bit(), calls
set_bit(), and then wipes the payload buffer via memset(). Without
an acquire barrier, the CPU can speculatively execute the memset()
before the bit is safely and formally claimed.
If these reorderings overlap, a new TX thread can claim the sequence
number and overwrite the buffer while the original polling thread is
still actively reading from it.
Fix this by upgrading the bitwise operators. Wrap the TX allocation in
test_and_set_bit_lock() to establish formal LKMM Acquire semantics, and
pair it with clear_bit_unlock() in the polling path to enforce Release
semantics.
Cc: stable@vger.kernel.org
Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260423-acpm-fixes-sashiko-reports-v1-0-2217b790925e%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-6-43b5ee7f1674@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Sashiko identified severe races in the polling state machine [1].
In the ACPM driver's polling mode, threads waited for responses by
monitoring the globally shared 'bitmap_seqnum'. This caused false
timeouts because if a thread processed its response and freed the
sequence number, a concurrent TX thread could immediately reallocate
it before the polling thread woke up.
Additionally, the driver suffered from a cross-thread Use-After-Free
(UAF) preemption race. Previously, acpm_get_rx() cleared the sequence
number of whichever RX message it drained from the hardware queue. This
meant Thread A could globally free Thread B's sequence slot while
Thread B was asleep. A new Thread C could then steal the slot,
overwrite the buffer, and leave Thread B to wake up to corrupted state
or a timeout.
Fix this by rewriting the polling state machine:
1. Decouple polling from the global allocator by introducing a per-slot
'completed' flag, synchronized via smp_store_release() and
smp_load_acquire().
2. Strip acpm_get_saved_rx() out of acpm_get_rx() to make it a pure
queue-draining function. Introduce a 'native_match' boolean argument
which evaluates to true only if the thread natively processed its
own sequence number during the call. This explicitly informs the
polling loop whether it must retrieve its payload from the
cross-thread cache.
3. Centralize the cache fallback and sequence number free (clear_bit)
inside the polling loop. Crucially, the free operation now strictly
targets the thread's own TX sequence number (xfer->txd[0]), rather
than the drained RX sequence number. This enforces strict ownership:
a thread only ever frees its own allocated sequence slot, and only
at the exact moment it completes its poll, eliminating the UAF
window.
Furthermore, explicitly guard the 'native_match' assignment with an
if (rx_seqnum == tx_seqnum) check, even for zero-length (no payload)
responses. While an unguarded assignment wouldn't crash (because the
cache fallback acpm_get_saved_rx() safely returns early on zero-length
transfers) doing so would "lie" to the state machine. If a thread
drained the queue and found another thread's zero-length message,
setting native_match = true would falsely convince the polling loop
that it natively handled its own response. Maintaining a rigorous state
machine requires that native_match is only set when a thread explicitly
processes its own sequence number.
Cc: stable@vger.kernel.org
Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260429-acpm-fixes-sashiko-reports-v3-0-47cf74ab09ad%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-5-43b5ee7f1674@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
This reverts commit 937f3e6b51.
The change to format propagation in the BRx broke configuration of the
DRM pipeline. Revert it to fix the regression.
The original commit was meant to fix a v4l2-compliance failure, with no
known userspace applications being affected beside test tools. Reverting
is the simplest option, a more comprehensive fix can be developed (and
tested more thoroughly) later.
Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Closes: https://lore.kernel.org/linux-media/CA+V-a8t481xuwava0nb7uY9CUPqFWZ_8EP0xrK3BgumP7HDcLg@mail.gmail.com
Fixes: 937f3e6b51 ("media: renesas: vsp1: brx: Fix format propagation")
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/T2H
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260506215650.1897177-3-laurent.pinchart+renesas@ideasonboard.com
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
This reverts commit 133ac42af0.
The change to format initialization, along with the change to format
propagation in the BRx in commit 937f3e6b51 ("media: renesas: vsp1:
brx: Fix format propagation"), broke configuration of the DRM pipeline.
Revert it to fix the regression.
The original commit was meant to fix a v4l2-compliance failure, with no
known userspace applications being affected beside test tools. Reverting
is the simplest option, a more comprehensive fix can be developed (and
tested more thoroughly) later.
Fixes: 133ac42af0 ("media: renesas: vsp1: Initialize format on all pads")
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/T2H
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260506215650.1897177-2-laurent.pinchart+renesas@ideasonboard.com
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
ZCR_EL2 can be updated by a VHE guest hypervisor either using ZCR_EL2
(which traps) or ZCR_EL1 (which does not trap). KVM handles both in
different way:
- on ZCR_EL2 trap, ZCR_EL2.LEN is immediately capped at the VM's own
VL limit. This has the potential to break existing SW that relies
on the full LEN field to be stateful.
- on ZCR_EL1 access, we do absolutely nothing.
On restoring the SVE context for an L2 guest, we directly restore the
guest hypervisor's view of ZCR_EL2 into the physical ZCR_EL2. If the
guest's view of the register was updated using the ZCR_EL2 accessor,
the value has already been sanitised (with the caveat mentioned above).
But if the guest used ZCR_EL1, the raw value is written into the HW,
and the L2 guest can now access VLs that it shouldn't.
Fix all the above by moving the VL capping to the restore points,
ensuring that:
- the HW is always programmed with a capped value, irrespective of
the accessor being used,
- the ZCR_EL2.LEN field is always completely stateful, irrespective
of the accessor being used.
Additionally, move ZCR_EL2 to be a sanitised register, ensuring that
only the LEN field is actually stateful. This requires some creative
construction of the RES0 mask, as the sysreg generation script does
not yet generate RAZ/WI fields.
Fixes: b3d29a8230 ("KVM: arm64: nv: Handle ZCR_EL2 traps")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260529-kvm-arm64-fix-zcr-len-nv-v2-1-86cad51992bd@kernel.org
[maz: rewrote commit message, tidy up access_zcr_el2()]
Signed-off-by: Marc Zyngier <maz@kernel.org>
This reverts commit 2982e599ff.
The patch does not fully fix the issue and the Author does
not match the 'Signed-off-by:' tag, so revert it for now.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Fix integer overflow issues in the dumb buffer creation path:
1. drm_mode_create_dumb() does not bound width, height, or bpp
before passing them to driver callbacks. Downstream helpers
(e.g. drm_gem_dma_dumb_create_internal) perform pitch/size
alignment in u32 arithmetic that can overflow for extreme
values. Add hard limits: width and height < 8192, bpp <= 32.
No legitimate software rendering use case exceeds these.
2. drm_mode_align_dumb() uses roundup(pitch, hw_pitch_align)
without checking for overflow. If pitch is near U32_MAX,
roundup() wraps to a small value, making subsequent
check_mul_overflow() pass with a much smaller pitch than
intended. Add an overflow check after roundup.
3. drm_mode_align_dumb() uses ALIGN(size, hw_size_align) which
only works correctly for power-of-two alignment values.
Replace with roundup() which works for any alignment.
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
If cma_activate_area() fails after allocating only part of the range
bitmaps, the cleanup path still has to release the reserved pages when
CMA_RESERVE_PAGES_ON_ERROR is clear.
That is still worth doing even in this __init path. A bitmap_zalloc()
failure does not necessarily mean the system cannot make further progress:
freeing the reserved CMA pages can return a substantial amount of memory
to the buddy allocator and may relieve the temporary memory shortage that
caused the allocation failure in the first place.
However, the cleanup path currently uses the bitmap-freeing bound for page
release as well. That is only correct for ranges whose bitmap allocation
already succeeded. The failed range and all later ranges still keep their
reserved pages, so a partial bitmap allocation failure can permanently
leak them.
Fix this by releasing reserved pages for all ranges. Use the saved
early_pfn[] value for ranges whose bitmap allocation already succeeded and
for the failed range, and use cmr->early_pfn for later ranges whose bitmap
allocation was never attempted.
Link: https://lore.kernel.org/20260523060123.2207992-1-songmuchun@bytedance.com
Fixes: c009da4258 ("mm, cma: support multiple contiguous ranges, if requested")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page can
trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock
when racing with a concurrent unmap:
thread#0 thread#1
-------- --------
madvise(folio, MADV_HWPOISON)
-> poisons the folio successfully
madvise(folio, MADV_HWPOISON) unmap(folio)
try_memory_failure_hugetlb
get_huge_page_for_hwpoison
spin_lock_irq(&hugetlb_lock) <- held
__get_huge_page_for_hwpoison
hugetlb_update_hwpoison()
-> MF_HUGETLB_FOLIO_PRE_POISONED
goto out:
folio_put()
refcount: 1 -> 0
free_huge_folio()
spin_lock_irqsave(&hugetlb_lock)
-> AA DEADLOCK!
The out: path in __get_huge_page_for_hwpoison() calls folio_put() to drop
the GUP reference while the hugetlb_lock is still held by the hugetlb.c
wrapper get_huge_page_for_hwpoison(). If concurrent unmap has released
the page table mapping reference, folio_put() drops the folio refcount to
zero, triggering free_huge_folio() which attempts to re-acquire the
non-recursive hugetlb_lock.
Fix this by moving hugetlb_lock acquisition from the hugetlb.c wrapper
into get_huge_page_for_hwpoison(). Place spin_unlock_irq() before the
folio_put() at the out: label so the folio is always released outside the
lock.
[akpm@linux-foundation.org: fix race, rename label per Miaohe]
Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@huawei.com
Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@huawei.com
Link: https://lore.kernel.org/20260522010305.4099834-1-mawupeng1@huawei.com
Fixes: 405ce05123 ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()")
Signed-off-by: Wupeng Ma <mawupeng1@huawei.com>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Two sites in mm/hugetlb.c allocate a hugetlb folio via
alloc_hugetlb_folio() (consuming a VMA reservation) and then call
copy_user_large_folio(), which became int-returning in commit 1cb9dc4b47
("mm: hwpoison: support recovery from HugePage copy-on-write faults") and
can now fail (e.g. -EHWPOISON on a hwpoisoned source page). On the
failure path, folio_put() restores the global hugetlb pool count through
free_huge_folio(), but the per-VMA reservation map entry is left marked
consumed:
- hugetlb_mfill_atomic_pte() resubmission path (UFFDIO_COPY)
- copy_hugetlb_page_range() fork-time CoW path when
hugetlb_try_dup_anon_rmap() fails (rare: pinned hugetlb anon
folio under fork)
User-visible effect: on UFFDIO_COPY into a private hugetlb VMA where the
resubmission copy fails, the reservation for that address is leaked from
the VMA's reserve map. A subsequent fault at the same address takes the
no-reservation path, and under hugetlb pool pressure the task is SIGBUSed
at an address it had previously reserved. The fork-time CoW path leaks
the same way in the child VMA's reserve map, though it requires the much
rarer combination of pinned hugetlb anon page + hwpoisoned source.
Add the missing restore_reserve_on_error() call before folio_put() on both
error paths.
Link: https://lore.kernel.org/20260520044912.6751-1-devnexen@gmail.com
Fixes: 1cb9dc4b47 ("mm: hwpoison: support recovery from HugePage copy-on-write faults")
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: yuehaibing <yuehaibing@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
cma_activate_area() can fail after allocating range bitmaps. Its cleanup
path frees those bitmaps, but only clears cma->count and
cma->available_count. It leaves cma->nranges and each range's count in
place, so cma_debugfs_init() can still register debugfs files for an area
that never activated successfully.
That exposes two problems. Reading the bitmap file can make debugfs walk
a freed range bitmap and trigger an invalid memory access. Reading
maxchunk can also take cma->lock even though that lock is initialized only
on the successful activation path.
Fix this by creating debugfs entries only for CMA areas that reached
CMA_ACTIVATED.
c009da4258 introduced the invalid access to bitmap file. 2e32b94760
introduced the invalid access to cma->lock. This change applies to both
issues. So I added two Fixes tags.
Link: https://lore.kernel.org/20260520061025.3971821-1-songmuchun@bytedance.com
Fixes: c009da4258 ("mm, cma: support multiple contiguous ranges, if requested")
Fixes: 2e32b94760 ("mm: cma: add functions to get region pages counters")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Stefan Strogin <stefan.strogin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Harry Yoo reported that get_random_u32_below() is not safe to call in the
nmi context and memcg charge draining can happen in nmi context.
More specifically get_random_u32_below() is neither reentrant- nor
NMI-safe: it acquires a per-cpu local_lock via local_lock_irqsave() on the
batched_entropy_u32 state. An NMI that lands on a CPU mid-update of the
ChaCha batch state and recurses into the random subsystem would corrupt
that state. The memcg_stock local_trylock prevents re-entry on the percpu
stock itself, but cannot protect an unrelated subsystem's per-cpu lock.
Replace the random pick with a per-cpu round-robin counter stored in
memcg_stock_pcp and serialized by the same local_trylock that already
guards cached[] and nr_pages[]. No atomics, no random calls, no extra
locks needed.
Link: https://lore.kernel.org/20260521223751.3794625-1-shakeel.butt@linux.dev
Fixes: f735eebe55 ("memcg: multi-memcg percpu charge cache")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reported-by: Harry Yoo <harry@kernel.org>
Closes: https://lore.kernel.org/4e20f643-6983-4b6e-b12d-c6c4eb20ae0c@kernel.org/
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 081056dc00 ("mm/hugetlb: unshare page tables during VMA split,
not before") changed the locking model around hugetlbfs PMD unsharing on
VMA split, but did not update the function which asserts the locks,
hugetlb_vma_assert_locked().
This function asserts that either the hugetlb VMA lock is held (if a
shared mapping) or that the reservation map lock is held (if private).
If you get an unfortunate race between something which results in one of
these locks being released and a hugetlb VMA split and you have
CONFIG_LOCKDEP enabled, you can therefore see a false positive assertion
arise when there is in fact no issue.
Since this change introduced a new take_locks parameter to
hugetlb_unshare_pmds(), which, when set to false, indicates that locking
is sufficient, simply pass this to the unsharing logic and predicate the
lock assertions on this.
This is safe, as we already asserted the file rmap lock and the VMA write
lock prior to this (implying exclusive mmap write lock), so we cannot be
raced by either rmap or page fault page table walkers which the asserted
locks are intended to protect against (we don't mind GUP-fast).
Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a
check_locks parameter, and update hugetlb_unshare_pmds() to pass this
parameter to it.
This leaves all other callers of huge_pmd_unshare() still correctly
asserting the locks.
The below reproducer will trigger the assert in a kernel with
CONFIG_LOCKDEP enabled by racing process teardown (which will release the
hugetlb lock) against a hugetlb split.
void execute_one(void)
{
void *ptr;
pid_t pid;
/*
* Create a hugetlb mapping spanning a PUD entry.
*
* We force the hugetlb page allocation with populate and
* noreserve.
*
* |---------------------|
* | |
* |---------------------|
* 0 PUD boundary
*/
ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_SHARED | MAP_ANON |
MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE,
-1, 0);
if (ptr == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}
/*
* Fork but with a bogus stack pointer so we try to execute code in
* a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap().
*
* The clone will cause PMD page table sharing between the
* processes first via:
* copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share()
*
* Then tear down and release the hugetlb 'VMA' lock via:
* exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free()
*/
pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0);
if (pid < 0) {
perror("clone");
exit(EXIT_FAILURE);
} if (pid == 0) {
/* Pop stack... */
return;
}
/*
* We are the parent process.
*
* Race the child process's teardown with a PMD unshare.
*
* We do this by triggering:
*
* __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds()
*
* Which, importantly, doesn't hold the hugetlb VMA lock (nor can
* it), meaning we assert in hugetlb_vma_assert_locked().
*
* .
* |----------.----------|
* | . |
* |----------.----------|
* 0 . PUD boundary
*/
mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
}
int main(void)
{
int i;
/* Kick off fork children. */
for (i = 0; i < NUM_FORKS; i++) {
pid_t pid = fork();
if (pid < 0) {
perror("fork");
exit(EXIT_FAILURE);
}
/* Fork children do their work and exit. */
if (!pid) {
int j;
for (j = 0; j < NUM_ITERS; j++)
execute_one();
return EXIT_SUCCESS;
}
}
/* If we succeeded, wait on children. */
for (i = 0; i < NUM_FORKS; i++)
wait(NULL);
return EXIT_SUCCESS;
}
[ljs@kernel.org: account for the !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING case]
Link: https://lore.kernel.org/agWZsPGYid08uU6O@lucifer
Link: https://lore.kernel.org/20260513085658.45264-1-ljs@kernel.org
Fixes: 081056dc00 ("mm/hugetlb: unshare page tables during VMA split, not before")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 8871389da1 introduces common pcs dts properties which writes
rx=normal,tx=normal polarity to register SGMSYS_QPHY_WRAP_CTRL of switch.
This is initialized with tx-bit set and so change inverts polarity
compared to before.
It looks like mt7531 has tx polarity inverted in hardware and set tx-bit
by default to restore the normal polarity.
The MT7531 datasheet quite clearly states:
Register 000050EC QPHY_WRAP_CTRL -- QPHY wrapper control
Reset value: 0x00000501
BIT 1 RX_BIT_POLARITY -- RX bit polarity control
1'b0: normal
1'b1: inverted
BIT 0 TX_BIT_POLARITY -- TX bit polarity control (TX default inversed
in MT7531)
1'b0: normal
1'b1: inverted
Till this patch the register write was only called when mediatek,pnswap
property was set which cannot be done for switch because the fw-node param
was always NULL from switch driver in the mtk_pcs_lynxi_create call.
Do not configure switch side like it's done before.
Fixes: 8871389da1 ("net: pcs: pcs-mtk-lynxi: deprecate "mediatek,pnswap"")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260526153239.30194-1-linux@fw-web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- hci_core: Rework hci_dev_do_reset() to use hci_sync functions
- hci_conn: Fix memory leak in hci_le_big_terminate()
- hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close
- hci_sync: Reset device counters in hci_dev_close_sync()
- hci_sync: fix UAF in hci_le_create_cis_sync
- L2CAP: Fix possible crash on l2cap_ecred_conn_rsp
- L2CAP: fix chan ref leak in l2cap_chan_timeout() on !conn
- L2CAP: use chan timer to close channels in cleanup_listen()
- L2CAP: clear chan->ident on ECRED reconfiguration success
- ISO: fix UAF in iso_recv_frame
- ISO: serialize iso_sock_clear_timer with socket lock
- HIDP: fix missing length checks in hidp_input_report()
- 6lowpan: check skb_clone() return value in send_mcast_pkt()
- btusb: Allow firmware re-download when version matches
- hci_qca: Use 100 ms SSR delay for rampatch and NVM loading
-----BEGIN PGP SIGNATURE-----
iQJNBAABCgA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmoYP6oZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKX9SD/9+09g+LpkrBUyTsqaMiWii
4BNwbGG1kTzof/3pxmjxsJNbOi1u7wpA6gyGo9Opaaz3IjE5ZYnwblhSCjf7i0Qz
Txl8p5DPmTBHBevZLDA0OrTtxAtNKlOzGQLs/nrWikHvEBy0F20e5defcv9IMaAy
XAyv++EyZFznqwGU0iV2avXF4Gnfs6F1WL5f42jl0zwahB3pKZZik1HYOkg9JsIm
9/6qlBStf4VxATF+vN5OqY9K6xV2bPVC/htoBP4oMUbaDXbgZgDoTYScyvBOktBZ
DenyFzWogGKRWrOxgWA9RYnQ0VU+Orx0C5gf+DLbUY2uMGMlEwVAOBUUbDUsgsz8
m97hfJamNcpVv9+40v8QfYc1gF+k5OYQx1UCAvRSdmnkdycukeZXfScwx/35jHqj
BFExP7b5UtphUfjUiP/nEqZogiV4HXPoWF+X4Wnoe80dNhX0JGsjBjlBc6Da4ZOF
zdn9l+Qc/OvaMJV2OZIHFu28BBpgBAp+ujOhMRtlvSgvi1XUkf1InFEWdIg+NpiE
8JOD63huNXeGzKLhwyMfIQ8zsMNwQpn+gnE4vqHal6GhYhAMhSBtJxLBAdc8aZQR
91SFDl96ghNrRXfQlWfN7g2TdTZKCUzD5ZmOl1s/iv1rqENYA6WCwLXHe/6I03mT
hpx6ALMEjS8qud43JFfOOg==
=7Nv9
-----END PGP SIGNATURE-----
Merge tag 'for-net-2026-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_core: Rework hci_dev_do_reset() to use hci_sync functions
- hci_conn: Fix memory leak in hci_le_big_terminate()
- hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close
- hci_sync: Reset device counters in hci_dev_close_sync()
- hci_sync: fix UAF in hci_le_create_cis_sync
- L2CAP: Fix possible crash on l2cap_ecred_conn_rsp
- L2CAP: fix chan ref leak in l2cap_chan_timeout() on !conn
- L2CAP: use chan timer to close channels in cleanup_listen()
- L2CAP: clear chan->ident on ECRED reconfiguration success
- ISO: fix UAF in iso_recv_frame
- ISO: serialize iso_sock_clear_timer with socket lock
- HIDP: fix missing length checks in hidp_input_report()
- 6lowpan: check skb_clone() return value in send_mcast_pkt()
- btusb: Allow firmware re-download when version matches
- hci_qca: Use 100 ms SSR delay for rampatch and NVM loading
* tag 'for-net-2026-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_sync: Reset device counters in hci_dev_close_sync()
Bluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close
Bluetooth: hci_core: Rework hci_dev_do_reset() to use hci_sync functions
Bluetooth: ISO: serialize iso_sock_clear_timer with socket lock
Bluetooth: ISO: fix UAF in iso_recv_frame
Bluetooth: L2CAP: Fix possible crash on l2cap_ecred_conn_rsp
Bluetooth: l2cap: clear chan->ident on ECRED reconfiguration success
Bluetooth: hci_qca: Use 100 ms SSR delay for rampatch and NVM loading
Bluetooth: hci_sync: fix UAF in hci_le_create_cis_sync
Bluetooth: 6lowpan: check skb_clone() return value in send_mcast_pkt()
Bluetooth: btusb: Allow firmware re-download when version matches
Bluetooth: HIDP: fix missing length checks in hidp_input_report()
Bluetooth: L2CAP: use chan timer to close channels in cleanup_listen()
Bluetooth: L2CAP: fix chan ref leak in l2cap_chan_timeout() on !conn
Bluetooth: hci_conn: Fix memory leak in hci_le_big_terminate()
====================
Link: https://patch.msgid.link/20260528131839.462344-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sctp_wait_for_connect() drops and re-acquires the socket lock while
waiting for the association to reach ESTABLISHED state. During this
window, another thread can peeloff the association to a new socket via
getsockopt(SCTP_SOCKOPT_PEELOFF), changing asoc->base.sk. After
re-acquiring the old socket lock, sctp_wait_for_connect() returns
success without noticing the migration — the caller then accesses
the association under the wrong lock in sctp_datamsg_from_user().
Add the same sk != asoc->base.sk check that sctp_wait_for_sndbuf()
already has, returning an error if the association was migrated while
we slept.
Fixes: 668c9beb90 ("sctp: implement assign_number for sctp_stream_interleave")
Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20260527032411.60959-1-kipreyyy@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dipayaan Roy says:
====================
net: mana: Fix NULL dereferences during teardown after attach failure
When mana_attach() fails (e.g. during queue allocation), the error
cleanup frees apc->tx_qp and apc->rxqs and sets them to NULL. Multiple
subsequent teardown paths can then dereference these NULL pointers,
causing kernel panics.
Patch 1 adds NULL guards in the low-level teardown functions
(mana_fence_rqs, mana_destroy_vport, mana_dealloc_queues) so they are
safe to call regardless of queue initialization state. This covers all
callers: mana_remove(), mana_change_mtu() recovery, and internal error
paths in mana_alloc_queues().
Patch 2 adds an early exit in mana_detach() for already-detached ports,
making it safe for non-close callers. This allows the queue reset
handler to safely retry mana_attach() without redundant teardown.
====================
Link: https://patch.msgid.link/20260525081129.1230035-1-dipayanroy@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When mana_per_port_queue_reset_work_handler() runs after a previous
detach succeeded but attach failed, the port is left in a detached
state with apc->tx_qp and apc->rxqs already freed. Calling
mana_detach() again unconditionally leads to NULL pointer dereferences
during queue teardown.
Add an early exit in mana_detach() when the port is already in
detached state (!netif_device_present) for non-close callers, making
it safe to call idempotently. This allows the queue reset handler and
other recovery paths to simply retry mana_attach() without redundant
teardown.
Fixes: 3b194343c2 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/20260525081129.1230035-3-dipayanroy@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When queue allocation fails partway through, the error cleanup frees
and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as
mana_remove(), mana_change_mtu() recovery, and internal error handling
in mana_alloc_queues() can subsequently call into functions that
dereference these pointers without NULL checks:
- mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer
dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26).
- mana_destroy_vport() iterates apc->rxqs without a NULL check.
- mana_fence_rqs() iterates apc->rxqs without a NULL check.
- mana_dealloc_queues() iterates apc->tx_qp without a NULL check.
Add NULL guards for apc->rxqs in mana_fence_rqs(),
mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a
NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue
draining when TX queues were never allocated or already freed.
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/20260525081129.1230035-2-dipayanroy@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix CAAM driver probe failures caused by missing SoC information by
retrieving the match data directly through of_machine_get_match_data(),
which provides the correct SoC-specific data.
-----BEGIN PGP SIGNATURE-----
iQHFBAABCgAvFiEEJS45w2QNr0ezLVaoNF3oRQ23YkwFAmoYj2MRHGZyYW5rLmxp
QG54cC5jb20ACgkQNF3oRQ23YkxPwQwAq95b24f+LRSLuQjyfQVi6SbuGwufVtMp
zsrwiZKfyWAkvxHa7CLHKaIC7x60J3YTUc5BdB7vQ/uDaZZwEnn/1rnUD4vvjTkd
vODYZ2DYdZERE3UrljAdxEcuXBH496s+eDybVf4oVk5V5/rQ6TCbnK+rxXB+fXS+
dkzUOL1j4fX1XsaGNICsCLJuSNLsxr9ja9CNJLFpr6fnncnmx8kKHSP1PHKseF+N
qwOw5Iv2Cwp4EQZGfm54j/k7pAMlJljzWy+X5qNInUsvEBMUKa97apiMVBCfYfTB
qDIYGA6KNd/7kMMkam2DusrOLitFpQtUFtjlMwee3BoYls3d2OkdpH1EilyoZoEg
MCF7wd7PzD7jenVC4Z1S1BW0AdG+n3Pyxnh9iW3grljYsXE/sFRx/U13JN32EqoU
vWshLNvNX4XH4iZKDNt+mvoCTzcnlkIZYcM4mR+JjXUpmFhcsOIpAFD2RhE/0jta
qJOLvCddLfqkf3VPXK/sRRE48Pkw8kJs
=freo
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoYurAACgkQmmx57+YA
GNkMKQ//f/kQ6VT0fovp6yM60tEAPsVtt5U8zpSsQpuv1joJqS9FgSx13Z6XlLsq
IIyGPxlzsTnCke3UHo0poiGUlNrOvY87fQ8ZeU7ItpxoKLSHmLUwXZAzmlVy+yj/
WSly0Ns3VURXhlbdQ14mQE+kxZTOBb9L02K3iIGjgIda5fbG+12IvB0cPxPUv3Oi
+Uw2H4D5S17FxYlQoVtxPL6D/GcqBVSSCaqVHJdvpS4wmoCKvycK9yYymJnbRxIf
/QBwpfR6QON4wxpFk7lF0s7+DWfk88qeZOvf+sDyMRGbDLsNU9C6zmjVD6y2lT3N
kHjrnZUqdg3DgwHhc/+sXJgW4cNsDisz/L27lHwMDzzFuvz6+SB7xXn1k0cDuyGt
5HVmdTuRRAYyolzTXCPgxvP1j11xGcOueNR7fQDbQKPDUm9OpgT0X0fPaG/w6H35
qq1VBxJDMV+0DPOgBzHniC2CNBdADZ6Og14wQPEdVTsQUaWH66aXS8iQrcKbjwEw
rDFS3eiOu8s+iP4wccSgKvJs0vn0fRmc2AVA74awwmoyUowwA2lT5hWKhKVuWlJN
Zs4Bw/bk+DQAeCzUvswYdEnb1tTO4l95jHh9b4p3qkU9N8vZOmsQSwoy2XjzseLo
Z14PuXI4R5+4LXgR26doMqonYvVkvblqETt4IkBGrUzUyaRifMM=
=Kz7A
-----END PGP SIGNATURE-----
Merge tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux into arm/fixes
i.MX SoC fixes for v7.1
Fix CAAM driver probe failures caused by missing SoC information by
retrieving the match data directly through of_machine_get_match_data(),
which provides the correct SoC-specific data.
* tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux:
soc: imx8m: Fix match data lookup for soc device
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
These FIS partition offsets were never right: the comment clearly
states the FIS index is at 0xfe0000 and 0x7f * 0x200000 is
0xfe0000.
Tested on the iTian SQ201.
Fixes: d88b11ef91 ("ARM: dts: Fix up SQ201 flash access")
Fixes: b5a923f8c7 ("ARM: dts: gemini: Switch to redboot partition parsing")
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
A number of targets now depends on the M.2 PCIe power sequencing driver,
enable this to keep these devices functional with a defconfig build.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCgAsFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmoVtMkOHGJqb3JuQGty
eW8uc2UACgkQCx85Pw2ZrcUL2w/8CkfdKhdB8BOe/X9t72nNwjVwq/+BTTlpZxR0
pNn5TvBhJWedzLq7Q2R7RU9CF74wGosT9JAX8jb45NHoemH4ia7aj5uHks2rMIU9
i+XI78VT9MwF13ls9yCjLGcqquowEpQzVMNfGwLZO5dp46wZ59GPIUcwoOmr6VqF
G1Ps3T1mkkALB1RDQitad6Ey3xaMZqA38mFhfByz/ELGSuhWaG7sjGSy6HVEVtzd
RIdFEdx+6Ez08V6tvKP82ljWljKDCeAlOggD9BbpTUVq1zGCdqQIaeG/O1nNJzaf
buQH5iqI7HGFtEmXTHCRFzUUKdD/hi5HyYvIF7PWBswqAfoC70Gb9NozFYnNvXac
eCJK6NNc6jCt42W4iZGAD74lr0UDGK2L+ymztJfRU+0v/PTxY40NmIX0GwRN2JKc
zwg6IKhyhwg9IOoBAQpVJLgVx7dUBk8sVj3yJAk/B+C8vOVoCQbde6FIJf6pJRS7
Ms8ZeJ4eGOcVTgdrzaJTxWN5xJzD7Y92KelzlhGN/l35QbYkf9DwEtIm+g816qBX
be6suWXMhZ+aYmrrAJB4yaXfqghDHC2pAaykH3b1J17FyvZIwWTDEJmcxTMHqaha
i+2l8Elrz/VlqC4ugrm0tzbQ6xIv7BeImUcC3fPKi/Whoxv/oZDbaR+RYDMs0R3n
SL15hhw=
=+XU9
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoYukwACgkQmmx57+YA
GNnagRAAj360SPhrfkWQ94jIqQNoVThrfD3t25JhgSyrrghYQiRf6zTSCRrIDqBl
21JzjbYhT/pd+wvs5AVlyTaY00fVIJtv81s3s5CYUKkonKK0wEG7y7bmHTb2HQIr
1t76jS0Zbj4+HE0eYcvfVQD++sZ+dL1b93HmGDjTm47lsUvdd2OsfaCChS/OkT4H
u9q/PT8iekXgEFY1Q4A6z+D8XE+k0UFsalZaCcEn/eo11ZluBNhX00TsGoIIYDyR
OkfRsyjqc3X3zAqLvsiNUlTtSrAKXXFIUUEGpvEUWRapCgiwqk0XHPSVfExn7C4r
ZSORwoxI7tIbTjNvn9iKZS2KGDUDr8fpe5TKrJvB1IM+4JnzNSU1sSTxrG21jCqA
hvIATBYzHfdE3525A9ANauMbG42zKWvQFwG0/Sege0t6W9Q895jjOH+bH2ku2mFb
/bQ3WF7IsoSFv48u47ISPZJMSZOjC12gJ+NIlouzt0JNmPQJI6Hu5adY9T2OrWtu
ovq6CnZO52V+uTcM6W/VxrdhstUVXJQY3zhbVPSWPu94sfes3a4wmDkZELFy9GdN
SFVZoL1gK4qojrjgLSniBnL0MNpv71coh1cx0j6xonDZqvWrXi9wtx6dpFFTVYyB
UWBZiOFnqeYttwJ1+FZilUsYET9BZt9xpFQBSo2eyVlMGF80n2U=
=M655
-----END PGP SIGNATURE-----
Merge tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 defconfig fixes for v7.1
A number of targets now depends on the M.2 PCIe power sequencing driver,
enable this to keep these devices functional with a defconfig build.
* tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: defconfig: Enable PCI M.2 power sequencing driver
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Add missing power-domain and iface clocks for the ICE node of Eliza and
Milos to avoid the validation errors that resulted from late binding
changes. Also drop the reference clock for the USB QMP PHYs, for the
same reason.
Avoid touching the 20'th I2C bus on the Hamoa-based (X Elite) Dell
laptops, as this conflicts with the battery management firmware.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCgAsFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmoVs6MOHGJqb3JuQGty
eW8uc2UACgkQCx85Pw2ZrcUHfA/+MoI7gAkKpGLLY3z7ZZUfcAgjD9shIaH2lvZU
JV8gvlP3QZSngm65qeADClnzD8aca/ufrp3tyW6aqQUccWmVfXDOV/K2L2DrjYRQ
8m5KyeZD0te1RensSfmHD0J74IhAfwX3vuHFONjRWSQ9+BGRubgnqYoFUG8qVVkZ
jhMFdDerjUBmwiQsg7jkhQrnc3SjUyuXLstB78KBejJ9oHMch0c6Nrt/LeX/T04w
hpldA+C3Gx5R7RAWTNQtQiVzR7uAnbgJci3hmJ0S7prCCPuNEF2lc7pCXaLwqbIu
/wsjorqDE39TizuLkDPdTgD/p9Ec8p4kkBlnNsgVrHuHYYQKP5TF8t/JmllMOy5X
U+96eYNENqjXkezKO69/6yqeTsqJAL1rNRIcauF/p0eNNOU4wUBnEsXnxoxK7TPM
HuvD48wLIxT4eK4nnna7397ZwuSzkvpPxAzL0Opq6nNKMreGVoBLZ2hQgiZXv1tC
zgSmT4Jr4LKk3Cl6RM6KmwpZ1H7DcQB8Me6tgfkApcNbbNLci/D70pG19ikmh8TX
rk18Cfnp5jJvjgE0Hv00PzuzkzmPj2nRTIExb4Cw17n+LAkyC2eifu1O7J7FPW4a
mxkxe9EQ2OdfAKy3jrysvvqMpIK1mH0d1COoKz9OGSrXubYearjhtuC8KaSrZNo5
4LiykTY=
=/uFW
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoYucwACgkQmmx57+YA
GNmK7Q/9EbFgk/uOiiyEhjPVudQHSR/JxEs7TQMbnxEOMmXvQ01FHnRlfqDSgZqx
X+7UgrFCTIHSIaD/O29WIapW2TN4OGBI2gwHNsnWFCWbC8ZZGo+HoU3V7Fjzk8D+
9A36G3myRerPuKWmPHQSkjfQHUKkmWUmLBCXGBwRxd3YZvjSZxeaYfDm+n9CVsIi
NelVe42QvwP2U+FPBYXARfawn9MEBY267s8PBVaqz+k159gqSfCS5UHHQjzg1d/F
KD7FX0y3B7EkmwpyEElapjPiXs3Mi/5PYa+C9f0zCIdRfU6H70cHcNO2OlXucq5A
PQ79gdTI7Om9uH4gdXEyGg7Qcu2+rz0WKjFjyYBxYpgsEpCYxUVjkRpCZ1Nn34iz
nDqVKY3jmiAws1omP51YPtFNLVw3K0ZmKMka1M7AR7wdK2115l+XUSDze6pTXazL
RBO6DCVZEs9AikhOX+p5VF3Brmupf9lhM7Ixml4E/EOu2jAsfTnKe+3v1IUxAQ4K
rllnn19NntTg7qIcuX+jDAkxY0mwD1JVSR4UYAahPHFfOZYqq++gLE9BcT0IGj47
mo6wL01Niu7OU5hkQlq6fJib1tyiv12zdHYOJt7u85eiT6hXwxjx4iMEnTROCTuI
uaV9Qzfd3EujRnwho4QnOrV8bNwcS9GpBXNz/MaYrjOrxEmn9G0=
=qWqc
-----END PGP SIGNATURE-----
Merge tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 DeviceTree fixes for v7.1
Add missing power-domain and iface clocks for the ICE node of Eliza and
Milos to avoid the validation errors that resulted from late binding
changes. Also drop the reference clock for the USB QMP PHYs, for the
same reason.
Avoid touching the 20'th I2C bus on the Hamoa-based (X Elite) Dell
laptops, as this conflicts with the battery management firmware.
* tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
arm64: dts: qcom: x1-dell-thena: remove i2c20 (battery SMBus) and reserve its pins
arm64: dts: qcom: glymur: Drop RPMh CXO clocks from QMP PHYs
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The Qualcomm ICE driver suffers from race conditions between probe() and
get() and will in certain cases return the wrong error code, which
results in storage drivers failing to probe. Fix these issues.
Also correct the DeviceTree binding, to ensure that relevant clocks are
described and voted for, to prevent the driver from accessing unclocked
hardware during boot.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCgAsFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmoVsj8OHGJqb3JuQGty
eW8uc2UACgkQCx85Pw2ZrcVLlhAAjW6WvbknpvjbJEwVjsFCdi9wtplLibtTBU+q
7HcPCdtN+kWSu1H1F6b/B5ugpfdroy+UK0P6tV2glOQW5h5JIVwCktT4ucPgSrld
UyAcx9EuupyYFy/qCLz6gZ82JMAjzKyFDa2F4IEp3VbAUFneVOgg87XXDPvD8tmU
9VsbLsXJYUaFGixFqpuG2fzSZGH6GSSd493HziyKZL43kskfH4cJOQ6LiQL5RhhE
GuEs9Oqerv+uisS3dq4nqf7pbZEP+SetlQOuj/LcGP0gibZf4W0gvx0RuN9gj0oJ
THeOdn1cf3mEY3wnCwrCTm4t5qn+k85EP5nVN2gmQTSgU/QIkDYNeNUydFn0rvPJ
LlaSMaU5zDI68s+xXfm+8uGZecdRV97VLifUc0OBBh2pdy3gXgNlmctyEKES/lYR
kUVakLcFszElfdsChRNw230MzM5wm9wjsag3KtBua5pqiGiMsfVgU59ejBR9tI6C
tQZrPZotk4mMcljO1f9GO5swJKt5mznQ1XVabTz8NzNHJycBkCJSIA1IDxzjATt0
9IQqtK8fULVtD5gPpwF1tEigIEy0mL1QDv4qDZWS2KkElYOrRWQVwRoP1Ba3ZqSI
gnBTVlULNftodxbLW32ujgBDBQoEAeNBkA9Nfy9HTiYapbCqrlr7fne29+YTj2lz
dR7+CXw=
=QdJR
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoYt/4ACgkQmmx57+YA
GNkgIA//S0tL8vyZ4kVKetG1ZUSSTZkX1IzmNLyiroaIk1mIFIepnaLI37MU6Yiq
RZUZsXPsHX2mPOVXSqIA0imwKn5fAVR8HgoQE8Tng0U2t9KTEW9NCgvPjLWlwpPe
kKZlxnk3wxoD+tlEBYisjc44MLLo2pS+lrrFGd9Ki03r+Gt+jnfb/icxMnHYRSIu
fBrsPaAzPsGuIzGUrNJcm0FzDMcLq2SJ9cIka8QUQSJR0+rugvC0thZxExAo1zt8
UnlMrpTDcuCTzxgVdiQMj9ThhplvAqobOrgadq6h91OQtPKwGaFQ8cXnTBvX1ILk
tFhNKTNp3hOxbPYy4km5+TCfmjI1X4b6i9hQGeP/3FTsB8L/twYQd135SDeV6Mmy
/zgDWFd2ZyEI/DtMauOTFs8bt3Cq9Lpq0wzwOxXdknZapXDbhdidIJloX+bAmS0D
BI6uRXEnEJQPWeHd452gJSJ9tJLqYRbSiw0KH0T430YnhwN4wx5owMyFPmGawVUj
D6AuGaWwpTtX1velVQD6cGGMxjw2RYfMw4tNWsxB9c6BoB8dR+K5jXGi0vfE5Hk0
6Lh09zvsc2hfIlT5gELMEkiKbEyrkAD48wUh9ZAZ1LMwIChZ6sq2JnTtaZzufetX
HdnsUDOVCgcrjb4UEMDUq6VNwXM2M/7PKmS2q/rRdDrU7OP7hhM=
=o97B
-----END PGP SIGNATURE-----
Merge tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm driver fixes for v7.1
The Qualcomm ICE driver suffers from race conditions between probe() and
get() and will in certain cases return the wrong error code, which
results in storage drivers failing to probe. Fix these issues.
Also correct the DeviceTree binding, to ensure that relevant clocks are
described and voted for, to prevent the driver from accessing unclocked
hardware during boot.
* tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found
scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get()
mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get()
soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL
soc: qcom: ice: Return -ENODEV if the ICE platform device is not found
soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get()
soc: qcom: ice: Allow explicit votes on 'iface' clock for ICE
dt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix three issues in the ACPI button driver: a possible crash due to
a button press after unloading the driver (introduced during the 6.15
development cycle), function keys breakage on Toshiba Tecra X40 due to
missing ACPI events (introduced during the 7.0 development cycle), and
a missing probe rollback path item that has not been added by mistake
during a recent update.
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmoYhl8SHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1mfsH/A4/8aBIuxemIAoyoAQwwjQGoIkdhDDV
vyHBpOxzXwlOB0dCi/vuVdkMX+ADt2/0roh82mnXRiIv47CF/ePrQSnLiIvtN32b
cY7xbSN8n7d2Vjp5k3cWBU/CL1tCGq6c/DEnFzMwtzhkt7qCuaDDfedqjg1/dZvg
IrbexlAveRO9Qhtep8wCrxJsmZULXFcQe2qSMS82UkbqeJzR4lz3ovLocdEQqP9Q
YRipKrv4M5/3H63ajbxJQZaP87WY731tM4Ny2zxO05dek9b70o+dHyjRGMRqqYZD
eTKo8ck90aLNsjiXXC6eqVOkcxZv3kbLkpY4Exblmmp0lBnzszhRe0w=
=Th+J
-----END PGP SIGNATURE-----
Merge tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
"Fix three issues in the ACPI button driver: a possible crash due to a
button press after unloading the driver (introduced during the 6.15
development cycle), function keys breakage on Toshiba Tecra X40 due to
missing ACPI events (introduced during the 7.0 development cycle), and
a missing probe rollback path item that has not been added by mistake
during a recent update"
* tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: button: Add missing device class clearing on probe failures
ACPI: button: Enable wakeup GPEs for ACPI buttons at probe time
ACPI: button: Fix ACPI GPE handler leak during removal
Fix a possible amd-pstate-ut cpufreq driver crash introduced by
a recent update (K Prateek Nayak)
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmoYhVgSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1mRIIAJhyAd7iABFrh2uHIGQjF4wb4BfRQ/Sk
bStrkwQf4mqTQ8QCIzz7cS882VytJvvMtgUm/0KjedfwV4ZHtJbu1YKuAzKyKxVH
dz7wYpUKXmhoQnPjij3xG5UK++khq14PyUKKyHGiUAtJmnhvEaWu07kh5K2pC51y
uuJ9z6v+JXiu0m2H/nXtEAkzwnWwqdGSl16f0hFUGJlKeMeA6MStP/vZVomo7hiX
G6mYJVS2eYMhjcpetm055X6zMg5IgHdYVhGOlC6G+HPB6/zs5/VZUfhibmfzi1cj
yolk8WbjoQDwVXStYofI6YFFv/Bw+B6mblYGb0HwQcUec58taARMSmY=
=Ha7c
-----END PGP SIGNATURE-----
Merge tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a possible amd-pstate-ut cpufreq driver crash introduced by a
recent update (K Prateek Nayak)"
* tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq/amd-pstate-ut: Disable dynamic_epp after the mode switch
Current release - regressions:
- netfilter: walk fib6_siblings under RCU
Previous releases - regressions:
- netlink: fix sending unassigned nsid after assigned one
- bridge: fix sleep in atomic context in netlink path
- sched: fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
- ipv4: fix net->ipv4.sysctl_local_reserved_ports UaF
- eth: tun: free page on short-frame rejection in tun_xdp_one()
Previous releases - always broken:
- skbuff: fix missing zerocopy reference in pskb_carve helpers
- handshake: drain pending requests at net namespace exit
- ethtool:
- rss: avoid modifying the RSS context response
- module: avoid leaking a netdev ref on module flash errors
- coalesce: cap profile updates at NET_DIM_PARAMS_NUM_PROFILES
- netfilter: fix dst corruption in same register operation
- nfc: hci: fix out-of-bounds read in HCP header parsing
- ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo()
- eth: vti: use ip6_tnl.net in vti6_changelink().
- eth: vxlan: do not reuse cached ip_hdr() value after skb_tunnel_check_pmtu()
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmoYVTISHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkYfoP/jBxdUf2IirOjl/vjJFm7cXzcCdTWreb
HmlvVRPF0YDuwQEjaZA+Ed/+wi0QIiyckI60Ltpfz9DbSm3ugstfUxPNWKVb5HZQ
TI1diAa+uTmaXndC5Kb56U/KNMcMZOJ0FZwHheU2mC/7USpB9S/gaGYf2vxCOF9B
huMrCuvoHhASxaL6W1xyYR3P4ouGS9XoQU/sGRWAynpi45BZdFF/Y8W2YrCk0IKc
SwkWbId2Ek6/2+f3pWKYbE88UEjpNh2U6K+kcAgy/UN3N0+tb91kuOrn/5Z+WjE7
3ZdEBvALj6K0P7BxsR64M1ikVgm2KcZAn8UH5UOqkzlP3VGWHYbbk/4KvEGD1oJF
p0lauztIkPPdq16Dau8v+KHw5UU4vBpEDo3323hh7kcSIu7cJkWSVxo7/WDjokzT
HlIZtzKpXwCUSSCNmV3y3zXR/Xl41HOzU5lZv6f8P2hkMfyIu9te9lXF6Foc6r2u
Ng0oVkevURpGhqpKQKxRtaApPrfOCYFkN4aVzvm5haxhFcughJZmQcjVbu03l4CM
/nddhYop7D2NdnZzSdlBO1bK/KBebZCYlSKZJGjdL7zqIOQAjjw9UoW0rU+84pkU
dcvFBPm+iWAhvwWEGaUrnuNcYth/umNMTzC4domLUyPrVydSUH0zi0RQYc9mXffR
EvWEj952b4o0
=IBwj
-----END PGP SIGNATURE-----
Merge tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"This is again significantly bigger than the same point into the
previous cycle, but at least smaller than last week.
I'm not aware of any pending regression for the current cycle.
Including fixes from netfilter.
Current release - regressions:
- netfilter: walk fib6_siblings under RCU
Previous releases - regressions:
- netlink: fix sending unassigned nsid after assigned one
- bridge: fix sleep in atomic context in netlink path
- sched: fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
- ipv4: fix net->ipv4.sysctl_local_reserved_ports UaF
- eth: tun: free page on short-frame rejection in tun_xdp_one()
Previous releases - always broken:
- skbuff: fix missing zerocopy reference in pskb_carve helpers
- handshake: drain pending requests at net namespace exit
- ethtool:
- rss: avoid modifying the RSS context response
- module: avoid leaking a netdev ref on module flash errors
- coalesce: cap profile updates at NET_DIM_PARAMS_NUM_PROFILES
- netfilter: fix dst corruption in same register operation
- nfc: hci: fix out-of-bounds read in HCP header parsing
- ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo()
- eth:
- vti: use ip6_tnl.net in vti6_changelink().
- vxlan: do not reuse cached ip_hdr() value after
skb_tunnel_check_pmtu()"
* tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
dpll: zl3073x: make frequency monitor a per-device attribute
dpll: zl3073x: use __dpll_device_change_ntf() and remove change_work
dpll: export __dpll_device_change_ntf() for use under dpll_lock
net/handshake: Drain pending requests at net namespace exit
net/handshake: Verify file-reference balance in submit paths
net/handshake: Close the submit-side sock_hold race
net/handshake: hand off the pinned file reference to accept_doit
net/handshake: Take a long-lived file reference at submit
net/handshake: Pass negative errno through handshake_complete()
nvme-tcp: store negative errno in queue->tls_err
net/handshake: Use spin_lock_bh for hn_lock
net: skbuff: fix missing zerocopy reference in pskb_carve helpers
net: hibmcge: move dma_rmb() after dma_sync_single_for_cpu() in RX path
net: hibmcge: disable Relaxed Ordering to fix RX packet corruption
selftests/tc-testing: Add netem test case exercising loops
selftests/tc-testing: Add mirred test cases exercising loops
net/sched: act_mirred: Fix return code in early mirred redirect error paths
net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow
net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
net/sched: fix packet loop on netem when duplicate is on
...
Nicholas Carlini reports that the keyring code calls assoc_array_find()
in find_key_to_update() without holding the RCU read lock, while the
assoc_array_gc() code really is designed around removing the node from
the tree and then freeing it after an RCU grace-period.
The regular key handling doesn't see this because holding the keyring
semaphore hides any lifetime issues, but the persistent key handling
uses a different model.
Instead of extending the keyring locking, just do the simple RCU locking
that the assoc_array was designed for.
Reported-by: Nicholas Carlini <npc@anthropic.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wacom_hid_set_device_mode() currently assumes that the HID_DG_INPUTMODE
usage is always located in the first field (field[0]) of the feature report.
However, a device can specify HID_DG_INPUTMODE in a different field.
If HID_DG_INPUTMODE is in a field other than the first one and the first
field has a report_count smaller than the usage_index of HID_DG_INPUTMODE,
this leads to an out-of-bounds write to r->field[0]->value.
Fix this by storing the field index of HID_DG_INPUTMODE in 'struct
hid_data' during feature mapping. In wacom_hid_set_device_mode(), use
this stored field index to access the correct field and add bounds
checks to ensure both the field index and the value index are within
valid ranges before writing.
Cc: stable@vger.kernel.org
Fixes: 5ae6e89f74 ("HID: wacom: implement the finger part of the HID generic handling")
Tested-by: Ping Cheng <ping.cheng@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Once FD_ADD() returns, the fd is live in the file descriptor table
and a thread sharing that table can close() it before DMA_BUF_TRACE()
runs. The close drops the last reference, __fput() frees the dma_buf,
and the tracepoint then dereferences dmabuf to take dmabuf->name_lock
-- slab-use-after-free.
Split FD_ADD() back into get_unused_fd_flags() + fd_install() and
emit the tracepoint between them. While the fdtable slot is reserved
with a NULL file pointer, a racing close() returns -EBADF without
entering __fput(), so the dma_buf stays alive across the trace. Same
approach as commit 2d76319c4c ("dma-buf: fix UAF in dma_buf_put()
tracepoint").
This undoes the FD_ADD() conversion done in commit 34dfce523c
("dma: convert dma_buf_fd() to FD_ADD()"); FD_ADD() has no place to
hook the tracepoint safely.
Reported-by: syzbot+7f4987d0afb97dd090cb@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7f4987d0afb97dd090cb
Fixes: 281a226314 ("dma-buf: add some tracepoints to debug.")
Cc: stable@vger.kernel.org # 7.0.x
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260523181446.69525-1-devnexen@gmail.com
Prevent _regmap_update_bits() from accessing hardware when the register
map is in cache-only mode.
Unlike regmap_raw_read() and _regmap_read(), the volatile
_regmap_update_bits() fast path bypasses the cache_only check. This can
result in unexpected hardware accesses while the device is suspended.
Return -EBUSY to ensure behavior is consistent with other cache-only
access paths.
Signed-off-by: bui duc phuc <phucduc.bui@gmail.com>
Link: https://patch.msgid.link/20260528053204.46783-1-phucduc.bui@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Clang recently added support for -Wattribute-alias [1], which results in
the same warnings that necessitated commit bee2003177 ("disable
-Wattribute-alias warning for SYSCALL_DEFINEx()") for GCC.
kernel/time/itimer.c:325:1: error: alias and aliasee have different types 'long (unsigned int)' and 'long (typeof (__builtin_choose_expr((__builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0LL)) || __builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0ULL))), 0LL, 0L)))' (aka 'long (long)') [-Werror,-Wattribute-alias]
325 | SYSCALL_DEFINE1(alarm, unsigned int, seconds)
| ^
include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
| ^
include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
| ^
include/linux/syscalls.h:251:18: note: expanded from macro '__SYSCALL_DEFINEx'
251 | __attribute__((alias(__stringify(__se_sys##name)))); \
| ^
kernel/time/itimer.c:325:1: note: aliasee is declared here
include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
| ^
include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
| ^
include/linux/syscalls.h:255:18: note: expanded from macro '__SYSCALL_DEFINEx'
255 | asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
| ^
<scratch space>:16:1: note: expanded from here
16 | __se_sys_alarm
| ^
Disable the warnings in the same way for clang-23 and newer. Disable the
warning about unknown warning options to avoid breaking the build for
versions of clang-23 that do not have -Wattribute-alias, such as ones
deployed by vendors like Android or CI systems or when bisecting LLVM
between llvmorg-23-init and release/23.x.
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2163
Link: 40da6920a0 [1]
Link: https://patch.msgid.link/20260515-syscall-disable-attribute-alias-for-clang-v1-1-9a9d95d41df6@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Prevent possible use after free in supplicant communication.
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEE0qerISgy2SKkqO79Wr/6JGat8H4FAmoPEQQaHGplbnMud2lr
bGFuZGVyQGxpbmFyby5vcmcACgkQWr/6JGat8H6i7g/+IKeDvegjBiud2okBgEq3
5yMYhmHP/ycgnAaQkpqvtZP41I4ar0LAbresNWEvmuAR71rirWSCDKRHbm0xv8Ww
UJ1xyDxnIdNQe8sBBqrYe4QBq80l9gKpX5Tmi8eQkFGfDX+DWet0fdYBW/pKoSaI
M/ciBORdbwYFPPb7hGBpuZ/tUuFNlRo4tcW14mM9v64m5pOe94RR92Dshf8fCwWk
+4HmMBldBCUHz1j4hjzwEunq3aMVFHobUozssk3A0eLIpEPla0OqeO4vT1UpskfL
nXiJCvioKH/X6/xKzhuumPN7Bvkq1eMZAQy+4N4wNJShzJJtqokAjDBUjc42576l
7xKoNNa+mSNfrpu9RABC0hdpcTRxIbcgE70tViyeUIPOfhryAAlhaUe/iN1O4HYf
dfwqWVGwosGUbXb8vZUvggR1GmLSLrN5H/PhXg7hb6RDveDD2eSgHWsh/+CGDrYq
quWOamOeodblIrjOnRKsBVLuBLXJ3muIsED9HrCfChlS/EMGQYSWqr7KP+MIBQuQ
Q9TzrBZ0j7m2MOwEzmLssa5nmbIxB7bnVOO/46xqQv4ktRC335/IYVpsQrD0AJ2B
XM4raWikXLNIxNAwkzzDzP2Tu67zKxNtiPi1z4S0UeMY90NSJZ5Z4E7AEe8iSL1P
Z3oXjhxDyv7aJFzYnxBApgI=
=aOSI
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoYQvUACgkQmmx57+YA
GNkRcxAAjiVXyEFwLErRTJlQ9Tq00axugkt0YJW5LxDixuqXjfmPp6TpsATm1Np7
Xx2eplkEhMZiIgP0nDQkT/31kFlLUd3nOQiJv9Cz8ZwwZulCWMlMeTPZ773MwlmO
rIuo+cBaQhkYp3e3mglWLuAXR4MtW4GQzZ42jRXYvOz+3HrG3Zb8C/CfOQfMlFU0
17M+HMBq6FAGjJEhH1sZEb19WEE87ZiH2aee1+HQ+FgzfDtW0QD+9bhlxpIsFkak
NRKoD58ixobSy1SuMSPQ4liK76VEdP6UQDZ2R4IyhO6/QM1ZE5JRpAf9RWAy4Mqi
4gcHrBXdME2hJUrXLtV6PdQMXPq9+/yphs+JWXwRfns+uL9ltDYwxk14Tc+hshGk
jM0qcfPvF/s/6Med+oCnSGQn8CYYOblwi/qhpWLqyl6Rrpag1kkTqVXmOsJ0J71m
J59fIEjqcYkv8y+46Pj3ZxNXVgXlhREpsp/gg/FAkJ5bNpKxWqdqPba4gy/rhZFh
x2TOpfzMHMQS9MrfbLtlu4HshbwDqEyiThTgUc5c1njYwb3Mao//QxwmjLBQesWY
sZSfl+26rF/fyPcaHne5nZ3JSKE0FGxnHot+jPDS6OURAhF8AwhKFjLnzIFnmWhy
kqsrVQCrDySP1rzm23A9pFcpx7mlR4tZM4q4xRb4DujPHiIYK9w=
=jbl0
-----END PGP SIGNATURE-----
Merge tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes
OP-TEE fix for v7.1
Prevent possible use after free in supplicant communication.
* tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee:
tee: optee: prevent use-after-free when the client exits before the supplicant
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Address several teardown issues and resource leaks in the driver's remove
path and error handling:
1. Debounce clock reference leak: The debounce clock (bank->db_clk) is
obtained using of_clk_get() which increments the clock's reference
count, but clk_put() is never called. Register a devm action to
cleanly release it on unbind. Note that of_clk_get(..., 1) remains
necessary over devm_clk_get() because the DT binding does not define
clock-names, precluding name-based lookup.
2. Unregistered chained IRQ handler: The chained IRQ handler is not
disconnected in remove(). If a stray interrupt fires after the driver
is removed, the kernel attempts to execute a stale handler, leading
to a panic. Fix this by clearing the handler in remove().
3. IRQ domain leak: The linear IRQ domain and its generic chips are
allocated manually during probe but never removed. Remove the IRQ
domain during driver teardown to free the associated generic chips
and mappings.
Fixes: 936ee2675e ("gpio/rockchip: add driver for rockchip gpio")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
Link: https://patch.msgid.link/20260526171050.12785-3-scardracs@disroot.org
[Bartosz: don't emit an error message on devres allocation failure]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
The bank->clk was previously obtained via of_clk_get() and manually
prepared/enabled. However, it was missing a corresponding clk_put() in
both the error paths and the remove function, leading to a reference leak.
Convert the allocation to devm_clk_get_enabled(), which also properly
propagates failures from clk_prepare_enable() that were previously ignored.
The GPIO bank device uses the same OF node as the previous of_clk_get()
call, so devm_clk_get_enabled(dev, NULL) correctly resolves the same
clock provider entry.
Fix the reference leak and simplify the code by removing the manual
clk_disable_unprepare() calls in the probe error paths and in the
remove function.
Fixes: 936ee2675e ("gpio/rockchip: add driver for rockchip gpio")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
Link: https://patch.msgid.link/20260526171050.12785-2-scardracs@disroot.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
If *ppos is non-zero (user-space write split over multiple calls to
write()) then simple_write_to_buffer() won't initialize the start of the
buffer. Really, non-zero values for *ppos aren't going to work at all.
Check for that and return -EINVAL at the start of the function.
Fixes: 91581c4b3f ("gpio: virtuser: new virtual testing driver for the GPIO API")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://patch.msgid.link/ahP3BJWWy-m_qI0X@stanley.mountain
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
By the time gpio_device_teardown_shared() is called, the parent device
is gone from the global list of GPIO devices and all outstanding SRCU
read-side critical sections have completed. That means that no
concurrent gpio_find_and_request() can call
gpio_shared_add_proxy_lookup() for this device at this time. There's
also no risk of the parent device being re-bound to the driver before
the unbinding completes (including the child devices).
Lockdep produces a false-positive report about a possible circular
dependency as it doesn't know the ordering guarantee. Not taking the
ref->lock in gpio_device_teardown_shared() silences it and is safe to do.
Cc: stable@vger.kernel.org
Fixes: ea513dd3c0 ("gpio: shared: make locking more fine-grained")
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-2-76bca088f8c0@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Commit 710abda580 ("gpio: shared: call gpio_chip::of_xlate() if set")
used the mutex embedded in struct gpio_shared_entry to protect the
offset field which now can be modified after assignment. The critical
section however is too wide and introduced a potential deadlock on the
removal of the shared GPIO proxy's parent.
Make the critical section shorter - only protect the offset when it's
being read.
While at it: mention the fact that the entry lock is now also used to
protect against concurrent access to the offset field in the structure's
documentation.
Cc: stable@vger.kernel.org
Fixes: 710abda580 ("gpio: shared: call gpio_chip::of_xlate() if set")
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-1-76bca088f8c0@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Before resetting or closing the device, protocol counters should also be
zeroed.
Fixes: d0b137062b ("Bluetooth: hci_sync: Rework init stages")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Since hci_dev_close_sync() can now be called during the reset path, we
should also set HCI_CMD_DRAIN_WORKQUEUE. This avoids queuing timeouts
while the hdev workqueue is being drained.
Fixes: 877afadad2 ("Bluetooth: When HCI work queue is drained, only queue chained work")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The current HCI reset function in hci_core.c duplicates most of the work
done by hci_dev_close_sync(), and doesn't handle LE, advertising or
discovery.
Instead of porting these to hci_dev_do_reset(), directly call the
close/open functions from hci_sync to reset the hdev. MGMT now notifies
when a user performs a reset.
Suggested-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
iso_sock_close() calls iso_sock_clear_timer() before acquiring
lock_sock(sk).
iso_sock_clear_timer() reads iso_pi(sk)->conn twice without the
socket lock held:
if (!iso_pi(sk)->conn)
return;
cancel_delayed_work(&iso_pi(sk)->conn->timeout_work);
Concurrently, iso_conn_del() executes under lock_sock(sk) and calls
iso_chan_del(), which sets iso_pi(sk)->conn to NULL and may result in
the final reference to the connection being dropped:
CPU0 CPU1
---- ----
iso_sock_clear_timer()
if (conn != NULL) ... lock_sock(sk)
iso_chan_del()
iso_pi(sk)->conn = NULL
cancel_delayed_work(conn) /* NULL deref or UAF */
iso_pi(sk)->conn is not stable across the unlock window, causing a
NULL pointer dereference or use-after-free.
Serialize iso_sock_clear_timer() with the socket lock by moving it
inside lock_sock()/release_sock(), matching the pattern used in
iso_conn_del() and all other call sites.
Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
iso_recv_frame reads conn->sk under iso_conn_lock but releases the lock
before using sk, with no reference held. A concurrent iso_sock_kill()
can free sk in that window, causing use-after-free on sk->sk_state and
sock_queue_rcv_skb().
Fix by replacing the bare pointer read with iso_sock_hold(conn), which
calls sock_hold() while the spinlock is held, atomically elevating the
refcount before the lock drops. Add a drop_put label so sock_put() is
called on all exit paths where the hold succeeded.
Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
If dcid is received for an already-assigned destination CID the spec
requires that both channels to be discarded, but calling l2cap_chan_del
may invalidate the tmp cursor created by list_for_each_entry_safe and
in fact it is the wrong procedure as the chan->dcid may be assigned
previously it really needs to be disconnected.
Calling l2cap_chan_clone directly may still lead to l2cap_chan_del so
instead schedule l2cap_chan_timeout with delay 0 to close the channel
asynchronously.
Fixes: 15f02b9105 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
l2cap_ecred_reconf_rsp() returns early on success without clearing
chan->ident. Every other L2CAP response handler (l2cap_ecred_conn_rsp,
l2cap_le_connect_rsp, l2cap_config_rsp) clears chan->ident after a
successful transaction to prevent the channel from matching subsequent
responses with the recycled ident value.
A remote attacker that completed a reconfiguration as the peer can
replay a failure response with the stale ident, causing the kernel to
match and destroy the already-established channel via
l2cap_chan_del(chan, ECONNRESET).
Clear chan->ident for all matching channels on success, and harden the
failure path by using l2cap_chan_hold_unless_zero() consistent with
other L2CAP handlers (l2cap_le_command_rej, __l2cap_get_chan_by_ident).
Fixes: 15f02b9105 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
spi_mem_supports_op() accepts a const struct spi_mem_op pointer but
casts away const internally to call spi_mem_adjust_op_freq(). This
mutates the caller's op template, which causes stale max_freq values
when callers reuse persistent templates - subsequent calls won't
re-apply the device frequency cap since spi_mem_adjust_op_freq()
skips non-zero values.
Fix by operating on a stack-local copy instead.
Fixes: a4f8e70d75 ("spi: spi-mem: add spi_mem_adjust_op_freq() in spi_mem_supports_op()")
Cc: Tianyu Xu <xtydtc@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://patch.msgid.link/20260527173736.2243004-1-s-k6@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Ivan Vecera says:
====================
dpll: zl3073x: various fixes
Three fixes for the zl3073x DPLL driver.
Patch 1 exports __dpll_device_change_ntf() for use by drivers that
need to send device change notifications from within callbacks
already running under dpll_lock.
Patch 2 replaces the change_work workqueue mechanism with direct
calls to __dpll_device_change_ntf(), eliminating a race condition
where the work handler could dereference a freed dpll_dev pointer
during device teardown.
Patch 3 moves the freq_monitor flag from per-DPLL to per-device
scope to match the hardware behavior where frequency measurement
registers are shared across all DPLL channels.
====================
Link: https://patch.msgid.link/20260526074525.1451008-1-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The frequency monitoring feature uses shared hardware registers
that measure input reference frequencies independently of
individual DPLL channels. However, the freq_monitor flag was
incorrectly placed in the per-DPLL structure, causing each
channel to track its own enable/disable state independently.
Since the DPLL core calls measured_freq_get() only for the first
pin registration, the measured_freq_check() in the periodic worker
was gated by the per-DPLL freq_monitor flag of whichever channel
happens to be checked. If the first DPLL channel had frequency
monitoring disabled while another had it enabled, measurements
were never reported.
Move freq_monitor from struct zl3073x_dpll to struct zl3073x_dev
so all DPLL channels share a single flag, matching the hardware
behavior. Update freq_monitor_set() to notify other DPLL devices
about the change (like phase_offset_avg_factor_set() already does)
and remove the mode-dependent guard in zl3073x_dpll_changes_check()
since all input pin monitoring (pin state, phase offset, FFO, and
measured frequency) works correctly in all DPLL modes.
Fixes: bfc923b642 ("dpll: zl3073x: implement frequency monitoring")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260526074525.1451008-4-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The change_work was introduced to send device change notifications
from DPLL device callbacks without deadlocking on dpll_lock, since
the callbacks are already invoked under that lock. Now that
__dpll_device_change_ntf() is exported for callers that already
hold dpll_lock, use it directly and remove the change_work
infrastructure entirely.
This eliminates a race condition where change_work could be
re-scheduled after cancel_work_sync() during device teardown,
potentially causing the handler to dereference a freed or NULL
dpll_dev pointer.
Fixes: 9363b48376 ("dpll: zl3073x: Allow to configure phase offset averaging factor")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260526074525.1451008-3-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Export __dpll_device_change_ntf() so that drivers can send device
change notifications from within device callbacks, which are already
called under dpll_lock. Using dpll_device_change_ntf() in that
context would deadlock.
Add lockdep_assert_held() to catch misuse without the lock held.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260526074525.1451008-2-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vasant has a long history of providing valuable feedback and testing
results for the AMD IOMMU code. Still, too often he gets not Cc'ed on
code changes, so make his reviewer status official.
Acked-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This round of fixes is mostly Sirini's Qualcomm cleanups that have been
in review for a while, we also have a couple of small fixes from Cássio.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmoYKn4ACgkQJNaLcl1U
h9DlaQf+Oo/7gs3X3tdCWaAP0F4rLa+7JO6tjTLnPdvr8nvbbHMUc93Prm8pgf6z
w3BDVU3xoP4k6te8qKc0p4RzCoVBSSUrabdT6C0fZyr7ToQpMn+JSOU9onVYtAm/
XdO7+W5HLUnB1XV5jbA3oiWGUibOOQhqGO9eQMswSrY4kTQlFpAKTIRgPB11lF8t
FJTCG9g/RVmnNxTU9VdpkLGJpAINy9GtduayGOocAKiaoqtWwciPh+9YhHAhi+b4
uXac71EuIrPeTeIxdAJCe3f7tFHqehidrhXQiX+2jFESeQ3q9XlGB14p8Yu8MNNh
l7irc2BzLK2MxNgUb9TvJDPbpOMTxw==
=MhQh
-----END PGP SIGNATURE-----
Merge tag 'asoc-fix-v7.1-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v7.1
This round of fixes is mostly Sirini's Qualcomm cleanups that have been
in review for a while, we also have a couple of small fixes from Cássio.
Chuck Lever says:
====================
net/handshake: anchor request lifetime to a pinned file reference
handshake_nl_accept_doit() has accumulated four follow-on fixes
since 3b3009ea8a ("net/handshake: Create a NETLINK service for
handling handshake requests"): 7ea9c1ec66, 7798b59409,
fe67b063f6, and dabac51b81. Each was a local refcount or
NULL-check correction; none moved where the file reference is
owned, and the same code keeps producing the same class of bug.
Reworking the ownership is what breaks the pattern.
For the duration of a request, sock->file has no single owner.
Submit publishes the request without taking a file reference;
accept_doit acquires one inside the handler, after the request
has already left the pending list. The consumer can drop its
own reference at any time, including the moment between
handshake_req_next() popping the request and accept_doit
reaching get_file(). The submit-side sock_hold() pins only
struct sock; struct socket and sock->file remain under the
consumer's control via the file descriptor.
This series places the file reference under unambiguous
ownership. handshake_req_submit() pins it on the request and
completion or cancel drops it (patches 4-5); the submit-side
sock_hold() then becomes redundant, and dropping it also closes
a publish-before-pin race the late sock_hold itself opened
(patch 6). The handshake_complete() API and its consumers move
to a uniform negative-errno sign convention (patch 3), with the
matching sign correction in nvme-tcp (patch 2). Patch 1
hardens hn_lock for BH context, the netns-exit drain fix
builds on the new file-pin infrastructure (patch 8), and new
KUnit file-count assertions verify the refcount contract
(patch 7).
Three things in this restructuring want a careful look. In
handshake_complete(), the fput() of the request's file
reference has to come after hp_done() -- fput() can transitively
run handshake_sk_destruct() and free the request, so the patch
stashes hr_file in a local first. handshake_sk_destruct()
itself is kept on purpose: it owns rhashtable removal and
kfree, and remains the backstop if a consumer path bypasses
handshake_complete() entirely. Third, handshake_req_next() now
returns its request with an extra get_file() held under
hn_lock; accept_doit must consume that reference (FD_PREPARE on
success, explicit fput on the fdf.err path), and any future
caller has to honor the same contract.
v2: https://patch.msgid.link/20260521-handshake-file-pin-v2-0-b9dadc472840@oracle.com
v1: https://patch.msgid.link/20260518-handshake-file-pin-v1-0-4bbcb7e62fda@oracle.com
====================
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-0-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The arguments to list_splice_init() in handshake_net_exit() are
reversed. The call moves the local empty "requests" list onto
hn->hn_requests, leaving the local list empty, so the subsequent
drain loop runs zero iterations. Pending handshake requests that
had not yet been accepted are not torn down when the net namespace
is destroyed; each one keeps a reference on a socket file and on
the handshake_req allocation.
Pass the source and destination in the documented order
(list_splice_init(list, head) moves list onto head) so the pending
list is transferred to the local scratch list and drained through
handshake_complete().
Fixing the splice direction exposes a list-corruption race. After
the splice each req->hr_list still has non-empty link pointers,
threading the stack-local scratch list rather than hn_requests.
A concurrent handshake_req_cancel() -- for example, from sunrpc's
TLS timeout on a kernel socket whose netns reference was not
taken -- finds the request through the rhashtable, calls
remove_pending(), and sees !list_empty(&req->hr_list).
__remove_pending_locked() then list_del_init()s an entry off the
scratch list while the drain iterates, corrupting it. The same
call arriving after the drain loop has run list_del() on an
entry hits LIST_POISON instead.
Have remove_pending() check HANDSHAKE_F_NET_DRAINING under
hn_lock and report not-found when drain is in progress. The
drain has already taken ownership; handshake_complete()'s existing
test_and_set on HANDSHAKE_F_REQ_COMPLETED still arbitrates
between drain and cancel for who calls the consumer's hp_done. Use
list_del_init() rather than list_del() in the drain so req->hr_list
does not carry LIST_POISON after drain releases the entry.
The DRAINING guard in remove_pending() makes cancel return false,
but cancel still falls through to test_and_set_bit on
HANDSHAKE_F_REQ_COMPLETED and drops the request's hr_file reference.
Without another pin, if that is the last reference, sk_destruct frees
the request while it is still linked on the drain loop's local list.
Pin each request's hr_file under hn_lock before releasing the list,
and drop that drain pin after the loop finishes with the request.
Fixes: 3b3009ea8a ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-8-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The new file-reference contract on struct handshake_req is silently
breakable: a missing get_file() at submit or a missing fput() on an
error path leaves the file leaked but does not crash the test, so
the existing absence-of-crash checks pass either way.
Snapshot file_count(filp) before each handshake_req_submit() in
the submit-success, EAGAIN, EBUSY, and cancel tests, and assert
the expected balance after submit and again after cancel. The
already-completed cancel test also asserts the post-complete
balance, which pins down that handshake_complete() drops the
reference and that the subsequent cancel does not double-fput.
The destroy test gets the same treatment before __fput_sync(),
which double-checks that cancel's fput() ran and the only
remaining reference is the one sock_alloc_file() established.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-7-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
handshake_req_submit() publishes the request via
handshake_req_hash_add() and __add_pending_locked(), drops
hn_lock, and calls handshake_genl_notify() (which can sleep)
before taking sock_hold() on req->hr_sk. A fast tlshd ACCEPT
followed by DONE can drive handshake_complete()'s sock_put()
into the window between the spin_unlock and the late
sock_hold(); on a system where the consumer's fd held the
only sk reference, the late sock_hold() then operates on an
sk whose refcount has reached zero.
The preceding two patches install an explicit file reference
on struct handshake_req. That file pins sock->file, which
pins the embedded struct socket, which defers inet_release()'s
sock_put(). As long as hr_file is held, sk cannot reach refcount
zero from the consumer side, and the submit-side sock_hold()
with its matching sock_put() calls in handshake_complete() and
handshake_req_cancel() is now redundant.
Drop all three. The file reference already keeps each request's
socket alive, and the lifetime story is contained in a single
get_file()/fput() pair.
Fixes: 3b3009ea8a ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-6-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
handshake_req_next() removes the request from the per-net
pending list and drops hn_lock before handshake_nl_accept_doit()
reads req->hr_sk->sk_socket and dereferences sock->file (once in
FD_PREPARE() and again in get_file()). In that window a
consumer running tls_handshake_cancel() followed by sockfd_put()
(svc_sock_free) or __fput_sync() (xs_reset_transport) releases
sock->file. sock_release() then runs sock_orphan(), zeroing
sk_socket, and frees the struct socket. The accept-side code
either reads NULL through sk_socket or chases freed memory.
The submit-side sock_hold() does not prevent this. sk_refcnt
protects struct sock, but struct socket and sock->file are
independently refcounted via the file descriptor the consumer
owns. Pinning sk leaves sock and sock->file unprotected.
Retarget the accept-side dereferences at req->hr_file, which was
pinned at submit time, instead of req->hr_sk->sk_socket->file.
Pinning on its own is not sufficient: a consumer that cancels
between handshake_req_next() returning and accept_doit reaching
FD_PREPARE() takes the !remove_pending() branch in
handshake_req_cancel() and drops hr_file before the accept side
takes its own reference. Hand off an additional file reference
inside handshake_req_next(), under hn_lock, so the accept side
operates on a reference that no concurrent handshake_req_cancel()
can revoke. FD_PREPARE() consumes that handed-off reference,
either by transferring it to the new fd in fd_publish() or by
dropping it in the cleanup destructor on error; the explicit
get_file() that previously balanced FD_PREPARE() is therefore
redundant and goes away.
Update handshake_req_cancel_test2 and _test3 to simulate the
FD_PREPARE() consumption with an fput() so the kunit file-count
assertions stay balanced.
Reported-by: Chris Mason <clm@meta.com>
Fixes: 3b3009ea8a ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-5-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
handshake_nl_accept_doit() needs the file pointer backing
req->hr_sk->sk_socket to survive the window between
handshake_req_next() and the subsequent FD_PREPARE() and get_file().
The submit-side sock_hold() does not provide that. sk_refcnt keeps
struct sock alive, but struct socket is owned by sock->file: when
the consumer fputs the last file reference, sock_release() tears
the socket down regardless of any sock_hold.
Add an hr_file pointer to struct handshake_req and acquire an
explicit reference on sock->file during handshake_req_submit().
handshake_complete() and handshake_req_cancel() release the
reference on the completion-bit-winning path.
The submit error path must also release the file reference, but
after rhashtable insertion a concurrent handshake_req_cancel() can
discover the request and race the error path. Gate the error-path
cleanup -- sk_destruct restoration, fput, and request destruction
-- with test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED), the same
serialization handshake_complete() and handshake_req_cancel()
already use. When cancel has already claimed ownership, the submit
error path returns without touching the request; socket teardown
handles final destruction.
The accept-side dereferences are not yet retargeted; that change
comes in the next patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-4-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
handshake_complete() declares status as unsigned int and
tls_handshake_done() negates that value (-status) before handing
it to the TLS consumer. Consumers match on negative errno
constants -- xs_tls_handshake_done() has
switch (status) {
case 0:
case -EACCES:
case -ETIMEDOUT:
lower_transport->xprt_err = status;
break;
default:
lower_transport->xprt_err = -EACCES;
}
so the API as designed expects callers to pass positive errno
values that the tlshd shim then negates.
Three internal callers in handshake_nl_accept_doit(), the
net-exit drain, and a kunit test follow kernel convention and
pass negative errnos -- -EIO, -ETIMEDOUT, -ETIMEDOUT. The
implicit conversion to unsigned int turns -ETIMEDOUT into
0xFFFFFF92; the subsequent -status in tls_handshake_done()
wraps back to 110, the consumer's switch falls through, and
the xprt reports -EACCES on what should be -ETIMEDOUT or -EIO.
Fix the API rather than the call sites. The natural kernel
convention is negative errno in, negative errno out. Change
handshake_complete() and hp_done to take int status, drop the
negation in tls_handshake_done(), and negate once in
handshake_nl_done_doit() where status arrives from the wire
as an unsigned netlink attribute. The three internal callers
were already correct under that convention and need no change.
At the same wire boundary, declare MAX_ERRNO as the netlink
policy upper bound for HANDSHAKE_A_DONE_STATUS. Attribute
validation rejects out-of-range values before
handshake_nl_done_doit() runs, and negating a bounded u32 there
stays within int range -- closing the UBSAN-visible signed-
integer overflow that an unconstrained u32 would invoke.
Fixes: 3b3009ea8a ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-3-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
nvme_tcp_tls_done() assigns queue->tls_err in three branches. The
ENOKEY lookup failure and the EOPNOTSUPP initializer both store
negative errnos. The third branch, reached when the handshake
layer reports a non-zero status, stores -status.
The handshake layer delivers status to the consumer callback as a
negative errno; the other in-tree consumers --
xs_tls_handshake_done() and the nvmet target callback -- treat
their status argument that way. The extra negation in
nvme_tcp_tls_done() flips the sign, leaving tls_err as a positive
value (for instance, +EIO), which nvme_tcp_start_tls() then
returns to its caller.
Drop the extra negation so queue->tls_err uniformly carries a
negative errno on failure.
Fixes: be8e82caa6 ("nvme-tcp: enable TLS handshake upcall")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-2-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
nvmet_tcp_state_change(), a socket callback that runs in BH context,
can reach handshake_req_cancel() via nvmet_tcp_schedule_release_queue()
and tls_handshake_cancel(). handshake_req_cancel() acquires
hn->hn_lock with plain spin_lock(). If a process-context thread on
the same CPU holds hn->hn_lock when a softirq invokes the cancel path,
the lock attempt deadlocks. This is the only caller that invokes
tls_handshake_cancel() from BH context; every other consumer calls it
from process context.
Deferring the cancel to process context in the NVMe target is not
straightforward: nvmet_tcp_schedule_release_queue() must call
tls_handshake_cancel() atomically with its state transition to
DISCONNECTING. If the cancel were deferred, the handshake completion
callback could fire in the window before the cancel runs, observe the
unexpected state, and return without dropping its kref on the queue.
Reworking that interlock is considerably more invasive than hardening
the handshake lock. Convert all hn->hn_lock acquisitions from
spin_lock/spin_unlock to spin_lock_bh/spin_unlock_bh so the lock is
never taken with softirqs enabled.
Fixes: 675b453e02 ("nvmet-tcp: enable TLS handshake upcall")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-1-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
the old skb_shared_info header into a new buffer via memcpy(), which
includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
Neither function calls net_zcopy_get() for the new shinfo, creating an
unaccounted holder: every skb_shared_info with destructor_arg set will
call skb_zcopy_clear() once when freed, but the corresponding
net_zcopy_get() was never called for the new copy. Repeated calls
drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
TX skbs still hold live destructor_arg pointers.
KASAN reports use-after-free on a freed ubuf_info_msgzc:
BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810
Read of size 8 at addr ffff88801574d3e8 by task poc/220
Call Trace:
skb_release_data+0x77b/0x810
kfree_skb_list_reason+0x13e/0x610
skb_release_data+0x4cd/0x810
sk_skb_reason_drop+0xf3/0x340
skb_queue_purge_reason+0x282/0x440
rds_tcp_inc_free+0x1e/0x30
rds_recvmsg+0x354/0x1780
__sys_recvmsg+0xdf/0x180
Allocated by task 219:
msg_zerocopy_realloc+0x157/0x7b0
tcp_sendmsg_locked+0x2892/0x3ba0
Freed by task 219:
ip_recv_error+0x74a/0xb10
tcp_recvmsg+0x475/0x530
The skb consuming the late access still referenced the same uarg via
shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without
a refcount bump. This has been verified to be reliably exploitable: a
working proof-of-concept achieves full root privilege escalation from
an unprivileged local user on a default kernel configuration.
The fix follows the pattern of pskb_expand_head() which has the same
memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get()
is placed after skb_orphan_frags() succeeds, so the orphan error path
needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is
placed after all failure points and just before skb_release_data(), so
no error path needs cleanup at all -- matching pskb_expand_head() more
closely and avoiding the need for a balancing net_zcopy_put().
Fixes: 6fa01ccd88 ("skbuff: Add pskb_extract() helper function")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jijie Shao says:
====================
hibmcge: fix RX packet corruption issue
This series fixes an RX packet corruption issue observed when SMMU is
disabled on the hibmcge driver. The fixes include disabling PCI Relaxed
Ordering and correcting the order of DMA barrier operations in the RX
data sync path.
====================
Link: https://patch.msgid.link/20260525144525.94884-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The dma_rmb() barrier was placed before dma_sync_single_for_cpu(), which
is incorrect. DMA sync must complete first to make the buffer accessible
to the CPU, then the rmb barrier ensures subsequent descriptor reads
observe the latest data written by the hardware.
Reorder the operations so dma_sync_single_for_cpu() is called before
dma_rmb() to guarantee the driver reads consistent data from the DMA
buffer.
Fixes: f72e255940 ("net: hibmcge: Implement rx_poll function to receive packets")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260525144525.94884-3-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When SMMU is disabled, the hibmcge driver may receive corrupted packets.
The hardware writes packet data and descriptors to the same page, but
with Relaxed Ordering enabled, PCI write transactions may not be
strictly ordered. This can cause the driver to observe a valid
descriptor before the corresponding packet data is fully written.
Fix this by clearing PCI_EXP_DEVCTL_RELAX_EN in the PCI bridge control
register to ensure strict write ordering between packet data and
descriptors.
Fixes: f72e255940 ("net: hibmcge: Implement rx_poll function to receive packets")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260525144525.94884-2-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim says:
====================
net/sched: Fix packet loops in mirred and netem
This patchset adds a 2-bit per-skb tc_depth counter that travels with
the packet. The existing per-CPU mirred nest tracking loses state
when a packet is deferred through the backlog or moves between CPUs
via XPS/RPS. A per-skb field covers both cases.
Patch 1 adds the tc_depth field in a padding hole in sk_buff.
Patches 2-3 revert the check_netem_in_tree() fix and its tests,
which broke legitimate multi-netem configurations.
Patch 4 uses tc_depth to stop netem duplicate recursion.
Patch 5 uses tc_depth to catch mirred ingress redirect loops.
Patch 6 fixes the infinite loop in the mirred egress blockcast case.
Patch 7 fixes drop stats in early return error scenarios in tcf_mirred_act
for redirect (caught by Sashiko [1]).
Patches 8-9 add mirred and netem test cases.
[1] https://sashiko.dev/#/patchset/20260413082027.2244884-1-hxzene%40gmail.com
====================
Link: https://patch.msgid.link/20260525122556.973584-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add a netem nested duplicate test case to validate that it won't
cause an infinite loop
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-10-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Since retval is set as TC_ACT_STOLEN in the mirred redirect case, returning
retval in cases where redirect failed will make the callers not register
the skb as being dropped.
Fix this by returning TC_ACT_SHOT instead in such scenarios.
Fixes: 16085e48cb ("net/sched: act_mirred: Create function tcf_mirred_to_dev and improve readability")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260413082027.2244884-1-hxzene%40gmail.com
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-8-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
tcf_mirred_act() checks sched_mirred_nest against MIRRED_NEST_LIMIT (4)
to prevent deep recursion. However, when the action uses blockcast
(tcfm_blockid != 0), the function returns at the tcf_blockcast() call
BEFORE reaching the counter increment. As a result, the recursion
counter never advances and the limit check is entirely bypassed.
When two devices share a TC egress block with a mirred blockcast rule,
a packet egressing on device A is mirrored to device B via blockcast;
device B's egress TC re-enters tcf_mirred_act() via blockcast and
mirrors back to A, creating an unbounded recursion loop:
tcf_mirred_act -> tcf_blockcast -> tcf_mirred_to_dev -> dev_queue_xmit
-> sch_handle_egress -> tcf_classify -> tcf_mirred_act -> (repeat)
This recursion continues until the kernel stack overflows.
The bug is reachable from an unprivileged user via
unshare(CLONE_NEWUSER | CLONE_NEWNET): user namespaces grant
CAP_NET_ADMIN in the new network namespace, which is sufficient to
create dummy devices, attach clsact qdiscs with shared blocks, and
install mirred blockcast filters.
BUG: TASK stack guard page was hit at ffffc90000b7fff8
Oops: stack guard page: 0000 [#1] SMP KASAN NOPTI
CPU: 2 UID: 1000 PID: 169 Comm: poc Not tainted 7.0.0-rc7-next-20260410
RIP: 0010:xas_find+0x17/0x480
Call Trace:
xa_find+0x17b/0x1d0
tcf_mirred_act+0x640/0x1060
tcf_action_exec+0x400/0x530
basic_classify+0x128/0x1d0
tcf_classify+0xd83/0x1150
tc_run+0x328/0x620
__dev_queue_xmit+0x797/0x3100
tcf_mirred_to_dev+0x7b1/0xf70
tcf_mirred_act+0x68a/0x1060
[repeating ~30+ times until stack overflow]
Kernel panic - not syncing: Fatal exception in interrupt
Fix this by incrementing sched_mirred_nest before calling
tcf_blockcast() and decrementing it on return, mirroring the
non-blockcast path. This ensures subsequent recursive entries see the
updated counter and are correctly limited by MIRRED_NEST_LIMIT.
Fixes: fe946a751d ("net/sched: act_mirred: add loop detection")
Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
Link: https://patch.msgid.link/20260525122556.973584-7-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When mirred redirects to ingress (from either ingress or egress) the loop
state from sched_mirred_dev array dev is lost because of 1) the packet
deferral into the backlog and 2) the fact the sched_mirred_dev array is
cleared. In such cases, if there was a loop we won't discover it.
Here's a simple test to reproduce:
ip a add dev port0 10.10.10.11/24
tc qdisc add dev port0 clsact
tc filter add dev port0 egress protocol ip \
prio 10 matchall action mirred ingress redirect dev port1
tc qdisc add dev port1 clsact
tc filter add dev port1 ingress protocol ip \
prio 10 matchall action mirred egress redirect dev port0
ping -c 1 -W0.01 10.10.10.10
Fixes: fe946a751d ("net/sched: act_mirred: add loop detection")
Tested-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-6-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When netem duplicates a packet it re-enqueues the copy at the root qdisc.
If another netem sits in the tree the copy can be duplicated
again, recursing until the stack or memory is exhausted.
The original duplication guard temporarily zeroed q->duplicate around
the re-enqueue, but that does not cover all cases because it is
per-qdisc state shared across all concurrent enqueue paths
and is not safe without additional locking.
Use the skb tc_depth field introduced in an earlier patch:
- increment it on the duplicate before re-enqueue
- skip duplication for any skb whose tc_depth is already non-zero.
This marks the packet itself rather than mutating qdisc state,
therefore it is safe regardless of tree topology or concurrency.
Fixes: 0afb51e728 ("[PKT_SCHED]: netem: reinsert for duplication")
Reported-by: William Liu <will@willsroot.io>
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Closes: https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/
Co-developed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: William Liu <will@willsroot.io>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-5-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This reverts commit ecdec65ec7.
The tests added were related to check_netem_in_tree() which was
just reverted in the previous patch.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-4-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This reverts commit ec8e0e3d7a.
The original patch rejects any tree containing two netems when
either has duplication set, even when they sit on unrelated classes
of the same classful parent. That broke configurations that have
worked since netem was introduced.
The re-entrancy problem the original commit was trying to solve is
handled by later patch using tc_depth flag.
Doing this revert will (re)expose the original bug with multiple
netem duplication. When this patch is backported make sure
and get the full series.
Fixes: ec8e0e3d7a ("net/sched: Restrict conditions for adding duplicating netems to qdisc tree")
Reported-by: Ji-Soo Chung <jschung2@proton.me>
Reported-by: Gerlinde <lrGerlinde@mailfence.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220774
Reported-by: zyc zyc <zyc199902@zohomail.cn>
Closes: https://lore.kernel.org/all/19adda5a1e2.12410b78222774.9191120410578703463@zohomail.cn/
Reported-by: Manas Ghandat <ghandatmanas@gmail.com>
Closes: https://lore.kernel.org/netdev/f69b2c8f-8325-4c2e-a011-6dbc089f30e4@gmail.com/
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-3-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add a 2-bit per-skb tc depth field to track packet loops across the stack.
The previous per-CPU loop counters like MIRRED_NEST_LIMIT
assume a single call stack and lose state in two cases:
1) When a packet is queued and reprocessed later (e.g., egress->ingress
via backlog), the per-cpu state is gone by the time it is dequeued.
2) With XPS/RPS a packet may arrive on one CPU and be processed on
another.
A per-skb field solves both by travelling with the packet itself.
The field fits in existing padding, using 2 bits that were previously a
hole:
pahole before(-) and after (+) diff looks like:
__u8 slow_gro:1; /* 132: 3 1 */
__u8 csum_not_inet:1; /* 132: 4 1 */
__u8 unreadable:1; /* 132: 5 1 */
+ __u8 tc_depth:2; /* 132: 6 1 */
- /* XXX 2 bits hole, try to pack */
/* XXX 1 byte hole, try to pack */
__u16 tc_index; /* 134 2 */
There used to be a ttl field which was removed as part of tc_verd in commit
aec745e2c5 ("net-tc: remove unused tc_verd fields"). It was already
unused by that time, due to remove earlier in commit c19ae86a51 ("tc: remove
unused redirect ttl").
The first user of this field is netem, which increments tc_depth on
duplicated packets before re-enqueueing them at the root qdisc. On
re-entry, netem skips duplication for any skb with tc_depth already set,
bounding recursion to a single level regardless of tree topology.
The other user is mirred which increments it on each pass
and limits to depth to MIRRED_DEFER_LIMIT (3).
The new field was called ttl in earlier versions of this patch
but renamed to tc_depth to avoid confusion with IP ttl.
Note (looking at you Sashiko! Dont ignore me and continue bringing this up):
1. Since both mirred and netem utilize the same 2-bit tc_depth field it is
possible when netem and mirred are used together that netem qdisc to skip
the duplication step. This is a known trade-off, as a 2-bit field cannot
independently track both features' recursion depths and it is not considered
sane to have a setup that addresses both features on at the same time.
2. skb_scrub_packet does not clear tc_depth. This means a packet's loop history
is preserved even across namespaces. While this might be restrictive for
some topologies, it is also design intent to provide robustness against loops
across namespaces.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260525122556.973584-2-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
It was missed that idt_do_interrupt_irqoff() gets compiled on x84_64;
this is a problem for CFI builds because it includes an unadorned
indirect call. It is however completely dead code.
Rework things to not emit this function at all.
Fixes: 0701c9e17b ("x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260526090631.GA4149641@noisy.programming.kicks-ass.net
gcc-16 has gained some more advanced inter-procedual optimization
techniques that enable it to inline the dummy_tlb_add_page() and
dummy_tlb_flush() function pointers into a specialized version of
__arm_v7s_unmap:
WARNING: modpost: vmlinux: section mismatch in reference: __arm_v7s_unmap+0x2cc (section: .text) -> dummy_tlb_add_page (section: .init.text)
ERROR: modpost: Section mismatches detected.
>From what I can tell, the transformation is correct, as this is only
called when __arm_v7s_unmap() is called from arm_v7s_do_selftests(),
which is also __init. Since __arm_v7s_unmap() however is not __init,
gcc cannot inline the inner function calls directly.
In debug_objects_selftest(), the same thing happens. Both the
caller and the leaf function are __init, but the IPA pulls
it into a non-init one:
WARNING: modpost: vmlinux: section mismatch in reference: lookup_object_or_alloc+0x7c (section: .text.lookup_object_or_alloc) -> is_static_object (section: .init.text)
Marking the affected functions as not "__init" would reliably avoid this
issue but is not a good solution because it removes an otherwise correct
annotation. I tried marking the functions as 'noinline', but that ended
up not covering all the affected configurations.
With some more experimenting, I found that marking these functions as
__attribute__((noipa)) is both logical and reliable.
In order to keep the syntax readable, add a custom macro for this in
include/linux/compiler_attributes.h next to other related macros and
use it to annotate both files.
Link: https://lore.kernel.org/all/abRB6g-48ZX6Yl2r@willie-the-truck/
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: linux-kbuild@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
ipv6_rpl_srh_decompress() computes:
outhdr->hdrlen = (((n + 1) * sizeof(struct in6_addr)) >> 3);
hdrlen is __u8. For n >= 127 the result exceeds 255 and silently
truncates. With n=127 (cmpri=15, cmpre=15, pad=0, hdrlen=16):
(128 * 16) >> 3 = 256, truncated to 0 as __u8
The caller in ipv6_rpl_srh_rcv() then places the compressed header
at buf + ((ohdr->hdrlen + 1) << 3). With hdrlen=0 this is buf + 8,
but the decompressed region occupies buf[0..2055] (8-byte header
plus 128 full addresses). The compressed header overlaps the
decompressed data, and ipv6_rpl_srh_compress() writes into this
overlap, corrupting the routing header of the forwarded packet.
The existing guard at exthdrs.c:546 checks (n + 1) > 255, which
prevents n+1 from overflowing unsigned char (the segments_left
field), but does not prevent the computed hdrlen from overflowing
__u8. n=127 passes because 128 <= 255, yet hdrlen=256 does not
fit.
Tighten the bound to (n + 1) > 127. This caps n at 126, giving
hdrlen = (127 * 16) >> 3 = 254, which fits in __u8. The compressed
header then lands at buf + ((254 + 1) << 3) = buf + 2040, exactly
past the decompressed region (buf[0..2039]). No overlap. 127
segments is well beyond any realistic RPL deployment.
Fixes: 8610c7c6e3 ("net: ipv6: add support for rpl sr exthdr")
Signed-off-by: Rahul Chandelkar <rc@rexion.ai>
Link: https://patch.msgid.link/20260525154031.2290876-1-rc@rexion.ai
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
ethtool: more bug fixes
Last week I sent two patch sets - one fixing bugs in RSS handling,
and one fixing CMIS / module handling. This set contains the remaining
fixes. There's a concentration of fixes around PHY and timestamp config
handling but not enough to break those out as separate sets.
====================
Link: https://patch.msgid.link/20260526153533.2779187-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Netlink fallback path for reading module EEPROM
(fallback_set_params()) validates that offset < eeprom_len,
but does not check that offset + length stays within eeprom_len.
The ioctl equivalent (ethtool_get_any_eeprom() in ioctl.c) has
always enforced both bounds:
if (eeprom.offset + eeprom.len > total_len)
return -EINVAL;
This could lead to surprises in both drivers and device FW.
Add the missing offset + length validation to fallback_set_params(),
mirroring the ioctl.
Similarly - ethtool core in general, and ethtool_get_any_eeprom()
in particular tries to zero-init all buffers passed to the drivers
to avoid any extra work of zeroing things out. eeprom_fallback()
uses a plain kmalloc(), change it to zalloc.
Fixes: 96d971e307 ("ethtool: Add fallback to get_module_eeprom from netlink command")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
All ethtool driver op calls should be sandwiched between
ethnl_ops_begin() / ethnl_ops_complete(). In Netlink eeprom code,
if the paged access failed we fall back to old API, but we
first call _complete() and the fallback never does its own
ethnl_ops_begin(). Move the fallback into the _begin() / _complete()
section.
Fixes: 96d971e307 ("ethtool: Add fallback to get_module_eeprom from netlink command")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
strset_prepare_data() passes ETHTOOL_A_HEADER_FLAGS (3) as the header
attribute to ethnl_req_get_phydev(). This is incorrect, in the main
attr space 3 is ETHTOOL_A_STRSET_COUNTS_ONLY, not the request
header attr. The correct constant is ETHTOOL_A_STRSET_HEADER (1).
ethnl_req_get_phydev() only uses this value for the extack,
so this is not a "functionally visible"(?) bug.
Fixes: e96c93aa4b ("net: ethtool: strset: Allow querying phy stats by index")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The goto err label leads to:
genlmsg_cancel(skb, ehdr);
return ret;
If ethnl_tsinfo_prepare_dump() failed, it has not started a genlmsg.
There's nothing to cancel, and passing an error pointer to
genlmsg_cancel() would cause a crash.
Fixes: b9e3f7dc9e ("net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tsinfo_prepare_data() has two code paths: a "by-PHC" path for
user-specified hardware timestamping providers, and the old path.
Commit 89e281ebff ("ethtool: init tsinfo stats if requested") added
ethtool_stats_init() to mark stat slots as ETHTOOL_STAT_NOT_SET before
the driver callback populates them, but placed the call inside the
old-path block.
When commit b9e3f7dc9e ("net: ethtool: tsinfo: Enhance tsinfo to
support several hwtstamp by net topology") added the by-PHC early
return, it landed above the stats initialization. On that path
the stats array retains the zero-fill from ethnl_init_reply_data()'s
zalloc. This leads to the reply including a stats nest with four
zero-valued attributes that should have been absent.
Reject GET requests for stats with HWTSTAMP_PROVIDER or dump.
Fixes: b9e3f7dc9e ("net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tsconfig_prepare_data() calls ethnl_ops_begin(), we need to call
ethnl_ops_complete() before returning the error.
Fixes: 6e9e2eed4f ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config")
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
pse_prepare_data() is missing ethnl_ops_complete() if
ethnl_req_get_phydev() returned an error. Move getting
phydev up so that we don't have to worry about this
(similar order to linkstate_prepare_data()).
Note that phydev may still be NULL (this is checked in
pse_get_pse_attributes()), the goal isn't really to avoid
the _begin() / _complete() calls, only to simplify the error
handling.
While at it propagate the original error. Why this code
overrides the error with -ENODEV but !phydev generates
-EOPNOTSUPP is unclear to me...
Fixes: 31748765be ("net: ethtool: pse-pd: Target the command to the requested PHY")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
linkstate_prepare_data() calls ethnl_req_get_phydev() before
ethnl_ops_begin(), but routes its error path through "goto out"
which calls ethnl_ops_complete().
Fixes: fe55b1d401 ("ethtool: linkstate: migrate linkstate functions to support multi-PHY setups")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A couple of trivial bugs in error handling in tsconfig_send_reply().
If we failed to allocate rskb we need to set the error.
If we did allocate it but failed to send it - we need to remember
to free it.
Fixes: 6e9e2eed4f ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config")
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ethnl_update_profile() walks the ETHTOOL_A_PROFILE_IRQ_MODERATION
nest list with an index 'i' and writes new_profile[i++] without
bounding i. The destination is kmemdup()'d at NET_DIM_PARAMS_NUM_PROFILES
entries (5), but the Netlink nest count is entirely user-controlled.
Netlink policies do not have support for constraining the number
of nested entries (or number of multi-attr entries).
Fixes: f750dfe825 ("ethtool: provide customized dim profile management")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260526153533.2779187-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel says:
====================
bridge: Fix sleep in atomic context
Under certain circumstances the bridge driver can call
dev_set_promiscuity() while holding the bridge spin lock. This is a
problem as dev_set_promiscuity() might sleep.
Patches #1-#2 fix the problem in the netlink and sysfs configuration
paths by only taking the lock where it is actually needed, thereby
avoiding calling dev_set_promiscuity() from an atomic context.
Patch #3 adds test cases for both configuration paths in rtnetlink.sh
which already includes test cases for similar issues.
Note that dev_set_promiscuity() can sleep either when it takes the net
device mutex or when calling netif_rx_mode_sync(). I encountered the
problem with the latter, but blamed the former since it came earlier.
====================
Link: https://patch.msgid.link/20260526064818.272516-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add two test cases that always pass, but trigger sleeping in atomic
context BUGs without "bridge: Fix sleep in atomic context in netlink
path" and "bridge: Fix sleep in atomic context in sysfs path".
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260526064818.272516-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the start of the git history, brport_store() always acquired the
bridge lock. Back then this decision made sense: The bridge lock
protects the STP state of the bridge and its ports and at that time the
function was only used by two STP related attributes (cost and
priority).
Nowadays, brport_store() processes a lot more attributes and most of
them do not need the bridge lock:
* Bridge flags: Only require RTNL. Read locklessly by the data path.
Annotations can be added in net-next.
* FDB port flushing: Only requires the FDB lock.
* Multicast attributes: Only require the multicast lock.
* Group forward mask: Only requires RTNL. Read locklessly by the data
path. Annotations can be added in net-next.
* Backup port: Only requires RTNL. Read locklessly by the data path.
This is a problem as the bridge calls dev_set_promiscuity() when certain
bridge port flags change and this function can sleep since the commit
cited below, resulting in a splat such as [1].
Fix this by reducing the scope of the bridge lock and only take it when
processing the two STP related attributes that require it. Remove the
now stale comment from br_switchdev_set_port_flag(). The
SWITCHDEV_F_DEFER flag can be removed in net-next.
[1]
BUG: sleeping function called from invalid context at net/core/dev_addr_lists.c:1262
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 372, name: bash
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
5 locks held by bash/372:
#0: ffff88810c51c3f0 (sb_writers#7){.+.+}-{0:0}, at: ksys_write (fs/read_write.c:740)
#1: ffff888115ce9480 (&of->mutex){+.+.}-{4:4}, at: kernfs_fop_write_iter (fs/kernfs/file.c:343)
#2: ffff88810b9fd330 (kn->active#37){.+.+}-{0:0}, at: kernfs_fop_write_iter (fs/kernfs/file.c:80 fs/kernfs/file.c:344)
#3: ffffffffa59473a0 (rtnl_mutex){+.+.}-{4:4}, at: brport_store (net/bridge/br_sysfs_if.c:326)
#4: ffff8881099d2d58 (&br->lock){+...}-{3:3}, at: brport_store (./include/linux/spinlock.h:348 net/bridge/br_sysfs_if.c:345)
Preemption disabled at:
0x0
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
__might_resched.cold (kernel/sched/core.c:9163)
netif_rx_mode_run (net/core/dev_addr_lists.c:1262)
netif_rx_mode_sync (net/core/dev_addr_lists.c:1428)
dev_set_promiscuity (net/core/dev_api.c:289)
br_manage_promisc (net/bridge/br_if.c:135 net/bridge/br_if.c:172)
br_port_flags_change (net/bridge/br_if.c:242 net/bridge/br_if.c:747)
store_learning (net/bridge/br_sysfs_if.c:79 net/bridge/br_sysfs_if.c:235)
brport_store (net/bridge/br_sysfs_if.c:346)
kernfs_fop_write_iter (fs/kernfs/file.c:352)
new_sync_write (fs/read_write.c:595)
vfs_write (fs/read_write.c:688)
ksys_write (fs/read_write.c:740)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Fixes: 78cd408356 ("net: add missing instance lock to dev_set_promiscuity")
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260526064818.272516-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the introduction of the netlink configuration path for bridge
ports in commit 25c71c75ac ("bridge: bridge port parameters over
netlink"), br_setport() was always called with the bridge lock held
around it. Back then this decision made sense: The bridge lock protects
the STP state of the bridge and its ports and at that time the function
only processed three STP related netlink attributes (cost, priority and
state).
Nowadays, br_setport() processes a lot more attributes and most of them
do not need the bridge lock:
* Bridge flags: Only require RTNL. Read locklessly by the data path.
Annotations can be added in net-next.
* FDB port flushing: Only requires the FDB lock.
* Multicast attributes: Only require the multicast lock.
* Group forward mask: Only requires RTNL. Read locklessly by the data
path. Annotations can be added in net-next.
* Backup port and NHID: Only require RTNL. Read locklessly by the data
path.
This is a problem as the bridge calls dev_set_promiscuity() when certain
bridge port flags change and this function can sleep since the commit
cited below, resulting in a splat such as [1].
Fix this by reducing the scope of the bridge lock and only take it when
processing the three STP related attributes that require it. This is
consistent with the multicast attributes where each attribute acquires
the multicast lock instead of having one critical section for all
relevant attributes.
[1]
BUG: sleeping function called from invalid context at net/core/dev_addr_lists.c:1262
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 356, name: bridge
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
2 locks held by bridge/356:
#0: ffffffff919473a0 (rtnl_mutex){+.+.}-{4:4}, at: rtnetlink_rcv_msg (net/core/rtnetlink.c:80 net/core/rtnetlink.c:7002)
#1: ffff888115072d58 (&br->lock){+...}-{3:3}, at: br_setlink (./include/linux/spinlock.h:348 net/bridge/br_netlink.c:1117)
Preemption disabled at:
0x0
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
__might_resched.cold (kernel/sched/core.c:9163)
netif_rx_mode_run (net/core/dev_addr_lists.c:1262)
netif_rx_mode_sync (net/core/dev_addr_lists.c:1428)
dev_set_promiscuity (net/core/dev_api.c:289)
br_manage_promisc (net/bridge/br_if.c:135 net/bridge/br_if.c:172)
br_port_flags_change (net/bridge/br_if.c:242 net/bridge/br_if.c:747)
br_setport (net/bridge/br_netlink.c:1000)
br_setlink (net/bridge/br_netlink.c:1118)
rtnl_bridge_setlink (net/core/rtnetlink.c:5572)
rtnetlink_rcv_msg (net/core/rtnetlink.c:7005)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
netlink_unicast (net/netlink/af_netlink.c:1318 net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
__sock_sendmsg (net/socket.c:787 (discriminator 4) net/socket.c:802 (discriminator 4))
____sys_sendmsg (net/socket.c:2698)
___sys_sendmsg (net/socket.c:2752)
__sys_sendmsg (net/socket.c:2784)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Fixes: 78cd408356 ("net: add missing instance lock to dev_set_promiscuity")
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260526064818.272516-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzbot reported a kernel paging request crash in
can_rx_unregister() inside net/can/af_can.c. The crash occurs
because a virtual CAN device (vxcan) is being enslaved to a
bonding master.
During the enslavement process, the bonding driver mutates
and modifies the network device states to fit an Ethernet-like
aggregation model. However, CAN devices operate on a completely
different Layer 2 architecture, relying on the CAN mid-layer
private data structure (can_ml_priv) instead of standard
Ethernet structures. Since bonding does not initialize or
maintain these CAN structures, subsequent operations on the
half-enslaved interface (such as closing associated sockets
via isotp_release) lead to a null-pointer dereference when
accessing the CAN receiver lists.
Bonding CAN interfaces is architecturally invalid as CAN lacks
MAC addresses, ARP capabilities, and standard Ethernet
link-layer mechanisms. While generic loopback devices are
blocked globally in net/core/dev.c, virtual CAN devices
bypass this check because they do not carry the IFF_LOOPBACK
flag, despite acting as local software-loopbacks.
Fix this by explicitly blocking network devices of type
ARPHRD_CAN from being enslaved at the very beginning of
bond_enslave(). This prevents illegal state mutations,
eliminates the resulting KASAN crashes, and avoids potential
memory leaks from incomplete socket cleanups.
As the CAN support has been added a long time after bonding
the Fixes-tag points to the introduction of ARPHRD_CAN that
would have needed a specific handling in bonding_main.c.
Fixes: cd05acfe65 ("[CAN]: Allocate protocol numbers for PF_CAN")
Reported-by: syzbot+8ed98cbd0161632bce95@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8ed98cbd0161632bce95
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20260526-bonding-candev-v1-1-ba1df400918a@hartkopp.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With CONFIG_CALL_DEPTH_TRACKING enabled on an x86 retbleed-affected platform
(eg: Skylake), with retbleed=stuff, registering a dynamic ftrace trampoline
crashes on the first call into the traced function:
BUG: unable to handle page fault for address: ffff88817ae18880
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 4b53067 P4D 4b53067 PUD 0
Oops: Oops: 0002 [#1] SMP PTI
CPU: 3 UID: 0 PID: 187 Comm: usleep Not tainted 7.0.10 #243 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014
Code: 24 78 00 00 00 00 48 89 ea 48 89 54 24 20 48 8b b4 24 b8 00 00 00 48 8b bc 24 b0 00 00 00 48 89 bc 24 80 00 00 00 48 83 ef 05 <65> 48 c1 3d 1f a8 b6 02 05 48 8b 15 f6 00 00 00 4c 89 3c 24 4c 89
Call Trace:
<TASK>
? find_held_lock
? exc_page_fault
? lock_release
? __x64_sys_clock_nanosleep
? lockdep_hardirqs_on_prepare
? trace_hardirqs_on
__x64_sys_clock_nanosleep
do_syscall_64
? exc_page_fault
? call_depth_return_thunk
entry_SYSCALL_64_after_hwframe
...
Kernel panic - not syncing: Fatal exception
This small reproducer allows to easily trigger the crash:
# echo 'p __x64_sys_clock_nanosleep' > /sys/kernel/tracing/kprobe_events
# echo 1 > /sys/kernel/tracing/events/kprobes/p___x64_sys_clock_nanosleep_0/enable
# usleep 1
Monitoring the crash under GDB points to the exact instruction in charge of
incrementing the call depth:
sarq $5, %gs:__x86_call_depth(%rip)
This instruction matches the one inserted by the ftrace_regs_caller from
ftrace_64.S. This emitted code was likely working fine until the introduction
of
59bec00ace ("x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR()"):
it has made the call depth accounting addressing relative to $rip, instead of
being based on an absolute address.
As this code exact location depends on where the trampoline lives in memory,
the corresponding displacement needs to be adjusted at runtime to actually
correctly find the per-cpu __x86_call_depth value, otherwise the targeted
address is wrong, leading to the page fault seen above.
Fix the %rip-relative displacement of the copied CALL_DEPTH_ACCOUNT
instruction (from ftrace_regs_caller) by calling text_poke_apply_relocation(),
as it is done for example by the x86 BPF JIT compiler through
x86_call_depth_emit_accounting(). This corrects both CALL_DEPTH_ACCOUNT slots,
in ftrace_caller and ftrace_regs_caller.
[ bp: Massage. ]
Fixes: 59bec00ace ("x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR()")
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260527-fix_call_depth_in_trampoline-v1-1-1c1abc8ae310@bootlin.com
compiling with W=2 pointed out that "written may be used uninitialized"
Fixes: 20d72b00ca ("netfs: Fix the request's work item to not require a ref")
Cc: stable@vger.kernel.org
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
cifs_copy_folioq_to_iter() copies a requested number of bytes from
a folio queue into the destination iterator. Since the encrypted
SMB2 READ path was changed to pass the server-declared payload
length (data_len) instead of the larger folioq buffer length, the
caller can ask for fewer bytes than the folio queue holds.
In that case the helper continues walking the remaining folios after
data_size has reached zero and calls copy_folio_to_iter() with
len = 0, which is unnecessary work.
The helper also returns 0 (success) when the folio queue is
exhausted before data_size bytes have been copied. The caller has
no way to distinguish that from a full copy and the reported
transfer count ends up larger than the amount of data placed in the
iterator.
Add an early exit when data_size reaches zero, and return an error
when the folio queue is exhausted before all requested bytes have
been copied.
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When bt_en is pulled high by hardware, the host does not re-download
the firmware after SSR. The controller loads the rampatch and NVM
internally.
On HMT chip, the rampatch is ~264 KB and the NVM is ~9.4 KB. The
loading process takes approximately 70 ms. The previous 50 ms delay is
too short, causing the controller to not respond to the reset command
sent by the host, which leads to BT initialization failure:
Bluetooth: hci0: QCA memdump Done, received 458752, total 458752
Bluetooth: hci0: mem_dump_status: 2
Bluetooth: hci0: Opcode 0x0c03 failed: -110
Increase the delay to 100 ms, which was confirmed as a safe value by
the controller, to ensure the controller has finished loading the
firmware before the host sends commands.
Steps to reproduce:
1. Trigger SSR and wait for SSR to complete:
hcitool cmd 0x3f 0c 26
2. Run "bluetoothctl power on" and observe that BT fails to start.
Fixes: fce1a9244a ("Bluetooth: hci_qca: Fix SSR (SubSystem Restart) fail when BT_EN is pulled up by hw")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
hci_le_create_cis_sync() dereferences conn->conn_timeout after releasing
both rcu_read_lock() and hci_dev_lock(hdev). The conn pointer was
obtained from an RCU-protected iteration over hdev->conn_hash.list and
is not valid once these locks are dropped. A concurrent disconnect can
free the hci_conn between the unlock and the dereference, causing a
use-after-free read.
The cancellation mechanism in hci_conn_del() cannot prevent this because
hci_le_create_cis_pending() queues hci_create_cis_sync with data=NULL:
hci_cmd_sync_queue(hdev, hci_create_cis_sync, NULL, NULL);
While hci_conn_del() dequeues with data=conn:
hci_cmd_sync_dequeue(hdev, NULL, conn, NULL);
Since NULL != conn, the lookup in _hci_cmd_sync_lookup_entry() never
matches, and the pending work item is not cancelled.
Fix this by saving conn->conn_timeout into a local variable while the
locks are still held, so the stale conn pointer is never dereferenced
after unlock.
This is the same class of bug as the one fixed by commit 035c25007c
("Bluetooth: hci_sync: Fix UAF on le_read_features_complete") which
addressed the identical pattern in a different function.
This vulnerability was identified using 0sec.ai, an open-source
automated security auditing platform (https://github.com/0sec-labs).
Fixes: c09b80be6f ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED")
Cc: stable@vger.kernel.org
Reported-by: Doruk Tan Ozturk <doruk@0sec.ai>
Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The skb_clone() function can return NULL if memory allocation fails.
send_mcast_pkt() calls skb_clone() without checking the return value, which
can lead to a NULL pointer dereference in send_pkt() when it dereferences
skb->data.
Add a NULL check after skb_clone() and skip the peer if the clone fails.
Fixes: 18722c2470 ("Bluetooth: Enable 6LoWPAN support for BT LE devices")
Signed-off-by: Zhao Dongdong <zhaodongdong@kylinos.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The Bluetooth host decides whether to download firmware by reading the
controller firmware download completion flag and firmware version
information.
If a USB error occurs during the firmware download process (for example
due to a USB disconnect), the download is aborted immediately. An
incomplete firmware transfer does not cause the controller to set the
download completion flag, but the firmware version information may be
updated at an early stage of the download process.
In this case, after USB reconnection, the host attempts to re-download
the firmware because the download completion flag is not set. However,
since the controller reports the same firmware version as the target
firmware, the download is skipped. This ultimately results in the
firmware not being properly updated on the controller.
This change removes the restriction that skips firmware download when
the versions are equal. It covers scenarios where the USB connection
can be disconnected at any time and ensures that firmware download can
be retriggered after USB reconnection, allowing the Bluetooth firmware
to be correctly and completely updated.
Fixes: 3267c884ce ("Bluetooth: btusb: Add support for QCA ROME chipset family")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
hidp_input_report() reads keyboard and mouse payload data from an skb
without first verifying that skb->len contains enough data.
hidp_recv_intr_frame() pulls the 1-byte HIDP header before dispatching
to hidp_input_report(). If a paired device sends a truncated packet,
the handler reads beyond the valid skb data, resulting in an
out-of-bounds read of skb data. The OOB bytes may be interpreted as
phantom key presses or spurious mouse movement.
Replace the open-coded length tracking and pointer arithmetic with
skb_pull_data() calls. skb_pull_data() returns NULL if the requested
bytes are not present, eliminating the need for a manual size variable
and the separate skb->len guard.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
l2cap_chan_close() removes the channel from conn->chan_l, which
must be done under conn->lock. cleanup_listen() runs under the
parent sk_lock, so acquiring conn->lock would invert the
established conn->lock -> chan->lock -> sk_lock order.
Instead of calling l2cap_chan_close() directly, schedule
l2cap_chan_timeout with delay 0 to close the channel
asynchronously. The timeout handler already acquires conn->lock
and chan->lock in the correct order.
The timer is only armed when chan->conn is still set: if it is
already NULL, l2cap_conn_del() has already processed this channel
(l2cap_chan_del + l2cap_sock_teardown_cb + l2cap_sock_close_cb),
so there is nothing left to do. If l2cap_conn_del() races in
after the timer is armed, __clear_chan_timer() inside
l2cap_chan_del() cancels it; if the timer has already fired, the
handler returns harmlessly because chan->conn was cleared.
Fixes: 3df91ea20e ("Bluetooth: Revert to mutexes from RCU list")
Cc: <stable@vger.kernel.org> # 0b58004: Bluetooth: fix UAF in l2cap_sock_cleanup_listen() vs l2cap_conn_del()
Signed-off-by: Siwei Zhang <oss@fourdim.xyz>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
__set_chan_timer() takes a l2cap_chan reference via l2cap_chan_hold()
before scheduling the delayed work. The normal path in
l2cap_chan_timeout() drops this reference with l2cap_chan_put() at the
end, but the early return when chan->conn is NULL skips the put,
leaking the reference.
Add the missing l2cap_chan_put() before the early return.
Fixes: adf0398cee ("Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout")
Cc: stable@vger.kernel.org
Signed-off-by: Siwei Zhang <oss@fourdim.xyz>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
hci_le_big_terminate() allocates iso_list_data via kzalloc_obj but
returns 0 without freeing it when neither pa_sync_term nor big_sync_term
flags are set after evaluating the PA and BIG sync connection state.
This early-return path was introduced when hci_le_big_terminate() was
refactored to take struct hci_conn instead of raw u8 parameters, adding
PA/BIG flag evaluation logic. The existing kfree() on hci_cmd_sync_queue
failure does not cover this path.
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Cc: stable@vger.kernel.org
Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Otherwise we don't invalidate page tables on next CS.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b6444d1bcbc34f6f2a31a3aab3059be082f3683e)
Cc: stable@vger.kernel.org
The notifier sequence must only be read once or otherwise we could work
with invalid pages.
While at it also fix the coding style, e.g. drop the pre-initialized
return value and use the common define for 2G range.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c08972f555945cda57b0adb72272a37910153390)
Cc: stable@vger.kernel.org
Use arrays instead of list for userq_vas since we have fixed no
of bos. Also, we dont have to worry to free that memory later
since this array would be free along with queue only.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ef7dc711a664b0c548ecfdf13a00436b7446b8e7)
mqd_destroy cleans up queue core objects like mqd and fw_object
which are needed for any pending fence to signal properly.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4ad65d610096498c8e265615aba42b3c47441bb5)
get_queue_ids() computes array_size = num_queues * sizeof(uint32_t),
which could overflow on 32-bit size_t build. using array_size()
instead, it saturates to SIZE_MAX on overflow.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2d57a0475f085c08b49312dfd8edcb461845f285)
Cc: stable@vger.kernel.org
Remove the amdgpu_userq_create/destroy_object wrappers and
use directly the kernel bo allocation function which does all the
things which are done in wrapper.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit deb02080ca5d3f015cf71e56067a39ef2f141998)
When no displays are connected, there is no vblank
happening so the power management code shouldn't
worry about it.
This fixes a regression that caused the memory clock
to be stuck at maximum when there were no displays
connected to a SI GPU.
Fixes: 9003a07468 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)")
Fixes: 9d73b107a6 ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6d87e0199f7b83735b56e422d59f170a201897a8)
Cc: stable@vger.kernel.org
CRIU restore ioctls are meant to be called by CRIU with no
existing drm file. There's an error path
for if the drm file unexpectedly exists. It was positioned so
it was missing a fput(drm_file).
Do that check earlier, as soon as we have the pdd.
Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2bab781dac78916c5cc8de76345a4102449267d7)
Cc: stable@vger.kernel.org
Use snprintf() with sizeof(fs_info.debugfs_name) so a long RAS block
name plus the "_err_inject" suffix cannot overflow the 32-byte buffer.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1a58070fda26857a8f6acc0ab05428e60d5c6844)
Thread 1: Running amdgpu_userq_destroy which eventually remove
the queue from door bell and set userq_mgr = NULL.
Thread2: An interrupt might have scheduled the hang_detect_work
which still need userq_mgr to be valid but could get an NULL
ptrs.
To fix that make sure we cancel the hang_detect_work again before
setting userq_mgr to NULL.
Along with that we also need all the queue va to remain valid till
we could be running anything on the queue and hence moving the
userq_va post hang_detect handler is cancelled.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1a66ceb98b137d18d303b9889f0e7d8c4db73943)
Fix the code to make it an uninterruptible reservation
for root bo.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d409ab4e387d94b2e593d558b54b7bfd315e0e75)
Unpin the wptr_obj->obj when amdgpu_ttm_alloc_gart fails.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d8145c437ccdc2d91c579787290f82788172bea0)
amdgpu_userq_get_doorbell_index returns a uint64 type index
as well as a int type failure values. Simplifying this and
using a int type return value and getting the index in input pointer
of type uint64 type.
Also since it's used at once place making it static would be better.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e947ec9d0529d5f93dbdb33cd197347f6a7b2922)
The process_info could be NULL if user doesn't call kfd_ioctl_acquire_vm
before calling kfd_ioctl_svm.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 83a26c812e0529eb040d31a76f73e33e637243d4)
Cc: stable@vger.kernel.org
[Why&How]
dccg21_init() calls dccg2_init() which hardcodes 100MHz refclk values
for MICROSECOND_TIME_BASE_DIV and MILLISECOND_TIME_BASE_DIV. DCN21
uses 48MHz refclk, so the wrong values corrupt DCCG timing and cause eDP
link training failure on cold boot.
Write the correct 48MHz values directly instead of calling dccg2_init().
v2:
Fixed typo
Fixes: e6e2b956fc ("drm/amd/display: Add missing DCCG register entries for DCN20-DCN316")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5272
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5311
Reported-by: Max Chernoff <git@maxchernoff.ca>
Tested-by: Max Chernoff <git@maxchernoff.ca>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 08236c3ef284cd2d110e5e3d51fc9615e551f9dc)
Cc: stable@vger.kernel.org
mutex fence_drv_lock is destroyed in amdgpu_userq_fence_driver_free
also in one of the jump condition mutex_destroy is also called leading
to double mutex_destroy.
So rearranging the code so amdgpu_userq_fence_driver_free takes care
of the clean up along with mutex_destroy.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 384dbef269d101e5b671fc7b942c56734cd1d186)
Unpin and unref the door bell obj if queue creation fails before
initialization is complete.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8c7506f7ba945f21e5abe7f8eac0a3acca6b5330)
kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL) at
amdgpu_gem.c:1050 uses the user-supplied num_entries directly without
any upper bounds check. Since num_entries is a __u32 and
sizeof(drm_amdgpu_gem_vm_entry) is 32 bytes, a large num_entries
produces an allocation exceeding INT_MAX, triggering
WARNING in __kvmalloc_node_noprof(), causing a kernel WARNING,
TAINT_WARN, and panic on CONFIG_PANIC_ON_WARN=y systems.
Add a size bounds check before we invoke the kvzalloc() to
reject oversized num_entries early with -EINVAL.
Fixes: 4d82724f7f ("drm/amdgpu: Add mapping info option for GEM_OP ioctl")
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1fe7bf5457f6efd7be60b17e23163ba54341d73d)
Cc: stable@vger.kernel.org
The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl()
holds three cleanup-tracked resources before calling kvcalloc():
the drm_gem_object reference from drm_gem_object_lookup(), the
drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and
the drm_exec lock on the per-process VM root page directory via
amdgpu_vm_lock_pd(). All three are released by the out_exec
label that every other error path in this function jumps to.
The kvcalloc() failure path returns -ENOMEM directly, skipping
out_exec and leaking all three.
The leaked per-process VM root PD dma_resv lock is the
load-bearing leak: any subsequent operation on the same VM
(further GEM ops, command-submission, eviction, TTM shrinker
callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is
DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local
denial of service against the caller's GPU context, reachable
by any process with /dev/dri/renderD* access.
Route the failure through out_exec so drm_exec_fini() and
drm_gem_object_put() run.
Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega
(Lucienne): the failing ioctl returns -ENOMEM and a second
GET_MAPPING_INFO on the same fd then blocks in
drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the
caller does not reap the task; the fd-release path during
process exit goes through amdgpu_gem_object_close() ->
drm_exec_prepare_obj() on the same lock, leaving the task in D
state until the box is rebooted. The patched kernel was not
rebuilt and re-tested on this hardware; the fix is mechanical.
Tested on a single Lucienne / Vega box only.
Ziyi Guo posted an independent INT_MAX-bound check for
args->num_entries in the same branch [1]; the two patches are
complementary and can land in either order.
Fixes: 4d82724f7f ("drm/amdgpu: Add mapping info option for GEM_OP ioctl")
Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.edu/ # [1]
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b69d3256d79de15f54c322986ff4da68f1d65b0a)
Cc: stable@vger.kernel.org
Wa_16023105232 programs the register IDLEDLY. The register is reset
whenever the engine is reset. Therefore it should be added to the GuC
save-restore register list for it to be restored after reset.
Fixes: 7c53ff050b ("drm/xe: Apply Wa_16023105232")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260522163531.1365540-2-balasubramani.vivekanandan@intel.com
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
(cherry picked from commit df1cfe24743a93b71eab27687e148ab8ae9b69e3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
During trace remote loading, hyp_trace_load() allocates the descriptor
pages but fails to store the allocated size in trace_buffer->desc_size.
As a result, when unloading the trace buffer, hyp_trace_unload() calls
free_pages_exact() with a size of 0 which fails to free the memory.
Fix this by updating the descriptor size in trace_buffer->desc_size.
Fixes: 3aed038aac ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260521124613.911067-4-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
When sharing the trace buffer with the hypervisor, if sharing a page
fails, the rollback path in hyp_trace_buffer_share_hyp() misses
unsharing the metadata page (meta_va) which was successfully shared
before entering the page sharing loop.
Additionally, if a failure occurs, the cleanup calls
hyp_trace_buffer_unshare_hyp() with an incorrect CPU index. Since that
CPU's pages were already rolled back locally in the loop, this leads to
duplicate unsharing attempts.
Fix both issues affecting the rollback.
Fixes: 3aed038aac ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260521124613.911067-3-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
As the hyp_trace_buffer_unshare_hyp() function name suggests we should
unshare all the previously shared pages, otherwise we leak hyp-shared
pages which won't be reusable for hyp memory.
Fix the typo by calling __unshare_page() on the meta-page, ensuring all
previously shared pages are correctly unshared.
Fixes: 3aed038aac ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp")
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260521124613.911067-2-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
simple_mux_control_put() rejects values greater than e->items, but
enum control values are zero based. For the two-entry mux used by this
driver, valid values are 0 and 1, so value 2 must be rejected as well.
Accepting e->items can store an invalid mux state, pass it to the GPIO
setter, and pass it on to the DAPM mux update path where it is used as
an index into the enum text array.
Use the same >= e->items check used by the ASoC enum helpers.
Fixes: 342fbb7578 ("ASoC: add simple-mux")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260527-asoc-simple-mux-enum-bounds-v1-1-3f805b9fc671@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
AArch32 writes to PMU event counters cannot update the top 32 bits,
even when PMUv3p5 makes the counters 64-bit. KVM therefore needs to
preserve the existing high half and only update the low half written by
the guest, unless the caller explicitly forces a full reset through
PMCR.P.
The current code masks @val down to the old high half before taking
lower_32_bits(val), which means the low half is always zero. As a
result, AArch32 writes to event counters discard the guest-provided low
32 bits instead of storing them.
Build the new value from the old high 32 bits and the low 32 bits of
the value supplied by the guest.
Fixes: 26d2d0594d ("KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits")
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20260526074640.791991-1-maqianga@uniontech.com
Cc: stable@vger.kernel.org
Setting up the interface when suspended/resumeing fail on this card.
Adding a reset and delay quirk will eliminate this problem.
usb 1-1: new full-speed USB device number 2 using xhci-hcd
usb 1-1: New USB device found, idVendor=25aa, idProduct=600b
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-1: Product: TAE1159
usb 1-1: Manufacturer: Generic
usb 1-1: SerialNumber: 20210726905926
Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
Link: https://patch.msgid.link/TYUPR06MB621736D7C85D43200E54E740D2082@TYUPR06MB6217.apcprd06.prod.outlook.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The 21.5" Retina 4K iMac (Late 2015, DMI product name "iMac16,1") ships
with a Cirrus Logic CS4208 codec wired to an external speaker amplifier
enabled through codec GPIO0 -- the same arrangement as the late-2013
MacBookPro 11,x. Without a matching entry in cs4208_mac_fixup_tbl[] the
fixup picker logs:
snd_hda_codec_cs420x hdaudioC1D0: CS4208: picked fixup for codec SSID 106b:0000
i.e. an empty fixup name, GPIO0 stays low, the external amp is never
powered up, and the internal speakers are silent on a stock kernel.
The codec SSID reported by hardware is 0x106b:0x7f00. Reusing CS4208_MBP11
(GPIO0 + SPDIF switch fixup) makes the internal speakers and S/PDIF
output work out of the box, removing the need for users to set
`options snd_hda_intel model=mbp11` via /etc/modprobe.d/.
Tested on iMac16,1 (kernel 6.17.0): four internal drivers
(Left tweeter, Left woofer, Right tweeter, Right woofer, exposed as the
4 channels of the analog-surround-40 ALSA profile) produce audio after
the fixup is applied.
Signed-off-by: Jakub Pisarczyk <pisarz77@gmail.com>
Link: https://patch.msgid.link/20260526201830.34097-1-pisarz77@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
ip6_datagram_recv_specific_ctl() builds IPV6_{HOPOPTS,DSTOPTS,RTHDR}
cmsgs (and their IPV6_2292* legacy counterparts) by trusting the
on-wire hdrlen byte (ptr[1]) when computing the put_cmsg() length.
The length was validated only at parse time (ipv6_parse_hopopts(),
etc.). An nftables payload-write expression can rewrite hdrlen after
parsing and before the skb reaches recvmsg; the write itself is
in-bounds but put_cmsg() then reads up to ((hdrlen+1) << 3) = 2040
bytes from an 8-byte header. nftables is reachable from an
unprivileged user namespace, so this is an unprivileged
slab-out-of-bounds read:
BUG: KASAN: slab-out-of-bounds in put_cmsg+0x3ac/0x540
put_cmsg+0x3ac/0x540
udpv6_recvmsg+0xca0/0x1250
sock_recvmsg+0xdf/0x190
____sys_recvmsg+0x1b1/0x620
Add ipv6_get_exthdr_len() which validates that at least two bytes
are accessible before reading the hdrlen field, then checks the
computed length against skb_tail_pointer(skb), returning 0 on
failure. Extension headers are kept in the linear skb area by
pskb_may_pull() during input, so skb_tail_pointer() is the correct
bound.
Use ipv6_get_exthdr_len() at all non-AH call sites: the five
standalone cmsg blocks (HbH, 2292HbH, 2292DSTOPTS x2, 2292RTHDR)
and the three standard cases in the extension-header walk loop
(DSTOPTS, ROUTING, default). AH retains an inline bounds check
because its length formula differs ((ptr[1]+2)<<2).
The walk loop also gets a pre-read bounds check at the top to
validate ptr before any case accesses ptr[0] or ptr[1].
When the walk loop detects a corrupted header, return from the
function instead of continuing to process later socket options.
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260523143245.2281415-1-tpluszz77@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
FSCTL_SET_SPARSE in fsctl_set_sparse() modifies the file's sparse
attribute and saves it through xattr without any permission checks.
This exposes two issues:
1) A client on a read-only share can change the sparse attribute
on files it opened, even though the share is read-only.
Other FSCTL write operations already check
test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE),
but FSCTL_SET_SPARSE does not.
2) Even on writable shares, clients without FILE_WRITE_DATA or
FILE_WRITE_ATTRIBUTES access should not modify the sparse
attribute. Similar handle-level checks exist in other functions
but are missing here.
Add both share-level writable check and per-handle access check.
Use goto out on error to avoid leaking file references.
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steve French <smfrench@gmail.com>
Signed-off-by: Sean Shen <grayhat@foxmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd_query_inode_status() and ksmbd_lookup_fd_inode() both take a
reference on a ksmbd_inode via __ksmbd_inode_lookup() (which performs
atomic_inc_not_zero()) and later release it using a bare
atomic_dec(&ci->m_count). Unlike ksmbd_inode_put(), a bare
atomic_dec() does not check whether the reference count has reached
zero, so if the caller happens to drop the last reference, the
ksmbd_inode is leaked: it stays in the global inode hash table with
m_count == 0, future __ksmbd_inode_lookup() calls reject it via
atomic_inc_not_zero(), and ksmbd_inode_free() is never invoked.
The race is:
T1: __ksmbd_inode_lookup() -> atomic_inc_not_zero(): m_count = 2
T2: ksmbd_inode_put() -> atomic_dec_and_test(): m_count = 1
(not freed)
T1: atomic_dec(&ci->m_count) -> m_count = 0
return (LEAK)
In ksmbd_lookup_fd_inode() the matched-fp path (which now also uses
ksmbd_inode_put()) cannot currently reach m_count == 0 because the
matched ksmbd_file holds its own reference on ci, but converting it to
the proper API keeps the three call sites consistent and avoids
future regressions if the locking changes.
Because ksmbd_inode_put() may free the ksmbd_inode if this drops the
last reference, the call must happen after up_read(&ci->m_lock) on the
two affected paths in ksmbd_lookup_fd_inode(). On the no-match path
this is a pure reordering; on the matched path ksmbd_fp_get() is
moved above the unlock so that the returned ksmbd_file is pinned
before the inode reference is released.
Signed-off-by: Aleksandr Golovnya <cofedish@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Commit d07b26f392 ("ksmbd: require minimum ACE size in
smb_check_perm_dacl()") introduced a transposed bounds check:
if (offsetof(struct smb_ace, sid) + aces_size < CIFS_SID_BASE_SIZE)
Since offsetof(..sid) is 8 and CIFS_SID_BASE_SIZE is 8, this evaluates
to `aces_size < 0`. Because `aces_size` is always non-negative, this
check becomes dead code and never breaks the loop.
Worse, that commit removed the old 4-byte guard, meaning the loop now
reads `ace->size` (offset 2) even when `aces_size` is 0-3 bytes. This
re-opens a 2-byte heap out-of-bounds (OOB) read past the pntsd allocation
during subsequent SMB2_CREATE operations.
Fix this by properly transposing the comparison to require at least
16 bytes (8-byte offset + 8-byte SID base), matching the correct form
used in smb_inherit_dacl().
Fixes: d07b26f392 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()")
Cc: stable@vger.kernel.org
Signed-off-by: Ali Ganiyev <ali.qaniyev@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
In some cases, iptunnel_pmtud_check_icmp() can be called while
skb transport header is not set.
This triggers an out-of-bound access, because
(typeof(skb->transport_header))~0U is 65535.
Access the icmp header based on IPv4 network header,
after making sure icmp->type is present in skb linear part.
Note that iptunnel_pmtud_check_icmpv6()) is fine.
Fixes: 4cb47a8644 ("tunnels: PMTU discovery support for directly bridged IP packets")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260522115512.1519110-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
skb_tunnel_check_pmtu() can change skb->head.
Reusing old_iph afer skb_tunnel_check_pmtu() can cause an UAF.
Use instead ip_hdr(skb) as done in drivers/net/bareudp.c
and drivers/net/geneve.c.
Found by Sashiko.
Fixes: 4cb47a8644 ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://patch.msgid.link/20260525203642.2389723-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sashiko found that iptunnel_pmtud_build_icmp() and
iptunnel_pmtud_build_icmpv6() were caching ip_hdr() and ipv6_hdr()
before an skb_cow() call which can reallocate skb->head.
Fix this possible UAF by initializing the local variables
after the skb_cow() call.
Remove skb_reset_network_header() calls which were not needed.
Fixes: 4cb47a8644 ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://patch.msgid.link/20260525201335.2361845-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
AN8811HB needs a MCU soft-reset cycle before firmware loading begins.
Assert the MCU (hold it in reset) and immediately deassert (release)
via a dedicated PBUS register pair (0x5cf9f8 / 0x5cf9fc), accessed
through a registered mdio_device at PHY-addr+8.
Add __air_pbus_reg_write() as a low-level helper taking a struct
mdio_device *, create and register the PBUS mdio_device in
an8811hb_probe() and store it in priv->pbusdev, then implement
an8811hb_mcu_assert() / _deassert() on top of it. Add
an8811hb_remove() to unregister the PBUS device on teardown. Wire
both calls into an8811hb_load_firmware() and en8811h_restart_mcu()
so every firmware load or MCU restart on AN8811HB correctly sequences
the reset control registers.
Fixes: 5afda1d734 ("net: phy: air_en8811h: add Airoha AN8811HB support")
Signed-off-by: Lucien Jheng <lucienzx159@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260524063915.47961-1-lucienzx159@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A reader in l2tp_session_get_by_ifname() can return a pointer to a
session whose refcount has reached zero. The getter takes its
reference with plain refcount_inc(), but every other session getter
in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the
corresponding _get_next variants) uses refcount_inc_not_zero()
because the IDR/RCU lookup can race with refcount_dec_and_test() ->
l2tp_session_free() -> kfree_rcu(). The ifname getter is the only
outlier; the inconsistency was raised on-list after 979c017803
("l2tp: use list_del_rcu in l2tp_session_unhash").
A reader inside rcu_read_lock_bh() that matches session->ifname can
be preempted between the strcmp() and the refcount_inc(). If the
last reference drops on another CPU in that window, the reader's
refcount_inc() runs on a counter that has reached zero. refcount_t
catches the addition-on-zero, prints "refcount_t: addition on 0;
use-after-free", saturates the counter, and returns the saturated
pointer to the caller. Session memory is held live by the in-flight
RCU read section, but the kfree_rcu() callback queued from
l2tp_session_free() will free it once the grace period closes; a
caller that dereferences the returned session past that point hits
a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU
sleeping lock and the preemption window is real; on stock PREEMPT
kernels local_bh_disable() is a preempt_count increment that closes
the cross-CPU race in practice (see below).
Use refcount_inc_not_zero() and continue the list walk on failure,
matching the other session getters in the file. The ifname getter
is the only session getter in net/l2tp/ that still uses the bare
refcount_inc() pattern; this change restores file-internal
consistency. The success path is unchanged.
Fixes: abe7a1a7d0 ("l2tp: improve tunnel/session refcount helpers")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260523023423.2568972-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the 'clock-frequency' property is missing from the device tree,
the driver falls back to DAVINCI_I2C_DEFAULT_BUS_FREQ. However, this
macro was defined in kHz (100), whereas the device tree property is
expected in Hz.
The probe function divided the fallback value by 1000, causing
integer truncation that resulted in dev->bus_freq = 0. This triggered
a deterministic division-by-zero kernel panic when calculating clock
dividers later in the probe sequence.
Fix this by redefining DAVINCI_I2C_DEFAULT_BUS_FREQ in Hz (100000)
to match the expected device tree property unit, allowing the existing
division logic to work correctly for both cases.
Fixes: b04ce63859 ("i2c: davinci: kill platform data")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://lore.kernel.org/all/20260514044726.57297C2BCB7@smtp.kernel.org/
Signed-off-by: Chaitanya Sabnis <chaitanya.msabnis@gmail.com>
Cc: <stable@vger.kernel.org> # v6.14+
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260526102240.4949-1-chaitanya.msabnis@gmail.com
A previous commit removed an optimization out of caution for a scenario
that turns out not to be real: all the "queue_exit" goto's are safe to
reinsert the request into the cached_rq's plug list as they are either
from a non-blocking path, or a successful merge that already holds the
queue reference. This optimization is most needed for small sequential
workloads that successfully merge into larger requests.
Fixes: dc278e9bf2 ("blk-mq: pop cached request if it is usable")
Suggested-by: Ming Lei <tom.leiming@gmail.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/20260526153531.2365935-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
CXL test environment hits the following error sometimes.
cxl_mem mem9: endpoint7 failed probe
All mock memdevs are platform firmware devices added by cxl_test module,
and cxl_test module also provides a platform device driver for them to
create a memdev device to CXL subsystem. cxl_test module uses
cxl_rcd/mem_single/mem arrays to store different types of mock memdevs.
CXL drivers calls registered mock functions for a mock memdev by
checking if a given memdev is in these arrays.
When cxl_test module adds these mock memdevs, it always calls
platform_device_add() before adding them to a suitable mock memdev
array. However, there is a small window where CXL drivers calls mock
function for a added memdev before it added to a mock memdev array. In
above case, cxl endpoint driver considers a added memdev was not a mock
memdev, then calling devm_cxl_endpoint_decoders_setup() for it rather
than mock_endpoint_decoders_setup().
An appropriate solution is that adding a new mock device to a mock
device array before calling platform_device_add() for it. It can
guarantee the new mock device is visible to CXL subsystem.
This patch introduces a new helped called cxl_mock_platform_device_add()
to handle the issue, and uses the function for all mock devices addition.
Fixes: 3a2b97b321 ("cxl/test: Improve init-order fidelity relative to real-world systems")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260520121457.234404-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Issues reported with v7.1-rc:
- Tighten bounds checking for sunrpc cache hash tables
- Don't report key material in the ftrace log
Issues that need expedient stable backports:
- Fix lockd's implementation of the NLM TEST procedure
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmoV9/kACgkQM2qzM29m
f5fdpxAAtT3hl4wKNJsVLhFlFhG+9ABL74fwaQ06j5vTSIgXqPm12NuO5YbrkC78
ZzV/B/YqoHLAw/t8Pgq2taBBuSeLF+H8JqjJRDYE5H2NB/KQOT8n9KTLZtac4/1V
Dvrk3mP2h12Q//BC3pF9bU9gMR1DO/+yLt9SkH+dtqcW+dWxiyVZWtK0eESIsMfh
IzkHNKOS0edMZmHl5O7VZSlbyq1jPA4hTZT+NCG7JwnK6YqSkpRGDiZdZIT2FBEI
C9a9hZHoP9JAJs9fR+xzTCVsIPpNW9OO3fknR2Lg7IScssVc1GIpqjU+g1O1XSVf
XsMfAl+pEipDBpULu46KM1TDqAKtjaAx8Z+hDmiPxSOCKWuPn/9LMdzwVrzC7Bw8
S7ftOxUZQLHtbS8Y0eECzwK9tdfBUHN26LAJfvg4P5ZOIsFoUj0LeDryPy0r9xxb
aEdEI8wro0O3p0krjtW2i+FJB8dtlKEu19LT6PN4MQtmv5a+DY4Hypt4Xovol0i+
eEugZVmLYE11b52ZFfcTcXf8n89jiWg7rgRBdBdy+vQl/32dKK3SMSIB/zCZYmBc
JZNywtri6JHeJjkohWJ4xmwrmMaDj4hNr3OqWh7bOQTHleg7igpCuy+9/LHzEF6G
BX4DgMJ6LqcdG8p4biGr2I2NF/+MJpXO5kNAdS44wpP983T26WE=
=onHe
-----END PGP SIGNATURE-----
Merge tag 'nfsd-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Regressions:
- Tighten bounds checking for sunrpc cache hash tables
- Don't report key material in the ftrace log
Stable fix:
- Fix lockd's implementation of the NLM TEST procedure"
* tag 'nfsd-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
lockd: fix TEST handling when not all permissions are available.
NFSD: Report whether fh_key was actually updated
sunrpc: prevent out-of-bounds read in __cache_seq_start()
Fixes use-after-free in kunit debugfs when using kunit.filter when
the executor frees dynamically allocated resources after running
boot-time tests. This resulted in fatal hardware exception due to
invalidation of capability flags on the reclaimed memory on some
architectures such as CHERI RISC-V that support the feature, and
silent memory corruption on others.
Fix for this UAF couples the coupling the lifetime of the filtered
suite memory allocation to the lifetime of the kunit subsystem and its
associated VFS nodes. Ownership of the boot-time suite_set is now
transferred to a global tracker ('kunit_boot_suites'), and the memory
is cleanly released in kunit_exit() during module teardown.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmoV6vEACgkQCwJExA0N
QxzyHBAA2POZWgHKwHip5nGp/Q+c3ttDYAnWno/ZZRARrC/bzDYxno4BLBw8oZK3
Zr9TsKYyu3JjIb6GEzS0WMNmRi/W/kfPlEELIoVFDh46EucErsoYPoyp0tNw1YgW
52lw+j9NiPHAaeev8duzOlSucCthcmw43F+cS2Ru6nsVszq8fi0LzCq9KCnrgnjT
3qzyCpzqzd7Kn/ex7DNXabJKB4jm4w3eGpyeYCYnH/02Q+pHAp8Rfftd7Mw9Q4b9
lhIlOXg1UisYfTpoTHNi960u8McbYu80jx9K6JJ6mXN4Nw/UXwoWhJsZPARYstN1
uI0TXzAaOsBOsiNSuLCkYQUSdXokqv/1eMLgL6/dMe93IqFgGYW4nqDL/RhOBiMM
e/40odgt+ep9LhAClEqHJXZtc+dtC33C6K9ZxYGzAJVk7jQ7MhVzkNh2Yk2Q6qOP
Ro+IwTWAWrVzN+l6PDltsZw6+r+2jKPB9Gcb/lSvW3XiUdNKzYL2aXiiYUqV+DgJ
U1SQhTeByN1xq0hbvCj/m7AQZp9k7T5S4OUGhuRFept5jCN2wueYWLSimRVM6wbN
rQq43XWx8z2AYeIARKFquE4SKrOb27eHHk4nDxFIk9ofNL7HQdEyScS/N4j+dN36
u5fNZOOU5pc73HU68uxK3e9wYf3cncN1RTo8fBwivOZFBn2meys=
=VKMy
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fix from Shuah Khan:
"Fix a use-after-free in kunit debugfs when using kunit.filter when the
executor frees dynamically allocated resources after running boot-time
tests. This resulted in fatal hardware exception due to invalidation
of capability flags on the reclaimed memory on some architectures such
as CHERI RISC-V that support the feature, and silent memory corruption
on others.
The fix for this couples the lifetime of the filtered suite memory
allocation to the lifetime of the kunit subsystem and its associated
VFS nodes. Ownership of the boot-time suite_set is now transferred to
a global tracker ('kunit_boot_suites'), and the memory is cleanly
released in kunit_exit() during module teardown"
* tag 'linux_kselftest-kunit-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: fix use-after-free in debugfs when using kunit.filter
Patch in Fixes: causes the usual:
unchecked MSR access error: RDMSR from 0x17 at ... (intel_get_platform_id)
Call Trace:
early_init_intel
early_cpu_init
setup_arch
_printk
start_kernel
x86_64_start_reservations
x86_64_start_kernel
common_startup_64
because the kernel is booted in a guest.
In order to avoid it, this MSR access needs to be prevented when running
virtualized. That is usually done by checking X86_FEATURE_HYPERVISOR but
for this particular case it is too early yet.
The platform ID needs to be read as early as when microcode is loaded on
the BSP:
load_ucode_bsp ... -> get_microcode_blob ... -> intel_find_matching_signature
and by that time, CPUID leafs haven't been parsed yet.
The microcode loader already has logic to check early whether the kernel
is running virtualized so make that globally available to arch/x86/. The
query whether running virtualized is getting more and more prominent in
recent times so might as well make it an arch-global var which the rest
of the code can use.
Fixes: d8630b67ca ("x86/cpu: Add platform ID to CPU info structure")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Binbin Wu <binbin.wu@linux.intel.com>
Link: https://lore.kernel.org/all/20260430020953.1405535-1-binbin.wu@linux.intel.com
Since the QPIC-SPI-NAND flash controller present in ipq5210 is the same
as the one found in ipq9574, document the ipq5210 compatible and with
ipq9574 as the fallback.
Signed-off-by: Varadarajan Narayanan <varadarajan.narayanan@oss.qualcomm.com>
Link: https://patch.msgid.link/20260514-ipq5210-nand-v1-1-cbdd7492e826@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
post-7.1 issues or aren't considered suitable for backporting.
All patches are singletons - please see the individual changelogs for
details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCahTZ2QAKCRDdBJ7gKXxA
ju+UAQDUga+l95O1iOnrraKFWvT1ghQKTgbNxGMwefHjVLLFBQD+Ln2wPfz73Ks7
H8WK0k5D0g+6lKs6tFGAALdQnTU0BAU=
=MYsv
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"13 hotfixes. 9 are for MM. 9 are cc:stable and the remaining 4 address
post-7.1 issues or aren't considered suitable for backporting.
All patches are singletons - please see the individual changelogs for
details"
* tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
Revert "mm: introduce a new page type for page pool in page type"
mm/vmalloc: do not trigger BUG() on BH disabled context
MAINTAINERS, mailmap: change email for Eugen Hristev
mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page
kernel/fork: validate exit_signal in kernel_clone()
mm: memcontrol: propagate NMI slab stats to memcg vmstats
mm/damon/sysfs-schemes: delete tried region in regions_rmdirs()
mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one
zram: fix use-after-free in zram_writeback_endio
memfd: deny writeable mappings when implying SEAL_WRITE
ipc: limit next_id allocation to the valid ID range
Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare"
MAINTAINERS: .mailmap: update after GEHC spin-off
Jakub Kicinski says:
====================
ethtool: module: fix a handful of small bugs
I've been poking at the locking in ethtool and it appears
that the FW flashing is not currently taking the ops lock.
Existing drivers which implement module FW flashing seem
to have their own locking, so this series doesn't actually
add the ops lock (I'll add it in net-next). But a number
of other errors have been surfaced in the process.
====================
Link: https://patch.msgid.link/20260522231312.1710836-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
cmis_fw_update_start_download() copies start_cmd_payload_size bytes
from the firmware blob into the CDB LPL vendor_data[] payload without
validating that the FW has enough data.
Since the start_cmd_payload_size can only be ~120B an image too short
is most likely corrupted, so reject it.
Fixes: c4f78134d4 ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The CMIS firmware update code reads start_cmd_payload_size from
the module's FW Management Features CDB reply and uses it directly
as the byte count for memcpy. The destination buffer is 112 bytes
(ETHTOOL_CMIS_CDB_LPL_MAX_PL_LENGTH - 8). So a malicious
module (or corrupted response) can cause a OOB write later on in
cmis_fw_update_start_download().
Let's error out. If modules that expect longer LPL writes actually
exist we should revisit.
struct cmis_cdb_start_fw_download_pl's definition has to move,
no change there.
Fixes: c4f78134d4 ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ethtool_cmis_cdb_compose_args() accepts msleep_pre_rpl as u16 but stores
it into the u8 field ethtool_cmis_cdb_cmd_args::msleep_pre_rpl, silently
truncating values >= 256. Seven of the nine call sites pass 1000 ms
(it's the third argument from the end).
Fixes: a39c84d796 ("ethtool: cmis_cdb: Add a layer for supporting CDB commands")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Malicious SFP module could respond with rpl_len longer than
what cmis_cdb_process_reply() expected, leading to OOB writes.
Malicious HW is a bit theoretical but some modules may just
be buggy and/or the reads may occasionally get corrupted,
so let's protect the kernel.
The existing check protects from short replies. We need to
protect from long ones, too. All callers that pass a non-zero
rpl_exp_len cast the reply payload to a fixed-layout struct
and read fields at fixed offsets, with no version negotiation
or short-reply handling:
- cmis_cdb_validate_password()
- cmis_cdb_module_features_get()
- cmis_fw_update_fw_mng_features_get()
so let's assume that responses longer than expected do not
have to be handled gracefully here. Add a warning message
to make the debug easier in case my understanding is wrong...
Note that page_data->length (argument of kmalloc) comes from
last arg to ethtool_cmis_page_init() which is rpl_exp_len.
Note2 that AIs also like to point out overflows in args->req.payload
itself (which is a fixed-size 120 B buffer, on the stack),
but callers should be reading structs defined by the standard,
so protecting from requests for more data than max seem like
defensive programming.
Fixes: a39c84d796 ("ethtool: cmis_cdb: Add a layer for supporting CDB commands")
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a single Netlink socket issues MODULE_FW_FLASH_ACT against multiple
devices, ethnl_sock_priv_set() overwrites sk_priv->dev on each call,
retaining only the last one. The socket priv is used on socket close,
to walk the global work list and mark the uncompleted flashing work
as "orphaned". Otherwise if another socket reuses the PID it will
unexpectedly receive the flashing notifications.
Don't record the device, record net pointer instead. The purpose of
the dev is to scope the work to a netns, anyway. If we store netns
the overrides are safe/a nop since all flashed devices must be in
the same netns as the socket.
Fixes: 32b4c8b53e ("ethtool: Add ability to flash transceiver modules' firmware")
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ethnl_set_module_validate() inspects module_fw_flash_in_progress
but validate is meant for _input_ validation, not state validation.
rtnl_lock is not held, yet. Move the check into ethnl_set_module().
Fixes: 32b4c8b53e ("ethtool: Add ability to flash transceiver modules' firmware")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When reviewing other changes Gemini points out that we currently
update module_fw_flash_in_progress without holding any locks.
Since module_fw_flash_in_progress is part of a bitfield this
is not great, updates to other fields may be lost.
We could use a bool and sprinkle some READ_ONCE/WRITE_ONCE here
but seems like the issue is rather than the work is an unusual
writer. The other writers already hold the right locks. So just
very briefly take these locks when the work completes.
Note that nothing ever cancels the FW update work, so there's
no concern with deadlocks vs cancel.
Fixes: 32b4c8b53e ("ethtool: Add ability to flash transceiver modules' firmware")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
module_flash_fw_schedule() is missing undo for setting
the "in_progress" flag and taking the netdev reference.
Delay taking these, the device can't disappear while
we are holding rtnl_lock.
Fixes: 32b4c8b53e ("ethtool: Add ability to flash transceiver modules' firmware")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When validate() fails we are skipping over ethnl_ops_complete()
even tho we already called ethnl_ops_begin().
Fixes: 32b4c8b53e ("ethtool: Add ability to flash transceiver modules' firmware")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://patch.msgid.link/20260522231312.1710836-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
ethtool: rss: fix a handful of small bugs
Fix a handful of small bugs in the ethtool Netlink RSS code.
====================
Link: https://patch.msgid.link/20260522230647.1705600-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We wait with filling the reply for new RSS context creation
until after the driver ->create_rxfh_context call. The driver
needs to fill some of the defaults in the context. The failure
of rss_fill_reply() is somewhat theoretical, but doesn't take
much effort to handle it properly. Call ->remove_rxfh_context().
If the driver's remove callback fails (some implementations like sfc
can return real command errors from firmware RPCs) - skip the xa_erase
and kfree, leaving the context in the xarray. This matches how
ethnl_rss_delete_doit() behaves.
Fixes: a166ab7816 ("ethtool: rss: support creating contexts via Netlink")
Link: https://patch.msgid.link/20260522230647.1705600-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rss_get_data_alloc() allocates a single buffer that backs both the
indirection table and the hash key, but only assigned data->indir_table
when indir_size was nonzero. The expectation was that no driver
implements RSS without supporting indirection table but apparently
enic does just that (it's the only such in-tree driver).
enic has get_rxfh_key_size but no get_rxfh_indir_size.
data->indir_table stays as NULL, hkey gets set but rss_get_data_free()
kfree(data->indir_table) is a nop and the allocation leaks.
Always store the allocation base in data->indir_table so the free path
is unambiguous. No caller treats indir_table as a sentinel; everything
keys off indir_size.
Fixes: 7112a04664 ("ethtool: add netlink based get rss support")
Link: https://patch.msgid.link/20260522230647.1705600-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rss_prepare_get() allocates the indirection table and hash key buffer
via rss_get_data_alloc(), then calls ops->get_rxfh() to populate them.
If get_rxfh() fails, the function returns an error without freeing
the allocation.
Fixes: 4f038a6a02 ("net: ethtool: Don't call .cleanup_data when prepare_data fails")
Link: https://patch.msgid.link/20260522230647.1705600-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rss_set_prep_indir() compares the new indirection table against the
current one to determine whether any update is needed. The memcmp
call passes data->indir_size as the length argument, but indir_size
is the number of u32 entries, not the byte count.
Fixes: c0ae03588b ("ethtool: rss: initial RSS_SET (indirection table handling)")
Link: https://patch.msgid.link/20260522230647.1705600-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remember to set ret before jumping out if someone tries
to delete a context on a device which doesn't support
contexts.
Fixes: fbe09277fa ("ethtool: rss: support removing contexts via Netlink")
Link: https://patch.msgid.link/20260522230647.1705600-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gemini says that we're modifying the RSS_CREATE response skb.
I think it's right, the comment says that unicast() should
unshare the skb but I'm not entirely sure what I meant there.
netlink_trim() does a copy but only if skb is not well sized
(it's at least 2x larger than necessary for the payload).
Fixes: a166ab7816 ("ethtool: rss: support creating contexts via Netlink")
Link: https://patch.msgid.link/20260522230647.1705600-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
socfpga_smp_prepare_cpus() looks up the Cortex-A9 SCU node with
of_find_compatible_node(), which returns a node reference that must be
released with of_node_put().
The function maps the SCU registers and then returns without dropping
that reference, leaking the node on both the success path and the
of_iomap() failure path.
Drop the reference once the mapping attempt is complete. The returned
MMIO mapping does not depend on keeping the device node reference held.
Fixes: 122694a0c7 ("ARM: socfpga: use of_iomap to map the SCU")
Cc: stable@vger.kernel.org
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
pskb_trim_rcsum_slow() keeps CHECKSUM_COMPLETE valid by subtracting
the checksum of the bytes removed from the skb tail. That assumes the
removed bytes can be read.
io_uring zcrx skbs may contain unreadable net_iov frags. With fbnic
header/data split, small TCP/IPv4 packets can carry Ethernet padding
in such a frag. ip_rcv_core() trims the skb to iph->tot_len before TCP
sees it, and the CHECKSUM_COMPLETE adjustment then calls
skb_checksum() on the padding.
This is exposed by IPv4 because small TCP/IPv4 frames can be shorter
than the Ethernet minimum payload. TCP/IPv6 frames are large enough in
the normal zcrx path, so they do not hit the same padding trim.
Keep the existing checksum adjustment for readable skbs. If the
remaining packet is fully linear, drop CHECKSUM_COMPLETE and let the
stack validate the packet after trimming. If unreadable payload would
remain, fail the trim; the checksum cannot be adjusted without reading
the trimmed tail.
Also clear skb->unreadable when trimming removes all frags.
Fixes: 65249feb6b ("net: add support for skbs with unreadable frags")
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260522120643.242974-1-bjorn@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Dan reported a possible NULL pointer dereference in amd-pstate-ut.c from
static analysis and sure enough, running amd-pstate-ut in active mode
with amd_dynamic_epp=enable results in a crash as a reult of the policy
reference being set to NULL early, before disabling dynamic EPP.
Kalpana also reported seeing amd-pstate-ut error out with -EBUSY for
"amd_pstate_ut_epp" test when starting from the passive mode and
amd_dynamic_epp=enable in the command line. The reason for the failure
is that the command line enables dynamic_epp by default after the mode
switch and the modifications to EPP values are blocked when running in
dynamic EPP mode.
Solution to both problems is to toggle off dynamic_epp *after* the mode
switch when the driver grabs the policy reference again since the unit
test is in full control of the policy after that point.
The final restoration step will reset the dynamic_epp state via mode
switch based on the initial conditions of the system.
Reported-by: Kalpana Shetty <kalpana.shetty@amd.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/linux-pm/ahEq0CvdBX0T7_cO@stanley.mountain/
Fixes: f9f16835d4 ("cpufreq/amd-pstate-ut: Drop policy reference before driver switch")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260523055503.7651-1-kprateek.nayak@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Maoyi Xie says:
====================
ip6_vti: vti6_changelink and vti6_siocdevprivate netns fixes
1/2 carries forward Eric Dumazet's Reviewed-by. Only the Fixes
tag changes there. 2/2 changes the Fixes tag and adds the
ns_capable hunk.
====================
Link: https://patch.msgid.link/20260521130555.3421684-1-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
After patch 1/2 in this series, vti6_update() unlinks and relinks
the tunnel through t->net. vti6_siocdevprivate() still uses
dev_net(dev) for the collision lookup. For a tunnel moved through
IFLA_NET_NS_FD, dev_net(dev) is the new netns, not t->net.
SIOCCHGTUNNEL on a migrated tunnel then runs:
net = dev_net(dev) /* migrated netns */
t = vti6_locate(net, &p1, false) /* misses target in t->net */
...
t = netdev_priv(dev)
vti6_update(t, &p1, false) /* mutates t->net's hash */
A caller in the migrated netns picks params that match a tunnel
in the creation netns. The lookup in dev_net(dev) finds nothing.
vti6_update() prepends the migrated tunnel at the head of the
creation netns hash bucket for those params. Later lookups in
the creation netns resolve to the migrated device. xfrm receive
delivers the matched packets through a device the caller controls.
Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net). Cross tenant scope on container hosts.
Switch the SIOCCHGTUNNEL path on a non fallback device to use
t->net for the lookup. The lookup now matches the netns
vti6_update() operates on.
Also add ns_capable(self->net->user_ns, CAP_NET_ADMIN) before
the lookup. The check at the top of the case is against
dev_net(dev)->user_ns, which after migration is the attacker's
netns. A caller there can pick params absent from self->net,
the lookup returns NULL, t becomes self, and vti6_update()
inserts the device into the creation netns hash. The new check
requires CAP_NET_ADMIN in the creation netns user_ns too.
SIOCADDTUNNEL and SIOCCHGTUNNEL on the fallback device keep
dev_net(dev), which equals init_net there.
Fixes: 61220ab349 ("vti6: Enable namespace changing")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Xiao Liang <shaw.leon@gmail.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
Link: https://patch.msgid.link/20260521130555.3421684-3-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
ip netns add ns1
ip netns add ns2
ip -n ns1 link add vti6_test type vti6 remote ::1 local ::2 key 7
ip -n ns1 link set vti6_test netns ns2
ip -n ns2 link set vti6_test type vti6 remote ::3 local ::4 key 9
ip netns del ns2
ip netns del ns1
[ 132.495484] ------------[ cut here ]------------
[ 132.497609] kernel BUG at net/core/dev.c:12376!
Commit 61220ab349 ("vti6: Enable namespace changing") dropped
NETIF_F_NETNS_LOCAL from vti6 devices. A vti6 tunnel can then
move through IFLA_NET_NS_FD. After the move dev_net(dev) points
at the new netns while t->net stays at the creation netns.
vti6_changelink() and vti6_update() still use dev_net(dev) and
dev_net(t->dev). They unlink from one per netns hash and relink
into another. The creation netns is left with a stale entry.
cleanup_net() of that netns later walks freed memory.
Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net). Cross tenant scope on container hosts.
Fixes: 61220ab349 ("vti6: Enable namespace changing")
Reported-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260521130555.3421684-2-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
__team_change_mode() clears team->ops with memset() before restoring
safe dummy handlers via team_adjust_ops(). A concurrent team_xmit()
running under RCU on another CPU can read team->ops.transmit during
this window and call a NULL function pointer, crashing the kernel.
The race requires a mode change (CAP_NET_ADMIN) concurrent with
transmit on the team device.
BUG: kernel NULL pointer dereference, address: 0000000000000000
Oops: 0010 [#1] SMP KASAN NOPTI
RIP: 0010:0x0
Call Trace:
team_xmit (drivers/net/team/team_core.c:1853)
dev_hard_start_xmit (net/core/dev.c:3904)
__dev_queue_xmit (net/core/dev.c:4871)
packet_sendmsg (net/packet/af_packet.c:3109)
__sys_sendto (net/socket.c:2265)
The original code assumed that no ports means no traffic, so mode
changes could freely memset()/memcpy() the ops. AF_PACKET with
forced carrier breaks that assumption.
Prevent the race instead of making it safe: replace memset()/memcpy()
with per-field updates that never touch transmit or receive. Those
two handlers are managed solely by team_adjust_ops(), which already
installs dummies when tx_en_port_count == 0 (always true during mode
change since no ports are present). WRITE_ONCE/READ_ONCE prevent
store/load tearing on the handler pointers.
synchronize_net() before exit_op() drains in-flight readers that may
still reference old mode state from before port removal switched the
handlers to dummies.
Fixes: 3d249d4ca7 ("net: introduce ethernet teaming device")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260521081159.1491563-3-bestswngs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Transport-mode reinjection stores a struct net pointer in skb->cb and
uses it later from xfrm_trans_reinject(). That pointer must stay valid
until the deferred callback runs.
Take a netns reference when queueing deferred reinjection work and drop
it after the callback completes. Use maybe_get_net() so the queueing
path does not revive a namespace that is already being torn down.
This keeps the existing workqueue design and fixes the netns lifetime
handling in one place for all users of xfrm_trans_queue_net().
Fixes: 7b3801927e ("xfrm: introduce xfrm_trans_queue_net")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Assisted-by: Codex:gpt-5.4
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The struct pernet_operations docstring in include/net/net_namespace.h
explicitly warns against blocking RCU primitives in .exit handlers:
Exit methods using blocking RCU primitives, such as
synchronize_rcu(), should be implemented via exit_batch.
[...]
Please, avoid synchronize_rcu() at all, where it's possible.
Note that a combination of pre_exit() and exit() can
be used, since a synchronize_rcu() is guaranteed between
the calls.
xfrm_policy_fini() violates this: it calls synchronize_rcu() before
freeing the policy_bydst hash tables (so no RCU reader is mid-
traversal at free time), but runs from xfrm_net_ops.exit -- once per
namespace -- so a cleanup_net() of N namespaces pays N full RCU
grace periods serially.
Use the documented pre_exit/exit split. Move the policy flush (and
the workqueue drains it depends on) into a new .pre_exit handler;
xfrm_policy_fini() then runs in .exit and frees the hash tables
after the synchronize_rcu_expedited() that cleanup_net() guarantees
between the two phases. Providing O(1) RCU grace periods per batch
instead of O(N).
Observed on Linux 6.18 with a workload doing unshare(CLONE_NEWNET)
at ~13/sec sustained: cleanup_net() and the netns_wq rescuer kthread
both stuck in xfrm_policy_fini()'s synchronize_rcu(), >300k struct
net accumulated in the cleanup queue, Percpu in /proc/meminfo climbed
to 130+ GB on 256-CPU hosts, and memcg OOMs followed. setup_net and
__put_net counts were balanced, ruling out a refcount leak.
Fixes: 069daad4f2 ("xfrm: Wait for RCU readers during policy netns exit")
Signed-off-by: Usama Arif <usama.arif@linux.dev>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
iptfs_clone_state() clones the IPTFS mode data with kmemdup(). This
copies runtime objects which must not be shared with the original SA,
including the embedded sk_buff_head, hrtimers, spinlock, and in-flight
reassembly/reorder state.
If xfrm_state_migrate() fails after clone_state() but before the later
init_state() call has reinitialized those fields, the cloned state can be
destroyed by xfrm_state_gc_task() with list and timer state copied from the
original SA. With queued packets this lets the clone splice and free skbs
owned by the original IPTFS queue, leading to use-after-free and
double-free reports in iptfs_destroy_state() and skb release paths.
Reinitialize the clone's runtime state before publishing it through
x->mode_data. Because clone_state() now publishes a destroyable mode_data
object before init_state(), take the mode callback module reference there.
Avoid taking it again from __iptfs_init_state() for the same object.
Fixes: 0e4fbf013f ("xfrm: iptfs: add user packet (tunnel ingress) handling")
Cc: stable@vger.kernel.org
Signed-off-by: Shaomin Chen <eeesssooo020@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
We are observing following warnings:
*ERROR* power well DC_off state mismatch (refcount 0/enabled 1)
gen9_dc_off_power_well_enabled is considering target state DC_STATE_DISABLE
as DC_OFF power well being enabled. Fix this by using wakeref for the
purpose.
To achieve this we need to modify notification code as well. Currently it
is possible that PSR gets notified vblank enable/disable twice on same
status. This is currently not a problem as it is just triggering call to
intel_display_power_set_target_dc_state with same target state as a
parameter. When using wakeref this becomes a problem due to reference
counting. Fix this storing vbank status on last notification and use that
to ensure there are no more than one notification with same vblank status.
v2: ensure there is no subsequent notifications with same status
Fixes: aa451abcff ("drm/i915/display: Prevent DC6 while vblank is enabled for Panel Replay")
Cc: <stable@vger.kernel.org> # v6.13+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Link: https://patch.msgid.link/20260520104944.239797-2-jouni.hogander@intel.com
(cherry picked from commit 35485ac56d878192a3829a58cb26503125ec7104)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Currently we are blocking DC states only when Panel Replay is enabled on
vblank enable. It may happen that Panel Replay is getting enabled when
vblank is already enabled. Fix this by blocking DC states always if Panel
Replay is supported.
While at it take care of possible dual eDP case by looping all encoders
supporting PSR.
Fixes: 0c427ac78a ("drm/i915/psr: Add interface to notify PSR of vblank enable/disable")
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Link: https://patch.msgid.link/20260520104944.239797-1-jouni.hogander@intel.com
(cherry picked from commit eb5911f990554f7ce947dd53df00c114362e4465)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
PTL with physically disconnected display was observed to have 40s longer
execution time when testing xe_fault_injection@xe_guc_mmio_send_recv.
The issue has not been seen when reverting commit 40a9f77a28 ("Revert
"drm/i915/dp: change aux_ctl reg read to polling read"").
Apparently the configuration suffers from not having AUX enabled when
using interrupts. One probable cause can be xe enabling interrupts too
late: interrupts need memory allocations which currently can't be done
before the display FB takeover is done.
As for now, use polling for AUX in case interrupts are unavailable.
Fixes: 40a9f77a28 ("Revert "drm/i915/dp: change aux_ctl reg read to polling read"")
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Michał Grzelak <michal.grzelak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260416163744.288107-1-michal.grzelak@intel.com
(cherry picked from commit 05e0550b65cd1604bd515fbc65f522bce4c10a87)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
TLDR: The bo->ttm object might be changed by calling ttm_bo_validate(),
move casting it to an i915_tt object later to actually get the right
pointer.
A user reported hitting the following bug under heavy use on DG2:
[26620.095550] Oops: general protection fault, probably for non-canonical address 0xa56b6b6b6b6b6b8b: 0000 1 SMP NOPTI
[26620.095556] CPU: 2 UID: 0 PID: 631 Comm: Xorg Not tainted 6.18.8 #1 PREEMPT(lazy)
[26620.095558] Hardware name: ASRock B850M Steel Legend WiFi/B850M Steel Legend WiFi, BIOS 3.50 09/18/2025
[26620.095559] RIP: 0010:i915_ttm_purge+0x84/0x100 [i915]
[26620.095604] Code: 00 00 00 48 8d 54 24 10 48 89 e6 48 89 fb e8 83 aa ae ff 85 c0 75 6f 48 83 bb a8 01 00 00 00 74 2c 48 8b 45 78 48 85 c0 74 23 <48> 8b 78 20 48 c7 c2 ff ff ff ff 31 f6 e8 7a 73 e3 e0 48 8b 7d 78
[26620.095605] RSP: 0018:ffffc90005fd7430 EFLAGS: 00010282
[26620.095607] RAX: a56b6b6b6b6b6b6b RBX: ffff8881f46c3dc0 RCX: 0000000000000000
[26620.095608] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 00000000ffffffff
[26620.095609] RBP: ffff888289610f00 R08: 0000000000000001 R09: ffff88823b022000
[26620.095609] R10: ffff888103029b28 R11: ffff8881fc7f3800 R12: ffff88810b6150d0
[26620.095609] R13: ffff888289610f00 R14: 0000000000000000 R15: ffff8881f46c3dc0
[26620.095610] FS: 00007f1004d86900(0000) GS:ffff88901c858000(0000) knlGS:0000000000000000
[26620.095611] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[26620.095611] CR2: 00007f0fdf489000 CR3: 000000035b0c1000 CR4: 0000000000750ef0
[26620.095612] PKRU: 55555554
[26620.095612] Call Trace:
[26620.095615] <TASK>
[26620.095615] i915_ttm_move+0x2b9/0x420 [i915]
[26620.095642] ? ttm_tt_init+0x65/0x80 [ttm]
[26620.095644] ? i915_ttm_tt_create+0xc6/0x150 [i915]
[26620.095667] ttm_bo_handle_move_mem+0xb6/0x160 [ttm]
[26620.095669] ttm_bo_evict+0x100/0x150 [ttm]
[26620.095671] ? preempt_count_add+0x64/0xa0
[26620.095673] ? _raw_spin_lock+0xe/0x30
[26620.095675] ? _raw_spin_unlock+0xd/0x30
[26620.095675] ? i915_gem_object_evictable+0xb7/0xd0 [i915]
[26620.095704] ttm_bo_evict_cb+0x6e/0xd0 [ttm]
[26620.095705] ttm_lru_walk_for_evict+0xa6/0x200 [ttm]
[26620.095708] ttm_bo_alloc_resource+0x185/0x4f0 [ttm]
[26620.095709] ? init_object+0x62/0xd0
[26620.095712] ttm_bo_validate+0x7a/0x180 [ttm]
[26620.095713] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095714] __i915_ttm_get_pages+0xb0/0x170 [i915]
[26620.095737] i915_ttm_get_pages+0x9f/0x150 [i915]
[26620.095759] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915]
[26620.095786] ? alloc_debug_processing+0xd0/0x100
[26620.095787] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095788] ? i915_vma_instance+0xa0/0x4e0 [i915]
[26620.095822] __i915_gem_object_get_pages+0x2f/0x40 [i915]
[26620.095848] i915_vma_pin_ww+0x706/0x980 [i915]
[26620.095875] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915]
[26620.095904] eb_validate_vmas+0x170/0xa00 [i915]
[26620.095930] i915_gem_do_execbuffer+0x1201/0x2b40 [i915]
[26620.095953] ? alloc_debug_processing+0xd0/0x100
[26620.095954] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095955] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915]
[26620.095977] ? __wake_up_sync_key+0x32/0x50
[26620.095979] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915]
[26620.096001] ? __slab_alloc.isra.0+0x67/0xc0
[26620.096003] i915_gem_execbuffer2_ioctl+0x11a/0x240 [i915]
Results from decode_stacktrace.sh pointed to dereference of a file pointer
field of a i915 TTM page vector container associated with an object being
purged on eviction. That path is taken when the object is marked as no
longer needed.
Code analysis revealed a possibility of the i915 TTM page vector container
being replaced with a new instance inside a function that purges content
of the object, should it be still busy. That function is called,
indirectly via a more general function that changes the object's placement
and caching policy, before the problematic dereference, but still after
a pointer to the container is captured, rendering the pointer no longer
valid.
Fix the issue by capturing the pointer to the container only after its
potential replacement.
v2: Move the container_of() inside the if block (Sebastian),
- a simplified version of the commit description that explains briefly
why the change is necessary (Christian).
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/14882
Fixes: 7ae034590c ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.17+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20260508122612.469227-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 4462966a93eb185849b7f174f0d0de53476d00a4)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
If port->irq_high is -1 (fsl,imx21-gpio compatible) and gpio_idx is >= 16
enable_irq_wake() is called with -1 which is wrong.
Fixes: 5f6d1998ad ("gpio: mxc: release the parent IRQ in runtime suspend")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260526063504.25916-1-alexander.stein@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Commit 91e74fa8b1 ("kho: make sure preservations do not span multiple
NUMA nodes") made sure preservations from kho_preserve_pages() do not
span multiple NUMA nodes. If they do, the order is reduced and tried
again.
The same logic was not implemented for kho_unpreserve_pages(). This can
result in unpreserve calculating a different order than preserve, and
thus not actually unpreserving the pages.
Fix this by moving the order calculation logic to
__kho_preserve_pages_order() and use it from both preserve and
unpreserve paths.
Move __kho_unpreserve() down to avoid having a forward declaration. Its
users are further down in the file anyway. Also, it results in grouping
for all the page-level preservation and unpreservation functions. This
unfortunately makes the diff hard to read, but the main change in
__kho_unpreserve() is to call __kho_preserve_pages_order() instead of
open-coding the order calculation.
Fixes: 91e74fa8b1 ("kho: make sure preservations do not span multiple NUMA nodes")
Cc: stable@vger.kernel.org
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://patch.msgid.link/20260519133332.2498092-1-pratyush@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
KHO_TREE_MAX_DEPTH is calculated as:
DIV_ROUND_UP(KHO_ORDER_0_LOG2 - KHO_BITMAP_SIZE_LOG2,
KHO_TABLE_SIZE_LOG2) + 1
For systems with 16KB pages (e.g. arm64 with CONFIG_ARM64_16K_PAGES=y or
LoongArch), this gives a depth of 4. Since levels are 0 based, with
depth = 4 the effective top level is 3 and the top-level shift at bit 39.
PAGE_SHIFT = 14
KHO_BITMAP_SIZE_LOG2 = PAGE_SHIFT + 3 = 17
KHO_TABLE_SIZE_LOG2 = log(2; (1 << PAGE_SHIFT) / 8) = 11
shift = ((3 - 1) * KHO_TABLE_SIZE_LOG2) + KHO_BITMAP_SIZE_LOG2 = 39
The order-0 bit sits at bit 50 (KHO_ORDER_0_LOG2 = 64 - PAGE_SHIFT =
50). When inserting or reading a key, the index extracted at the top
level is:
(1 << 50) >> 39 = 2048
2048 is exactly the table size (PAGE_SIZE / sizeof(phys_addr_t) = 2048
for 16KB pages), so it wraps to 0, aliasing the order bit to index 0
and losing it silently.
On the second kernel, kho_radix_decode_key() sees a key without the
order bit, calls fls64() on the wrong bit, computes a wrong order and
thus a garbage physical address. phys_to_page() of that address faults
in kho_preserved_memory_reserve(), causing a kernel panic early in boot.
Fix by adding +1 to the DIV_ROUND_UP numerator so the formula accounts
for the order bit itself, giving depth 5 for 16KB pages. The top-level
shift becomes 50, and (1 << 50) >> 50 = 1, which is nonzero and
unambiguous. For 4KB and 64KB page sizes the depth is unchanged.
Link: https://patch.msgid.link/20260509024415.33190-1-dongtai.guo@linux.dev
Fixes: 3f2ad90060 ("kho: adopt radix tree for preserved memory tracking")
Tested-by: Kexin Liu <liukexin@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
[rppt: added actual math to the changelog]
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
The simple_write_to_buffer() will only initialize data starting from
the *pos offset so if it's non-zero then the first part of the buffer
uninitialized. Really, if *pos is non-zero then this code won't work
so just check for that at the start of the function.
Fixes: 320323d2e5 ("accel/ivpu: Add debugfs interface for setting HWS priority bands")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/ahP24m6Mii9EDL7Q@stanley.mountain
This option was never meant to be used in production because it solely
clears the X86_FEATURE kernel-internal representation of what CPUID bits
it has detected and doesn't do any *proper* feature disablement like
clearing CR4.CET in the user shadow stack case, for example.
So remove its documentation so that it doesn't get used in production
and people get silly ideas. It is meant strictly for debugging; and if
a chicken bit for properly disabling a feature is warranted, then that
would need proper enablement.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://patch.msgid.link/20260520202508.160112-1-bp@kernel.org
Ensure the entire TLV header is linearized before access by adding
sizeof(struct hsr_sup_tlv) to the pskb_may_pull() calls. Without this,
a truncated frame could cause an out-of-bounds access.
Fixes: eafaa88b3e ("net: hsr: Add support for redbox supervision frames")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260523130330.61880-1-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rvu_mbox_handler_rep_event_notify() in drivers/net/ethernet/marvell/
octeontx2/af/rvu_rep.c queues a sender-controlled REP_EVENT_NOTIFY
request body verbatim, and rvu_rep_up_notify() then forwards
event->pcifunc (the nested body field, distinct from the
AF-normalised header pcifunc) into rvu_get_pfvf(), rvu_get_pf() and
the AF->PF mailbox device index without any bounds check.
A VF attached to a PF that has been put into switchdev
representor mode reaches this path: the VF mailbox handler
otx2_pfvf_mbox_handler() forwards every message id including
MBOX_MSG_REP_EVENT_NOTIFY to AF without an allowlist, and the AF
dispatcher rewrites only msg->pcifunc, leaving struct
rep_event::pcifunc attacker-controlled. The sibling
rvu_mbox_handler_esw_cfg() refuses requests whose header pcifunc
is not rvu->rep_pcifunc; this handler has no equivalent gate.
An out-of-range body pcifunc selects an &rvu->pf[]/&rvu->hwvf[]
element past the allocated array and, for RVU_EVENT_MAC_ADDR_CHANGE,
turns into a six-byte attacker-chosen OOB ether_addr_copy() target
inside the queued worker; KASAN reports a slab-out-of-bounds write
in rvu_rep_wq_handler.
Reject malformed requests at the handler entry by gating on
is_pf_func_valid(), which is already the canonical PF/VF range check
in this driver; expose it via rvu.h so callers in rvu_rep.c can use
it instead of open-coding the same range arithmetic.
Fixes: b8fea84a04 ("octeontx2-pf: Add support to sync link state between representor and VFs")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260520154157.1439319-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCahRHEhQcbXBhdG9ja2FA
cmVkaGF0LmNvbQAKCRATAyx9YGnhbTeRAQC469Voj6ymkvJ6QqQRaz3u71nofiyw
ZwqHyq+HfoOXtgD/UPll03KkKRvZapI9+YWwaugH4FLcw2tZLUb15bBugQ8=
=59tH
-----END PGP SIGNATURE-----
Merge tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull hpfs fix from Mikulas Patocka:
- Fix a crash on corrupted filesystem
* tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
hpfs: fix a crash if hpfs_map_dnode_bitmap fails
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCahQ/khQcbXBhdG9ja2FA
cmVkaGF0LmNvbQAKCRATAyx9YGnhbfzPAP9KuvzdAfxkV3jixoATrLYjWRxXeLO8
VvUo60xhIan1WAEAyd46noLrXwINvdCYzdHDUeLNa5pb4mclwYcwk69dBw0=
=1g2Q
-----END PGP SIGNATURE-----
Merge tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:
- fix crashes in dm-vdo if GFP_NOWAIT allocation fails
* tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: use GFP_NOIO for blkdev_issue_zeroout on format path
In macsec_post_decrypt(), when pn is U32_MAX, pn + 1 overflows u32 to 0
and the first branch never fires. If next_pn_halves.lower is also in the
upper half, pn_same_half(pn, lower) is true and the XPN else-if does not
fire either, leaving next_pn_halves unchanged. An attacker that captures
the legitimate frame carrying pn == 0xFFFFFFFF on an XPN association
can then replay it indefinitely, since lowest_pn never rises above
the captured pn and macsec_decrypt() reconstructs the same IV.
Extend the XPN else-if to also fire when pn + 1 wraps to 0, so receipt
of pn == U32_MAX advances next_pn_halves to (upper + 1, 0).
Fixes: a21ecf0e03 ("macsec: Support XPN frame handling - IEEE 802.1AEbw")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Link: https://patch.msgid.link/SYBPR01MB78813FD49E58F253B989F197AF012@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- tools/bootconfig: Fix buf leaks in apply_xbc
If data memory allocation failed, free the buf before return.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmoTCt0bHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b8NEH/RBpnMMjF488qq3Uv8a7
0VpUd2YGvzKjybO0t/reOYtVGNbnbUlAqIF/nAZpynGJ9e8/u/+FUOq4D7GTBD9J
18wwhf9mtNTyo+awzclpf6O7aUu+xM0+J4O1uWHKsY6nEyp//koLGOW+jQ1ju5JK
b7jHuvLeWvEhJ0pg0CahdCqdpAPSe064zue2szHop0gywnuaYX7MjNLVdk5MHfdq
XbiKb2M15ZllD4hYBCHKBsJevtxcjdBx5Jwj3YKzt4K9gedNnIUE8Lliar7inGE4
syCQtc1V8Un/v5XVk+24rPGtRnDGrBASWcfgKURIv/bFXqqFI2wzl0+Po9c7R12L
u6U=
=NBvF
-----END PGP SIGNATURE-----
Merge tag 'bootconfig-fixes-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig fix from Masami Hiramatsu:
- Fix buf leak in apply_xbc
* tag 'bootconfig-fixes-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/bootconfig: Fix buf leaks in apply_xbc
ip6_parse_tlv() caches skb_network_header(skb) in nh while walking
IPv6 TLVs.
ipv6_dest_hao() may call pskb_expand_head() for a cloned skb, which can
move the skb head and invalidate the cached network header pointer.
Refresh nh after ipv6_dest_hao() returns so any trailing padding or TLVs
are parsed from the current skb head.
This matches the existing pattern used in ip6_parse_tlv() after helpers
that can modify skb header storage.
Fixes: a831f5bbc8 ("[IPV6] MIP6: Add inbound interface of home address option.")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/7aba1debc2196189172499e5769802b026f8caf8.1779247873.git.zcliangcn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When memblock_free() is called after memblock_discard() on architectures
that don't select ARCH_KEEP_MEMBLOCK, it tries to update memblock.reserved
that was already discarded and it causes use-after-free, for example
[ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
[ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
[ 8.514775] Call Trace:
[ 8.514775] <TASK>
[ 8.514775] kasan_report+0xb2/0x1b0
[ 8.514775] memblock_isolate_range+0x4ac/0x650
[ 8.514775] memblock_phys_free+0xc4/0x190
[ 8.514775] housekeeping_late_init+0x257/0x280
[ 8.514775] do_one_initcall+0xaa/0x470
[ 8.514775] do_initcalls+0x1b4/0x1f0
[ 8.514775] kernel_init_freeable+0x4b5/0x550
[ 8.514775] kernel_init+0x1c/0x150
[ 8.514775] ret_from_fork+0x5dc/0x8e0
[ 8.514775] ret_from_fork_asm+0x1a/0x30
[ 8.514775] </TASK>
Make sure memblock_free() updates memblock.reserved only when called early
enough or when ARCH_KEEP_MEMBLOCK is enabled.
Reported-by: Waiman Long <longman@redhat.com>
Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/all/20260505051821.1107133-1-longman@redhat.com
Tested-by: Waiman Long <longman@redhat.com>
Tested-by: Breno Leitao <leitao@debian.org>
Fixes: 87ce9e83ab ("memblock, treewide: make memblock_free() handle late freeing")
Link: https://patch.msgid.link/20260513105122.502506-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJdBAABCABHFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmoQL+UbFIAAAAAABAAO
bWFudTIsMi41KzEuMTIsMiwyDRxmd0BzdHJsZW4uZGUACgkQcJGo2a1f9gDS/RAA
rLqfCwVendInfbwAZ/ldOdfRWgpMISHHgG0mYtoSUMtaDs4X5s2azOFk4A3pClZF
Or8SgVhbBSlrEOSxJKYimH81oqXc15mHpj0BYUcqZsCJoDBAcbwafh2alVKUrzoY
JaKrwEVgpGH8wXsjExmzXXWsRrQfYGlI+IvoCn2PlfY0Ex7lQdXIv1brpmkkHJ3Q
Ho9Juf0yywe0sh/tLqH9qG7aDImkafz32lGWjjFkTMaSTqfNH1ENNLWIdskObmpk
4YVncco8zrrRqHg76HIJdqaY9UminWognlIrlTHbB3G1VoD5fCEHlmREWBPtEE7V
qqB13ec3DSJgrNY/hR0mwz5TV+dPiD8M39SWtlIhwJIQYOGUgNimsuAR/Nze5Gdl
j7tedKViS8MOlDINhHhVBVKbrr74B6f4L+5gSIRWUEJKSP/VIqbDQ0AI86FqPUiN
shEPCOtNxcJDalBALWHqBd7vm50OWiB6ZcqjabOzX5vyHsgRMPsaqYAbHVW36uhP
NGLQywu7oq+4AKJX6OALEMIP6QtiKtV6q5p93b5ICv0/wlv+S/M/i81PP86DJ7n0
ZUZ2vKRaupqTObb8Au4Qow96/KRZgBPaZy3lNt35KFKG/xLfOcXF309OfD8xmhjV
fOCje1LMktxoaMtFuFiLHH8SPn48AwTxZwQglkToMFc=
=aOa1
-----END PGP SIGNATURE-----
Merge tag 'nf-26-05-22' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: updates for net
Patches 7+8 fix a regression from 7.1-rc1. Everything else
is from 2.6.x to 5.3 releases. There are additional known
issues with these patches (drive-by-findings in related code).
There are many old bugs all over netfilter and our ability to review
feature patches has come to a complete halt due to lack of time.
There are further security bugs that we cannot address
due to lack of time, maintainers and reviewers.
Other remarks: The xtables 32bit compat interface is already
off in many vendor kernels, the plan is to remove it soon.
1) Prevent RST packets with invalid sequence numbers from forcing TCP
connections into the CLOSE state without a direction check.
From Hamza Mahfooz.
2) Re-derive the TCP header pointer after skb_ensure_writable in
synproxy_tstamp_adjust. Prevent use-after-free and invalid checksum
updates caused by stale pointers during buffer expansion.
From Chris Mason.
3) Fix a race condition causing keymap list corruption in conntracks gre/pptp
helper.
4) Use raw_smp_processor_id() in xt_cpu to prevent splats under
PREEMPT_RCU.
5) Disable netfilter payload mangling in user namespaces (nft_payload.c
and nf_queue).
TCP option mangling via nft_exthdr.c remains enabled.
There will be followups here to restrict resp. revalidate
headers.
6) Fix an out-of-bounds read in ebtables's compat_mtw_from_user function.
7) Use list_for_each_entry_rcu() to traverse fib6_siblings in
nft_fib6_info_nh_uses_dev(). Ensure safe list walking under RCU.
8) Fix an out-of-bounds read in nft_fib_ipv6 caused by incorrect list
traversal.
9) Add nft_fib_nexthop selftest to netfilter. Cover nexthop enumeration for
single, group, and multipath route shapes.
All three nft_fib6 fixes from Jiayuan Chen.
10) Fix destination corruption in shift operations when source and destination
registers overlap. Reject partial register overlap for all operations
from control plane. From Fernando Fernandez Mancera.
* tag 'nf-26-05-22' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fix dst corruption in same register operation
selftests: netfilter: add nft_fib_nexthop test
netfilter: nft_fib_ipv6: handle routes via external nexthop
netfilter: nft_fib_ipv6: walk fib6_siblings under RCU
netfilter: ebtables: fix OOB read in compat_mtw_from_user
netfilter: disable payload mangling in userns
netfilter: xt_cpu: prefer raw_smp_processor_id
netfilter: nf_conntrack_gre: fix gre keymap list corruption
netfilter: synproxy: refresh tcphdr after skb_ensure_writable
netfilter: conntrack: tcp: do not force CLOSE on invalid-seq RST without direction check
====================
Link: https://patch.msgid.link/20260522104257.2008-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlx5_cmd_hws_packet_reformat_alloc() handles
MLX5_REFORMAT_TYPE_REMOVE_HDR by looking up a matching HWS remove-header
action.
If mlx5_fs_get_action_remove_header_vlan() returns NULL, the code only
logs an error and continues. The function then returns success with a NULL
HWS action stored in the packet-reformat object.
Return an error when no matching remove-header action is available.
Fixes: aecd9d1020 ("net/mlx5: fs, add HWS packet reformat API function")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260506000054.51797-1-prathameshdeshpande7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> says:
Here is the set of patches, that fixes one of the isssue reported by
Richard Acayan, while doing fix for the reported issue, found various
other issues in the existing code.
This set contains some of those cleanups along with few trivial coding
style patches which looked uncomfortable to read.
Patch 1 should be enough to fix the issue reported.
Tested this is on UNO-Q.
Link: https://patch.msgid.link/20260518092347.3446946-1-srinivas.kandagatla@oss.qualcomm.com
The ASM_CLIENT_EVENT_DATA_WRITE_DONE case does not declare any local
variables or require a separate scope, so drop the unnecessary braces.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260518092347.3446946-5-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Fix error handling in q6asm_dai_compr_set_params() and q6asm_dai_prepare()
for both CMD_CLOSE and q6asm_unmap_memory_regions().
In both the functions, we are doing q6asm_audio_client_free in failure
cases, which means if prepare or set_params fail, we can never recover.
Now open and close are done in respective dai_open/close functions.
Fixes: 2a9e92d371 ("ASoC: qdsp6: q6asm: Add q6asm dai driver")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260518092347.3446946-4-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
q6asm_dai_close() and q6asm_dai_compr_free() currently issue CMD_CLOSE
whenever prtd->state is non-zero.
After prepare() closes an existing stream, the state is updated to
Q6ASM_STREAM_STOPPED. Since this state is also non-zero, the close and
free paths can send CMD_CLOSE again for a stream that has already been
closed.
Restrict CMD_CLOSE to the Q6ASM_STREAM_RUNNING state so the command is
sent only when the ASM stream is still active.
Fixes: 2a9e92d371 ("ASoC: qdsp6: q6asm: Add q6asm dai driver")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260518092347.3446946-3-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The q6asm-dai stream state is used by prepare() to decide whether an
existing stream setup needs to be closed before opening/configuring a new
one. Updating the state from trigger or asynchronous DSP callbacks can make
that state stale or incorrect relative to the actual setup lifetime.
In particular, setting Q6ASM_STREAM_STOPPED on STOP or EOS completion can
make prepare() believe there is no active setup to close, which can result
in opening/configuring the same stream more than once.
Keep stream state updates tied to prepare(), where the stream is actually
closed and reopened, and stop changing it from trigger and EOS callbacks.
Fixes: bfbb12dfa1 ("ASoC: qcom: q6asm-dai: perform correct state check before closing")
Cc: Stable@vger.kernel.org
Closes: https://lore.kernel.org/all/afS7rTHdc9TyIeLx@rdacayan/
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260518092347.3446946-2-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
byt_cht_es8316_init() enables MCLK before configuring the codec sysclk
and creating the headset jack. If either of those later steps fails, the
function returns without disabling MCLK, leaving the clock enabled after
card registration fails.
Track whether this driver enabled MCLK and disable it on the init error
paths. Add the matching DAI link exit callback so the same clock enable
is also balanced when ASoC cleans up a successfully initialized link.
Fixes: a03bdaa565 ("ASoC: Intel: add machine driver for BYT/CHT + ES8316")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260519-asoc-bytcht-es8316-mclk-leak-v1-1-b4a11cdc2afd@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
hyperv_receive_sub() reads msg->vid_hdr.type and dispatches into one
of four message-type branches without knowing how many bytes the host
wrote into hv->recv_buf. The completion path then runs
memcpy(hv->init_buf, msg, VMBUS_MAX_PACKET_SIZE), so the consumer that
wakes on wait_for_completion_timeout() can read up to 16 KiB of
residue from a prior message as if it were the response payload.
Pass bytes_recvd into hyperv_receive_sub() and reject any packet that
does not cover the pipe + synthvid header. A single switch on
msg->vid_hdr.type then computes the type-specific payload size: the
three completion-driving types (SYNTHVID_VERSION_RESPONSE,
SYNTHVID_RESOLUTION_RESPONSE, SYNTHVID_VRAM_LOCATION_ACK) fall through
to a shared exit that requires that size before memcpy/complete, while
SYNTHVID_FEATURE_CHANGE validates its own payload and returns before
reading is_dirt_needed. Unknown types are dropped.
SYNTHVID_RESOLUTION_RESPONSE is variable length: the host fills
resolution_count entries, not the full SYNTHVID_MAX_RESOLUTION_COUNT
array. Validate the fixed prefix first so resolution_count can be
read, bound it against the array, then require only the count-sized
array, so the shorter responses the host actually sends are accepted.
Only run the sub-handler when vmbus_recvpacket() returned success. The
memcpy length is bytes_recvd, which is bounded by VMBUS_MAX_PACKET_SIZE
only on a successful receive; on -ENOBUFS vmbus_recvpacket() instead
reports the required length, which can exceed hv->recv_buf, so copying
bytes_recvd would read and write past the 16 KiB buffers. Gating on the
success return keeps the copy bounded. The nonzero-return path is itself
a malformed-message case and is now logged rather than silently skipped;
channel recovery is not attempted.
Rejected packets are reported via drm_err_ratelimited() rather than
silently dropped, matching the CoCo-hardened pattern in
hv_kvp_onchannelcallback().
Fixes: 76c56a5aff ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
Cc: stable@vger.kernel.org # 5.14+
Signed-off-by: Berkant Koc <me@berkoc.com>
Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Link: https://patch.msgid.link/8200dbc199c7a9b75ac7e8af6c748d2189b5ebd5.1779542874.git.me@berkoc.com
A SYNTHVID_RESOLUTION_RESPONSE with resolution_count > 64 walks past
the supported_resolution[SYNTHVID_MAX_RESOLUTION_COUNT] array in the
parse loop. Bound resolution_count against the array size, folded
into the existing zero-check.
When the WIN10 resolution probe fails, the caller in
hyperv_connect_vsp() left hv->screen_*_max / preferred_* unpopulated,
which sets mode_config.max_width / max_height to 0 and makes
drm_internal_framebuffer_create() reject every userspace framebuffer
with -EINVAL. The pre-WIN10 branch had the same gap for
preferred_width / preferred_height. Use a single post-probe fallback
guarded by screen_width_max == 0 so both paths converge on the WIN8
defaults.
Signed-off-by: Berkant Koc <me@berkoc.com>
Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline
Fixes: 76c56a5aff ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
Cc: stable@vger.kernel.org # 5.14+
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Link: https://patch.msgid.link/6945b22419c7d404b4954a113de2ac9c900dba93.1779542874.git.me@berkoc.com
Commit e18947038b ("ACPI: driver: Do not set acpi_device_class()
unnecessarily") modified acpi_button_remove() to clear the device class
field in struct acpi_device on driver removal, but it should also have
updated the rollback path in acpi_button_probe(), which it didn't do,
so do it now.
Fixes: e18947038b ("ACPI: driver: Do not set acpi_device_class() unnecessarily")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/6167713.MhkbZ0Pkbq@rafael.j.wysocki
Prior to commit 57c31e6d62 ("ACPI: scan: Use acpi_setup_gpe_for_wake()
for buttons"), ACPI button wakeup GPEs having handler methods remained
enabled after acpi_wakeup_gpe_init(), but currently they are not enabled
because acpi_setup_gpe_for_wake() disables them.
That causes function keys to stop working on some systems [1] and there
may be other related issues elsewhere.
To address that, make the ACPI button driver enable wakeup GPEs for ACPI
buttons so long as they have handler methods. While this does not
restore the old behavior exactly (the ACPI button driver needs to be
bound to the button devices for the GPEs to be enabled), it should be
sufficient to restore the missing functionality.
For this purpose, introduce acpi_enable_gpe_cond() that enables
a GPE if its dispatch type matches the supplied one and modify
acpi_button_probe() to use that function for enabling the GPEs in
question.
Fixes: 57c31e6d62 ("ACPI: scan: Use acpi_setup_gpe_for_wake() for buttons")
Reported-by: Nick <nick@kousu.ca>
Closes: https://lore.kernel.org/linux-acpi/E2OXET.4X5GTP37VTNC3@kousu.ca/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Nick <nick@kousu.ca>
Cc: 7.0+ <stable@vger.kernel.org> # 7.0+
Link: https://patch.msgid.link/9629117.CDJkKcVGEf@rafael.j.wysocki
Commit a7e23ec17f ("ACPI: button: Install notifier for system events
as well") changed the ACPI notify handler type for ACPI buttons to
ACPI_ALL_NOTIFY, but it forgot to update acpi_button_remove() to reflect
that change. This leads to leaking the notify handler past driver
removal, which may cause a kernel crash to occur if ACPI notify on
the given device is triggered after removing the driver, and causes a
subsequent probe of the given device with the same driver to fail.
Address this by updating the acpi_remove_notify_handler() call in
acpi_button_remove() as appropriate.
Fixes: a7e23ec17f ("ACPI: button: Install notifier for system events as well")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Cc: 6.15+ <stable@vger.kernel.org> # 6.15+
Link: https://patch.msgid.link/7954431.EvYhyI6sBW@rafael.j.wysocki
The internal mic boost on the Positivo DN140 is too high.
Fix this by applying the ALC269_FIXUP_LIMIT_INT_MIC_BOOST fixup to the machine
to limit the gain.
Signed-off-by: Edson Juliano Drosdeck <edson.drosdeck@gmail.com>
Link: https://patch.msgid.link/20260524185324.28959-1-edson.drosdeck@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Firmware 2417 for the Scarlett 4th Gen 2i2 moved the direct monitor
gain parameter by 4 bytes, from offset 0x2a0 to 0x2a4, breaking the
"Direct Monitor X Mix Y" controls.
Special-case the offset in the get/set config helpers when the
running firmware is 2417 or later.
Fixes: 4e809a2996 ("ALSA: scarlett2: Add support for Solo, 2i2, and 4i4 Gen 4")
Cc: <stable@vger.kernel.org>
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/ahIWTueUlWA5xiV+@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
cs35l56_hda_read_acpi() gets an allocated ACPI _SUB string from
acpi_get_subsystem_id(). On success, that string is used to create the
firmware system name.
Several error paths after the _SUB lookup can return without releasing
the allocated string. This includes speaker ID lookup errors other than
-ENOENT, and errors after a firmware system name has been allocated.
Use scoped cleanup for the temporary _SUB string and make
cs35l56->system_name device-managed. This releases the temporary _SUB
string on every error path and lets devres release the firmware system
name on probe failure and device removal.
Fixes: 6f03b446cb ("ALSA: hda: cs35l56: Add support for speaker id")
Fixes: 40b1c2f9b2 ("ALSA: hda/cs35l56: Workaround bad dev-index on Lenovo Yoga Book 9i GenX")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260522-alsa-cs35l56-system-name-leak-v4-1-a6154dd09cd9@gmail.com
The BIOS on the Lenovo Yoga Slim 7 14AGP11 (AMD Ryzen AI / Kraken
Point chassis; board LNVNB161216, product 83QS) programs the PCI
subsystem ID of the HDA function as 17aa:0000. As a result no entry
in alc269_fixup_tbl[] matches via SND_PCI_QUIRK, the fixup falls back
to the generic auto-routing path, and the bass speaker pin is left
mis-routed. Laptop speakers sound noticeably thin.
The codec's own internal subsystem ID register reports 0x17aa394c
correctly, so an HDA_CODEC_QUIRK entry (which matches on the codec
SSID rather than on the PCI SSID) binds the chassis to the existing
ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN fixup. This mirrors the same
workaround already in place for the closely-related Yoga 7 2-in-1
14AKP10 and 16AKP10 entries earlier in the table.
With this change the kernel log goes from
ALC287: picked fixup for PCI SSID 17aa:0000
to
ALC287: picked fixup alc287-yoga9-bass-spk-pin
and speaker routing matches what the firmware intended. Verified by
the reporter against the equivalent modprobe override
(model=,alc287-yoga9-bass-spk-pin).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221438
Signed-off-by: Kris Kater <kris@kater.nu>
Link: https://patch.msgid.link/20260522060902.9423-1-kris@kater.nu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The comment for the pin configuration 0x21 in the fixup
ALC299_FIXUP_PREDATOR_SPK states "use as headset mic, without its own
jack detect", but the fixup name and the actual usage indicate that the
pin is meant to be used as internal speaker. Correct the comment to
avoid confusion.
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260522060742.1384390-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The register DSP event queue is updated under parser->lock, but
snd_motu_register_dsp_message_parser_count_event() reads pull_pos and
push_pos without the lock.
snd_motu_register_dsp_message_parser_copy_event() also reads both queue
positions before taking the lock.
Protect these accesses with parser->lock as well. This keeps the hwdep
poll/read path consistent with the producer side and with the cached
meter/parameter accessors.
Fixes: 634ec0b290 ("ALSA: firewire-motu: notify event for parameter change in register DSP model")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260521-alsa-firewire-motu-event-locking-v1-1-708e1c2b5e56@gmail.com
io_register_iowq_max_workers() walks ctx->tctx_list under ctx->tctx_lock
and dereferences each node's task->io_uring without a NULL check:
list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
tctx = node->task->io_uring;
if (WARN_ON_ONCE(!tctx->io_wq))
continue;
...
}
__io_uring_add_tctx_node() installs the node into ctx->tctx_list (via
io_tctx_install_node(), which does the list_add() under tctx_lock) and
only assigns current->io_uring = tctx afterwards. A task doing its first
io_uring operation on a shared ring therefore has a window in which its
node is already visible on ctx->tctx_list while node->task->io_uring is
still NULL. A concurrent IORING_REGISTER_IOWQ_MAX_WORKERS on the same
ring reads that NULL and dereferences tctx->io_wq:
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
RIP: io_register_iowq_max_workers io_uring/register.c:423
Publish current->io_uring = tctx before installing the node, so any node
visible on ctx->tctx_list always has a valid task->io_uring.
Fixes: 7880174e1e ("io_uring/tctx: clean up __io_uring_add_tctx_node() error handling")
Signed-off-by: Lim HyeonJun <shja0831@gmail.com>
Link: https://patch.msgid.link/20260524110853.115634-1-shja0831@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Here are a number of fixes for memory corruption and information leaks
due to missing endpoint and transfer sanity checks dating back to
simpler times when we trusted our hardware.
Included are also a fix for a recently added modem device id entry and
some new modem devices ids.
All but the last five commits have been in linux-next and with no
reported issues.
-----BEGIN PGP SIGNATURE-----
iJEEABYKADkWIQQHbPq+cpGvN/peuzMLxc3C7H1lCAUCahFk5RsUgAAAAAAEAA5t
YW51MiwyLjUrMS4xMiwyLDIACgkQC8XNwux9ZQhGwQEA8P0jl8Bfd7tSnjqDplH7
mVipmM3I8MBMN7ZiurSRfbMBAIOplkKAn6I3wyDoY2n+lTLUVWHcVeYFCtprkQ3Z
pAoN
=3zZY
-----END PGP SIGNATURE-----
Merge tag 'usb-serial-7.1-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB serial fixes for 7.1-rc5
Here are a number of fixes for memory corruption and information leaks
due to missing endpoint and transfer sanity checks dating back to
simpler times when we trusted our hardware.
Included are also a fix for a recently added modem device id entry and
some new modem devices ids.
All but the last five commits have been in linux-next and with no
reported issues.
* tag 'usb-serial-7.1-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: cypress_m8: validate interrupt packet headers
USB: serial: safe_serial: fix memory corruption with small endpoint
USB: serial: omninet: fix memory corruption with small endpoint
USB: serial: mxuport: fix memory corruption with small endpoint
USB: serial: cypress_m8: fix memory corruption with small endpoint
USB: serial: option: add missing RSVD(5) flag for Rolling RW135R-GL
USB: serial: option: add MeiG SRM813Q
USB: serial: mct_u232: fix missing interrupt-in transfer sanity check
USB: serial: mct_u232: fix memory corruption with small endpoint
USB: serial: keyspan: fix missing indat transfer sanity check
USB: serial: digi_acceleport: fix memory corruption with small endpoints
USB: serial: belkin_sa: validate interrupt status length
cypress_read_int_callback() parses the interrupt-in buffer according to
the selected Cypress packet format. Format 1 has a two-byte status/count
header and format 2 has a one-byte combined status/count header. The
usb-serial core sizes the interrupt-in buffer from the endpoint
descriptor's wMaxPacketSize, and successful interrupt transfers can
complete short when URB_SHORT_NOT_OK is not set.
Check that the completed packet contains the selected header before
reading it. Malformed short reports are ignored and the interrupt URB is
resubmitted through the existing retry path, preventing out-of-bounds
header-byte reads.
KASAN report as below:
KASAN slab-out-of-bounds in cypress_read_int_callback+0x240/0x7f0
Read of size 1
Call trace:
cypress_read_int_callback() (drivers/usb/serial/cypress_m8.c:1009)
__usb_hcd_giveback_urb()
dummy_timer()
Fixes: 3416eaa1f8 ("USB: cypress_m8: Packet format is separate from characteristic size")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Fixes: 3416eaa1f8 ("USB: cypress_m8: Packet format is separate from characteristic size")
Cc: stable@vger.kernel.org # 2.6.26
[ johan: use constants in header length sanity checks ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Make sure that the bulk-out buffer size is at least eight bytes to avoid
user-controlled slab corruption in "safe" mode should a malicious device
report a smaller size.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Make sure that the bulk-out buffers are at least as large as the
hardcoded transfer size to avoid user-controlled slab corruption should
a malicious device report a smaller endpoint max packet size than
expected.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Make sure that the bulk-out endpoint max packet size is at least eight
bytes to avoid user-controlled slab corruption should a malicious device
report a smaller size.
Fixes: ee467a1f20 ("USB: serial: add Moxa UPORT 12XX/14XX/16XX driver")
Cc: stable@vger.kernel.org # 3.14
Cc: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
chap_server_compute_hash() allocates client_digest as
kzalloc(chap->digest_size) and then, for BASE64-encoded responses,
passes chap_r directly to chap_base64_decode() without checking whether
the input length could produce more than digest_size bytes of output.
chap_base64_decode() writes to the destination unconditionally as long
as there is input to consume. With MAX_RESPONSE_LENGTH set to 128 and
the "0b" prefix stripped by extract_param(), up to 127 base64 characters
can reach the decoder. 127 characters decode to 95 bytes. For SHA-256
(digest_size=32) this overflows client_digest by 63 bytes; for MD5
(digest_size=16) the overflow is 79 bytes.
The length check at line 344 fires after the write has already happened.
The HEX branch in the same switch statement already validates the length
up front. Apply the same approach to the BASE64 branch: strip trailing
base64 padding characters, then reject any input whose data length
exceeds DIV_ROUND_UP(digest_size * 4, 3) before calling the decoder.
Stripping trailing '=' before the comparison handles both padded and
unpadded encodings. chap_base64_decode() already returns early on '=',
so the full original string is still passed to the decoder unchanged.
The mutual CHAP path decodes CHAP_C into initiatorchg_binhex, which is
kzalloc(CHAP_CHALLENGE_STR_LEN). extract_param() caps initiatorchg at
CHAP_CHALLENGE_STR_LEN characters, so at most CHAP_CHALLENGE_STR_LEN-1
base64 characters reach the decoder. The maximum decoded size,
DIV_ROUND_UP((CHAP_CHALLENGE_STR_LEN-1) * 3, 4), is less than
CHAP_CHALLENGE_STR_LEN, so no overflow is possible there. A comment is
added at the call site to document this.
Fixes: 1e57338834 ("scsi: target: iscsi: Support base64 in CHAP")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandru Hossu <hossu.alexandru@gmail.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Link: https://patch.msgid.link/20260521151121.808477-1-hossu.alexandru@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
iscsi_encode_text_output() concatenates "key=value\0" records into
login->rsp_buf, an 8192-byte kzalloc(MAX_KEY_VALUE_PAIRS) buffer
allocated in iscsit_alloc_login_setup_buffer(). The three sprintf() call
sites in this function (lines 1398, 1411, 1424 in v7.1-rc2) never check
the remaining buffer capacity:
*length += sprintf(output_buf, "%s=%s", er->key, er->value);
*length += 1;
output_buf = textbuf + *length;
The 8192-byte ceiling at iscsi_target_check_login_request() bounds the
*input* Login PDU payload, but a single PDU can carry up to 2048 minimal
four-byte "a=b\0" pairs, each unknown key expanding to a 16-byte
"a=NotUnderstood\0" output record via iscsi_add_notunderstood_response().
2048 * 16 = 32 KiB of output into an 8 KiB buffer, producing a ~24 KiB
heap overrun in the kmalloc-8k slab.
The fix introduces a static iscsi_encode_text_record() helper that uses
snprintf() with a per-call bounds check against the remaining buffer,
and threads a u32 textbuf_size parameter through
iscsi_encode_text_output(). Both call sites in
iscsi_target_handle_csg_zero() (PHASE_SECURITY) and
iscsi_target_handle_csg_one() (PHASE_OPERATIONAL) pass
MAX_KEY_VALUE_PAIRS. On overflow the encoder logs the condition, calls
iscsi_release_extra_responses() to drop queued records, and returns -1;
both caller sites now emit ISCSI_STATUS_CLS_INITIATOR_ERR /
ISCSI_LOGIN_STATUS_INIT_ERR via iscsit_tx_login_rsp() before returning,
so the initiator sees an explicit failed-login response rather than a
silent connection drop. (Prior to this patch only the PHASE_OPERATIONAL
caller did that; the PHASE_SECURITY caller is converted to the same
shape.)
Fixes: e48354ce07 ("iscsi-target: Add iSCSI fabric support for target v4.1")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Two latent bugs in the Text-phase handler, both present since the
original LIO integration in commit e48354ce07 ("iscsi-target: Add
iSCSI fabric support for target v4.1"):
1) DataDigest CRC buffer overread (4 bytes past text_in).
text_in is kzalloc()'d at ALIGN(payload_length, 4). rx_size is then
incremented by ISCSI_CRC_LEN to make room for the received DataDigest
in the iovec, but the same (now-bumped) rx_size is passed as the
buffer length to iscsit_crc_buf():
if (conn->conn_ops->DataDigest) {
...
rx_size += ISCSI_CRC_LEN;
}
...
if (conn->conn_ops->DataDigest) {
data_crc = iscsit_crc_buf(text_in, rx_size, 0, NULL);
iscsit_crc_buf() walks rx_size bytes of text_in with crc32c(), so
when DataDigest is negotiated it reads 4 bytes past the end of the
text_in allocation. KASAN reproduces this directly on the unpatched
mainline tree as slab-out-of-bounds in crc32c() called from the Text
PDU path. The OOB bytes feed crc32c() and are then compared against
the initiator-supplied checksum, so the value does not flow back to
the attacker, but the kernel does read past the buffer on every Text
PDU with DataDigest=CRC32C.
Fix by passing the actual padded payload length
(ALIGN(payload_length, 4)) that was used for the kzalloc().
2) Stale cmd->text_in_ptr re-free (double-free) on ERL>0 bad DataDigest
drop.
On DataDigest mismatch with ErrorRecoveryLevel > 0 the handler
silently drops the PDU and lets the initiator plug the CmdSN gap:
kfree(text_in);
return 0;
cmd->text_in_ptr still points at the freed buffer. The next Text
Request on the same ITT re-enters iscsit_setup_text_cmd(), which
unconditionally does
kfree(cmd->text_in_ptr);
cmd->text_in_ptr = NULL;
freeing the same pointer a second time. Session teardown via
iscsit_release_cmd() has the same shape and hits the same double-free
if the connection is dropped before a second Text Request arrives.
On an unmodified mainline tree the bug-1 CRC overread fires first on
the initial valid Text Request and perturbs the subsequent state, so
#4 was isolated by building a kernel with only the bug-1 hunk of this
patch applied plus temporary printk() observability around the three
relevant kfree() sites. The observability prints are not part of
this patch. On that build, a three-PDU Text Request sequence after
login produces two back-to-back splats:
BUG: KASAN: double-free in iscsit_setup_text_cmd+0x??
BUG: KASAN: double-free in iscsit_release_cmd+0x??
showing the same pointer freed in the ERL>0 drop path and again in
iscsit_setup_text_cmd() (next Text Request on the same ITT) and once
more in iscsit_release_cmd() (session teardown). On distro kernels
with CONFIG_SLAB_FREELIST_HARDENED=y (default) the double-free
becomes a remote kernel BUG(); on non-hardened kernels it corrupts
the slab freelist.
Fix by clearing cmd->text_in_ptr after the kfree() in the ERL>0 drop
path. With both hunks applied #4 is directly observable on the stock
tree without observability printks; fixing bug-1 alone would mask #4
less, not more, so the hunks are submitted together.
Both fixes are one-liners. The Text PDU state machine is unchanged and
the wire protocol is unaffected.
Fixes: e48354ce07 ("iscsi-target: Add iSCSI fabric support for target v4.1")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ipv4_sysctl_exit_net() is currently freeing net->ipv4.sysctl_local_reserved_ports
too soon.
Only after unregister_net_sysctl_table() we can be sure no threads can possibly
use the sysctls, including /proc/sys/net/ipv4/ip_local_reserved_ports.
Fixes: 122ff243f5 ("ipv4: make ip_local_reserved_ports per netns")
Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260521122147.3584624-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On 32-bit architectures, both skb_queue_len() and SKB_TRUESIZE(0) evaluate
to 32-bit values. The multiplication can overflow before being assigned to
the u64 skb_overhead variable, making the skb overhead check ineffective.
Cast skb_queue_len() to u64 so the multiplication is always performed in
64-bit arithmetic.
This issue was reported by Sashiko while reviewing another patch.
Fixes: 059b7dbd20 ("vsock/virtio: fix potential unbounded skb queue")
Closes: https://sashiko.dev/#/patchset/20260518090656.134588-1-sgarzare%40redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260521124732.125771-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/scsi/fcoe/fcoe_ctlr.c::fcoe_ctlr_recv_clr_vlink() advanced the
descriptor cursor by an attacker-supplied fip_dlen without ever
requiring dlen >= sizeof(struct fip_desc) in the default branch. The
named descriptor cases (FIP_DT_MAC, FIP_DT_NAME, FIP_DT_VN_ID) checked
their per-type minimum lengths, but a FIP_DT_NON_CRITICAL descriptor
(fip_dtype >= 128, which the standard requires receivers to silently
ignore) skipped that check entirely.
An unauthenticated L2 peer on the FCoE control VLAN could hang
fcoe_ctlr_recv_work on an fcoe, qedf, or bnx2fc initiator indefinitely
by emitting one FIP CVL frame whose single descriptor had fip_dtype ==
FIP_DT_NON_CRITICAL and fip_dlen == 0: the cursor advanced zero bytes
per iteration and the loop condition rlen >= sizeof(*desc) stayed true
forever, blocking every subsequent FIP frame on that controller.
Tighten the outer dlen guard to also reject dlen < sizeof(struct
fip_desc), so a malformed descriptor whose length cannot even cover the
descriptor header is rejected before the switch. This is the same
lower-bound the named cases already apply and is the minimum scope that
closes the loop.
Fixes: 97c8389d54 ("[SCSI] fcoe, libfcoe: Add support for FIP. FCoE discovery and keep-alive.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260518144307.2820961-1-michael.bommarito@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
An adjacent Fibre Channel fabric actor that can deliver an FPIN ELS
frame to an lpfc or qla2xxx Linux initiator can trigger a non-return in
the generic FC transport. This is not a local userspace or IP network
path; the attacker must be able to inject fabric traffic, for example as
a compromised switch or fabric controller, or as a same-zone N_Port on a
fabric that permits source spoofing.
The Link-Integrity and Peer-Congestion FPIN walkers used a u8 loop
counter against the 32-bit on-wire pname_count field, and did not bound
pname_count by the descriptor body already validated by the TLV walker.
A pname_count of 256 therefore wraps the counter and keeps the loop
condition true indefinitely.
Factor the shared pname_list[] walk into one helper, widen the counter
to u32, and clamp pname_count against the entries that fit in the
descriptor body before iterating.
Fixes: 3dcfe0de5a ("scsi: fc: Parse FPIN packets and update statistics")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260520133015.1018937-1-michael.bommarito@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A "\n" at the end of the sdev_printk() string appears to have been
inadvertently removed. Add it back for correct log message formatting.
Fixes: a743b12022 ("scsi: scsi_debug: Stop printing extra function name in debug logs")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260519205356.1040855-1-emilne@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add NULL check for scmd_local in the MPI2_FUNCTION_SCSI_IO_REQUEST case
to handle firmware duplicate/stale completions.
When firmware sends a duplicate completion for a command that was
already processed and returned to the pool, the driver accesses NULL
scmd pointer causing a crash.
Timeline of the bug:
1. Command completes normally, megasas_return_cmd_fusion() called
2. This sets cmd->scmd = NULL and clears io_request with memset(..., 0,
...)
3. Firmware sends duplicate/stale completion for same SMID (firmware
bug)
4. Driver processes reply descriptor again
5. Cleared io_request has Function = 0 (MPI2_FUNCTION_SCSI_IO_REQUEST)
6. Switch statement matches SCSI_IO_REQUEST case by accident
7. Accesses megasas_priv(NULL scmd)->status -> crash at offset 0x228
The offset 0x228 = sizeof(struct scsi_cmnd) 0x220 + offsetof(status)
0x8.
This issue was observed on PERC H330 Mini running firmware 25.5.9.0001
after 3+ days of heavy I/O load.
Crash signature:
BUG: unable to handle kernel NULL pointer dereference at 0x228
RIP: complete_cmd_fusion+0x428
Function: megasas_priv(cmd_fusion->scmd)->status
Add defensive check to skip processing when scmd_local is NULL. This
handles duplicate completions from firmware and prevents accessing freed
command structures. The check protects all scmd_local uses in both the
SCSI_IO path and the fallthrough LDIO path.
Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Link: https://patch.msgid.link/agWAgtk6rtHqNWb5@machine1
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The extremely slow boots reported July 2014 in bug 79901:
https://bugzilla.kernel.org/show_bug.cgi?id=79901
for Promise VTrak E610f 3U 16-bay FC RAID enclosure occur also with the
Promise VTrak E310f 2U 12-bay FC RAID enclosure. The 2014 patch:
https://bugzilla.kernel.org/attachment.cgi?id=144101&action=diff
added support for the BLIST_NO_RSOC flag and specified that flag for the
Promise VTrak E610f. This current patch simply adds the E310f to that
same list.
One curiosity is the additional BLIST_SPARSELUN flag. This was also in
the 2014 patch for the E610f, and was already in place for *all* Promise
devices since 2007 due to commit e0b2e597d5 ("[SCSI] stex: fix id
mapping issue") which added the line:
{"Promise", "", NULL, BLIST_SPARSELUN}
The 2007 commit message talks of issues with SuperTrak EX (stex) but the
added line did not limit itself to that particular device family. The
current patch for E310F, like the 2014 patch for E610f, adds
BLIST_NO_RSOC while preserving BLIST_SPARSELUN from 2007.
Signed-off-by: Alexander Perlis <aperlis@math.lsu.edu>
Suggested-by: Nikkos Svoboda <nsvoboda@math.lsu.edu>
Link: https://patch.msgid.link/20260512231254.27530-1-aperlis@math.lsu.edu
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While a SCSI host is in a recovery state, scsi_mq_requeue_cmd() will not
set the requeue list for a requeued command to be kicked in the future.
The expectation is a call to scsi_run_host_queues() will kick all SCSI
devices once the recovery state is cleared.
However, scsi_run_host_queues() uses shost_for_each_device() which uses
scsi_device_get() and so will ignore devices in a partially removed
state like SDEV_CANCEL. But these devices may also have requeued
requests, leaving their requests stuck from not being kicked and causing
the removal process of the device to hang.
scsi_run_host_queues() needs to run against more devices than the macro
shost_for_each_device() allows. Instead of using the too limiting
scsi_device_get() state checks, only ignore devices in SDEV_DEL state or
when unable to acquire a reference. Attempt to run the queues for all
other devices when scsi_run_host_queues() is called.
Fixes: 8b566edbdb ("scsi: core: Only kick the requeue list if necessary")
Signed-off-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260515180941.9698-1-djeffery@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mirror iucv_sock_setsockopt() and wrap the whole switch in
lock_sock()/release_sock(). The pre-existing SO_MSGLIMIT-only lock
becomes redundant and is removed.
Any AF_IUCV HIPER user can potentially crash the kernel by racing
recvmsg() with getsockopt(SO_MSGSIZE): the SO_MSGSIZE arm dereferences
iucv->hs_dev->mtu after iucv_sock_close() (called from the racing
recvmsg()) has set hs_dev to NULL, producing a NULL pointer dereference
oops.
Suggested-by: Stanislav Fomichev <sdf.kernel@gmail.com>
Fixes: 51363b8751 ("af_iucv: allow retrieval of maximum message size")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Tested-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://patch.msgid.link/20260521-af_iucv_fix2-v1-1-f16b1c510aa9@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
INIT_HLIST_HEAD(&smc_v*_hashinfo.ht) are called after smc_nl_init(),
proto_register() and sock_register(). This can lead to smc_v*_hashinfo.ht
being reset even though hash entries already exist and are being used,
possibly resulting in a corrupted list.
Remove unnecessary and dangerous re-initialisation of smc_v*_hashinfo.ht in
smc_init(); it is implicitly initialised to zero anyhow. Add
HLIST_HEAD_INIT to the definitions for clarity.
Fixes: f16a7dd5cf ("smc: netlink interface for SMC sockets")
Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Link: https://patch.msgid.link/20260521145639.10317-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ilya Maximets says:
====================
netlink: fixes for cross-namespace nsid reporting
While working on some new features for OVS and OVN we discovered that
self-referential NSIDs get unintentionally allocated in the system as
well as unexpectedly reported for local events on all-nsid listeners.
More details in the patches. They change user-visible behavior, but
the current behavior is arguably a bug, as it makes it hard to use
all-nsid sockets without a decent amount of extra unrelated work of
tracking when new NSIDs are allocated for your local namespace.
Tests are added to check the expected behavior and YNL is extended
to support all-nsid sockets in the tests.
====================
Link: https://patch.msgid.link/20260520172317.175168-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test subscribes to link events from all namespaces and makes
sure that local events do not carry NSID in their ancillary data
(even if there is a self-referential NSID allocated for the local
namespace), and remote events do.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://patch.msgid.link/20260520172317.175168-5-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In most cases, notifications on sockets with NETLINK_LISTEN_ALL_NSID
do not contain NSID in their ancillary data in case the event is local
to the listener.
However, when a self-referential NSID is allocated for a namespace,
every local notification starts sending this ID to the user space.
This is problematic, because the listener cannot tell if those
notifications are local or not anymore without making extra requests
to figure out if the provided NSID is local or not. The listener
can also not figure out the local NSID beforehand as it can be
allocated at any point in time by other processes, changing the
structure of the future notifications for everyone.
The value is practically not useful, since it's the namespace's own
ID that the application has to obtain from other sources in order to
figure out if it's the same or not. So, for the application it's
just an extra busy work with no benefits. Moreover, applications
that do not know about this quirk may be mishandling notifications
with NSID set as notifications from remote namespaces. This is the
case for ovs-vswitchd and the iproute2's 'ip monitor' that stops
printing 'current' and starts printing the nsid number mid-session.
Lack of clear documentation for this behavior is also not helping.
A search though open-source projects doesn't reveal any projects
that use NETNSA_NSID_NOT_ASSIGNED and rely on metadata to contain
self-referential NSIDs (expected, since the value is not useful).
Quite the opposite, as already mentioned, there are few applications
that rely on NSID to not be present in local events.
Since the value is not useful and actively harmful in some cases,
let's not report it for local events, making the notifications more
consistent.
Also adding some blank lines for readability.
Fixes: 59324cf35a ("netlink: allow to listen "all" netns")
Reported-by: Matteo Perin <matteo.perin@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://patch.msgid.link/20260520172317.175168-3-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the current skb is not shared, it is re-used directly for all the
sockets subscribed to the notification. If we have remote all-nsid
socket receiving a message first, then the 'nsid_is_set' will be
set to 'true'. If the nsid is NOT_ASSIGNED for the next socket in
the list, the 'nsid_is_set' will remain 'true' and the negative value
is be delivered to the user space. All subsequent nsid values will be
delivered as well, since there is no code path that sets the flag
back to 'false'.
Fix that by always dropping the flag to 'false' first.
Fixes: 7212462fa6 ("netlink: don't send unknown nsid")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://patch.msgid.link/20260520172317.175168-2-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
vsock_poll() reads vsk->peer_shutdown before taking the socket lock
to set EPOLLHUP and EPOLLRDHUP, then reads it again after taking
the lock to report EOF readability. A shutdown packet can update
peer_shutdown while poll is waiting for the lock, so one poll invocation
can report EOF readability without the corresponding HUP/RDHUP bits.
For connectible sockets, take one peer_shutdown snapshot after
lock_sock() and use it for all peer-shutdown-derived poll bits. For
datagram sockets, which do not take lock_sock() in poll(), take one
lockless READ_ONCE() snapshot and pair it with WRITE_ONCE() on the
writer side.
This keeps the peer-shutdown-derived bits internally consistent for each
poll pass.
Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Ziyu Zhang <ziyuzhang201@gmail.com>
Link: https://patch.msgid.link/20260519165636.62542-1-ziyuzhang201@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When build_skb() fails in tun_xdp_one(), the function sets ret to
-ENOMEM and jumps to the out label, which returns without freeing the
page that vhost_net_build_xdp() allocated for the frame. As with the
short-frame rejection path, tun_sendmsg() discards the per-buffer error
and still returns total_len, so vhost_tx_batch() takes the success path
and never frees the page. Each build_skb() failure in a batch leaks one
page-frag chunk.
Free the page before taking the error path, matching the put_page() the
other error exits of tun_xdp_one() already perform.
Fixes: 043d222f93 ("tuntap: accept an array of XDP buffs through sendmsg()")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
and returns -ENOMEM when build_skb() fails. Both paths jump to the err
label without freeing the page that vhost_net_build_xdp() allocated for
the frame. tap_sendmsg() discards the per-buffer return value and always
returns 0, so vhost_tx_batch() takes the success path and never frees
the page; each rejected frame in a batch leaks one page-frag chunk.
Free the page on both error paths, before the skb is built. This is the
tap counterpart of the same leak in tun_xdp_one().
Fixes: 0efac27791 ("tap: accept an array of XDP buffs through sendmsg()")
Fixes: ed7f2afdd0 ("tap: add missing verification for short frame")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The input buffer size is pcu->max_in_size, but pcu->max_out_size is
passed to usb_free_coherent().
Change size to match the allocation size.
Fixes: 628329d524 ("Input: add IMS Passenger Control Unit driver")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://patch.msgid.link/20260522085412.45430-2-fourier.thomas@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
tun_xdp_one() returns -EINVAL on a frame shorter than ETH_HLEN without
freeing the page that vhost_net_build_xdp() allocated for it.
tun_sendmsg() discards that -EINVAL and still returns total_len, so
vhost_tx_batch() takes the success path and never frees the page; each
short frame in a batch leaks one page-frag chunk.
A local process that can open /dev/net/tun and /dev/vhost-net can hit
this path: it attaches a tun/tap device as the vhost-net backend and
feeds TX descriptors whose length minus the virtio-net header is below
ETH_HLEN. Each kick leaks the page-frag chunks for that batch, and a
tight submission loop exhausts host memory and triggers an OOM panic.
Free the page before returning -EINVAL, matching the XDP-program error
path in the same function.
Fixes: 049584807f ("tun: add missing verification for short frame")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260520160020.375349-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make sure that the interrupt-out endpoint max packet size is at least
eight bytes to avoid user-controlled slab corruption or NULL-pointer
dereference should a malicious device report a smaller size.
Fixes: 3416eaa1f8 ("USB: cypress_m8: Packet format is separate from characteristic size")
Cc: stable@vger.kernel.org # 2.6.26
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
For lshift and rshift, the shift operations are performed in a loop over
32-bit words. The loop calculates the shifted value and write it to dst,
and then immediately reads from src to calculate the carry for the next
iteration. Because src and dst could point to the same memory location,
the carry is incorrectly calculated using the newly modified dst value
instead of the original src value.
Adding a temporary local variable to cache the original value before
writing to dst and using it for the carry calculation solves the
problem. In addition, partial overlap is rejected from control plane for
all kind of operations including byteorder. This was tested with the
following bytecode:
table test_table ip flags 0 use 1 handle 1
ip test_table test_chain use 3 type filter hook input prio 0 policy accept packets 0 bytes 0 flags 1
ip test_table test_chain 2
[ immediate reg 1 0x44332211 0x88776655 ]
[ bitwise reg 1 = ( reg 1 << 0x08000000 ) ]
[ cmp eq reg 1 0x66443322 0x00887766 ]
[ counter pkts 0 bytes 0 ]
ip test_table test_chain 4 3
[ immediate reg 1 0x44332211 0x88776655 ]
[ bitwise reg 1 = ( reg 1 << 0x08000000 ) ]
[ cmp eq reg 1 0x55443322 0x00887766 ]
[ counter pkts 21794 bytes 1917798 ]
Fixes: 567d746b55 ("netfilter: bitwise: add support for shifts.")
Acked-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Functional coverage of nft_fib6_eval()'s nexthop enumeration over
three route shapes:
1) single external nexthop (nhid)
2) external nexthop group (nhid -> group)
3) old-style multipath (nexthop ... nexthop ...)
Each scenario places one nexthop on the input device (veth0). For
(2) and (3) the matching nexthop is the second member, so the walk
has to traverse beyond the primary nh. Two nft counters on prerouting
verify the data path: one increments only when fib reports veth0 as
the oif, the other counts "missing" results and must stay at zero.
./nft_fib_nexthop.sh
PASS: single external nexthop (nhid -> veth0)
PASS: nexthop group (dummy0 + veth0)
PASS: old-style multipath (sibling on veth0)
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
nft_fib6_info_nh_uses_dev() runs from nft_fib6_eval() in softirq under
rcu_read_lock(). fib6_siblings is modified by writers that hold
tb6_lock but do not wait for RCU readers, so the sibling walk should
use list_for_each_entry_rcu(): it adds READ_ONCE() on the ->next
pointer and lets CONFIG_PROVE_RCU_LIST validate the locking.
No functional change for non-debug builds.
Fixes: 1c32b24c23 ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Luxiao Xu says:
The function compat_mtw_from_user() converts ebtables extensions from
32-bit user structures to kernel native structures. However, it lacks
proper validation of the user-supplied match_size/target_size.
When certain extensions are processed, the kernel-side translation
logic may perform memory accesses based on the extension's expected
size. If the user provides a size smaller than what the extension
requires, it results in an out-of-bounds read as reported by KASAN.
This fix introduces a check to ensure match_size is at least as large
as the extension's required compatsize. This covers matches, watchers,
and targets, while maintaining compatibility with standard targets.
AFAIU this is relevant for matches that need to go though
match->compat_from_user() call. Those that use plain memcpy with the
user-provided size are ok because the caller checks that size vs the
start of the next rule entry offset (which itself is checked vs. total
size copied from userspace).
The ->compat_from_user() callbacks assume they can read compatsize bytes,
so they need this extra check.
Based on an earlier patch from Luxiao Xu.
Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Luxiao Xu <rakukuip@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Several parts of network stack rely on iph->ihl validation
done by network stack before PRE_ROUTING.
Disable this feature for user namespaces for now.
tcp option handling is likely safe even for LOCAL_IN, so this
this leaves tcp option mangling via nft_exthdr.c as-is.
I don't think these are the only means to alter packets, but these
appear to be relatively prominent.
This could be relaxed later. Example:
- allow userns for ingress hook.
- allow userns if base is transport header.
Also, we should revalidate or restrict generally:
- Don't allow linklayer writes to spill into network header
- restrict ipv4 and ipv6 to 'known safe' writes, e.g.
saddr/daddr/check/tos
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Tested-by: Qi Tang <tpluszz77@gmail.com>
Link: https://lore.kernel.org/netfilter-devel/20260515100411.3141-1-fw@strlen.de/
Signed-off-by: Florian Westphal <fw@strlen.de>
With PREEMPT_RCU we get splat:
BUG: using smp_processor_id() in preemptible [..]
caller is cpu_mt+0x53/0xd0 net/netfilter/xt_cpu.c:37
CPU: 1 .. Comm: syz.3.1377 #0 PREEMPT(full)
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
check_preemption_disabled+0xd3/0xe0 lib/smp_processor_id.c:47
cpu_mt+0x53/0xd0 net/netfilter/xt_cpu.c:37
[..]
Just use raw version instead.
This is similar to 14d14a5d29 ("netfilter: nft_meta: use raw_smp_processor_id()").
Fixes: 0ca743a559 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: syzbot+690d3e3ffa7335ac10eb@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Quoting reporter:
A race between GRE keymap insertion and destruction can corrupt the
kernel list or use a freed object. `nf_ct_gre_keymap_add()` publishes a
new keymap pointer before the embedded `list_head` is linked, while
`nf_ct_gre_keymap_destroy()` can concurrently delete and free that
same object. An unprivileged user can reach this through the PPTP
conntrack helper by racing PPTP control messages or helper teardown,
leading to KASAN-detectable list corruption/UAF in kernel context.
## Root Cause Analysis
`exp_gre()` installs GRE expectations for a PPTP control flow and then
adds two GRE keymap entries [..]
The add path publishes `ct_pptp_info->keymap[dir]` before linking the
embedded list node [..]
Concurrent teardown deletes that partially initialized object.
Make add/destroy symmetric: install both, destroy both while under lock.
Furthermore, we should refuse to publish a new mapping in case ct is going
away, else we may leak the allocation.
The "retrans" detection is strange: existing mapping is checked for key
equality with the new mapping, then for "is on the list" via list walk.
But I can't see how an existing keymap entry can be NOT on list.
Change this to only check if we're asked to map same tuple again -- if so,
skip re-install, else signal failure.
Last, add a bug trap for the keymap list; it has to be empty when namespace
is going away.
Reported-by: Leo Lin <leo@depthfirst.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
synproxy_tstamp_adjust() rewrites the TCP timestamp option in place
and then patches the TCP checksum via inet_proto_csum_replace4() on
the caller-supplied tcphdr pointer. Both ipv4_synproxy_hook() and
ipv6_synproxy_hook() obtain that pointer with skb_header_pointer()
before calling in, so it may either alias skb->head directly or
point at the caller's on-stack _tcph buffer.
Between obtaining the pointer and using it, the function calls
skb_ensure_writable(skb, optend), which on a cloned or non-linear
skb invokes pskb_expand_head() and frees the old skb->head. After
that point the cached th is stale:
caller (ipv[46]_synproxy_hook)
th = skb_header_pointer(skb, ..., &_tcph)
synproxy_tstamp_adjust(skb, protoff, th, ...)
skb_ensure_writable(skb, optend)
pskb_expand_head() /* kfree(old skb->head) */
...
inet_proto_csum_replace4(&th->check, ...)
/* writes into freed head, or
into the caller's stack copy
leaving the on-wire checksum
stale */
The option bytes are written through skb->data and are fine; only
the checksum update goes through th and so lands in the wrong
place. The result is either a write into freed slab memory or a
packet leaving with a checksum that does not match its payload.
Fix by re-deriving th from skb->data + protoff immediately after
skb_ensure_writable() succeeds, so the subsequent checksum update
targets the linear, writable header.
Fixes: 48b1de4c11 ("netfilter: add SYNPROXY core/target")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
An unintended behavior in the TCP conntrack state machine allows a
connection to be forced into the CLOSE state using an RST packet with an
invalid sequence number.
Specifically, after a SYN packet is observed, an RST with an invalid SEQ
can transition the conntrack entry to TCP_CONNTRACK_CLOSE, regardless of
whether the RST corresponds to the expected reply direction. The relevant
code path assumes the RST is a response to an outgoing SYN, but does not
validate packet direction or ensure that a matching SYN was actually sent
in the opposite direction.
As a result, a crafted packet sequence consisting of a SYN followed by an
invalid-sequence RST can prematurely terminate an active NAT entry. This
makes connection teardown easier than intended.
So, tighten the state transition logic to ensure that RST-triggered
CLOSE transitions only occur when the RST is a valid response to a
previously observed SYN in the correct direction.
Cc: stable@vger.kernel.org
Fixes: 9fb9cbb108 ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
After a kexec/kdump reboot, the macb Ethernet controller fails to
receive any packets, causing DHCP to hang indefinitely and the network
interface to be unusable despite link being up.
The root cause is that RP1's level-triggered MSI-X interrupt sources
(such as macb on hwirq 6) may have their internal state machines stuck
in the "waiting for IACK" state. This happens because the previous
kernel crashed before sending the acknowledgment for a pending level
interrupt.
In this stuck state, RP1 will not generate new MSI-X writes even though
the interrupt source remains asserted. Since no new MSI-X is sent, the
GIC never sees a new edge, the chained IRQ handler is never invoked,
and the interrupt is permanently lost.
Fix this by sending MSIX_CFG_IACK in rp1_irq_activate(). This
unconditionally resets the MSI-X state machine back to idle when a
child device requests its interrupt. If the interrupt source is still
asserted, RP1 will immediately issue a new MSI-X with the freshly
configured msg_addr/msg_data, and normal interrupt delivery resumes.
Writing IACK when the state machine is already idle (i.e., on a normal
cold boot) is harmless — it has no effect.
Fixes: 49d63971f9 ("misc: rp1: RaspberryPi RP1 misc driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Link: https://patch.msgid.link/20260518073405.2115003-1-xiaolei.wang@windriver.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The parport subsystem registers port devices before they are fully
initialised, resulting in a race condition where client drivers such
as lp can attach to ports that are not completely initialised or even
being torn down.
When the port and client drivers are built as modules and loaded
around the same time during boot, this occasionally results in a
crash. I was able to make this happen reliably in a VM with a
PC-style parallel port by patching parport_pc to fail probing:
> --- a/drivers/parport/parport_pc.c
> +++ b/drivers/parport/parport_pc.c
> @@ -2069,7 +2069,7 @@ static struct parport *__parport_pc_probe_port(unsigned long int base,
> if (!p)
> goto out3;
>
> - base_res = request_region(base, 3, p->name);
> + base_res = NULL;
> if (!base_res)
> goto out4;
>
and then running:
while true; do
modprobe lp & modprobe parport_pc
wait
rmmod lp parport_pc
done
for a few seconds.
In the long term I think port registration should be changed to put
the call to device_add() inside parport_announce_port(), but since the
latter currently cannot fail this will require changing all port
drivers.
For now, add a flag to indicate whether a port has been "announced"
and only try to attach client drivers to ports when the flag is set.
Fixes: 6fa45a2268 ("parport: add device-model to parport subsystem")
Closes: https://bugs.debian.org/1130365
Closes: https://lore.kernel.org/all/6ba903ad-9897-42bb-8c2d-337385cc3746@molgen.mpg.de/
Cc: stable <stable@kernel.org>
Signed-off-by: Ben Hutchings <benh@debian.org>
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://patch.msgid.link/afo6uBv68GDevbMD@decadent.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
uio_pci_sva allocates struct uio_pci_sva_dev with devm_kzalloc() in
probe(), but then calls kfree(udev) both on the probe() error path
(label out_free) and again in remove().
Because devm_kzalloc() allocations are devres-managed and are freed
automatically when the device is detached (including after a failing
probe() and during driver unbind), the explicit kfree() can lead to a
double free.
If probe() fails after devm_kzalloc(), the error path frees udev and
devres cleanup will free it again when the core unwinds the partially
bound device. On normal driver removal, remove() frees udev and devres
will free it again when the device is detached.
This issue was identified by a static analysis tool I developed and
confirmed by manual review. Fix by removing the manual kfree() calls
and dropping the now-unused label.
Fixes: 3397c3cd85 ("uio: Add SVA support for PCI devices via uio_pci_generic_sva.c")
Cc: stable <stable@kernel.org>
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://patch.msgid.link/20260505150256.614071-1-lgs201920130244@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In 6c37bebd8c, we switched to looping over the list and dropping each
individual node, ostensibly without the lock held in the loop body.
If the kernel were using Rust Edition 2024, the comment would be
accurate, and the lock would not be held across the drop. However, the
kernel is currently using 2021, so tail expression lifetime extension
results in the lock being held across the drop. Explicitly binding the
expression result to a variable makes the lockguard no longer part of a
tail expression, causing the lock to be dropped before entering the loop
body.
This was detected via `CONFIG_PROVE_LOCKING` identifying an invalid wait
context at the drop site.
Reported-by: David Stevens <stevensd@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Cc: stable <stable@kernel.org>
Fixes: 6c37bebd8c ("rust_binder: avoid mem::take on delivered_deaths")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20260403-lockhold-v1-1-c332b56cd8ae@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an outdated transaction is removed from `oneway_todo` due to
`TF_UPDATE_TXN`, its `Allocation` is dropped. The current implementation
of `Allocation::drop` calls `pending_oneway_finished()`, assuming the
transaction was executed. This leads to premature execution of the next
queued one-way transaction.
Fix this by taking the `oneway_node` from the `Allocation` of the
outdated transaction before it is dropped. This prevents
`Allocation::drop` from signaling completion.
We do not call `take_oneway_node()` from `Transaction::cancel` because
it's actually correct to call `pending_oneway_finished()` on cancel if
the transaction did not come from `oneway_todo`. This ensures that if
`BINDER_THREAD_EXIT` is invoked and cancels a oneway transaction, then
the next transaction is taken from `oneway_todo`.
This bug does not lead to any issues in the kernel, but may lead to
Binder delivering transactions to userspace earlier than userspace
expected to receive them.
Cc: stable <stable@kernel.org>
Fixes: eafedbc7c0 ("rust_binder: add Rust Binder driver")
Assisted-by: Antigravity:gemini
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20260414-tf-update-txn-fix-v1-1-d2b83303acc9@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Enable modular build since the driver now has a proper module_exit()
handler. There's nothing specific to DZ hardware to prevent driver
unloading and reloading from working.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062331420.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prevent a crash from happening as the first serial port is initialised:
Console: switching to mono frame buffer device 160x64
fb0: PMAG-AA frame buffer device at tc0
DECstation Z85C30 serial driver version 0.10
CPU 0 Unable to handle kernel paging request at virtual address 0000002c, epc == 803ab00c, ra == 803aafe0
Oops[#1]:
CPU: 0 PID: 1 Comm: swapper Not tainted 6.4.0-rc3-00031-g84a9582fd203-dirty #57
$ 0 : 00000000 10012c00 803aaeb0 00000000
$ 4 : 80e12f60 80e12f50 80e12f58 81000030
$ 8 : 00000000 805ff37c 00000000 33433538
$12 : 65732030 00000006 80c2915d 6c616972
$16 : 80e12f00 807b7630 00000000 00000000
$20 : 00000004 00000348 000001a0 807623b8
$24 : 00000018 00000000
$28 : 80c24000 80c25d60 8078b148 803aafe0
Hi : 00000000
Lo : 00000000
epc : 803ab00c serial_base_ctrl_add+0x78/0xf4
ra : 803aafe0 serial_base_ctrl_add+0x4c/0xf4
Status: 10012c03 KERNEL EXL IE
Cause : 00000008 (ExcCode 02)
BadVA : 0000002c
PrId : 00000440 (R4400SC)
Modules linked in:
Process swapper (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000)
Stack : 80760000 00000cc0 00400044 00400040 803aa02c 80d61ab8 00000000 807b7630
80760000 807623b8 807b7628 803aa644 80386998 00000000 80e17780 80220f68
80e17780 80d61ab8 80c17d80 80e17780 80e17780 8063c798 80e17780 80383fa0
00000010 80e17780 00000000 80386998 807a0000 00000000 00400040 8038f848
807623b8 80d61ab8 00000004 80e17780 00000000 803a68e4 80c25e2c 803bb884
...
Call Trace:
[<803ab00c>] serial_base_ctrl_add+0x78/0xf4
[<803aa644>] serial_core_register_port+0x174/0x69c
[<8077e9ac>] zs_init+0xc8/0xfc
[<800404d4>] do_one_initcall+0x40/0x2ac
[<8076cecc>] kernel_init_freeable+0x1e4/0x270
[<80605bec>] kernel_init+0x20/0x108
[<800431e8>] ret_from_kernel_thread+0x14/0x1c
Code: 2442aeb0 ae120024 ae0200d0 <8c67002c> 50e00001 8c670000 3c06806e 3c05806e afb30010
---[ end trace 0000000000000000 ]---
(report at the offending commit) -- where a pointer is dereferenced that
has been derived from a null pointer to the port's parent device.
Since no device is available with legacy probing and it's not anymore a
preferable way to discover devices anyway, switch the driver to using a
platform device and use it as the port's parent device. Update resource
handling accordingly and only request the actual span of addresses used
within the slot, which will have had its resource already requested by
generic platform device code.
Use platform_driver_probe() not just because SCC devices are fixed with
solder on board and not straightforward to remove, but foremost because
the associated TTY's major device number is the same as used by the dz
driver and the first driver to claim it will prevent the other one from
using it. Either one DZ device or some SCC devices will be present in a
given system but never both at a time, and therefore we want the major
device number to be claimed by the first driver to actually successfully
bind to its device and platform_driver_probe() is a way to fulfil that.
An unfortunate consequence of the switch to a platform device is we now
hand the console over from the bootconsole much later in the bootstrap.
The firmware console handler appears good enough though to work so late
and in particular with interrupts enabled.
Since there is one way only remaining to reach zs_reset() now, remove
the port initialisation marker as no longer needed and go through the
channel reset unconditionally.
Fixes: 84a9582fd2 ("serial: core: Start managing serial controllers to enable runtime PM")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # needs to use .remove_new for <= 6.10
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062328480.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prevent a crash from happening as the first serial port is initialised:
Console: switching to colour frame buffer device 160x64
tgafb: SFB+ detected, rev=0x02
fb0: Digital ZLX-E1 frame buffer device at 0x1e000000
DECstation DZ serial driver version 1.04
CPU 0 Unable to handle kernel paging request at virtual address 000000bc, epc == 8048b3a4, ra == 80470a78
Oops[#1]:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-dirty #35 NONE
$ 0 : 00000000 1000ac00 00000004 804707ac
$ 4 : 00000000 80e20850 80e20858 81000030
$ 8 : 00000000 8072c81c 00000008 fefefeff
$12 : 6c616972 00000006 80c5917f 69726420
$16 : 80e20800 00000000 808f8968 80e20800
$20 : 00000000 807f5a90 808b0094 808d3bc8
$24 : 00000018 80479030
$28 : 80c2e000 80c2fd70 00000069 80470a78
Hi : 00000004
Lo : 00000000
epc : 8048b3a4 __dev_fwnode+0x0/0xc
ra : 80470a78 serial_base_ctrl_add+0xa0/0x168
Status: 1000ac04 IEp
Cause : 30000008 (ExcCode 02)
BadVA : 000000bc
PrId : 00000220 (R3000)
Modules linked in:
Process swapper/0 (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000)
Stack : 00400044 00400040 8046f4cc 00000000 808a6148 808a0000 808f8968 8086983c
808e0000 8046fc84 1000ac01 00000028 80e20700 802ba3f8 80e20700 80d34a94
80c1b900 80e20700 80e20700 80e20700 80e20700 80444650 00000000 00000000
00000000 807f5a90 808b0094 80447080 00400040 808e0000 80d34a94 808a6148
80d34a94 00000004 80e20700 00000000 8076974c 80469810 80c2fe3c 1000ac01
...
Call Trace:
[<8048b3a4>] __dev_fwnode+0x0/0xc
[<80470a78>] serial_base_ctrl_add+0xa0/0x168
[<8046fc84>] serial_core_register_port+0x1c8/0x974
[<808c6af0>] dz_init+0x74/0xc8
[<800470e0>] do_one_initcall+0x44/0x2d4
[<808b111c>] kernel_init_freeable+0x258/0x308
[<8072e434>] kernel_init+0x20/0x114
[<80049cd0>] ret_from_kernel_thread+0x14/0x1c
Code: 27bd0018 03e00008 2402ffea <8c8200bc> 03e00008 00000000 27bdffc0 afbe0038 afb30024
---[ end trace 0000000000000000 ]---
-- where a pointer is dereferenced that has been derived from a null
pointer to the port's parent device.
Since no device is available with legacy probing and it's not anymore a
preferable way to discover devices anyway, switch the driver to using a
platform device and use it as the port's parent device. Update resource
handling accordingly and only request the actual span of addresses used
within the slot, which will have had its resource already requested by
generic platform device code.
Use platform_driver_probe() not just because the DZ device is fixed with
solder on board and not straightforward to remove, but foremost because
the associated TTY's major device number is the same as used by the zs
driver and the first driver to claim it will prevent the other one from
using it. Either one DZ device or some SCC devices will be present in a
given system but never both at a time, and therefore we want the major
device number to be claimed by the first driver to actually successfully
bind to its device and platform_driver_probe() is a way to fulfil that.
An unfortunate consequence of the switch to a platform device is we now
hand the console over from the bootconsole much later in the bootstrap.
The firmware console handler appears good enough though to work so late
and in particular with interrupts enabled.
Conversely only starting the console port so late lets the reset code
fully utilise our delay handlers, so switch from udelay() to fsleep()
for transmitter draining so as to avoid busy-waiting for an excessive
amount of time.
Fixes: 84a9582fd2 ("serial: core: Start managing serial controllers to enable runtime PM")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # needs to use .remove_new for <= 6.10
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062326540.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Switch the driver to using the channel reset rather than hardware reset,
simplifying handling by removing an interference between channels that
causes the other channel to become uninitialised afterwards.
There is little difference between the two kinds of reset in terms of
register settings that result, and we initialise the whole register set
right away anyway. However this prevents a hang from happening should
the console output handler in the firmware try to access the other port
whose transmitter has been disabled and line parameters messed up.
For example this will happen if the keyboard port (port A) is chosen for
the system console, unusually but not insanely for a headless system, as
the port is wired to a standard DA-15 connector and an adapter can be
easily made. Or with the next change in place this would happen for the
regular console port (port B), since the keyboard port (port A) will be
initialised first.
Just remove the unnecessary complication then, a channel reset is good
enough. We still need the initialisation marker, now per channel rather
than per SCC, as for the console port zs_reset() will be called twice:
once early on via zs_serial_console_init() for the console setup only,
and then again via zs_config_port() as the port is associated with a TTY
device.
Fixes: 8b4a40809e ("zs: move to the serial subsystem")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v2.6.23+
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062323430.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Calling zs_reset() in the course of setting up the serial device causes
line parameters to be reset and the transmitter disabled. We've been
lucky in that no message is usually produced to the kernel log between
this call and the later call to uart_set_options() in the course of
console setup done by zs_serial_console_init(), or the system would hang
as the console output handler in the firmware tried to access a port the
transmitter of which has been disabled and line parameters messed up.
This will change with the next change to the driver, so fix zs_reset()
such that line parameters are set for 9600n8 console operation as with
the system firmware and the transmitter re-enabled after reset. This
also means zs_pm() serves no purpose anymore, so drop it.
Fixes: 8b4a40809e ("zs: move to the serial subsystem")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v2.6.23+
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062308040.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Calling dz_reset() in the course of setting up the serial device causes
line parameters to be reset and the transmitter disabled. We've been
lucky in that no message is usually produced to the kernel log between
this call and the later call to uart_set_options() in the course of
console setup done by dz_serial_console_init(), or the system would hang
as the console output handler in the firmware tried to access a port the
transmitter of which has been disabled and line parameters messed up.
This will change with the next change to the driver, so fix dz_reset()
such that line parameters are set for 9600n8 console operation as with
the system firmware and the transmitter re-enabled after reset. This
also means dz_pm() serves no purpose anymore, so drop it.
Fixes: e6ee512f5a ("dz.c: Resource management")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v2.6.25+
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062302010.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the DZ interface as implemented by the DC7085 gate array the serial
transmitters are double buffered, meaning that at the time a transmitter
is ready to accept the next character there is one in the transmit shift
register still being sent to the line. Issuing a master clear at this
time causes this character to be lost, so wait an extra amount of time
sufficient for the transmit shift register to drain at 9600bps, which is
the baud rate setting used by the firmware console.
Mind the specified 1.4us TRDY recovery time in the course and continue
using iob() as the completion barrier, since the platforms involved use
a write buffer that can delay and combine writes, and reorder them with
respect to reads regardless of the MMIO locations accessed and we still
lack a platform-independent handler for that.
When called from dz_serial_console_init() this is too early for fsleep()
to work and even before lpj has been calculated and therefore the delay
is actually not sufficient for the transmitter to drain and is merely a
placeholder now. This will be addressed in a follow-up change.
Fixes: e6ee512f5a ("dz.c: Resource management")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v2.6.25+
Link: https://patch.msgid.link/alpine.DEB.2.21.2605062259080.46195@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dw8250_handle_irq() calls serial8250_handle_irq_locked() with the port
lock held via guard(uart_port_lock_irqsave). The guard destructor is
plain uart_port_unlock_irqrestore(), so a SysRq character captured into
port->sysrq_ch by uart_prepare_sysrq_char() is dropped without ever
being dispatched to handle_sysrq().
This is the same regression pattern as in serial8250_handle_irq(),
introduced when 883c5a2bc9 ("serial: 8250_dw: Rework
dw8250_handle_irq() locking and IIR handling") moved the function to
the guard()-based locking scheme without using the sysrq-aware unlock
helper.
Switch to guard(uart_port_lock_check_sysrq_irqsave) so that captured
sysrq_ch is dispatched on scope exit, matching the fix in
serial8250_handle_irq().
Fixes: 883c5a2bc9 ("serial: 8250_dw: Rework dw8250_handle_irq() locking and IIR handling")
Cc: stable@vger.kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jacques Nilo <jnilo@free.fr>
Link: https://patch.msgid.link/ed56fcaf4af24e4ed011a7bce206e0182acb761c.1778675349.git.jnilo@free.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
serial8250_handle_irq() captures a SysRq character into port->sysrq_ch
inside serial8250_handle_irq_locked() via uart_prepare_sysrq_char()
(reached from serial8250_read_char()). Dispatch of that captured
character to handle_sysrq() is expected to happen at port-unlock time,
through uart_unlock_and_check_sysrq[_irqrestore]().
After commit 8324a54f60 ("serial: 8250: Add
serial8250_handle_irq_locked()") the function was reduced to a wrapper
that takes the port lock via guard(uart_port_lock_irqsave) whose
destructor is plain uart_port_unlock_irqrestore(). The sysrq-aware
unlock helper is no longer called, so port->sysrq_ch is captured but
never dispatched: BREAK + SysRq key is consumed silently.
This was the very condition Johan Hovold's 853a9ae29e ("serial:
8250: fix handle_irq locking", 2021) introduced
uart_unlock_and_check_sysrq_irqrestore() to address.
Switch to the new guard(uart_port_lock_check_sysrq_irqsave), whose
destructor is the sysrq-aware unlock helper, restoring the pre-split
behaviour. Update the Context: comment on serial8250_handle_irq_locked()
so future HW-specific 8250 wrappers know to use the same guard or the
explicit sysrq-aware unlock.
Verified on RTL8196E with CONFIG_MAGIC_SYSRQ_SERIAL=y: BREAK + 'h' on
the console UART produces the SysRq help dump in dmesg and the brk
counter in /proc/tty/driver/serial increments correctly.
Fixes: 8324a54f60 ("serial: 8250: Add serial8250_handle_irq_locked()")
Cc: stable@vger.kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jacques Nilo <jnilo@free.fr>
Link: https://patch.msgid.link/52692ae6c3501f7940347cef364ad7fcacaab7e5.1778675349.git.jnilo@free.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
uart_handle_break() and uart_prepare_sysrq_char() (in
include/linux/serial_core.h) capture a SysRq character into
port->sysrq_ch while the port lock is held and rely on the unlock
helper -- uart_unlock_and_check_sysrq_irqrestore() -- to dispatch the
captured character to handle_sysrq() on scope exit.
The existing guard(uart_port_lock_irqsave) cannot be used by IRQ
handlers that process RX, because its destructor calls plain
uart_port_unlock_irqrestore() and silently drops port->sysrq_ch.
Add a dedicated guard(uart_port_lock_check_sysrq_irqsave) variant
whose destructor is the sysrq-aware unlock helper. The lock side is
identical to uart_port_lock_irqsave -- only the unlock-time behaviour
differs. Callers that may capture SysRq characters must use
guard(uart_port_lock_check_sysrq_irqsave); the existing
guard(uart_port_lock_irqsave) keeps its current plain-unlock semantics
for the many callers that do not process RX.
The new macro is placed after the CONFIG_MAGIC_SYSRQ_SERIAL block so
both definitions of uart_unlock_and_check_sysrq_irqrestore() (sysrq
enabled and disabled) are visible at expansion time. When
CONFIG_MAGIC_SYSRQ_SERIAL=n the destructor degenerates to plain
uart_port_unlock_irqrestore(), so there is no overhead.
No functional change on its own; users are converted in the following
patches.
Cc: stable@vger.kernel.org
Signed-off-by: Jacques Nilo <jnilo@free.fr>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/3849af4bc55d5d2a424fa850844e94d641b2f8a6.1778675349.git.jnilo@free.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sashiko identified a deadlock when the console flow is engaged [1].
When console flow control is enabled (UPF_CONS_FLOW),
s3c24xx_serial_stop_tx() calls s3c24xx_serial_rx_enable() and
s3c24xx_serial_start_tx() calls s3c24xx_serial_rx_disable().
The serial core framework invokes the .stop_tx() and .start_tx()
callbacks with the port->lock spinlock already held. Furthermore, all
internal driver paths that invoke stop_tx (such as the DMA TX
completion handler s3c24xx_serial_tx_dma_complete() or the PIO TX IRQ
handler s3c24xx_serial_tx_irq()) also acquire port->lock prior to
calling it. (Note that s3c24xx_serial_start_tx() is only invoked by the
serial core).
However, s3c24xx_serial_rx_enable() and s3c24xx_serial_rx_disable()
unconditionally attempt to acquire port->lock again using
uart_port_lock_irqsave(). Since spinlocks are not recursive, this
causes a deadlock on the same CPU when console flow control is engaged.
Remove the redundant lock acquisition from both rx helper functions.
Cc: stable <stable@kernel.org>
Fixes: b497549a03 ("[ARM] S3C24XX: Split serial driver into core and per-cpu drivers")
Reported-by: John Ogness <john.ogness@linutronix.de>
Closes: https://sashiko.dev/#/patchset/20260506121606.5805-1-john.ogness%40linutronix.de [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260515-samsung-tty-flow-control-deadlock-v1-1-93255edbc9bc@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
altera_jtaguart_probe() maps the register window before registering the
UART port, but it ignores failures from uart_add_one_port(). If port
registration fails, probe still returns success and the mapping remains
live until a later remove path that is not part of probe failure cleanup.
Return the uart_add_one_port() error and unmap the register window on
that failure path.
This issue was identified during our ongoing static-analysis research while
reviewing kernel code.
Fixes: 5bcd601049 ("serial: Add driver for the Altera JTAG UART")
Cc: stable <stable@kernel.org>
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Link: https://patch.msgid.link/20260512065837.79528-1-mhun512@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The VENDOR_CLASS_DATA_IFACE and ALWAYS_POLL_CTRL quirk flags added in
commit f58752ebcb ("USB: cdc-acm: Add quirks for Yoga Book 9 14IAH10
INGENIC touchscreen") were placed inside the acm_ctrl_msg() function
rather than in the header with the other quirk flags. Then, their
values (BIT(9) and BIT(10)) collided with NO_UNION_12 which is already
BIT(9).
Move the definitions to drivers/usb/class/cdc-acm.h where they belong
and shift them to BIT(10) and BIT(11) to avoid the overlap.
Fixes: f58752ebcb ("USB: cdc-acm: Add quirks for Yoga Book 9 14IAH10 INGENIC touchscreen")
Cc: stable <stable@kernel.org>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Link: https://patch.msgid.link/20260522091357.1301196-1-guanwentao@uniontech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're not allowed to dereference "urb" after calling
usb_hcd_giveback_urb() so save the urb->status ahead of time.
Fixes: 7359d482eb ("staging: HCD files for the DWC2 driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://patch.msgid.link/ag1NwBpqT4IEQcdJ@stanley.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a system contains multiple USB controllers, the global ci_role_switch
variable may be overwritten by subsequent driver initialization code.
This can cause issues in the following cases:
- The 2nd ci_hdrc_probe() sees ci_role_switch.fwnode as non-NULL even
though the "usb-role-switch" property is not present for the controller.
- When the ci_hdrc device is unbound and bound again, ci_role_switch
fwnode will not be reassigned, and the old value will be used instead.
Convert ci_role_switch to a local variable to fix these issues.
Fixes: 05559f10ed ("usb: chipidea: add role switch class support")
Cc: stable <stable@kernel.org>
Acked-by: Peter Chen <peter.chen@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://patch.msgid.link/20260427075755.3611217-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ffs_epfile_dmabuf_io_complete() calls usb_ep_free_request() on the
completed request but leaves priv->req, the back-pointer that
ffs_dmabuf_transfer() set on submission, pointing at the freed
memory. A later FUNCTIONFS_DMABUF_DETACH ioctl or
ffs_epfile_release() on the close path still sees priv->req
non-NULL under ffs->eps_lock:
if (priv->ep && priv->req)
usb_ep_dequeue(priv->ep, priv->req);
so usb_ep_dequeue() is called on a freed usb_request.
On dummy_hcd the dequeue path only walks a live queue and
pointer-compares, so the freed pointer reads without faulting and
KASAN requires an explicit check at the FunctionFS call site to
surface the use-after-free. On SG-capable in-tree UDCs the
dequeue path dereferences the supplied request immediately:
* chipidea's ep_dequeue() does
container_of(req, struct ci_hw_req, req) and reads
hwreq->req.status before acquiring its own lock.
* cdnsp's cdnsp_gadget_ep_dequeue() reads request->status first.
The narrower option of clearing priv->req via cmpxchg() in the
completion does not close the race: the completion runs without
eps_lock, so a cancel path holding eps_lock can still observe
priv->req non-NULL, race a concurrent completion that clears and
frees, and pass the freed pointer to usb_ep_dequeue(). A slightly
longer fix that moves the free into the cleanup work is needed.
Same class of lifetime race as the recent usbip-vudc timer fix [1].
Take eps_lock in the sole place that mutates priv->req from the
callback direction by moving usb_ep_free_request() out of the
completion into ffs_dmabuf_cleanup(), the existing work handler
scheduled by ffs_dmabuf_signal_done() on
ffs->io_completion_wq. Clear priv->req there under eps_lock
before freeing, and only clear if priv->req still names our
request (a subsequent ffs_dmabuf_transfer() on the same
attachment may have queued a new one).
This keeps the existing dummy_hcd sync-dequeue invariant: the
completion callback is still invoked by the UDC without
eps_lock held (dummy_hcd drops its own lock before calling the
callback), and the callback now takes no f_fs lock at all.
Serialization against the cancel path happens in cleanup, which
runs from the workqueue with no f_fs lock held on entry.
The priv ref count protects the containing ffs_dmabuf_priv:
ffs_dmabuf_transfer() takes a ref via ffs_dmabuf_get(), cleanup
drops it via ffs_dmabuf_put(), so priv stays live for the
cleanup even after the cancel path's list_del + ffs_dmabuf_put.
The ffs_dmabuf_transfer() error path no longer frees usb_req
inline: fence->req and fence->ep are set before usb_ep_queue(),
so ffs_dmabuf_cleanup() (scheduled by the error-path
ffs_dmabuf_signal_done()) owns the free regardless of whether
the queue succeeded.
Reproduced under KASAN on both detach and close paths against
dummy_hcd with an observability hook
(kasan_check_byte(priv->req) immediately before usb_ep_dequeue)
at the two FunctionFS cancel sites to surface the stale-pointer
access; the hook is not part of this patch. The KASAN
allocator / free stacks in the captured splats identify the
same request: alloc in dummy_alloc_request, free in
dummy_timer, fault reached from ffs_epfile_release (close) and
from the FUNCTIONFS_DMABUF_DETACH ioctl (detach). With the
patch applied, both paths are silent under the same hook.
The bug is reached from the FunctionFS device node, which in
real deployments is owned by the privileged gadget daemon
(adbd, UMS, composite gadget services, etc.); it is not
reachable from unprivileged userspace or from a USB host on the
cable. FunctionFS mounts default to GLOBAL_ROOT_UID, but the
filesystem supports uid=, gid=, and fmode= delegation to a
non-root gadget daemon, so on real deployments the attacker may
be a less-privileged service rather than root.
Fixes: 7b07a2a7ca ("usb: gadget: functionfs: Add DMABUF import interface")
Link: https://lore.kernel.org/all/20260417163552.807548-1-michael.bommarito@gmail.com/ [1]
Cc: stable <stable@kernel.org>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260419161227.1587668-1-michael.bommarito@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ffs_ep0_read() allocates its control-OUT data buffer with
kmalloc() (not kzalloc) at the Length value from the Setup
packet, then copies that full len to userspace regardless of
how many bytes were actually received:
data = kmalloc(len, GFP_KERNEL);
...
ret = __ffs_ep0_queue_wait(ffs, data, len);
if ((ret > 0) && (copy_to_user(buf, data, len)))
ret = -EFAULT;
__ffs_ep0_queue_wait() returns req->actual, which on a short
control OUT transfer is strictly less than len. The
copy_to_user() call still copies len bytes, so on a short OUT
the last (len - ret) bytes of the kmalloc() buffer --
uninitialised slab residue -- are delivered to the FunctionFS
daemon.
Short ep0 OUT completions are specified USB control-transfer
behavior and are produced by in-tree UDCs:
* dwc2 continues on req->actual < req->length for ep0 DATA OUT
(short-not-ok is the only ep0-OUT stall path).
* aspeed_udc ends ep0 OUT on rx_len < ep->ep.maxpacket.
* renesas_usbf logs "ep0 short packet" and completes the
request.
* dwc3 stalls on short IN but not on short OUT.
A short ep0 OUT is therefore not evidence of a broken UDC; it is
a normal condition f_fs has to cope with. The sibling gadgetfs
implementation in drivers/usb/gadget/legacy/inode.c already does
this correctly via min(len, dev->req->actual) before
copy_to_user(). This patch brings f_fs.c to the same safe
pattern rather than trimming at a defensive layer.
The bug is reached from the FunctionFS device node, which in
real deployments is owned by the privileged gadget daemon
(adbd, UMS, composite gadget services, etc.); it is not
reachable from unprivileged userspace. Linux host stacks
normally reject short-wLength control OUTs before they reach
the gadget, so reproducing this required a build that
bypasses that host-side check. With the bypass in place, a
1-byte payload on a 64-byte Setup produces 63 bytes of
non-canary slab residue in the daemon's read buffer.
Fix by copying only ret (actually received) bytes to
userspace.
Fixes: ddf8abd259 ("USB: f_fs: the FunctionFS driver")
Cc: stable <stable@kernel.org>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260419160359.1577270-1-michael.bommarito@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The `dummy_hub_control()` function handles USB hub class requests
to the virtual root hub. The `GetPortStatus` case returns -EPIPE for
requests with `wIndex != 1`, since the virtual root hub has only a
single port. However, the `ClearPortFeature` and `SetPortFeature`
cases lack the same check.
Fix this by extending the `wIndex != 1` rejection to both cases,
matching the existing behavior of `GetPortStatus`.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@kernel.org>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Seungjin Bae <eeodqql09@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://patch.msgid.link/20260518234314.1889396-1-eeodqql09@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The EIC7700 USB requires a USB PHY reset operation; otherwise, the USB
will not work. The reason why the USB driver that was applied can work
properly is that the USB PHY has already been reset in ESWIN's U-Boot.
However, the proper functioning of the USB driver should not be dependent
on the bootloader. Therefore, it is necessary to incorporate the USB PHY
reset signal into the DT bindings.
This patch does not introduce any backward incompatibility since the dts
is not upstream yet. As array of reset operations are used in the driver,
no modifications to the USB controller driver are needed.
Fixes: c640a4239d ("dt-bindings: usb: Add ESWIN EIC7700 USB controller")
Cc: stable <stable@kernel.org>
Signed-off-by: Hang Cao <caohang@eswincomputing.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260415064238.1784-1-caohang@eswincomputing.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch follows up Zheng Wang's 2023 report of a use-after-free in
vudc_remove(). The original thread stalled on Shuah Khan's request for
runtime testing of the unplug/unbind path. This patch supplies that
testing and keeps Zheng's original fix shape.
In vudc_probe(), v_init_timer() binds udc->tr_timer.timer to v_timer().
usbip_sockfd_store() starts the timer via v_start_timer()/v_kick_timer().
vudc_remove() can then free the containing struct vudc while the timer is
still pending or executing.
KASAN confirms the race on an unpatched x86_64 QEMU guest with
CONFIG_KASAN=y, CONFIG_USBIP_VUDC=y, CONFIG_USB_ZERO=y, and a tight loop
that repeatedly writes a socket fd to usbip_sockfd, closes the socket
pair, and unbinds/rebinds usbip-vudc.0:
BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x8ba/0x8e0
Write of size 8 at addr ffff888001b80740 by task trigger_and_unb/239
Allocated by task 239:
vudc_probe+0x4d/0xaa0
Freed by task 239:
kfree+0x18f/0x520
device_release_driver_internal+0x388/0x540
unbind_store+0xd9/0x100
This lands in the timer core rather than v_timer() itself because the
embedded timer_list is being walked after its containing struct vudc has
already been freed. The underlying lifetime bug is the same one Zheng
reported.
With v_stop_timer() called from vudc_remove() and the timer deleted
synchronously, the same harness completed 5000 bind/unbind iterations
with no KASAN report.
Fixes: b6a0ca1118 ("usbip: vudc: Add UDC specific ops")
Cc: stable <stable@kernel.org>
Reported-by: Zheng Wang <zyytlz.wz@163.com>
Closes: https://lore.kernel.org/linux-usb/20230317100954.2626573-1-zyytlz.wz@163.com/
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/20260417163552.807548-1-michael.bommarito@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The deprecated 'usb-phy' property is documented already in usb.yaml and
doesn't need a type definition here. It just needs constraints on how
many entries there are.
As this is a host controller, reference usb-hcd.yaml which then
references usb.yaml.
Fixes: 70fcdc82cf ("dt-bindings: usb: ti,omap4-musb: convert to DT schema")
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260508182556.1759173-1-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PNY Elite Portable SSD (USB ID 154b:f009) is a sibling of the
already-quirked PNY Pro Elite SSDs (154b:f00b and 154b:f00d). Like its
siblings, it uses a Phison-based USB-SATA bridge that exhibits
firmware bugs when bound to the uas driver.
Without quirks, the device fails to complete READ CAPACITY commands
when accessed over UAS on a SuperSpeed (USB 3) port. The device
enumerates and reports as a SCSI direct-access device, but reports
zero logical blocks and never finishes spin-up:
usb 2-3: new SuperSpeed USB device number 8 using xhci_hcd
usb 2-3: New USB device found, idVendor=154b, idProduct=f009
usb 2-3: Product: PNY ELITE PSSD
usb 2-3: Manufacturer: PNY
scsi host0: uas
scsi 0:0:0:0: Direct-Access PNY PNY ELITE PSSD 0
sd 0:0:0:0: [sda] Spinning up disk...
[...10+ seconds of polling, no progress...]
sd 0:0:0:0: [sda] Read Capacity(16) failed: hostbyte=DID_ERROR
sd 0:0:0:0: [sda] Read Capacity(10) failed: hostbyte=DID_ERROR
sd 0:0:0:0: [sda] 0 512-byte logical blocks: (0 B/0 B)
Tested each individual quirk to find the minimum that fixes this:
- US_FL_NO_ATA_1X alone: device hangs on spin-up
- US_FL_NO_REPORT_OPCODES alone: works on USB 2.0, hangs on USB 3.0
- US_FL_NO_ATA_1X | US_FL_NO_REPORT_OPCODES: works on both
With both quirks the device enumerates correctly while still using
the uas driver, and delivers full UAS throughput (~281 MB/s
sequential read on a USB 3.0 Gen 1 port).
The existing PNY Pro Elite entries (f00b, f00d) only set NO_ATA_1X,
but this device additionally chokes on REPORT OPCODES under
SuperSpeed.
Signed-off-by: Sam Burkels <sam@1a38.nl>
Acked-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@kernel.org>
Link: https://patch.msgid.link/20260501132346.86572-1-sam@1a38.nl
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Lenovo ThinkPad USB-C Dock Gen2 (17ef:a391, 17ef:a392) hub
controllers exhibit link instability when USB Link Power Management
is enabled, similar to the dock's Ethernet adapter (17ef:a387) which
already carries USB_QUIRK_NO_LPM.
When the dock reconnects after a transient disconnect, the hub
controllers enter LPM states between re-enumeration retries, causing
repeated disconnect/reconnect cycles lasting up to two minutes.
Disabling LPM for these devices restores stable enumeration.
Signed-off-by: Stephen J. Fuhry <fuhrysteve@gmail.com>
Cc: stable <stable@kernel.org>
Link: https://patch.msgid.link/20260513171419.44849-1-fuhrysteve@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB488 subclass specification requires interrupt wMaxPacketSize to
be 0x02, unless the device sends vendor-specific notifications.
Endpoints that advertise less than 2 bytes for wMaxPacketSize are
unlikely to work with the current driver, as URBs will not have enough
space for interrupt headers. Considering that any notification URBs will
be ignored by the driver, reject these endpoints early during probe.
Fixes: 041370cce8 ("USB: usbtmc: refactor endpoint retrieval")
Cc: stable <stable@kernel.org>
Suggested-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260505-usbtmc-iin-size-v3-2-a36113f62db7@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
USBTMC devices can use an optional interrupt endpoint for notification
messages. These typically contain two-byte headers indicating the
payload format, but the driver does not check if these headers are
present before accessing the data buffers. In cases where the URB
actual_length is not enough to fit these headers, the driver will either
cause an out-of-bounds read, or consume stale leftover data from a
previous notification.
Fix by checking if actual_data contains enough bytes for the headers,
otherwise resubmit URB to the interrupt endpoint.
Fixes: dbf3e7f654 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Reported-by: syzbot+abbfd103085885cf16a2@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=abbfd103085885cf16a2
Cc: stable <stable@kernel.org>
Suggested-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260505-usbtmc-iin-size-v3-1-a36113f62db7@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a USB device is unplugged from the dual-role port, the device-mode
path in tegra_xhci_id_work() explicitly clears both SS and HS port power
via direct hub_control ClearPortFeature(POWER) calls. This preempts the
xHCI controller's normal disconnect processing -- PORT_CSC is never
generated, the USB core never sees the disconnect, and the device remains
in its internal tree as a ghost visible in lsusb.
Add an otg_set_port_power flag to control whether the dual-role switch
path performs explicit port power management. SoCs that need it
(Tegra124 / Tegra210 / Tegra186) set the flag; later SoCs (Tegra194 and
beyond) rely on the PHY mode change to handle disconnect naturally and
skip all port power calls.
Within the port power path, otg_reset_sspi additionally gates the SSPI
reset sequence on host-mode entry for SoCs that require it.
Flags set per SoC:
Tegra124, Tegra186 -> otg_set_port_power
Tegra210 -> otg_set_port_power, otg_reset_sspi
Tegra194 and later -> (none)
Fixes: f836e78430 ("usb: xhci-tegra: Add OTG support")
Cc: stable <stable@kernel.org>
Signed-off-by: Wei-Cheng Chen <weichengc@nvidia.com>
Link: https://patch.msgid.link/20260505112630.217704-1-weichengc@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
uvc_function_bind() walks &opts->extension_units twice without holding
opts->lock:
- directly, for the iExtension string-descriptor fixup loop;
- indirectly, four times via uvc_copy_descriptors() (once per speed),
where the helper iterates uvc->desc.extension_units (which aliases
&opts->extension_units) to size and emit XU descriptors.
The configfs side (uvcg_extension_make / uvcg_extension_drop, in
drivers/usb/gadget/function/uvc_configfs.c) takes opts->lock around its
list_add_tail / list_del operations. A privileged userspace process
that holds the configfs subtree open and writes the gadget UDC name
to bind the function while concurrently rmdir()'ing an extensions
subdir can race uvcg_extension_drop() against the bind-time list walks
and dereference a freed struct uvcg_extension.
Hold opts->lock from the start of the XU string-descriptor fixup
through the last uvc_copy_descriptors() call, releasing on the
descriptor-error path via a new error_unlock label that drops the
lock before falling through to the existing error label. This
matches the locking discipline of the configfs callbacks and removes
the only remaining unsynchronised reader of the XU list during bind.
Reachability: only privileged processes that can mount configfs and
write to gadget UDC files can trigger the race, so this is a
correctness fix rather than a security boundary.
Fixes: 0525210c98 ("usb: gadget: uvc: Allow definition of XUs in configfs")
Cc: stable <stable@kernel.org>
Signed-off-by: Kai Aizen <kai.aizen.dev@gmail.com>
Link: https://patch.msgid.link/20260430175643.67120-1-kai.aizen.dev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
usb_initialize_gadget() installs gadget_release() as the release
callback for the embedded gadget device. The struct net2280 instance is
therefore released through gadget_release() when the gadget device's last
reference is dropped.
The probe error path calls net2280_remove(), which tears down the
partially initialized device and drops the gadget reference with
usb_put_gadget(). Calling kfree(dev) afterwards can free the same object
again.
Drop the explicit kfree() and let the gadget device release callback
handle the final free. This issue was found by a static analysis tool
I am developing.
Fixes: f770fbec41 ("USB: UDC: net2280: Fix memory leaks")
Cc: stable <stable@kernel.org>
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://patch.msgid.link/20260427153651.337846-1-lgs201920130244@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hidg_alloc() initializes hidg->dev with device_initialize() before
calling dev_set_name(). If dev_set_name() fails, the function currently
jumps to err_unlock and returns without calling put_device().
This leaves the device reference unbalanced and prevents hidg_release()
from being called. Calling put_device() here is also safe, since
hidg_release() only frees resources owned by hidg.
The issue was identified by a static analysis tool I developed and
confirmed by manual review.
Route the dev_set_name() failure path through err_put_device so the
device reference is dropped properly.
Fixes: 89ff3dfac6 ("usb: gadget: f_hid: fix f_hidg lifetime vs cdev")
Cc: stable <stable@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Johan Hovold johan@kernel.org
Link: https://patch.msgid.link/20260413142119.2977716-1-lgs201920130244@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In omap2430_probe(), of_node_put(np) is called prematurely before the
last access to np, leading to a use-after-free if the node's reference
count drops to zero. Move the of_node_put() calls after the last use of
np in both the success and error paths.
Fixes: ffbe2feac5 ("usb: musb: omap2430: Fix probe regression for missing resources")
Cc: stable <stable@kernel.org>
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20260409101104.480623-1-vulab@iscas.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ESP out-of-place fast path appends the trailer in esp_output_head()
before esp_output_tail() allocates the destination page frag. The
head-side gate currently checks skb->data_len and tailen separately, but
the tail code allocates a single destination frag from the combined
post-trailer skb->data_len.
Reject the page-frag fast path when the combined aligned length exceeds a
page. Otherwise skb_page_frag_refill() may fall back to a single page while
the destination sg still spans the combined skb->data_len.
Restore this combined-length page gate for both IPv4 and IPv6.
Fixes: 5bd8baab08 ("esp: limit skb_page_frag_refill use to a single page")
Cc: stable@vger.kernel.org
Signed-off-by: Lin Ma <malin89@huawei.com>
Signed-off-by: Chenyuan Mi <michenyuan@huawei.com>
Signed-off-by: Jingguo Tan <tanjingguo@huawei.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
In esp_output_tail(), when esp->inplace is false, the old skb page frags
are replaced with a new page from the xfrm page_frag cache. The source
scatterlist (sg) is built from the old frags before the replacement, and
esp_ssg_unref() is responsible for releasing the old page references
after the crypto operation completes.
However, if the second skb_to_sgvec() call (which builds the destination
scatterlist from the new page) fails, the code jumps to error_free which
only calls kfree(tmp). The old page frag references captured in the
source scatterlist are never released:
1. sg[] is built from old frags via skb_to_sgvec() (no extra get_page)
2. nr_frags is set to 1 and frag[0] is replaced with the new page
3. Second skb_to_sgvec() fails -> goto error_free
4. kfree(tmp) frees the sg[] memory but old frags are not unref'd
5. kfree_skb() only releases frag[0] (the new page), not the old ones
Fix this by adding a bool parameter to esp_ssg_unref() that, when true,
unconditionally unrefs the source scatterlist frags without checking
req->src and req->dst, since those fields are not yet initialized by
aead_request_set_crypt() at the point of the error. Existing callers
pass false to preserve the original behavior.
The same issue exists in both esp4 and esp6 as the code is identical.
Fixes: cac2661c53 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a ("esp6: Avoid skb_cow_data whenever possible")
Signed-off-by: Alessandro Schino <7991aleschino@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
This reverts commit db359fccf2 ("mm: introduce a new page type for page
pool in page type") and a part of 735a309b4b ("net: add net_iov_init()
and use it to initialize ->page_type").
Netpp page_type'ed pages might be used in mapping so as to use @_mapcount.
However, since @page_type and @_mapcount are union'ed in struct page,
these two can't be used at the same time. Revert the commit introducing
page_type for Netpp for now.
The patch will be retried once @page_type and @_mapcount get allowed to be
used at the same time.
The revert also includes removal of @page_type initialization part
introduced by commit 735a309b4b ("net: add net_iov_init() and use it
to initialize ->page_type"), which will be restored on the retry.
Link: https://lore.kernel.org/20260515034701.17027-1-byungchul@sk.com
Fixes: db359fccf2 ("mm: introduce a new page type for page pool in page type")
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reported-by: Dragos Tatulea <dtatulea@nvidia.com>
Closes: https://lore.kernel.org/all/982b9bc1-0a0a-4fc5-8e3a-3672db2b29a1@nvidia.com
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Mark Bloch <mbloch@nvidia.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Cc: Toke Hoiland-Jorgensen <toke@redhat.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__get_vm_area_node() currently triggers a BUG() if in_interrupt() returns
true. However, in_interrupt() also reports true when BH are disabled.
The bridge code can call rhashtable_lookup_insert_fast() with bottom
halves disabled:
__vlan_add()
-> br_fdb_add_local()
spin_lock_bh(&br->hash_lock); <-- Disable BH
-> fdb_add_local()
-> fdb_create()
-> rhashtable_lookup_insert_fast()
-> kvmalloc()
-> vmalloc()
-> __get_vm_area_node()
-> BUG_ON(in_interrupt())
spin_unlock_bh(&br->hash_lock)
this triggers the BUG() despite the caller not being in NMI or
hard IRQ context.
Replace the in_interrupt() check with in_nmi() || in_hardirq().
Link: https://lore.kernel.org/20260515153009.2296191-1-urezki@gmail.com
Fixes: c6307674ed ("mm: kvmalloc: add non-blocking support for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reported-by: syzbot+8b12fc6e0fb139765b58@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69ff8c7c.050a0220.1036b8.000b.GAE@google.com/
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When migrate_vma_insert_huge_pmd_page() jumps to unlock_abort due
to a PMD check failure, the pgtable allocated earlier via
pte_alloc_one() is never freed, causing a memory leak.
Added free_abort label to release the pgtable in error path.
Link: https://lore.kernel.org/20260501115122.23288-1-nueralspacetech@gmail.com
Fixes: a30b48bf1b ("mm/migrate_device: implement THP migration of zone device pages")
Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When a child process exits, it sends exit_signal to its parent via
do_notify_parent(). The clone() syscall constructs exit_signal as:
(lower_32_bits(clone_flags) & CSIGNAL)
CSIGNAL is 0xff, so values in the range 65-255 are possible. However,
valid_signal() only accepts signals up to _NSIG (64 on x86_64). A
non-zero non-valid exit_signal acts the same as exit_signal == 0: the
parent process is not signaled when the child terminates.
The syzkaller reproducer triggers this by calling clone() with flags=0x80,
resulting in exit_signal = (0x80 & CSIGNAL) = 128, which exceeds _NSIG and
is not a valid signal.
The v1 of this patch added the check only in the clone() syscall handler,
which is incomplete. kernel_clone() has other callers such as
sys_ia32_clone() which would remain unprotected. Move the check to
kernel_clone() to cover all callers.
Since the valid_signal() check is now in kernel_clone() and covers all
callers including clone3(), the same check in copy_clone_args_from_user()
becomes redundant and is removed. The higher 32bits check for clone3() is
kept as it is clone3() specific.
Note that this is a user-visible change: previously, passing an invalid
exit_signal to clone() was silently accepted. The man page for clone()
does not document any defined behavior for invalid exit_signal values, so
rejecting them with -EINVAL is the correct behavior. It is unlikely that
any sane application relies on passing an invalid exit_signal.
[oleg@redhat.com: the comment above kernel_clone() should be updated]
Link: https://lore.kernel.org/abwvgU17W8wuW2-J@redhat.com
Link: https://lore.kernel.org/20260316151956.563558-1-kartikey406@gmail.com
Fixes: 3f2c788a13 ("fork: prevent accidental access to clone3 features")
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bbe6b99feefc3a0842de
Tested-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/20260307064202.353405-1-kartikey406@gmail.com/T/ [v1]
Link: https://lore.kernel.org/all/20260316104536.558108-1-kartikey406@gmail.com/T/ [v2]
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
flush_nmi_stats() drains per-node NMI slab atomics into the per-node
lruvec_stats, but does not propagate them to the memcg-level vmstats.
For non NMI case, account_slab_nmi_safe() calls mod_memcg_lruvec_state()
which updates both per-node lruvec_stats and memcg-level vmstats, so
flush_nmi_stats() needs to flush to per-node lruvec_stats as well as
memcg-level vmstats.
So fix this by flushing to the memcg-level vmstats for NMI too.
Link: https://lore.kernel.org/20260518082830.599102-1-alex@ghiti.fr
Fixes: 940b01fc8d ("memcg: nmi safe memcg stats for specific archs")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
DAMON sysfs maintains the DAMOS tried region directory objects via a
linked list. When the user requests refresh of the directories, DAMON
sysfs removes all the region directories first, and then generate updated
regions directory on the empty space. The removal function
(damon_sysfs_scheme_regions_rm_dirs()) only puts the kobj objects.
Deletion of the container region object from the linked list is done
inside the kobj release callback function.
If somehow the callback invocation is delayed, the list will contain
regions list that gonna be freed. If the updated region directories
creation is started in this situation, the list can be corrupted and
use-after-free can happen.
Because the kobj objects are managed by only DAMON sysfs, the issue cannot
happen in normal situation. But, such delays can be made on kernels that
built with CONFIG_DEBUG_KOBJECT_RELEASE. On the kernel, the issue can
indeed be reproduced like below.
# damo start --damos_action stat
# cd /sys/kernel/mm/damon/admin/kdamonds/0/
# for i in {1..10}; do echo update_schemes_tried_regions > state; done
# dmesg | grep underflow
[ 89.296152] refcount_t: underflow; use-after-free.
Fix the issue by removing the region object from the list when
decrementing the reference count.
Also update damos_sysfs_populate_region_dir() to add the region object to
the list only after the kobject_init_and_add() is success, so that fail of
kobject_init_and_add() is not leaving the deallocated object on the list.
The issue was discovered [1] by Sashiko.
Link: https://lore.kernel.org/20260518152559.93038-1-sj@kernel.org
Link: https://lore.kernel.org/20260513011920.119183-1-sj@kernel.org [1]
Fixes: 9277d0367b ("mm/damon/sysfs-schemes: implement scheme region directory")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> # 6.2.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Initialize nr_pages to 1 at the start of each loop iteration, like
folio_referenced_one() does.
Without this, nr_pages computed by a previous folio_unmap_pte_batch() call
can be reused on a later iteration that does not run
folio_unmap_pte_batch() again.
mmap a 64K large folio with MAP_ANONYMOUS | MAP_DROPPABLE, then call
madvise(MADV_FREE), then make the last page device-exclusive via
HMM_DMIRROR_EXCLUSIVE.
Trigger node reclaim through sysfs. Now, in try_to_unmap_one(), we will
first clear the first 15 out of 16 entries mapping the lazyfree folio.
This will set nr_pages to 15. In the next pvmw walk, this nr_pages gets
reused on a device-exclusive pte, thus potentially corrupting folio
refcount/mapcount.
At the moment, I have a userspace program which can make the kernel spit
out a trace, but the blow up is in folio_referenced_one(), because there
are existing bugs in the interaction between device-private and rmap
(which too I am investigating). I did a one liner kernel change to avoid
going into folio_referenced_one(), and the kernel blows up at
folio_remove_rmap_ptes in try_to_unmap_one which is what I wanted.
Note that the bug is there not since file folio batching but lazyfree
folio batching, since device-exclusive only works for anonymous folios.
Userspace visible effect is simply kernel crashing somewhere due to
refcount/mapcount corruption.
Link: https://lore.kernel.org/20260518063656.3721056-1-dev.jain@arm.com
Fixes: 354dffd295 ("mm: support batched unmap for lazyfree large folios during reclamation")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Harry Yoo <harry@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A crash was observed in zram_writeback_endio due to a NULL pointer
dereference in wake_up. The root cause is a race condition between the
bio completion handler (zram_writeback_endio) and the writeback task.
In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
releasing wb_ctl->done_lock. This creates a race window where the
writeback task can see num_inflight become 0, return, and free wb_ctl
before zram_writeback_endio calls wake_up().
CPU 0 (zram_writeback_endio) CPU 1 (writeback_store)
============================ ============================
zram_writeback_slots
zram_submit_wb_request
zram_submit_wb_request
wait_event(wb_ctl->done_wait)
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
wake_up(&wb_ctl->done_wait);
zram_complete_done_reqs
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
while (num_inflight) > 0)
spin_lock(&wb_ctl->done_lock);
list_del(&req->entry);
spin_unlock(&wb_ctl->done_lock);
// num_inflight becomes 0
atomic_dec(num_inflight);
// Leave zram_writeback_slots
// Free wb_ctl
release_wb_ctl(wb_ctl);
// UAF crash!
wake_up(&wb_ctl->done_wait);
This patch fixes this race by using RCU. By protecting wb_ctl with
rcu_read_lock() in zram_writeback_endio and using kfree_rcu() to free it,
we ensure that wb_ctl remains valid during the execution of
zram_writeback_endio.
Link: https://lore.kernel.org/20260512074918.2606208-1-richardycc@google.com
Fixes: f405066a1f ("zram: introduce writeback bio batching")
Signed-off-by: Richard Chang <richardycc@google.com>
Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Suggested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin Liu <liumartin@google.com>
Cc: wang wei <a929244872@163.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When SEAL_EXEC is added, SEAL_WRITE is implied to make W^X. But the
implied seal is set after the check that makes sure the memfd can not have
any writable mappings. This means one can use SEAL_EXEC to apply
SEAL_WRITE while having writeable mappings.
This breaks the contract that SEAL_WRITE provides and can be used by an
attacker to pass a memfd that appears to be write sealed but can still be
modified arbitrarily.
Fix this by adding the implied seals before the call for
mapping_deny_writable() is done.
Link: https://lore.kernel.org/20260505133922.797635-1-pratyush@kernel.org
Fixes: c4f75bc8bd ("mm/memfd: add write seals when apply SEAL_EXEC to executable memfd")
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Jeff Xu <jeffxu@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The checkpoint/restore sysctl path can request the next SysV IPC id
through ids->next_id. ipc_idr_alloc() currently forwards that request to
idr_alloc() with an open-ended upper bound.
If the valid tail of the SysV IPC id space is full, the allocation can
spill beyond ipc_mni. The returned SysV IPC id still uses the normal
index encoding, so later lookup and removal can target the wrong slot.
This leaves the real IDR entry behind and breaks the IDR state for the
object.
The bug is in ipc_idr_alloc() in the checkpoint/restore path.
1. ids->next_id is passed to:
idr_alloc(&ids->ipcs_idr, new, ipcid_to_idx(next_id), 0, ...)
2. The zero upper bound makes the allocation effectively open-ended.
Once the valid SysV IPC tail is occupied, idr_alloc() can spill past
ipc_mni and allocate an entry beyond the valid IPC id range.
3. The new object id is still encoded with the narrower SysV IPC index
width:
new->id = (new->seq << ipcmni_seq_shift()) + idx
4. Later removal goes through ipc_rmid(), which uses:
ipcid_to_idx(ipcp->id)
That truncates the real IDR index. An object actually stored at a
high index can then be removed as if it lived at a low in-range
index.
5. For shared memory, shm_destroy() frees the current object anyway, but
the real high IDR slot is left behind as a dangling pointer.
6. A subsequent walk of /proc/sysvipc/shm reaches the stale IDR entry
and dereferences freed memory.
Prevent this by bounding the requested allocation to ipc_mni so the
checkpoint/restore path fails once the valid range is exhausted.
Link: https://lore.kernel.org/cover.1778336914.git.linpu5433@gmail.com
Link: https://lore.kernel.org/2eebe949bfa7d1f6e13b5be6a92c64c850ce9d45.1778336914.git.linpu5433@gmail.com
Fixes: 03f5956680 ("ipc: add sysctl to specify desired next object id")
Signed-off-by: Linpu Yu <linpu5433@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Cc: Kees Cook <kees@kernel.org>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This reverts commit ea52cb24cd ("mm/hugetlbfs: update hugetlbfs to use
mmap_prepare") with conflict resolution to account for changes in commit
ea52cb24cd ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare").
The patch incorrectly handled hugetlb VMA lock allocation at the
mmap_prepare stage, where a failed allocation occurring after mmap_prepare
is called might result in the lock leaking.
There is no risk of a merge causing a similar issues, as
VMA_DONTEXPAND_BIT is set for hugetlb mappings.
As a first step in addressing this issue, simply revert the change so we
can rework how we do this having corrected the underlying issues.
We maintain the VMA flags changes as best we can, accounting for the fact
that we were working with a VMA descriptor previously and propagating
like-for-like changes for this.
Note that we invoke vma_set_flags() and do not call vma_start_write() as
vm_flags_set() does. This is OK as it's being done in an .mmap hook where
the VMA is not yet linked into the tree so nobody else can be accessing
it.
Link: https://lore.kernel.org/20260512160643.266960-1-ljs@kernel.org
Fixes: ea52cb24cd ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reported-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Closes: https://lore.kernel.org/linux-mm/20260425070700.562229-1-25181214217@stu.xidian.edu.cn/
Acked-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Update my email address from @ge.com to @gehealthcare.com after GE
HealthCare was spun-off from GE.
Link: https://lore.kernel.org/20260506063335.3-1-ian.ray@gehealthcare.com
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Merge the fixes to add power-domain and correct clocks for the ICC block
in Eliza and Milos through a topic branch, to allow them to be merged
also into arm64-for-7.2 to resolve the merge conflicts that would
otherwise appear.
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Qualcomm in-line crypto engine (ICE) platform driver specifies and votes
for its own resources. Before accessing ICE hardware during probe, to
avoid potential unclocked register access issues (when clk_ignore_unused
is not passed on the kernel command line), in addition to the 'core' clock
the 'iface' clock should also be turned on by the driver. This can only be
done if the GCC_UFS_PHY_GDSC power domain is enabled. Specify both the
GCC_UFS_PHY_GDSC power domain and the 'iface' clock in the ICE node for
eliza.
Fixes: af20af39fc ("arm64: dts: qcom: Introduce Eliza Soc base dtsi")
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Fixes: 54a4f0239f ("KVM: MMU: make kvm_mmu_zap_page() return the
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-13-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Qualcomm in-line crypto engine (ICE) platform driver specifies and votes
for its own resources. Before accessing ICE hardware during probe, to
avoid potential unclocked register access issues (when clk_ignore_unused
is not passed on the kernel command line), in addition to the 'core' clock
the 'iface' clock should also be turned on by the driver. This can only be
done if the UFS_PHY_GDSC power domain is enabled. Specify both the
UFS_PHY_GDSC power domain and the 'iface' clock in the ICE node for milos.
Fixes: 04bb374333 ("arm64: dts: qcom: milos: Add UFS nodes")
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-12-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Flush the current TLB when xAVIC *or* x2AVIC is activated, as KVM is
(apparently) responsible for purging TLB entries when transitioning from
xAVIC to x2AVIC. The APM says a whole lot of nothing about TLB flushing
with respect to (x2)AVIC, but empirical data strongly suggests hardware
also does a whole lot of nothing.
Failure to flush the TLB when enabling x2AVIC can lead to guest accesses
to the APIC base address getting incorrectly redirected to the virtual
APIC page. The flaw most visibly manifests as failures in KVM-Unit-Test's
verify_disabled_apic_mmio() testcase when x2APIC is enabled (though for
reasons unknown, the test only reliably fails with EFI builds).
Fixes: 0ccf3e7cb9 ("KVM: SVM: Flush the "current" TLB when activating AVIC")
Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Link: https://patch.msgid.link/20260515171536.1841645-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use kvm_register_mark_dirty() instead of kvm_register_is_dirty() to
actually mark VCPU_EXREG_ERAPS as dirty when emulating
INVPCID_TYPE_SINGLE_CTXT. kvm_register_is_dirty() is a read-only
predicate whose return value is discarded, making the call a no-op.
Without this fix, a single-context INVPCID will not trigger a RAP clear
on the next VMRUN, breaking the ERAPS security guarantee.
Fixes: db5e824964 ("KVM: SVM: Virtualize and advertise support for ERAPS")
Signed-off-by: Emily Ehlert <ehemily@amazon.de>
Link: https://patch.msgid.link/20260518135956.82569-1-ehemily@amazon.de
Signed-off-by: Sean Christopherson <seanjc@google.com>
The F_GETLK fcntl can work with either read access or write access or
both. It can query F_RDLCK and F_WRLCK locks in either case.
However lockd currently treats F_GETLK similar to F_SETLK in that read
access is required to query an F_RDLCK lock and write access is required
to query a F_WRLCK lock.
This is wrong and can cause problems - e.g. when qemu accesses a
read-only (e.g. iso) filesystem image over NFS (though why it queries
if it can get a write lock - I don't know. But it does, and this works
with local filesystems).
So we need TEST requests to be handled differently. To do this:
- change nlm_do_fopen() to accept O_RDWR as a mode and in that case
succeed if either a O_RDONLY or O_WRONLY file can be opened.
- change nlm_lookup_file() to accept a mode argument from caller,
instead of deducing base on lock time, and pass that on to nlm_do_fopen()
- change nlm4svc_retrieve_args() and nlmsvc_retrieve_args() to detect
TEST requests and pass O_RDWR as a mode to nlm_lookup_file, passing
the same mode as before for other requests. Also set
lock->fl.c.flc_file to whichever file is available for TEST requests.
- change nlmsvc_testlock() to also not calculate the mode, but to use
whatever was stored in lock->fl.c.flc_file.
This behaviour of lockd - requesting O_WRONLY access to TEST for
exclusive locks - has been present at least since git history began.
However it was hidden until recently because knfsd ignored the access
requested by lockd and required only READ access for all locking
requests (unless the underlying filesystem provided an f_op->open
function which checked access permissions).
The commit mentioned in Fixes: below changed nfsd_permission() to NOT
override the access request for LOCK requests and this exposed the bug
that we are now fixing.
Note that there is another issue that this patch does not address.
The flock(.., LOCK_EX) call is permitted on a read-only file descriptor.
Linux NFS maps this to NLM locking as whole-file byte-range locks.
nfsd will see this as though it were fcntl( F_SETLK (F_WRLCK)) and will
now require write access, which it might not be able to get.
It is not clear if this is a problem in practice, or what the best
solution might be. So no attempt is made to address it.
Reported-by: Tj <tj.iam.tj@proton.me>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1128861
Fixes: 4cc9b9f2bf ("nfsd: refine and rename NFSD_MAY_LOCK")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The nfsd_ctl_fh_key_set tracepoint was introduced to capture
operator activity on the filehandle signing key. Earlier revisions
logged the key bytes verbatim; the version that landed hashes the
16 key bytes through crc32_le and stores the result.
CRC32 is a linear projection of its input rather than a one-way
function, and truncating any hash of fixed-size secret material
leaves the key recoverable under offline brute force when the
threat model includes an attacker with access to the trace ring.
The operational question the fingerprint was meant to answer is
whether a NFSD_CMD_THREADS_SET call that carries an
NFSD_A_SERVER_FH_KEY attribute actually replaced the active key or
re-installed the value already in place. Answer that question
directly: compare the incoming key bytes against the current
nn->fh_key inside nfsd_nl_fh_key_set() and surface a single bit to
the tracepoint. The event now prints "updated" when the stored
key changed and "unmodified" otherwise. A first set that fails
kmalloc reports "unmodified" because no key was installed.
Reported-by: jaeyeong <fin@spl.team>
Fixes: 62346217fd ("NFSD: Add a key for signing filehandles")
Cc: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Commit 7b546bd899 ("sunrpc/cache: improve RCU safety in
cache_list walking.") changed the tail of __cache_seq_start()
to unconditionally store
*pos = ((long long)hash << 32) + 1
before returning, dropping a prior "hash >= cd->hash_size"
guard. When the while loop exits because every remaining
bucket was empty, hash equals cd->hash_size, so the stored
*pos is one position past the table's last valid bucket.
seq_read_iter() caches that index in m->index. A subsequent
pread(2) at the same file offset skips traverse() and hands
the stored index back to __cache_seq_start(), which decodes
hash = cd->hash_size and dereferences
cd->hash_table[cd->hash_size] -- one hlist_head past the end
of the kzalloc'd table.
KASAN reports an eight-byte slab-out-of-bounds read at the
tail of the 2048-byte hash_table allocation for the NFSD
export cache (EXPORT_HASHMAX * sizeof(struct hlist_head) ==
256 * 8).
Reject an input hash that is out of range before touching the
hash table. cache_seq_next() already bounds-checks its own
loop; the start routine needs to be symmetric.
Reported-by: syzbot+60cfa08822470bbebe44@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=60cfa08822470bbebe44
Fixes: 7b546bd899 ("sunrpc/cache: improve RCU safety in cache_list walking.")
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
rocket_ioctl_create_bo() inserts a GEM handle into the file's IDR via
drm_gem_handle_create() early on, then performs several operations that
can fail (sgt allocation, drm_mm insert, iommu_map). If any fail after
the handle is live, the error path calls drm_gem_shmem_object_free()
which kfree's the object without removing the handle from the IDR.
This leaves a dangling handle pointing to freed slab memory. Any
subsequent ioctl using that handle (PREP_BO, FINI_BO, SUBMIT) calls
drm_gem_object_lookup() and dereferences freed memory (UAF).
Fix by moving drm_gem_handle_create() to after all fallible operations
succeed, matching the pattern used by panfrost, lima, and etnaviv.
Also fix drm_mm_insert_node_generic() whose return value was silently
overwritten by iommu_map_sgtable() on the next line. Add the missing
error check.
[tomeu: Move handle creation to the very end]
Fixes: 658ebeac33 ("accel/rocket: Add IOCTL for BO creation")
Reported-by: Dhabaleshwar Das <dhabal123@gmail.com>
Signed-off-by: Dhabaleshwar Das <dhabal123@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://patch.msgid.link/20260521165720.2113571-1-tomeu@tomeuvizoso.net
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
When the kernel is booted with a kunit filter (e.g.,
kunit.filter="speed!=slow"), the kunit executor dynamically allocates
copies of the filtered test suites using kmalloc/kmemdup.
During the initial boot execution, kunit_debugfs_create_suite() creates
debugfs files (such as /sys/kernel/debug/kunit/<suite>/run) and
permanently stores a pointer to the dynamically allocated suite in the
inode's i_private field.
Previously, the executor freed this dynamically allocated suite_set
immediately after executing the boot-time tests. Because the debugfs
nodes were not destroyed, any subsequent interaction with the debugfs
`run` file from userspace triggered a use-after-free (UAF). On systems
with architectural capabilities, like CHERI RISC-V, this resulted in
an immediate fatal hardware exception due to the invalidation of the
capability tags on the reclaimed memory. On other architectures, it
resulted in silent memory corruption.
Fix this UAF by properly coupling the lifetime of the filtered suite
memory allocation to the lifetime of the kunit subsystem and its
associated VFS nodes. Ownership of the boot-time suite_set is now
transferred to a global tracker ('kunit_boot_suites'), and the memory
is cleanly released in kunit_exit() during module teardown.
Link: https://lore.kernel.org/r/20260507084854.233984-1-florian.schmaus@codasip.com
Fixes: e2219db280 ("kunit: add debugfs /sys/kernel/debug/kunit/<suite>/results display")
Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com>
Reviewed-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: David Gow <david@davidgow.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
to_usb_interface() is a container_of_const() macro: it performs
pointer arithmetic and never returns NULL. The if (!intf) and if
(intf) tests in get_endpoint_address() can never fire. Remove them
in both drivers.
No functional change.
Suggested-by: Derek J. Clark <derekjohn.clark@gmail.com>
Signed-off-by: Louis Clinckx <clinckx.louis@gmail.com>
Reviewed-by: Derek J. Clark <derekjoh.clark@gmail.com>
Tested-by: Derek J. Clark <derekjohn.clark@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
These drivers only match HID_USB_DEVICE() entries and assume the
underlying bus is USB. Make that explicit at probe by rejecting any
non-USB hdev, following the pattern used by other HID drivers.
Signed-off-by: Louis Clinckx <clinckx.louis@gmail.com>
Reviewed-by: Derek J. Clark <derekjoh.clark@gmail.com>
Tested-by: Derek J. Clark <derekjohn.clark@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
In lenovo_raw_event(), the X12 Tab keyboard handler reads a 4-byte
little-endian value from the raw HID report buffer but:
1. The size guard is size >= 3, while the access reads 4 bytes.
A malformed 3-byte report with ID 0x03 would over-read the
buffer by one byte.
2. Casting u8 *data directly to __le32 * can trigger unaligned
access faults on architectures like ARM, MIPS, and SPARC,
because HID input buffers carry no alignment guarantee.
(e.g. uhid payloads start at offset 6 in struct uhid_event,
giving only 2-byte alignment.)
Fix both by tightening the size check to >= 4 and replacing the
open-coded cast + le32_to_cpu() with get_unaligned_le32(), which
handles the LE-to-CPU conversion safely regardless of alignment.
Link: https://sashiko.dev/#/message/20260512044911.99B6DC2BCB0%40smtp.kernel.org
Assisted-by: CLAUDE:claude-4-sonnet
Signed-off-by: Kean <rh_king@163.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
A typo in the config guard in __hyp_do_panic broke the stage-2 disabling
and made backtraces for pKVM quite unreliable.
Fix that typo.
Fixes: 9019e82c7e ("KVM: arm64: Add PKVM_DISABLE_STAGE2_ON_PANIC")
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260520220830.273289-1-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
We only need to update the power_supply on power role change if the port
is connected, because otherwise the online status should be the same for
both cases.
Cc: stable <stable@kernel.org>
Fixes: 7616f006db ("usb: typec: ucsi: Update power_supply on power role change")
Signed-off-by: Myrrh Periwinkle <myrrhperiwinkle@qtmlabs.xyz>
Reported-and-tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patch.msgid.link/20260519-ucsi-fix-2-v1-2-6f1239535187@qtmlabs.xyz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UGREEN USB-C Multifunction Adapter Model CM512 (AKA "Revodok 107")
exposes two SVIDs: 0xff01 (DP Alt Mode) and 0x1d5c. The DISCOVER_MODES
step succeeds for 0xff01 and gets a NAK for 0x1d5c. Currently this
results in DP Alt Mode not being registered either, since the modes
are only registered once all of them have been discovered. The NAK
results in the processing being stopped and thus no Alt modes being
registered.
Improve the situation by handling the NAK gracefully and continue
processing the other modes.
Before this change, the TCPM log ends like this:
(more log entries before this)
[ 5.028287] AMS DISCOVER_SVIDS finished
[ 5.028291] cc:=4
[ 5.040040] SVID 1: 0xff01
[ 5.040054] SVID 2: 0x1d5c
[ 5.040082] AMS DISCOVER_MODES start
[ 5.040096] PD TX, header: 0x1b6f
[ 5.050946] PD TX complete, status: 0
[ 5.059609] PD RX, header: 0x264f [1]
[ 5.059626] Rx VDM cmd 0xff018043 type 1 cmd 3 len 2
[ 5.059640] AMS DISCOVER_MODES finished
[ 5.059644] cc:=4
[ 5.069994] Alternate mode 0: SVID 0xff01, VDO 1: 0x000c0045
[ 5.070029] AMS DISCOVER_MODES start
[ 5.070043] PD TX, header: 0x1d6f
[ 5.081139] PD TX complete, status: 0
[ 5.087498] PD RX, header: 0x184f [1]
[ 5.087515] Rx VDM cmd 0x1d5c8083 type 2 cmd 3 len 1
[ 5.087529] AMS DISCOVER_MODES finished
[ 5.087534] cc:=4
(no further log entries after this point)
After this patch the TCPM log looks exactly the same, but then
continues like this:
[ 5.100222] Skip SVID 0x1d5c (failed to discover mode)
[ 5.101699] AMS DFP_TO_UFP_ENTER_MODE start
(log goes on as the system initializes DP AltMode)
Cc: stable <stable@kernel.org>
Fixes: 41d9d75344 ("usb: typec: tcpm: add discover svids and discover modes support for sop'")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Link: https://patch.msgid.link/20260429-tcpm-discover-modes-nak-fix-v4-1-75945d0ed30f@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
causes the DMA engine to reposition its internal pointer to the next
Transfer Descriptor (TD) if it was already processing one.
This issue is consistently observed during the ADB identification
process on macOS hosts, where the host issues a Clear_Halt. Although
commit 4bf2dd6513 ("usb: cdns3: gadget: toggle cycle bit before reset
endpoint") attempted to avoid DMA advance by toggling the cycle bit,
trace logs show that on certain hosts like macOS, the DMA pointer
(EP_TRADDR) still shifts after EPRST:
cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030 <-- Should be f9c04000
cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
As shown above, the DMA pointer jumped to the next TD, causing
the controller to skip the initial TRBs of the request. This leads to
data misalignment and ADB protocol hangs on macOS.
Fix this by manually restoring the EP_TRADDR register to the starting
physical address of the current request after the EPRST operation is
complete.
Fixes: 7733f6c32e ("usb: cdns3: Add Cadence USB3 DRD Driver")
Cc: stable <stable@kernel.org>
Cc: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Yongchao Wu <yongchao.wu@autochips.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://patch.msgid.link/20260513160012.2547894-1-yongchao.wu@autochips.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The SIGMACHIP USB mouse with VID/PID 1c4f:0034 can disconnect and
re-enumerate repeatedly after it has been enumerated if its interrupt
endpoint is not continuously polled.
This was observed with the device reporting itself as "SIGMACHIP Usb
Mouse". Keeping the input event device open avoids the disconnects.
Add HID_QUIRK_ALWAYS_POLL for this device so the HID core keeps polling
it even when there is no userspace input consumer.
Cc: stable@vger.kernel.org
Signed-off-by: hlleng <a909204013@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
If devm_drm_dp_hpd_bridge_add() fails during fusb302_probe(), the original
code returned directly without cleaning up the resources.
Move bridge registration before the IRQ is requested and route bridge
registration failures through the existing TCPM unregister and fwnode
cleanup path.
Fixes: 5d79c52540 ("usb: typec: fusb302: add DRM DP HPD bridge support")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://patch.msgid.link/20260428-fusb-v2-1-aa3b5942cabb@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hid_warn_ratelimited macro is defined twice in include/linux/hid.h:
- first one added by commit 4051ead998 ("HID: rate-limit hid_warn to
prevent log flooding")
- second one added by commit 1d64624243 ("HID: core: Add
printk_ratelimited variants to hid_warn() etc")).
The second definition is correctly grouped with other ratelimited macros.
Remove the duplicate definition.
Fixes: 1d64624243 ("HID: core: Add printk_ratelimited variants to hid_warn() etc")
Signed-off-by: Liu Kai <lukace97@outlook.com>
[bentiss: edited commit message]
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
u2fzero_fill_in_urb() allocates dev->urb with usb_alloc_urb(), but
u2fzero_probe() ignored its return value and only freed the URB from
u2fzero_remove().
If LED or hwrng registration fails after the URB allocation, probe returns
an error and the driver core does not call .remove(), leaking the URB. A
failed URB setup was also allowed to continue probing with an unusable
device.
Check the URB setup result and add the missing probe-error unwind so the
URB is freed before returning from later errors.
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Commit 783ddaebd3 ("staging: comedi: comedi_test: support
scan_begin_src == TRIG_FOLLOW") neglected to add a test that
`scan_begin_src` has only one bit set. The allowed values are
`TRIG_FOLLOW` and `TRIG_TIMER`, but the code incorrectly also allows
`TRIG_FOLLOW | TRIG_TIMER`. Add a call to
`comedi_check_trigger_is_unique()` to check that only one trigger source
bit is set.
Fixes: 783ddaebd3 ("staging: comedi: comedi_test: support scan_begin_src == TRIG_FOLLOW")
Cc: stable <stable@kernel.org>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://patch.msgid.link/20260422162138.36003-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function checks and possibly modifies the description of an
asynchronous command to be run on the analog input subdevice of a comedi
device attached to the "comedi_test" driver, returning 0 if no
modifications were required, or a positive value that indicates which
step of the checking process it failed on. Step 4 fixes up various
argument values for various trigger sources.
There are two bugs in the fixing up of the `convert_arg` value to keep
the `scan_begin_arg` value within the range of `unsigned int` when
`scan_begin_src` and `convert_src` both have the value `TRIG_TIMER`,
which indicates that the corresponding `_arg` values hold a time period
in nanoseconds. The code also uses `scan_end_arg` which hold the number
of "conversions" within each "scan". The goal is to end up with the
scan period being less than or equal to the convert period multiplied by
the number of conversions per scan. It intends to do that by clamping
the `convert_arg` value to a maximum value of `UINT_MAX / scan_end_arg`
rounded down to a multiple of 1000 (`NSEC_PER_USEC`).
(The rounding from nanoseconds to microseconds is because the driver is
modelling a device that uses a 1 MHz clock for timing. This is partly
because that is a more typical timing base for real hardware devices
driven by comedi, and partly because the driver used to use `struct
timeval` internally.)
The first bug is that the code checks if `scan_begin_arg == TRIG_TIMER`
when it should be checking if `scan_begin_src == TRIG_TIMER`. The
bugged check will always fail because if `scan_begin_src == TRIG_TIMER`,
then `scan_begin_arg` will be at least 1000 (`NSEC_PER_USEC`), otherwise
`scan_begin_src == TRIG_FOLLOW` and `scan_begin_arg` will be 0. (N.B
`TRIG_TIMER` is defined as `0x10`.) The second bug is that is rounding
the maximum value down to a multiple of 1000000000 (`NSEC_PER_SEC`)
instead of 1000 (`NSEC_PER_USEC`), however this bug is not reached due
to the first bug. This patch fixes both bugs.
Fixes: 783ddaebd3 ("staging: comedi: comedi_test: support scan_begin_src == TRIG_FOLLOW")
Fixes: 5afdcad2f8 ("staging: comedi: comedi_test: limit maximum convert_arg")
Cc: stable <stable@kernel.org>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://patch.msgid.link/20260422144637.27692-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add the missing sanity check on the size of interrupt-in transfers to
avoid parsing stale or uninitialised slab data (and leaking it to user
space).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
The driver overrides the maximum transfer size for a specific device
which only accepts 16 byte packets for its 32 byte bulk-out endpoint.
Make sure to never increase the maximum transfer size to prevent slab
corruption should a malicious device report a smaller endpoint max
packet size than expected.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Add the missing sanity check on the size of usa49wg indat transfers to
avoid parsing stale or uninitialised slab data.
Fixes: 0ca1268e10 ("USB Serial Keyspan: add support for USA-49WG & USA-28XG")
Cc: stable@vger.kernel.org # 2.6.23
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
The Belkin interrupt callback treats interrupt data as a four-byte
status report and reads LSR/MSR fields at offsets 2 and 3. The
interrupt-in buffer length is derived from endpoint wMaxPacketSize, and
short interrupt transfers may complete successfully with a smaller
actual_length.
Check the completed interrupt packet length before parsing status
fields so short interrupt endpoints and short successful packets are
ignored instead of causing out-of-bounds or stale status-byte reads.
KASAN report as below:
BUG: KASAN: slab-out-of-bounds in belkin_sa_read_int_callback()
Read of size 1
Call trace:
belkin_sa_read_int_callback() (drivers/usb/serial/belkin_sa.c:202)
__usb_hcd_giveback_urb() (drivers/usb/core/hcd.c:1630)
dummy_timer() (?:?)
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
qcomtee_object_user_init() is a variadic function and when the function
return because there's no dispatch callback in QCOMTEE_OBJECT_TYPE_CB
case, there's no va_end to cleanup "ap" object initialized by va_start
and that can cause undefined behavior. So make sure to use va_end before
returning the error code when there's no dispatch callback.
This is reported by Coverity Scan as "Missing varargs init or cleanup".
Fixes: d6e290837e ("tee: add Qualcomm TEE driver")
Signed-off-by: Robertus Diawan Chris <robertusdchris@gmail.com>
Reviewed-by: Amirreza Zarrabi <amirreza.zarrabi@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
params_from_user() may acquire tee_shm references for MEMREF parameters
before failing after partially processing the supplied parameter array.
In tee_ioctl_supp_recv(), those references are currently not released on
that error path.
Fix this by freeing MEMREF references before returning when
params_from_user() fails.
Keep the final cleanup path in tee_ioctl_supp_recv() unchanged since
supp_recv() may consume and replace the supplied parameters, unlike the
other TEE ioctl callback paths.
Signed-off-by: Qihang <q.h.hack.winter@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
register_shm_helper() allocates shm before calling
iov_iter_npages(). If iov_iter_npages() returns 0, the function
jumps to err_ctx_put and leaks shm.
This can be triggered by TEE_IOC_SHM_REGISTER with
struct tee_ioctl_shm_register_data where length is 0.
Jump to err_free_shm instead.
Fixes: 7bdee41575 ("tee: Use iov_iter to better support shared buffer registration")
Cc: stable@vger.kernel.org
Cc: lvc-project@linuxtesting.org
Signed-off-by: Georgiy Osokin <g.osokin@auroraos.dev>
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
The tee_ioctl_object_invoke_arg structure has padding on some
architectures but not on x86-32 and a few others:
include/linux/tee.h:474:32: error: padding struct to align 'params' [-Werror=padded]
I expect that all current users of this are on architectures that do
have implicit padding here (arm64, arm, x86, riscv), so make the padding
explicit in order to avoid surprises if this later gets used elsewhere.
Fixes: d5b8b0fa17 ("tee: add TEE_IOCTL_PARAM_ATTR_TYPE_OBJREF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Tested-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
The connector number in a UCSI CCI notification is a 7-bit field
supplied by the PPM. ucsi_connector_change() uses it to index the
ucsi->connector[] array without checking it against the number of
connectors the PPM reported at init time, so a buggy or malicious PPM
(EC firmware, or an I2C-attached UCSI controller on the ccg / stm32g0 /
glink transports) can drive schedule_work() on memory past the end of
the array.
Reject connector numbers that are zero or exceed cap.num_connectors
before dereferencing the array.
Assisted-by: gkh_clanker_t1000
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Benson Leung <bleung@chromium.org>
Cc: Jameson Thies <jthies@google.com>
Cc: Nathan Rebello <nathan.c.rebello@gmail.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: Pooja Katiyar <pooja.katiyar@intel.com>
Cc: Hsin-Te Yuan <yuanhsinte@chromium.org>
Cc: Abel Vesa <abelvesa@kernel.org>
Cc: stable <stable@kernel.org>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://patch.msgid.link/2026051351-truck-steadfast-df48@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ucsi_displayport_vdm() handles a DP_CMD_CONFIGURE by copying the first
payload VDO from data[], but unlike the equivalent handler in
altmodes/displayport.c it does not check that count covers a VDO beyond
the header. A header-only Configure VDM (count == 1) would read one u32
past the caller's array.
In the normal UCSI path the caller controls count, so this is hardening
for non-standard delivery paths. NAK and bail when no configuration VDO
is present, matching the generic DP altmode driver's existing guard.
Assisted-by: gkh_clanker_t1000
Cc: Pooja Katiyar <pooja.katiyar@intel.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: stable <stable@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/2026051351-vividly-flattered-eb3d@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
svdm_consume_modes() checks pmdata->altmodes against the array size once
before the loop over the count, but forgot to check the bound at every
point in the loop.
In the well-behaved SVDM discovery flow this is harmless because each of
at most SVID_DISCOVERY_MAX SVIDs contributes at most MODE_DISCOVERY_MAX
modes, exactly filling altmode_desc[ALTMODE_DISCOVERY_MAX]. But the
CMDT_RSP_ACK handler in tcpm_pd_svdm() does not correlate an incoming
ACK with any request the port actually sent. Once port->partner is set,
an unsolicited Discover Modes ACK is consumed unconditionally. A broken
or malicious port partner can therefore drive altmodes to
ALTMODE_DISCOVERY_MAX - 1 via the normal flow, and then send one extra
Discover Modes ACK with seven VDOs. Because the pre-loop check passes,
the loop could then writes up to five entries past altmode_desc[]. For
mode_data_prime the next field in struct tcpm_port is the
partner_altmode[] pointer array, which then receives partner-chosen
SVID/VDO bytes.
Move the bound check inside the loop so the array can never be indexed
past ALTMODE_DISCOVERY_MAX regardless of how many VDOs the partner
supplies or how the function was reached.
Assisted-by: gkh_clanker_t1000
Cc: Badhri Jagan Sridharan <badhri@google.com>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@kernel.org>
Link: https://patch.msgid.link/2026051351-reshuffle-skillful-90af@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Properly validate the count passed from a device when calling
svdm_consume_identity() or svdm_consume_identity_sop_prime() as the
device-controlled value could index off of the static arrays, which
could leak data.
Assisted-by: gkh_clanker_t1000
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@kernel.org>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Link: https://patch.msgid.link/2026051350-plated-salute-0efe@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A broken/malicious port can transmit a CRC-valid frame whose header
advertises up to seven data objects but whose body carries fewer than
that. Check for this, and rightfully reject the message, instead of
reading from uninitialized stack memory.
Assisted-by: gkh_clanker_t1000
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: "André Draszik" <andre.draszik@linaro.org>
Cc: Badhri Jagan Sridharan <badhri@google.com>
Cc: Amit Sunil Dhamne <amitsd@google.com>
Cc: stable <stable@kernel.org>
Link: https://patch.msgid.link/2026051350-sitter-canopener-9045@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A broken/malicious device can send the incorrect count for a status
update VDO, which will cause the kernel to read uninitialized stack data
and send it off elsewhere.
Fix this up by correctly verifying the count for the update object.
Assisted-by: gkh_clanker_t1000
Cc: stable <stable@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/2026051350-reacquire-sculpture-4244@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wcove_read_rx_buffer() copies the PD RX FIFO into the caller's
struct pd_message with
for (i = 0; i < USBC_RXINFO_RXBYTES(info); i++)
regmap_read(wcove->regmap, USBC_RX_DATA + i, msg + i);
which has two problems:
USBC_RXINFO_RXBYTES() is a 5-bit field (max 31) while struct pd_message
is 30 bytes (__le16 header + __le32 payload[PD_MAX_PAYLOAD], packed).
The byte count latched in RXINFO is the number of bytes the port partner
put on the wire, so a malicious partner that transmits a 31-byte frame
can drive the loop one byte past the destination if the WCOVE BMC
receiver does not enforce the PD object-count limit in hardware. The
existing FIXME flagged this as unverified.
Independently, regmap_read() takes an unsigned int * and stores a full
unsigned int at the destination. Passing the byte pointer msg + i means
each iteration writes four bytes; the high three are zero (val_bits is
8) and are normally overwritten by the next iteration, but the final
iteration's high bytes are not. With RXBYTES == 30 the i == 29 iteration
already writes three zero bytes past msg, which sits on the IRQ thread's
stack in wcove_typec_irq().
Clamp the loop to sizeof(struct pd_message) and read each register into
a local before storing only its low byte, so the copy can never exceed
the destination regardless of what RXINFO reports.
Assisted-by: gkh_clanker_t1000
Cc: stable <stable@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/2026051347-clustered-deflected-9543@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This includes three patches that harden the XDomain property parser
against possible malicious hosts.
All these have been in linux-next with no reported issues.
-----BEGIN PGP SIGNATURE-----
iQJUBAABCgA+FiEEVTdhRGBbNzLrSUBaAP2fSd+ZWKAFAmoMN6kgHG1pa2Eud2Vz
dGVyYmVyZ0BsaW51eC5pbnRlbC5jb20ACgkQAP2fSd+ZWKA8Ag//ZxXcy8i84eU9
VIFBtR4jxwQXWJ6aOkJl/y3sZQXbOAY3CfbB54+0zFMx4qxSfCx4P4bMLYDqK64U
YjWF+sxlcFtLv+u9YMfBFzcutK8qA5K0YEFP894gWyYRqdLzA6gGoJt7Vn+zARxZ
eWTvmW8DJjy9Azh4opnfd8yvdh3wxslaGbU1/PRX7FLOPWxU0skn+N+T1583pJ4m
bUxNw+jyGRKz49a0Foo4Ql5RCHg2wTFSGikrzvJreWF7kpz9NFoL/bkzLBbj5sKy
Eyg8fxNAV2BVFRlp3CSHJsjRFEpg4o5LVNiRLHdeAATXHztsgFvqr3xRhQnfbuwa
P8rkse1rIW6uFCkrLfpMdL9cdmDfrTBKsEp1YmtmtRHGBri+c7umf0vrDSliX2SO
p/PfLKGHWeaIo1d8BDsnbsHIs+BLR7JZcjMJ4cIoUcCqTYzdtXibWTBVeHbT/CWm
vgXfPyQ0g9L1kLxvdNQ7UD5hIB+PvlEhGprU5xjteBhVkM2IwczGxo1rxKacPh0q
1qwF+/yKorGhVB5vFcjE3uQSSOaRw7kkfAO1biidPdNnXWPydEJM3RGQsB462dCL
XQRg46Ynes/K+Ya1n2TyI8hnnr+gz+qKl8hYqI4O99fDiiFGKgmpVP+0ap9q9nzZ
itAOtz3N2RdSFX5LmUfP8Q7dUGaPHtw=
=fKQQ
-----END PGP SIGNATURE-----
Merge tag 'thunderbolt-for-v7.1-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus
Mika writes:
thunderbolt: Fixes for v7.1-rc5
This includes three patches that harden the XDomain property parser
against possible malicious hosts.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v7.1-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: property: Cap recursion depth in __tb_property_parse_dir()
thunderbolt: property: Reject dir_len < 4 to prevent size_t underflow
thunderbolt: property: Reject u32 wrap in tb_property_entry_valid()
This is doing far too much math for the simple task of finding a
power of 2 that fully spans the given range. Use fls directly on
the xor which computes the common binary prefix.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v2-895748900b39+5303-iommupt_inv_vtd_jgg@nvidia.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Some ACPI-based platforms report incorrect IRQ trigger types (e.g.
IRQF_TRIGGER_HIGH), which can lead to interrupt storms.
Use the historically working rising-edge trigger on ACPI systems to
avoid this regression.
Device Tree-based systems continue to use the firmware-provided
trigger type.
Fixes: 57be33f85e ("nfc: nxp-nci: remove interrupt trigger type")
Signed-off-by: Carl Lee <carl.lee@amd.com>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Tested-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Link: https://patch.msgid.link/20260516-nfc-nxp-nci-i2c-restore-irq-trigger-fallback-v3-1-37ba4b6e9086@amd.com
Signed-off-by: David Heidelberg <david@ixit.cz>
POWER_SEQUENCING_PCIE_M2 driver handles power supply to the PCIe M.2
connectors and is required on wide variety of ARM64 platforms such as
Qcom Snapdragon X Elite laptops and Mediatek Dojo Chromebooks.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260514065017.11305-1-manivannan.sadhasivam@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Now since the devm_of_qcom_ice_get() API never returns NULL, remove the
NULL check and also simplify the error handling.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # UFS
Tested-by: Sumit Garg <sumit.garg@oss.qualcomm.com> # OP-TEE as TZ
Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-qcom-ice-fix-v7-5-2a595382185b@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Now since the devm_of_qcom_ice_get() API never returns NULL, remove the
NULL check and also simplify the error handling.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Sumit Garg <sumit.garg@oss.qualcomm.com> # OP-TEE as TZ
Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-qcom-ice-fix-v7-4-2a595382185b@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
devm_of_qcom_ice_get() currently returns NULL if ICE SCM is not available
or "qcom,ice" property is not found in DT. But this confuses the clients
since NULL doesn't convey the reason for failure. So return proper error
codes instead of NULL.
Reported-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Tested-by: Sumit Garg <sumit.garg@oss.qualcomm.com> # OP-TEE as TZ
Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-qcom-ice-fix-v7-3-2a595382185b@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
By the time the consumer driver calls devm_of_qcom_ice_get(), all the
platform devices for ICE nodes would've been created by
of_platform_default_populate().
So for the absence of any platform device, -ENODEV should not returned, not
-EPROBE_DEFER.
Fixes: 2afbf43a4a ("soc: qcom: Make the Qualcomm UFS/SDCC ICE a dedicated driver")
Tested-by: Sumit Garg <sumit.garg@oss.qualcomm.com> # OP-TEE as TZ
Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-qcom-ice-fix-v7-2-2a595382185b@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
The current platform driver design causes probe ordering races with
consumers (UFS, eMMC) due to ICE's dependency on SCM firmware calls. If ICE
probe fails (missing ICE SCM or DT registers), devm_of_qcom_ice_get() loops
with -EPROBE_DEFER, leaving consumers non-functional even when ICE should
be gracefully disabled. devm_of_qcom_ice_get() doesn't know if the ICE
driver probe has failed due to above reasons or it is waiting for the SCM
driver.
Moreover, there is no devlink dependency between ICE and consumer drivers
as 'qcom,ice' is not considered as a DT 'supplier'. So the consumer drivers
have no idea of when the ICE driver is going to probe.
To address these issues, store the error pointer in a global xarray with
ice node phandle as a key during probe in addition to the valid ice pointer
and synchronize both qcom_ice_probe() and of_qcom_ice_get() using a mutex.
If the xarray entry is NULL, then it implies that the driver is not
probed yet, so return -EPROBE_DEFER. If it has any error pointer, return
that error pointer directly. Otherwise, add the devlink as usual and return
the valid pointer to the consumer.
Xarray is used instead of platform drvdata, since driver core frees the
drvdata during probe failure. So it cannot be used to pass the error
pointer to the consumers.
Note that this change only fixes the standalone ICE DT node bindings and
not the ones with 'ice' range embedded in the consumer nodes, where there
is no issue.
Fixes: 2afbf43a4a ("soc: qcom: Make the Qualcomm UFS/SDCC ICE a dedicated driver")
Reported-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Tested-by: Sumit Garg <sumit.garg@oss.qualcomm.com> # OP-TEE as TZ
Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Cc: stable@vger.kernel.org # 6.4
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-qcom-ice-fix-v7-1-2a595382185b@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Core usually prints endpoint addresses with 0x%X format.
Change this code to use it too, instead of just %d.
Particularly for IN, 0x83 seems more readable than 131.
While at that, fix checkpatch warnings about multi-line
quoted strings, as well as missing or doubled whitespace
in those strings.
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Link: https://patch.msgid.link/20260518073258.6532bdd5.michal.pecio@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tao Xue found that some common devices violate USB 3.x section 9.6.7
by reporting wBytesPerInterval lower than the size of packets they
actually send. I confirmed that AX88179 may set it to 0 and RTL8153
CDC configuration sets it to 8 but sends both 8 and 16 byte packets:
S Ii:11:007:3 -115:128 16 <
C Ii:11:007:3 0:128 8 = a1000000 01000000
S Ii:11:007:3 -115:128 16 <
C Ii:11:007:3 0:128 16 = a12a0000 01000800 00000000 00000000
Most xHCI host controllers neglect interrupt bandwidth reservations
and let such devices exceed theirs, some fail the URB with EOVERFLOW.
Assume that wBytesPerInterval lower than wMaxPacketSize is bogus and
increase it to the worst case maximum on interrupt IN endpoints. This
solves xHCI problems and appears to have no other effect. Interrupt
transfers are not limited to one interval and drivers submit URBs of
class defined size without looking at wBytesPerInterval. Any multi-
interval transfer is considered terminated by a packet shorter than
wMaxPacketSize regardless of wBytesPerInterval - see USB3 8.10.3.
Stay in spec on OUT endpoints and isochronous. No buggy devices are
known and we don't want to risk sending more data than the device
is prepared to handle or confusing isoc drivers regarding altsetting
capacities guaranteed by the device itself. And don't complain when
wMaxPacketSize <= wBytesPerInterval < wMaxPacketSize * (bMaxBurst+1)
because enabling this seems to be the exact goal of the spec.
Reported-and-tested-by: Tao Xue <xuetao09@huawei.com>
Closes: https://lore.kernel.org/linux-usb/20260402021400.28853-1-xuetao09@huawei.com/
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Link: https://patch.msgid.link/20260518073207.5b7d26e7.michal.pecio@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no good reason to have wBytesPerInterval < wMaxPacketSize -
either one is too low or the other too high, and we may want to warn
about such descriptors. Start with cleaning up our own root hubs.
USB 3.2 section 10.15.1 sets wMaxPacketSize and wBytesPerInterval of
SuperSpeed hub status endpoints at 2 bytes, so reduce wMaxPacketSize
from its former value of 4, which was derived from USB 2.0 spec and
the kernel's USB_MAXCHILDREN limit. They don't apply because USB 3.2
10.15.2.1 specifies SuperSpeed hubs to have up to 15 ports.
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Link: https://patch.msgid.link/20260518073121.7bc1da0f.michal.pecio@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
do_flash() locates the first .cyacd record with
p = strnchr(fw->data, fw->size, ':');
while (p < eof) {
s = strnchr(p + 1, eof - p - 1, ':');
...
}
If the firmware image contains no ':' byte, strnchr() returns NULL.
NULL compares less than the valid kernel pointer eof, so the loop body
runs and strnchr() is called with p + 1 == (void *)1 and a length of
roughly (unsigned long)eof, causing a wonderful crash.
The not_signed_fw fallthrough earlier in do_flash() and the chip-state
branches in ccg_fw_update_needed() allow an unsigned blob to reach this
loop, so a root user who can place a crafted file under /lib/firmware
and write the do_flash sysfs attribute can trigger the oops.
Bail out with -EINVAL when the initial strnchr() returns NULL.
Assisted-by: gkh_clanker_t1000
Cc: stable <stable@kernel.org>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026051405-posture-shrill-7884@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The WebUSB GET_URL handler in composite_setup() narrows
landing_page_length to fit the host-supplied wLength using
landing_page_length = w_length
- WEBUSB_URL_DESCRIPTOR_HEADER_LENGTH + landing_page_offset;
If wLength is smaller than WEBUSB_URL_DESCRIPTOR_HEADER_LENGTH the
unsigned subtraction wraps, and the subsequent
memcpy(url_descriptor->URL,
cdev->landing_page + landing_page_offset,
landing_page_length - landing_page_offset);
ends up copying close to UINT_MAX bytes from cdev->landing_page into
cdev->req->buf. KASAN reports a slab-out-of-bounds in composite_setup
on the kmalloc-2k gadget_info allocation, and FORTIFY_SOURCE traps the
memcpy as a 4294967293-byte field-spanning write into
url_descriptor->URL (size 252).
A USB host can reach this from a single SETUP packet against any
gadget that has webusb/use=1 and a landingPage configured.
Handle the small-wLength case before the math: when the host requested
fewer bytes than the URL descriptor header, only the header is
meaningful and no URL bytes need to be copied. Setting
landing_page_length to landing_page_offset makes the existing memcpy a
no-op and leaves the descriptor returned to the host unchanged for all
larger wLength values.
Fixes: 93c473948c ("usb: gadget: add WebUSB landing page support")
Cc: stable <stable@kernel.org>
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Link: https://patch.msgid.link/20260512160530.352318-1-mendozayt13@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set the error code on these two error paths. The existing code returns
success.
Fixes: 77ed2f4538 ("usb: typec: tipd: Use read_power_status function in probe")
Fixes: 04041fd7d6 ("usb: typec: tipd: Read data status in probe and cache its value")
Cc: stable <stable@kernel.org>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/agL9o7wUK1dOVBTy@stanley.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PANEL_BOOT_MESSAGE dependency uses a quoted-string comparison
against the PANEL_CHANGE_MESSAGE bool symbol:
depends on PANEL_CHANGE_MESSAGE="y"
This is the only such pattern under drivers/auxdisplay/ (grep shows
no other Kconfig file in the tree uses depends on FOO="y" with
quotes for a plain bool symbol). The quoted form is parsed by
Kconfig but is not idiomatic; the common form for the same intent
is the unquoted tristate-style dependency:
depends on PANEL_CHANGE_MESSAGE
which evaluates true when PANEL_CHANGE_MESSAGE is y or m. Since
PANEL_CHANGE_MESSAGE is declared as bool (not tristate), there is
no behaviour change in practice: y is the only enabled value
either form can match.
Drop the quoted comparison so the dependency matches the prevailing
kernel Kconfig style and so it is obvious to readers that the
comparison works.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/CAHp75VfsA_LsbEKjxoeMdbhPbWj7OHZ7=0SYNA3c=ZLj_M94Bw@mail.gmail.com
Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
linedisp_display() unconditionally reads msg[count - 1] before
checking whether count is zero, so a write of zero bytes to the
message sysfs attribute hits msg[-1]:
write(fd, "", 0);
-> message_store(..., buf, count=0)
-> linedisp_display(linedisp, buf, count=0)
-> msg[count - 1] == '\n' ; OOB read
The kernfs write buffer for that store is a 1-byte allocation
(kernfs_fop_write_iter() does kmalloc(len + 1) with len == 0),
so msg[-1] is a 1-byte read before the slab object. On a
KASAN-enabled kernel this trips an out-of-bounds report and
panics; on stock kernels it silently reads adjacent slab data
and, if that byte happens to be '\n', the following count--
wraps ssize_t 0 to -1 and is then passed to kmemdup_nul().
linedisp_display() is reached from the message_store() sysfs
callback (drivers/auxdisplay/line-display.c message attribute,
mode 0644) and from the in-tree initial-message setup with
count == -1, so the OOB path is only userspace-triggerable via
zero-byte writes; vfs_write() does not short-circuit on
count == 0 and kernfs_fop_write_iter() dispatches the store
callback regardless.
Guard the trailing-newline trim with a count check. The
existing if (!count) block then takes the clear-display path
unchanged.
Affects every auxdisplay driver that registers via
linedisp_register() / linedisp_attach(): ht16k33, max6959,
img-ascii-lcd, seg-led-gpio.
Fixes: 7e76aece6f ("auxdisplay: Extract character line display core support")
Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The GMAC node incorrectly listed four clocks, including a separate tx_clk
and a TSU GCK clock sourced from ID 67. According to the SAM9X7 clocking
scheme, the GMAC uses only three clocks: HCLK, PCLK, and the TSU GCK
derived from the GMAC peripheral clock (ID 24).
Remove the unused tx_clk, update the clock-names accordingly, and correct
the assigned clock to use GCK 24 instead of GCK 67. This aligns the device
tree with the actual hardware clock topology and prevents misconfiguration
of the GMAC clock tree.
[root@SAM9X75 ~]$ cat /sys/kernel/debug/clk/clk_summary | grep gmac
gmac_gclk 1 1 1 266666666 0 0 50000 Y f802c000.ethernet tsu_clk
f802c000.ethernet tsu_clk
gmac_clk 2 2 0 266666666 0 0 50000 Y f802c000.ethernet hclk
f802c000.ethernet pclk
Fixes: 41af45af8b ("ARM: dts: at91: sam9x7: add device tree for SoC")
Signed-off-by: Mihai Sain <mihai.sain@microchip.com>
Link: https://lore.kernel.org/r/20260309075329.1528-5-mihai.sain@microchip.com
[claudiu.beznea: massaged the patch description]
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
When AH output is offloaded to an asynchronous crypto provider
(hardware accelerators such as AMD CCP, or a forced-async software
shim used for testing), the digest completion fires
ah_output_done() / ah6_output_done() on a workqueue. The egress
skb at that point may have been originated by a TCP listener
sending a SYN-ACK, which sets skb->sk to a request_sock via
skb_set_owner_edemux(); it may also have been originated by an
inet_timewait_sock retransmit. Neither is a full struct sock, and
passing the raw skb->sk to xfrm_output_resume() then forwards a
non-full socket through the rest of the xfrm output chain.
xfrm_output_resume() and its downstream consumers expect a full
sk where they dereference at all. The natural egress path
through ah_output_done() does not crash today because the
consumers that read past sock_common are either gated by
sk_fullsock() or short-circuit on flags that are clear on a fresh
request_sock; an exhaustive walk of the 50 most plausible
consumers under sch_fq, dev_queue_xmit, netfilter, tc-egress and
cgroup-egress BPF found no current unguarded deref. The bug is
still a real type confusion that future consumer changes could
turn into a memory-corruption primitive.
This is the same bug class fixed for ESP in commit 1620c88887
("xfrm: Fix the usage of skb->sk"). Apply the analogous fix to
AH: convert skb->sk to a full socket pointer (or NULL) via
skb_to_full_sk() before handing it to xfrm_output_resume().
The same async AH callbacks were touched recently for an
independent ESN-related ICV layout bug in commit ec54093e6a
("xfrm: ah: account for ESN high bits in async callbacks"); the
sk type-confusion addressed here is orthogonal. This patch is
part of an ongoing audit of the AH callback paths; an ah_output
ihl-validation hardening series is also currently under review on
netdev.
Reproduced under UML + KASAN + lockdep with a forced-async
hmac(sha1) shim that registers at priority 9999 and wraps the
sync in-tree hmac-sha1-lib. With the shim loaded, ah_output_done
runs on every SYN-ACK egress through a transport-mode AH SA and
skb->sk arrives as a request_sock (TCP_NEW_SYN_RECV); after this
patch, xfrm_output_resume() receives the listener (the result of
sk_to_full_sk()) and consumer derefs land on full-sock fields as
intended.
Fixes: 9ab1265d52 ("xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Usual mixed bag of ancient issues and more recent.
Lots of new contributors this cycle and some of that work has
uncovered bugs in the code they were looking at.
core,buffer
- Add missing dma_fence_put()
- hw-consumer - Fix a use after free in cleaning up a list.
core,inkern
- Fix return value handling in iio_read_channel_processed_scale() that
meant a correct result was treated as an error.
adi,ad3530r
- Fix powerdown mode strings for AD3531 and AD3531R.
adi,ad4695
- Fix ordering so by the time device transitions to offload mode
all setup transfers are done. This avoids issues with offload
controllers that cannot handle normal transfers after offload has
begun.
adi,ad5686
- Fix wrong initialization of reference bit for single channel parts.
- Fix off by one in check on input raw value
adi,adis16260
- Fix division by zero triggerable from sysfs.
adi,adis16550
- Fix a stack leak to userspace.
amlogic,meson-adc
- Fix a buffer allocation leak in an error path.
bosch,bmp280
- Fix a stack leak to userspace.
capella,cm3323
- Fix wrong return value rather than register value being written
data->reg_conf on write.
maxim,max5821
- Check correct length i2c_master_send() in max5821_sync_powerdown_mode().
mediatek,mt6359
- Fix potential uninitialized value.
nuvoton,npcm_adc
- Fix unbalance clk_disable_unprepare()
nxp,sar-adc
- Avoid a division by zero if the common clock framework is disabled.
- Fix a division by zero triggerable from sysfs.
- Ensure all of struct dma_slave_config is initialized.
qcom,spmi-adc-gen3
- Fix an off by one that leads to an out of bounds array read.
samsung,ssp_sensors
- Ensure work is cancelled during remove to avoid use after free.
sensiron,scd30
- Fix a division by zero triggerable from sysfs.
st,lsm6dsx
- Fix a stack leak to userspace.
st,magn
- Fix default value for data ready pin selection for devices that
have no data ready pin selection.
vishay,veml6070
- Close a resource leak in an error path.
winsen,mhz19b
- Reject over-sized serial messages from device.
xilinx,xadc
- Fix sequencer handling for dual MUX cases
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmoG/rkRHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0FohZ9w//c9Hycnr8BDjluSDZ0hlGzv1CTO3FeHwM
OgW/IXCLFK0y66xDkD08w8pwkn8Ol/wCrXoFitGinUhNc3MdK7lNtvvALiFnvU2f
OP4wAvlCWjJlMb1Q8BCAtl/0W2rlb9gpm13pyBH4401IoGnf9sMdVkNortCEse3P
0uH/0UdpFoN8Ke5WvwnC22gZiCF4SaJtyShcR9INyHz70uA8WUOXVCHloWuUZEeC
WrXRwBollXwa7mLiYkDRr6lle71+xQPFVi+1tTUBAFaXZtJlEA3vWMkKHX1I0ywy
6oANSvW/VQJkkZwgRUmANbqBffCL/CyOoLDC7plzELqMQXTh52Jw8Zxt+Xf/nq8u
dSYVxJ4tt7RGxCYGekQdGc1bKz2HwvIXGBUigGbxzZKj6z7rrqbfHq9sXs1FTi3n
PwF3rC43v3LAgtJ7yoS9U3KvZ08B/ydDkfvPlWEHXOveLWLpyVeMQf76JI9O2M7j
4baK99ngEQmrjwSHM9ZVS/cx/alLQjwegkmpsa/OOkuIHBTyECBerIAirwF52eOC
kGo7fcAHazGp3te20/51FNBgilD2RIlMtwEGXsxDOcN/nnA3e0tQVpDJeCXepP8t
8AzeRsMMII5GiLkqqwg/bGpJ0qNWCqqoNse5x/LXbKXGSCCffe3xahZFIzxP5FxK
y4lj71MPAS8=
=Ja6W
-----END PGP SIGNATURE-----
Merge tag 'iio-fixes-for-7.1a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:
iio-fixes-for-7.a IIO: 1st set of fixes for the 7.1 cycle
Usual mixed bag of ancient issues and more recent.
Lots of new contributors this cycle and some of that work has
uncovered bugs in the code they were looking at.
core,buffer
- Add missing dma_fence_put()
- hw-consumer - Fix a use after free in cleaning up a list.
core,inkern
- Fix return value handling in iio_read_channel_processed_scale() that
meant a correct result was treated as an error.
adi,ad3530r
- Fix powerdown mode strings for AD3531 and AD3531R.
adi,ad4695
- Fix ordering so by the time device transitions to offload mode
all setup transfers are done. This avoids issues with offload
controllers that cannot handle normal transfers after offload has
begun.
adi,ad5686
- Fix wrong initialization of reference bit for single channel parts.
- Fix off by one in check on input raw value
adi,adis16260
- Fix division by zero triggerable from sysfs.
adi,adis16550
- Fix a stack leak to userspace.
amlogic,meson-adc
- Fix a buffer allocation leak in an error path.
bosch,bmp280
- Fix a stack leak to userspace.
capella,cm3323
- Fix wrong return value rather than register value being written
data->reg_conf on write.
maxim,max5821
- Check correct length i2c_master_send() in max5821_sync_powerdown_mode().
mediatek,mt6359
- Fix potential uninitialized value.
nuvoton,npcm_adc
- Fix unbalance clk_disable_unprepare()
nxp,sar-adc
- Avoid a division by zero if the common clock framework is disabled.
- Fix a division by zero triggerable from sysfs.
- Ensure all of struct dma_slave_config is initialized.
qcom,spmi-adc-gen3
- Fix an off by one that leads to an out of bounds array read.
samsung,ssp_sensors
- Ensure work is cancelled during remove to avoid use after free.
sensiron,scd30
- Fix a division by zero triggerable from sysfs.
st,lsm6dsx
- Fix a stack leak to userspace.
st,magn
- Fix default value for data ready pin selection for devices that
have no data ready pin selection.
vishay,veml6070
- Close a resource leak in an error path.
winsen,mhz19b
- Reject over-sized serial messages from device.
xilinx,xadc
- Fix sequencer handling for dual MUX cases
* tag 'iio-fixes-for-7.1a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (31 commits)
iio: adc: viperboard: Fix error handling in vprbrd_iio_read_raw
iio: gyro: itg3200: fix i2c read into the wrong stack location
iio: dac: ad5686: fix powerdown control on dual-channel devices
iio: dac: ad5686: acquire lock when doing powerdown control
iio: temperature: tsys01: fix broken PROM checksum validation
iio: dac: ad3530r: Fix AD3531/AD3531R powerdown mode strings
iio: buffer: hw-consumer: fix use-after-free in error path
iio: dac: ad5686: fix input raw value check
iio: dac: ad5686: fix ref bit initialization for single-channel parts
iio: ssp_sensors: cancel delayed work_refresh on remove
iio: adc: meson-saradc: fix calibration buffer leak on error
iio: dac: max5821: fix return value check in powerdown sync
iio: adc: mt6359: fix unchecked return value in mt6358_read_imp
iio: adc: qcom-spmi-adc5-gen3: Fix off by one in adc5_gen3_get_fw_channel_data()
iio: imu: adis16550: fix stack leak in trigger handler
iio: imu: st_lsm6dsx: fix stack leak in tagged FIFO buffer
iio: pressure: bmp280: fix stack leak in bmp580 trigger handler
iio: adc: nxp-sar-adc: zero-initialize dma_slave_config
iio: light: cm3323: fix reg_conf not being initialized correctly
iio: magnetometer: st_magn: fix default DRDY pin selection for LIS2MDL
...
The driver proceeds to the reception phase even if the preceding
transmission fails.
This uses a goto error label for an early bail out and ensures the mutex is
properly unlocked in case of failure.
Fixes: ffd8a6e7a7 ("iio: adc: Add viperboard adc driver")
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Reviewed-by: Joshua Crofts <joshua.crofts1@gmail.com>
Reviewed-by: Maxwell Doose <m32285159@gmail.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
itg3200_read_all_channels() takes `__be16 *buf' as a parameter and
fills the i2c_msg destination as `(char *)&buf'. Since `buf' is the
parameter (a pointer), `&buf' is the address of the local pointer
slot on the stack of itg3200_read_all_channels(), not the address
of the caller's scan buffer. The (char *) cast hides the type
mismatch.
i2c_transfer() therefore writes ITG3200_SCAN_ELEMENTS * sizeof(s16)
= 8 bytes into the parameter's stack slot, which is discarded when
the function returns. The caller's scan buffer in
itg3200_trigger_handler() is never written to, so
iio_push_to_buffers_with_timestamp() pushes uninitialised stack
contents to userspace via /dev/iio:deviceX every scan -- both a
functional bug (no actual gyroscope or temperature data is
delivered through the triggered buffer) and an information leak.
The non-buffered read_raw() path is unaffected: it goes through
itg3200_read_reg_s16() which uses `&out' on a local s16 value,
where that is correct.
Drop the spurious `&' so the i2c read writes into the caller's
buffer.
Fixes: 9dbf091da0 ("iio: gyro: Add itg3200")
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Fix powerdown control by using a proper bit shift for the powerdown mask
values. During initialization, powerdown bits are initialized so that
unused bits are set to 1 and the correct bit shift is used. Dual-channel
devices use one-hot encoding in the address and that reflects on the
position of the powerdown bits, which are not channel-index based
for that case. Quad-channel devices also use one-hot encoding for the
channel address but the result of log2(address) coincides with the channel
index value. Mask as 0x3U is used rather than 0x3, because shift can reach
value of 30 (last channel of a 16-channel device), which would mess with
the sign bit. The issue was introduced when first adding support for
dual-channel devices, which overlooked powerdown control differences.
Fixes: 7dc8faeab3 ("iio: dac: ad5686: add support for AD5338R")
Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Protect access of pwr_down_mode and pwr_down_mask fields with existing
mutex lock. Each channel exposes their own attributes for controlling
powerdown modes and powerdown state. This fixes potential race conditions
as those the write functions perform non-atomic read-modify-write
operations to those pwr_down_* fields. This issue exists since the ad5686
driver was first introduced.
Fixes: c2f37c8dca ("iio: dac: New driver for AD5686R, AD5685R, AD5684R Digital to analog converters")
Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The current implementation of tsys01_crc_valid() incorrectly sums the
first word (n_prom[0]) repeatedly instead of iterating over the 8 words
retrieved from the PROM. This leads to a checksum mismatch and probe
failure on hardware.
According to the TSYS01 datasheet, the PROM consists of 8 words. A valid
check must iterate through all 8 words to verify the integrity of the
calibration data. The current driver only checks the first word 8 times.
Note: This fix was identified during a code audit and is based on
datasheet specifications. It has not been tested on real hardware.
Fixes: 43e53407f6 ("Add tsys01 meas-spec driver support")
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The AD3531/AD3531R has different output operating modes from the
AD3530/AD3530R. According to the AD3531/AD3531R datasheet, the
powerdown modes are:
01: 500 Ohm output impedance
10: 3.85 kOhm output impedance
11: 16 kOhm output impedance
The driver currently uses the AD3530R modes (1k, 7.7k, 32k) for all
variants, which is incorrect for AD3531/AD3531R.
Add AD3531R-specific powerdown mode strings and assign them to the
AD3531/AD3531R chip variants.
Fixes: 93583174a3 ("iio: dac: ad3530r: Add driver for AD3530R and AD3531R")
Signed-off-by: Kim Seer Paller <kimseer.paller@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
In the err_put_buffers cleanup path of iio_hw_consumer_alloc(), the code
was using list_for_each_entry() to iterate through buffers while calling
iio_buffer_put() which can free the current buffer if refcount drops to 0.
The list_for_each_entry() loop macro then evaluates buf->head.next to
continue iteration, accessing the freed buffer.
Fix this by using list_for_each_entry_safe().
Fixes: 48b66f8f93 ("iio: Add hardware consumer buffer support")
Reported-by: sashiko <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260427-iio_buf-v1-1-2bbdac844647%40gmail.com
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Maxwell Doose <m32285159@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Fix range check for input raw value, which is off by one, i.e., for a
10-bit DAC the max valid value is 1023, but 1 << 10 equals 1024, which
passes the previous check, allowing an out-of-range write. The issue
exists since the ad5686 driver was first introduced.
Fixes: c2f37c8dca ("iio: dac: New driver for AD5686R, AD5685R, AD5684R Digital to analog converters")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The reference bit position was ignored when writing the register at the
probe() function (!!val was used). When such bit is 1, internal voltage
reference is disabled so that an external one can be used. For
multi-channel devices, bit 0 of the Internal Reference Setup command
behaves the same way, so AD5686_REF_BIT_MSK is created. The issue exists
since support for single-channel devices were first introduced.
Fixes: be1b24d245 ("iio:dac:ad5686: Add AD5691R/AD5692R/AD5693/AD5693R support")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The work_refresh may still be pending or running when the device is
removed, cancel the delayed work_refresh in remove path.
Fixes: 50dd64d57e ("iio: common: ssp_sensors: Add sensorhub driver")
Signed-off-by: Sanjay Chitroda <sanjayembeddedse@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
meson_sar_adc_temp_sensor_init() allocates a buffer with
nvmem_cell_read(), but the old code leaked it if
syscon_regmap_lookup_by_phandle() failed.
Fix this by adding missing kfree(buf).
Fixes: d6f2eac644 ("iio: adc: meson: no devm for nvmem_cell_get")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The function max5821_sync_powerdown_mode() returned the result of
i2c_master_send() directly. If a partial transfer occurred, it would
be incorrectly treated as a success by the caller.
While the caller currently handles the positive return value of 2 as
success, this patch refactors the function to return 0 on full success
and -EIO on short writes. This ensures robust error handling for
incomplete transfers and improves code maintainability by using
sizeof(outbuf).
Fixes: 4729889727 ("iio: add support of the max5821")
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
In mt6358_read_imp(), the variable val_v is passed to regmap_read()
but the return value is not checked. If the read fails, val_v remains
uninitialized and its random stack content is subsequently reported
as a measurement result.
Initialize val_v to zero to ensure a predictable value is reported
in case of bus failure and to prevent potential stack data leakage.
This also satisfies static analyzers that might otherwise flag the
variable as used uninitialized.
Fixes: 3587914bf6 ("iio: adc: Add support for MediaTek MT6357/8/9 Auxiliary ADC")
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The > in "if (chan > ADC5_MAX_CHANNEL)" should be >= to prevent an out
of bound read of the adc->data->adc_chans[] array.
Fixes: baff45179e ("iio: adc: Add support for QCOM PMIC5 Gen3 ADC")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
adis16550_trigger_handler() declares the scan data array on the stack
without initializing it. The memcpy() at the bottom fills only the
first 28 bytes (TEMP + 6 channels of GYRO/ACCEL data), and
iio_push_to_buffers_with_timestamp() writes the s64 timestamp at the
8-byte-aligned offset 32. Bytes 28-31 remain uninitialized stack data
which leaks to userspace on ever trigger.
Fix this all by just zero-initializing the structure on the stack.
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Michael Hennerich <Michael.Hennerich@analog.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: David Lechner <dlechner@baylibre.com>
Cc: "Nuno Sá" <nuno.sa@analog.com>
Cc: Andy Shevchenko <andy@kernel.org>
Fixes: e4570f4bb2 ("iio: imu: adis16550: align buffers for timestamp")
Cc: stable <stable@kernel.org>
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The tagged FIFO path declares iio_buff on the stack with __aligned(8)
but no initializer, but there is a hole in the structure, which will
then leak to userspace as ST_LSM6DSX_SAMPLE_SIZE bytes (6) will be
copied, but the space between that and the timestamp are not
initialized.
Commit c14edb4d0b ("iio:imu:st_lsm6dsx Fix alignment and data leak
issues") moved the untagged FIFO path to a kzalloc'd buffer in hw->scan,
but for the tagged path it only added the alignment qualifier and not
the initializer :(
Fix this by just zero-initializing the structure on the stack.
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: David Lechner <dlechner@baylibre.com>
Cc: "Nuno Sá" <nuno.sa@analog.com>
Cc: Andy Shevchenko <andy@kernel.org>
Fixes: c14edb4d0b ("iio:imu:st_lsm6dsx Fix alignment and data leak issues")
Cc: stable <stable@kernel.org>
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
bmp580_trigger_handler() declares its scan buffer on the stack without
an initializer and then memcpy()s 3 bytes of 24-bit sensor data into
each 4-byte __le32 field. The high byte of comp_temp and comp_press is
left uninitialized, and the channel storagebits is 32, so two bytes of
stack are pushed to userspace per scan.
This is a regression from when the buffer lived in the private data, the
move to a stack-local struct dropped the implicit zeroing.
bme280_trigger_handler() was fixed up to handle this bug, but this
driver was not fixed because there was no padding hole, but rather a
short-fill issue.
Fix this all by just zero-initializing the structure on the stack.
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: David Lechner <dlechner@baylibre.com>
Cc: "Nuno Sá" <nuno.sa@analog.com>
Cc: Andy Shevchenko <andy@kernel.org>
Fixes: 872c8014e0 ("iio: pressure: bmp280: drop sensor_data array")
Cc: stable <stable@kernel.org>
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
nxp_sar_adc_start_cyclic_dma() only fills the RX-side members of
dma_slave_config before passing it to dmaengine_slave_config().
Zero-initialize the structure so unused members do not contain stack
garbage. Some DMA engines consult optional dma_slave_config fields, so
leaving them uninitialized can cause DMA setup failures.
Fixes: 4434072a89 ("iio: adc: Add the NXP SAR ADC support for the s32g2/3 platforms")
Signed-off-by: Shuvam Pandey <shuvampandey1@gmail.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The code stores the return value of i2c_smbus_write_word_data()
in data->reg_conf; however, this value represents the result
of the write operation and not the value actually written to
the configuration register. This meant that the contents of
data->reg_conf did not truly reflect the contents
of the hardware register.
Instead, save the value of the register before the write
and use this value in the I2C write.
The bug was found by code inspection: i2c_smbus_write_word_data()
returns 0 on success, not the value written to the register.
Tested using i2c-stub on a Raspberry Pi 3B running a custom 6.19.10
kernel. Before loading the driver, the configuration register 0x00
CM3323_CMD_CONF was populated with 0x0030 using
`i2cset -y 11 0x10 0x00 0x0030 w`, encoding an integration time of 320ms
in bits[6:4].
Due to incorrect initialization of data->reg_conf in
cm3323_init(), the print of integration_time returns 0.040000
instead of the expected 0.320000. This happens because the read of the
integration_time depends on cm3323_get_it_bits() that is based on the
value of data->reg_conf, which is erroneously set to 0.
With this fix applied, data->reg_conf correctly saves 0x0030 after init
and the successive integration_time reports 0.320000 as expected.
Fixes: 8b05442637 ("iio: light: Add support for Capella CM3323 color sensor")
Cc: stable@vger.kernel.org
Signed-off-by: Aldo Conte <aldocontelk@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The device tree binding for st,lis2mdl does not support
st,drdy-int-pin property. However, when no platform data is provided
and the property is absent, the driver falls back to default_magn_pdata
which hardcodes drdy_int_pin = 2. This causes
`st_sensors_set_drdy_int_pin` to fail with -EINVAL because the LIS2MDL
sensor settings have no INT2 DRDY mask defined.
Fix this by checking the sensor's INT2 DRDY mask availability at
probe time and selecting the appropriate default pin. Sensors that
do not support INT2 DRDY will default to INT1, while all others
retain the existing default of INT2.
Fixes: 38934daf7b ("iio: magnetometer: st_magn: Provide default platform data")
Signed-off-by: Advait Dhamorikar <advaitd@mechasystems.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
iio_buffer_enqueue_dmabuf() allocates a struct iio_dma_fence (104 bytes,
kmalloc-128) via kmalloc_obj()+dma_fence_init(), which sets the initial
kref to 1. It then calls dma_resv_add_fence() which takes a second
reference (kref=2), and stores a raw pointer in block->fence.
On the success path the function returns without calling dma_fence_put()
to release the initial reference, so every buffer enqueue permanently
leaks one kmalloc-128 allocation.
The iio_buffer_cleanup() work item only releases the temporary reference
taken during completion signalling by iio_buffer_signal_dmabuf_done();
the initial reference from dma_fence_init() is never released.
With four iio_rwdev instances at 240kHz and 512 samples per buffer,
this produces ~1875 kmalloc-128 allocations per second matching the
observed slab growth exactly. A test with ftrace confirmed that the
dma_fence_destroy event was never triggered.
Fix by calling dma_fence_put() after dma_resv_add_fence(), transferring
ownership of the fence to the DMA reservation object. The DMA fence then
gets properly discarded after being signalled.
Fixes: 3e26d9f08f ("iio: core: Add new DMABUF interface infrastructure")
Originally-by: James Nuss <jamesnuss@nanometrics.ca>
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Add a validation check for the sampling frequency value before using it
as a divisor. A user writing zero or a negative value to the
sampling_frequency sysfs attribute triggers a division by zero in the
kernel.
Also prevent unsigned integer underflow when the computed cycle count is
smaller than NXP_SAR_ADC_CONV_TIME, which would wrap the u32 inpsamp to
a huge value.
Fixes: 4434072a89 ("iio: adc: Add the NXP SAR ADC support for the s32g2/3 platforms")
Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
A fix to plug a refcount leak in counter_alloc() if dev_set_name() fails
and the error path is taken.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSNN83d4NIlKPjon7a1SFbKvhIjKwUCaga4SQAKCRC1SFbKvhIj
K6BFAQCT+DwFTzJ8JVt4eXZIwzZ/B5KdTetOMVST6pjqwy9OiAEAhLhZeMFdkAxg
KEUApW65KeKCSZKUEJeDftIOZxMx4ws=
=28xw
-----END PGP SIGNATURE-----
Merge tag 'counter-fixes-for-7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wbg/counter into char-misc-linus
William writes:
Counter fixes for 7.1
A fix to plug a refcount leak in counter_alloc() if dev_set_name() fails
and the error path is taken.
* tag 'counter-fixes-for-7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wbg/counter:
counter: Fix refcount leak in counter_alloc() error path
Sashiko identified the leak at [1].
The ACPM driver allocates hardware mailbox channels using
`mbox_request_channel()` during `acpm_channels_init()`. However, the
driver lacked a `.remove` callback and did not free these channels on
subsequent error paths inside `acpm_probe()`.
Additionally, if `acpm_achan_alloc_cmds()` failed during the channel
initialization loop, the function returned immediately, bypassing the
manual cleanup and permanently leaking any channels successfully
requested in previous loop iterations.
Fix this by modifying `acpm_free_mbox_chans()` to match the `devres`
action signature and registering it via `devm_add_action_or_reset()`.
Cc: stable@vger.kernel.org
Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-2-43b5ee7f1674@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Sashiko identified a cross-thread RX length corruption bug when
reviewing the thermal addition to ACPM [1].
When multiple threads concurrently send IPC requests, the ACPM polling
mechanism can encounter responses belonging to other threads. To drain
the queue, the driver saves these concurrent responses into an internal
cache (`rx_data->cmd`) to be retrieved later by the owning thread.
Previously, the driver incorrectly used `xfer->rxcnt` (the expected
receive length of the *current* polling thread) when copying data for
*other* threads into this cache. If the threads expected responses of
different lengths, this resulted in buffer underflows (leading to reads
of uninitialized memory) or potential buffer overflows.
Fix this by replacing the boolean `response` flag in
`struct acpm_rx_data` with `rxcnt`, caching the exact expected receive
length for each specific transaction during transfer preparation. Use
this cached length when saving concurrent responses.
Consequently, ensure that `xfer->rxcnt` is explicitly zeroed in driver
helpers (e.g., `acpm_dvfs_set_xfer`) for fire-and-forget messages to
prevent uninitialized stack garbage from being interpreted as a massive
expected receive length.
Cc: stable@vger.kernel.org
Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Reported-by: Titouan Ameline de Cadeville <titouan.ameline@gmail.com>
Closes: https://lore.kernel.org/r/20260426210255.73674-1-titouan.ameline@gmail.com/
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-1-43b5ee7f1674@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Leo Lin reported OOB write issue in esp component:
xfrm_state_mtu() returns u32 but performs its arithmetic in unsigned
modulo-2^32 space using an attacker-influenced "header_len + authsize +
net_adj" subtracted from a small "mtu" argument. A nobody user can
install an IPv4 ESP tunnel SA with a large authentication key
(XFRMA_ALG_AUTH_TRUNC, e.g. hmac(sha512), 64-byte key, 64-byte trunc),
configure a small interface MTU (68 bytes), and set XFRMA_TFCPAD to a
large value. When a single UDP datagram is then sent through the
tunnel, xfrm_state_mtu() underflows to a near-2^32 value, and
esp_output() consumes it as a signed int via:
padto = min(x->tfcpad, xfrm_state_mtu(x, mtu_cached))
esp.tfclen = padto - skb->len (assigned to int)
esp.tfclen ends up negative (e.g. -207). It is sign-extended to size_t
when passed to memset() inside esp_output_fill_trailer(), producing a
~16 EB write of zeroes at skb_tail_pointer(skb). KASAN logs it as
"Write of size 18446744073709551537 at addr ffff888...".
Check for underflow and return 1. This causes the sendmsg attempt to
fail with ENETUNREACH.
Fixes: c5c2523893 ("[XFRM]: Optimize MTU calculation")
Reported-by: Leo Lin <leo@depthfirst.com>
Assisted-by: Codex:26.506.31004
Signed-off-by: David Ahern <dahern@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The Lenovo ThinkPad E490 (PNP ID: LEN2058) has a Synaptics TM3471-020
touchpad that supports SMBus/RMI4 mode but is not listed in
smbus_pnp_ids[]. Without this entry, RMI4 over SMBus is not enabled
by default, and the touchpad falls back to PS/2 mode.
Adding LEN2058 to the passlist enables automatic RMI4 detection without
requiring the psmouse.synaptics_intertouch parameter, and matches
the behavior of similar ThinkPad models already in the list
(E480/LEN2054, E580/LEN2055).
Tested on ThinkPad E490 with kernel 7.0.5-zen1 and Arch Linux.
RMI4 over SMBus is confirmed working without any kernel parameters.
Signed-off-by: Nicolás Bazaes <contacto@bazaes.cl>
Assisted-by: Claude:claude-sonnet-4-6
Link: https://patch.msgid.link/20260514013552.14234-1-contacto@bazaes.cl
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The backtrace() function and execinfo.h are GNU extensions available
in glibc but not in non-glibc C libraries such as musl. Building KVM
selftests with musl-gcc fails with:
lib/assert.c:9:10: fatal error: execinfo.h: No such file or directory
Fix this by guarding the inclusion of execinfo.h and the stack dumping
logic under #ifdef __GLIBC__. For non-glibc builds, provide a local
stub for test_dump_stack().
Suggested-by: Aqib Faruqui <aqibaf@amazon.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Hisam Mehboob <hisamshar@gmail.com>
Link: https://patch.msgid.link/20260409153846.1502656-2-hisamshar@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
commit 446fcce2a5 ("Revert "x86: kvm: rate-limit global clock updates"")
dropped the rate limiting for KVM_REQ_GLOBAL_CLOCK_UPDATE.
As a result, kvm_arch_vcpu_load() can queue global clock update requests
every time a vCPU is scheduled when the master clock is disabled or when
the vCPU is loaded for the first time.
Restore the throttling with a per-VM ratelimit state and gate
KVM_REQ_GLOBAL_CLOCK_UPDATE through __ratelimit(), so frequent vCPU
scheduling does not generate a steady stream of redundant clock update
requests.
Fixes: 446fcce2a5 ("Revert "x86: kvm: rate-limit global clock updates"")
Signed-off-by: Lei Chen <lei.chen@smartx.com>
Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Closes: https://lore.kernel.org/all/CAK8fFZ5gY8_Mw2A=iZVFNVKQNrXQzVsn-HTd+Me9K6ZfmdgA+Q@mail.gmail.com/
Link: https://patch.msgid.link/20260409142226.2581-1-lei.chen@smartx.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference()
through machine_crash_shutdown() with IRQs disabled but with RCU not
necessarily watching the crashing CPU, which triggers a suspicious
RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during
panic/kdump:
WARNING: suspicious RCU usage
arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage!
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by tee/11119:
#0: ffff8881fa32c440 (sb_writers#3){.+.+}-{0:0}, at: ksys_write
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
lockdep_rcu_suspicious.cold+0x37/0x8f
x86_virt_invoke_kvm_emergency_callback+0x5f/0x70
x86_svm_emergency_disable_virtualization_cpu+0x2a/0x30
x86_virt_emergency_disable_virtualization_cpu+0x6b/0x90
native_machine_crash_shutdown+0x72/0x170
__crash_kexec+0x137/0x280
panic+0xce/0xd0
sysrq_handle_crash+0x1f/0x20
__handle_sysrq.cold+0x192/0x335
write_sysrq_trigger+0x8c/0xc0
proc_reg_write+0x1c3/0x3c0
vfs_write+0x1d0/0xf80
ksys_write+0x116/0x250
do_syscall_64+0x11c/0x1480
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
A truly correct fix is non-trivial: the RCU usage genuinely is wrong in
panic context (RCU may ignore the crashing CPU during synchronization),
and a concurrent KVM module unload could in principle race with the
callback read; see commit 2baa33a8dd ("KVM: x86: Leave user-return
notifier registered on reboot/shutdown") which notes that nothing
prevents module unload during panic/reboot.
However, the alternatives are worse:
- smp_store_release()/smp_load_acquire() handles ordering but not
liveness; the kernel still needs to keep the module text alive
while the callback is in flight.
- Taking a lock in the panic path is risky — any lock could be held
by a CPU that has already been NMI'd to a halt.
Use rcu_dereference_raw() to silence the splat and accept the
vanishingly small remaining race. Panic context inherently cannot
guarantee complete correctness; the goal here is to keep debug builds
quiet on the kdump path so the splat doesn't obscure the actual
kernel state being captured.
Reproducible on a debug kernel (CONFIG_PROVE_LOCKING=y, CONFIG_PROVE_RCU=y)
with kvm_amd or kvm_intel loaded by triggering kdump:
echo c > /proc/sysrq-trigger
Suggested-by: Sean Christopherson <seanjc@google.com>
Fixes: 428afac5a8 ("KVM: x86: Move bulk of emergency virtualizaton logic to virt subsystem")
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Link: https://patch.msgid.link/20260504235435.90957-1-mikhail.v.gavrilov@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Include both linux/mman.h (the kernel provided version) and sys/mman.h (the
libc provided version) throughout KVM selftests, by way of kvm_syscalls.h
(which should have been including sys/mman.h anyways). Pulling in the
kernel's version fixes compilation errors with the guest_memfd test on
older versions of libc due to a recent commit adding MADV_COLLAPSE testing.
In file included from include/kvm_util.h:8,
from guest_memfd_test.c:21:
guest_memfd_test.c: In function ‘test_collapse’:
guest_memfd_test.c:219:47: error: ‘MADV_COLLAPSE’ undeclared (first use in this function); did you mean ‘MADV_COLD’?
219 | TEST_ASSERT_EQ(madvise(mem, pmd_size, MADV_COLLAPSE), -1);
| ^~~~~~~~~~~~~
include/test_util.h:62:16: note: in definition of macro ‘TEST_ASSERT_EQ’
62 | typeof(a) __a = (a); \
| ^
guest_memfd_test.c:219:47: note: each undeclared identifier is reported only once for each function it appears in
219 | TEST_ASSERT_EQ(madvise(mem, pmd_size, MADV_COLLAPSE), -1);
| ^~~~~~~~~~~~~
include/test_util.h:62:16: note: in definition of macro ‘TEST_ASSERT_EQ’
62 | typeof(a) __a = (a); \
| ^
Route the includes through kvm_syscalls.h to try and avoid a future game
of whack-a-mole, i.e. so that future expansion of test coverage doesn't run
into the same problem.
To discourage use of sys/mman.h, opportunistically include the kernel's
version of mman.h in test_util.h as it only needs MAP_SHARED, i.e. only
needs the full set of kernel defs, not the libc syscall wrappers.
Fixes: 9830209b4a ("KVM: selftests: Test MADV_COLLAPSE on guest_memfd")
Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Closes: https://lore.kernel.org/all/20260427204313.50741-1-rick.p.edgecombe@intel.com
Link: https://patch.msgid.link/20260428012503.1213654-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
i2c20 is used by the battmgr service on the ADSP to communicate with the
SBS interface of the battery. Initializing it from Linux would break the
battmgr functionality when booted in EL2. Mark those pins as reserved.
Fixes: e7733b4211 ("arm64: dts: qcom: Add support for Dell Inspiron 7441 / Latitude 7455")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Signed-off-by: Val Packett <val@packett.cool>
Link: https://lore.kernel.org/r/20260312005731.12488-2-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Every platform driver can be forced to match a device that doesn't match
its list of device IDs because of device_match_driver_override(), so
platform drivers that rely on the existence of a device's ACPI companion
object need to verify its presence.
Accordingly, add a requisite ACPI_COMPANION() check against NULL to the
atlas_btns driver.
Fixes: b8303880b6 ("Input: atlas - convert ACPI driver to a platform one")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/8696590.T7Z3S40VBb@rafael.j.wysocki
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
On Glymur, all QMP PHYs except the one used by USB SS0 take their
reference clock from the TCSR clock controller. Since these TCSR clocks
already derive from RPMH_CXO_CLK as their sole parent, there is no need
to provide an extra `clkref` clock to the PHY nodes.
Drop the extra RPMh CXO clock inputs and use the TCSR clocks as the PHY
reference clocks instead.
This also fixes the devicetree schema validation, as the bindings do not
allow a separate `clkref` clock.
Fixes: 4eee57dd4d ("arm64: dts: qcom: glymur: Add USB related nodes")
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reported-by: Rob Herring <robh@kernel.org>
Closes: https://lore.kernel.org/r/20260410145205.GA554754-robh@kernel.org/
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260414-dts-glymur-drop-rpmh-cxo-clk-from-qmpphys-v1-1-ab12d77c4aec@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
When uart_flush_buffer() runs before the DMA completion IRQ is delivered,
the following race can occur (all steps serialized by uart_port_lock):
1. DMA starts: tx_remaining = N, kfifo contains N bytes
2. DMA completes in hardware; IRQ is pending but not yet delivered
3. uart_flush_buffer() acquires the port lock and calls kfifo_reset(),
making kfifo_len() = 0 while tx_remaining remains N
4. uart_flush_buffer() releases the port lock
5. DMA IRQ fires; handle_tx_dma() acquires the port lock and calls
uart_xmit_advance(uport, tx_remaining) on an empty kfifo
uart_xmit_advance() increments kfifo->out by tx_remaining. Since
kfifo_reset() already set both in and out to 0, out wraps past in,
causing kfifo_len() to return UART_XMIT_SIZE - tx_remaining. The next
start_tx_dma() call then submits a DMA transfer of stale buffer data.
Fix this by snapshotting kfifo_len() at the start of handle_tx_dma()
and skipping uart_xmit_advance() when fifo_len < tx_remaining, which
indicates the kfifo was reset by a preceding flush.
Fixes: 2aaa43c707 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
Cc: stable <stable@kernel.org>
Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260506-serial-dma-stale-tx-buf-v1-1-e3ccb360d719@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
lpuart_start_rx_dma() allocates sport->rx_ring.buf with kzalloc() and
then maps a scatterlist via dma_map_sg(). On three subsequent error
paths the function returns directly without releasing those resources:
- when dma_map_sg() returns 0 (-EINVAL):
ring->buf is leaked.
- when dmaengine_slave_config() fails:
ring->buf and the DMA mapping are leaked.
- when dmaengine_prep_dma_cyclic() returns NULL:
ring->buf and the DMA mapping are leaked.
The sole cleanup path, lpuart_dma_rx_free(), is only reached when
lpuart_dma_rx_use is set, and the caller lpuart_rx_dma_startup() clears
that flag on failure of lpuart_start_rx_dma(). So these resources are
permanently leaked on every failure in this function. Repeated port
open/close or termios changes under error conditions will slowly consume
memory and leave stale streaming DMA mappings behind.
Fix it by introducing two error labels that unmap the scatterlist and
free the ring buffer as appropriate. While here, replace the misleading
-EFAULT (bad userspace pointer) returned when dmaengine_prep_dma_cyclic()
fails with the more accurate -ENOMEM, matching how other dmaengine users
in the tree treat this failure.
No functional change on the success path.
Fixes: 5887ad43ee ("tty: serial: fsl_lpuart: Use cyclic DMA for Rx")
Cc: stable <stable@kernel.org>
Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260420135903.2062024-1-shitalkumar.gandhi@cambiumnetworks.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Include the definition of struct tty_driver in tty_port.h to keep the
header self-contained and avoid build breakage in case anyone includes
it before tty_driver.h.
Fixes: eb3b0d92c9 ("tty: tty_port: add workqueue to flip TTY buffer")
Cc: Xin Zhao <jackzxcui1989@163.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260506124323.186703-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UART_RX_PAR_EN is incorrectly defined as bit 3, which triggers false
framing errors (S_GP_IRQ_1_EN) and causes received data to be dropped
when parity is enabled and the parity bit is 0.
Define UART_RX_PAR_EN as bit 4 of the SE_UART_RX_TRANS_CFG register, as
specified in the reference manual.
Fixes: c4f528795d ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
Cc: stable <stable@kernel.org>
Signed-off-by: Prasanna S <prasanna.s@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260428-serial-bit-correct-v1-1-9131ad5b97d8@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The sci_request_port() function uses request_mem_region() to reserve
I/O memory, but in the error path when sci_remap_port() fails, it
incorrectly calls release_resource() instead of release_mem_region().
This mismatch can cause resource accounting issues. Fix it by using
the correct release function, consistent with sci_release_port().
Fixes: e265164708 ("serial: sh-sci: Handle port memory region reservations.")
Cc: stable <stable@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604032356.SzEjYkBC-lkp@intel.com/
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260421065737.724187-1-zenghongling@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a check for dma_alloc_coherent() failure to prevent a potential
NULL pointer dereference in dma_handle_rx(). Properly release DMA
channels and the PCI device reference using a goto ladder if the
allocation fails.
Fixes: 3c6a483275 ("Serial: EG20T: add PCH_UART driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Zhaoyang Yu <2426767509@qq.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/tencent_E328416B7CFD436F6029F2DF02AD7ED89C08@qq.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a thinko in the status interrupt handler that has caused counters
for the RI and DSR modem line transitions to be used for the other line
each.
Fixes: 8b4a40809e ("zs: move to the serial subsystem")
Cc: stable <stable@kernel.org>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2604101747110.29980@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since Qualcomm inline-crypto engine (ICE) is now a dedicated driver
de-coupled from the QCOM UFS driver, it explicitly votes for its required
clocks during probe. For scenarios where the 'clk_ignore_unused' flag is
not passed on the kernel command line, to avoid potential unclocked ICE
hardware register access during probe the ICE driver should additionally
vote on the 'iface' clock.
Also update the suspend and resume callbacks to handle un-voting and voting
on the 'iface' clock.
Fixes: 2afbf43a4a ("soc: qcom: Make the Qualcomm UFS/SDCC ICE a dedicated driver")
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-2-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
The DT bindings for inline-crypto engine do not specify the UFS_PHY_GDSC
power-domain and iface clock. Without enabling the iface clock and the
associated power-domain the ICE hardware cannot function correctly and
leads to unclocked hardware accesses being observed during probe.
Fix the DT bindings for inline-crypto engine to require the UFS_PHY_GDSC
power-domain and iface clock for new devices (Eliza and Milos) introduced
in the current release (7.1) with yet-to-stabilize ABI, while preserving
backward compatibility for older devices.
Fixes: 618195a7ac ("dt-bindings: crypto: qcom,inline-crypto-engine: Document the Eliza ICE")
Fixes: 85faec1e85 ("dt-bindings: crypto: qcom,inline-crypto-engine: document the Milos ICE")
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
A DIRECTORY entry's value field is used as the dir_offset for a
recursive call into __tb_property_parse_dir() with no depth counter.
A crafted peer that chains DIRECTORY entries into a back-reference
loop drives the parser until the kernel stack is exhausted and the
guard page fires. Any untrusted XDomain peer (cable, dock, in-line
inspector, adjacent host) that reaches the PROPERTIES_REQUEST
control-plane exchange can trigger this without authentication.
Thread a depth counter through tb_property_parse() and
__tb_property_parse_dir(), and reject blocks that exceed
TB_PROPERTY_MAX_DEPTH = 8. That is comfortably larger than any
observed legitimate XDomain layout.
Operators who do not need XDomain host-to-host discovery can disable
the path entirely with thunderbolt.xdomain=0 on the kernel command
line.
Fixes: cdae7c07e3 ("thunderbolt: Add support for XDomain properties")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On the non-root path, __tb_property_parse_dir() takes dir_len from
entry->length (u16 widened to size_t). Two distinct OOB conditions
follow when entry->length < 4:
1. The non-root path begins with kmemdup(&block[dir_offset],
sizeof(*dir->uuid), ...) which always reads 4 dwords from
dir_offset. tb_property_entry_valid() only enforces
dir_offset + entry->length <= block_len, so a crafted entry
with dir_offset close to the end of the property block and
entry->length in 0..3 passes that gate but lets the UUID copy
run off the block (e.g. dir_offset = 497, dir_len = 3 in a
500-dword block reads block[497..501]).
2. After the kmemdup, content_len = dir_len - 4 underflows size_t
to ~SIZE_MAX, nentries becomes SIZE_MAX / 4, and the entry
walk runs OOB on each iteration until an entry fails
validation or the kernel oopses on an unmapped page.
Reject dir_len < 4 on the non-root path *before* the UUID kmemdup,
which closes both holes.
Also move INIT_LIST_HEAD(&dir->properties) up to immediately after
the dir allocation so the new error-return path (and the existing
uuid-alloc failure path) calling tb_property_free_dir() sees a
walkable list rather than the zero-initialized NULL next/prev that
list_for_each_entry_safe() would oops on.
Fixes: cdae7c07e3 ("thunderbolt: Add support for XDomain properties")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
entry->value is u32 and entry->length is u16; the sum is performed in
u32 and wraps. A malicious XDomain peer can pick
value = 0xffffff00, length = 0x100 so the sum 0x100000000 wraps to 0
and passes the > block_len check. tb_property_parse() then passes
entry->value to parse_dwdata() as a dword offset into the property
block, reading attacker-directed memory far past the allocation.
For TEXT-typed entries with the "deviceid" or "vendorid" keys this
lands in xd->device_name / xd->vendor_name and is readable back via
the per-XDomain device_name / vendor_name sysfs attributes; the leak
is NUL-bounded (kstrdup() stops at the first zero byte) and
untargeted (the attacker picks a delta, not an absolute address).
DATA-typed entries are parsed into property->value.data but not
generically surfaced to userspace.
Use check_add_overflow() so a wrapped sum is rejected.
Fixes: cdae7c07e3 ("thunderbolt: Add support for XDomain properties")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Move the out_free_req label up by a couple of lines so that the
allocated dst SG list gets freed on error as well as success.
Fixes: eb2953d269 ("xfrm: ipcomp: Use crypto_acomp interface")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reported-by: Yilin Zhu <zylzyl2333@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
In mxt_update_cfg(), the driver calculates the memory size needed to store
the configuration as data->mem_size - cfg.start_ofs. If data->mem_size is
less than or equal to cfg.start_ofs, this calculation will underflow or
result in a zero-size buffer, neither of which is valid for a configuration
update.
Add a check to return -EINVAL if data->mem_size is too small. While at it,
change the types of start_ofs and mem_size in struct mxt_cfg to u16 to
match the device address space.
Assisted-by: Gemini:gemini-3.1-pro
Link: https://patch.msgid.link/20260504185448.4055973-2-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When a configuration file provides an object size that is larger than the
driver's known mxt_obj_size(object), the driver intends to discard the
extra bytes.
The loop iterates using for (i = 0; i < size; i++). Inside the loop, the
condition to skip processing extra bytes is:
if (i > mxt_obj_size(object))
continue;
Since i is a 0-based index, the valid indices for the object are 0 through
mxt_obj_size(object) - 1.
When i == mxt_obj_size(object), the condition evaluates to false, and the
code processes the byte instead of discarding it.
This causes the code to calculate byte_offset = reg + i - cfg->start_ofs
and writes the byte there, overwriting exactly one byte of the adjacent
instance or object.
Update the boundary check to skip extra bytes correctly by using >=.
Fixes: 50a77c658b ("Input: atmel_mxt_ts - download device config using firmware loader")
Cc: stable@vger.kernel.org
Assisted-by: Gemini:gemini-3.1-pro
Reviewed-by: Ricardo Ribalda <ribalda@chromium.org>
Link: https://patch.msgid.link/20260504185448.4055973-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Instead of assigning the pci_device_id members using a list (which is
hard to read as you need to look at the order of the members in that
struct in parallel) use the PCI_VDEVICE() convenience macro to compact
the initialisation while improving readability.
Also drop trailing zeros that the compiler will care about then.
The change doesn't introduce binary changes to the compiled driver,
verified on both ARCH=x86 and ARCH=arm64.
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260507160051.3315630-2-u.kleine-koenig@baylibre.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The i.MX8M soc device is registered via platform_device_register_simple(),
so it is not associated with a Device Tree node and the imx8m_soc_driver
has no of_match_table.
As a result, device_get_match_data() always returns NULL when probing
the soc device.
Retrieve the match data directly from the machine compatible using
of_machine_get_match_data(imx8_soc_match), which provides the correct SoC
data.
Fixes: 2524b293a5 ("soc: imx8m: don't access of_root directly")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Both nfc_hci_recv_from_llc() and nci_hci_data_received_cb() read
packet->header from skb->data at function entry without first checking
that the buffer holds at least one byte. A malicious NFC peer can send
a 0-byte HCP frame that passes through the SHDLC layer and reaches
these functions, causing an out-of-bounds heap read of packet->header.
The same 0-byte frame, if queued as a non-final fragment, also causes
the reassembly loop to underflow msg_len to UINT_MAX, triggering
skb_over_panic() when the reassembled skb is written.
Fix this by adding a pskb_may_pull() check at the entry of each
function before packet->header is first accessed. The existing
pskb_may_pull() checks before the reassembled hcp_skb is cast to
struct hcp_packet remain in place to guard the 2-byte HCP message
header.
Fixes: 8b8d2e08bf ("NFC: HCI support")
Fixes: 11f54f2286 ("NFC: nci: Add HCI over NCI protocol support")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
Link: https://patch.msgid.link/20260505170712.96560-1-ashutoshdesai993@gmail.com
Signed-off-by: David Heidelberg <david@ixit.cz>
xfrm_send_migrate() in net/xfrm/xfrm_user.c and pfkey_send_migrate()
in net/key/af_key.c both hardcode &init_net for the multicast that
announces a successful XFRM_MSG_MIGRATE / SADB_X_MIGRATE.
XFRM_MSG_MIGRATE arrives on a per-netns NETLINK_XFRM socket, and the
rest of the xfrm/af_key netlink path was made netns-aware in 2008.
The other 14 multicast paths in xfrm_user.c route their event using
xs_net(x), xp_net(xp) or sock_net(skb->sk); only the migrate path
was missed.
Two consequences of the init_net hardcoding:
1. The notification (selector, old/new endpoint addresses, and the
km_address) is delivered to listeners on init_net's
XFRMNLGRP_MIGRATE / pfkey BROADCAST_ALL groups rather than on
the issuing netns. An IKE daemon running in init_net therefore
receives migration notifications originating from any other
netns on the host.
2. An IKE daemon running inside a non-init netns and subscribed
to its own XFRMNLGRP_MIGRATE / pfkey groups never receives the
notification of its own migration. IKEv2 MOBIKE / address-update
handling inside a netns is silently broken.
Thread struct net through km_migrate() and the xfrm_mgr.migrate
function pointer, drop the &init_net override in xfrm_send_migrate()
and pfkey_send_migrate(), and pass the caller's net (already in
scope in xfrm_migrate() via sock_net(skb->sk)) all the way down.
struct xfrm_mgr is in-tree only and not exported as a stable API,
so the function-pointer signature change is internal.
pfkey_broadcast() is already netns-aware via net_generic(net,
pfkey_net_id) since the pernet conversion. The five other
pfkey_broadcast() callers in af_key.c already pass xs_net(x),
sock_net(sk) or a per-netns net, so this only removes the
&init_net outlier.
Fixes: 5c79de6e79 ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
A race condition exists in the NFC LLCP connection state machine where
the connection acceptance packet (CC) can be processed concurrently with
socket release. This can lead to a use-after-free of the socket object.
When nfc_llcp_recv_cc() moves the socket from the connecting_sockets
list to the sockets list, it does so without holding the socket lock.
If llcp_sock_release() is executing concurrently, it might have already
unlinked the socket and dropped its references, which can result in
nfc_llcp_recv_cc() linking a freed socket into the live list.
Fix this by holding lock_sock() during the state transition and list
movement in nfc_llcp_recv_cc(). After acquiring the lock, check if
the socket is still hashed to ensure it hasn't already been unlinked
and marked for destruction by the release path. This aligns the locking
pattern with recv_hdlc() and recv_disc().
Fixes: a69f32af86 ("NFC: Socket linked list")
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260429134115.3558604-2-lee@kernel.org
Signed-off-by: David Heidelberg <david@ixit.cz>
llcp_sock_release() unconditionally unlinks the socket from the local
sockets list. However, if the socket is still in connecting state, it
is on the connecting list.
Fix this by checking the socket state and unlinking from the correct list.
Fixes: b4011239a0 ("NFC: llcp: Fix non blocking sockets connections")
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://patch.msgid.link/20260429134115.3558604-1-lee@kernel.org
Signed-off-by: David Heidelberg <david@ixit.cz>
Replace the ternary with a direct call to the regmap_assign_bits()
helper and save a couple lines of code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
After removing of_platform_default_populate() calls the atmel-ebi driver
was affected by deferred probing. platform_driver_probe() is
incompatible with deferred probing. This led to atmel-ebi driver
eventually not being probed on at91 sam9x60-curiosity and other sam9x60
based boards. Subsequently the nand-controller driver (nand-controller
being a child node of ebi) on that platform was not probed and thus raw
NAND flash was inaccessible, preventing devices to boot with rootfs on
raw NAND flash (e.g. with UBI/UBIFS).
Fixes: 0b0f7e6539 ("ARM: at91: remove unnecessary of_platform_default_populate calls")
Cc: stable@vger.kernel.org
Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Alexander Dahl <ada@thorsis.com>
Link: https://patch.msgid.link/20260429125930.844790-1-ada@thorsis.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
GFP_NOWAIT is inappropriate when blkdev_issue_zeroout may sleep and
bio_alloc can fail under pressure; use GFP_NOIO for clear_partition and
vdo_clear_layout zeroout calls.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: fc1d438267 ("dm vdo: save the formatted metadata to disk")
After device_initialize(), the lifetime of the embedded struct device
is expected to be managed through the device core reference counting.
In counter_alloc(), if dev_set_name() fails after device_initialize(),
the error path removes the chrdev, frees the ID, and frees the backing
allocation directly instead of releasing the device reference with
put_device(). This bypasses the normal device lifetime rules and may
leave the reference count of the embedded struct device unbalanced,
resulting in a refcount leak.
The issue was identified by a static analysis tool I developed and
confirmed by manual review.
Fix this by using put_device() in the dev_set_name() failure path and
let counter_device_release() handle the final cleanup.
Fixes: 4da08477ea ("counter: Set counter device name")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://lore.kernel.org/r/20260413134604.2861772-1-lgs201920130244@gmail.com
Signed-off-by: William Breathitt Gray <wbg@kernel.org>
Add the VID/PIDs for the ASUS ROG RAIKIRI II controller to xpad_device
and the VID to xpad_table. The controller has a physical PC/XBOX toggle
which switches between XBOX360 and XBOXONE protocols.
Signed-off-by: Dmitriy Zharov <contact@zharov.dev>
Link: https://patch.msgid.link/20260430183522.122151-1-contact@zharov.dev
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Ensure that the firmware file is large enough to contain the expected
number of pages and the signature (which resides at the end of the
firmware blob) before accessing them to prevent potential out-of-bounds
reads.
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/ae2dOgiFvXRm4BHo@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Add a zero check for val2 before using it as a divisor when setting the
sampling frequency. A user writing a zero fractional part to the
sampling_frequency sysfs attribute triggers a division by zero in the
kernel.
Fixes: 64b3d8b1b0 ("iio: chemical: scd30: add core driver")
Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The driver acquired the ADC clock with devm_clk_get() and read its
rate, but never called clk_prepare_enable(). The probe error path and
npcm_adc_remove() both called clk_disable_unprepare() unconditionally,
causing the clk framework's enable/prepare counts to underflow on
probe failure or module unbind.
The issue went unnoticed because NPCM BMC firmware leaves the ADC
clock enabled at boot, so the driver happened to work in practice.
Switch to devm_clk_get_enabled() so the clock is properly enabled
during probe and automatically released by the device-managed
cleanup, and drop the now-redundant clk_disable_unprepare() from
both the probe error path and remove().
While at it, drop the duplicate error message on devm_request_irq()
failure since the IRQ core already logs it.
Fixes: 9bf85fbc9d ("iio: adc: add NPCM ADC driver")
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
When Common Clock Framework is disabled, clk_get_rate() returns 0.
This is used as part of the divisor to perform nanosecond delays
with help of ndelay(). When the above condition occurs the compiler,
due to unspecified behaviour, is free to do what it wants to. Here
it saturates the value, which is logical from mathematics point of
view. However, the ndelay() implementation has set a reasonable
upper threshold and refuses to provide anything for such a long
delay. That's why code may not be linked under these circumstances.
To solve the issue, provide a wrapper that calls ndelay() when
the value is known not to be zero.
Fixes: 4434072a89 ("iio: adc: Add the NXP SAR ADC support for the s32g2/3 platforms")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603311958.ly6uROit-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The function iio_multiply_value returns IIO_VAL_INT (1) on success or a
negative error number on failure, while iio_read_channel_processed_scale
should return an error code or 0. This creates a situation where the
expected result is treated as an error. Fix this by checking the
iio_multiply_value result separately, instead of passing it as a return
value.
Fixes: 05f958d003 ("iio: Improve iio_read_channel_processed_scale() precision")
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Add a validation check for the sampling frequency value before using it
as a divisor. A user writing zero to the sampling_frequency sysfs
attribute triggers a division by zero in the kernel.
Fixes: 089a41985c ("staging: iio: adis16260 digital gyro driver")
Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
ad4695_enter_advanced_sequencer_mode() was called after
spi_offload_trigger_enable(). That is wrong because
ad4695_enter_advanced_sequencer_mode() issues regular SPI transfers to
put the ADC into advanced sequencer mode, and not all SPI offload capable
controllers support regular SPI transfers while offloading is enabled.
Fix this by calling ad4695_enter_advanced_sequencer_mode() before
spi_offload_trigger_enable(), so the ADC is fully configured before the
first CNV pulse can occur. This is consistent with the same constraint
that already applies to the BUSY_GP_EN write above it.
Update the error unwind labels accordingly: add err_exit_conversion_mode
so that a failure of spi_offload_trigger_enable() correctly exits
conversion mode before clearing BUSY_GP_EN.
Fixes: f09f140e3e ("iio: adc: ad4695: Add support for SPI offload")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Radu Sabau <radu.sabau@analog.com>
Cc: Stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The driver calls i2c_new_dummy_device() to create a dummy device,
then calls i2c_smbus_write_byte(). If i2c_smbus_write_byte() fails and
returns, the cleanup via devm_add_action_or_reset() was never registered,
so the dummy device leaks.
Switch to devm_i2c_new_dummy_device() which registers cleanup atomically
with device creation, eliminating the error-path window.
Fixes: 7501bff87c ("iio: light: veml6070: add action for i2c_unregister_device")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
mhz19b_receive_buf() appends each serdev chunk into the fixed
MHZ19B_CMD_SIZE receive buffer and advances buf_idx by len without
checking that the chunk fits in the remaining space. A large callback
can therefore overflow st->buf before the command path validates the
reply.
Reset the reply state before each command and reject oversized serial
replies before copying them into the fixed buffer. When an oversized
reply is detected, wake the waiter and report -EMSGSIZE instead of
overwriting st->buf.
Fixes: 4572a70b36 ("iio: chemical: Add support for Winsen MHZ19B CO2 sensor")
Cc: stable@vger.kernel.org
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Acked-by: Gyeyoung Baek <gye976@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
xadc_postdisable() unconditionally sets the sequencer to continuous
mode. For dual external multiplexer configurations this is incorrect:
simultaneous sampling mode is required so that ADC-A samples through
the mux on VAUX[0-7] while ADC-B simultaneously samples through the
mux on VAUX[8-15]. In continuous mode only ADC-A is active, so
VAUX[8-15] channels return incorrect data.
Since postdisable is also called from xadc_probe() to set the initial
idle state, the wrong sequencer mode is active from the moment the
driver loads.
The preenable path already uses xadc_get_seq_mode() which returns
SIMULTANEOUS for dual mux. Fix postdisable to do the same.
Fixes: bdc8cda1d0 ("iio:adc: Add Xilinx XADC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Christofer Jonason <christofer.jonason@guidelinegeo.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Salih Erim <salih.erim@amd.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
xpadone_process_packet() receives len directly from urb->actual_length
and uses it to index the share-button byte at data[len - 18] or
data[len - 26]. Since both len and data[0] are under the device's
control, a broken controller can send a GIP_CMD_INPUT packet with
actual_length < 18 (e.g. 5 bytes) and reach this code path, causing
accesses beyond the actual array.
Fix this by calculating the offset and checking bounds against the
packet length.
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 4ef4636707 ("Input: xpad - fix Share button on Xbox One controllers")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
nexio_read_data() pulls data_len and x_len from a packed __be16 header
in the device's interrupt packet and then walks packet->data[0..x_len)
and packet->data[x_len..data_len) comparing each byte against a
threshold.
Both fields are 16-bit on the wire (max 65535). The existing
adjustments shave at most 0x100 / 0x80 off, so the loop bound can still
reach roughly 0xfeff. The URB transfer buffer for NEXIO is rept_size
(1024) bytes from usb_alloc_coherent(), with the first 7 occupied by the
packed header — so packet->data[] has 1017 valid bytes. read_data()
callbacks are not given urb->actual_length, and nothing else bounds the
walk.
A device that lies about its length can get a ~64 KiB out-of-bounds read
past the coherent DMA allocation. The first index whose byte exceeds
NEXIO_THRESHOLD lands in begin_x / begin_y and from there into the
reported touch coordinates, so adjacent kernel memory contents leak to
userspace as ABS_X / ABS_Y events. Far enough out, the read can also
hit an unmapped page and fault.
Fix this all by clamping data_len to the buffer's data[] capacity and
x_len to data_len.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Fixes: 5197424cdc ("Input: usbtouchscreen - add NEXIO (or iNexio) support")
Cc: stable <stable@kernel.org>
Assisted-by: gkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026042026-chlorine-epidermis-fd6d@gregkh
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Commit 70b0d6b0a1 ("tee: optee: Fix supplicant wait loop") made the
client wait as killable so it can be interrupted during shutdown or
after a supplicant crash. This changes the original lifetime expectations:
the client task can now terminate while the supplicant is still processing
its request.
If the client exits first it removes the request from its queue and
kfree()s it, while the request ID remains in supp->idr. A subsequent
lookup on the supplicant path then dereferences freed memory, leading to
a use-after-free.
Serialise access to the request with supp->mutex:
* Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
looking up and touching the request.
* Let optee_supp_thrd_req() notice that the client has terminated and
signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still
has a reference, eliminating the race.
Fixes: 70b0d6b0a1 ("tee: optee: Fix supplicant wait loop")
Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@oss.qualcomm.com>
Tested-by: Ox Yeh <ox.yeh@mediatek.com>
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2026-03-02 14:36:50 +01:00
422 changed files with 5222 additions and 2429 deletions