Commit Graph

20446 Commits (3e082978c33151d576694deac8abde021ea669a8)

Author SHA1 Message Date
Brendan Jackman 953dad21bb tools: testing: support EXTRA_CFLAGS in shared.mk
This allows the user to set cflags when building tests that use this
shared build infrastructure.

For example, it enables building with -Werror so that patch-check scripts
will fail:

	make -C tools/testing/vma -j EXTRA_CFLAGS=-Werror

Link: https://lkml.kernel.org/r/20250828-b4-vma-no-atomic-h-v2-3-02d146a58ed2@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:25 -07:00
Brendan Jackman d794cd23dc tools: testing: allow importing arch headers in shared.mk
There is an arch/ tree under tools.  This contains some useful stuff, to
make that available, add it to the -I flags.  This requires $(SRCARCH),
which is provided by Makefile.arch, so include that..

There still aren't that many headers so also just smush all of them into
SHARED_DEPS instead of starting to do any header dependency hocus pocus.

Link: https://lkml.kernel.org/r/20250828-b4-vma-no-atomic-h-v2-2-02d146a58ed2@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:25 -07:00
Dev Jain 1580cd50b6 selftests/mm/uffd-stress: stricten constraint on free hugepages needed before the test
The test requires at least 2 * (bytes/page_size) hugetlb memory, since we
require identical number of hugepages for src and dst location.  Fix this.

Along with the above, as explained in patch "selftests/mm/uffd-stress:
Make test operate on less hugetlb memory", the racy nature of the test
requires that we have some extra number of hugepages left beyond what is
required.  Therefore, stricten this constraint.

Link: https://lkml.kernel.org/r/20250909061531.57272-3-dev.jain@arm.com
Fixes: 5a6aa60d18 ("selftests/mm: skip uffd hugetlb tests with insufficient hugepages")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:19 -07:00
Dev Jain 060b6c72ce selftests/mm/uffd-stress: make test operate on less hugetlb memory
Patch series "selftests/mm: uffd-stress fixes", v2.

This patchset ensures that the number of hugepages is correctly set in the
system so that the uffd-stress test does not fail due to the racy nature
of the test.  Patch 1 changes the hugepage constraint in the
run_vmtests.sh script, whereas patch 2 changes the constraint in the test
itself.


This patch (of 2):

We observed uffd-stress selftest failure on arm64 and intermittent
failures on x86 too:

running ./uffd-stress hugetlb-private 128 32

bounces: 17, mode: rnd read, ERROR: UFFDIO_COPY error: -12 (errno=12, @uffd-common.c:617) [FAIL]
not ok 18 uffd-stress hugetlb-private 128 32 # exit=1

For this particular case, the number of free hugepages from run_vmtests.sh
will be 128, and the test will allocate 64 hugepages in the source
location.  The stress() function will start spawning threads which will
operate on the destination location, triggering uffd-operations like
UFFDIO_COPY from src to dst, which means that we will require 64 more
hugepages for the dst location.

Let us observe the locking_thread() function.  It will lock the mutex kept
at dst, triggering uffd-copy.  Suppose that 127 (64 for src and 63 for
dst) hugepages have been reserved.  In case of BOUNCE_RANDOM, it may
happen that two threads trying to lock the mutex at dst, try to do so at
the same hugepage number.  If one thread succeeds in reserving the last
hugepage, then the other thread may fail in alloc_hugetlb_folio(),
returning -ENOMEM.  I can confirm that this is indeed the case by this
hacky patch:

:--- a/mm/hugetlb.c
; +++ b/mm/hugetlb.c
; @@ -6929,6 +6929,11 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
; 
;  		folio = alloc_hugetlb_folio(dst_vma, dst_addr, false);
;  		if (IS_ERR(folio)) {
; +			pte_t *actual_pte = hugetlb_walk(dst_vma, dst_addr, PMD_SIZE);
; +			if (actual_pte) {
; +				ret = -EEXIST;
; +				goto out;
; +			}
;  			ret = -ENOMEM;
;  			goto out;
;  		}

This code path gets triggered indicating that the PMD at which one thread
is trying to map a hugepage, gets filled by a racing thread.

Therefore, instead of using freepgs to compute the amount of memory, use
freepgs - (min(32, nr_cpus) - 1), so that the test still has some extra
hugepages to use.  The adjustment is a function of min(32, nr_cpus) - the
value of nr_parallel in the test - because in the worst case, nr_parallel
number of threads will try to map a hugepage on the same PMD, one will win
the allocation race, and the other nr_parallel - 1 threads will fail, so
we need extra nr_parallel - 1 hugepages to satisfy this request.  Note
that, in case the adjusted value underflows, there is a check for the
number of free hugepages in the test itself, which will fail:
get_free_hugepages() < bytes / page_size A negative value will be passed
on to bytes which is of type size_t, thus the RHS will become a large
value and the check will fail, so we are safe.

Link: https://lkml.kernel.org/r/20250909061531.57272-1-dev.jain@arm.com
Link: https://lkml.kernel.org/r/20250909061531.57272-2-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:19 -07:00
I Viswanath 79dfed0976 selftests/mm: use calloc instead of malloc in pagemap_ioctl.c
As per Documentation/process/deprecated.rst, dynamic size calculations
should not be performed in memory allocator arguments due to possible
overflows.

Replace malloc with calloc to avoid open-ended arithmetic and prevent
possible overflows.

Link: https://lkml.kernel.org/r/20250825170643.63174-1-viswanathiyyappan@gmail.com
Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed by: Donet Tom <donettom@linux.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:17 -07:00
Bala-Vignesh-Reddy a7498388b0 selftests: centralise maybe-unused definition in kselftest.h
Several selftests subdirectories duplicated the define __maybe_unused,
leading to redundant code.  Move to kselftest.h header and remove other
definitions.

This addresses the duplication noted in the proc-pid-vm warning fix

Link: https://lkml.kernel.org/r/20250821101159.2238-1-reddybalavignesh9979@gmail.com
Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Link:https://lore.kernel.org/lkml/20250820143954.33d95635e504e94df01930d0@linux-foundation.org/
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Mickal Salan <mic@digikod.net>	[landlock]
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:16 -07:00
ally heev 940b1be225 kselftest: mm: fix typos in test_vmalloc.sh
Fix simple typos in function name and console message.

Link: https://lkml.kernel.org/r/20250823170208.184149-1-allyheev@gmail.com
Signed-off-by: ally heev <allyheev@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:16 -07:00
Wei Yang c9615059ca selftests/mm: test that rmap behaves as expected
As David suggested, currently we don't have a high level test case to
verify the behavior of rmap.  This patch introduce the verification on
rmap by migration.

The general idea is if migrate one shared page between processes, this
would be reflected in all related processes.  Otherwise, we have problem
in rmap.

Currently it covers following four scenarios:

  * anonymous page
  * shmem page
  * pagecache page
  * ksm page

Link: https://lkml.kernel.org/r/20250819080047.10063-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:13 -07:00
Wei Yang b27f292de6 selftests/mm: put general ksm operation into vm_util
Patch series "test that rmap behaves as expected", v4.

As David suggested, currently we don't have a high level test case to
verify the behavior of rmap. This patch set introduce the verification
on rmap by migration.

Patch 1 is a preparation to move ksm related operations into vm_util.
Patch 2 is the new test case for rmap.

Currently it covers following four scenarios:

  * anonymous page
  * shmem page
  * pagecache page
  * ksm page


This patch (of 2):

There are some general ksm operations could be used by other related
test cases. Put them into vm_util for common use.

This is a preparation patch for later use.

Link: https://lkml.kernel.org/r/20250819080047.10063-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20250819080047.10063-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:13 -07:00
Zi Yan c55ed758e0 selftests/mm: check after-split folio orders in split_huge_page_test
Instead of just checking the existence of PMD folios before and after folio
split tests, use check_folio_orders() to check after-split folio orders.

The split ranges in split_thp_in_pagecache_to_order_at() are changed to
[addr, addr + pagesize) for every pmd_pagesize. It prevents folios within
the range being split multiple times due to debugfs split function always
perform splits with a pagesize step for a given range.

The following tests are not changed:
1. split_pte_mapped_thp: the test already uses kpageflags to check;
2. split_file_backed_thp: no vaddr available.

Link: https://lkml.kernel.org/r/20250818184622.1521620-6-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: wang lian <lianux.mm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:12 -07:00
Zi Yan fca418e59a selftests/mm: add check_after_split_folio_orders() helper
The helper gathers a folio order statistics of folios within a virtual
address range and checks it against a given order list. It aims to provide
a more precise folio order check instead of just checking the existence of
PMD folios.

The helper will be used the upcoming commit.

Link: https://lkml.kernel.org/r/20250818184622.1521620-5-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: wang lian <lianux.mm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:12 -07:00
Zi Yan bd66448f2a selftests/mm: reimplement is_backed_by_thp() with more precise check
and rename it to is_backed_by_folio().

is_backed_by_folio() checks if the given vaddr is backed a folio with
a given order. It does so by:
1. getting the pfn of the vaddr;
2. checking kpageflags of the pfn;

if order is greater than 0:
3. checking kpageflags of the head pfn;
4. checking kpageflags of all tail pfns.

pmd_order is added to split_huge_page_test.c and replaces max_order.

[ziy@nvidia.com: reduce code duplication, per David]
  Link: https://lkml.kernel.org/r/F54782D6-65A3-4D35-AE03-8ADE636EE258@nvidia.com
Link: https://lkml.kernel.org/r/20250818184622.1521620-4-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: wang lian <lianux.mm@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:12 -07:00
Zi Yan 72a07c0390 selftests/mm: mark all functions static in split_huge_page_test.c
All functions are only used within the file.

Link: https://lkml.kernel.org/r/20250818184622.1521620-3-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: wang lian <lianux.mm@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:11 -07:00
Usama Arif 6bb9614484 selftests: prctl: introduce tests for disabling THPs except for madvise
The test will set the global system THP setting to never, madvise or
always depending on the fixture variant and the 2M setting to inherit
before it starts (and reset to original at teardown).  The fixture setup
will also test if PR_SET_THP_DISABLE prctl call can be made with
PR_THP_DISABLE_EXCEPT_ADVISED and skip if it fails.

This tests if the process can:
- successfully get the policy to disable THPs expect for madvise.
- get hugepages only on MADV_HUGE and MADV_COLLAPSE if the global policy
  is madvise/always and only with MADV_COLLAPSE if the global policy is
  never.
- successfully reset the policy of the process.
- after reset, only get hugepages with:
  - MADV_COLLAPSE when policy is set to never.
  - MADV_HUGE and MADV_COLLAPSE when policy is set to madvise.
  - always when policy is set to "always".
- never get a THP with MADV_NOHUGEPAGE.
- repeat the above tests in a forked process to make sure  the policy is
  carried across forks.

Test results:
./prctl_thp_disable
TAP version 13
1..12
ok 1 prctl_thp_disable_completely.never.nofork
ok 2 prctl_thp_disable_completely.never.fork
ok 3 prctl_thp_disable_completely.madvise.nofork
ok 4 prctl_thp_disable_completely.madvise.fork
ok 5 prctl_thp_disable_completely.always.nofork
ok 6 prctl_thp_disable_completely.always.fork
ok 7 prctl_thp_disable_except_madvise.never.nofork
ok 8 prctl_thp_disable_except_madvise.never.fork
ok 9 prctl_thp_disable_except_madvise.madvise.nofork
ok 10 prctl_thp_disable_except_madvise.madvise.fork
ok 11 prctl_thp_disable_except_madvise.always.nofork
ok 12 prctl_thp_disable_except_madvise.always.fork

[usamaarif642@gmail.com: return after executing test in child process]
  Link: https://lkml.kernel.org/r/3dca2de4-9a6a-4efe-a86c-83f9509831fc@gmail.com
Link: https://lkml.kernel.org/r/20250815135549.130506-8-usamaarif642@gmail.com
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yafang <laoar.shao@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:06 -07:00
Usama Arif 681f45deca selftests: prctl: introduce tests for disabling THPs completely
The test will set the global system THP setting to never, madvise or
always depending on the fixture variant and the 2M setting to inherit
before it starts (and reset to original at teardown).  The fixture setup
will also test if PR_SET_THP_DISABLE prctl call can be made to disable all
THPs and skip if it fails.

This tests if the process can:
- successfully get the policy to disable THPs completely.
- never get a hugepage when the THPs are completely disabled
  with the prctl, including with MADV_HUGE and MADV_COLLAPSE.
- successfully reset the policy of the process.
- after reset, only get hugepages with:
  - MADV_COLLAPSE when policy is set to never.
  - MADV_HUGE and MADV_COLLAPSE when policy is set to madvise.
  - always when policy is set to "always".
- never get a THP with MADV_NOHUGEPAGE.
- repeat the above tests in a forked process to make sure
  the policy is carried across forks.

[usamaarif642@gmail.com: return after executing test in child process]
  Link: https://lkml.kernel.org/r/2d0ea708-ecba-4021-b6ca-e93f1413d60a@gmail.com
[usamaarif642@gmail.com: include linux/mman.h for prctl_thp_disable]
  Link: https://lkml.kernel.org/r/20250910204609.1720498-1-usamaarif642@gmail.com
  Link: https://lore.kernel.org/all/c8249725-e91d-4c51-b9bb-40305e61e20d@sirena.org.uk/
Link: https://lkml.kernel.org/r/20250815135549.130506-7-usamaarif642@gmail.com
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yafang <laoar.shao@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:06 -07:00
Usama Arif 49850bd026 selftest/mm: extract sz2ord function into vm_util.h
The function already has 2 uses and will have a 3rd one in prctl
selftests.  The pagesize argument is added into the function, as it's not
a global variable anymore.  No functional change intended with this patch.

Link: https://lkml.kernel.org/r/20250815135549.130506-6-usamaarif642@gmail.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yafang <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:06 -07:00
Aboorva Devarajan 19de1e5d11 selftests/mm: skip hugepage-mremap test if userfaultfd unavailable
Gracefully skip test if userfaultfd is not supported (ENOSYS) or not
permitted (EPERM), instead of failing.  This avoids misleading failures
with clear skip messages.

--------------
Before Patch
--------------
~ running ./hugepage-mremap
...
~ Bail out! userfaultfd: Function not implemented
~ Planned tests != run tests (1 != 0)
~ Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
~ [FAIL]
not ok 4 hugepage-mremap # exit=1

--------------
After Patch
--------------
~ running ./hugepage-mremap
...
~ ok 2 # SKIP userfaultfd is not supported/not enabled.
~ 1 skipped test(s) detected.
~ Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0
~ [SKIP]
ok 4 hugepage-mremap # SKIP

Link: https://lkml.kernel.org/r/20250816040113.760010-8-aboorvad@linux.ibm.com
Co-developed-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:04 -07:00
Aboorva Devarajan e36215431c selftests/mm: skip thuge-gen test if system is not setup properly
Make thuge-gen skip instead of fail when it can't run due to system
settings.  If shmmax is too small or no 1G huge pages are available, the
test now prints a warning and is marked as skipped.

-------------------
Before Patch:
-------------------
~ running ./thuge-gen
~ Bail out! Please do echo 262144 > /proc/sys/kernel/shmmax
~ Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
~ [FAIL]
not ok 28 thuge-gen ~ exit=1

-------------------
After Patch:
-------------------
~ running ./thuge-gen
~ ~ WARNING: shmmax is too small to run this test.
~ ~ Please run the following command to increase shmmax:
~ ~ echo 262144 > /proc/sys/kernel/shmmax
~ 1..0 ~ SKIP Test skipped due to insufficient shmmax value.
~ [SKIP]
ok 29 thuge-gen ~ SKIP

Link: https://lkml.kernel.org/r/20250816040113.760010-7-aboorvad@linux.ibm.com
Co-developed-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:04 -07:00
Aboorva Devarajan 2d941088f4 selftests/mm: fix child process exit codes in ksm_functional_tests
In ksm_functional_tests, test_child_ksm() returned negative values to
indicate errors.  However, when passed to exit(), these were interpreted
as large unsigned values (e.g, -2 became 254), leading to incorrect
handling in the parent process.  As a result, some tests appeared to be
skipped or silently failed.

This patch changes test_child_ksm() to return positive error codes (1, 2,
3) and updates test_child_ksm_err() to interpret them correctly. 
Additionally, test_prctl_fork_exec() now uses exit(4) after a failed
execv() to clearly signal exec failures.  This ensures the parent
accurately detects and reports child process failures.

--------------
Before patch:
--------------
- [RUN] test_unmerge
ok 1 Pages were unmerged
...
- [RUN] test_prctl_fork
- No pages got merged
- [RUN] test_prctl_fork_exec
ok 7 PR_SET_MEMORY_MERGE value is inherited
...
Bail out! 1 out of 8 tests failed
- Planned tests != run tests (9 != 8)
- Totals: pass:7 fail:1 xfail:0 xpass:0 skip:0 error:0

--------------
After patch:
--------------
- [RUN] test_unmerge
ok 1 Pages were unmerged
...
- [RUN] test_prctl_fork
- No pages got merged
not ok 7 Merge in child failed
- [RUN] test_prctl_fork_exec
ok 8 PR_SET_MEMORY_MERGE value is inherited
...
Bail out! 2 out of 9 tests failed
- Totals: pass:7 fail:2 xfail:0 xpass:0 skip:0 error:0

Link: https://lkml.kernel.org/r/20250816040113.760010-6-aboorvad@linux.ibm.com
Fixes: 6c47de3be3 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
Co-developed-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:04 -07:00
Donet Tom 7bc857ddee mm/selftests: fix split_huge_page_test failure on systems with 64KB page size
The split_huge_page_test fails on systems with a 64KB base page size. 
This is because the order of a 2MB huge page is different:

On 64KB systems, the order is 5.

On 4KB systems, it's 9.

The test currently assumes a maximum huge page order of 9, which is only
valid for 4KB base page systems.  On systems with 64KB pages, attempting
to split huge pages beyond their actual order (5) causes the test to fail.

In this patch, we calculate the huge page order based on the system's base
page size.  With this change, the tests now run successfully on both 64KB
and 4KB page size systems.

Link: https://lkml.kernel.org/r/20250816040113.760010-5-aboorvad@linux.ibm.com
Fixes: fa6c02315f ("mm: huge_memory: a new debugfs interface for splitting THP tests")
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:03 -07:00
Donet Tom 08c907c5bc selftest/mm: fix ksm_funtional_test failures
This patch fixes 2 issues.

1) After fork() in test_prctl_fork, the child process uses the file
   descriptors from the parent process to read ksm_stat and
   ksm_merging_pages.  This results in incorrect values being read (parent
   process ksm_stat and ksm_merging_pages will be read in child), causing
   the test to fail.

   This patch calls init_global_file_handles() in the child process to
   ensure that the current process's file descriptors are used to read
   ksm_stat and ksm_merging_pages.

2) All tests currently call ksm_merge to trigger page merging.  To
   ensure the system remains in a consistent state for subsequent tests,
   it is better to call ksm_unmerge during the test cleanup phase

   In the test_prctl_fork test, after a fork(), reading
   ksm_merging_pages in the child process returns a non-zero value because
   a previous test performed a merge, and the child's memory state is
   inherited from the parent.

   Although the child process calls ksm_unmerge, the ksm_merging_pages
   counter in the parent is reset to zero, while the child's counter
   remains unchanged.  This discrepancy causes the test to fail.

   To avoid this issue, each test should call ksm_unmerge during
   cleanup to ensure the counter is reset and the system is in a clean
   state for subsequent tests.

execv argument is an array of pointers to null-terminated strings.  In
this patch we also added NULL in the execv argument.

Link: https://lkml.kernel.org/r/20250816040113.760010-4-aboorvad@linux.ibm.com
Fixes: 6c47de3be3 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:03 -07:00
Donet Tom 0ef3783d75 selftests/mm: add support to test 4PB VA on PPC64
PowerPC64 supports a 4PB virtual address space, but this test was
previously limited to 512TB.  This patch extends the coverage up to the
full 4PB VA range on PowerPC64.

Memory from 0 to 128TB is allocated without an address hint, while
allocations from 128TB to 4PB use a hint address.

Link: https://lkml.kernel.org/r/20250816040113.760010-3-aboorvad@linux.ibm.com
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:03 -07:00
Donet Tom eda0bf339b mm/selftests: fix incorrect pointer being passed to mark_range()
Patch series "selftests/mm: Fix false positives and skip unsupported
tests", v4.

This patch series addresses false positives in the generic mm selftests
and skips tests that cannot run correctly due to missing features or
system limitations.


This patch (of 7):

In main(), the high address is stored in hptr, but for mark_range(), the
address passed is ptr, not hptr.  Fixed this by changing ptr[i] to hptr[i]
in mark_range() function call.

Link: https://lkml.kernel.org/r/20250816040113.760010-1-aboorvad@linux.ibm.com
Link: https://lkml.kernel.org/r/20250816040113.760010-2-aboorvad@linux.ibm.com
Fixes: b2a79f6213 ("selftests/mm: virtual_address_range: unmap chunks after validation")
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:03 -07:00
Enze Li a3f451ad33 selftests/damon/access_memory_even: remove unused header file
Since the time.h header file is not actually needed in this code, we can
safely remove its inclusion.

Link: https://lkml.kernel.org/r/20250814125417.659937-1-lienze@kylinos.cn
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:02 -07:00
Alexandre Ghiti 9d246d7410 selftests/damon: fix damon selftests by installing _common.sh
_common.sh was recently introduced but is not installed and then triggers
an error when trying to run the damon selftests:

selftests: damon: sysfs.sh
./sysfs.sh: line 4: _common.sh: No such file or directory

Install this file to avoid this error.

Link: https://lkml.kernel.org/r/20250812-alex-fixes_manual-v1-1-c4e99b1f80e4@rivosinc.com
Fixes: 511914506d ("selftests/damon: introduce _common.sh to host shared function")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Sang-Heon Jeon <ekffu200098@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Tested-by: Enze Li <lienze@kylinos.cn>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:59 -07:00
Colin Ian King 53c225ffa7 selftests/mm: fix spelling mistake "mrmeap" -> "mremap"
There are spelling mistakes in perror messages.  Fix these.

Link: https://lkml.kernel.org/r/20250813081333.1978096-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:59 -07:00
Lorenzo Stoakes 8166353fb8 mm: replace mm->flags with bitmap entirely and set to 64 bits
Now we have updated all users of mm->flags to use the bitmap accessors,
repalce it with the bitmap version entirely.

We are then able to move to having 64 bits of mm->flags on both 32-bit and
64-bit architectures.

We also update the VMA userland tests to ensure that everything remains
functional there.

No functional changes intended, other than there now being 64 bits of
available mm_struct flags.

Link: https://lkml.kernel.org/r/e1f6654e016d36c43959764b01355736c5cbcdf8.1755012943.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Namhyung kim <namhyung@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:58 -07:00
Wei Yang 4a12633e87 selftests/mm: do check_huge_anon() with a number been passed in
Currently it hard codes the number of hugepage to check for
check_huge_anon(), but it would be more reasonable to do the check based
on a number passed in.

Pass in the hugepage number and do the check based on it.

Link: https://lkml.kernel.org/r/20250809194209.30484-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: wang lian <lianux.mm@gmail.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:55 -07:00
Sang-Heon Jeon 10725cd2b0 selftests/damon: test no-op commit broke DAMON status
Add test to verify that DAMON status is not changed after a no-op commit.

[ekffu200098@gmail.com: change wrong json.dump usage to json.dumps]
  Link: https://lkml.kernel.org/r/20250816014033.190451-1-ekffu200098@gmail.com
Link: https://lkml.kernel.org/r/20250810124354.16456-1-ekffu200098@gmail.com
Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:55 -07:00
Mike Rapoport (Microsoft) 801295be01 selftest/kho: update generation of initrd
Use nolibc include directory rather than include a cumulative nolibc.h on
the compiler command line and replace use of 'sudo cpio' with
usr/gen_init_cpio.

While on it fix spelling of KHO_FINALIZE

Link: https://lkml.kernel.org/r/20250811082510.4154080-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:55 -07:00
David Hildenbrand 4c89792ea0 mm: rename vm_ops->find_special_page() to vm_ops->find_normal_page()
...  and hide it behind a kconfig option.  There is really no need for any
!xen code to perform this check.

The naming is a bit off: we want to find the "normal" page when a PTE was
marked "special".  So it's really not "finding a special" page.

Improve the documentation, and add a comment in the code where XEN ends up
performing the pte_mkspecial() through a hypercall.  More details can be
found in commit 923b2919e2 ("xen/gntdev: mark userspace PTEs as special
on x86 PV guests").

Link: https://lkml.kernel.org/r/20250811112631.759341-12-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Juegren Gross <jgross@suse.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:53 -07:00
Suren Baghdasaryan 41f1055816 selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified
Patch series " execute PROCMAP_QUERY ioctl under per-vma lock", v4.

With /proc/pid/maps now being read under per-vma lock protection we can
reuse parts of that code to execute PROCMAP_QUERY ioctl also without
taking mmap_lock.  The change is designed to reduce mmap_lock contention
and prevent PROCMAP_QUERY ioctl calls from blocking address space updates.

This patchset was split out of the original patchset [1] that introduced
per-vma lock usage for /proc/pid/maps reading.  It contains PROCMAP_QUERY
tests, code refactoring patch to simplify the main change and the actual
transition to per-vma lock.


This patch (of 3):

Extend /proc/pid/maps tearing tests to verify PROCMAP_QUERY ioctl operation
correctness while the vma is being concurrently modified.

Link: https://lkml.kernel.org/r/20250808152850.2580887-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250808152850.2580887-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: SeongJae Park <sj@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: T.J. Mercier <tjmercier@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Ye Bin <yebin10@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:48 -07:00
Pranav Tyagi 6de1ef1ca3 selftests/mm: use __auto_type in swap() macro
Replace typeof() with __auto_type in the swap() macro in uffd-stress.c. 
__auto_type was introduced in GCC 4.9 and reduces the compile time for all
compilers.  No functional changes intended.

Link: https://lkml.kernel.org/r/20250730142301.6754-1-pranav.tyagi03@gmail.com
Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:44 -07:00
Sudarsan Mahendran 35edbaa04a selftests/mm: pass filename as input param to VM_PFNMAP tests
Enable these tests to be run on other pfnmap'ed memory like NVIDIA's EGM.

Add '--' as a separator to pass in file path.  This allows passing of cmd
line arguments to kselftest_harness.  Use '/dev/mem' as default filename.

Existing test passes:
	pfnmap
	TAP version 13
	1..6
	# Starting 6 tests from 1 test cases.
	# PASSED: 6 / 6 tests passed.
	# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0

Pass params to kselftest_harness:
	pfnmap -r pfnmap:mremap_fixed
	TAP version 13
	1..1
	# Starting 1 tests from 1 test cases.
	#  RUN           pfnmap.mremap_fixed ...
	#            OK  pfnmap.mremap_fixed
	ok 1 pfnmap.mremap_fixed
	# PASSED: 1 / 1 tests passed.
	# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0

Pass non-existent file name as input:
	pfnmap -- /dev/blah
	TAP version 13
	1..6
	# Starting 6 tests from 1 test cases.
	#  RUN           pfnmap.madvise_disallowed ...
	#      SKIP      Cannot open '/dev/blah'

Pass non pfnmap'ed file as input:
	pfnmap -r pfnmap.madvise_disallowed -- randfile.txt
	TAP version 13
	1..1
	# Starting 1 tests from 1 test cases.
	#  RUN           pfnmap.madvise_disallowed ...
	#      SKIP      Invalid file: 'randfile.txt'. Not pfnmap'ed

Link: https://lkml.kernel.org/r/20250805013629.47629-1-sudarsanm@google.com
Signed-off-by: Sudarsan Mahendran <sudarsanm@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:54:43 -07:00
Sean Christopherson a001cd248a rseq/selftests: Use weak symbol reference, not definition, to link with glibc
Add "extern" to the glibc-defined weak rseq symbols to convert the rseq
selftest's usage from weak symbol definitions to weak symbol _references_.
Effectively re-defining the glibc symbols wreaks havoc when building with
-fno-common, e.g. generates segfaults when running multi-threaded programs,
as dynamically linked applications end up with multiple versions of the
symbols.

Building with -fcommon, which until recently has the been the default for
GCC and clang, papers over the bug by allowing the linker to resolve the
weak/tentative definition to glibc's "real" definition.

Note, the symbol itself (or rather its address), not the value of the
symbol, is set to 0/NULL for unresolved weak symbol references, as the
symbol doesn't exist and thus can't have a value.  Check for a NULL rseq
size pointer to handle the scenario where the test is statically linked
against a libc that doesn't support rseq in any capacity.

Fixes: 3bcbc20942 ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/all/87frdoybk4.ffs@tglx
2025-09-13 19:51:59 +02:00
Jakub Kicinski bd569dd935 netfilter pull request nf-next-25-09-11
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmjC0WsNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AOC2D/982GOJGI79ArhsM9FzEiJDp2QLxMnz7cw0t1Vg
 pU469JYA/He6VHMbhiqCBopSAskl36kMMGLbflPe+Zc1DNjconFD8wrDRBlUI+EI
 lR1dERMUyafuRMuuvKaE7A0ePyR7AD1JFf1znpo6BrRDMX8mbumcIA/L7/T1Rz2F
 0b9Phbj77EdKruRgD4eXCuDJibOr5j6FymNvDGR/bm3SovOsOM4gi2dQ3Y3B85ZK
 aUDDrss5acypBfu36w+340eMtr1GJKoaCKoK9ScnO58peCSo9C0b+o6xwUBPnHdw
 sQOhl6fErOfN+agGir357jLsBUb1H7Tg9decwv9qHp3jESei3kgAyoBF4SqH+48S
 Bz1E2M0A0Hxve3/I/Pp1Hz6X5r+V+gDtx71z4+acoZg5YjDMAKFJAfgiK6sFKqZP
 73QsvnU2omu2C5r8TAbryv01C5xmkbe3YyU8j4taZOBGk4YkPd5WjJELbTcVM4If
 dxDE6vQ9+GrrlBl9aidrMNxIhJuS/qjVIb0tv7oZPVql995AgPJG3F3gSO78OPBj
 V4dN9p/wPN9lRl5tj/odWiOfsdQiY7fEkqdXlO79sACQCzXC5O6zAwcnK37s80mQ
 QFSt3oiPFcacJJyHHYeYISy13HXlxwUgdxWC6T/sXMU5Jpic+3bWFxC2YgRcIsqZ
 ZE1JbA==
 =FXmW
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter: updates for net-next

1) Don't respond to ICMP_UNREACH errors with another ICMP_UNREACH
   error.
2) Support fetching the current bridge ethernet address.
   This allows a more flexible approach to packet redirection
   on bridges without need to use hardcoded addresses. From
   Fernando Fernandez Mancera.
3) Zap a few no-longer needed conditionals from ipvs packet path
   and convert to READ/WRITE_ONCE to avoid KCSAN warnings.
   From Zhang Tengfei.
4) Remove a no-longer-used macro argument in ipset, from Zhen Ni.

* tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_reject: don't reply to icmp error messages
  ipvs: Use READ_ONCE/WRITE_ONCE for ipvs->enable
  netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR support
  netfilter: ipset: Remove unused htable_bits in macro ahash_region
  selftest:net: fixed spelling mistakes
====================

Link: https://patch.msgid.link/20250911143819.14753-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-12 17:06:25 -07:00
Petr Machata dbd9134792 selftests: forwarding: Add test for BR_BOOLOPT_FDB_LOCAL_VLAN_0
Add a selftest to check the operation of this newly-introduced bridge
option.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/62294f96884ab5d341648eef21243fa099a2dee5.1757004393.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 19:02:51 -07:00
Petr Machata fa57032941 selftests: net: lib.sh: Don't defer failed commands
Usually the autodefer helpers in lib.sh are expected to be run in context
where success is the expected outcome. However when using them for feature
detection, failure can legitimately occur. But the failed command still
schedules a cleanup, which will likely fail again.

Instead, only schedule deferred cleanup when the positive command succeeds.

This way of organizing the cleanup has the added benefit that now the
return code from these functions reflects whether the command passed.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/af10a5bb82ea11ead978cf903550089e006d7e70.1757004393.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 19:02:51 -07:00
Petr Machata ed07c8f2b8 selftests: defer: Introduce DEFER_PAUSE_ON_FAIL
The fact that all cleanup (ideally) goes through the defer framework makes
debugging of these commands a bit tricky. However, this also gives us a
nice point to place a hook along the lines of PAUSE_ON_FAIL. When the
environment variable DEFER_PAUSE_ON_FAIL is set, and a cleanup command
results in non-zero exit status, show a bit of debuginfo and give the user
an opportunity to interrupt the execution altogether.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/2a07d24568ede6c42e4701657fa0b738e490fe59.1757004393.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 19:02:51 -07:00
Petr Machata d89d3b29ce selftests: defer: Allow spaces in arguments of deferred commands
Currently the way deferred commands are stored and invoked causes any
whitespace to act as an argument separator when the command is executed.
To make it possible to use spaces in deferred commands, store the commands
quoted, and then eval the string prior to execution.

Fixes: a6e263f125 ("selftests: net: lib: Introduce deferred commands")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/6c2523139a6f99103889c9c9fedcdc66a75441f4.1757004393.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 19:02:50 -07:00
Jason A. Donenfeld ff78bfe48b wireguard: selftests: select CONFIG_IP_NF_IPTABLES_LEGACY
This is required on recent kernels, where it is now off by default.
While we're here, fix some stray =m's that were supposed to be =y.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250910013644.4153708-5-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 18:52:21 -07:00
David Hildenbrand 30e1a1dfa2 wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config
It's no longer user-selectable (and the default was already "y"), so
let's just drop it.

It was never really relevant to the wireguard selftests either way.

Cc: Shuah Khan <shuah@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250910013644.4153708-4-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 18:52:21 -07:00
David Ahern 2f186dd558 selftests: Replace sleep with slowwait
Replace the sleep in kill_procs with slowwait.

Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250910025828.38900-2-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 17:57:21 -07:00
David Ahern 53d591730e selftests: Disable dad for ipv6 in fcnal-test.sh
Constrained test environment; duplicate address detection is not needed
and causes races so disable it.

Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250910025828.38900-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 17:57:21 -07:00
Jakub Kicinski fc3a281041 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc6).

Conflicts:

net/netfilter/nft_set_pipapo.c
net/netfilter/nft_set_pipapo_avx2.c
  c4eaca2e10 ("netfilter: nft_set_pipapo: don't check genbit from packetpath lookups")
  84c1da7b38 ("netfilter: nft_set_pipapo: use avx2 algorithm for insertions too")

Only trivial adjacent changes (in a doc and a Makefile).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11 17:40:13 -07:00
Puranjay Mohan 86f2225065 selftests/bpf: Add tests for arena fault reporting
Add selftests for testing the reporting of arena page faults through BPF
streams. Two new bpf programs are added that read and write to an
unmapped arena address and the fault reporting is verified in the
userspace through streams.

The added bpf programs need to access the user_vm_start in struct
bpf_arena, this is done by casting &arena to struct bpf_arena *, but
barrier_var() is used on this ptr before accessing ptr->user_vm_start;
to stop GCC from issuing an out-of-bound access due to the cast from
smaller map struct to larger "struct bpf_arena"

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250911145808.58042-7-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11 13:00:44 -07:00
Puranjay Mohan edd03fcd76 selftests: bpf: use __stderr in stream error tests
Start using __stderr directly in the bpf programs to test the reporting
of may_goto timeout detection and spin_lock dead lock detection.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250911145808.58042-6-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11 13:00:44 -07:00
Puranjay Mohan 744eeb2b27 selftests: bpf: introduce __stderr and __stdout
Add __stderr and __stdout to validate the output of BPF streams for bpf
selftests. Similar to __xlated, __jited, etc., __stderr/out can be used
in the BPF progs to compare a string (regex supported) to the output in
the bpf streams.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250911145808.58042-5-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11 13:00:44 -07:00
Alexei Starovoitov 5d87e96a49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5
Cross-merge BPF and other fixes after downstream PR.

No conflicts.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11 09:34:37 -07:00
Sean Christopherson aebc62b3de KVM: selftests: Add support for DIV and IDIV in the fastops test
Extend the fastops test coverage to DIV and IDIV, specifically to provide
coverage for #DE (divide error) exceptions, as #DE is the only exception
that can occur in KVM's fastops path, i.e. that requires exception fixup.

Link: https://lore.kernel.org/r/20250909202835.333554-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11 08:55:44 -07:00
Sean Christopherson fe08478b1d KVM: selftests: Dedup the gnarly constraints of the fastops tests (more macros!)
Add a fastop() macro along with macros to define its required constraints,
and use the macros to dedup the innermost guts of the fastop testcases.

No functional change intended.

Link: https://lore.kernel.org/r/20250909202835.333554-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11 08:55:44 -07:00
Sean Christopherson 9bf5da1ca4 KVM: selftests: Add coverage for 'b' (byte) sized fastops emulation
Extend the fastops test to cover instructions that operate on 8-bit data.
Support for 8-bit instructions was omitted from the original commit purely
due to complications with BT not having a r/m8 variant.  To keep the
RFLAGS.CF behavior deterministic and not heavily biased to '0' or '1',
continue using BT, but cast and load the to-be-tested value into a
dedicated 32-bit constraint.

Supporting 8-bit operations will allow using guest_test_fastops() as-is to
provide full coverage for DIV and IDIV.  For divide operations, covering
all operand sizes _is_ interesting, because KVM needs provide exception
fixup for each size (failure to handle a #DE could panic the host).

Link: https://lore.kernel.org/all/aIF7ZhWZxlkcpm4y@google.com
Link: https://lore.kernel.org/r/20250909202835.333554-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11 08:55:44 -07:00
Sean Christopherson 7b39b6c769 KVM: selftests: Add support for #DE exception fixup
Add support for handling #DE (divide error) exceptions in KVM selftests
so that the fastops test can verify KVM correctly handles #DE when
emulating DIV or IDIV on behalf of the guest.  Morph #DE to 0xff (i.e.
to -1) as a mostly-arbitrary vector to indicate #DE, so that '0' (the
real #DE vector) can still be used to indicate "no exception".

Link: https://lore.kernel.org/r/20250909202835.333554-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11 08:55:44 -07:00
Linus Torvalds db87bd2ad1 Including fixes from CAN, netfilter and wireless.
We have an IPv6 routing regression with the relevant fix still
 a WiP. This v2 includes a last-minute revert to avoid more
 problems.
 
 Current release - new code bugs:
 
   - wifi: nl80211: completely disable per-link stats for now
 
 Previous releases - regressions:
 
   - dev_ioctl: take ops lock in hwtstamp lower paths
 
   - netfilter:
     - fix spurious set lookup failures
     - fix lockdep splat due to missing annotation
 
   - genetlink: fix genl_bind() invoking bind() after -EPERM
 
   - phy: transfer phy_config_inband() locking responsibility to phylink
 
   - can: xilinx_can: fix use-after-free of transmitted SKB
 
   - hsr: fix lock warnings
 
   - eth: igb: fix NULL pointer dereference in ethtool loopback test
 
   - eth: i40e: fix Jumbo Frame support after iPXE boot
 
   - eth: macsec: sync features on RTM_NEWLINK
 
 Previous releases - always broken:
 
   - tunnels: reset the GSO metadata before reusing the skb
 
   - mptcp: make sync_socket_options propagate SOCK_KEEPOPEN
 
   - can: j1939: implement NETDEV_UNREGISTER notification hanidler
 
   - wifi: ath12k: fix WMI TLV header misalignment
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmjC4tYSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkNwcQAIh8d10PYJ6iJvyxqBKVNT2CTr4wVk27
 lT9rB9t4Jdq4EhuDdRmolVPrkXEt4FiGnqPOsYIK+tPN5j6fgPaksA/lCKr3HMhT
 N7qfhCzUFUJQqZEA3pBy5RnRIzNupdTP3rcXFG8QPnGfzdcOZ2m1Tu/36rjt6lmE
 lB+9pLQBjI15r67v08ZzEGHfTX4FqlnlCu/jbcYhXNF6erv3jZRboneytJ3fxbMW
 kRdmi9wctMprKmWVFmaA0OPkwigMBO8xILnYOCcFhQcGKLugc58YGsjzBJbp/yUA
 Qmxb3Gl9pFe97u/URoLPXUE+2hF1X7ydT9hMXrle/gcXVnmU6rN+4xQuEzVPBOK7
 J3qh8IH5joMLaQS4VKSD/Wh0RrtDHJdduLDcjtL8qRkFyS30FqY0USd8prMkIE7I
 /s9Wdi96nC3WEZQtZKg0mRrSeTtbRR0/KO5gEb6MHaAE8ffKJa7MsMZlrh9VoJvM
 PC4PJnOr0Qy5MNAA1MxBWK6Dcsnor4cBkvE35uMoCkUoxOeTnKuruo0kyZYqmrIh
 VOnKrb3+w/I4lDXgQ4kIcpyQfrqMky8R6EGvQvjm2c2OLWZBCmBRU1c0Fjw789O4
 jMdzjs7H9ild2KwBZK4e7KfdoUnlxIza2Yh4pFLUf2R3NqBZ2vQTkpoCF4AK+wzU
 636F5GITLcVe
 =XjZM
 -----END PGP SIGNATURE-----

Merge tag 'net-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from CAN, netfilter and wireless.

  We have an IPv6 routing regression with the relevant fix still a WiP.
  This includes a last-minute revert to avoid more problems.

  Current release - new code bugs:

   - wifi: nl80211: completely disable per-link stats for now

  Previous releases - regressions:

   - dev_ioctl: take ops lock in hwtstamp lower paths

   - netfilter:
       - fix spurious set lookup failures
       - fix lockdep splat due to missing annotation

   - genetlink: fix genl_bind() invoking bind() after -EPERM

   - phy: transfer phy_config_inband() locking responsibility to phylink

   - can: xilinx_can: fix use-after-free of transmitted SKB

   - hsr: fix lock warnings

   - eth:
       - igb: fix NULL pointer dereference in ethtool loopback test
       - i40e: fix Jumbo Frame support after iPXE boot
       - macsec: sync features on RTM_NEWLINK

  Previous releases - always broken:

   - tunnels: reset the GSO metadata before reusing the skb

   - mptcp: make sync_socket_options propagate SOCK_KEEPOPEN

   - can: j1939: implement NETDEV_UNREGISTER notification hanidler

   - wifi: ath12k: fix WMI TLV header misalignment"

* tag 'net-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  Revert "net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups"
  hsr: hold rcu and dev lock for hsr_get_port_ndev
  hsr: use hsr_for_each_port_rtnl in hsr_port_get_hsr
  hsr: use rtnl lock when iterating over ports
  wifi: nl80211: completely disable per-link stats for now
  net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups
  net: ethtool: fix wrong type used in struct kernel_ethtool_ts_info
  MAINTAINERS: add Phil as netfilter reviewer
  netfilter: nf_tables: restart set lookup on base_seq change
  netfilter: nf_tables: make nft_set_do_lookup available unconditionally
  netfilter: nf_tables: place base_seq in struct net
  netfilter: nft_set_rbtree: continue traversal if element is inactive
  netfilter: nft_set_pipapo: don't check genbit from packetpath lookups
  netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation
  can: rcar_can: rcar_can_resume(): fix s2ram with PSCI
  can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB
  can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails
  can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed
  can: j1939: implement NETDEV_UNREGISTER notification handler
  selftests: can: enable CONFIG_CAN_VCAN as a module
  ...
2025-09-11 08:54:42 -07:00
Linus Torvalds 02ffd6f89c bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjB/6gACgkQ6rmadz2v
 bToSCg//Z8Q2ToTV/BOLTzFYLvcTm2YRlqqIe3SFxyxLQCIhC0kxQAT94baVQHky
 /6ASbPjDWXdGVHNoMopA6lpMx22Tq4xi6qO5fzJHDuSqh5KTi8l5/GyJeA3egPzD
 7RIvKvvgePpCx0xm9rm5O5vvUeFrsxhQPRRiN/fsOibiTJjBpRAJDp9k+pvnK6mb
 HaZcHF+In5Vg7XozuHAUMzsp+4njzdLrMXL2Q54o2MrIoeBg8/oAnhLujskGMnXK
 mgUA+skW42IEkw+TYUu9888/5PMDkto3BZIx0plcAIVAIvcU5BFzLt11llQswgVl
 q740k50oRKrmwHyEVDwugV7WeGQMks48lMHtLKytYmdEhdTfEYUKHeBpcI87fUYy
 IpOdSUT49nBxOmGl59ccBcdzsndTjo7Zrl7dMf4umN0SSjfdohwj0uu7rmZCaOdd
 m/TxH13Ae7na4QzVx0N911qxBYw07uYNiq3Ati+x327ySozvvNfLIYK/sS/clJkd
 lOpz3kpjwgV+PUfv2NBqEJm4nSPTtW7fiEQ8p/yBvK90nB6NnHIbq9a2rPBKeDKx
 RpkDB9nhJ1J6OMKWkUDasMiP5tAXt9RrI+la/CgBxMcP/G4HxH6yf22Mf3Hzhe8V
 UfjAgHqXvmrjqpgIbO0AVkfwDvlOM37DGvY0H3bMFeOXCk0DDw4=
 =09/9
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:
 "A number of fixes accumulated due to summer vacations

   - Fix out-of-bounds dynptr write in bpf_crypto_crypt() kfunc which
     was misidentified as a security issue (Daniel Borkmann)

   - Update the list of BPF selftests maintainers (Eduard Zingerman)

   - Fix selftests warnings with icecc compiler (Ilya Leoshkevich)

   - Disable XDP/cpumap direct return optimization (Jesper Dangaard
     Brouer)

   - Fix unexpected get_helper_proto() result in unusual configuration
     BPF_SYSCALL=y and BPF_EVENTS=n (Jiri Olsa)

   - Allow fallback to interpreter when JIT support is limited (KaFai
     Wan)

   - Fix rqspinlock and choose trylock fallback for NMI waiters. Pick
     the simplest fix. More involved fix is targeted bpf-next (Kumar
     Kartikeya Dwivedi)

   - Fix cleanup when tcp_bpf_send_verdict() fails to allocate
     psock->cork (Kuniyuki Iwashima)

   - Disallow bpf_timer in PREEMPT_RT for now. Proper solution is being
     discussed for bpf-next. (Leon Hwang)

   - Fix XSK cq descriptor production (Maciej Fijalkowski)

   - Tell memcg to use allow_spinning=false path in bpf_timer_init() to
     avoid lockup in cgroup_file_notify() (Peilin Ye)

   - Fix bpf_strnstr() to handle suffix match cases (Rong Tao)"

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Skip timer cases when bpf_timer is not supported
  bpf: Reject bpf_timer for PREEMPT_RT
  tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()
  bpf: Allow fall back to interpreter for programs with stack size <= 512
  rqspinlock: Choose trylock fallback for NMI waiters
  xsk: Fix immature cq descriptor production
  bpf: Update the list of BPF selftests maintainers
  selftests/bpf: Add tests for bpf_strnstr
  selftests/bpf: Fix "expression result unused" warnings with icecc
  bpf: Fix bpf_strnstr() to handle suffix match cases better
  selftests/bpf: Extend crypto_sanity selftest with invalid dst buffer
  bpf: Fix out-of-bounds dynptr write in bpf_crypto_crypt
  bpf: Check the helper function is valid in get_helper_proto
  bpf, cpumap: Disable page_pool direct xdp_return need larger scope
2025-09-11 07:54:16 -07:00
Andres Urian Florez 496a6ed840 selftest:net: fixed spelling mistakes
Fixed spelling errors in test_redirect6() error message and
test_port_shadowing() comments

Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-11 15:40:55 +02:00
Ido Schimmel f7240999de selftests: traceroute: Add VRF tests
Create versions of the existing test cases where the routers generating
the ICMP error messages are using VRFs. Check that the source IPs of
these messages do not change in the presence of VRFs.

IPv6 always behaved correctly, but IPv4 fails when reverting "ipv4:
icmp: Fix source IP derivation in presence of VRFs".

Without IPv4 change:

 # ./traceroute.sh
 TEST: IPv6 traceroute                                               [ OK ]
 TEST: IPv6 traceroute with VRF                                      [ OK ]
 TEST: IPv4 traceroute                                               [ OK ]
 TEST: IPv4 traceroute with VRF                                      [FAIL]
         traceroute did not return 1.0.3.1
 $ echo $?
 1

The test fails because the ICMP error message is sent with the VRF
device's IP (1.0.4.1):

 # traceroute -n -s 1.0.1.3 1.0.2.4
 traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
  1  1.0.4.1  0.165 ms  0.110 ms  0.103 ms
  2  1.0.2.4  0.098 ms  0.085 ms  0.078 ms
 # traceroute -n -s 1.0.3.3 1.0.2.4
 traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
  1  1.0.4.1  0.201 ms  0.138 ms  0.129 ms
  2  1.0.2.4  0.123 ms  0.105 ms  0.098 ms

With IPv4 change:

 # ./traceroute.sh
 TEST: IPv6 traceroute                                               [ OK ]
 TEST: IPv6 traceroute with VRF                                      [ OK ]
 TEST: IPv4 traceroute                                               [ OK ]
 TEST: IPv4 traceroute with VRF                                      [ OK ]
 $ echo $?
 0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-9-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 12:22:38 +02:00
Ido Schimmel 2e6428100b selftests: traceroute: Test traceroute with different source IPs
When generating ICMP error messages, the kernel will prefer a source IP
that is on the same subnet as the destination IP (see
inet_select_addr()). Test this behavior by invoking traceroute with
different source IPs and checking that the ICMP error message is
generated with a source IP in the same subnet.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-8-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 12:22:38 +02:00
Ido Schimmel 5c9c78224f selftests: traceroute: Reword comment
Both of the addresses are configured as primary addresses, but the
kernel is expected to choose 10.0.1.1/24 as the source IP of the ICMP
error message since it is on the same subnet as the destination IP of
the message (10.0.1.3/24). Reword the comment to reflect that.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-7-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 12:22:38 +02:00
Ido Schimmel 47efbac9b7 selftests: traceroute: Use require_command()
Use require_command() so that the test will return SKIP (4) when a
required command is not present.

Before:

 # ./traceroute.sh
 SKIP: Could not run IPV6 test without traceroute6
 SKIP: Could not run IPV4 test without traceroute
 $ echo $?
 0

After:

 # ./traceroute.sh
 TEST: traceroute6 not installed                                    [SKIP]
 $ echo $?
 4

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-6-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 12:22:38 +02:00
Ido Schimmel c068ba9d3d selftests: traceroute: Return correct value on failure
The test always returns success even if some tests were modified to
fail. Fix by converting the test to use the appropriate library
functions instead of using its own functions.

Before:

 # ./traceroute.sh
 TEST: IPV6 traceroute                                               [FAIL]
 TEST: IPV4 traceroute                                               [ OK ]

 Tests passed:   1
 Tests failed:   1
 $ echo $?
 0

After:

 # ./traceroute.sh
 TEST: IPv6 traceroute                                               [FAIL]
         traceroute6 did not return 2000:102::2
 TEST: IPv4 traceroute                                               [ OK ]
 $ echo $?
 1

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-5-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 12:22:38 +02:00
Marc Harvey db1b600666 selftests: net: Add tests to verify team driver option set and get.
There are currently no kernel tests that verify setting and getting
options of the team driver.

In the future, options may be added that implicitly change other
options, which will make it useful to have tests like these that show
nothing breaks. There will be a follow up patch to this that adds new
"rx_enabled" and "tx_enabled" options, which will implicitly affect the
"enabled" option value and vice versa.

The tests use teamnl to first set options to specific values and then
gets them to compare to the set values.

Signed-off-by: Marc Harvey <marcharvey@google.com>
Link: https://patch.msgid.link/20250905040441.2679296-1-marcharvey@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-11 11:07:55 +02:00
Jakub Kicinski ccf78f7f05 linux-can-fixes-for-6.17-20250910
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEn/sM2K9nqF/8FWzzDHRl3/mQkZwFAmjBpO8THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAMdGXf+ZCRnJOOB/9r35uZBD1+ZZVpvoByhil8XY7K42Vv
 LrARJ8+kh1Q6mcuB4V/176GBPJ7JabSUTGqvqvYbJBAB5obONhLU5Bkb/7lM2z1G
 MOUe0DMiBP+J3K8zqXcEKhqeW0qQO6pqfnv8iIj97Kxwd5JRPJq8pMpCIC/RLzOY
 Ot7+fcsE+cSZt9VumnQTx/S1fzGoY1C9VCglSr9B0O6IiTZZBIPNg8EwIGvbm7oh
 uofnFRocRZaWQ6OErdRSTLK4hu11DeFUmEAA6YN2ikpAdshUsf3U8HYaz2LIEEuu
 zGflFXrMekXqDod1SvhklLPUY/ssC11dUon+4Cgz/yK6I+lFvD/cuLIP
 =/YxL
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.17-20250910' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-09-10

The 1st patch is by Alex Tran and fixes the Documentation of the
struct bcm_msg_head.

Davide Caratti's patch enabled the VCAN driver as a module for the
Linux self tests.

Tetsuo Handa contributes 3 patches that fix various problems in the
CAN j1939 protocol.

Anssi Hannula's patch fixes a potential use-after-free in the
xilinx_can driver.

Geert Uytterhoeven's patch fixes the rcan_can's suspend to RAM on
R-Car Gen3 using PSCI.

* tag 'linux-can-fixes-for-6.17-20250910' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: rcar_can: rcar_can_resume(): fix s2ram with PSCI
  can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB
  can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails
  can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed
  can: j1939: implement NETDEV_UNREGISTER notification handler
  selftests: can: enable CONFIG_CAN_VCAN as a module
  docs: networking: can: change bcm_msg_head frames member to support flexible array
====================

Link: https://patch.msgid.link/20250910162907.948454-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-10 19:29:40 -07:00
Jakub Kicinski 15c068cb21 selftests: net: replace sleeps in fcnal-test with waits
fcnal-test.sh already includes lib.sh, use relevant helpers
instead of sleeping. Replace sleep after starting nettest
as a server with wait_local_port_listen.

Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250909223837.863217-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-10 18:14:02 -07:00
Leon Hwang fbdd61c94b selftests/bpf: Skip timer cases when bpf_timer is not supported
When enable CONFIG_PREEMPT_RT, verifier will reject bpf_timer with
returning -EOPNOTSUPP.

Therefore, skip test cases when errno is EOPNOTSUPP.

cd tools/testing/selftests/bpf
./test_progs -t timer
125     free_timer:SKIP
456     timer:SKIP
457/1   timer_crash/array:SKIP
457/2   timer_crash/hash:SKIP
457     timer_crash:SKIP
458     timer_lockup:SKIP
459     timer_mim:SKIP
Summary: 5/0 PASSED, 6 SKIPPED, 0 FAILED

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20250910125740.52172-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-10 12:34:09 -07:00
Davide Caratti d013ebc349 selftests: can: enable CONFIG_CAN_VCAN as a module
A proper kernel configuration for running kselftest can be obtained with:

 $ yes | make kselftest-merge

Build of 'vcan' driver is currently missing, while the other required knobs
are already there because of net/link_netns.py [1]. Add a config file in
selftests/net/can to store the minimum set of kconfig needed for CAN
selftests.

[1] https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com

Fixes: 77442ffa83 ("selftests: can: Import tst-filter from can-tests")
Reviewed-by: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/fa4c0ea262ec529f25e5f5aa9269d84764c67321.1757516009.git.dcaratti@redhat.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-10 17:11:50 +02:00
Davidlohr Bueso c4272905c3 cxl/acpi: Rename CFMW coherency restrictions
ACPICA commit 710745713ad3a2543dbfb70e84764f31f0e46bdc

This has been renamed in more recent CXL specs, as
type3 (memory expanders) can also use HDM-DB for
device coherent memory.

Link: 710745713a
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250908160034.86471-1-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-10 08:09:10 -07:00
Matthieu Baerts (NGI0) e2cda6343b selftests: mptcp: join: allow more time to send ADD_ADDR
When many ADD_ADDR need to be sent, it can take some time to send each
of them, and create new subflows. Some CIs seem to occasionally have
issues with these tests, especially with "debug" kernels.

Two subtests will now run for a slightly longer time: the last two where
3 or more ADD_ADDR are sent during the test.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-3-824cc805772b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-09 18:57:45 -07:00
Matthieu Baerts (NGI0) 63c31d42cf selftests: mptcp: join: tolerate more ADD_ADDR
ADD_ADDR can be retransmitted, and with, the parent commit, these
retransmissions can be sent quicker: from 2 minutes to less than one
second.

To avoid false positives where retransmitted ADD_ADDR causes higher
counters than expected, it is required to be more tolerant. Errors are
now only reported when fewer ADD_ADDRs have been sent/received, except
if no ADD_ADDR are expected.

Before the parent commit, the tolerance was present for each tests where
the ADD_ADDR could be retransmitted in a reasonable time (1 sec). Now
that all tests can have retransmitted ADD_ADDR, it is normal to apply
the same tolerance for all tests.

An alternative could be to disable the ADD_ADDR retransmissions by
default, but that's changing the default kernel behaviour. Plus,
ADD_ADDR retransmissions can be required for some tests. To avoid adding
exceptions to many tests, it seems better to increase the tolerance.

Later, we could add a new MIB counter to identify the ADD_ADDR
retransmissions, and remove the tolerance when this counter is
available.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-2-824cc805772b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-09 18:57:45 -07:00
Matthieu Baerts (NGI0) ef1bd93b3b selftests: mptcp: shellcheck: support v0.11.0
This v0.11.0 version introduces SC2329:

  Warn when (non-escaping) functions are never invoked.

Except that, similar to SC2317, ShellCheck is currently unable to figure
out functions that are invoked via trap, or indirectly, when calling
functions via variables. It is then needed to disable this new SC2329.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250908-net-mptcp-misc-fixes-6-17-rc5-v1-3-5f2168a66079@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-09 18:39:52 -07:00
Jakub Kicinski 1c0353a6df selftests: net: speed up pmtu.sh by avoiding unnecessary cleanup
The pmtu test takes nearly an hour when run on a debug kernel
(10min on a normal kernel, so the debug slow down is quite significant).
NIPA tries to ensure all results are delivered by a certain deadline
so this prevents it from retrying the test in case of a flake.

Looks like one of the slowest operations in the test is calling out
to ./openvswitch/ovs-dpctl.py to remove potential leftover OvS interfaces.
Check whether the interfaces exist in the first place in sysfs,
since it can be done directly in bash it is very fast.

This should save us around 20-30% of the test runtime.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250906214535.3204785-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-09 16:26:44 -07:00
Jakub Kicinski a12fd5c31b selftests: net: run groups from fcnal-test in parallel
fcnal-test.sh takes almost hour and a half to finish.
The tests are already grouped into ipv4, ipv6 and other.
Run those groups separately.

Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250908201021.270681-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-09 15:34:11 -07:00
Rong Tao 6624fb2f33 selftests/bpf: Add tests for bpf_strnstr
Add tests for bpf_strnstr():

    bpf_strnstr("", "", 0) = 0
    bpf_strnstr("hello world", "hello", 5) = 0
    bpf_strnstr(str, "hello", 4) = -ENOENT
    bpf_strnstr("", "a", 0) = -ENOENT

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_2ED218F8082565C95D86A804BDDA8DBA200A@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-09 15:07:58 -07:00
Ilya Leoshkevich 5d40c038c8 selftests/bpf: Fix "expression result unused" warnings with icecc
icecc is a compiler wrapper that distributes compile jobs over a build
farm [1]. It works by sending toolchain binaries and preprocessed
source code to remote machines.

Unfortunately using it with BPF selftests causes build failures due to
a clang bug [2]. The problem is that clang suppresses the
-Wunused-value warning if the unused expression comes from a macro
expansion. Since icecc compiles preprocessed source code, this
information is not available. This leads to -Wunused-value false
positives.

obj_new_no_struct() and obj_new_acq() use the bpf_obj_new() macro and
discard the result. arena_spin_lock_slowpath() uses two macros that
produce values and ignores the results. Add (void) casts to explicitly
indicate that this is intentional and suppress the warning.

An alternative solution is to change the macros to not produce values.
This would work today for the arena_spin_lock_slowpath() issue, but in
the future there may appear users who need them. Another potential
solution is to replace these macros with functions. Unfortunately this
would not work, because these macros work with unknown types and
control flow.

[1] https://github.com/icecc/icecream
[2] https://github.com/llvm/llvm-project/issues/142614

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250829030017.102615-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-09 15:07:58 -07:00
Daniel Borkmann 3aa9b9a165 selftests/bpf: Extend crypto_sanity selftest with invalid dst buffer
Small cleanup and test extension to probe the bpf_crypto_{encrypt,decrypt}()
kfunc when a bad dst buffer is passed in to assert that an error is returned.

Also, encrypt_sanity() and skb_crypto_setup() were explicit to set the global
status variable to zero before any test, so do the same for decrypt_sanity().
Do not explicitly zero the on-stack err before bpf_crypto_ctx_create() given
the kfunc is expected to do it internally for the success case.

Before kernel fix:

  # ./vmtest.sh -- ./test_progs -t crypto
  [...]
  [    1.531200] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.533388] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #87/1    crypto_basic/crypto_release:OK
  #87/2    crypto_basic/crypto_acquire:OK
  #87      crypto_basic:OK
  test_crypto_sanity:PASS:skel open 0 nsec
  test_crypto_sanity:PASS:ip netns add crypto_sanity_ns 0 nsec
  test_crypto_sanity:PASS:ip -net crypto_sanity_ns -6 addr add face::1/128 dev lo nodad 0 nsec
  test_crypto_sanity:PASS:ip -net crypto_sanity_ns link set dev lo up 0 nsec
  test_crypto_sanity:PASS:open_netns 0 nsec
  test_crypto_sanity:PASS:AF_ALG init fail 0 nsec
  test_crypto_sanity:PASS:if_nametoindex lo 0 nsec
  test_crypto_sanity:PASS:skb_crypto_setup fd 0 nsec
  test_crypto_sanity:PASS:skb_crypto_setup 0 nsec
  test_crypto_sanity:PASS:skb_crypto_setup retval 0 nsec
  test_crypto_sanity:PASS:skb_crypto_setup status 0 nsec
  test_crypto_sanity:PASS:create qdisc hook 0 nsec
  test_crypto_sanity:PASS:make_sockaddr 0 nsec
  test_crypto_sanity:PASS:attach encrypt filter 0 nsec
  test_crypto_sanity:PASS:encrypt socket 0 nsec
  test_crypto_sanity:PASS:encrypt send 0 nsec
  test_crypto_sanity:FAIL:encrypt status unexpected error: -5 (errno 95)
  #88      crypto_sanity:FAIL
  Summary: 1/2 PASSED, 0 SKIPPED, 1 FAILED

After kernel fix:

  # ./vmtest.sh -- ./test_progs -t crypto
  [...]
  [    1.540963] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.542404] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #87/1    crypto_basic/crypto_release:OK
  #87/2    crypto_basic/crypto_acquire:OK
  #87      crypto_basic:OK
  #88      crypto_sanity:OK
  Summary: 2/2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/20250829143657.318524-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-09 15:07:57 -07:00
Jiayuan Chen f85981327a selftests/bpf: Fix incorrect array size calculation
The loop in bench_sockmap_prog_destroy() has two issues:

1. Using 'sizeof(ctx.fds)' as the loop bound results in the number of
   bytes, not the number of file descriptors, causing the loop to iterate
   far more times than intended.

2. The condition 'ctx.fds[0] > 0' incorrectly checks only the first fd for
   all iterations, potentially leaving file descriptors unclosed. Change
   it to 'ctx.fds[i] > 0' to check each fd properly.

These fixes ensure correct cleanup of all file descriptors when the
benchmark exits.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250909124721.191555-1-jiayuan.chen@linux.dev

Closes: https://lore.kernel.org/bpf/aLqfWuRR9R_KTe5e@stanley.mountain/
2025-09-09 09:23:47 -07:00
Brett A C Sheffield aeb8d48ea9 selftests: net: add test for ipv6 fragmentation
Add selftest for the IPv6 fragmentation regression which affected
several stable kernels.

Commit a18dfa9925 ("ipv6: save dontfrag in cork") was backported to
stable without some prerequisite commits.  This caused a regression when
sending IPv6 UDP packets by preventing fragmentation and instead
returning -1 (EMSGSIZE).

Add selftest to check for this issue by attempting to send a packet
larger than the interface MTU. The packet will be fragmented on a
working kernel, with sendmsg(2) correctly returning the expected number
of bytes sent.  When the regression is present, sendmsg returns -1 and
sets errno to EMSGSIZE.

Link: https://lore.kernel.org/stable/aElivdUXqd1OqgMY@karahi.gladserv.com
Signed-off-by: Brett A C Sheffield <bacs@librecast.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250903154925.13481-1-bacs@librecast.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-09 11:43:59 +02:00
Oscar Maes bf59028ea8 selftests: net: add test for destination in broadcast packets
Add test to check the broadcast ethernet destination field is set
correctly.

This test sends a broadcast ping, captures it using tcpdump and
ensures that all bits of the 6 octet ethernet destination address
are correctly set by examining the output capture file.

Co-developed-by: Brett A C Sheffield <bacs@librecast.net>
Signed-off-by: Brett A C Sheffield <bacs@librecast.net>
Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
Link: https://patch.msgid.link/20250902150240.4272-1-oscmaes92@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-09 11:29:58 +02:00
Thomas Weißschuh e82bf7570d selftests: vDSO: Drop vdso_test_clock_getres
vdso_test_abi provides the exact same functionality, properly uses
kselftest.h and explicitly calls into the vDSO without relying on the libc.

Drop the pointless testcase.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-8-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh 7262aa781f selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64()
To be y2038-safe, 32-bit userspace needs to explicitly call the 64-bit safe
time APIs. For this the 32-bit vDSOs contains a clock_gettime() variant
which always uses 64-bit time types.

Also test this vDSO function.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-7-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh 7b87dbf9d8 selftests: vDSO: vdso_test_abi: Test CPUTIME clocks
The structure is already there anyways, so test the CPUTIME clocks, too.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-6-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh 74b408ff06 selftests: vDSO: vdso_test_abi: Use explicit indices for name array
The array relies on the numeric values of the clock IDs.
When reading the code it is not obvious that the order is correct.

Make the code easier to read by using explicit indices.

While at it make the array static.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-5-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh d7516f25a9 selftests: vDSO: vdso_test_abi: Drop clock availability tests
The test uses the kselftest.h framework and declares in its testplan to
always execute 16 testcases. If any of the clockids were not available,
the testplan would not be satisfied anymore and the test would fail.
Apparently that never happened, so the clockids are always available.

Remove the pointless checks.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-4-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh 3afe371d32 selftests: vDSO: vdso_test_abi: Use ksft_finished()
The existing logic is just an open-coded ksft_finished().
Replace it with the real thing.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-3-90f499dd35f8@linutronix.de
2025-09-09 10:57:39 +02:00
Thomas Weißschuh 4b59a9f762 selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO
If AT_SYSINFO_EHDR is missing the whole test needs to be skipped.
Currently this results in the following output:

	TAP version 13
	1..16
	# AT_SYSINFO_EHDR is not present!

This output is incorrect, as "1..16" still requires the subtest lines to
be printed, which isn't done however.

Switch to the correct skipping functions, so the output now correctly
indicates that no subtests are being run:

	TAP version 13
	1..0 # SKIP AT_SYSINFO_EHDR is not present!

Fixes: 693f5ca08c ("kselftest: Extend vDSO selftest")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-2-90f499dd35f8@linutronix.de
2025-09-09 10:57:38 +02:00
Thomas Weißschuh 9f15e0f9ef selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper
The _rval register variable is meant to be an output operand of the asm
statement but is instead used as input operand.
clang 20.1 notices this and triggers -Wuninitialized warnings:

tools/testing/selftests/timers/auxclock.c:154:10: error: variable '_rval' is uninitialized when used here [-Werror,-Wuninitialized]
  154 |                 return VDSO_CALL(self->vdso_clock_gettime64, 2, clockid, ts);
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tools/testing/selftests/timers/../vDSO/vdso_call.h:59:10: note: expanded from macro 'VDSO_CALL'
   59 |                 : "r" (_rval)                                           \
      |                        ^~~~~
tools/testing/selftests/timers/auxclock.c:154:10: note: variable '_rval' is declared here
tools/testing/selftests/timers/../vDSO/vdso_call.h:47:2: note: expanded from macro 'VDSO_CALL'
   47 |         register long _rval asm ("r3");                                 \
      |         ^

It seems the list of input and output operands have been switched around.
However as the argument registers are not always initialized they can not
be marked as pure inputs as that would trigger -Wuninitialized warnings.
Adding _rval as another input and output operand does also not work as it
would collide with the existing _r3 variable.

Instead reuse _r3 for both the argument and the return value.

Fixes: 6eda706a53 ("selftests: vDSO: fix the way vDSO functions are called for powerpc")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-1-90f499dd35f8@linutronix.de
Closes: https://lore.kernel.org/oe-kbuild-all/202506180223.BOOk5jDK-lkp@intel.com/
2025-09-09 10:57:38 +02:00
Hangbin Liu c2377f1763 selftests: bonding: add test for LACP actor port priority
Add comprehensive selftest to verify:
- Per-port actor priority setting via ad_actor_port_prio
- Aggregator selection behavior with port_priority ad_select policy

Also move cmd_jq helper from forwarding/lib.sh to net/lib.sh for
broader reusability across network selftests.

Here is the result output
  # ./bond_lacp_prio.sh
  TEST: bond 802.3ad (ad_actor_port_prio setting)                     [ OK ]
  TEST: bond 802.3ad (ad_actor_port_prio select)                      [ OK ]
  TEST: bond 802.3ad (ad_actor_port_prio switch)                      [ OK ]

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250902064501.360822-4-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-09 10:56:02 +02:00
Alok Tiwari 665071186c KVM: selftests: Fix typo in hyperv cpuid test message
Fix a typo in hyperv_cpuid.c test assertion log:
replace "our of supported range" -> "out of supported range".

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://lore.kernel.org/r/20250824181642.629297-1-alok.a.tiwari@oracle.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-08 13:03:22 -07:00
Jakub Kicinski f3883b1ea5 selftests: net: move netlink-dumps back to progs
Commit 9bb88c6596 ("selftests: net: test extacks in netlink dumps")
moved netlink-dumps from TEST_GEN_PROGS to YNL_GEN_FILES.
But _FILES are not for tests, rather for utilities / helpers.
Create YNL_GEN_PROGS and include netlink-dumps there.
This makes netlink-dumps part of executed tests, again.

Link: https://patch.msgid.link/20250906211351.3192412-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-08 12:57:05 -07:00
Jakub Kicinski 27bc5eaf00 selftests: net: make the dump test less sensitive to mem accounting
Recent changes to make netlink socket memory accounting must
have broken the implicit assumption of the netlink-dump test
that we can fit exactly 64 dumps into the socket. Handle the
failure mode properly, and increase the dump count to 80
to make sure we still run into the error condition if
the default buffer size increases in the future.

Link: https://patch.msgid.link/20250906211351.3192412-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-08 12:57:05 -07:00
Feng Yang 93a83d0443 selftests/bpf: Fix the issue where the error code is 0
The error message printed here only uses the previous err value,
which results in it being printed as 0.
When bpf_map__attach_struct_ops encounters an error,
it uses libbpf_err_ptr(err) to set errno = -err and returns NULL.
Therefore, Using -errno can fix this issue.

Fix before:
run_subtest:FAIL:1019 bpf_map__attach_struct_ops failed for map pro_epilogue: err=0

Fix after:
run_subtest:FAIL:1019 bpf_map__attach_struct_ops failed for map pro_epilogue: err=-9

Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250908060810.1054341-1-yangfeng59949@163.com
2025-09-08 09:51:13 -07:00
Nikola Z. Ivanov 14a41628c4 selftests/arm64: Fix grammatical error in string literals
Fix grammatical error in <past tense verb> + <infinitive>
construct related to memory allocation checks.
In essence change "Failed to allocated" to "Failed to allocate".

Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 16:17:13 +01:00
Vivek Yadav 62e8a9fbaa kselftest/arm64: Add parentheses around sizeof for clarity
Added parentheses around sizeof to make the expression clearer
and improve readability. This change has no functional impact.

```
[command]
	./scripts/checkpatch.pl tools/testing/selftests/arm64/fp/sve-ptrace.c

[output]
	WARNING: sizeof *sve should be sizeof(*sve)
```

Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 16:01:22 +01:00
Vivek Yadav a940568ccd kselftest/arm64: Supress warning and improve readability
The comment was correct, but `checkpatch` script flagged it with a warning
as shown in the output section. The comment is slightly modified
to improve readability, which also suppresses the warning.

```
[command]
	./script/checkpatch.pl --strict -f tools/testing/selftests/arm64/fp/fp-stress.c

[output]
	WARNING: Possible repeated word: 'on'
```

Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 16:01:22 +01:00
Vivek Yadav 3198780eaf kselftest/arm64: Remove extra blank line
Remove an unnecessary blank line to improve code style consistency.

```
[command]
        ./scripts/checkpatch.pl --strict -f <path/to/file>

[output]
        CHECK: Please don't use multiple blank lines
	CHECK: Blank lines aren't necessary before a close brace '}'
```

Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 16:01:21 +01:00
Thomas Weißschuh a985fe6383 kselftest/arm64/gcs: Use nolibc's getauxval()
Nolibc now does have getauxval(), use it.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 15:54:39 +01:00
Thomas Weißschuh 740cdafd0d kselftest/arm64/gcs: Correctly check return value when disabling GCS
The return value was not assigned to 'ret', so the check afterwards
does not do anything.

Fixes: 3d37d4307e ("kselftest/arm64: Add very basic GCS test program")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 15:54:39 +01:00
Linus Torvalds f777d1112e vfs-6.17-rc6.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaL6SyQAKCRCRxhvAZXjc
 ouTGAQDGiTnaENiOzRhzNl1XONTRv8a1uV0pxg4W3fNdiRlxgQEA/O90/+nM48KC
 pdV3WHz5eGfcnMTpqgHxK6HYgwklJAY=
 =oKnm
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "fuse:

   - Prevent opening of non-regular backing files.

     Fuse doesn't support non-regular files anyway.

   - Check whether copy_file_range() returns a larger size than
     requested.

   - Prevent overflow in copy_file_range() as fuse currently only
     supports 32-bit sized copies.

   - Cache the blocksize value if the server returned a new value as
     inode->i_blkbits isn't modified directly anymore.

   - Fix i_blkbits handling for iomap partial writes.

     By default i_blkbits is set to PAGE_SIZE which causes iomap to mark
     the whole folio as uptodate even on a partial write. But fuseblk
     filesystems support choosing a blocksize smaller than PAGE_SIZE
     risking data corruption. Simply enforce PAGE_SIZE as blocksize for
     fuseblk's internal inode for now.

   - Prevent out-of-bounds acces in fuse_dev_write() when the number of
     bytes to be retrieved is truncated to the fc->max_pages limit.

  virtiofs:

   - Fix page faults for DAX page addresses.

  Misc:

   - Tighten file handle decoding from userns.

     Check that the decoded dentry itself has a valid idmapping in the
     user namespace.

   - Fix mount-notify selftests.

   - Fix some indentation errors.

   - Add an FMODE_ flag to indicate IOCB_HAS_METADATA availability.

     This will be moved to an FOP_* flag with a bit more rework needed
     for that to happen not suitable for a fix.

   - Don't silently ignore metadata for sync read/write.

   - Don't pointlessly log warning when reading coredump sysctls"

* tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fuse: virtio_fs: fix page fault for DAX page address
  selftests/fs/mount-notify: Fix compilation failure.
  fhandle: use more consistent rules for decoding file handle from userns
  fuse: Block access to folio overlimit
  fuse: fix fuseblk i_blkbits for iomap partial writes
  fuse: reflect cached blocksize if blocksize was changed
  fuse: prevent overflow in copy_file_range return value
  fuse: check if copy_file_range() returns larger than requested size
  fuse: do not allow mapping a non-regular backing file
  coredump: don't pointlessly check and spew warnings
  fs: fix indentation style
  block: don't silently ignore metadata for sync read/write
  fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability
  Please enter a commit message to explain why this merge is necessary,
  especially if it merges an updated upstream into a topic branch.
2025-09-08 07:53:01 -07:00
Bala-Vignesh-Reddy 50af02425a selftests: arm64: Fix -Waddress warning in tpidr2 test
Thanks to -Waddress, the compiler warns that the ksft_test_result()
invocations in the arm64 tpidr2 selftest are always true. Oops.

Fix the test by, err, actually running the test functions.

Fixes: 6d80cb7313 ("kselftest/arm64: Convert tpidr2 test to use kselftest.h")
Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 15:24:19 +01:00
Mark Brown 791d703bad kselftest/arm64: Log error codes in sve-ptrace
Use ksft_perror() to report error codes from failing ptrace operations to
make it easier to interpret logs when things go wrong.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 15:07:17 +01:00
Bala-Vignesh-Reddy a679e5683d selftests: arm64: Check fread return value in exec_target
Fix -Wunused-result warning generated when compiled with gcc 13.3.0,
by checking fread's return value and handling errors, preventing
potential failures when reading from stdin.

Fixes compiler warning:
warning: ignoring return value of 'fread' declared with attribute
'warn_unused_result' [-Wunused-result]

Fixes: 806a15b254 ("kselftests/arm64: add PAuth test for whether exec() changes keys")

Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2025-09-08 15:03:55 +01:00
Stanislav Fomichev 8c0b9ed240 selftests: ncdevmem: don't retry EFAULT
devmem test fails on NIPA. Most likely we get skb(s) with readable
frags (why?) but the failure manifests as an OOM. The OOM happens
because ncdevmem spams the following message:

  recvmsg ret=-1
  recvmsg: Bad address

As of today, ncdevmem can't deal with various reasons of EFAULT:
- falling back to regular recvmsg for non-devmem skbs
- increasing ctrl_data size (can't happen with ncdevmem's large buffer)

Exit (cleanly) with error when recvmsg returns EFAULT. This should at
least cause the test to cleanup its state.

Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250904182710.1586473-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-05 18:14:34 -07:00
Mykyta Yatsenko e12873ee85 selftests/bpf: Add BPF program dump in veristat
Add the ability to dump BPF program instructions directly from veristat.
Previously, inspecting a program required separate bpftool invocations:
one to load and another to dump it, which meant running multiple
commands.
During active development, it's common for developers to use veristat
for testing verification. Integrating instruction dumping into veristat
reduces the need to switch tools and simplifies the workflow.
By making this information more readily accessible, this change aims
to streamline the BPF development cycle and improve usability for
developers.
This implementation leverages bpftool, by running it directly via popen
to avoid any code duplication and keep veristat simple.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250905140835.1416179-1-mykyta.yatsenko5@gmail.com
2025-09-05 12:12:09 -07:00
Eric Dumazet 8bc316cf3a selftests/net: packetdrill: add tcp_close_no_rst.pkt
This test makes sure we do send a FIN on close()
if the receive queue contains data that was consumed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250903084720.1168904-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04 19:13:41 -07:00
Jakub Kicinski 5ef04a7b06 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc5).

No conflicts.

Adjacent changes:

include/net/sock.h
  c51613fa27 ("net: add sk->sk_drop_counters")
  5d6b58c932 ("net: lockless sock_i_ino()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04 13:33:00 -07:00
Srinivas Pandruvada 0115d06355 thermal: intel: selftests: workload_hint: Mask unsupported types
The workload hint may contain some other hints which are not defined.
So mask out unsupported types. Currently only lower 4 bits of workload
type hints are defined.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20250828201541.931425-1-srinivas.pandruvada@linux.intel.com
[ rjw: Subject cleanup ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-04 22:17:47 +02:00
Linus Torvalds d69eb204c2 Including fixes from netfilter, wireless and Bluetooth.
We're reverting the removal of a Sundance driver, a user has appeared.
 This makes the PR rather large in terms of LoC.
 
 There's a conspicuous absence of real, user-reported 6.17 issues.
 Slightly worried that the summer distracted people from testing.
 
 Previous releases - regressions:
 
  - ax25: properly unshare skbs in ax25_kiss_rcv()
 
 Previous releases - always broken:
 
  - phylink: disable autoneg for interfaces that have no inband,
    fix regression on pcs-lynx (NXP LS1088)
 
  - vxlan: fix null-deref when using nexthop objects
 
  - batman-adv: fix OOB read/write in network-coding decode
 
  - icmp: icmp_ndo_send: fix reversing address translation for replies
 
  - tcp: fix socket ref leak in TCP-AO failure handling for IPv6
 
  - mctp:
    - mctp_fraq_queue should take ownership of passed skb
    - usb: initialise mac header in RX path, avoid WARN
 
  - wifi: mac80211: do not permit 40 MHz EHT operation on 5/6 GHz,
    respect device limitations
 
  - wifi: wilc1000: avoid buffer overflow in WID string configuration
 
  - wifi: mt76:
    - fix regressions from mt7996 MLO support rework
    - fix offchannel handling issues on mt7996
    - fix multiple wcid linked list corruption issues
    - mt7921: don't disconnect when AP requests switch to a channel which
      requires radar detection
    - mt7925u: use connac3 tx aggr check in tx complete
 
  - wifi: intel:
    - improve validation of ACPI DSM data
    - cfg: restore some 1000 series configs
 
  - wifi: ath:
    - ath11k: a fix for GTK rekeying
    - ath12k: a missed WiFi7 capability (multi-link EMLSR)
 
  - eth: intel:
    - ice: fix races in "low latency" firmware interface for Tx timestamps
    - idpf: set mac type when adding and removing MAC filters
    - i40e: remove racy read access to some debugfs files
 
 Misc:
 
  - Revert "eth: remove the DLink/Sundance (ST201) driver"
 
  - netfilter: conntrack: helper: Replace -EEXIST by -EBUSY, avoid confusing
    modprobe
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmi5tR8ACgkQMUZtbf5S
 Irt/dA//YtVjqx/qVRADORDq2CF2SZk7P4Kcanci5gI36tP+7yVYOoJyGm1SFfSp
 MSv5Oxqn2U37we34FBgaIbkPOc6Bg1ZDDlzgs/RzGYzNbz9mZkYMHry0kJ8bjbeA
 c6tlOKJu49wG82K/3eMXRFQFUJZS0TeMEfZMGoqcKU8ziHC6+XpUzXj1IqBbTU/+
 C6JLGzpWnWHI0o97rscnRF4w1h2tl5FF1eBV6MiIJzP1nuBuK7bDkwbGvA5DdBOr
 nqWdi6iSfGwb/4XYMKeRMNucYSsZOwAt9DYAxlPb2AXhpKSO31OjZ1qp3e5Uhg4u
 n6Oj3q9tPNBitSqizZOAftay7xZJtXISpuFVm2pijgAz0G2SITvx7Z0kPKDw7rXa
 S6ZSmBhlamj/6uaOmmmIyG7zqMXarLKVCh6C+tU/+vCYVb1YhC5fcIU1QML68FxX
 9YfJFOf6Fn/j68MtRdknB5pnEfh0acMcKgie5b0725KBbRPM0/QlpLL5vj4UXZD5
 3yxpgQbgciU3HiYBDJM1vWwF0Z3Y89YDtZqjUavnrnrxRv+BkzTmGlmNuqICKm3f
 rUCjdNm8TDPUwKgSLeUbXF2u8Aapgizd7ptXcmKT2mbjSKdhqdvafUnrdPZaBZBA
 uKH0XLNezj2zxcQaTEVdV5mtDujZMoXPS/RnCrICaMfXkz5Whxg=
 =e0Jg
 -----END PGP SIGNATURE-----

Merge tag 'net-6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, wireless and Bluetooth.

  We're reverting the removal of a Sundance driver, a user has appeared.
  This makes the PR rather large in terms of LoC.

  There's a conspicuous absence of real, user-reported 6.17 issues.
  Slightly worried that the summer distracted people from testing.

  Previous releases - regressions:

   - ax25: properly unshare skbs in ax25_kiss_rcv()

  Previous releases - always broken:

   - phylink: disable autoneg for interfaces that have no inband, fix
     regression on pcs-lynx (NXP LS1088)

   - vxlan: fix null-deref when using nexthop objects

   - batman-adv: fix OOB read/write in network-coding decode

   - icmp: icmp_ndo_send: fix reversing address translation for replies

   - tcp: fix socket ref leak in TCP-AO failure handling for IPv6

   - mctp:
       - mctp_fraq_queue should take ownership of passed skb
       - usb: initialise mac header in RX path, avoid WARN

   - wifi: mac80211: do not permit 40 MHz EHT operation on 5/6 GHz,
     respect device limitations

   - wifi: wilc1000: avoid buffer overflow in WID string configuration

   - wifi: mt76:
       - fix regressions from mt7996 MLO support rework
       - fix offchannel handling issues on mt7996
       - fix multiple wcid linked list corruption issues
       - mt7921: don't disconnect when AP requests switch to a channel
         which requires radar detection
       - mt7925u: use connac3 tx aggr check in tx complete

   - wifi: intel:
       - improve validation of ACPI DSM data
       - cfg: restore some 1000 series configs

   - wifi: ath:
       - ath11k: a fix for GTK rekeying
       - ath12k: a missed WiFi7 capability (multi-link EMLSR)

   - eth: intel:
       - ice: fix races in "low latency" firmware interface for Tx timestamps
       - idpf: set mac type when adding and removing MAC filters
       - i40e: remove racy read access to some debugfs files

  Misc:

   - Revert "eth: remove the DLink/Sundance (ST201) driver"

   - netfilter: conntrack: helper: Replace -EEXIST by -EBUSY, avoid
     confusing modprobe"

* tag 'net-6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
  phy: mscc: Stop taking ts_lock for tx_queue and use its own lock
  selftest: net: Fix weird setsockopt() in bind_bhash.c.
  MAINTAINERS: add Sabrina to TLS maintainers
  gve: update MAINTAINERS
  ppp: fix memory leak in pad_compress_skb
  net: xilinx: axienet: Add error handling for RX metadata pointer retrieval
  net: atm: fix memory leak in atm_register_sysfs when device_register fail
  netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  selftests: netfilter: fix udpclash tool hang
  ax25: properly unshare skbs in ax25_kiss_rcv()
  mctp: return -ENOPROTOOPT for unknown getsockopt options
  net/smc: Remove validation of reserved bits in CLC Decline message
  ipv4: Fix NULL vs error pointer check in inet_blackhole_dev_init()
  net: thunder_bgx: decrement cleanup index before use
  net: thunder_bgx: add a missing of_node_put
  net: phylink: move PHY interrupt request to non-fail path
  net: lockless sock_i_ino()
  tools: ynl-gen: fix nested array counting
  wifi: wilc1000: avoid buffer overflow in WID string configuration
  wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result()
  ...
2025-09-04 09:59:15 -07:00
Leon Hwang 88a3bde432 selftests/bpf: Add case to test bpf_in_interrupt()
Add a timer test case to test 'bpf_in_interrupt()'.

cd tools/testing/selftests/bpf
./test_progs -t timer_interrupt
462     timer_interrupt:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20250903140438.59517-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:05:58 -07:00
Leon Hwang 4b69e31329 selftests/bpf: Introduce experimental bpf_in_interrupt()
Filtering pid_tgid is meanlingless when the current task is preempted by
an interrupt.

To address this, introduce 'bpf_in_interrupt()' helper function, which
allows BPF programs to determine whether they are executing in interrupt
context.

'get_preempt_count()':

* On x86, '*(int *) bpf_this_cpu_ptr(&__preempt_count)'.
* On arm64, 'bpf_get_current_task_btf()->thread_info.preempt.count'.

Then 'bpf_in_interrupt()' will be:

* If !PREEMPT_RT, 'get_preempt_count() & (NMI_MASK | HARDIRQ_MASK
  | SOFTIRQ_MASK)'.
* If PREEMPT_RT, '(get_preempt_count() & (NMI_MASK | HARDIRQ_MASK))
  | (bpf_get_current_task_btf()->softirq_disable_cnt & SOFTIRQ_MASK)'.

As for other archs, it can be added support by updating
'get_preempt_count()'.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20250903140438.59517-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:05:58 -07:00
Rong Tao abc8a952d4 selftests/bpf: Test kfunc bpf_strcasecmp
Add testsuites for kfunc bpf_strcasecmp.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Link: https://lore.kernel.org/r/tencent_81A1A0ACC04B68158C57C4D151C46A832B07@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:00:57 -07:00
Menglong Dong a85d888768 selftests/bpf: add benchmark testing for kprobe-multi-all
For now, the benchmark for kprobe-multi is single, which means there is
only 1 function is hooked during testing. Add the testing
"kprobe-multi-all", which will hook all the kernel functions during
the benchmark. And the "kretprobe-multi-all" is added too.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20250904021011.14069-4-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:00:25 -07:00
Menglong Dong adf6b57ce4 selftests/bpf: skip recursive functions for kprobe_multi
Some functions is recursive for the kprobe_multi and impact the benchmark
results. So just skip them.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20250904021011.14069-3-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:00:25 -07:00
Menglong Dong 8bad31edf5 selftests/bpf: move get_ksyms and get_addrs to trace_helpers.c
We need to get all the kernel function that can be traced sometimes, so we
move the get_syms() and get_addrs() in kprobe_multi_test.c to
trace_helpers.c and rename it to bpf_get_ksyms() and bpf_get_addrs().

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20250904021011.14069-2-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-04 09:00:25 -07:00
Kuniyuki Iwashima fd2004d82d selftest: net: Fix weird setsockopt() in bind_bhash.c.
bind_bhash.c passes (SO_REUSEADDR | SO_REUSEPORT) to setsockopt().

In the asm-generic definition, the value happens to match with the
bare SO_REUSEPORT, (2 | 15) == 15, but not on some arch.

arch/alpha/include/uapi/asm/socket.h:18:#define SO_REUSEADDR	0x0004
arch/alpha/include/uapi/asm/socket.h:24:#define SO_REUSEPORT	0x0200
arch/mips/include/uapi/asm/socket.h:24:#define SO_REUSEADDR	0x0004	/* Allow reuse of local addresses.  */
arch/mips/include/uapi/asm/socket.h:33:#define SO_REUSEPORT 0x0200	/* Allow local address and port reuse.  */
arch/parisc/include/uapi/asm/socket.h:12:#define SO_REUSEADDR	0x0004
arch/parisc/include/uapi/asm/socket.h:18:#define SO_REUSEPORT	0x0200
arch/sparc/include/uapi/asm/socket.h:13:#define SO_REUSEADDR	0x0004
arch/sparc/include/uapi/asm/socket.h:20:#define SO_REUSEPORT	0x0200
include/uapi/asm-generic/socket.h:12:#define SO_REUSEADDR	2
include/uapi/asm-generic/socket.h:27:#define SO_REUSEPORT	15

Let's pass SO_REUSEPORT only.

Fixes: c35ecb95c4 ("selftests/net: Add test for timing a bind request to a port with a populated bhash entry")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250903222938.2601522-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04 07:30:04 -07:00
Thomas Weißschuh bad53ae2dc vdso: Drop Kconfig GENERIC_VDSO_TIME_NS
All architectures implementing time-related functionality in the vDSO are
using the generic vDSO library which handles time namespaces properly.

Remove the now unnecessary Kconfig symbol.

Enables the use of time namespaces on architectures, which use the
generic vDSO but did not enable GENERIC_VDSO_TIME_NS, namely MIPS and arm.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-10-d9b65750e49f@linutronix.de
2025-09-04 11:23:50 +02:00
Florian Westphal 661a4f307f selftests: netfilter: fix udpclash tool hang
Yi Chen reports that 'udpclash' loops forever depending on compiler
(and optimization level used); while (x == 1) gets optimized into
for (;;).  Add volatile qualifier to avoid that.

While at it, also run it under timeout(1) and fix the resize script
to not ignore the timeout passed as second parameter to insert_flood.

Reported-by: Yi Chen <yiche@redhat.com>
Suggested-by: Yi Chen <yiche@redhat.com>
Fixes: 78a5883635 ("selftests: netfilter: add conntrack clash resolution test case")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-04 09:19:24 +02:00
Ricardo B. Marlière c9110e6f72 selftests/bpf: Fix count write in testapp_xdp_metadata_copy()
Commit 4b30209255 ("selftests/xsk: Add tail adjustment tests and support
check") added a new global to xsk_xdp_progs.c, but left out the access in
the testapp_xdp_metadata_copy() function. Since bpf_map_update_elem() will
write to the whole bss section, it gets truncated. Fix by writing to
skel_rx->bss->count directly.

Fixes: 4b30209255 ("selftests/xsk: Add tail adjustment tests and support check")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250829-selftests-bpf-xsk_regression_fix-v1-1-5f5acdb9fe6b@suse.com
2025-09-03 16:46:36 -07:00
Ricardo B. Marlière 2a912258c9 selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh
Currently, even if some subtests fails, the end result will still yield
"ok 1 selftests: bpf: test_xsk.sh". Fix it by exiting with 1 if there are
any failures.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20250828-selftests-bpf-test_xsk_ret-v1-1-e6656c01f397@suse.com
2025-09-03 16:44:22 -07:00
Gang Yan 3fff72f827 selftests: mptcp: add checks for fallback counters
Recently, some mib counters about fallback has been added, this patch
provides a method to check the expected behavior of these mib counters
during the test execution.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/571
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250902-net-next-mptcp-misc-feat-6-18-v2-2-fa02bb3188b1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-03 15:08:20 -07:00
Ido Schimmel 2c9fb925c2 selftests: net: Add a selftest for VXLAN with FDB nexthop groups
Add test cases for VXLAN with FDB nexthop groups, testing both IPv4 and
IPv6. Test basic Tx functionality as well as some corner cases.

Example output:

 # ./test_vxlan_nh.sh
 TEST: VXLAN FDB nexthop: IPv4 basic Tx                              [ OK ]
 TEST: VXLAN FDB nexthop: IPv6 basic Tx                              [ OK ]
 TEST: VXLAN FDB nexthop: learning                                   [ OK ]
 TEST: VXLAN FDB nexthop: IPv4 proxy                                 [ OK ]
 TEST: VXLAN FDB nexthop: IPv6 proxy                                 [ OK ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250901065035.159644-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:57:00 -07:00
Zongmin Zhou b0bc645122 selftests: net: avoid memory leak
The buffer be used without free,fix it to avoid memory leak.

Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250901054557.32811-1-min_halo@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:55:00 -07:00
Jakub Kicinski e2cf2d5baa selftests: drv-net: rss_ctx: make the test pass with few queues
rss_ctx.test_rss_key_indir implicitly expects at least 5 queues,
as it checks that the traffic on first 2 queues is lower than
the remaining queues when we use all queues. Special case fewer
queues.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250901173139.881070-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:38:26 -07:00
Jakub Kicinski 4022f92a2e selftests: drv-net: rss_ctx: use Netlink for timed reconfig
The rss_ctx test has gotten pretty flaky after I increased
the queue count in NIPA 2->3. Not 100% clear why. We get
a lot of failures in the rss_ctx.test_hitless_key_update case.

Looking closer it appears that the failures are mostly due
to startup costs. I measured the following timing for ethtool -X:
 - python cmd(shell=True)  : 150-250msec
 - python cmd(shell=False) :  50- 70msec
 - timed in bash           :  45- 55msec
 - YNL Netlink call        :   2-  4msec
 - .set_rxfh callback      :   1-  2msec

The target in the test was set to 200msec. We were mostly measuring
ethtool startup cost it seems. Switch to YNL since it's 100x faster.

Lower the pass criteria to 150msec, no real science behind this number
but we removed some overhead, drivers which previously passed 200msec
should easily pass 150msec now.

Separately we should probably follow up on defaulting to shell=False,
when script doesn't explicitly ask for True, because the overhead
is rather significant.

Switch from _rss_key_rand() to random.randbytes(), YNL takes a binary
array rather than array of ints.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250901173139.881070-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:38:26 -07:00
Jakub Kicinski bc1a767f69 selftests: net: py: don't default to shell=True
Overhead of using shell=True is quite significant.
Micro-benchmark of running ethtool --help shows that
non-shell run is 2x faster.

Runtime of the XDP tests also shows improvement:
this patch: 2m34s 2m21s 2m18s 2m18s
    before:     2m54s 2m36s 2m34s

Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250830184317.696121-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:09:36 -07:00
Jakub Kicinski c2e5108649 selftests: drv-net: adjust tests before defaulting to shell=False
Clean up tests which expect shell=True without explicitly passing
that param to cmd(). There seems to be only one such case, and
in fact it's better converted to a direct write.

Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250830184317.696121-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 16:09:36 -07:00
Breno Leitao 23313771c7 net: selftests: clean up tools/testing/selftests/net/lib/py/utils.py
This patch improves the utils.py module by removing unused imports
(errno, random), simplifying the fd_read_timeout() function by
eliminating unnecessary else clause, and cleaning up code style in the
defer class constructor.

Additionally, it renames the parameter in rand_port() from 'type' to
'stype' to avoid shadowing the built-in Python name 'type', improving
code clarity and preventing potential issues.

These changes enhance code readability and maintainability without
affecting functionality.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250901-fix-v1-1-df0abb67481e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02 15:51:16 -07:00
Linus Torvalds 8026aed072 17 hotfixes. 13 are cc:stable and the remainder address post-6.16 issues
or aren't considered necessary for -stable kernels.  11 of these fixes are
 for MM.
 
 This includes a three-patch series from Harry Yoo which fixes an
 intermittent boot failure which can occur on x86 systems.  And a two-patch
 series from Alexander Gordeev which fixes a KASAN crash on S390 systems.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaLY4WAAKCRDdBJ7gKXxA
 jp2qAP92JCCzscR87um+YSc4u6a/X6ucYWkzh9BGhM8bMT8p7wD/UhIuGbYRFLPw
 XbSDkAD6lKpujQkRAudRFQTcZcU7gwg=
 =mPUd
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "17 hotfixes. 13 are cc:stable and the remainder address post-6.16
  issues or aren't considered necessary for -stable kernels. 11 of these
  fixes are for MM.

  This includes a three-patch series from Harry Yoo which fixes an
  intermittent boot failure which can occur on x86 systems. And a
  two-patch series from Alexander Gordeev which fixes a KASAN crash on
  S390 systems"

* tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: fix possible deadlock in kmemleak
  x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()
  mm: introduce and use {pgd,p4d}_populate_kernel()
  mm: move page table sync declarations to linux/pgtable.h
  proc: fix missing pde_set_flags() for net proc files
  mm: fix accounting of memmap pages
  mm/damon/core: prevent unnecessary overflow in damos_set_effective_quota()
  kexec: add KEXEC_FILE_NO_CMA as a legal flag
  kasan: fix GCC mem-intrinsic prefix with sw tags
  mm/kasan: avoid lazy MMU mode hazards
  mm/kasan: fix vmalloc shadow memory (de-)population races
  kunit: kasan_test: disable fortify string checker on kasan_strings() test
  selftests/mm: fix FORCE_READ to read input value correctly
  mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE
  ocfs2: prevent release journal inode after journal shutdown
  rust: mm: mark VmaNew as transparent
  of_numa: fix uninitialized memory nodes causing kernel panic
2025-09-02 13:18:00 -07:00
Aleksa Sarai 5554d820f7
selftests/proc: add tests for new pidns APIs
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/20250805-procfs-pidns-api-v4-4-705f984940e7@cyphar.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-02 11:40:20 +02:00
Xing Guo e51bd0e595
selftests/fs/mount-notify: Fix compilation failure.
Commit c6d9775c20 ("selftests/fs/mount-notify: build with tools include
dir") introduces the struct __kernel_fsid_t to decouple dependency with
headers_install.  The commit forgets to define a macro for __kernel_fsid_t
and it will cause type re-definition issue.

Signed-off-by: Xing Guo <higuoxing@gmail.com>
Link: https://lore.kernel.org/20250813031647.96411-1-higuoxing@gmail.com
Acked-by: Amir Goldstein <amir73il@gmail.com>
Closes: https://lore.kernel.org/oe-lkp/202508110628.65069d92-lkp@intel.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-02 10:34:37 +02:00
Praveen Balakrishnan c85ae0240e selftests: net: fix spelling and grammar mistakes
Fix several spelling and grammatical mistakes in output messages from
the net selftests to improve readability.

Only the message strings for the test output have been modified. No
changes to the functional logic of the tests have been made.

Signed-off-by: Praveen Balakrishnan <praveen.balakrishnan@magd.ox.ac.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250828211100.51019-1-praveen.balakrishnan@magd.ox.ac.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-01 13:08:07 -07:00
Jakub Kicinski 49c2502b59 selftests: drv-net: csum: fix interface name for remote host
Use cfg.remote_ifname for arguments of remote command.
Without this UDP tests fail in NIPA where local interface
is called enp1s0 and remote enp0s4.

Fixes: 1d0dc857b5 ("selftests: drv-net: add checksum tests")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250830183842.688935-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-01 12:43:10 -07:00
Benjamin Berg ca38943e83 selftests/nolibc: remove outdated comment about construct order
The constructor order is not (and should not) be tested. Remove the
comment.

Fixes: a782d45c86 ("selftests/nolibc: stop testing constructor order")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://lore.kernel.org/r/20250731201225.323254-3-benjamin@sipsolutions.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-09-01 20:48:41 +02:00
Benjamin Berg 6d33ce3634 selftests/nolibc: fix EXPECT_NZ macro
The expect non-zero macro was incorrect and never used. Fix its
definition.

Fixes: 362aecb2d8 ("selftests/nolibc: add basic infrastructure to ease creation of nolibc tests")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://lore.kernel.org/r/20250731201225.323254-2-benjamin@sipsolutions.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-09-01 20:48:41 +02:00
Thomas Weißschuh 61a3cf7934 kselftest/arm64: tpidr2: Switch to waitpid() over wait4()
wait4() is deprecated, non-standard and about to be removed from nolibc.

Switch to the equivalent waitpid() call.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250821-nolibc-enosys-v1-6-4b63f2caaa89@weissschuh.net
2025-09-01 20:48:35 +02:00
Dan Carpenter 237bfb76c9 selftests/futex: Fix futex_wait() for 32bit ARM
On 32bit ARM systems gcc-12 will use 32bit timestamps while gcc-13 and later
will use 64bit timestamps.  The problem is that SYS_futex will continue
pointing at the 32bit system call.  This makes the futex_wait test fail like
this:

  waiter failed errno 110
  not ok 1 futex_wake private returned: 0 Success
  waiter failed errno 110
  not ok 2 futex_wake shared (page anon) returned: 0 Success
  waiter failed errno 110
  not ok 3 futex_wake shared (file backed) returned: 0 Success

Instead of compiling differently depending on the gcc version, use the
-D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 options to ensure that 64bit timestamps
are used.  Then use ifdefs to make SYS_futex point to the 64bit system call.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/20250827130011.677600-6-bigeasy@linutronix.de
2025-09-01 16:26:55 +02:00
Gopi Krishna Menon 5fdb877b49 selftests/futex: Fix typos and grammar in futex_priv_hash
Fix multiple typos and small grammar issues in help text, comments and test
messages in the futex_priv_hash test.

Signed-off-by: Gopi Krishna Menon <krishnagopi487@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/20250827130011.677600-5-bigeasy@linutronix.de
2025-09-01 16:21:02 +02:00
Nai-Chen Cheng f8ef9c2402 selftests/futex: Fix format-security warnings in futex_priv_hash
Fix format-security warnings by using proper format strings when passing
message variables to ksft_exit_fail_msg(), ksft_test_result_pass(), and
ksft_test_result_skip() function.

Thus prevent potential security issues and eliminate compiler warnings when
building with -Wformat-security.

Signed-off-by: Nai-Chen Cheng <bleach1827@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250827130011.677600-4-bigeasy@linutronix.de
2025-09-01 16:20:08 +02:00
Waiman Long d8e2f91999 selftests/futex: Fix some futex_numa_mpol subtests
The "Memory out of range" subtest of futex_numa_mpol assumes that memory
access outside of the mmap'ed area is invalid. That may not be the case
depending on the actual memory layout of the test application. When that
subtest was run on an x86-64 system with latest upstream kernel, the test
passed as an error was returned from futex_wake(). On another PowerPC system,
the same subtest failed because futex_wake() returned 0.

  Bail out! futex2_wake(64, 0x86) should fail, but didn't

Looking further into the passed subtest on x86-64, it was found that an
-EINVAL was returned instead of -EFAULT. The -EINVAL error was returned
because the node value test with FLAGS_NUMA set failed with a node value
of 0x7f7f. IOW, the futex memory was accessible and futex_wake() failed
because the supposed node number wasn't valid. If that memory location
happens to have a very small value (e.g. 0), the test will pass and no
error will be returned.

Since this subtest is non-deterministic, drop it unless a guard page beyond
the mmap region is explicitly set.

The other problematic test is the "Memory too small" test. The futex_wake()
function returns the -EINVAL error code because the given futex address isn't
8-byte aligned, not because only 4 of the 8 bytes are valid and the other
4 bytes are not. So change the name of this subtest to "Mis-aligned futex" to
reflect the reality.

  [ bp: Massage commit message. ]

Fixes: 3163369407 ("selftests/futex: Add futex_numa_mpol")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/20250827130011.677600-3-bigeasy@linutronix.de
2025-09-01 16:12:54 +02:00
Thomas Huth 74db6cc331 powerpc: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers
While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembler code, __ASSEMBLY__ is a
macro that only gets defined by the Makefiles in the kernel.
This is bad since macros starting with two underscores are names
that are reserved by the C language. It can also be very confusing
for the developers when switching between userspace and kernelspace
coding, or when dealing with uapi headers that rather should use
__ASSEMBLER__  instead. So let's standardize now on the __ASSEMBLER__
macro that is provided by the compilers.

This is almost a completely mechanical patch (done with a simple
"sed -i" statement), apart from tweaking two comments manually in
arch/powerpc/include/asm/bug.h and arch/powerpc/include/asm/kasan.h
(which did not have proper underscores at the end) and fixing a
checkpatch error about spaces in arch/powerpc/include/asm/spu_csa.h.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250801082007.32904-3-thuth@redhat.com
2025-09-01 13:23:29 +05:30
Linus Torvalds c8bc81a52d Two arm64 fixes:
- CFI failure due to kpti_ng_pgd_alloc() signature mismatch
 
 - Underallocation bug in the SVE ptrace kselftest
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmiy2d8ACgkQa9axLQDI
 XvEPOxAAhMBXB6RoYlUe9Xwnkazz9TndSJy4buxEYQhC6dUtRdHwSiDEo8caaZpt
 lhIqLsmCL0SiP4SiZDobLhrcrD4U0sLNeZf9mkk6JnZ12ws3scEtZvBi2WhYn3rv
 hlAoH0WvhBgELm3RNyvuwhovZNixPj/aJ4keblHV3AuQZ2V9bN7f2rH9XMUDBP2R
 Q3yKUFxniyLh5UR/m2JD4lm5IvnzDT1uYdvChKviW5VMepY/UIerG6Y7fqfaKQIE
 h4S/QNqulOI+hfpRICowJ+Ydpb9oTgDBTWYLSCsKpjA5aLlkv1uVQSVHayyx7hc2
 XitLDcYNV8X2tBHa04Ip+TiRk1KM6UZvMsgGdnd6AzzucEjJg8glt4u4aM3gxkqq
 N2jgVFOiWj4xQi2Y3wDM74/PrywzD+TwhKUW15hjwkYONzm2Ff+Q0RN86y45lCON
 lchZ+khaQoZDG8eBuacsWlaIm0VGNySNsLbVNbj2+8fZWfWdCXo53DqMP71qzsWR
 bKSMlQTek1RoJYzC6zbJ4NSuPXJTPFc0SbLKGUIlNL+X5Zfu+kcI5scakczVb3FB
 gvqA0kek9dB/RIKUeXE2i9+1Ew9LsotnK8Woww/stYP+zUbq/Xsp6FVhk9kEZOls
 JWU90GbegChS/Lgc1C8CimsUiCKb4GtZ+x1iUCSdZNSZ7yK/V+g=
 =gHuh
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - CFI failure due to kpti_ng_pgd_alloc() signature mismatch

 - Underallocation bug in the SVE ptrace kselftest

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kselftest/arm64: Don't open code SVE_PT_SIZE() in fp-ptrace
  arm64: mm: Fix CFI failure due to kpti_ng_pgd_alloc function signature
2025-08-30 10:43:53 -07:00
Mark Brown d82aa5d350 kselftest/arm64: Don't open code SVE_PT_SIZE() in fp-ptrace
In fp-trace when allocating a buffer to write SVE register data we open
code the addition of the header size to the VL depeendent register data
size, which lead to an underallocation bug when we cut'n'pasted the code
for FPSIMD format writes. Use the SVE_PT_SIZE() macro that the kernel
UAPI provides for this.

Fixes: b84d2b2795 ("kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250812-arm64-fp-trace-macro-v1-1-317cfff986a5@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-08-30 11:31:11 +01:00
Takashi Iwai 14f628cb58 Merge branch 'for-linus' into for-next
Pull 6.17 devel branch for further auto-cleanup updates.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-30 09:42:45 +02:00
Liao Yuanhong 2a63607bfd vsock/test: Remove redundant semicolons
Remove unnecessary semicolons.

Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250828083938.400872-1-liaoyuanhong@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-29 19:10:52 -07:00
Florian Westphal d6a367ec6c netfilter: nft_flowtable.sh: re-run with random mtu sizes
Jakub says:
 nft_flowtable.sh is one of the most flake-atious test for netdev CI currently :(

The root cause is two-fold:
1. the failing part of the test is supposed to make sure that ip
   fragments are forwarded for offloaded flows.
   (flowtable has to pass them to classic forward path).
   path mtu discovery for these subtests is disabled.

2. nft_flowtable.sh has two passes.  One with fixed mtus/file size and
  one where link mtus and file sizes are random.

The CI failures all have same pattern:
  re-run with random mtus and file size: -o 27663 -l 4117 -r 10089 -s 54384840
  [..]
  PASS: dscp_egress: dscp packet counters match
  FAIL: file mismatch for ns1 -> ns2

In some cases this error triggers a bit ealier, sometimes in a later
subtest:
  re-run with random mtus and file size: -o 20201 -l 4555 -r 12657 -s 9405856
  [..]
  PASS: dscp_egress: dscp packet counters match
  PASS: dscp_fwd: dscp packet counters match
  2025/08/17 20:37:52 socat[18954] E write(7, 0x560716b96000, 8192): Broken pipe
  FAIL: file mismatch for ns1 -> ns2
  -rw------- 1 root root 9405856 Aug 17 20:36 /tmp/tmp.2n63vlTrQe

But all logs I saw show same scenario:
1. Failing tests have pmtu discovery off (i.e., ip fragmentation)
2. The test file is much larger than first-pass default (2M Byte)
3. peers have much larger MTUs compared to the 'network'.

These errors are very reproducible when re-running the test with
the same commandline arguments.

The timeout became much more prominent with
1d2fbaad7c ("tcp: stronger sk_rcvbuf checks"): reassembled packets
typically have a skb->truesize more than double the skb length.

As that commit is intentional and pmtud-off with
large-tcp-packets-as-fragments is not normal adjust the test to use a
smaller file for the pmtu-off subtests.

While at it, add more information to pass/fail messages and
also run the dscp alteration subtest with pmtu discovery enabled.

Link: https://netdev.bots.linux.dev/contest.html?test=nft-flowtable-sh
Fixes: f84ab63490 ("selftests: netfilter: nft_flowtable.sh: re-run with random mtu sizes")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20250822071330.4168f0db@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250828214918.3385-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-29 18:53:53 -07:00
Linus Torvalds 11e7861d68 ARM:
- Correctly handle 'invariant' system registers for protected VMs
 
 - Improved handling of VNCR data aborts, including external aborts
 
 - Fixes for handling of FEAT_RAS for NV guests, providing a sane
   fault context during SEA injection and preventing the use of
   RASv1p1 fault injection hardware
 
 - Ensure that page table destruction when a VM is destroyed gives an
   opportunity to reschedule
 
 - Large fix to KVM's infrastructure for managing guest context loaded
   on the CPU, addressing issues where the output of AT emulation
   doesn't get reflected to the guest
 
 - Fix AT S12 emulation to actually perform stage-2 translation when
   necessary
 
 - Avoid attempting vLPI irqbypass when GICv4 has been explicitly
   disabled for a VM
 
 - Minor KVM + selftest fixes
 
 RISC-V:
 
 - Fix pte settings within kvm_riscv_gstage_ioremap()
 
 - Fix comments in kvm_riscv_check_vcpu_requests()
 
 - Fix stack overrun when setting vlenb via ONE_REG
 
 x86:
 
 - Use array_index_nospec() to sanitize the target vCPU ID when handling PV
   IPIs and yields as the ID is guest-controlled.
 
 - Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as
   the common case, by far, is that at least one CPU will have entered the
   VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where
   the set of have_run_cpus is empty.
 
 Selftests (not KVM):
 
 - Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var()
   to fix a collision with linux/overflow.h.  The collision generates compiler
   warnings due to the two macros having different meaning.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmix3OMUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOZGAf+K+xTAhbMuY4bK5Sn93/QssYUVsFv
 wWc/q5FXUd8t21eAN+b/qhGF4d71eDuoIUNzOBwbJ9qY/0F42Xgihfr7BarSBBqD
 anqQBnhhtCyPCa1tF8SyBv34HewNKts3bgSxnwo2V2CBGWqomm6cZ9Uh3yALFBGJ
 kqHi0kKql+QL9G9DbRQ8lEJAPnCnktFFtA94T5B+o7yh1vvPeBsK40chH8bi19nh
 vCdoGhNLr+k+MoYpfJ8lyOJ7QctijJBK7OlsteksMvCXKQdfz1/X7TnoF11rb4yV
 MPfMUDOGlIVEBaVBkokyHXXPv0Fg4zGlt/SYzOZWRHIYgQNQ+aSscAKODA==
 =W51r
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Correctly handle 'invariant' system registers for protected VMs

   - Improved handling of VNCR data aborts, including external aborts

   - Fixes for handling of FEAT_RAS for NV guests, providing a sane
     fault context during SEA injection and preventing the use of
     RASv1p1 fault injection hardware

   - Ensure that page table destruction when a VM is destroyed gives an
     opportunity to reschedule

   - Large fix to KVM's infrastructure for managing guest context loaded
     on the CPU, addressing issues where the output of AT emulation
     doesn't get reflected to the guest

   - Fix AT S12 emulation to actually perform stage-2 translation when
     necessary

   - Avoid attempting vLPI irqbypass when GICv4 has been explicitly
     disabled for a VM

   - Minor KVM + selftest fixes

  RISC-V:

   - Fix pte settings within kvm_riscv_gstage_ioremap()

   - Fix comments in kvm_riscv_check_vcpu_requests()

   - Fix stack overrun when setting vlenb via ONE_REG

  x86:

   - Use array_index_nospec() to sanitize the target vCPU ID when
     handling PV IPIs and yields as the ID is guest-controlled.

   - Drop a superfluous cpumask_empty() check when reclaiming SEV
     memory, as the common case, by far, is that at least one CPU will
     have entered the VM, and wbnoinvd_on_cpus_mask() will naturally
     handle the rare case where the set of have_run_cpus is empty.

  Selftests (not KVM):

   - Rename the is_signed_type() macro in kselftest_harness.h to
     is_signed_var() to fix a collision with linux/overflow.h. The
     collision generates compiler warnings due to the two macros having
     different meaning"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits)
  KVM: arm64: nv: Fix ATS12 handling of single-stage translation
  KVM: arm64: Remove __vcpu_{read,write}_sys_reg_{from,to}_cpu()
  KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
  KVM: arm64: Simplify sysreg access on exception delivery
  KVM: arm64: Check for SYSREGS_ON_CPU before accessing the 32bit state
  RISC-V: KVM: fix stack overrun when loading vlenb
  RISC-V: KVM: Correct kvm_riscv_check_vcpu_requests() comment
  RISC-V: KVM: Fix pte settings within kvm_riscv_gstage_ioremap()
  KVM: arm64: selftests: Sync ID_AA64MMFR3_EL1 in set_id_regs
  KVM: arm64: Get rid of ARM64_FEATURE_MASK()
  KVM: arm64: Make ID_AA64PFR1_EL1.RAS_frac writable
  KVM: arm64: Make ID_AA64PFR0_EL1.RAS writable
  KVM: arm64: Ignore HCR_EL2.FIEN set by L1 guest's EL2
  KVM: arm64: Handle RASv1p1 registers
  arm64: Add capability denoting FEAT_RASv1p1
  KVM: arm64: Reschedule as needed when destroying the stage-2 page-tables
  KVM: arm64: Split kvm_pgtable_stage2_destroy()
  selftests: harness: Rename is_signed_type() to avoid collision with overflow.h
  KVM: SEV: don't check have_run_cpus in sev_writeback_caches()
  KVM: arm64: Correctly populate FAR_EL2 on nested SEA injection
  ...
2025-08-29 13:54:26 -07:00
Jakub Kicinski d23ad54de7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc4).

No conflicts.

Adjacent changes:

drivers/net/ethernet/intel/idpf/idpf_txrx.c
  02614eee26 ("idpf: do not linearize big TSO packets")
  6c4e684802 ("idpf: remove obsolete stashing code")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-29 11:48:01 -07:00
Paolo Bonzini 42a0305ab1 KVM/arm64 changes for 6.17, take #2
- Correctly handle 'invariant' system registers for protected VMs
 
  - Improved handling of VNCR data aborts, including external aborts
 
  - Fixes for handling of FEAT_RAS for NV guests, providing a sane
    fault context during SEA injection and preventing the use of
    RASv1p1 fault injection hardware
 
  - Ensure that page table destruction when a VM is destroyed gives an
    opportunity to reschedule
 
  - Large fix to KVM's infrastructure for managing guest context loaded
    on the CPU, addressing issues where the output of AT emulation
    doesn't get reflected to the guest
 
  - Fix AT S12 emulation to actually perform stage-2 translation when
    necessary
 
  - Avoid attempting vLPI irqbypass when GICv4 has been explicitly
    disabled for a VM
 
  - Minor KVM + selftest fixes
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCaLC0JBccb2xpdmVyLnVw
 dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFogJAQCyxHd5tuvXWWT/iC2EYFlPWYkU
 LOQbNhus16QjQ9f2ggD8CoA+6UAxzYW7ZU6IzYkDhJkN/3dKQEQhh8Cx0GXXRAs=
 =uky+
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.17-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 changes for 6.17, take #2

 - Correctly handle 'invariant' system registers for protected VMs

 - Improved handling of VNCR data aborts, including external aborts

 - Fixes for handling of FEAT_RAS for NV guests, providing a sane
   fault context during SEA injection and preventing the use of
   RASv1p1 fault injection hardware

 - Ensure that page table destruction when a VM is destroyed gives an
   opportunity to reschedule

 - Large fix to KVM's infrastructure for managing guest context loaded
   on the CPU, addressing issues where the output of AT emulation
   doesn't get reflected to the guest

 - Fix AT S12 emulation to actually perform stage-2 translation when
   necessary

 - Avoid attempting vLPI irqbypass when GICv4 has been explicitly
   disabled for a VM

 - Minor KVM + selftest fixes
2025-08-29 12:57:31 -04:00
Sebastian Andrzej Siewior 2e62688d58 selftests/futex: Remove the -g parameter from futex_priv_hash
The -g parameter was meant to the test the immutable global hash instead of the
private hash which has been made immutable. The global hash is tested as part
at the end of the regular test. The immutable private hash been removed.

Remove last traces of the immutable private hash.

Fixes: 16adc7f136 ("selftests/futex: Remove support for IMMUTABLE")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/20250827130011.677600-2-bigeasy@linutronix.de
2025-08-29 12:01:17 +02:00
Linus Torvalds d1cf752d58 block-6.17-20250828
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiwwhkQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnooD/92QqpW1IOn0hs0WGOiKl7se5JgBuLV9bd0
 j7M8EpGyFK63zeDgAO0/Qu2arEva9tlDaW9S/e78kt1PwQ7it82RSlPC47nKWEv+
 Xn5npn2SOEkyY/5/a+iVkOsSD++RIM/DnHNBe9vmU6wePR7AthG5Nhl6kOEi5cyw
 e1eLbwB/0eJZzhHYCaowLwdtnMCcwQAEh/FFYz7tSyrgOsDpovKid6D7s2Sv9sGA
 +lOzwimgszr3RF5bOMudU12RtP35gAvyF56iDVMQylHXybuaYYzzVAmxRLD5vrOf
 vC4HSzRxwUCZnIW62TTP29dAB9mik3va069e1xV1le94vdEsgj2HmuV4tpCzs3LN
 4a1q8oEC83QI+cixzsjxf00DaJSkk9msUNsqE+6rcEK0M6z0tRz3mbLXYiCo7V4Y
 eLD7eMsxXkJUSpTwWOIVYiXuM+OvSmFxIEoz4lGnQESKyV0dA86RT5TpOiUCrkW1
 G9nTccxVPZG3i8FKJXmgZLPmpviw+wwzdpjlVShqSGA++/bBXKwAnjAQPSsaVDIS
 HaqhG1IngL9sCcnAy8ZBTEy2TYibasUL48vfCeRhP2u2RZa21zRSOPNFRV9cVvRE
 /wOWQikOqC9ys7zvbLG0OfLQAlejGn1k+k6oEOI6P9x2a0vbjq/UZxIQSkdYEhXG
 x73gWYk28g==
 =/1eG
 -----END PGP SIGNATURE-----

Merge tag 'block-6.17-20250828' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - Fix a lockdep spotted issue on recursive locking for zoned writes, in
   case of errors

 - Update bcache MAINTAINERS entry address for Coly

 - Fix for a ublk release issue, with selftests

 - Fix for a regression introduced in this cycle, where it assumed
   q->rq_qos was always set if the bio flag indicated that

 - Fix for a regression introduced in this cycle, where loop retrieving
   block device sizes got broken

* tag 'block-6.17-20250828' of git://git.kernel.dk/linux:
  bcache: change maintainer's email address
  ublk selftests: add --no_ublk_fixed_fd for not using registered ublk char device
  ublk: avoid ublk_io_release() called after ublk char dev is closed
  block: validate QoS before calling __rq_qos_done_bio()
  blk-zoned: Fix a lockdep complaint about recursive locking
  loop: fix zero sized loop for block special file
2025-08-28 18:51:28 -07:00
Jakub Kicinski c158b5a570 selftests: drv-net: rss_ctx: fix the queue count check
Commit 0d6ccfe6b3 ("selftests: drv-net: rss_ctx: check for all-zero keys")
added a skip exception if NIC has fewer than 3 queues enabled,
but it's just constructing the object, it's not actually rising
this exception.

Before:

  # Exception| net.lib.py.utils.CmdExitFailure: Command failed: ethtool -X enp1s0 equal 3 hkey d1:cc:77:47:9d:ea:15:f2:b9:6c:ef:68:62:c0:45:d5:b0:99:7d:cf:29:53:40:06:3d:8e:b9:bc:d4:70:89:b8:8d:59:04:ea:a9:c2:21:b3:55:b8:ab:6b:d9:48:b4:bd:4c:ff:a5:f0:a8:c2
  not ok 1 rss_ctx.test_rss_key_indir

After:

  ok 1 rss_ctx.test_rss_key_indir # SKIP Device has fewer than 3 queues (or doesn't support queue stats)

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250827173558.3259072-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-28 16:09:14 -07:00
David Matlack 03e073bc4d vfio: selftests: Fix .gitignore for already tracked files
Fix the rules in tools/testing/selftests/vfio/.gitignore to not ignore
some already tracked files (.gitignore, Makefile, lib/libvfio.mk).

This change should be a no-op, since these files are already tracked by git
and thus git will not ignore updates to them even though they match the
ignore rules in the VFIO selftests .gitignore file.

However, they do generate warnings with W=1, as reported by the kernel test
robot.

  $ KBUILD_EXTRA_WARN=1 scripts/misc-check
  tools/testing/selftests/vfio/.gitignore: warning: ignored by one of the .gitignore files
  tools/testing/selftests/vfio/Makefile: warning: ignored by one of the .gitignore files
  tools/testing/selftests/vfio/lib/libvfio.mk: warning: ignored by one of the .gitignore files

Fix this by explicitly un-ignoring the tracked files.

Fixes: 292e9ee22b ("selftests: Create tools/testing/selftests/vfio")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508280918.rFRyiLEU-lkp@intel.com/
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250828185815.382215-1-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-28 13:40:51 -06:00
Ricardo B. Marlière 98857d111c selftests/bpf: Fix bpf_prog_detach2 usage in test_lirc_mode2
Commit e9fc3ce99b ("libbpf: Streamline error reporting for high-level
APIs") redefined the way that bpf_prog_detach2() returns. Therefore, adapt
the usage in test_lirc_mode2_user.c.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250828-selftests-bpf-v1-1-c7811cd8b98c@suse.com
2025-08-28 09:49:46 -07:00
Ming Lei 9b2785ea85 ublk selftests: add --no_ublk_fixed_fd for not using registered ublk char device
Add a new command line option --no_ublk_fixed_fd that excludes the ublk
control device (/dev/ublkcN) from io_uring's registered files array.
When this option is used, only backing files are registered starting
from index 1, while the ublk control device is accessed using its raw
file descriptor.

Add ublk_get_registered_fd() helper function that returns the appropriate
file descriptor for use with io_uring operations.

Key optimizations implemented:
- Cache UBLKS_Q_NO_UBLK_FIXED_FD flag in ublk_queue.flags to avoid
  reading dev->no_ublk_fixed_fd in fast path
- Cache ublk char device fd in ublk_queue.ublk_fd for fast access
- Update ublk_get_registered_fd() to use ublk_queue * parameter
- Update io_uring_prep_buf_register/unregister() to use ublk_queue *
- Replace ublk_device * access with ublk_queue * access in fast paths

Also pass --no_ublk_fixed_fd to test_stress_04.sh for covering
plain ublk char device mode.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250827121602.2619736-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-28 07:56:57 -06:00
Eric Dumazet 51132b99f0 udp: add drop_counters to udp socket
When a packet flood hits one or more UDP sockets, many cpus
have to update sk->sk_drops.

This slows down other cpus, because currently
sk_drops is in sock_write_rx group.

Add a socket_drop_counters structure to udp sockets.

Using dedicated cache lines to hold drop counters
makes sure that consumers no longer suffer from
false sharing if/when producers only change sk->sk_drops.

This adds 128 bytes per UDP socket.

Tested with the following stress test, sending about 11 Mpps
to a dual socket AMD EPYC 7B13 64-Core.

super_netperf 20 -t UDP_STREAM -H DUT -l10 -- -n -P,1000 -m 120
Note: due to socket lookup, only one UDP socket is receiving
packets on DUT.

Then measure receiver (DUT) behavior. We can see both
consumer and BH handlers can process more packets per second.

Before:

nstat -n ; sleep 1 ; nstat | grep Udp
Udp6InDatagrams                 615091             0.0
Udp6InErrors                    3904277            0.0
Udp6RcvbufErrors                3904277            0.0

After:

nstat -n ; sleep 1 ; nstat | grep Udp
Udp6InDatagrams                 816281             0.0
Udp6InErrors                    7497093            0.0
Udp6RcvbufErrors                7497093            0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250826125031.1578842-5-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-28 13:14:50 +02:00
Zi Yan 5bbc2b785e selftests/mm: fix FORCE_READ to read input value correctly
FORCE_READ() converts input value x to its pointer type then reads from
address x.  This is wrong.  If x is a non-pointer, it would be caught it
easily.  But all FORCE_READ() callers are trying to read from a pointer
and FORCE_READ() basically reads a pointer to a pointer instead of the
original typed pointer.  Almost no access violation was found, except the
one from split_huge_page_test.

Fix it by implementing a simplified READ_ONCE() instead.

Link: https://lkml.kernel.org/r/20250805175140.241656-1-ziy@nvidia.com
Fixes: 3f6bfd4789 ("selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));"")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: wang lian <lianux.mm@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-27 22:45:42 -07:00
Matt Fleming 737433c6a5 selftests/bpf: Add LPM trie microbenchmarks
Add benchmarks for the standard set of operations: LOOKUP, INSERT,
UPDATE, DELETE. Also include benchmarks to measure the overhead of the
bench framework itself (NOOP) as well as the overhead of generating keys
(BASELINE). Lastly, this includes a benchmark for FREE (trie_free())
which is known to have terrible performance for maps with many entries.

Benchmarks operate on tries without gaps in the key range, i.e. each
test begins or ends with a trie with valid keys in the range [0,
nr_entries). This is intended to cause maximum branching when traversing
the trie.

LOOKUP, UPDATE, DELETE, and FREE fill a BPF LPM trie from userspace
using bpf_map_update_batch() and run the corresponding benchmark
operation via bpf_loop(). INSERT starts with an empty map and fills it
kernel-side from bpf_loop(). FREE records the time to free a filled LPM
trie by attaching and destroying a BPF prog. NOOP measures the overhead
of the test harness by running an empty function with bpf_loop().
BASELINE is similar to NOOP except that the function generates a key.

Each operation runs 10,000 times using bpf_loop(). Note that this value
is intentionally independent of the number of entries in the LPM trie so
that the stability of the results isn't affected by the number of
entries.

For those benchmarks that need to reset the LPM trie once it's full
(INSERT) or empty (DELETE), throughput and latency results are scaled by
the fraction of a second the operation actually ran to ignore any time
spent reinitialising the trie.

By default, benchmarks run using sequential keys in the range [0,
nr_entries). BASELINE, LOOKUP, and UPDATE can use random keys via the
--random parameter but beware there is a runtime cost involved in
generating random keys. Other benchmarks are prohibited from using
random keys because it can skew the results, e.g. when inserting an
existing key or deleting a missing one.

All measurements are recorded from within the kernel to eliminate
syscall overhead. Most benchmarks run an XDP program to generate stats
but FREE needs to collect latencies using fentry/fexit on
map_free_deferred() because it's not possible to use fentry directly on
lpm_trie.c since commit c83508da56 ("bpf: Avoid deadlock caused by
nested kprobe and fentry bpf programs") and there's no way to
create/destroy a map from within an XDP program.

Here is example output from an AMD EPYC 9684X 96-Core machine for each
of the benchmarks using a trie with 10K entries and a 32-bit prefix
length, e.g.

  $ ./bench lpm-trie-$op \
  	--prefix_len=32  \
	--producers=1     \
	--nr_entries=10000

     noop: throughput   74.417 ± 0.032 M ops/s ( 74.417M ops/prod), latency   13.438 ns/op
 baseline: throughput   70.107 ± 0.171 M ops/s ( 70.107M ops/prod), latency   14.264 ns/op
   lookup: throughput    8.467 ± 0.047 M ops/s (  8.467M ops/prod), latency  118.109 ns/op
   insert: throughput    2.440 ± 0.015 M ops/s (  2.440M ops/prod), latency  409.290 ns/op
   update: throughput    2.806 ± 0.042 M ops/s (  2.806M ops/prod), latency  356.322 ns/op
   delete: throughput    4.625 ± 0.011 M ops/s (  4.625M ops/prod), latency  215.613 ns/op
     free: throughput    0.578 ± 0.006 K ops/s (  0.578K ops/prod), latency    1.730 ms/op

And the same benchmarks using random keys:

  $ ./bench lpm-trie-$op \
  	--prefix_len=32  \
	--producers=1     \
	--nr_entries=10000 \
	--random

     noop: throughput   74.259 ± 0.335 M ops/s ( 74.259M ops/prod), latency   13.466 ns/op
 baseline: throughput   35.150 ± 0.144 M ops/s ( 35.150M ops/prod), latency   28.450 ns/op
   lookup: throughput    7.119 ± 0.048 M ops/s (  7.119M ops/prod), latency  140.469 ns/op
   insert: N/A
   update: throughput    2.736 ± 0.012 M ops/s (  2.736M ops/prod), latency  365.523 ns/op
   delete: N/A
     free: N/A

Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20250827140149.1001557-1-matt@readmodwrite.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-27 17:28:14 -07:00
Puranjay Mohan 22b22bf9ee selftests/bpf: Enable timed may_goto tests for arm64
As arm64 JIT now supports timed may_goto instruction, make sure all
relevant tests run on this architecture. Some tests were enabled and
other required modifications to work properly on arm64.

 $ ./test_progs -a "stream*","*may_goto*",verifier_bpf_fastcall

 #404     stream_errors:OK
 [...]
 #406/2   stream_success/stream_cond_break:OK
 [...]
 #494/23  verifier_bpf_fastcall/may_goto_interaction_x86_64:SKIP
 #494/24  verifier_bpf_fastcall/may_goto_interaction_arm64:OK
 [...]
 #539/1   verifier_may_goto_1/may_goto 0:OK
 #539/2   verifier_may_goto_1/batch 2 of may_goto 0:OK
 #539/3   verifier_may_goto_1/may_goto batch with offsets 2/1/0:OK
 #539/4   verifier_may_goto_1/may_goto batch with offsets 2/0:OK
 #539     verifier_may_goto_1:OK
 #540/1   verifier_may_goto_2/C code with may_goto 0:OK
 #540     verifier_may_goto_2:OK
 Summary: 7/16 PASSED, 25 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20250827113245.52629-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-27 17:16:22 -07:00
Jiawei Zhao 69424097ee selftests/bpf: Enrich subtest_basic_usdt case in selftests to cover SIB handling logic
When using GCC on x86-64 to compile an usdt prog with -O1 or higher
optimization, the compiler will generate SIB addressing mode for global
array, e.g. "1@-96(%rbp,%rax,8)".

In this patch:
- enrich subtest_basic_usdt test case to cover SIB addressing usdt argument spec
  handling logic

Signed-off-by: Jiawei Zhao <phoenix500526@163.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250827053128.1301287-3-phoenix500526@163.com
2025-08-27 15:48:05 -07:00
Shubham Sharma d3abefe897 selftests/bpf: Fix typos and grammar in test sources
Fix spelling typos and grammar errors in BPF selftests source code.

Signed-off-by: Shubham Sharma <slopixelz@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250826125746.17983-1-slopixelz@gmail.com
2025-08-27 15:13:08 -07:00
Nandakumar Edamana 2660b9d477 bpf: Add selftest to check the verifier's abstract multiplication
Add new selftest to test the abstract multiplication technique(s) used
by the verifier, following the recent improvement in tnum
multiplication (tnum_mul). One of the newly added programs,
verifier_mul/mul_precise, results in a false positive with the old
tnum_mul, while the program passes with the latest one.

Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250826034524.2159515-2-nandakumar@nandakumar.co.in
2025-08-27 15:00:31 -07:00
David Matlack fd134b0f2f vfio: selftests: Add a script to help with running VFIO selftests
Introduce run.sh, a script to help with running VFIO selftests. The
script is intended to be used for both humans manually running VFIO
selftests, and to incorporate into test automation where VFIO selftests
may run alongside other tests. As such the script aims to be hermetic,
returning the system to the state it was before the test started.

The script takes as input the BDF of a device to use and a command to
run (typically the command would be a VFIO selftest). e.g.

  $ ./run.sh -d 0000:6a:01.0 ./vfio_pci_device_test

 or

  $ ./run.sh -d 0000:6a:01.0 -- ./vfio_pci_device_test

The script then handles unbinding device 0000:6a:01.0 from its current
driver, binding it to vfio-pci, running the test, unbinding from
vfio-pci, and binding back to the original driver.

When run.sh runs the provided test, it does so by appending the BDF as
the last parameter. For example:

  $ ./run.sh -d 0000:6a:01.0 -- echo hello

Results in the following being printed to stdout:

  hello 0000:6a:01.0

The script also supports a mode where it can break out into a shell so
that multiple tests can be run manually.

  $ ./run.sh -d 0000:6a:01.0 -s
  $ echo $VFIO_SELFTESTS_BDF
  $ ./vfio_pci_device_test
  $ exit

Choosing which device to use is up to the user.

In the future this script should be extensible to tests that want to use
multiple devices. The script can support accepting -d BDF multiple times
and parse them into an array, setup all the devices, pass the list of
BDFs to the test, and then cleanup all the devices.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-31-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:11 -06:00
David Matlack 8afcbe2047 vfio: selftests: Make iommufd the default iommu_mode
Now that VFIO selftests support iommufd, make it the default mode.
IOMMUFD is the successor to VFIO_TYPE1{,v2}_IOMMU and all new features
are being added there, so it's a slightly better fit as the default
mode.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-30-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:11 -06:00
David Matlack 61cbfe5014 vfio: selftests: Add iommufd mode
Add a new IOMMU mode for using iommufd directly. In this mode userspace
opens /dev/iommu and binds it to a device FD acquired through
/dev/vfio/devices/vfioX.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-29-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:11 -06:00
David Matlack d1a17495bb vfio: selftests: Add iommufd_compat_type1{,v2} modes
Add new IOMMU modes for using iommufd in compatibility mode with
VFIO_TYPE1_IOMMU and VFIO_TYPE1v2_IOMMU.

In these modes, VFIO selftests will open /dev/iommu and treats it as a
container FD (as if it had opened /dev/vfio/vfio) and the kernel
translates the container ioctls to iommufd calls transparently.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-28-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:10 -06:00
David Matlack 0969c685ba vfio: selftests: Add vfio_type1v2_mode
Add a new IOMMU mode for using VFIO_TYPE1v2_IOMMU.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-27-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:10 -06:00
David Matlack 892aff147a vfio: selftests: Replicate tests across all iommu_modes
Automatically replicate vfio_dma_mapping_test and vfio_pci_driver_test
across all supported IOMMU modes using fixture variants. Both of these
tests exercise DMA mapping to some degree so having automatic coverage
across all IOMMU modes will help catch bugs.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-26-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:10 -06:00
David Matlack 5df9bd6205 vfio: selftests: Encapsulate IOMMU mode
Encapsulate the "IOMMU mode" a test should use behind a new struct.
In the future this will be used to support other types of IOMMUs besides
VFIO_TYPE1_IOMMU, and allow users to select the mode on the command
line.

No functional change intended.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-25-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:09 -06:00
David Matlack 118e073ef6 vfio: selftests: Move helper to get cdev path to libvfio
Move the helper function to get the VFIO cdev path to libvfio so that it
can be used in libvfio in a subsequent commit.

No functional change intended.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-24-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:09 -06:00
David Matlack 35b05bd962 vfio: selftests: Add driver for Intel DSA
Add a driver to VFIO selftests for Intel DSA devices.

For now the driver only supports up to 32 batches and 1024 copies per
batch, which were the limits of the hardware this commit was tested
with. This is sufficient to generate 9+ minutes of DMA memcpys at a rate
of over 30 GB/s. This should be plenty to stress test VFIO and the IOMMU.

The driver does not yet support requesting interrupt handles, as this
commit was not tested against hardware that requires it.

Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-23-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:09 -06:00
David Matlack 2223587df5 vfio: selftests: Add driver for Intel CBDMA
Add a driver for the Intel CBDMA device. This driver is based on and
named after the Linux driver for this device (drivers/dma/ioat/) and
also based on previous work from Peter Shier <pshier@google.com>.

The driver aims to be as simple as possible. It uses a single descriptor
to issue DMA operations, and only supports the copy operation. For "DMA
storms", the driver kicks off the maximum number of maximum-sized DMA
operations. On Skylake server parts, this was 2^16-1 copies of size 2M
and lasts about 15 seconds.

Create symlinks to drivers/dma/ioat/{hw.h,registers.h} to get access to
various macros (e.g. IOAT_CHANCMD_RESET) and struct ioat_dma_descriptor.

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-20-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:08 -06:00
David Matlack fded8da4bc vfio: sefltests: Add vfio_pci_driver_test
Add a new selftest that tests all driver operations. This test serves
both as a demonstration of the driver framework, and also as a
correctness test for future drivers.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-14-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:05 -06:00
David Matlack 1b197032ac vfio: selftests: Add driver framework
Add a driver framework to VFIO selftests, so that devices can generate
DMA and interrupts in a common way that can be then utilized by tests.
This will enable VFIO selftests to exercise real hardware DMA and
interrupt paths, without needing any device-specific code in the test
itself.

Subsequent commits will introduce drivers for specific devices.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-13-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:05 -06:00
David Matlack 50d8fe805f vfio: selftests: Add a helper for matching vendor+device IDs
Add a helper function for matching a device against a given vendor and
device ID. This will be used in a subsequent commit to match devices
against drivers.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-12-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:05 -06:00
David Matlack 924947804f vfio: selftests: Enable asserting MSI eventfds not firing
Make it possible to assert that a given MSI eventfd did _not_ fire by
adding a helper to mark an eventfd non-blocking. Demonstrate this in
vfio_pci_device_test by asserting the MSI eventfd did not fire before
vfio_pci_irq_trigger().

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-11-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:04 -06:00
David Matlack 346cd58f1f vfio: selftests: Keep track of DMA regions mapped into the device
Keep track of the list of DMA regions that are mapped into the device
using a linked list and a new struct vfio_dma_region and use that to add
{__,}to_iova() for converting host virtual addresses into IOVAs.

This will be used in a subsequent commit to map multiple DMA regions
into a device that are then used by drivers.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-10-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:04 -06:00
Josh Hilke 47f861048e vfio: selftests: Validate 2M/1G HugeTLB are mapped as 2M/1G in IOMMU
Update vfio dma mapping test to verify that the IOMMU uses 2M and 1G
mappings when 2M and 1G HugeTLB pages are mapped into a device
respectively.

This validation is done by inspecting the contents of the I/O page
tables via /sys/kernel/debug/iommu/intel/. This validation is skipped if
that directory is not available (i.e. non-Intel IOMMUs).

Signed-off-by: Josh Hilke <jrhilke@google.com>
[reword commit message, refactor code]
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-9-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:04 -06:00
Josh Hilke 751f6b5d06 vfio: selftests: Add DMA mapping tests for 2M and 1G HugeTLB
Add test coverage of mapping 2M and 1G HugeTLB to vfio_dma_mapping_test
using a fixture variant. If there isn't enough HugeTLB memory available
for the test, just skip them.

Signed-off-by: Josh Hilke <jrhilke@google.com>
[switch from command line option to fixture variant]
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-8-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:03 -06:00
Josh Hilke a0fd0af504 vfio: selftests: Add test to reset vfio device.
Add a test to vfio_pci_device_test which resets the device. If reset is
not supported by the device, the test is skipped.

Signed-off-by: Josh Hilke <jrhilke@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-7-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:03 -06:00
Josh Hilke b477e7bcd2 vfio: selftests: Move vfio dma mapping test to their own file
Move the dma_map_unmap test from vfio_pci_device_test to a new test:
vfio_dma_mapping_test. We are going to add more complex dma mapping
tests, so it makes sense to separate this from the vfio pci device
test which is more of a sanity check for vfio pci functionality.

Signed-off-by: Josh Hilke <jrhilke@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-6-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:02 -06:00
Josh Hilke 790588f06e vfio: selftests: Test basic VFIO and IOMMUFD integration
Add a vfio test suite which verifies that userspace can bind and unbind
devices, allocate I/O address space, and attach a device to an IOMMU
domain using the cdev + IOMMUfd VFIO interface.

Signed-off-by: Josh Hilke <jrhilke@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-5-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:02 -06:00
David Matlack 16eadd7c12 vfio: selftests: Introduce vfio_pci_device_test
Introduce a basic VFIO selftest called vfio_pci_device_test to
demonstrate the functionality of the VFIO selftest library and provide
some test coverage of basic VFIO operations, including:

 - Mapping and unmapping DMA
 - Mapping and unmapping BARs
 - Enabling, triggering, and disabling MSI and MSI-x
 - Reading and writing to PCI config space

This test should work with most PCI devices, as long as they are bound
to vfio-pci.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-4-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:02 -06:00
David Matlack 19faf6fd96 vfio: selftests: Add a helper library for VFIO selftests
Add a basic helper library to be used by VFIO selftests.

The basic unit of the library is struct vfio_pci_device, which
represents a single PCI device that is bound to the vfio-pci driver. The
library currently only supports a single device per group and container,
and VFIO IOMMU types.

The code in this library was heavily based on prior work done by
Raghavendra Rao Ananta <rananta@google.com>, and the VFIO_ASSERT*()
macros were written by Vipin Sharma <vipinsh@google.com>.

Separate that Makefile rules for building the library into a separate
script so that the library can be built by and linked into KVM selftests
in a subsequent commit.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-3-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:01 -06:00
David Matlack 292e9ee22b selftests: Create tools/testing/selftests/vfio
Create the directory tools/testing/selftests/vfio with a stub Makefile
and hook it up to the top-level selftests Makefile.

This directory will be used in subsequent commits to host selftests for
the VFIO subsystem.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20250822212518.4156428-2-dmatlack@google.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-27 12:14:01 -06:00
Michal Koutný 3b0dec689a selftests: cgroup: Make test_pids backwards compatible
The predicates in test expect event counting from 73e75e6fc3
("cgroup/pids: Separate semantics of pids.events related to pids.max")
and the test would fail on older kernels. We want to have one version of
tests for all, so detect the feature and skip the test on old kernels.
(The test could even switch to check v1 semantics based on the flag but
keep it simple for now.)

Fixes: 9f34c56602 ("selftests: cgroup: Add basic tests for pids controller")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Tested-by: Sebastian Chlad <sebastian.chlad@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-27 05:59:52 -10:00
Christian Bruel 106fc08b30 selftests: pci_endpoint: Skip IRQ test if IRQ is out of range.
The pci_endpoint_test tests the entire MSI/MSI-X range, which generates
false errors on platforms that do not support the whole range.

Skip the test in such cases and report accordingly.

Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250804170916.3212221-4-christian.bruel@foss.st.com
2025-08-27 18:41:00 +05:30
Chen Linxuan 1a7b13781b selftests: filesystems: Add functional test for the abort file in fusectl
This patch add a simple functional test for the "abort" file
in fusectlfs (/sys/fs/fuse/connections/ID/abort).

A simple fuse daemon is added for testing.

Related discussion can be found in the link below.

Link: https://lore.kernel.org/all/CAOQ4uxjKFXOKQxPpxtS6G_nR0tpw95w0GiO68UcWg_OBhmSY=Q@mail.gmail.com/
Signed-off-by: Chen Linxuan <chenlinxuan@uniontech.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-08-27 14:29:43 +02:00
Sean Christopherson 42188667be KVM: selftests: Add guest_memfd testcase to fault-in on !mmap()'d memory
Add a guest_memfd testcase to verify that a vCPU can fault-in guest_memfd
memory that supports mmap(), but that is not currently mapped into host
userspace and/or has a userspace address (in the memslot) that points at
something other than the target guest_memfd range.  Mapping guest_memfd
memory into the guest is supposed to operate completely independently from
any userspace mappings.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250729225455.670324-25-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-08-27 04:41:34 -04:00
Fuad Tabba a11d7124b4 KVM: selftests: guest_memfd mmap() test when mmap is supported
Expand the guest_memfd selftests to comprehensively test host userspace
mmap functionality for guest_memfd-backed memory when supported by the
VM type.

Introduce new test cases to verify the following:

* Successful mmap operations: Ensure that MAP_SHARED mappings succeed
  when guest_memfd mmap is enabled.

* Data integrity: Validate that data written to the mmap'd region is
  correctly persistent and readable.

* fallocate interaction: Test that fallocate(FALLOC_FL_PUNCH_HOLE)
  correctly zeros out mapped pages.

* Out-of-bounds access: Verify that accessing memory beyond the
  guest_memfd's size correctly triggers a SIGBUS signal.

* Unsupported mmap: Confirm that mmap attempts fail as expected when
  guest_memfd mmap support is not enabled for the specific guest_memfd
  instance or VM type.

* Flag validity: Introduce test_vm_type_gmem_flag_validity() to
  systematically test that only allowed guest_memfd creation flags are
  accepted for different VM types (e.g., GUEST_MEMFD_FLAG_MMAP for
  default VMs, no flags for CoCo VMs).

The existing tests for guest_memfd creation (multiple instances, invalid
sizes), file read/write, file size, and invalid punch hole operations
are integrated into the new test_with_type() framework to allow testing
across different VM types.

Cc: James Houghton <jthoughton@google.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Shivank Garg <shivankg@amd.com>
Co-developed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250729225455.670324-24-seanjc@google.com>
[Fix default vm_types to use BIT() - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-08-27 04:41:10 -04:00
Fuad Tabba 692f6ecf38 KVM: selftests: Do not use hardcoded page sizes in guest_memfd test
Update the guest_memfd_test selftest to use getpagesize() instead of
hardcoded 4KB page size values.

Using hardcoded page sizes can cause test failures on architectures or
systems configured with larger page sizes, such as arm64 with 64KB
pages. By dynamically querying the system's page size, the test becomes
more portable and robust across different environments.

Additionally, build the guest_memfd_test selftest for arm64.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Shivank Garg <shivankg@amd.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Suggested-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250729225455.670324-23-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-08-27 04:37:03 -04:00
Paolo Bonzini 0dc4a75150 KVM x86 fixes and a selftest fix for 6.17-rcN
- Use array_index_nospec() to sanitize the target vCPU ID when handling PV
    IPIs and yields as the ID is guest-controlled.
 
  - Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as
    the common case, by far, is that at least one CPU will have entered the
    VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where
    the set of have_run_cpus is empty.
 
  - Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var()
    to fix a collision with linux/overflow.h.  The collision generates compiler
    warnings due to the two macros having different implementations.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAminjaMACgkQOlYIJqCj
 N/0QDhAAhZgUqW2BGqGzOU/pjzXr0riJvVsNeAP85pcxygCc8qO8Hg1OWQz50YL5
 q4sitjZ+Ot39bSzjDMiwkrtuX25OsvlTnZDeN/liIim5rKMiYwvoKqQe5PPNxx5M
 NI4dc4B2AMedJ42gP7thBO+sMGf3J07445C69nJ4K9BppHAoZHH40grVeV0oDw+0
 XoujnyjI0KjghkbWgJlg51TZg6et14prjNZeiAuSulSTaMaPBfjPadkjlG1bsBFV
 lDeZypPvsh/ZLhhAgUFjZCKl7+XCKwKeze5MnpwqFYKhEBL8QqS11WGhyNmPFd1u
 spDe7MjMiNMOOyPlWpJktjMJXz908MJKjrn1Rd78iieqVeM1HQyhHAeC26a+A5Xi
 gFI9lrnNbZ4mlas/xyiX+Tld2yR4Ns3zF4D+eSM4KwII6MF+kEcF3j++U+PqLRvh
 M7r+OKjdvry9cIgHZ/5pa3VshdAfTE6EwNPakdPl+D0hVhPqKHIzi0H8rPmkuNwM
 aIKYSCa9SVmU6DS2vh0qWmTgYsc4Nk7W0bBmce7NftI+PDCYl7+GJAiZ1DBt4N9P
 +i9dKK19tYJCRButj5GZXnYzRpQ3WuPBzEv9C63GPwNaRuzAxvJP5ErrkxT2xE/5
 2WJgd+/J+JvQ14o8HtALc7fZckdWflGlN+pyvGyyQnkNRFpBNN0=
 =8zIJ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-fixes-6.17-rc7' of https://github.com/kvm-x86/linux into HEAD

* Use array_index_nospec() to sanitize the target vCPU ID when handling PV
  IPIs and yields as the ID is guest-controlled.

* Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as
  the common case, by far, is that at least one CPU will have entered the
  VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where
  the set of have_run_cpus is empty.

* Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var()
  to fix a collision with linux/overflow.h.  The collision generates compiler
  warnings due to the two macros having different implementations.
2025-08-27 04:25:40 -04:00
Paolo Bonzini 22b2ca023f KVM x86 fixes and a selftest fix for 6.17-rcN
- Use array_index_nospec() to sanitize the target vCPU ID when handling PV
    IPIs and yields as the ID is guest-controlled.
 
  - Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as
    the common case, by far, is that at least one CPU will have entered the
    VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where
    the set of have_run_cpus is empty.
 
  - Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var()
    to fix a collision with linux/overflow.h.  The collision generates compiler
    warnings due to the two macros having different implementations.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAminjaMACgkQOlYIJqCj
 N/0QDhAAhZgUqW2BGqGzOU/pjzXr0riJvVsNeAP85pcxygCc8qO8Hg1OWQz50YL5
 q4sitjZ+Ot39bSzjDMiwkrtuX25OsvlTnZDeN/liIim5rKMiYwvoKqQe5PPNxx5M
 NI4dc4B2AMedJ42gP7thBO+sMGf3J07445C69nJ4K9BppHAoZHH40grVeV0oDw+0
 XoujnyjI0KjghkbWgJlg51TZg6et14prjNZeiAuSulSTaMaPBfjPadkjlG1bsBFV
 lDeZypPvsh/ZLhhAgUFjZCKl7+XCKwKeze5MnpwqFYKhEBL8QqS11WGhyNmPFd1u
 spDe7MjMiNMOOyPlWpJktjMJXz908MJKjrn1Rd78iieqVeM1HQyhHAeC26a+A5Xi
 gFI9lrnNbZ4mlas/xyiX+Tld2yR4Ns3zF4D+eSM4KwII6MF+kEcF3j++U+PqLRvh
 M7r+OKjdvry9cIgHZ/5pa3VshdAfTE6EwNPakdPl+D0hVhPqKHIzi0H8rPmkuNwM
 aIKYSCa9SVmU6DS2vh0qWmTgYsc4Nk7W0bBmce7NftI+PDCYl7+GJAiZ1DBt4N9P
 +i9dKK19tYJCRButj5GZXnYzRpQ3WuPBzEv9C63GPwNaRuzAxvJP5ErrkxT2xE/5
 2WJgd+/J+JvQ14o8HtALc7fZckdWflGlN+pyvGyyQnkNRFpBNN0=
 =8zIJ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-fixes-6.17-rc7' of https://github.com/kvm-x86/linux into HEAD

KVM x86 fixes and a selftest fix for 6.17-rcN

 - Use array_index_nospec() to sanitize the target vCPU ID when handling PV
   IPIs and yields as the ID is guest-controlled.

 - Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as
   the common case, by far, is that at least one CPU will have entered the
   VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where
   the set of have_run_cpus is empty.

 - Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var()
   to fix a collision with linux/overflow.h.  The collision generates compiler
   warnings due to the two macros having different implementations.
2025-08-27 04:18:01 -04:00
David Gow 922d1dde44 kunit: tool: Accept --raw_output=full as an alias of 'all'
I can never remember whether --raw_output takes 'all' or 'full'. No
reason we can't support both.

For the record, 'all' is the recommended, documented option.

Link: https://lore.kernel.org/r/20250730031624.1911689-1-davidgow@google.com
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-08-26 23:35:41 -06:00
Jakub Kicinski a9d533fbba selftests: drv-net: ncdevmem: explicitly set HDS threshold to 0
Make sure we set HDS threshold to 0 if the device supports changing it.
It's required for ZC.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250825180447.2252977-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:28 -07:00
Jakub Kicinski 6351fadbd5 selftests: drv-net: ncdevmem: restore original HDS setting before exiting
Restore HDS settings if we modified them.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250825180447.2252977-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:28 -07:00
Jakub Kicinski b9f4f95298 selftests: drv-net: ncdevmem: restore old channel config
In case changing channel count with provider bound succeeds
unexpectedly - make sure we return to original settings.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250825180447.2252977-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:27 -07:00
Jakub Kicinski 6d04b36c73 selftests: drv-net: ncdevmem: save IDs of flow rules we added
In prep for more selective resetting of ntuple filters
try to save the rule IDs to a table.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250825180447.2252977-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:27 -07:00
Jakub Kicinski 6925f61714 selftests: drv-net: ncdevmem: remove use of error()
Using error() makes it impossible for callers to unwind their
changes. Replace error() calls with proper error handling.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250825180447.2252977-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:27 -07:00
Jakub Kicinski ee3ae27721 selftests: drv-net: hds: restore hds settings
The test currently modifies the HDS settings and doesn't restore them.
This may cause subsequent tests to fail (or pass when they should not).
Add defer()ed reset handling.

Link: https://patch.msgid.link/20250825175939.2249165-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26 17:35:10 -07:00
Ilya Leoshkevich 21bce56940 selftests/bpf: Remove may_goto tests from DENYLIST.s390x
The may_goto instruction is now fully supported on s390x, including the
timed implementation, so remove the respective test from the denylist.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250821113339.292434-6-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-26 16:51:52 -07:00
Ilya Leoshkevich 7197dbcba2 selftests/bpf: Enable timed may_goto verifier tests on s390x
Now that the timed may_goto implementation is available on s390x,
enable the respective verifier tests.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250821113339.292434-5-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-26 16:51:52 -07:00
Ilya Leoshkevich 1e4e6b9e26 selftests/bpf: Add __arch_s390x macro
Make it possible to limit certain tests to s390x, just like it's
already done for x86_64, arm64, and riscv64.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250821113339.292434-4-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-26 16:51:52 -07:00
Ilya Leoshkevich b68dfcc12a selftests/bpf: Add a missing newline to the "bad arch spec" message
Fix error messages like this one:

  parse_test_spec:FAIL:569 bad arch spec: 's390x'process_subtest:FAIL:1153 Can't parse test spec for program 'may_goto_simple'

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250821113339.292434-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-26 16:51:52 -07:00
Tiezhu Yang d0f27ff27c selftests/bpf: Remove entries from config.{arch} already present in config
`config.{arch}` had entries already present in `config`.

When generating the config used by vmtest, concatenate the `config` file
with the `config.{arch}` one, making those entries duplicated, so remove
those duplications.

Use the following command to get the differences:

  $ comm -1 -2  <(sort tools/testing/selftests/bpf/config.x86_64) <(sort tools/testing/selftests/bpf/config)
  $ comm -1 -2  <(sort tools/testing/selftests/bpf/config.aarch64) <(sort tools/testing/selftests/bpf/config)
  $ comm -1 -2  <(sort tools/testing/selftests/bpf/config.riscv64) <(sort tools/testing/selftests/bpf/config)
  $ comm -1 -2  <(sort tools/testing/selftests/bpf/config.ppc64el) <(sort tools/testing/selftests/bpf/config)
  $ comm -1 -2  <(sort tools/testing/selftests/bpf/config.s390x) <(sort tools/testing/selftests/bpf/config)

This is similar with commit 7a42af4b94 ("selftests/bpf: Remove entries
from config.s390x already present in config").

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250826065057.11415-1-yangtiezhu@loongson.cn
2025-08-26 13:47:03 +02:00
Oscar Maes bd0d9e751b selftests: net: add test for dst hint mechanism with directed broadcast addresses
Add a test for ensuring that the dst hint mechanism is used for
directed broadcast addresses.

This test relies on mausezahn for sending directed broadcast packets.
Additionally, a high GRO flush timeout is set to ensure that packets
will be received as lists.

The test determines if the hint mechanism was used by checking
the in_brd statistic using lnstat.

Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
Link: https://patch.msgid.link/20250819174642.5148-3-oscmaes92@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25 16:07:16 -07:00
Alessandro Ratti e79012967b selftests: rtnetlink: skip tests if tools or feats are missing
Some rtnetlink selftests assume the presence of ifconfig and iproute2
support for the `proto` keyword in `ip address` commands. These
assumptions can cause test failures on modern systems (e.g. Debian
Bookworm) where:

 - ifconfig is not installed by default
 - The iproute2 version lacks support for address protocol

This patch improves test robustness by:

 - Skipping kci_test_promote_secondaries if ifconfig is missing
 - Skipping do_test_address_proto if ip address help does not mention
   proto

These changes ensure the tests degrade gracefully by reporting SKIP
instead of FAIL when prerequisites are not met, improving portability
across systems.

Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Alessandro Ratti <alessandro@0x65c.net>
Link: https://patch.msgid.link/20250822140633.891360-2-alessandro@0x65c.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25 15:22:18 -07:00
Jakub Kicinski 992e9f53a0 selftests: drv-net: xdp: make sure we're actually testing native XDP
Kernel tries to be helpful and attach the XDP program in generic
mode if the driver has no BPF ndo at all. Since the xdp.py tests
all have "native" in their names this can be quite confusing.
Force native / "drv" attachment. Note that netdevsim re-uses
the generic handler as its "native" handler, so we'll maintain
the test coverage of the generic mode that way. No need to test
both explicitly, I reckon.

Link: https://patch.msgid.link/20250822195645.1673390-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25 10:55:55 -07:00
Greg Kroah-Hartman 5b9057cfaf Merge 6.17-rc3 into char-misc-next
We need the char/misc/iio fixes in here as well to build on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-25 09:20:17 +02:00
Tiffany Yang 7b281a4582 cgroup: selftests: Add tests for freezer time
Test cgroup v2 freezer time stat. Freezer time accounting should
be independent of other cgroups in the hierarchy and should increase
iff a cgroup is CGRP_FREEZE (regardless of whether it reaches
CGRP_FROZEN).

Skip these tests on systems without freeze time accounting.

Signed-off-by: Tiffany Yang <ynaffit@google.com>
Cc: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-22 07:50:52 -10:00
Paul Chaignon 0780f54ab1 selftests/bpf: Tests for is_scalar_branch_taken tnum logic
This patch adds tests for the new jeq and jne logic in
is_scalar_branch_taken. The following shows the first test failing
before the previous patch is applied. Once the previous patch is
applied, the verifier can use the tnum values to deduce that instruction
7 is dead code.

  0: call bpf_get_prandom_u32#7  ; R0_w=scalar()
  1: w0 = w0                     ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  2: r0 >>= 30                   ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0; 0x3))
  3: r0 <<= 30                   ; R0_w=scalar(smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000))
  4: r1 = r0                     ; R0_w=scalar(id=1,smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000)) R1_w=scalar(id=1,smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000))
  5: r1 += 1024                  ; R1_w=scalar(smin=umin=umin32=1024,smax=umax=umax32=0xc0000400,smin32=0x80000400,smax32=0x40000400,var_off=(0x400; 0xc0000000))
  6: if r1 != r0 goto pc+1       ; R0_w=scalar(id=1,smin=umin=umin32=1024,smax=umax=umax32=0xc0000000,smin32=0x80000400,smax32=0x40000000,var_off=(0x400; 0xc0000000)) R1_w=scalar(smin=umin=umin32=1024,smax=umax=umax32=0xc0000000,smin32=0x80000400,smax32=0x40000400,var_off=(0x400; 0xc0000000))
  7: r10 = 0
  frame pointer is read only

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/550004f935e2553bdb2fb1f09cbde7d0452112d0.1755694148.git.paul.chaignon@gmail.com
2025-08-22 18:12:34 +02:00
Dimitri Daskalakis bbd885b193 selftests: drv-net: xdp: Validate single-buff XDP_TX in multi-buff mode
Validate that drivers with multi-buff XDP programs properly reinitialize
xdp_buff between packets.

Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Link: https://patch.msgid.link/20250821014023.1481662-4-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-22 07:46:27 -07:00
Dimitri Daskalakis d06d70eb6a selftests: drv-net: xdp: Add a single-buffer XDP_TX test.
Test single-buffer XDP_TX for packets with various payload sizes.
Update the socat TX command to generate packets with 0 length payloads.

Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Link: https://patch.msgid.link/20250821014023.1481662-3-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-22 07:46:27 -07:00
Dimitri Daskalakis 91aacd8cef selftests: drv-net: xdp: Extract common XDP_TX setup/validation.
In preparation of single-buffer XDP_TX tests, refactor common test code
into the _test_xdp_native_tx method. Add support for multiple payload
sizes, and additional validation for RX packet count. Pass the -n flag
to echo to avoid adding an extra byte into the TX packet.

Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Link: https://patch.msgid.link/20250821014023.1481662-2-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-22 07:46:27 -07:00
Linus Torvalds a2e94e8079 block-6.17-20250822
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiobRkQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnvDEAC6ybsqvNAOSV1Tdk1EQZ/mIrmIb7tVrp/P
 zRReWTK9jF7kzOLn2Mqgu0c4RFLCMABXmPb5F2aLx72uSxMSFq2sI9QZCgGzZQeZ
 yjOIxFBAPsdgr+gyIOdS3zH04+IKfJw20ojJb83irCgd5M1hpmVwzZ3iGMq8Gs9q
 VJQYvKny7tjjpuLpk3DWl7t1J0YV+0sGQhk3iZdWEHrui7mqmfh6DkeB5forTu6z
 Gn5e4DNbZvmcvkJQ+Rnkua1UmTZ4hr/+3YV9mqzsWYv+1hOTx/uomGbY7DjSdSyK
 vWWNwN97sgAjwhaFgWvB2iRk1pdAb4A3zP+NV1MXheOhHnAT3C6i43DaS1fivone
 YKLEqy4v3IzB5WcdlwclJW2qizoLtopu7A4pRURv9v+Q0wb4Q2YM0gRum59QgxZN
 +YUhglR5ucazYPmIAxOZMaU/WMIN6m4h3hRa1RkFRNXkBvPGxV2fQxi8exX0QWqf
 oxSSfImO0QVjYPlAL7oi0eWwHtqXtebXXdrUNozQdnrEQnimTrxPAuSnfRIv63un
 swlaCzfqXXhtl25t9p6Sx7xM7aKF2k7tYnZdSM7JjiOS7KXHFaZcYt3YcoFfdLc7
 X/vtT9OQWwnEtqzFKnK8EvcjSN+4KbXwI4neVLmsWK81dwqI2huScB+Xe5eBPidU
 6d6dZzUikA==
 =mbqK
 -----END PGP SIGNATURE-----

Merge tag 'block-6.17-20250822' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
 "A set of fixes for block that should go into this tree. A bit larger
  than what I usually have at this point in time, a lot of that is the
  continued fixing of the lockdep annotation for queue freezing that we
  recently added, which has highlighted a number of little issues here
  and there. This contains:

   - MD pull request via Yu:

       - Add a legacy_async_del_gendisk mode, to prevent a user tools
         regression. New user tools releases will not use such a mode,
         the old release with a new kernel now will have warning about
         deprecated behavior, and we prepare to remove this legacy mode
         after about a year later

       - The rename in kernel causing user tools build failure, revert
         the rename in mdp_superblock_s

       - Fix a regression that interrupted resync can be shown as
         recover from mdstat or sysfs

   - Improve file size detection for loop, particularly for networked
     file systems, by using getattr to get the size rather than the
     cached inode size.

   - Hotplug CPU lock vs queue freeze fix

   - Lockdep fix while updating the number of hardware queues

   - Fix stacking for PI devices

   - Silence bio_check_eod() for the known case of device removal where
     the size is truncated to 0 sectors"

* tag 'block-6.17-20250822' of git://git.kernel.dk/linux:
  block: avoid cpu_hotplug_lock depedency on freeze_lock
  block: decrement block_rq_qos static key in rq_qos_del()
  block: skip q->rq_qos check in rq_qos_done_bio()
  blk-mq: fix lockdep warning in __blk_mq_update_nr_hw_queues
  block: tone down bio_check_eod
  loop: use vfs_getattr_nosec for accurate file size
  loop: Consolidate size calculation logic into lo_calculate_size()
  block: remove newlines from the warnings in blk_validate_integrity_limits
  block: handle pi_tuple_size in queue_limits_stack_integrity
  selftests: ublk: Use ARRAY_SIZE() macro to improve code
  md: fix sync_action incorrect display during resync
  md: add helper rdev_needs_recovery()
  md: keep recovery_cp in mdp_superblock_s
  md: add legacy_async_del_gendisk mode
2025-08-22 09:29:51 -04:00
Linus Torvalds 6eba757ce9 20 hotfixes. 10 are cc:stable and the remainder address post-6.16 issues
or aren't considered necessary for -stable kernels.  17 of these fixes are
 for MM.
 
 As usual, singletons all over the place, apart from a three-patch series
 of KHO followup work from Pasha which is actually also a bunch of
 singletons.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaKfFVwAKCRDdBJ7gKXxA
 jvZGAQCCRTRgwnYsH0op9Rlxs72zokENbErSzXweWLez31pNpAD/S7bVSjjk1mXr
 BQ24ZadKUUomWkghwCusb9VomMeneg0=
 =+uBT
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2025-08-21-18-17' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "20 hotfixes. 10 are cc:stable and the remainder address post-6.16
  issues or aren't considered necessary for -stable kernels. 17 of these
  fixes are for MM.

  As usual, singletons all over the place, apart from a three-patch
  series of KHO followup work from Pasha which is actually also a bunch
  of singletons"

* tag 'mm-hotfixes-stable-2025-08-21-18-17' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/mremap: fix WARN with uffd that has remap events disabled
  mm/damon/sysfs-schemes: put damos dests dir after removing its files
  mm/migrate: fix NULL movable_ops if CONFIG_ZSMALLOC=m
  mm/damon/core: fix damos_commit_filter not changing allow
  mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn
  MAINTAINERS: mark MGLRU as maintained
  mm: rust: add page.rs to MEMORY MANAGEMENT - RUST
  iov_iter: iterate_folioq: fix handling of offset >= folio size
  selftests/damon: fix selftests by installing drgn related script
  .mailmap: add entry for Easwar Hariharan
  selftests/mm: add test for invalid multi VMA operations
  mm/mremap: catch invalid multi VMA moves earlier
  mm/mremap: allow multi-VMA move when filesystem uses thp_get_unmapped_area
  mm/damon/core: fix commit_ops_filters by using correct nth function
  tools/testing: add linux/args.h header and fix radix, VMA tests
  mm/debug_vm_pgtable: clear page table entries at destroy_args()
  squashfs: fix memory leak in squashfs_fill_super
  kho: warn if KHO is disabled due to an error
  kho: mm: don't allow deferred struct page with KHO
  kho: init new_physxa->phys_bits to fix lockdep
2025-08-22 08:54:34 -04:00
Nikola Z. Ivanov c08e42c9a4 selftests/alsa: remove 0/NULL global variable assignment
Remove 0/NULL global variable assignment in mixer-test.c and pcm-test.c

Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20250821200132.1218850-1-zlatistiv@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-22 08:20:45 +02:00
Mark Brown 01860bcc53 KVM: arm64: selftests: Sync ID_AA64MMFR3_EL1 in set_id_regs
When we added coverage for ID_AA64MMFR3_EL1 we didn't add it to the list
of registers we read in the guest, do so.

Fixes: 0b593ef12a ("KVM: arm64: selftests: Catch up set_id_regs with the kernel")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250818-kvm-arm64-selftests-mmfr3-idreg-v1-1-2f85114d0163@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-21 16:36:30 -07:00
Marc Zyngier 0843e0ced3 KVM: arm64: Get rid of ARM64_FEATURE_MASK()
The ARM64_FEATURE_MASK() macro was a hack introduce whilst the
automatic generation of sysreg encoding was introduced, and was
too unreliable to be entirely trusted.

We are in a better place now, and we could really do without this
macro. Get rid of it altogether.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250817202158.395078-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-21 16:31:56 -07:00
Jakub Kicinski 4dba4a936f bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCaKdsDgAKCRAraS+Z+3y5
 E85CAP0aLdMc7glIhzXITfY8If4HntXCKyDjEABDKlCKMea3MAD/d/lIHXkfL7vE
 ZCZMhFOpDVdQ1ZojpfEs7wLipfbl/Ao=
 =iUY0
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2025-08-21

We've added 9 non-merge commits during the last 3 day(s) which contain
a total of 13 files changed, 1027 insertions(+), 27 deletions(-).

The main changes are:

1) Added bpf dynptr support for accessing the metadata of a skb,
   from Jakub Sitnicki.
   The patches are merged from a stable branch bpf-next/skb-meta-dynptr.
   The same patches have also been merged into bpf-next/master.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Cover metadata access from a modified skb clone
  selftests/bpf: Cover read/write to skb metadata at an offset
  selftests/bpf: Cover write access to skb metadata via dynptr
  selftests/bpf: Cover read access to skb metadata via dynptr
  selftests/bpf: Parametrize test_xdp_context_tuntap
  selftests/bpf: Pass just bpf_map to xdp_context_test helper
  selftests/bpf: Cover verifier checks for skb_meta dynptr type
  bpf: Enable read/write access to skb metadata through a dynptr
  bpf: Add dynptr type for skb metadata
====================

Link: https://patch.msgid.link/20250821191827.2099022-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21 15:37:16 -07:00
Linus Torvalds d72052ac09 sched_ext: Fixes for v6.17-rc2
- Fix a subtle bug during SCX enabling where a dead task skips init but
   doesn't skip sched class switch leading to invalid task state transition
   warning.
 
 - Cosmetic fix in selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaKdWkg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGWI2AP9e+OTPPHa+sHeM7g3ngigF44nyvvRIPIMJHmZO
 7CYT9AD/e+YI+atHzo5iSBcpGwjW8BSLc0ozdrkI0N7XFLXC4go=
 =7Ti1
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix a subtle bug during SCX enabling where a dead task skips init
   but doesn't skip sched class switch leading to invalid task state
   transition warning

 - Cosmetic fix in selftests

* tag 'sched_ext-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  selftests/sched_ext: Remove duplicate sched.h header
  sched/ext: Fix invalid task state transitions on class switch
2025-08-21 16:02:35 -04:00
Jakub Kicinski a9af709fda Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21 11:33:15 -07:00
Hengqi Chen 21aeabb682 selftests/bpf: Use vmlinux.h for BPF programs
Some of the bpf test progs still use linux/libc headers.
Let's use vmlinux.h instead like the rest of test progs.
This will also ease cross compiling.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250821030254.398826-1-hengqi.chen@gmail.com
2025-08-21 11:32:25 -07:00
Jiri Olsa 9ffc7a635c selftests/seccomp: validate uprobe syscall passes through seccomp
Adding uprobe checks into the current uretprobe tests.

All the related tests are now executed with attached uprobe
or uretprobe or without any probe.

Renaming the test fixture to uprobe, because it seems better.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-22-jolsa@kernel.org
2025-08-21 20:09:26 +02:00
Jiri Olsa 52718438af selftests/bpf: Fix uprobe syscall shadow stack test
Now that we have uprobe syscall working properly with shadow stack,
we can remove testing limitations for shadow stack tests and make
sure uprobe gets properly optimized.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250821141557.13233-1-jolsa@kernel.org
2025-08-21 20:09:25 +02:00
Jiri Olsa 3abf4298c6 selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe
Changing the test_uretprobe_regs_change test to test both uprobe
and uretprobe by adding entry consumer handler to the testmod
and making it to change one of the registers.

Making sure that changed values both uprobe and uretprobe handlers
propagate to the user space.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-20-jolsa@kernel.org
2025-08-21 20:09:25 +02:00
Jiri Olsa 275eae6789 selftests/bpf: Add uprobe_regs_equal test
Changing uretprobe_regs_trigger to allow the test for both
uprobe and uretprobe and renaming it to uprobe_regs_equal.

We check that both uprobe and uretprobe probes (bpf programs)
see expected registers with few exceptions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-19-jolsa@kernel.org
2025-08-21 20:09:25 +02:00
Jiri Olsa 875e1705ad selftests/bpf: Add optimized usdt variant for basic usdt test
Adding optimized usdt variant for basic usdt test to check that
usdt arguments are properly passed in optimized code path.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-18-jolsa@kernel.org
2025-08-21 20:09:24 +02:00
Jiri Olsa c11661bd9a selftests/bpf: Add uprobe syscall sigill signal test
Make sure that calling uprobe syscall from outside uprobe trampoline
results in sigill signal.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-17-jolsa@kernel.org
2025-08-21 20:09:24 +02:00
Jiri Olsa c8be59667c selftests/bpf: Add hit/attach/detach race optimized uprobe test
Adding test that makes sure parallel execution of the uprobe and
attach/detach of optimized uprobe on it works properly.

By default the test runs for 500ms, which is adjustable by using
BPF_SELFTESTS_UPROBE_SYSCALL_RACE_MSEC env variable.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-16-jolsa@kernel.org
2025-08-21 20:09:24 +02:00
Jiri Olsa d5c86c3370 selftests/bpf: Add uprobe/usdt syscall tests
Adding tests for optimized uprobe/usdt probes.

Checking that we get expected trampoline and attached bpf programs
get executed properly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-15-jolsa@kernel.org
2025-08-21 20:09:24 +02:00
Jiri Olsa 7932c4cf57 selftests/bpf: Rename uprobe_syscall_executed prog to test_uretprobe_multi
Renaming uprobe_syscall_executed prog to test_uretprobe_multi
to fit properly in the following changes that add more programs.

Plus adding pid filter and increasing executed variable.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-14-jolsa@kernel.org
2025-08-21 20:09:23 +02:00
Jiri Olsa 4e7005223e selftests/bpf: Reorg the uprobe_syscall test function
Adding __test_uprobe_syscall with non x86_64 stub to execute all the tests,
so we don't need to keep adding non x86_64 stub functions for new tests.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-13-jolsa@kernel.org
2025-08-21 20:09:23 +02:00
Jiri Olsa 17c3b00157 selftests/bpf: Import usdt.h from libbpf/usdt project
Importing usdt.h from libbpf/usdt project.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250720112133.244369-12-jolsa@kernel.org
2025-08-21 20:09:23 +02:00
Cryolitia PukNgae e5b71dd3ad selftests: net: fix memory leak in tls.c
To free memory and close fd after use

Suggested-by: Jun Zhan <zhanjun@uniontech.com>
Signed-off-by: Cryolitia PukNgae <cryolitia@uniontech.com>
Link: https://patch.msgid.link/20250819-memoryleak-v1-1-d4c70a861e62@uniontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21 10:57:12 -07:00
Linus Torvalds 6439a0e64c Including fixes from Bluetooth.
Current release - fix to a fix:
 
  - usb: asix_devices: fix PHY address mask in MDIO bus initialization
 
 Current release - regressions:
 
  - Bluetooth: fixes for the split between BIS_LINK and PA_LINK
 
  - Revert "net: cadence: macb: sama7g5_emac: Remove USARIO CLKEN flag",
    breaks compatibility with some existing device tree blobs
 
  - dsa: b53: fix reserved register access in b53_fdb_dump()
 
 Current release - new code bugs:
 
  - sched: dualpi2: run probability update timer in BH to avoid deadlock
 
  - eth: libwx: fix the size in RSS hash key population
 
  - pse-pd: pd692x0: improve power budget error paths and handling
 
 Previous releases - regressions:
 
  - tls: fix handling of zero-length records on the rx_list
 
  - hsr: reject HSR frame if skb can't hold tag
 
  - bonding: fix negotiation flapping in 802.3ad passive mode
 
 Previous releases - always broken:
 
  - gso: forbid IPv6 TSO with extensions on devices with only IPV6_CSUM
 
  - sched: make cake_enqueue return NET_XMIT_CN when past buffer_limit,
    avoid packet drops with low buffer_limit, remove unnecessary WARN()
 
  - sched: fix backlog accounting after modifying config of a qdisc
    in the middle of the hierarchy
 
  - mptcp: improve handling of skb extension allocation failures
 
  - eth: mlx5:
    - fixes for the "HW Steering" flow management method
    - fixes for QoS and device buffer management
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAminVEUACgkQMUZtbf5S
 Irtt/g//YoT8G6Wzv3unTf+qYl/pia7936VqW/W51A7UyCZp2bfDi8/r8eIl+bCf
 8KbUH2nSTUPEsngRrzoXhbVAXs6AFe3WKA89efCeal+GFPPA43A5+BZKaXBRE14M
 DgSCo1mYjrVIMENv33LttjQOYjLIBu4finciqIkLj/rOeyRZUURlModQW+aA800K
 QhXKt0BYniL7tg4ChW3/TQMNpRVLnfae6cu/CKjv/rmzdioVZRChC/cGPI0xctpq
 kAIJa49i6dNGtNMWFtS+YaDjTFWS05R9y5bSiylLUnqply9+sWrF1VZIGHXMOcv8
 uzxLIlZhvSQycETMinPf0w02H3PuuqpKH8X+1c77Z7/wHmCn5ht2QVv1BwgFu4hv
 MZm+zRYKZDdkdYCqujD/XmqF/D7t49LIql8WctxwgWus2yTfg089+9H2VHEj2kyl
 S6FBI6f3ZwwLGh6MCAcqswbGc/6QSVFg2ndW6QFxl85Py3fTefV0Y3OknSkXqom/
 2azpWNX6zoKfB6QSW4ZrfPFvuDAV8Ai5OJCTpw/3gkbquUMgx+mN2wf2/t8hPbci
 Qkh5Y9/44gQFY9KaLx8N7Xr1FBTQ4c21QnG67gvflw3mNMdr3ot0isGTCtjty/eC
 abQ+KvQwxy8rVnlunYC8REAAFxQi3AtCiFBE/vioVApNQtsqm68=
 =waf0
 -----END PGP SIGNATURE-----

Merge tag 'net-6.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth.

  Current release - fix to a fix:

   - usb: asix_devices: fix PHY address mask in MDIO bus initialization

  Current release - regressions:

   - Bluetooth: fixes for the split between BIS_LINK and PA_LINK

   - Revert "net: cadence: macb: sama7g5_emac: Remove USARIO CLKEN
     flag", breaks compatibility with some existing device tree blobs

   - dsa: b53: fix reserved register access in b53_fdb_dump()

  Current release - new code bugs:

   - sched: dualpi2: run probability update timer in BH to avoid
     deadlock

   - eth: libwx: fix the size in RSS hash key population

   - pse-pd: pd692x0: improve power budget error paths and handling

  Previous releases - regressions:

   - tls: fix handling of zero-length records on the rx_list

   - hsr: reject HSR frame if skb can't hold tag

   - bonding: fix negotiation flapping in 802.3ad passive mode

  Previous releases - always broken:

   - gso: forbid IPv6 TSO with extensions on devices with only IPV6_CSUM

   - sched: make cake_enqueue return NET_XMIT_CN when past buffer_limit,
     avoid packet drops with low buffer_limit, remove unnecessary WARN()

   - sched: fix backlog accounting after modifying config of a qdisc in
     the middle of the hierarchy

   - mptcp: improve handling of skb extension allocation failures

   - eth: mlx5:
       - fixes for the "HW Steering" flow management method
       - fixes for QoS and device buffer management"

* tag 'net-6.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
  netfilter: nf_reject: don't leak dst refcount for loopback packets
  net/mlx5e: Preserve shared buffer capacity during headroom updates
  net/mlx5e: Query FW for buffer ownership
  net/mlx5: Restore missing scheduling node cleanup on vport enable failure
  net/mlx5: Fix QoS reference leak in vport enable error path
  net/mlx5: Destroy vport QoS element when no configuration remains
  net/mlx5e: Preserve tc-bw during parent changes
  net/mlx5: Remove default QoS group and attach vports directly to root TSAR
  net/mlx5: Base ECVF devlink port attrs from 0
  net: pse-pd: pd692x0: Skip power budget configuration when undefined
  net: pse-pd: pd692x0: Fix power budget leak in manager setup error path
  Octeontx2-af: Skip overlap check for SPI field
  selftests: tls: add tests for zero-length records
  tls: fix handling of zero-length records on the rx_list
  net: airoha: ppe: Do not invalid PPE entries in case of SW hash collision
  selftests: bonding: add test for passive LACP mode
  bonding: send LACPDUs periodically in passive mode after receiving partner's LACPDU
  bonding: update LACP activity flag after setting lacp_active
  Revert "net: cadence: macb: sama7g5_emac: Remove USARIO CLKEN flag"
  ipv6: sr: Fix MAC comparison to be constant-time
  ...
2025-08-21 13:51:15 -04:00
Jakub Kicinski a61a3e961b selftests: tls: add tests for zero-length records
Test various combinations of zero-length records.
Unfortunately, kernel cannot be coerced into producing those,
so hardcode the ciphertext messages in the test.

Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20250820021952.143068-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21 07:52:31 -07:00
Hangbin Liu 87951b5664 selftests: bonding: add test for passive LACP mode
Add a selftest to verify bonding behavior when `lacp_active` is set to `off`.

The test checks the following:
- The passive LACP bond should not send LACPDUs before receiving a partner's
  LACPDU.
- The transmitted LACPDUs must not include the active flag.
- After transitioning to EXPIRED and DEFAULTED states, the passive side should
  still not initiate LACPDUs.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250815062000.22220-4-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-21 09:35:21 +02:00
Kuniyuki Iwashima a5c10aa3d1 selftests/net: packetdrill: Support single protocol test.
Currently, we cannot write IPv4 or IPv6 specific packetdrill tests
as ksft_runner.sh runs each .pkt file for both protocols.

Let's support single protocol test by checking --ip_version in the
.pkt file.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250819231527.1427361-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-20 19:39:25 -07:00
Hangbin Liu 781bf2cc06 selftests: rtnetlink: print device info on preferred_lft test failure
Even with slowwait used to avoid system sleep in the preferred_lft test,
failures can still occur after long runtimes.

Print the device address info when the test fails to provide better
troubleshooting data.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250819074749.388064-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-20 19:28:08 -07:00
Hangbin Liu eacb6e408d selftests: net: bpf_offload: print loaded programs on mismatch
The test sometimes fails due to an unexpected number of loaded programs. e.g

  FAIL: 2 BPF programs loaded, expected 1
    File "/usr/libexec/kselftests/net/./bpf_offload.py", line 940, in <module>
      progs = bpftool_prog_list(expected=1)
    File "/usr/libexec/kselftests/net/./bpf_offload.py", line 187, in bpftool_prog_list
      fail(True, "%d BPF programs loaded, expected %d" %
    File "/usr/libexec/kselftests/net/./bpf_offload.py", line 89, in fail
      tb = "".join(traceback.extract_stack().format())

However, the logs do not show which programs were actually loaded, making it
difficult to debug the failure.

Add printing of the loaded programs when a mismatch is detected to help
troubleshoot such errors. The list is printed on a new line to avoid breaking
the current log format.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250819073348.387972-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-20 19:28:03 -07:00
Alex Tran 6b4b1d577e selftests/net/socket.c: removed warnings from unused returns
socket.c: In function ‘run_tests’:
socket.c:59:25: warning: ignoring return value of ‘strerror_r’ \
declared with attribute ‘warn_unused_result’ [-Wunused-result]
59 | strerror_r(-s->expect, err_string1, ERR_STRING_SZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

socket.c:60:25: warning: ignoring return value of ‘strerror_r’ \
declared with attribute ‘warn_unused_result’ [-Wunused-result]
60 | strerror_r(errno, err_string2, ERR_STRING_SZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

socket.c:73:33: warning: ignoring return value of ‘strerror_r’ \
declared with attribute ‘warn_unused_result’ [-Wunused-result]
73 | strerror_r(errno, err_string1, ERR_STRING_SZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

changelog:
v2
- const char* messages and fixed patch warnings of max 75 chars
  per line

Signed-off-by: Alex Tran <alex.t.tran@gmail.com>
Link: https://patch.msgid.link/20250819025227.239885-1-alex.t.tran@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-20 19:26:21 -07:00
Sean Christopherson dce1b33ed7 selftests: harness: Rename is_signed_type() to avoid collision with overflow.h
Rename is_signed_type() to is_signed_var() to avoid colliding with a macro
of the same name defined by tools' linux/overflow.h.  This fixes warnings
(and presumably potential test failures) in tests that utilize the
selftests harness and happen to (indirectly) include overflow.h.

  In file included from tools/include/linux/bits.h:34,
                   from tools/include/linux/bitops.h:14,
                   from tools/include/linux/hashtable.h:13,
                   from include/kvm_util.h:11,
                   from x86/userspace_msr_exit_test.c:11:
  tools/include/linux/overflow.h:31:9: error: "is_signed_type" redefined [-Werror]
     31 | #define is_signed_type(type)       (((type)(-1)) < (type)1)
        |         ^~~~~~~~~~~~~~
  In file included from include/kvm_test_harness.h:11,
                   from x86/userspace_msr_exit_test.c:9:
  ../kselftest_harness.h:754:9: note: this is the location of the previous definition
    754 | #define is_signed_type(var)       (!!(((__typeof__(var))(-1)) < (__typeof__(var))1))
        |         ^~~~~~~~~~~~~~

Use a separate definition, at least for now, as many selftests build
without tools/include in their include path.

Fixes: fc92099902 ("tools headers: Synchronize linux/bits.h with the kernel sources")
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250624231930.583689-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-08-20 08:04:09 -07:00
Eric Biggers 490a9591b5 selftests: net: Explicitly enable CONFIG_CRYPTO_SHA1 for IPsec
xfrm_policy.sh, nft_flowtable.sh, and vrf-xfrm-tests.sh use 'ip xfrm'
with SHA-1, either 'auth sha1' or 'auth-trunc hmac(sha1)'.  That
requires CONFIG_CRYPTO_SHA1, which CONFIG_INET_ESP intentionally doesn't
select (as per its help text).  Previously, the config for these tests
relied on CONFIG_CRYPTO_SHA1 being selected by the unrelated option
CONFIG_IP_SCTP.  Since CONFIG_IP_SCTP is being changed to no longer do
that, instead add CONFIG_CRYPTO_SHA1 to the configs explicitly.

Reported-by: Paolo Abeni <pabeni@redhat.com>
Closes: https://lore.kernel.org/r/766e4508-aaba-4cdc-92b4-e116e52ae13b@redhat.com
Suggested-by: Florian Westphal <fw@strlen.de>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20250818205426.30222-2-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19 19:36:24 -07:00
Jakub Kicinski 51992f99f0 selftests: drv-net: ncdevmem: make configure_channels() support combined channels
ncdevmem tests that the kernel correctly rejects attempts
to deactivate queues with MPs bound.

Make the configure_channels() test support combined channels.
Currently it tries to set the queue counts to rx N tx N-1,
which only makes sense for devices which have IRQs per ring
type. Most modern devices used combined IRQs/channels with
both Rx and Tx queues. Since the math is total Rx == combined+Rx
setting Rx when combined is non-zero will be increasing the total
queue count, not decreasing as the test intends.

Note that the test would previously also try to set the Tx
ring count to Rx - 1, for some reason. Which would be 0
if the device has only 2 queues configured.

With this change (device with 2 queues):
  setting channel count rx:1 tx:1
  YNL set channels: Kernel error: 'requested channel counts are too low for existing memory provider setting (2)'

Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250815231513.381652-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19 17:49:35 -07:00
Jakub Kicinski eddc821f98 selftests: drv-net: tso: increase the retransmit threshold
We see quite a few flakes during the TSO test against virtualized
devices in NIPA. There's often 10-30 retransmissions during the
test. Sometimes as many as 100. Set the retransmission threshold
at 1/4th of the wire frame target.

Link: https://patch.msgid.link/20250815224100.363438-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19 17:49:21 -07:00
Sang-Heon Jeon 0cc2a4880c selftests/damon: fix selftests by installing drgn related script
drgn_dump_damon_status is not installed during kselftest setup.  It can
break other tests which depend on drgn_dump_damon_status.  Install
drgn_dump_damon_status files to fix broken test.

Link: https://lkml.kernel.org/r/20250812140046.660486-1-ekffu200098@gmail.com
Fixes: f3e8e1e513 ("selftests/damon: add drgn script for extracting damon status")
Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-19 16:35:55 -07:00
Lorenzo Stoakes 742d3663a5 selftests/mm: add test for invalid multi VMA operations
We can use UFFD to easily assert invalid multi VMA moves, so do so,
asserting expected behaviour when VMAs invalid for a multi VMA operation
are encountered.

We assert both that such operations are not permitted, and that we do not
even attempt to move the first VMA under these circumstances.

We also assert that we can still move a single VMA regardless.

We then assert that a partial failure can occur if the invalid VMA appears
later in the range of multiple VMAs, both at the very next VMA, and also at
the end of the range.

As part of this change, we are using the is_range_valid() helper more
aggressively. Therefore, fix a bug where stale buffered data would hang
around on success, causing subsequent calls to is_range_valid() to
potentially give invalid results.

We simply have to fflush() the stream on success to resolve this issue.

Link: https://lkml.kernel.org/r/c4fb86dd5ba37610583ad5fc0e0c2306ddf318b9.1754218667.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-19 16:35:55 -07:00
Lorenzo Stoakes 9a6a6a3191 tools/testing: add linux/args.h header and fix radix, VMA tests
Commit 857d18f23a ("cleanup: Introduce ACQUIRE() and ACQUIRE_ERR() for
conditional locks") accidentally broke the radix tree, VMA userland tests
by including linux/args.h which is not present in the tools/include
directory.

This patch copies this over and adds an #ifdef block to avoid duplicate
__CONCAT declaration in conflict with system headers when we ultimately
include this.

Link: https://lkml.kernel.org/r/20250811052654.33286-1-lorenzo.stoakes@oracle.com
Fixes: 857d18f23a ("cleanup: Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks") 
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-19 16:35:54 -07:00
Gopi Krishna Menon 05f297c3e3 KVM: selftests: fix minor typo in cpumodel_subfuncs
Specifically, fix spelling of "available" in main function.

Signed-off-by: Gopi Krishna Menon <krishnagopi487@gmail.com>
Link: https://lore.kernel.org/r/20250813154751.5725-1-krishnagopi487@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-08-19 10:50:59 -07:00
Linus Torvalds 055f213075 vfs-6.17-rc3.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaKRttQAKCRCRxhvAZXjc
 onwmAP98oaMku7CttHEVwJj8KD7luXvZWbvB23TPGmF6BNWg9wEAraks5EzZZJy3
 +4xWn10b6R+gXUqvwqr+bf0ufk3c+gc=
 =Nbg0
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix two memory leaks in pidfs

 - Prevent changing the idmapping of an already idmapped mount without
   OPEN_TREE_CLONE through open_tree_attr()

 - Don't fail listing extended attributes in kernfs when no extended
   attributes are set

 - Fix the return value in coredump_parse()

 - Fix the error handling for unbuffered writes in netfs

 - Fix broken data integrity guarantees for O_SYNC writes via iomap

 - Fix UAF in __mark_inode_dirty()

 - Keep inode->i_blkbits constant in fuse

 - Fix coredump selftests

 - Fix get_unused_fd_flags() usage in do_handle_open()

 - Rename EXPORT_SYMBOL_GPL_FOR_MODULES to EXPORT_SYMBOL_FOR_MODULES

 - Fix use-after-free in bh_read()

 - Fix incorrect lflags value in the move_mount() syscall

* tag 'vfs-6.17-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  signal: Fix memory leak for PIDFD_SELF* sentinels
  kernfs: don't fail listing extended attributes
  coredump: Fix return value in coredump_parse()
  fs/buffer: fix use-after-free when call bh_read() helper
  pidfs: Fix memory leak in pidfd_info()
  netfs: Fix unbuffered write error handling
  fhandle: do_handle_open() should get FD with user flags
  module: Rename EXPORT_SYMBOL_GPL_FOR_MODULES to EXPORT_SYMBOL_FOR_MODULES
  fs: fix incorrect lflags value in the move_mount syscall
  selftests/coredump: Remove the read() that fails the test
  fuse: keep inode->i_blkbits constant
  iomap: Fix broken data integrity guarantees for O_SYNC writes
  selftests/mount_setattr: add smoke tests for open_tree_attr(2) bug
  open_tree_attr: do not allow id-mapping changes without OPEN_TREE_CLONE
  fs: writeback: fix use-after-free in __mark_inode_dirty()
2025-08-19 09:54:47 -07:00
Sean Christopherson e2bcf62a2e KVM: selftests: Move Intel and AMD module param helpers to x86/processor.h
Move the x86 specific helpers for getting kvm_{amd,intel} module params to
x86 where they belong.  Expose the module-agnostic helpers globally, there
is nothing secret about the logic.

Link: https://lore.kernel.org/r/20250806225159.1687326-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-08-19 08:00:29 -07:00
James Houghton a585b87614 KVM: selftests: Fix signedness issue with vCPU mmap size check
Check that the return value of KVM_GET_VCPU_MMAP_SIZE is non-negative
before comparing with sizeof(kvm_run). If KVM_GET_VCPU_MMAP_SIZE fails,
it will return -1, and `-1 > sizeof(kvm_run)` is true, so the ASSERT
passes.

There are no other locations in tools/testing/selftests/kvm that make
the same mistake.

Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250711001742.1965347-1-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-08-19 07:43:56 -07:00
Jakub Kicinski 0283b8f134 selftests: drv-net: test the napi init state
Test that threaded state (in the persistent NAPI config) gets updated
even when NAPI with given ID is not allocated at the time.

This test is validating commit ccba9f6baa ("net: update NAPI threaded
config even for disabled NAPIs").

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250815013314.2237512-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-19 15:46:04 +02:00
Li Li f37b55ded8 binder: add transaction_report feature entry
Add "transaction_report" to the binderfs feature list, to help userspace
determine if the "BINDER_CMD_REPORT" generic netlink api is supported by
the binder driver.

Signed-off-by: Li Li <dualli@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20250727182932.2499194-5-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-19 12:53:01 +02:00
Martin KaFai Lau 5c42715e63 Merge branch 'bpf-next/skb-meta-dynptr' into 'bpf-next/master'
Merge 'skb-meta-dynptr' branch into 'master' branch. No conflict.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-08-18 17:59:26 -07:00
Martin KaFai Lau 7e1371023a Merge branch 'bpf-next/skb-meta-dynptr' into 'bpf-next/net'
Merge 'skb-meta-dynptr' branch into 'net' branch. No conflict.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-08-18 17:58:21 -07:00
Matthieu Baerts (NGI0) 3259889fd3 selftests: mptcp: sockopt: fix C23 extension warning
GCC was complaining about the new label:

  mptcp_inq.c:79:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
     79 |         int err = getaddrinfo(node, service, hints, res);
        |         ^

  mptcp_sockopt.c:166:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
    166 |         int err = getaddrinfo(node, service, hints, res);
        |         ^

Simply declare 'err' before the label to avoid this warning.

Fixes: dd367e81b7 ("selftests: mptcp: sockopt: use IPPROTO_MPTCP for getaddrinfo")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-8-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-18 17:39:59 -07:00
Matthieu Baerts (NGI0) 2eefbed30d selftests: mptcp: connect: fix C23 extension warning
GCC was complaining about the new label:

  mptcp_connect.c:187:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
    187 |         int err = getaddrinfo(node, service, hints, res);
        |         ^

Simply declare 'err' before the label to avoid this warning.

Fixes: a862771d1a ("selftests: mptcp: use IPPROTO_MPTCP for getaddrinfo")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-7-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-18 17:39:59 -07:00
Geliang Tang f92199f551 selftests: mptcp: disable add_addr retrans in endpoint_tests
To prevent test instability in the "delete re-add signal" test caused by
ADD_ADDR retransmissions, disable retransmissions for this test by setting
net.mptcp.add_addr_timeout to 0.

Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-6-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-18 17:39:58 -07:00
Matthieu Baerts (NGI0) 452690be7d selftests: mptcp: pm: check flush doesn't reset limits
This modification is linked to the parent commit where the received
ADD_ADDR limit was accidentally reset when the endpoints were flushed.

To validate that, the test is now flushing endpoints after having set
new limits, and before checking them.

The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.

Fixes: 01cacb00b3 ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-3-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-18 17:39:58 -07:00
Jakub Sitnicki 403fae5978 selftests/bpf: Cover metadata access from a modified skb clone
Demonstrate that, when processing an skb clone, the metadata gets truncated
if the program contains a direct write to either the payload or the
metadata, due to an implicit unclone in the prologue, and otherwise the
dynptr to the metadata is limited to being read-only.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-9-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:43 -07:00
Jakub Sitnicki bd1b51b319 selftests/bpf: Cover read/write to skb metadata at an offset
Exercise r/w access to skb metadata through an offset-adjusted dynptr,
read/write helper with an offset argument, and a slice starting at an
offset.

Also check for the expected errors when the offset is out of bounds.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-8-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:43 -07:00
Jakub Sitnicki ed93360807 selftests/bpf: Cover write access to skb metadata via dynptr
Add tests what exercise writes to skb metadata in two ways:
1. indirectly, using bpf_dynptr_write helper,
2. directly, using a read-write dynptr slice.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-7-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:43 -07:00
Jakub Sitnicki 153f6bfd48 selftests/bpf: Cover read access to skb metadata via dynptr
Exercise reading from SKB metadata area in two new ways:
1. indirectly, with bpf_dynptr_read(), and
2. directly, with bpf_dynptr_slice().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-6-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:42 -07:00
Jakub Sitnicki dd9f6cfb4e selftests/bpf: Parametrize test_xdp_context_tuntap
We want to add more test cases to cover different ways to access the
metadata area. Prepare for it. Pull up the skeleton management.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-5-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:42 -07:00
Jakub Sitnicki 6dfd5e01e1 selftests/bpf: Pass just bpf_map to xdp_context_test helper
Prepare for parametrizing the xdp_context tests. The assert_test_result
helper doesn't need the whole skeleton. Pass just what it needs.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-4-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:42 -07:00
Jakub Sitnicki 0e74eb4d57 selftests/bpf: Cover verifier checks for skb_meta dynptr type
dynptr for skb metadata behaves the same way as the dynptr for skb data
with one exception - writes to skb_meta dynptr don't invalidate existing
skb and skb_meta slices.

Duplicate those the skb dynptr tests which we can, since
bpf_dynptr_from_skb_meta kfunc can be called only from TC BPF, to cover the
skb_meta dynptr verifier checks.

Also add a couple of new tests (skb_data_valid_*) to ensure we don't
invalidate the slices in the mentioned case, which are specific to skb_meta
dynptr.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-3-8a39e636e0fb@cloudflare.com
2025-08-18 10:29:42 -07:00
Thomas Weißschuh 850047b197 selftests/nolibc: always compile the kernel with GCC
LLVM/clang can not build the kernel for all architectures supported by
nolibc. The current setup uses the same compiler to build the kernel as
is used for nolibc-test. This prevents using the full qemu-system tests
for LLVM builds.

Instead always build the kernel with GCC. For the nolibc testsuite the
kernel does not need to be built with LLVM.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250719-nolibc-llvm-system-v1-3-1730216ce171@weissschuh.net
2025-08-18 16:05:41 +02:00
Thomas Weißschuh 1a5b40317d selftests/nolibc: don't pass CC to toplevel Makefile
The toplevel Makefile is capable of calculating CC from CROSS_COMPILE
and/or ARCH.

Stop passing the unnecessary variable.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250719-nolibc-llvm-system-v1-2-1730216ce171@weissschuh.net
2025-08-18 16:05:40 +02:00
Thomas Weißschuh 32042f638c selftests/nolibc: deduplicate invocations of toplevel Makefile
Various targets of the testsuite call back into the toplevel kernel
Makefile. These calls use various parameters and are quite long.

Introduce a common variable to make future changes smaller and the lines
shorter.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250719-nolibc-llvm-system-v1-1-1730216ce171@weissschuh.net
2025-08-18 16:05:40 +02:00
Thomas Weißschuh 2be3fd903a selftests/nolibc: be more specific about variables affecting nolibc-test
Only one of these variables is used.
$CC is preferred over $CROSS_COMPILE.

Make this clear in the help message.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20250817093905.GA14213@1wt.eu/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-08-18 16:05:39 +02:00
Ilya Leoshkevich 1274163035 selftests/bpf: Clobber a lot of registers in tailcall_bpf2bpf_hierarchy tests
Clobbering a lot of registers and stack slots helps exposing tail call
counter overwrite bugs in JITs.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250813121016.163375-5-iii@linux.ibm.com
2025-08-18 15:08:30 +02:00
Akhilesh Patil 0227af355b selftests: ublk: Use ARRAY_SIZE() macro to improve code
Use ARRAY_SIZE() macro while calculating size of an array to improve
code readability and reduce potential sizing errors.
Implement this suggestion given by spatch tool by running
coccinelle script - scripts/coccinelle/misc/array_size.cocci
Follow ARRAY_SIZE() macro usage pattern in ublk.c introduced by,
commit ec12009318 ("selftests: ublk: fix ublk_find_tgt()")
wherever appropriate to maintain consistency.

Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/aKGihYui6/Pcijbk@bhairav-test.ee.iitb.ac.in
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-18 05:36:29 -06:00
Thomas Weißschuh 1201f6fb5b tools/nolibc: fix error return value of clock_nanosleep()
clock_nanosleep() returns a positive error value. Unlike other libc
functions it *does not* return -1 nor set errno.

Fix the return value and also adapt nanosleep().

Fixes: 7c02bc4088 ("tools/nolibc: add support for clock_nanosleep() and nanosleep()")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250731-nolibc-clock_nanosleep-ret-v1-1-9e4af7855e61@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-08-17 11:42:09 +02:00
Jakub Kicinski 715c7a36d5 selftests: tls: make the new data_steal test less flaky
The CI has hit a couple of cases of:

  RUN           global.data_steal ...
 tls.c:2762:data_steal:Expected recv(cfd, buf2, sizeof(buf2), MSG_DONTWAIT) (20000) == -1 (-1)
 data_steal: Test terminated by timeout
          FAIL  global.data_steal

Looks like the 2msec sleep is not long enough. Make the sleep longer,
and then instead of second sleep wait for the thieving process to exit.
That way we can be sure it called recv() before us.

While at it also avoid trying to steal more than a record, this seems
to be causing issues in manual testing as well.

Fixes: d7e82594a4 ("selftests: tls: test TCP stealing data from under the TLS socket")
Link: https://patch.msgid.link/20250814194323.2014650-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-15 18:07:27 -07:00
Yureka Lilian 7f8fa9d370 selftests/bpf: Add test for DEVMAP reuse
The test covers basic re-use of a pinned DEVMAP map,
with both matching and mismatching parameters.

Signed-off-by: Yureka Lilian <yuka@yuka.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250814180113.1245565-4-yuka@yuka.dev
2025-08-15 16:52:52 -07:00
Matt Bobrowski c80d797206 bpf/selftests: Fix test_tcpnotify_user
Based on a bisect, it appears that commit 7ee9887703 ("timers:
Implement the hierarchical pull model") has somehow inadvertently
broken BPF selftest test_tcpnotify_user. The error that is being
generated by this test is as follows:

	FAILED: Wrong stats Expected 10 calls, got 8

It looks like the change allows timer functions to be run on CPUs
different from the one they are armed on. The test had pinned itself
to CPU 0, and in the past the retransmit attempts also occurred on CPU
0. The test had set the max_entries attribute for
BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling
bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was
likely to be in range. With the change to allow timers to run on other
CPUs, the current CPU tasked with performing the retransmit might be
bumped and in turn fall out of range, as the event will be filtered
out via __bpf_perf_event_output() using:

    if (unlikely(index >= array->map.max_entries))
            return -E2BIG;

A possible change would be to explicitly set the max_entries attribute
for perf_event_map in test_tcpnotify_kern.c to a value that's at least
as large as the number of CPUs. As it turns out however, if the field
is left unset, then the libbpf will determine the number of CPUs available
on the underlying system and update the max_entries attribute accordingly
in map_set_def_max_entries().

A further problem with the test is that it has a thread that continues
running up until the program exits. The main thread cleans up some
LIBBPF data structures, while the other thread continues to use them,
which inevitably will trigger a SIGSEGV. This can be dealt with by
telling the thread to run for as long as necessary and doing a
pthread_join on it before exiting the program.

Finally, I don't think binding the process to CPU 0 is meaningful for
this test any more, so get rid of that.

Fixes: 435f90a338 ("selftests/bpf: add a test case for sock_ops perf-event notification")
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/aJ8kHhwgATmA3rLf@google.com
2025-08-15 13:05:29 -07:00
Ido Schimmel 5e0b2177bd selftest: forwarding: router: Add a test case for IPv4 link-local source IP
Add a test case which checks that packets with an IPv4 link-local source
IP are forwarded and not dropped.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/3c2e0b17d99530f57bef5ddff9af284fa0c9b667.1755174341.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-15 10:44:48 -07:00
Thomas Weißschuh bd80c4d6e4 kunit: tool: Parse skipped tests from kselftest.h
Skipped tests reported by kselftest.h use a different format than KTAP,
there is no explicit test name. Normally the test name is part of the
free-form string after the SKIP keyword:

	ok 3 # SKIP test: some reason

Extend the parser to handle those correctly. Use the free-form string as
test name instead.

Link: https://lore.kernel.org/r/20250813-kunit-kselftesth-skip-v1-1-57ae3de4f109@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-08-15 11:43:12 -06:00
Pu Lehui dc0fe95614 selftests/bpf: Enable arena atomics tests for RV64
Enable arena atomics tests for RV64.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20250719091730.2660197-11-pulehui@huaweicloud.com
2025-08-15 10:46:51 +02:00
William Liu 8c06cbdcba selftests/tc-testing: Check backlog stats in gso_skb case
Add tests to ensure proper backlog accounting in hhf, codel, pie, fq,
fq_pie, and fq_codel qdiscs. We check for the bug pattern originally
found in fq, fq_pie, and fq_codel, which was an underflow in the tbf
parent backlog stats upon child qdisc removal.

Signed-off-by: William Liu <will@willsroot.io>
Reviewed-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250812235808.45281-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14 17:52:29 -07:00
Ido Schimmel 51ca1e67f4 selftests: net: Test bridge backup port when port is administratively down
Test that packets are redirected to the backup port when the primary
port is administratively down.

With the previous patch:

 # ./test_bridge_backup_port.sh
 [...]
 TEST: swp1 administratively down                                    [ OK ]
 TEST: No forwarding out of swp1                                     [ OK ]
 TEST: Forwarding out of vx0                                         [ OK ]
 TEST: swp1 administratively up                                      [ OK ]
 TEST: Forwarding out of swp1                                        [ OK ]
 TEST: No forwarding out of vx0                                      [ OK ]
 [...]
 Tests passed:  89
 Tests failed:   0

Without the previous patch:

 # ./test_bridge_backup_port.sh
 [...]
 TEST: swp1 administratively down                                    [ OK ]
 TEST: No forwarding out of swp1                                     [ OK ]
 TEST: Forwarding out of vx0                                         [FAIL]
 TEST: swp1 administratively up                                      [ OK ]
 TEST: Forwarding out of swp1                                        [ OK ]
 [...]
 Tests passed:  85
 Tests failed:   4

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812080213.325298-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14 17:45:36 -07:00
Jakub Kicinski f09fc24dd9 selftests: drv-net: wait for carrier
On fast machines the tests run in quick succession so even
when tests clean up after themselves the carrier may need
some time to come back.

Specifically in NIPA when ping.py runs right after netpoll_basic.py
the first ping command fails.

Since the context manager callbacks are now common NetDrvEpEnv
gets an ip link up call as well.

Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250812142054.750282-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14 17:44:46 -07:00
Paul E. McKenney bd89367e05 torture: Add --do-normal parameter to torture.sh help text
The --do-normal parameter was missing from the torture.sh script's help
text, so this commit adds it.  Hopefully better late than never!

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2025-08-14 15:26:30 -07:00
Paul E. McKenney e95f6ccdbc rcutorture: Fix jitter.sh spin time
An embarrassing syntax error in jitter.sh makes for fixed spin time.
This commit therefore makes it be variable, as intended, albeit with
very coarse-grained adjustment.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2025-08-14 15:26:30 -07:00
Jakub Kicinski f24775c325 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc2).

No conflicts.

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
  d7a276a576 ("net: stmmac: rk: convert to suspend()/resume() methods")
  de1e963ad0 ("net: stmmac: rk: put the PHY clock on remove")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14 12:13:00 -07:00
Linus Torvalds 63467137ec Including fixes from Netfilter and IPsec.
Current release - regressions:
 
   - netfilter: nft_set_pipapo:
     - don't return bogus extension pointer
     - fix null deref for empty set
 
 Current release - new code bugs:
 
   - core: prevent deadlocks when enabling NAPIs with mixed kthread config
 
   - eth: netdevsim: Fix wild pointer access in nsim_queue_free().
 
 Previous releases - regressions:
 
   - page_pool: allow enabling recycling late, fix false positive warning
 
   - sched: ets: use old 'nbands' while purging unused classes
 
   - xfrm:
     - restore GSO for SW crypto
     - bring back device check in validate_xmit_xfrm
 
   - tls: handle data disappearing from under the TLS ULP
 
   - ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
 
   - eth: bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
 
   - eth: hv_netvsc: fix panic during namespace deletion with VF
 
 Previous releases - always broken:
 
   - netfilter: fix refcount leak on table dump
 
   - vsock: do not allow binding to VMADDR_PORT_ANY
 
   - sctp: linearize cloned gso packets in sctp_rcv
 
   - eth: hibmcge: fix the division by zero issue
 
   - eth: microchip: fix KSZ8863 reset problem
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmidxhgSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkPckP/AhJ0kARPgo72OhElbW6KvkHhXVNUTJb
 6m15j9Z8ybQPBupSorxUEBx7pxczv/GBloN1xoJqUY/p7B6kLO2HDOpaReNFjZvF
 J9hPl6ZF6CWzwgfcUfwI3UB1zhALKyHDClclfd8FBoKjAYXCrZXuPv/AV4oqYsA1
 y7g24zpI76Cu+M+Nf5YhrlIVSmQ1/DXX8gifdcHFYnAKmCn7KxNv2lwvm2/yE2lL
 9/Xl/D1cG/BiAaCUUXR4BP8RU5gdW72+lM3qmC/QFO/7jcBaoE89Y9Anona8p0PQ
 dqDerHd0GDUH9QA6bht3asCS+IW+Zo2gf25o53OzlYIMAxDmEZLUBCwetJhvNJBq
 DUQ6agzfNRxsCnlOc4JhMOqNr7rdU7d+9c9KuZWA/m8KRWdlvTYGJd/qzSlTWOhq
 s9228dl+4oTb9Mnq8Bqafi42+TImeOyFRW9ZgF8ptjlF0l/lyv6moIrRCmVXppRZ
 awABNDdG+i004XmAOAeOhjbUT7clLkLr+KEnsfH16qCa2o3dM6rlhvWYp2sucVJf
 SyRvMdz5VqMLgruefpQS/DuK52UklpRawgvgngzU6UDYQUaxQKToeusMjRU7xUnW
 hVI1y7/oNH6+r7Zr/iLTLKRR007B+RVC7VSbeMpxmAW+n6puMb+z7RnrJlnFapGM
 qqqtk2/jItuK
 =Ydk1
 -----END PGP SIGNATURE-----

Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from Netfilter and IPsec.

  Current release - regressions:

   - netfilter: nft_set_pipapo:
      - don't return bogus extension pointer
      - fix null deref for empty set

  Current release - new code bugs:

   - core: prevent deadlocks when enabling NAPIs with mixed kthread
     config

   - eth: netdevsim: Fix wild pointer access in nsim_queue_free().

  Previous releases - regressions:

   - page_pool: allow enabling recycling late, fix false positive
     warning

   - sched: ets: use old 'nbands' while purging unused classes

   - xfrm:
      - restore GSO for SW crypto
      - bring back device check in validate_xmit_xfrm

   - tls: handle data disappearing from under the TLS ULP

   - ptp: prevent possible ABBA deadlock in ptp_clock_freerun()

   - eth:
      - bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
      - hv_netvsc: fix panic during namespace deletion with VF

  Previous releases - always broken:

   - netfilter: fix refcount leak on table dump

   - vsock: do not allow binding to VMADDR_PORT_ANY

   - sctp: linearize cloned gso packets in sctp_rcv

   - eth:
      - hibmcge: fix the division by zero issue
      - microchip: fix KSZ8863 reset problem"

* tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  net: usb: asix_devices: add phy_mask for ax88772 mdio bus
  net: kcm: Fix race condition in kcm_unattach()
  selftests: net/forwarding: test purge of active DWRR classes
  net/sched: ets: use old 'nbands' while purging unused classes
  bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
  netdevsim: Fix wild pointer access in nsim_queue_free().
  net: mctp: Fix bad kfree_skb in bind lookup test
  netfilter: nf_tables: reject duplicate device on updates
  ipvs: Fix estimator kthreads preferred affinity
  netfilter: nft_set_pipapo: fix null deref for empty set
  selftests: tls: test TCP stealing data from under the TLS socket
  tls: handle data disappearing from under the TLS ULP
  ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
  ixgbe: prevent from unwanted interface name changes
  devlink: let driver opt out of automatic phys_port_name generation
  net: prevent deadlocks when enabling NAPIs with mixed kthread config
  net: update NAPI threaded config even for disabled NAPIs
  selftests: drv-net: don't assume device has only 2 queues
  docs: Fix name for net.ipv4.udp_child_hash_entries
  riscv: dts: thead: Add APB clocks for TH1520 GMACs
  ...
2025-08-14 07:14:30 -07:00
Jakub Kicinski 26dbe030ff selftests: drv-net: add test for RSS on flow label
Add a simple test for checking that RSS on flow label works,
and that its rejected for IPv4 flows.

 # ./tools/testing/selftests/drivers/net/hw/rss_flow_label.py
 TAP version 13
 1..2
 ok 1 rss_flow_label.test_rss_flow_label
 ok 2 rss_flow_label.test_rss_flow_label_6only
 # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250811234212.580748-5-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-14 11:40:21 +02:00
Davide Caratti 774a2ae661 selftests: net/forwarding: test purge of active DWRR classes
Extend sch_ets.sh to add a reproducer for problematic list deletions when
active DWRR class are purged by ets_qdisc_change() [1] [2].

[1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
[2] https://lore.kernel.org/netdev/f3b9bacc73145f265c19ab80785933da5b7cbdec.1754581577.git.dcaratti@redhat.com/

Suggested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/489497cb781af7389011ca1591fb702a7391f5e7.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-13 18:11:48 -07:00
Andre Carvalho acfea93610 selftests: netconsole: Validate interface selection by MAC address
Extend the existing netconsole cmdline selftest to also validate that
interface selection can be performed via MAC address.

The test now validates that netconsole works with both interface name
and MAC address, improving test coverage.

Suggested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Andre Carvalho <asantostc@gmail.com>
Link: https://patch.msgid.link/20250812-netcons-cmdline-selftest-v2-1-8099fb7afa9e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-13 17:28:27 -07:00
Ido Schimmel 5e88777a38 selftests: forwarding: Add a test for FDB activity notification control
Test various aspects of FDB activity notification control:

* Transitioning of an FDB entry from inactive to active state.

* Transitioning of an FDB entry from active to inactive state.

* Avoiding the resetting of an FDB entry's last activity time (i.e.,
  "updated" time) using the "norefresh" keyword.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812071810.312346-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-13 17:09:54 -07:00
Amery Hung 07866544e4 selftests/bpf: Copy test_kmods when installing selftest
Commit d6212d82bf ("selftests/bpf: Consolidate kernel modules into
common directory") consolidated the Makefile of test_kmods. However,
since it removed test_kmods from TEST_GEN_PROGS_EXTENDED, the kernel
modules required by bpf selftests are now missing from kselftest_install
when "make install". Fix it by adding test_kmod to TEST_GEN_FILES.

Fixes: d6212d82bf ("selftests/bpf: Consolidate kernel modules into common directory")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250812175039.2323570-1-ameryhung@gmail.com
2025-08-13 15:53:01 -07:00
Linus Torvalds 91325f31af 12 hotfixes. 5 are cc:stable and the remainder address post-6.16 issues
or aren't considered necessary for -stable kernels.  10 of these fixes are
 for MM.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaJwLpgAKCRDdBJ7gKXxA
 js1sAP0c/XlVJhICq9aNJluu4Nj7cKTlzN7nvD/YRivZrG8XJQD/Q5nvFY6yeOdi
 /HxCAdTDY5HsWv28nEbvnxKKbuFligU=
 =mtKs
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "12 hotfixes. 5 are cc:stable and the remainder address post-6.16
  issues or aren't considered necessary for -stable kernels.

  10 of these fixes are for MM"

* tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  proc: proc_maps_open allow proc_mem_open to return NULL
  mm/mremap: avoid expensive folio lookup on mremap folio pte batch
  userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
  mm: pass page directly instead of using folio_page
  selftests/proc: fix string literal warning in proc-maps-race.c
  fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats
  mm/smaps: fix race between smaps_hugetlb_range and migration
  mm: fix the race between collapse and PT_RECLAIM under per-vma lock
  mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
  MAINTAINERS: add Masami as a reviewer of hung task detector
  mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
  kasan/test: fix protection against compiler elision
2025-08-13 08:28:33 -07:00
Jakub Kicinski d7e82594a4 selftests: tls: test TCP stealing data from under the TLS socket
Check a race where data disappears from the TCP socket after
TLS signaled that its ready to receive.

  ok 6 global.data_steal
  #  RUN           tls_basic.base_base ...
  #            OK  tls_basic.base_base

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250807232907.600366-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:59:06 -07:00
Jakub Kicinski c378c497f3 selftests: drv-net: devmem: flip the direction of Tx tests
The Device Under Test should always be the local system.
While the Rx test gets this right the Tx test is sending
from remote to local. So Tx of DMABUF memory happens on remote.

These tests never run in NIPA since we don't have a compatible
device so we haven't caught this.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:27:43 -07:00
Jakub Kicinski 6e9a12f85a selftests: net: terminate bkg() commands on exception
There is a number of:

  with bkg("socat ..LISTEN..", exit_wait=True)

uses in the tests. If whatever is supposed to send the traffic
fails we will get stuck in the bkg(). Try to kill the process
in case of exception, to avoid the long wait.

A specific example where this happens is the devmem Tx tests.

Reviewed-by: Joe Damato <joe@dama.to>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:27:42 -07:00
Jakub Kicinski 424e96de30 selftests: drv-net: devmem: add / correct the IPv6 support
We need to use bracketed IPv6 addresses for socat.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:27:42 -07:00
Jakub Kicinski a94e9cf79c selftests: drv-net: devmem: remove sudo from system() calls
The general expectations for network HW selftests is that they
will be run as root. sudo doesn't seem to work on NIPA VMs.
While it's probably something solvable in the setup I think we should
remove the sudos. devmem is the only networking test using sudo.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:27:42 -07:00
Jakub Kicinski 27e5b560a8 selftests: drv-net: add configs for zerocopy Rx
Looks like neither IO_URING nor UDMABUF are enabled even tho
iou-zcrx.py and devmem.py (respectively) need those.
IO_URING gets enabled by default but UDMABUF is missing.

Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250811231334.561137-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:27:42 -07:00
Kuniyuki Iwashima 1838731f10 selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.
-Wall and -Wflex-array-member-not-at-end caught some warnings that
will be fixed in later patches.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:01:53 -07:00
Kuniyuki Iwashima fd9faac372 selftest: af_unix: Silence -Wall warning for scm_pid.c.
-Wall found 2 unused variables in scm_pid.c:

scm_pidfd.c: In function ‘parse_cmsg’:
scm_pidfd.c:140:13: warning: unused variable ‘data’ [-Wunused-variable]
  140 |         int data = 0;
      |             ^~~~
scm_pidfd.c: In function ‘cmsg_check_dead’:
scm_pidfd.c:246:15: warning: unused variable ‘client_pid’ [-Wunused-variable]
  246 |         pid_t client_pid;
      |               ^~~~~~~~~~

Let's remove these variables.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:01:53 -07:00
Kuniyuki Iwashima 9a58d8e682 selftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_rights.c.
scm_rights.c has no problem in functionality, but when compiled with
-Wflex-array-member-not-at-end, it shows this warning:

scm_rights.c: In function ‘__send_fd’:
scm_rights.c:275:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  275 |                 struct cmsghdr cmsghdr;
      |                                ^~~~~~~

Let's silence it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:01:53 -07:00
Kuniyuki Iwashima 942224e6ba selftest: af_unix: Silence -Wflex-array-member-not-at-end warning for scm_inq.c.
scm_inq.c has no problem in functionality, but when compiled with
-Wflex-array-member-not-at-end, it shows this warning:

scm_inq.c:15:24: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
   15 |         struct cmsghdr cmsghdr;
      |                        ^~~~~~~

Let's silence it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250811215432.3379570-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12 18:01:53 -07:00
Martin KaFai Lau 9e293d47bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Cross merge bpf/master after 6.17-rc1.

No conflict.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-08-12 12:14:02 -07:00
Jakub Kicinski bda053d644 selftests: drv-net: don't assume device has only 2 queues
The test is implicitly assuming the device only has 2 queues.
A real device will likely have more. The exact problem is that
because NAPIs get added to the list from the head, the netlink
dump reports them in reverse order. So the naive napis[0] will
actually likely give us the _last_ NAPI, not the first one.
Re-enable all the NAPIs instead of hard-coding 2 in the test.
This way the NAPIs we operated on will always reappear,
doesn't matter where they were in the registration order.

Fixes: e6d7626881 ("net: Update threaded state in napi config in netif_set_threaded")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250809001205.1147153-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 14:43:05 +02:00
Sukrut Heroorkar ab5ac789ef selftests/proc: fix string literal warning in proc-maps-race.c
This change resolves non literal string format warning invoked for
proc-maps-race.c while compiling.

proc-maps-race.c:205:17: warning: format not a string literal and no format arguments [-Wformat-security]
 205 |                 printf(text);
     |                 ^~~~~~
proc-maps-race.c:209:17: warning: format not a string literal and no format arguments [-Wformat-security]
 209 |                 printf(text);
     |                 ^~~~~~
proc-maps-race.c: In function `print_last_lines':
proc-maps-race.c:224:9: warning: format not a string literal and no format arguments [-Wformat-security]
 224 |         printf(start);
     |         ^~~~~~

Add string format specifier %s for the printf calls in both
print_first_lines() and print_last_lines() thus resolving the warnings.

The test executes fine after this change thus causing no effect to the
functional behavior of the test.

Link: https://lkml.kernel.org/r/20250804225633.841777-1-hsukrut3@gmail.com
Fixes: aadc099c48 ("selftests/proc: add verbose mode for /proc/pid/maps tearing tests")
Signed-off-by: Sukrut Heroorkar <hsukrut3@gmail.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:00:59 -07:00
Wake Liu bc4c0a48bd selftests/net: Ensure assert() triggers in psock_tpacket.c
The get_next_frame() function in psock_tpacket.c was missing a return
statement in its default switch case, leading to a compiler warning.

This was caused by a `bug_on(1)` call, which is defined as an
`assert()`, being compiled out because NDEBUG is defined during the
build.

Instead of adding a `return NULL;` which would silently hide the error
and could lead to crashes later, this change restores the original
author's intent. By adding `#undef NDEBUG` before including <assert.h>,
we ensure the assertion is active and will cause the test to abort if
this unreachable code is ever executed.

Signed-off-by: Wake Liu <wakel@google.com>
Link: https://patch.msgid.link/20250809062013.2407822-1-wakel@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:12:23 -07:00
Wake Liu c36748e873 selftests/net: Replace non-standard __WORDSIZE with sizeof(long) * 8
The `__WORDSIZE` macro, defined in the non-standard `<bits/wordsize.h>`
header, is a GNU extension and not universally available with all
toolchains, such as Clang when used with musl libc.

This can lead to build failures in environments where this header is
missing.

The intention of the code is to determine the bit width of a C `long`.
Replace the non-portable `__WORDSIZE` with the standard and portable
`sizeof(long) * 8` expression to achieve the same result.

This change also removes the inclusion of the now-unused
`<bits/wordsize.h>` header.

Signed-off-by: Wake Liu <wakel@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:08:32 -07:00
Jiapeng Chong e69980bd16 selftests/sched_ext: Remove duplicate sched.h header
./tools/testing/selftests/sched_ext/hotplug.c: sched.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22941
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11 08:24:08 -10:00
Nam Cao 2319f9d0aa
selftests/coredump: Remove the read() that fails the test
Resolve a conflict between
  commit 6a68d28066 ("selftests/coredump: Fix "socket_detect_userspace_client" test failure")
and
  commit 994dc26302 ("selftests/coredump: fix build")

The first commit adds a read() to wait for write() from another thread to
finish. But the second commit removes the write().

Now that the two commits are in the same tree, the read() now gets EOF and
the test fails.

Remove this read() so that the test passes.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/20250811074957.4079616-1-namcao@linutronix.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11 15:43:31 +02:00
Aleksa Sarai df579e4711
selftests/filesystems: add basic fscontext log tests
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-2-8d91d6242dc3@cyphar.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11 14:52:41 +02:00
Aleksa Sarai 81e4b9cf36
selftests/mount_setattr: add smoke tests for open_tree_attr(2) bug
There appear to be no other open_tree_attr(2) tests at the moment, but
as a minimal solution just add some additional checks in the existing
MOUNT_ATTR_IDMAP tests to make sure that open_tree_attr(2) cannot be
used to bypass the tested restrictions that apply to mount_setattr(2).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-2-0ec7bc05646c@cyphar.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11 14:51:49 +02:00
Marc Zyngier 85acc29f90 KVM: arm64: selftest: Add standalone test checking for KVM's own UUID
Tinkering with UUIDs is a perilious task, and the KVM UUID gets
broken at times. In order to spot this early enough, add a selftest
that will shout if the expected value isn't found.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250721130558.50823-1-jackabt.amazon@gmail.com
Link: https://lore.kernel.org/r/20250806171341.1521210-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-08 01:30:16 -07:00
Linus Torvalds 3781648824 Previous releases - regressions:
- netlink: avoid infinite retry looping in netlink_unicast()
 
 Previous releases - always broken:
 
  - packet: fix a race in packet_set_ring() and packet_notifier()
 
  - ipv6: reject malicious packets in ipv6_gso_segment()
 
  - sched: mqprio: fix stack out-of-bounds write in tc entry parsing
 
  - net: drop UFO packets (injected via virtio) in udp_rcv_segment()
 
  - eth: mlx5: correctly set gso_segs when LRO is used, avoid false
    positive checksum validation errors
 
  - netpoll: prevent hanging NAPI when netcons gets enabled
 
  - phy: mscc: fix parsing of unicast frames for PTP timestamping
 
  - number of device tree / OF reference leak fixes
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiUvMcACgkQMUZtbf5S
 IrucuA//bQGZdQkpRo/2zWDFS4wN7hV8PkR+kI0F4rUyNQjFDyUOrccYnhHl8VRn
 DfnmVC9oiWxwuW2QgrgH1KKSw9dxhDPVmhLezAHaZv9XAPPgPO/Yb2Dr3oKHF+yK
 +/QW57FN8rNg8mHUzMS/26Y+rH6OubTaf3MKcsT2uZhuIXKHScOUedmr6EeEkp/L
 c32MpWfcVC7M2jjvCH+HO4k4RawWaB8W93iMUrLMSVI1oE3Tsjyhl5Cv9TBixh7g
 3KZW98qVqMXKBQr0QTF4LBR0IWgKOS4KMVJqrgQ3CZszbDWbmBsFfi8olr/AS0Nt
 vgk6hTd8vVHXa+sOvdFpDbocdjXBq6vf5bDd9p57bt3JyxFMfFUWbG1eDj8bUpMI
 YuKAhQg9mTr6R8DaLwqANY8zrSET2FiYbNkUsP9TD8++q2j34U5/a468cxymfjJ/
 90vSrBGUpwxYPhx2B+Slu1SZJ1g0NbUL34jrKBuSNloVoZDRuzq2zd310BigG/35
 T5CTr5OH+USlFjL6KpgGHnygiMQB/h2WgEjWbFoZHnIb+exVKCgS6IiNBVV3b2OK
 Nv6iVyFKPdKJ8qxeaiYMWA16wG8buOytlBn/1hYKLBvyVKAAx9Ib4ao2TK9w/1Vs
 50sX21h+Awj2a/17sXlSXA+RVdeBcAzHdsGWHr1ql31htn2Xl10=
 =nwy7
 -----END PGP SIGNATURE-----

Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
  Previous releases - regressions:

   - netlink: avoid infinite retry looping in netlink_unicast()

  Previous releases - always broken:

   - packet: fix a race in packet_set_ring() and packet_notifier()

   - ipv6: reject malicious packets in ipv6_gso_segment()

   - sched: mqprio: fix stack out-of-bounds write in tc entry parsing

   - net: drop UFO packets (injected via virtio) in udp_rcv_segment()

   - eth: mlx5: correctly set gso_segs when LRO is used, avoid false
     positive checksum validation errors

   - netpoll: prevent hanging NAPI when netcons gets enabled

   - phy: mscc: fix parsing of unicast frames for PTP timestamping

   - a number of device tree / OF reference leak fixes"

* tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
  pptp: fix pptp_xmit() error path
  net: ti: icssg-prueth: Fix skb handling for XDP_PASS
  net: Update threaded state in napi config in netif_set_threaded
  selftests: netdevsim: Xfail nexthop test on slow machines
  eth: fbnic: Lock the tx_dropped update
  eth: fbnic: Fix tx_dropped reporting
  eth: fbnic: remove the debugging trick of super high page bias
  net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect
  dt-bindings: net: Replace bouncing Alexandru Tachici emails
  dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET
  net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing
  Revert "net: mdio_bus: Use devm for getting reset GPIO"
  selftests: net: packetdrill: xfail all problems on slow machines
  net/packet: fix a race in packet_set_ring() and packet_notifier()
  benet: fix BUG when creating VFs
  net: airoha: npu: Add missing MODULE_FIRMWARE macros
  net: devmem: fix DMA direction on unmapping
  ipa: fix compile-testing with qcom-mdt=m
  eth: fbnic: unlink NAPIs from queues on error to open
  net: Add locking to protect skb->dev access in ip_output
  ...
2025-08-08 07:03:25 +03:00
Amery Hung ba7000f1c3 selftests/bpf: Test multi_st_ops and calling kfuncs from different programs
Test multi_st_ops and demonstrate how different bpf programs can call
a kfuncs that refers to the struct_ops instance in the same source file
by id. The id is defined as a global vairable and initialized before
attaching the skeleton. Kfuncs that take the id can hide the argument
with a macro to make it almost transparent to bpf program developers.

The test involves two struct_ops returning different values from
.test_1. In syscall and tracing programs, check if the correct value is
returned by a kfunc that calls .test_1.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250806162540.681679-4-ameryhung@gmail.com
2025-08-06 16:01:58 -07:00
Amery Hung eeb52b6279 selftests/bpf: Add multi_st_ops that supports multiple instances
Current struct_ops in bpf_testmod only support attaching single instance.
Add multi_st_ops that supports multiple instances. The struct_ops uses map
id as the struct_ops id and will reject attachment with an existing id.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250806162540.681679-3-ameryhung@gmail.com
2025-08-06 16:01:56 -07:00
Linus Torvalds adf12a394c Perf fixes for perf_mmap() reference counting to prevent potential
reference count leaks which are caused by:
 
  - VMA splits, which change the offset or size of a mapping, which causes
    perf_mmap_close() to ignore the unmap or unmap the wrong buffer.
 
  - Several internal issues of perf_mmap(), which can cause reference count
    leaks in the perf mmap, corrupt accounting or cause leaks in perf
    drivers.
 
 The main fix is to prevent VMA splits by implementing the [may_]split()
 callback for vm operations. The other issues are addressed by rearranging
 code, early returns on failure and invocation of cleanups.
 
 Also provide a selftest to validate the fixes.
 
 The reference counting should be converted to refcount_t, but that requires
 larger refactoring of the code and will be done once these fixes are
 upstream.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiSd0gTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoWCbD/9niCHArXIWfdIp1K2ZwM2+tsCU7Ntl
 XnkfnaPoCVpQQXcAN11WIEK6DlwaiY2sfOco1cpEqr5px0Hv5qzM+OGm0r3KQBpr
 9d1Ox8+tqUOIxw5zsKFXpw6/WX4zzdxGHjzU/T0H+fPP3jB+hj/Q3hOk4u10+f3v
 7f3Q4sOfkmOauQez2HEaDwUip6lLZFEaf8IK0tYEkOJxcStwsC2TnLvmlEmOA0Yx
 PnAXOicrpbe9d8KNq6VxU0OtV6XAT+YJtf9T5cTNR1NhIkqyaMwbdzkuh9RZgxAE
 oRblaAHubAUMmv2DgYOTUGoYivsXY13XOtjfXdLmxt19HmkSOyaCFO8nJgjAPOL7
 gxGXS7zKxhNac7bfVgBANPUHOOWtV30H5CqYOzxaPlQs8gzOsl8l+NDZuwVlP4P6
 CMdN3rz3eMpnMpuzy0mmUJhowytKDA8N81yamCP5L9hWWZVfp4boZfIXMMLtJdQa
 nv/T2HxLL8HweFrI6Wd7YDhXMKhsNDAqJvtSv0z+5U+PWWd9rcOFsgS9sUHIiJuB
 pLvNwLxPntzF6qw4qIp1W1AHfLz2VF/tR8WyINpEZe4oafP1TccI+aLQdIJ/vVqp
 gQ0bCTiZb16IGsHruu4L9C0fe40TdSuiwEK5X9Opk4aP11oagsqQ+GxzssvQZnZc
 Jx2XqouabWBBvQ==
 =B9L/
 -----END PGP SIGNATURE-----

Merge tag 'perf-fixes-27504' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

Pull perf fixes from Thomas Gleixner:
 "Perf fixes for perf_mmap() reference counting to prevent potential
  reference count leaks which are caused by:

   - VMA splits, which change the offset or size of a mapping, which
     causes perf_mmap_close() to ignore the unmap or unmap the wrong
     buffer.

   - Several internal issues of perf_mmap(), which can cause reference
     count leaks in the perf mmap, corrupt accounting or cause leaks in
     perf drivers.

  The main fix is to prevent VMA splits by implementing the
  [may_]split() callback for vm operations.

  The other issues are addressed by rearranging code, early returns on
  failure and invocation of cleanups.

  Also provide a selftest to validate the fixes.

  The reference counting should be converted to refcount_t, but that
  requires larger refactoring of the code and will be done once these
  fixes are upstream"

* tag 'perf-fixes-27504' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git:
  selftests/perf_events: Add a mmap() correctness test
  perf/core: Prevent VMA split of buffer mappings
  perf/core: Handle buffer mapping fail correctly in perf_mmap()
  perf/core: Exit early on perf_mmap() fail
  perf/core: Don't leak AUX buffer refcount on allocation failure
  perf/core: Preserve AUX buffer allocation failure result
2025-08-06 04:41:21 +03:00
Samiullah Khawaja e6d7626881 net: Update threaded state in napi config in netif_set_threaded
Commit 2677010e77 ("Add support to set NAPI threaded for individual
NAPI") added support to enable/disable threaded napi using netlink. This
also extended the napi config save/restore functionality to set the napi
threaded state. This breaks netdev reset for drivers that use napi
threaded at device level and also use napi config save/restore on
napi_disable/napi_enable. Basically on netdev with napi threaded enabled
at device level, a napi_enable call will get stuck trying to stop the
napi kthread. This is because the napi->config->threaded is set to
disabled when threaded is enabled at device level.

The issue can be reproduced on virtio-net device using qemu. To
reproduce the issue run following,

  echo 1 > /sys/class/net/threaded
  ethtool -L eth0 combined 1

Update the threaded state in napi config in netif_set_threaded and add a
new test that verifies this scenario.

Tested on qemu with virtio-net:
 NETIF=eth0 ./tools/testing/selftests/drivers/net/napi_threaded.py
 TAP version 13
 1..2
 ok 1 napi_threaded.change_num_queues
 ok 2 napi_threaded.enable_dev_threaded_disable_napi_threaded
 # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0

Fixes: 2677010e77 ("Add support to set NAPI threaded for individual NAPI")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250804164457.2494390-1-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05 17:46:15 -07:00
Ido Schimmel 8d22aea8af selftests: netdevsim: Xfail nexthop test on slow machines
A lot of test cases in the file are related to the idle and unbalanced
timers of resilient nexthop groups and these tests are reported to be
flaky on slow machines running debug kernels.

Rather than marking a lot of individual tests with xfail_on_slow(),
simply mark all the tests. Note that the test is stable on non-debug
machines and that with debug kernels we are mainly interested in the
output of various sanitizers in order to determine pass / fail.

Before:

 # make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
 	TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
 	TEST_GEN_PROGS="" run_tests
 [...]
 # TEST: Bucket migration after idle timer (with delete)               [FAIL]
 #       Group expected to still be unbalanced
 [...]
 not ok 1 selftests: drivers/net/netdevsim: nexthop.sh # exit=1

After:

 # make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
 	TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
 	TEST_GEN_PROGS="" run_tests
 [...]
 # TEST: Bucket migration after idle timer (with delete)               [XFAIL]
 #       Group expected to still be unbalanced
 [...]
 ok 1 selftests: drivers/net/netdevsim: nexthop.sh

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20250729160609.02e0f157@kernel.org/
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250804114320.193203-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05 16:11:03 -07:00
Lorenzo Stoakes 084d2ac403 selftests/perf_events: Add a mmap() correctness test
Exercise various mmap(), munmap() and mremap() invocations, which might
cause a perf buffer mapping to be split or truncated.

To avoid hard coding the perf event and having dependencies on
architectures and configuration options, scan through event types in sysfs
and try to open them. On success, try to mmap() and if that succeeds try to
mmap() the AUX buffer.

In case that no AUX buffer supporting event is found, only test the base
buffer mapping. If no mappable event is found or permissions are not
sufficient, skip the tests.

Reserve a PROT_NONE region for both rb and aux tests to allow testing the
case where mremap unmaps beyond the end of a mapped VMA to prevent it from
unmapping unrelated mappings.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>            
Co-developed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
2025-08-05 21:55:29 +02:00
Linus Torvalds da23ea194d Significant patch series in this pull request:
- The 4 patch series "mseal cleanups" from Lorenzo Stoakes erforms some
   mseal cleaning with no intended functional change.
 
 - The 3 patch series "Optimizations for khugepaged" from David
   Hildenbrand improves khugepaged throughput by batching PTE operations
   for large folios.  This gain is mainly for arm64.
 
 - The 8 patch series "x86: enable EXECMEM_ROX_CACHE for ftrace and
   kprobes" from Mike Rapoport provides a bugfix, additional debug code and
   cleanups to the execmem code.
 
 - The 7 patch series "mm/shmem, swap: bugfix and improvement of mTHP
   swap in" from Kairui Song provides bugfixes, cleanups and performance
   improvememnts to the mTHP swapin code.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+6HQAKCRDdBJ7gKXxA
 jv7lAQCAKE5dUhdZ0pOYbhBKTlDapQh2KqHrlV3QFcxXgknEoQD/c3gG01rY3fLh
 Cnf5l9+cdyfKxFniO48sUPx6IpriRg8=
 =HT5/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "mseal cleanups" (Lorenzo Stoakes)

     Some mseal cleaning with no intended functional change.

   - "Optimizations for khugepaged" (David Hildenbrand)

     Improve khugepaged throughput by batching PTE operations for large
     folios. This gain is mainly for arm64.

   - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport)

     A bugfix, additional debug code and cleanups to the execmem code.

   - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song)

     Bugfixes, cleanups and performance improvememnts to the mTHP swapin
     code"

* tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits)
  mm: mempool: fix crash in mempool_free() for zero-minimum pools
  mm: correct type for vmalloc vm_flags fields
  mm/shmem, swap: fix major fault counting
  mm/shmem, swap: rework swap entry and index calculation for large swapin
  mm/shmem, swap: simplify swapin path and result handling
  mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO
  mm/shmem, swap: tidy up swap entry splitting
  mm/shmem, swap: tidy up THP swapin checks
  mm/shmem, swap: avoid redundant Xarray lookup during swapin
  x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations
  x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations
  execmem: drop writable parameter from execmem_fill_trapping_insns()
  execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP)
  execmem: move execmem_force_rw() and execmem_restore_rox() before use
  execmem: rework execmem_cache_free()
  execmem: introduce execmem_alloc_rw()
  execmem: drop unused execmem_update_copy()
  mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  mm/rmap: add anon_vma lifetime debug check
  mm: remove mm/io-mapping.c
  ...
2025-08-05 16:02:07 +03:00
Jakub Kicinski 5ef7fdf52c selftests: net: packetdrill: xfail all problems on slow machines
We keep seeing flakes on packetdrill on debug kernels, while
non-debug kernels are stable, not a single flake in 200 runs.
Time to give up, debug kernels appear to suffer from 10msec
latency spikes and any timing-sensitive test is bound to flake.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250801181638.2483531-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04 17:21:44 -07:00
Linus Torvalds 806381e1a2 powerpc fixes for 6.17 #2
- Fixes for several issues in the powernv PCI hotplug path
  - Fix htmldoc generation for htm.rst in toctree
  - Add jit support for load_acquire and store_release in ppc64 bpf jit
 
 Thanks to: Bjorn Helgaas, Hari Bathini, Puranjay Mohan, Saket Kumar Bhaskar,
 Shawn Anastasio, Timothy Pearson, Vishal Parmar
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmiQDOcACgkQpnEsdPSH
 ZJRqTBAAkpaZRrbzX6P1fjoTb89esIfY7YSymmXrgMwoF1fgTMzPSjg2Lzwzcjfb
 Ednlo8Hy+Ei2SDvctNaqtOcZuidn81AJN41aFIu44S/Mj1cde/cuBMKUTqx+ZBO0
 gI4Bd4+V7yEv484PF7ZJga9K1VM85THCLgVWJ00VNHVHyvxgAuFiXdWmh6/qMUyi
 thvvLR+ANuAQz3S4VwbBg3AifDl6LXx2s5VB30xYxnPKzFNKZmnGXKwuOJH7rQe2
 J8v99n0tcXW1tRGE4pVykzXg4EXL5zgWT9fJ5EZxbeXaW9sqMxi4VjO4jSsrSZ+K
 q2v362Dyjgygel9aC2rzN8Q+P5horX2QBR7knJJGa0VtztUiPWKR8za7vGLbzlcm
 rUvnTuoY1dwFB72Sy3amALalvWscssL/1sHazvRv65RcciW7/PZNVEiZ0xh1RKqb
 J8nWlb+iNBf2z12qLmS5DvUaveaZG1eyndLeD/knsEC49DEOoEy3t7QM1F7KscRR
 mYPsEpfjF/D0r+vzb6zl2ykwhJf/t7BNu7MdXH7xbIpj5iwtrhSUfvmx6g2MThzA
 Vee2QvQACscdop3W/6xgATH4xoq96v1XxMmCLnZ/HVl2PorxO27ad4EMNO4sBG8c
 5agWHT7EnoUgwNF30DRtIHd7jNK/jt8++3kZx6CG9hdSboAM/pM=
 =tTms
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

 - Fixes for several issues in the powernv PCI hotplug path

 - Fix htmldoc generation for htm.rst in toctree

 - Add jit support for load_acquire and store_release in ppc64 bpf jit

Thanks to Bjorn Helgaas, Hari Bathini, Puranjay Mohan, Saket Kumar
Bhaskar, Shawn Anastasio, Timothy Pearson, and Vishal Parmar

* tag 'powerpc-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc64/bpf: Add jit support for load_acquire and store_release
  docs: powerpc: add htm.rst to toctree
  PCI: pnv_php: Enable third attention indicator state
  PCI: pnv_php: Fix surprise plug detection and recovery
  powerpc/eeh: Make EEH driver device hotplug safe
  powerpc/eeh: Export eeh_unfreeze_pe()
  PCI: pnv_php: Work around switches with broken presence detection
  PCI: pnv_php: Clean up allocated IRQs on unplug
2025-08-03 19:15:04 -07:00
Linus Torvalds e991acf1bc Significant patch series in this pull request:
- The 2 patch series "squashfs: Remove page->mapping references" from
   Matthew Wilcox gets us closer to being able to remove page->mapping.
 
 - The 5 patch series "relayfs: misc changes" from Jason Xing does some
   maintenance and minor feature addition work in relayfs.
 
 - The 5 patch series "kdump: crashkernel reservation from CMA" from Jiri
   Bohac switches us from static preallocation of the kdump crashkernel's
   working memory over to dynamic allocation.  So the difficulty of
   a-priori estimation of the second kernel's needs is removed and the
   first kernel obtains extra memory.
 
 - The 5 patch series "generalize panic_print's dump function to be used
   by other kernel parts" from Feng Tang implements some consolidation and
   rationalizatio of the various ways in which a faiing kernel splats
   information at the operator.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+82gAKCRDdBJ7gKXxA
 jj4JAP9xb+w9DrBY6sa+7KTPIb+aTqQ7Zw3o9O2m+riKQJv6jAEA6aEwRnDA0451
 fDT5IqVlCWGvnVikdZHSnvhdD7TGsQ0=
 =rT71
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
     us closer to being able to remove page->mapping

   - "relayfs: misc changes" (Jason Xing) does some maintenance and
     minor feature addition work in relayfs

   - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
     us from static preallocation of the kdump crashkernel's working
     memory over to dynamic allocation. So the difficulty of a-priori
     estimation of the second kernel's needs is removed and the first
     kernel obtains extra memory

   - "generalize panic_print's dump function to be used by other
     kernel parts" (Feng Tang) implements some consolidation and
     rationalization of the various ways in which a failing kernel
     splats information at the operator

* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
  tools/getdelays: add backward compatibility for taskstats version
  kho: add test for kexec handover
  delaytop: enhance error logging and add PSI feature description
  samples: Kconfig: fix spelling mistake "instancess" -> "instances"
  fat: fix too many log in fat_chain_add()
  scripts/spelling.txt: add notifer||notifier to spelling.txt
  xen/xenbus: fix typo "notifer"
  net: mvneta: fix typo "notifer"
  drm/xe: fix typo "notifer"
  cxl: mce: fix typo "notifer"
  KVM: x86: fix typo "notifer"
  MAINTAINERS: add maintainers for delaytop
  ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
  ucount: fix atomic_long_inc_below() argument type
  kexec: enable CMA based contiguous allocation
  stackdepot: make max number of pools boot-time configurable
  lib/xxhash: remove unused functions
  init/Kconfig: restore CONFIG_BROKEN help text
  lib/raid6: update recov_rvv.c zero page usage
  docs: update docs after introducing delaytop
  ...
2025-08-03 16:23:09 -07:00
Lorenzo Stoakes f225b34f1e mm/mseal: always define VM_SEALED
Patch series "mseal cleanups", v4.

Perform a number of cleanups to the mseal logic.  Firstly, VM_SEALED is
treated differently from every other VMA flag, it really doesn't make
sense to do this, so we start by making this consistent with everything
else.

Next we place the madvise logic where it belongs - in mm/madvise.c.  It
really makes no sense to abstract this elsewhere.  In doing so, we go to
great lengths to explain very clearly the previously very confusing logic
as to what sealed mappings are impacted here.

In doing so, we retain existing logic regarding treatment of madvise()
discard operations for a sealed, read-only MAP_PRIVATE file-backed
mapping.  This is something we likely need to revisit.

We then abstract out and explain the 'are there are any gaps in this range
in the mm?' check being performed as a prerequisite to mseal being
performed.

Finally, we simplify the actual mseal logic which is really quite
straightforward.

No functional change is intended.


This patch (of 4):

There is no reason to treat VM_SEALED in a special way, in each other case
in which a VMA flag is unavailable due to configuration, we simply assign
that flag to VM_NONE, so make VM_SEALED consistent with all other VMA
flags in this respect.

Additionally, use the next available bit for VM_SEALED, 42, rather than
arbitrarily putting it at 63 and update the declaration to match all other
VMA flags.

No functional change intended.

Link: https://lkml.kernel.org/r/cover.1753431105.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/aeb398a77029b6e7377cd944328bc9bbc3c90537.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:09 -07:00
Suresh K C d6a511dea4 selftests: cachestat: add tests for mmap, refactor and enhance mmap test for cachestat validation
Add a cohesive test case that verifies cachestat behavior with
memory-mapped files using mmap().  Also refactor the test logic to reduce
redundancy, improve error reporting, and clarify failure messages for both
shmem and mmap file types.

[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20250709174657.6916-1-suresh.k.chandrappa@gmail.com
Signed-off-by: Suresh K C <suresh.k.chandrappa@gmail.com>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Tested-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:09 -07:00
wang lian b50e37889f selftests/mm: add process_madvise() tests
Add tests for process_madvise(), focusing on verifying behavior under
various conditions including valid usage and error cases.

[lianux.mm@gmail.com: v7]
  Link: https://lkml.kernel.org/r/20250729113109.12272-1-lianux.mm@gmail.com
Link: https://lkml.kernel.org/r/20250729113109.12272-1-lianux.mm@gmail.com
Link: https://lkml.kernel.org/r/20250721114614.40996-1-lianux.mm@gmail.com
Signed-off-by: wang lian <lianux.mm@gmail.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Mark Brown <broonie@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Zi Yan <ziy@nvidia.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:08 -07:00
Mike Rapoport (Microsoft) b753522bed kho: add test for kexec handover
Testing kexec handover requires a kernel driver that will generate some
data and preserve it with KHO on the first boot and then restore that data
and verify it was preserved properly after kexec.

To facilitate such test, along with the kernel driver responsible for data
generation, preservation and restoration add a script that runs a kernel
in a VM with a minimal /init.  The /init enables KHO, loads a kernel image
for kexec and runs kexec reboot.  After the boot of the kexeced kernel,
the driver verifies that the data was properly preserved.

[rppt@kernel.org: fix section mismatch]
  Link: https://lkml.kernel.org/r/aIiRC8fXiOXKbPM_@kernel.org
Link: https://lkml.kernel.org/r/20250727083733.2590139-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:01:41 -07:00
Amery Hung 7841811417 selftests/bpf: Test concurrent task local data key creation
Test thread-safety of tld_create_key(). Since tld_create_key() does
not rely on locks but memory barriers and atomic operations to protect
the shared metadata, the thread-safety of the function is non-trivial.
Make sure concurrent tld_key_create(), both valid and invalid, can not
race and corrupt metatada, which may leads to TLDs not being thread-
specific or duplicate TLDs with the same name.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20250730185903.3574598-5-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-01 18:00:46 -07:00
Amery Hung 120f1a950e selftests/bpf: Test basic task local data operations
Test basic operations of task local data with valid and invalid
tld_create_key().

For invalid calls, make sure they return the right error code and check
that the TLDs are not inserted by running tld_get_data("
value_not_exists") on the bpf side. The call should a null pointer.

For valid calls, first make sure the TLDs are created by calling
tld_get_data() on the bpf side. The call should return a valid pointer.

Finally, verify that the TLDs are indeed task-specific (i.e., their
addresses do not overlap) with multiple user threads. This done by
writing values unique to each thread, reading them from both user space
and bpf, and checking if the value read back matches the value written.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20250730185903.3574598-4-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-01 18:00:46 -07:00
Amery Hung 31e838e1cd selftests/bpf: Introduce task local data
Task local data defines an abstract storage type for storing task-
specific data (TLD). This patch provides user space and bpf
implementation as header-only libraries for accessing task local data.

Task local data is a bpf task local storage map with two UPTRs:

- tld_meta_u, shared by all tasks of a process, consists of the total
  count and size of TLDs and an array of metadata of TLDs. A TLD
  metadata contains the size and name. The name is used to identify a
  specific TLD in bpf programs.

- u_tld_data points to a task-specific memory. It stores TLD data and
  the starting offset of data in a page.

  Task local design decouple user space and bpf programs. Since bpf
  program does not know the size of TLDs in compile time, u_tld_data
  is declared as a page to accommodate TLDs up to a page. As a result,
  while user space will likely allocate memory smaller than a page for
  actual TLDs, it needs to pin a page to kernel. It will pin the page
  that contains enough memory if the allocated memory spans across the
  page boundary.

The library also creates another task local storage map, tld_key_map,
to cache keys for bpf programs to speed up the access.

Below are the core task local data API:

                   User space                          BPF
  Define TLD       TLD_DEFINE_KEY(), tld_create_key()  -
  Init TLD object  -                                   tld_object_init()
  Get TLD data     tld_get_data()                      tld_get_data()

- TLD_DEFINE_KEY(), tld_create_key()

  A TLD is first defined by the user space with TLD_DEFINE_KEY() or
  tld_create_key(). TLD_DEFINE_KEY() defines a TLD statically and
  allocates just enough memory during initialization. tld_create_key()
  allows creating TLDs on the fly, but has a fix memory budget,
  TLD_DYN_DATA_SIZE.

  Internally, they all call __tld_create_key(), which iterates
  tld_meta_u->metadata to check if a TLD can be added. The total TLD
  size needs to fit into a page (limit of UPTR), and no two TLDs can
  have the same name. If a TLD can be added, u_tld_meta->cnt is
  increased using cmpxchg as there may be other concurrent
  __tld_create_key(). After a successful cmpxchg, the last available
  tld_meta_u->metadata now belongs to the calling thread. To prevent
  other threads from reading incomplete metadata while it is being
  updated, tld_meta_u->metadata->size is used to signal the completion.

  Finally, the offset, derived from adding up prior TLD sizes is then
  encapsulated as an opaque object key to prevent user misuse. The
  offset is guaranteed to be 8-byte aligned to prevent load/store
  tearing and allow atomic operations on it.

- tld_get_data()

  User space programs can pass the key to tld_get_data() to get a
  pointer to the associated TLD. The pointer will remain valid for the
  lifetime of the thread.

  tld_data_u is lazily allocated on the first call to tld_get_data().
  Trying to read task local data from bpf will result in -ENODATA
  during tld_object_init(). The task-specific memory need to be freed
  manually by calling tld_free() on thread exit to prevent memory leak
  or use TLD_FREE_DATA_ON_THREAD_EXIT.

- tld_object_init() (BPF)

  BPF programs need to call tld_object_init() before calling
  tld_get_data(). This is to avoid redundant map lookup in
  tld_get_data() by storing pointers to the map values on stack.
  The pointers are encapsulated as tld_object.

  tld_key_map is also created on the first time tld_object_init()
  is called to cache TLD keys successfully fetched by tld_get_data().

  bpf_task_storage_get(.., F_CREATE) needs to be retried since it may
  fail when another thread has already taken the percpu counter lock
  for the task local storage.

- tld_get_data() (BPF)

  BPF programs can also get a pointer to a TLD with tld_get_data().
  It uses the cached key in tld_key_map to locate the data in
  tld_data_u->data. If the cached key is not set yet (<= 0),
  __tld_fetch_key() will be called to iterate tld_meta_u->metadata
  and find the TLD by name. To prevent redundant string comparison
  in the future when the search fail, the tld_meta_u->cnt is stored
  in the non-positive range of the key. Next time, __tld_fetch_key()
  will be called only if there are new TLDs and the search will start
  from the newly added tld_meta_u->metadata using the old
  tld_meta_u-cnt.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20250730185903.3574598-3-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-01 18:00:46 -07:00
Linus Torvalds a6923c06a3 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiNNksACgkQ6rmadz2v
 bTrKRhAAnju4bbFRHU88Y68p6Meq/jxgjxHZAkTqZA0Nvbu2cItPRL7XHAAhTWE7
 OBEIm3UKCH4gs4fY8rDHiIgnnaQavXUmvXZblOIOjxnqRKJpU3px+wwJvGFq5Enq
 WP6UZV8tj+O2tNfNNYS+mgQvvIpUISHGpKimvx7ede3e1U3cJBkppbT3gooMHYuc
 5s1QtYHWaPY/1DpkHgqJ2UPGcbT9/HSPGMHRNaHKjQTcNcLcrj7RRjchgXqcc7Vs
 hVijvVrLiuK0MyU42ritmaqvjjgD6hKPZguRQe2/hAtrOo0Alf+4mXkMgam7simN
 iHfGc7nhw1xAFTPj4WXahja89G00FdDN5NR37Rgurm/i2fY7BuXAkMjiMiwGB3C3
 jk2wG3RSifYeC2rxhkYJdqcx8Cz6m+pjgyJ2o9Jy5dn426VXg/kzkUXpl6u5jaPZ
 SmKoo9Xu1r7xqTaUc9kk8pJI5Xt9vD5oQjF2KQuPZXxNidiwW6k2OGbW+wF26nEi
 Q6pfDu3pvHAd/UE6cD5yFe97o3Cc2XfGwI/Sv2k99UVPvNcvfAvVo9fsItHBhCPn
 zHkihW2S0zmbBlhcrB+PrLclNgLleP9JukFN+5scc0a9lbQxIm6v2TNKGlBfDQtO
 I+Kn266oqT4BEgnQGlCQquINnQAdmS8VMnnunGOu6+rwPUtkI7E=
 =XLHS
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay
   Mohan, Mark Rutland, Maxwell Bland)

 - Disallow tail calls between BPF programs that use different cgroup
   local storage maps to prevent out-of-bounds access (Daniel Borkmann)

 - Fix unaligned access in flow_dissector and netfilter BPF programs
   (Paul Chaignon)

 - Avoid possible use of uninitialized mod_len in libbpf (Achill
   Gilgenast)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Test for unaligned flow_dissector ctx access
  bpf: Improve ctx access verifier error message
  bpf: Check netfilter ctx accesses are aligned
  bpf: Check flow_dissector ctx accesses are aligned
  arm64/cfi,bpf: Support kCFI + BPF on arm64
  cfi: Move BPF CFI types and helpers to generic code
  cfi: add C CFI type macro
  libbpf: Avoid possible use of uninitialized mod_len
  bpf: Fix oob access in cgroup local storage
  bpf: Move cgroup iterator helpers to bpf.h
  bpf: Move bpf map owner out of common struct
  bpf: Add cookie object to bpf maps
2025-08-01 17:13:26 -07:00
Linus Torvalds d41e5839d8 cxl for v6.17
- Add documentation template for CXL conventions to document CXL platform quirks
 - Replace mutex_lock_io() with mutex_lock() for mailbox
 - Add location limit for fake CFMWS range for cxl_test, ARM platform enabling
 - CXL documentation typo and clarity fixes
 - Use correct format specifier for function cxl_set_ecs_threshold()
 - Make cxl_bus_type constant
 - Introduce new helper cxl_resource_contains_addr() to check address availability
 - Fix wrong DPA checking for PPR operation
 - Remove core/acpi.c and CXL core dependency on ACPI
 - Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks
   - Add CXL updates utilizing ACQUIRE() macro to remove gotos and improve
     readability
 - Add return for the dummy version of cxl_decoder_detach() without CONFIG_CXL_REGION
 - CXL events updates for spec r3.2
 - Fix return of __cxl_decoder_detach() error path
 - CXL debugfs documentation fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmiM6yQACgkQYGjFFmlT
 OErolBAAnUcYHhpmPJV46KWIhahOx5yLzPTVTzDjpeYQVsbkWLNSTBAL2ygKZt3d
 1zfWSI2f11xyA7nMWf0/2oP6oVRhrN9ty+NtoO118olqT909KQ3PGFnCGbYLLIUb
 SxPnZCxdWQ7xOq3KUgnF+2TSYoL+pa5Gv1m2bGCgf29KgPMrOk6wl/IoAwLEFfCU
 HVgIPDOo9IeYJkTZEZFSTPP4tr+JXygoFrd5k2+kj/GMqQJ5iDMYZxN4dPLhalic
 e4wOXQRMrD23defdegg6VDDWfftBGU3yQqixDNN7sFpWIM/hsnOMVPSz9sCbO0Bg
 2130sTPpsJXMVMUJoniPO5Ad3clhM4M8PoZ/QAO1J8Uodwv5wnKKwAWYFVmu6JU1
 mpn5xroWci1twXXxXCdlyM+sVA9TqvyFbndYvVWo5CTAyCkVHdx1oAvUbV0YzDXN
 5oeJTz42Y9PfawzTDiE5cMlGMGWux/VCiY9n909hg66gmdkUU/7zjjGYPEGT3tNK
 GjCQ0SGIss6bNE9b0g55of/o0O+U00n+8TdvzCkqcc/lG0Yqt5jDGQb94cZKNVn0
 OX01iyZj87w8vyroA1g2vi0aF4HB+88X7GKeiLuQCLe3iyYSkhNVz5EFyS5dQi7H
 r5gN5u7Fq9BfCVIWA6Hh1AU4BGVR1An3hxfib8dV7Ik0Bv1X9jA=
 =79ot
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:
 "The most significant changes in this pull request is the series that
  introduces ACQUIRE() and ACQUIRE_ERR() macros to replace conditional
  locking and ease the pain points of scoped_cond_guard().

  The series also includes follow on changes that refactor the CXL
  sub-system to utilize the new macros.

  Detail summary:

   - Add documentation template for CXL conventions to document CXL
     platform quirks

   - Replace mutex_lock_io() with mutex_lock() for mailbox

   - Add location limit for fake CFMWS range for cxl_test, ARM platform
     enabling

   - CXL documentation typo and clarity fixes

   - Use correct format specifier for function cxl_set_ecs_threshold()

   - Make cxl_bus_type constant

   - Introduce new helper cxl_resource_contains_addr() to check address
     availability

   - Fix wrong DPA checking for PPR operation

   - Remove core/acpi.c and CXL core dependency on ACPI

   - Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks

   - Add CXL updates utilizing ACQUIRE() macro to remove gotos and
     improve readability

   - Add return for the dummy version of cxl_decoder_detach() without
     CONFIG_CXL_REGION

   - CXL events updates for spec r3.2

   - Fix return of __cxl_decoder_detach() error path

   - CXL debugfs documentation fix"

* tag 'cxl-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (28 commits)
  Documentation/ABI/testing/debugfs-cxl: Add 'cxl' to clear_poison path
  cxl/region: Fix an ERR_PTR() vs NULL bug
  cxl/events: Trace Memory Sparing Event Record
  cxl/events: Add extra validity checks for CVME count in DRAM Event Record
  cxl/events: Add extra validity checks for corrected memory error count in General Media Event Record
  cxl/events: Update Common Event Record to CXL spec rev 3.2
  cxl: Fix -Werror=return-type in cxl_decoder_detach()
  cleanup: Fix documentation build error for ACQUIRE updates
  cxl: Convert to ACQUIRE() for conditional rwsem locking
  cxl/region: Consolidate cxl_decoder_kill_region() and cxl_region_detach()
  cxl/region: Move ready-to-probe state check to a helper
  cxl/region: Split commit_store() into __commit() and queue_reset() helpers
  cxl/decoder: Drop pointless locking
  cxl/decoder: Move decoder register programming to a helper
  cxl/mbox: Convert poison list mutex to ACQUIRE()
  cleanup: Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks
  cxl: Remove core/acpi.c and cxl core dependency on ACPI
  cxl/core: Using cxl_resource_contains_addr() to check address availability
  cxl/edac: Fix wrong dpa checking for PPR operation
  cxl/core: Introduce a new helper cxl_resource_contains_addr()
  ...
2025-08-01 15:47:06 -07:00
Paul Chaignon d8d2d9d12f selftests/bpf: Test for unaligned flow_dissector ctx access
This patch adds tests for two context fields where unaligned accesses
were not properly rejected.

Note the new macro is similar to the existing narrow_load macro, but we
need a different description and access offset. Combining the two
macros into one is probably doable but I don't think it would help
readability.

vmlinux.h is included in place of bpf.h so we have the definition of
struct bpf_nf_ctx.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/bf014046ddcf41677fb8b98d150c14027e9fddba.1754039605.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-01 14:47:39 -07:00
Eric Dumazet 7cbd49795d selftests: avoid using ifconfig
ifconfig is deprecated and not always present, use ip command instead.

Fixes: e0f3b3e5c7 ("selftests: Add test cases for vlan_filter modification during runtime")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250730115313.3356036-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-01 14:40:19 -07:00
Linus Torvalds 0bd0a41a51 pci-v6.17-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmiL3OkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vz9bhAAqiD9REYlNUgGX/bEBgCVPFdtjjTz
 FpSLzG23vWd2J0FEy04qtQWH9j71IXnM+yMybzsMe9SsPt2HhczzSCIMpPj0FZNN
 ccOf3gA/KqPux7FORrS3mpM8OO4ICt3XZhCji3nNg5iW5XlH+NrQKPVxRlvBB0rP
 +7RxSjDClUdZ97QSSmp1uZ7Qh1qyV0Ht0qjPMwecrnB2kApt4ZaMphAaKPEjX/4f
 RgZPFqbIpRWt9e87Z8ADr5c2jokZAzIV0zauQ2fhbjBkTcXIXL3yOzUbR+ngBWDD
 oq21rXJBUCQheA7J6j2SKabgF9AZaI5NI9ERld5vJ1inXSZCyuyKopN1AzuKZquG
 N+jyYJqZC99ePvMLbTWs/spU58J03A6TOwaJNE3ISRgbnxFkhvLl7h68XuTDonZm
 hYGloXXUj+i+rh7/eJIDDWa9MTpEvl2p1zc6EDIZ/umlnHwg9rGlGQVARMCs6Ist
 EiJQEtjMMlXiBJMkFhpxesOdyonGkxAL9WtT6MoEOFF7dqgsTqSKiDUPa+6MHV+I
 tsTB630J3ROsWGfQD1uJI2BrCm+op4j6faamH6UMqCrUU0TUZMHiRR3qVWbM6qgU
 /WL1gZ96uy5I7UoE0+gH+wMhMClO2BnsxffocToDE5wOYpGDd5BwPEoY8ej8U2lu
 CBMCkMor1jDtS8Y=
 =ipv3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Allow built-in drivers, not just modular drivers, to use async
     initial probing (Lukas Wunner)

   - Support Immediate Readiness even on devices with no PM Capability
     (Sean Christopherson)

   - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the
     required delay between a reset and sending config requests to a
     device (Niklas Cassel)

   - Add pci_is_display() to check for "Display" base class and use it
     in ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

   - Allow 'isolated PCI functions' (multi-function devices without a
     function 0) for LoongArch, similar to s390 and jailhouse (Huacai
     Chen)

  Power control:

   - Add ability to enable optional slot clock for cases where the PCIe
     host controller and the slot are supplied by different clocks
     (Marek Vasut)

  PCIe native device hotplug:

   - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
     misinterpreting a config read failure after a device has been
     removed (Lukas Wunner)

   - Avoid creating a useless PCIe port service device for pciehp if the
     slot is handled by the ACPI hotplug driver (Lukas Wunner)

   - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug
     ports (Lukas Wunner)

  Virtualization:

   - Save VF resizable BAR state and restore it after reset (Michał
     Winiarski)

   - Allow IOV resources (VF BARs) to be resized (Michał Winiarski)

   - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size
     (Michał Winiarski)

  Endpoint framework:

   - Add RC-to-EP doorbell support using platform MSI controller,
     including a test case (Frank Li)

   - Allow BAR assignment via configfs so platforms have flexibility in
     determining BAR usage (Jerome Brunet)

  Native PCIe controller drivers:

   - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie,
     axis,artpec6-pcie, marvell,armada-3700-pcie, st,spear1340-pcie to
     DT schema format (Rob Herring)

   - Use dev_fwnode() instead of of_fwnode_handle() to remove OF
     dependency in altera (fixes an unused variable), designware-host,
     mediatek, mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma,
     xilinx-nwl (Jiri Slaby, Arnd Bergmann)

   - Convert aardvark, altera, brcmstb, designware-host, iproc,
     mediatek, mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx,
     xilinx-dma, xilinx-nwl from using pci_msi_create_irq_domain() to
     using msi_create_parent_irq_domain() instead; this makes the
     interrupt controller per-PCI device, allows dynamic allocation of
     vectors after initialization, and allows support of IMS (Nam Cao)

  APM X-Gene PCIe controller driver:

   - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug
     bits, use device-managed memory allocations, and clean things up
     (Marc Zyngier)

   - Probe xgene-msi as a standard platform driver rather than a
     subsys_initcall (Marc Zyngier)

  Broadcom STB PCIe controller driver:

   - Add optional DT 'num-lanes' property and if present, use it to
     override the Maximum Link Width advertised in Link Capabilities
     (Jim Quinlan)

  Cadence PCIe controller driver:

   - Use PCIe Message routing types from the PCI core rather than
     defining private ones (Hans Zhang)

  Freescale i.MX6 PCIe controller driver:

   - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu)

   - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features
     (Richard Zhu)

   - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can
     trigger doorbel on Endpoint (Frank Li)

   - Remove apps_reset (LTSSM_EN) from
     imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug
     regression on i.MX8MM (Richard Zhu)

   - Delay Endpoint link start until configfs 'start' written (Richard
     Zhu)

  Intel VMD host bridge driver:

   - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo)

  Qualcomm PCIe controller driver:

   - Add DT binding and driver support for SA8255p, which supports ECAM
     for Configuration Space access (Mayank Rana)

   - Update DT binding and driver to describe PHYs and per-Root Port
     resets in a Root Port stanza and deprecate describing them in the
     host bridge; this makes it possible to support multiple Root Ports
     in the future (Krishna Chaitanya Chundru)

   - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang)

   - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang)

   - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT
     bindings (Konrad Dybcio)

   - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue
     Zhang)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Rockchip PCIe controller driver:

   - Drop unused PCIe Message routing and code definitions (Hans Zhang)

   - Remove several unused header includes (Hans Zhang)

   - Use standard PCIe config register definitions instead of
     rockchip-specific redefinitions (Geraldo Nascimento)

   - Set Target Link Speed to 5.0 GT/s before retraining so we have a
     chance to train at a higher speed (Geraldo Nascimento)

  Rockchip DesignWare PCIe controller driver:

   - Prevent race between link training and register update via DBI by
     inhibiting link training after hot reset and link down (Wilfred
     Mallawa)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Sophgo PCIe controller driver:

   - Add DT binding and driver for Sophgo SG2044 PCIe controller driver
     in Root Complex mode (Inochi Amaoto)

  Synopsys DesignWare PCIe controller driver:

   - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
     Ports that support > 5.0 GT/s. Slower Ports still rely on the
     not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while
     waiting for the Link (Niklas Cassel)"

* tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits)
  dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset
  dt-bindings: PCI: Remove 83xx-512x-pci.txt
  dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
  dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
  dt-bindings: PCI: Convert apm,xgene-pcie to DT schema
  dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
  dt-bindings: PCI: Convert st,spear1340-pcie to DT schema
  PCI: Move is_pciehp check out of pciehp_is_native()
  PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
  PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
  PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
  selftests: pci_endpoint: Add doorbell test case
  misc: pci_endpoint_test: Add doorbell test case
  PCI: endpoint: pci-epf-test: Add doorbell test support
  PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
  PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
  PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
  PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode
  PCI: vmd: Switch to msi_create_parent_irq_domain()
  PCI: vmd: Convert to lock guards
  ...
2025-08-01 13:59:07 -07:00
Ido Schimmel f8fded7536 selftests: net: Fix flaky neighbor garbage collection test
The purpose of the "Periodic garbage collection" test case is to make
sure that "extern_valid" neighbors are not flushed during periodic
garbage collection, unlike regular neighbor entries.

The test case is currently doing the following:

1. Changing the base reachable time to 10 seconds so that periodic
   garbage collection will run every 5 seconds.

2. Changing the garbage collection stale time to 5 seconds so that
   neighbors that have not been used in the last 5 seconds will be
   considered for removal.

3. Waiting for the base reachable time change to take effect.

4. Adding an "extern_valid" neighbor, a non-"extern_valid" neighbor and
   a bunch of other neighbors so that the threshold ("thresh1") will be
   crossed and stale neighbors will be flushed during garbage
   collection.

5. Waiting for 10 seconds to give garbage collection a chance to run.

6. Checking that the "extern_valid" neighbor was not flushed and that
   the non-"extern_valid" neighbor was flushed.

The test sometimes fails in the netdev CI because the non-"extern_valid"
neighbor was not flushed. I am unable to reproduce this locally, but my
theory that since we do not know exactly when the periodic garbage
collection runs, it is possible for it to run at a time when the
non-"extern_valid" neighbor is still not considered stale.

Fix by moving the addition of the two neighbors before step 3 and by
reducing the garbage collection stale time to 1 second, to ensure that
both neighbors are considered stale when garbage collection runs.

Fixes: 171f2ee31a ("selftests: net: Add a selftest for externally validated neighbor entries")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20250728093504.4ebbd73c@kernel.org/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250731110914.506890-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-01 13:51:41 -07:00
Linus Torvalds 7a64bdfaf3 sound updates #2 for 6.17-rc1
For catching up the remaining stuff for 6.17: only small updates
 and the rest are mostly small fixes.
 
 - Fixes in HD-audio codec driver Kconfig, so that configurations
   can be more easily/safely carried between different versions
 - Fixes in ASoC SDCA, FSL xcvr, AW88399
 - ASoC IMX WM8524 support
 - HD-audio and USB-audio quirks and fixes
 - A minor selftest fix
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmiMtFsOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE/VTA/+M9QtrWs3IUGUncud9ku4Jd7YnKqR+PE8hUey
 Y5rPXpj1jH7wx3bXtSbIzrn5rr2uaZ4LwX6cygmXxh9xwKJDOyCXCQ7v9X19wVFu
 cw9SYFnokBVXgQ0fibXixps0YB4Ll86mflckv7TgVT0W4kYmu3XLDRsmtdnXKRrG
 7LOdXON+6jbluTQGXJcxh3pXNOMmeJlQ6FxajHgPQ91iQlB5fZc6tEiLZ06fzOaE
 BOyWNwitXBcsUjitHralJNUfY1POvSBBai13chY3L2xkntVR7uoojia507+YyCqA
 dqGErDclSo85dHd2ZLwGKsT63tPSgPMfP6q7BS08uZrkqrMAH9KsSyg3HB7SHXcm
 VvS9Is4ThovPAkq0naaGoT3UIHNuLBAG0dPy56U6YHRQa0rmp/ZBBGhLTvW735lF
 uugtfDmimoZ12YmC2DkalptcOeOelB+V6CVWYF9D4558DGqLTwHtvn4zs5/oR7yt
 4r6qNiBzrcKqxcMjlXgQYHHdhkCbiqbXw87daDhoLJtBtIXASpRfRBJpOx+lDe9Y
 LCRDq9juGsclDQdr+rzeO7NVhLrEiZmkabenYr6VWZSCjle7d5ro7yRc8qLAlu0Z
 WRUqw7P0wnIZYgSXieaMaEBndiFo0PJljPJGYS+lGjVKduNM9eTdR12PNwZeVRP+
 wFwtKzM=
 =Q+iZ
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull more sound updates from Takashi Iwai:
 "For catching up the remaining stuff for 6.17: only small updates and
  the rest are mostly small fixes.

   - Fixes in HD-audio codec driver Kconfig, so that configurations can
     be more easily/safely carried between different versions

   - Fixes in ASoC SDCA, FSL xcvr, AW88399

   - ASoC IMX WM8524 support

   - HD-audio and USB-audio quirks and fixes

   - A minor selftest fix"

* tag 'sound-6.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: usb: scarlett2: Fix missing NULL check
  mips: Update HD-audio configs again
  LoongArch: Update HD-audio codec configs
  arm: Update HD-audio configs again
  selftests: ALSA: fix memory leak in utimer test
  ALSA: usb-audio: Add DSD support for Comtrue USB Audio device
  ALSA: hda/hdmi: Enable drivers as default
  ALSA: hda/cirrus: Enable drivers as default
  ALSA: hda/realtek: Enable drivers as default
  ALSA: hda/realtek - Fix mute LED for HP Victus 16-d1xxx (MB 8A26)
  ALSA: hda/realtek - Fix mute LED for HP Victus 16-s0xxx
  ALSA: hda: Fix the wrong register was used for DVC of TAS2770
  ALSA: scarlett2: Add retry on -EPROTO from scarlett2_usb_tx()
  ALSA: hda/realtek - Fix mute LED for HP Victus 16-r1xxx
  ASoC: codecs: Add acpi_match_table for aw88399 driver
  ASoC: dt-bindings: atmel,at91-ssc: add microchip,sam9x7-ssc
  ASoC: imx-card: Add WM8524 support
  ASoC: fsl_xcvr: get channel status data with firmware exists
  ASoC: fsl_xcvr: get channel status data when PHY is not exists
  ASoC: SDCA: Add support for -cn- value properties
  ...
2025-08-01 12:26:24 -07:00
Linus Torvalds b80a75cf69 hid-for-linus-2025073101
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEL65usyKPHcrRDEicpmLzj2vtYEkFAmiL2DAACgkQpmLzj2vt
 YEk9aRAApXAYs/LNZ6hS2FvqxSKJhThLEm4sib6CvdQ82Er92aqrZEutkpb8Rq0/
 KMPLYTv9QsZgFKilZJnxIRwZRpy37C7SGO8Y3PSARpakxpCPe76RyqcY+DghrB9+
 kBRUzfDq9JR8RM8lcxw5dgEjgI3cz/5gS8OFh/LOVqe2nqub3U3Dr2kQkDE3DSiG
 BqIPhIG//TW0al6IxJQVL59egzVqzzqcgK/efBwT7WjToaCjXuZTClDiKASVLcmW
 p/xvsPjIBeVuEdOTfboiHowOs3RyXa3zWlHS76/x9JXVDPOrPcleJyoxEedU4BBG
 selhkdVEYcCOYMlzz7/p1BrIkkpczWfye5rSGQ05qmOeApMAxicKRp0DfTLm0EGa
 bdmDLIncSBRBb5iATrRS+rQcV6PSzLCaB9+rGiWS/EQPFCKh54SZSNvl5rw0WCHY
 +b9zmxmr2KG7B8eQQXx0/x6RNCC3gANqkVO4+TXamcGhJaoW2M0+xT9tzid0sL+S
 Y7t5C8T4KoI4dc2be76KNs/LlR6F5LK1E0HTkapdG+ZoiLc/PsqK0pnGc0aXd+/z
 Ee246MZEMsURgOEA68dEm4XBdh2xp9xVZIsZtaWcfSSQZJnfpx89v9Tl4MUlfqzj
 8BPLeYxxg9gxfxpGcP+oGMjXbeKDjSbmRJroOP4jw/WdPZ2rs0s=
 =kYcB
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-2025073101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID updates from Jiri Kosina:

 - hardening of HID core parser against conversion to 0 bits in s32ton()
   by buggy/malicious devices (Alan Stern)

 - fix for potential NULL pointer dereference in hid-apple that could be
   caused by malicious device with APPLE_MAGIC_BACKLIGHT quirk present
   triggering overflow in data field (Qasim Ijaz)

 - support for Wake-on-touch in intel-thc (Even Xu)

 - support for "Input max input size control" and "Input interrupt
   delay" I2C features in order to improve compatibility of THC devices
   with legacy HIDI2C touch devices (Even Xu)

 - support for Touch Bars on x86 MacBook Pros (Kerem Karabay)

 - support for XP-PEN Artist 22R Pro (Joshua Goins)

 - third party trackpart support for MacBookPro15,1 (Aditya Garg)

 - Apple Magic Keyboard A311[89] USB-C support (Aditya Garg, Grigorii
   Sokoli)

 - support for operating modes in amd-sfh (Basavaraj Natikar)

 - avoid setting up battery timer for Apple and Magicmouse devices
   without battery (Aditya Garg)

 - fix for behavior of the hid-mcp2221 driver for !CONFIG_IIO cases
   (Heiko Schocher)

 - other assorted fixups and device ID additions

* tag 'hid-for-linus-2025073101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (54 commits)
  HID: core: Harden s32ton() against conversion to 0 bits
  HID: apple: validate feature-report field count to prevent NULL pointer dereference
  HID: core: Improve the kerneldoc for hid_report_len()
  selftests/hid: sync python tests to hid-tools 0.10
  selftests/hid: sync the python tests to hid-tools 0.8
  selftests/hid: run ruff format on the python part
  HID: magicmouse: use secs_to_jiffies() for battery timeout
  HID: apple: use secs_to_jiffies() for battery timeout
  HID: magicmouse: avoid setting up battery timer when not needed
  HID: apple: avoid setting up battery timer for devices without battery
  HID: amd_sfh: Enable operating mode
  HID: uclogic: Add support for XP-PEN Artist 22R Pro
  HID: rate-limit hid_warn to prevent log flooding
  HID: replace scnprintf() with sysfs_emit()
  HID: uclogic: make read-only array reconnect_event static const
  HID: mcp-2221: Replace manual comparison with min() macro
  HID: intel-thc-hid: Separate max input size control conditional list
  HID: mcp2221: set gpio pin mode
  HID: multitouch: add device ID for Apple Touch Bar
  HID: multitouch: specify that Apple Touch Bar is direct
  ...
2025-07-31 21:26:05 -07:00
Linus Torvalds 6a68cec16b sched_ext: Changes for v6.17
- Add support for cgroup "cpu.max" interface.
 
 - Code organization cleanup so that ext_idle.c doesn't depend on the
   source-file-inclusion build method of sched/.
 
 - Drop UP paths in accordance with sched core changes.
 
 - Documentation and other misc changes.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaIqnxg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGUh5AQC6YM7ggRPYRmy28m5B0nubpKtCHqPOAHSd/QbY
 MCiThgD+JuE9ewg3wYO/jvJx3NyIRB1McMnAaG59hf6R0Plh5Qo=
 =TeLF
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext updates from Tejun Heo:

 - Add support for cgroup "cpu.max" interface

 - Code organization cleanup so that ext_idle.c doesn't depend on the
   source-file-inclusion build method of sched/

 - Drop UP paths in accordance with sched core changes

 - Documentation and other misc changes

* tag 'sched_ext-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Fix scx_bpf_reenqueue_local() reference
  sched_ext: Drop kfuncs marked for removal in 6.15
  sched_ext, rcu: Eject BPF scheduler on RCU CPU stall panic
  kernel/sched/ext.c: fix typo "occured" -> "occurred" in comments
  sched_ext: Add support for cgroup bandwidth control interface
  sched_ext, sched/core: Factor out struct scx_task_group
  sched_ext: Return NULL in llc_span
  sched_ext: Always use SMP versions in kernel/sched/ext_idle.h
  sched_ext: Always use SMP versions in kernel/sched/ext_idle.c
  sched_ext: Always use SMP versions in kernel/sched/ext.h
  sched_ext: Always use SMP versions in kernel/sched/ext.c
  sched_ext: Documentation: Clarify time slice handling in task lifecycle
  sched_ext: Make scx_locked_rq() inline
  sched_ext: Make scx_rq_bypassing() inline
  sched_ext: idle: Make local functions static in ext_idle.c
  sched_ext: idle: Remove unnecessary ifdef in scx_bpf_cpu_node()
2025-07-31 16:29:46 -07:00
Linus Torvalds 6aee5aed2e cgroup: Changes for v6.17
- Allow css_rstat_updated() in NMI context to enable memory accounting for
   allocations in NMI context.
 
 - /proc/cgroups doesn't contain useful information for cgroup2 and was
   updated to only show v1 controllers. This unfortunately broke something in
   the wild. Add an option to bring back the old behavior to ease transition.
 
 - selftest updates and other cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaIqlxQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGcTMAQDUlGf50ATWB9hDU7zUG4lVn8s8n8/+x8QFGHn4
 e4NERQD9FpU/jLN+cwGgspKo+L9qpu/1g+t36cJLcOuEKKoaQwI=
 =FLwx
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - Allow css_rstat_updated() in NMI context to enable memory accounting
   for allocations in NMI context.

 - /proc/cgroups doesn't contain useful information for cgroup2 and was
   updated to only show v1 controllers. This unfortunately broke
   something in the wild. Add an option to bring back the old behavior
   to ease transition.

 - selftest updates and other cleanups.

* tag 'cgroup-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Add compatibility option for content of /proc/cgroups
  selftests/cgroup: fix cpu.max tests
  cgroup: llist: avoid memory tears for llist_node
  selftests: cgroup: Fix missing newline in test_zswap_writeback_one
  selftests: cgroup: Allow longer timeout for kmem_dead_cgroups cleanup
  memcg: cgroup: call css_rstat_updated irrespective of in_nmi()
  cgroup: remove per-cpu per-subsystem locks
  cgroup: make css_rstat_updated nmi safe
  cgroup: support to enable nmi-safe css_rstat_updated
  selftests: cgroup: Fix compilation on pre-cgroupns kernels
  selftests: cgroup: Optionally set up v1 environment
  selftests: cgroup: Add support for named v1 hierarchies in test_core
  selftests: cgroup_util: Add helpers for testing named v1 hierarchies
  Documentation: cgroup: add section explaining controller availability
  cgroup: Drop sock_cgroup_classid() dummy implementation
2025-07-31 16:04:19 -07:00
Linus Torvalds beace86e61 Summary of significant series in this pull request:
- The 4 patch series "mm: ksm: prevent KSM from breaking merging of new
   VMAs" from Lorenzo Stoakes addresses an issue with KSM's
   PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for
   merging with existing adjacent VMAs.
 
 - The 4 patch series "mm/damon: introduce DAMON_STAT for simple and
   practical access monitoring" from SeongJae Park adds a new kernel module
   which simplifies the setup and usage of DAMON in production
   environments.
 
 - The 6 patch series "stop passing a writeback_control to swap/shmem
   writeout" from Christoph Hellwig is a cleanup to the writeback code
   which removes a couple of pointers from struct writeback_control.
 
 - The 7 patch series "drivers/base/node.c: optimization and cleanups"
   from Donet Tom contains largely uncorrelated cleanups to the NUMA node
   setup and management code.
 
 - The 4 patch series "mm: userfaultfd: assorted fixes and cleanups" from
   Tal Zussman does some maintenance work on the userfaultfd code.
 
 - The 5 patch series "Readahead tweaks for larger folios" from Ryan
   Roberts implements some tuneups for pagecache readahead when it is
   reading into order>0 folios.
 
 - The 4 patch series "selftests/mm: Tweaks to the cow test" from Mark
   Brown provides some cleanups and consistency improvements to the
   selftests code.
 
 - The 4 patch series "Optimize mremap() for large folios" from Dev Jain
   does that.  A 37% reduction in execution time was measured in a
   memset+mremap+munmap microbenchmark.
 
 - The 5 patch series "Remove zero_user()" from Matthew Wilcox expunges
   zero_user() in favor of the more modern memzero_page().
 
 - The 3 patch series "mm/huge_memory: vmf_insert_folio_*() and
   vmf_insert_pfn_pud() fixes" from David Hildenbrand addresses some warts
   which David noticed in the huge page code.  These were not known to be
   causing any issues at this time.
 
 - The 3 patch series "mm/damon: use alloc_migrate_target() for
   DAMOS_MIGRATE_{HOT,COLD" from SeongJae Park provides some cleanup and
   consolidation work in DAMON.
 
 - The 3 patch series "use vm_flags_t consistently" from Lorenzo Stoakes
   uses vm_flags_t in places where we were inappropriately using other
   types.
 
 - The 3 patch series "mm/memfd: Reserve hugetlb folios before
   allocation" from Vivek Kasireddy increases the reliability of large page
   allocation in the memfd code.
 
 - The 14 patch series "mm: Remove pXX_devmap page table bit and pfn_t
   type" from Alistair Popple removes several now-unneeded PFN_* flags.
 
 - The 5 patch series "mm/damon: decouple sysfs from core" from SeongJae
   Park implememnts some cleanup and maintainability work in the DAMON
   sysfs layer.
 
 - The 5 patch series "madvise cleanup" from Lorenzo Stoakes does quite a
   lot of cleanup/maintenance work in the madvise() code.
 
 - The 4 patch series "madvise anon_name cleanups" from Vlastimil Babka
   provides additional cleanups on top or Lorenzo's effort.
 
 - The 11 patch series "Implement numa node notifier" from Oscar Salvador
   creates a standalone notifier for NUMA node memory state changes.
   Previously these were lumped under the more general memory on/offline
   notifier.
 
 - The 6 patch series "Make MIGRATE_ISOLATE a standalone bit" from Zi Yan
   cleans up the pageblock isolation code and fixes a potential issue which
   doesn't seem to cause any problems in practice.
 
 - The 5 patch series "selftests/damon: add python and drgn based DAMON
   sysfs functionality tests" from SeongJae Park adds additional drgn- and
   python-based DAMON selftests which are more comprehensive than the
   existing selftest suite.
 
 - The 5 patch series "Misc rework on hugetlb faulting path" from Oscar
   Salvador fixes a rather obscure deadlock in the hugetlb fault code and
   follows that fix with a series of cleanups.
 
 - The 3 patch series "cma: factor out allocation logic from
   __cma_declare_contiguous_nid" from Mike Rapoport rationalizes and cleans
   up the highmem-specific code in the CMA allocator.
 
 - The 28 patch series "mm/migration: rework movable_ops page migration
   (part 1)" from David Hildenbrand provides cleanups and
   future-preparedness to the migration code.
 
 - The 2 patch series "mm/damon: add trace events for auto-tuned
   monitoring intervals and DAMOS quota" from SeongJae Park adds some
   tracepoints to some DAMON auto-tuning code.
 
 - The 6 patch series "mm/damon: fix misc bugs in DAMON modules" from
   SeongJae Park does that.
 
 - The 6 patch series "mm/damon: misc cleanups" from SeongJae Park also
   does what it claims.
 
 - The 4 patch series "mm: folio_pte_batch() improvements" from David
   Hildenbrand cleans up the large folio PTE batching code.
 
 - The 13 patch series "mm/damon/vaddr: Allow interleaving in
   migrate_{hot,cold} actions" from SeongJae Park facilitates dynamic
   alteration of DAMON's inter-node allocation policy.
 
 - The 3 patch series "Remove unmap_and_put_page()" from Vishal Moola
   provides a couple of page->folio conversions.
 
 - The 4 patch series "mm: per-node proactive reclaim" from Davidlohr
   Bueso implements a per-node control of proactive reclaim - beyond the
   current memcg-based implementation.
 
 - The 14 patch series "mm/damon: remove damon_callback" from SeongJae
   Park replaces the damon_callback interface with a more general and
   powerful damon_call()+damos_walk() interface.
 
 - The 10 patch series "mm/mremap: permit mremap() move of multiple VMAs"
   from Lorenzo Stoakes implements a number of mremap cleanups (of course)
   in preparation for adding new mremap() functionality: newly permit the
   remapping of multiple VMAs when the user is specifying MREMAP_FIXED.  It
   still excludes some specialized situations where this cannot be
   performed reliably.
 
 - The 3 patch series "drop hugetlb_free_pgd_range()" from Anthony Yznaga
   switches some sparc hugetlb code over to the generic version and removes
   the thus-unneeded hugetlb_free_pgd_range().
 
 - The 4 patch series "mm/damon/sysfs: support periodic and automated
   stats update" from SeongJae Park augments the present
   userspace-requested update of DAMON sysfs monitoring files.  Automatic
   update is now provided, along with a tunable to control the update
   interval.
 
 - The 4 patch series "Some randome fixes and cleanups to swapfile" from
   Kemeng Shi does what is claims.
 
 - The 4 patch series "mm: introduce snapshot_page" from Luiz Capitulino
   and David Hildenbrand provides (and uses) a means by which debug-style
   functions can grab a copy of a pageframe and inspect it locklessly
   without tripping over the races inherent in operating on the live
   pageframe directly.
 
 - The 6 patch series "use per-vma locks for /proc/pid/maps reads" from
   Suren Baghdasaryan addresses the large contention issues which can be
   triggered by reads from that procfs file.  Latencies are reduced by more
   than half in some situations.  The series also introduces several new
   selftests for the /proc/pid/maps interface.
 
 - The 6 patch series "__folio_split() clean up" from Zi Yan cleans up
   __folio_split()!
 
 - The 7 patch series "Optimize mprotect() for large folios" from Dev
   Jain provides some quite large (>3x) speedups to mprotect() when dealing
   with large folios.
 
 - The 2 patch series "selftests/mm: reuse FORCE_READ to replace "asm
   volatile("" : "+r" (XXX));" and some cleanup" from wang lian does some
   cleanup work in the selftests code.
 
 - The 3 patch series "tools/testing: expand mremap testing" from Lorenzo
   Stoakes extends the mremap() selftest in several ways, including adding
   more checking of Lorenzo's recently added "permit mremap() move of
   multiple VMAs" feature.
 
 - The 22 patch series "selftests/damon/sysfs.py: test all parameters"
   from SeongJae Park extends the DAMON sysfs interface selftest so that it
   tests all possible user-requested parameters.  Rather than the present
   minimal subset.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaIqcCgAKCRDdBJ7gKXxA
 jkVBAQCCn9DR1QP0CRk961ot0cKzOgioSc0aA03DPb2KXRt2kQEAzDAz0ARurFhL
 8BzbvI0c+4tntHLXvIlrC33n9KWAOQM=
 =XsFy
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "As usual, many cleanups. The below blurbiage describes 42 patchsets.
  21 of those are partially or fully cleanup work. "cleans up",
  "cleanup", "maintainability", "rationalizes", etc.

  I never knew the MM code was so dirty.

  "mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
     addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
     mapped VMAs were not eligible for merging with existing adjacent
     VMAs.

  "mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
     adds a new kernel module which simplifies the setup and usage of
     DAMON in production environments.

  "stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
     is a cleanup to the writeback code which removes a couple of
     pointers from struct writeback_control.

  "drivers/base/node.c: optimization and cleanups" (Donet Tom)
     contains largely uncorrelated cleanups to the NUMA node setup and
     management code.

  "mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
     does some maintenance work on the userfaultfd code.

  "Readahead tweaks for larger folios" (Ryan Roberts)
     implements some tuneups for pagecache readahead when it is reading
     into order>0 folios.

  "selftests/mm: Tweaks to the cow test" (Mark Brown)
     provides some cleanups and consistency improvements to the
     selftests code.

  "Optimize mremap() for large folios" (Dev Jain)
     does that. A 37% reduction in execution time was measured in a
     memset+mremap+munmap microbenchmark.

  "Remove zero_user()" (Matthew Wilcox)
     expunges zero_user() in favor of the more modern memzero_page().

  "mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
     addresses some warts which David noticed in the huge page code.
     These were not known to be causing any issues at this time.

  "mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
     provides some cleanup and consolidation work in DAMON.

  "use vm_flags_t consistently" (Lorenzo Stoakes)
     uses vm_flags_t in places where we were inappropriately using other
     types.

  "mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
     increases the reliability of large page allocation in the memfd
     code.

  "mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
     removes several now-unneeded PFN_* flags.

  "mm/damon: decouple sysfs from core" (SeongJae Park)
     implememnts some cleanup and maintainability work in the DAMON
     sysfs layer.

  "madvise cleanup" (Lorenzo Stoakes)
     does quite a lot of cleanup/maintenance work in the madvise() code.

  "madvise anon_name cleanups" (Vlastimil Babka)
     provides additional cleanups on top or Lorenzo's effort.

  "Implement numa node notifier" (Oscar Salvador)
     creates a standalone notifier for NUMA node memory state changes.
     Previously these were lumped under the more general memory
     on/offline notifier.

  "Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
     cleans up the pageblock isolation code and fixes a potential issue
     which doesn't seem to cause any problems in practice.

  "selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
     adds additional drgn- and python-based DAMON selftests which are
     more comprehensive than the existing selftest suite.

  "Misc rework on hugetlb faulting path" (Oscar Salvador)
     fixes a rather obscure deadlock in the hugetlb fault code and
     follows that fix with a series of cleanups.

  "cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
     rationalizes and cleans up the highmem-specific code in the CMA
     allocator.

  "mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
     provides cleanups and future-preparedness to the migration code.

  "mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
     adds some tracepoints to some DAMON auto-tuning code.

  "mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
     does that.

  "mm/damon: misc cleanups" (SeongJae Park)
     also does what it claims.

  "mm: folio_pte_batch() improvements" (David Hildenbrand)
     cleans up the large folio PTE batching code.

  "mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
     facilitates dynamic alteration of DAMON's inter-node allocation
     policy.

  "Remove unmap_and_put_page()" (Vishal Moola)
     provides a couple of page->folio conversions.

  "mm: per-node proactive reclaim" (Davidlohr Bueso)
     implements a per-node control of proactive reclaim - beyond the
     current memcg-based implementation.

  "mm/damon: remove damon_callback" (SeongJae Park)
     replaces the damon_callback interface with a more general and
     powerful damon_call()+damos_walk() interface.

  "mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
     implements a number of mremap cleanups (of course) in preparation
     for adding new mremap() functionality: newly permit the remapping
     of multiple VMAs when the user is specifying MREMAP_FIXED. It still
     excludes some specialized situations where this cannot be performed
     reliably.

  "drop hugetlb_free_pgd_range()" (Anthony Yznaga)
     switches some sparc hugetlb code over to the generic version and
     removes the thus-unneeded hugetlb_free_pgd_range().

  "mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
     augments the present userspace-requested update of DAMON sysfs
     monitoring files. Automatic update is now provided, along with a
     tunable to control the update interval.

  "Some randome fixes and cleanups to swapfile" (Kemeng Shi)
     does what is claims.

  "mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
     provides (and uses) a means by which debug-style functions can grab
     a copy of a pageframe and inspect it locklessly without tripping
     over the races inherent in operating on the live pageframe
     directly.

  "use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
     addresses the large contention issues which can be triggered by
     reads from that procfs file. Latencies are reduced by more than
     half in some situations. The series also introduces several new
     selftests for the /proc/pid/maps interface.

  "__folio_split() clean up" (Zi Yan)
     cleans up __folio_split()!

  "Optimize mprotect() for large folios" (Dev Jain)
     provides some quite large (>3x) speedups to mprotect() when dealing
     with large folios.

  "selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
     does some cleanup work in the selftests code.

  "tools/testing: expand mremap testing" (Lorenzo Stoakes)
     extends the mremap() selftest in several ways, including adding
     more checking of Lorenzo's recently added "permit mremap() move of
     multiple VMAs" feature.

  "selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
     extends the DAMON sysfs interface selftest so that it tests all
     possible user-requested parameters. Rather than the present minimal
     subset"

* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
  MAINTAINERS: add missing headers to mempory policy & migration section
  MAINTAINERS: add missing file to cgroup section
  MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
  MAINTAINERS: add missing zsmalloc file
  MAINTAINERS: add missing files to page alloc section
  MAINTAINERS: add missing shrinker files
  MAINTAINERS: move memremap.[ch] to hotplug section
  MAINTAINERS: add missing mm_slot.h file THP section
  MAINTAINERS: add missing interval_tree.c to memory mapping section
  MAINTAINERS: add missing percpu-internal.h file to per-cpu section
  mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
  selftests/damon: introduce _common.sh to host shared function
  selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
  selftests/damon/sysfs.py: test non-default parameters runtime commit
  selftests/damon/sysfs.py: generalize DAMON context commit assertion
  selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
  selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
  selftests/damon/sysfs.py: test DAMOS filters commitment
  selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
  selftests/damon/sysfs.py: test DAMOS destinations commitment
  ...
2025-07-31 14:57:54 -07:00
Jiri Kosina ddb7a62af2 Merge branch 'for-6.17/selftests' into for-linus
- upgrade the python scripts in hid-selftests to match hid-tools
  version 0.10 (Benjamin Tissoires)
2025-07-31 22:53:29 +02:00
Jiri Kosina e9ef810dfe Merge branch 'for-6.17/amd-sfh' into for-linus
- add support for operating modes (Basavaraj Natikar)
2025-07-31 22:36:25 +02:00
Linus Torvalds c93529ad4f iommufd 6.17 merge window pull
- IOMMU HW now has features to directly assign HW command queues to a
   guest VM. In this mode the command queue operates on a limited set of
   invalidation commands that are suitable for improving guest invalidation
   performance and easy for the HW to virtualize.
 
   This PR brings the generic infrastructure to allow IOMMU drivers to
   expose such command queues through the iommufd uAPI, mmap the doorbell
   pages, and get the guest physical range for the command queue ring
   itself.
 
 - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built on
   the new iommufd command queue features. It works with the existing SMMU
   driver support for cmdqv in guest VMs.
 
 - Many precursor cleanups and improvements to support the above cleanly,
   changes to the general ioctl and object helpers, driver support for
   VDEVICE, and mmap pgoff cookie infrastructure.
 
 - Sequence VDEVICE destruction to always happen before VFIO device
   destruction. When using the above type features, and also in future
   confidential compute, the internal virtual device representation becomes
   linked to HW or CC TSM configuration and objects. If a VFIO device is
   removed from iommufd those HW objects should also be cleaned up to
   prevent a sort of UAF. This became important now that we have HW backing
   the VDEVICE.
 
 - Fix one syzkaller found error related to math overflows during iova
   allocation
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaIpl9AAKCRCFwuHvBreF
 YS5tAP9MDIRML5a/2IOhzcsc4LiDkWTMKm2m1wcRYd+iU2aFVQEAjdghINLHrUlx
 HVuIDvNvWIUED/oTAp5kCxQ7PBFN4gU=
 =NmCO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "This broadly brings the assigned HW command queue support to iommufd.
  This feature is used to improve SVA performance in VMs by avoiding
  paravirtualization traps during SVA invalidations.

  Along the way I think some of the core logic is in a much better state
  to support future driver backed features.

  Summary:

   - IOMMU HW now has features to directly assign HW command queues to a
     guest VM. In this mode the command queue operates on a limited set
     of invalidation commands that are suitable for improving guest
     invalidation performance and easy for the HW to virtualize.

     This brings the generic infrastructure to allow IOMMU drivers to
     expose such command queues through the iommufd uAPI, mmap the
     doorbell pages, and get the guest physical range for the command
     queue ring itself.

   - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built
     on the new iommufd command queue features. It works with the
     existing SMMU driver support for cmdqv in guest VMs.

   - Many precursor cleanups and improvements to support the above
     cleanly, changes to the general ioctl and object helpers, driver
     support for VDEVICE, and mmap pgoff cookie infrastructure.

   - Sequence VDEVICE destruction to always happen before VFIO device
     destruction. When using the above type features, and also in future
     confidential compute, the internal virtual device representation
     becomes linked to HW or CC TSM configuration and objects. If a VFIO
     device is removed from iommufd those HW objects should also be
     cleaned up to prevent a sort of UAF. This became important now that
     we have HW backing the VDEVICE.

   - Fix one syzkaller found error related to math overflows during iova
     allocation"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits)
  iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size
  iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3
  iommufd: Rename some shortterm-related identifiers
  iommufd/selftest: Add coverage for vdevice tombstone
  iommufd/selftest: Explicitly skip tests for inapplicable variant
  iommufd/vdevice: Remove struct device reference from struct vdevice
  iommufd: Destroy vdevice on idevice destroy
  iommufd: Add a pre_destroy() op for objects
  iommufd: Add iommufd_object_tombstone_user() helper
  iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
  iommufd/selftest: Test reserved regions near ULONG_MAX
  iommufd: Prevent ALIGN() overflow
  iommu/tegra241-cmdqv: import IOMMUFD module namespace
  iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set
  iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support
  iommu/tegra241-cmdqv: Add user-space use support
  iommu/tegra241-cmdqv: Do not statically map LVCMDQs
  iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf()
  iommu/tegra241-cmdqv: Use request_threaded_irq
  iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops
  ...
2025-07-31 12:43:08 -07:00
WangYuli 6260da0468 selftests: ALSA: fix memory leak in utimer test
Free the malloc'd buffer in TEST_F(timer_f, utimer) to prevent
memory leak.

Fixes: 1026392d10 ("selftests: ALSA: Cover userspace-driven timers with test")
Reported-by: Jun Zhan <zhanjun@uniontech.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Link: https://patch.msgid.link/DE4D931FCF54F3DB+20250731100222.65748-1-wangyuli@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-07-31 17:01:53 +02:00
Linus Torvalds 63eb28bb14 ARM:
- Host driver for GICv5, the next generation interrupt controller for
   arm64, including support for interrupt routing, MSIs, interrupt
   translation and wired interrupts.
 
 - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
   GICv5 hardware, leveraging the legacy VGIC interface.
 
 - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
   userspace to disable support for SGIs w/o an active state on hardware
   that previously advertised it unconditionally.
 
 - Map supporting endpoints with cacheable memory attributes on systems
   with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
   maintenance on the address range.
 
 - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
   hypervisor to inject external aborts into an L2 VM and take traps of
   masked external aborts to the hypervisor.
 
 - Convert more system register sanitization to the config-driven
   implementation.
 
 - Fixes to the visibility of EL2 registers, namely making VGICv3 system
   registers accessible through the VGIC device instead of the ONE_REG
   vCPU ioctls.
 
 - Various cleanups and minor fixes.
 
 LoongArch:
 
 - Add stat information for in-kernel irqchip
 
 - Add tracepoints for CPUCFG and CSR emulation exits
 
 - Enhance in-kernel irqchip emulation
 
 - Various cleanups.
 
 RISC-V:
 
 - Enable ring-based dirty memory tracking
 
 - Improve perf kvm stat to report interrupt events
 
 - Delegate illegal instruction trap to VS-mode
 
 - MMU improvements related to upcoming nested virtualization
 
 s390x
 
 - Fixes
 
 x86:
 
 - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC,
   PIC, and PIT emulation at compile time.
 
 - Share device posted IRQ code between SVM and VMX and
   harden it against bugs and runtime errors.
 
 - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1)
   instead of O(n).
 
 - For MMIO stale data mitigation, track whether or not a vCPU has access to
   (host) MMIO based on whether the page tables have MMIO pfns mapped; using
   VFIO is prone to false negatives
 
 - Rework the MSR interception code so that the SVM and VMX APIs are more or
   less identical.
 
 - Recalculate all MSR intercepts from scratch on MSR filter changes,
   instead of maintaining shadow bitmaps.
 
 - Advertise support for LKGS (Load Kernel GS base), a new instruction
   that's loosely related to FRED, but is supported and enumerated
   independently.
 
 - Fix a user-triggerable WARN that syzkaller found by setting the vCPU
   in INIT_RECEIVED state (aka wait-for-SIPI), and then putting the vCPU
   into VMX Root Mode (post-VMXON).  Trying to detect every possible path
   leading to architecturally forbidden states is hard and even risks
   breaking userspace (if it goes from valid to valid state but passes
   through invalid states), so just wait until KVM_RUN to detect that
   the vCPU state isn't allowed.
 
 - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
   APERF/MPERF reads, so that a "properly" configured VM can access
   APERF/MPERF.  This has many caveats (APERF/MPERF cannot be zeroed
   on vCPU creation or saved/restored on suspend and resume, or preserved
   over thread migration let alone VM migration) but can be useful whenever
   you're interested in letting Linux guests see the effective physical CPU
   frequency in /proc/cpuinfo.
 
 - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been
   created, as there's no known use case for changing the default
   frequency for other VM types and it goes counter to the very reason
   why the ioctl was added to the vm file descriptor.  And also, there
   would be no way to make it work for confidential VMs with a "secure"
   TSC, so kill two birds with one stone.
 
 - Dynamically allocation the shadow MMU's hashed page list, and defer
   allocating the hashed list until it's actually needed (the TDP MMU
   doesn't use the list).
 
 - Extract many of KVM's helpers for accessing architectural local APIC
   state to common x86 so that they can be shared by guest-side code for
   Secure AVIC.
 
 - Various cleanups and fixes.
 
 x86 (Intel):
 
 - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest.
   Failure to honor FREEZE_IN_SMM can leak host state into guests.
 
 - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to prevent
   L1 from running L2 with features that KVM doesn't support, e.g. BTF.
 
 x86 (AMD):
 
 - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
   nested SVM MSRPM offsets tracker can't handle an MSR (which is pretty
   much a static condition and therefore should never happen, but still).
 
 - Fix a variety of flaws and bugs in the AVIC device posted IRQ code.
 
 - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware
   supports) instead of rejecting vCPU creation.
 
 - Extend enable_ipiv module param support to SVM, by simply leaving
   IsRunning clear in the vCPU's physical ID table entry.
 
 - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by
   erratum #1235, to allow (safely) enabling AVIC on such CPUs.
 
 - Request GA Log interrupts if and only if the target vCPU is blocking,
   i.e. only if KVM needs a notification in order to wake the vCPU.
 
 - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
   vCPU's CPUID model.
 
 - Accept any SNP policy that is accepted by the firmware with respect to
   SMT and single-socket restrictions.  An incompatible policy doesn't put
   the kernel at risk in any way, so there's no reason for KVM to care.
 
 - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and
   use WBNOINVD instead of WBINVD when possible for SEV cache maintenance.
 
 - When reclaiming memory from an SEV guest, only do cache flushes on CPUs
   that have ever run a vCPU for the guest, i.e. don't flush the caches for
   CPUs that can't possibly have cache lines with dirty, encrypted data.
 
 Generic:
 
 - Rework irqbypass to track/match producers and consumers via an xarray
   instead of a linked list.  Using a linked list leads to O(n^2) insertion
   times, which is hugely problematic for use cases that create large
   numbers of VMs.  Such use cases typically don't actually use irqbypass,
   but eliminating the pointless registration is a future problem to
   solve as it likely requires new uAPI.
 
 - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *",
   to avoid making a simple concept unnecessarily difficult to understand.
 
 - Decouple device posted IRQs from VFIO device assignment, as binding a VM
   to a VFIO group is not a requirement for enabling device posted IRQs.
 
 - Clean up and document/comment the irqfd assignment code.
 
 - Disallow binding multiple irqfds to an eventfd with a priority waiter,
   i.e.  ensure an eventfd is bound to at most one irqfd through the entire
   host, and add a selftest to verify eventfd:irqfd bindings are globally
   unique.
 
 - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues
   related to private <=> shared memory conversions.
 
 - Drop guest_memfd's .getattr() implementation as the VFS layer will call
   generic_fillattr() if inode_operations.getattr is NULL.
 
 - Fix issues with dirty ring harvesting where KVM doesn't bound the
   processing of entries in any way, which allows userspace to keep KVM
   in a tight loop indefinitely.
 
 - Kill off kvm_arch_{start,end}_assignment() and x86's associated tracking,
   now that KVM no longer uses assigned_device_count as a heuristic for
   either irqbypass usage or MDS mitigation.
 
 Selftests:
 
 - Fix a comment typo.
 
 - Verify KVM is loaded when getting any KVM module param so that attempting
   to run a selftest without kvm.ko loaded results in a SKIP message about
   KVM not being loaded/enabled (versus some random parameter not existing).
 
 - Skip tests that hit EACCES when attempting to access a file, and rpint
   a "Root required?" help message.  In most cases, the test just needs to
   be run with elevated permissions.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmiKXMgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhMQf/QDhC/CP1aGXph2whuyeD2NMqPKiU
 9KdnDNST+ftPwjg9QxZ9mTaa8zeVz/wly6XlxD9OQHy+opM1wcys3k0GZAFFEEQm
 YrThgURdzEZ3nwJZgb+m0t4wjJQtpiFIBwAf7qq6z1VrqQBEmHXJ/8QxGuqO+BNC
 j5q/X+q6KZwehKI6lgFBrrOKWFaxqhnRAYfW6rGBxRXxzTJuna37fvDpodQnNceN
 zOiq+avfriUMArTXTqOteJNKU0229HjiPSnjILLnFQ+B3akBlwNG0jk7TMaAKR6q
 IZWG1EIS9q1BAkGXaw6DE1y6d/YwtXCR5qgAIkiGwaPt5yj9Oj6kRN2Ytw==
 =j2At
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Host driver for GICv5, the next generation interrupt controller for
     arm64, including support for interrupt routing, MSIs, interrupt
     translation and wired interrupts

   - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
     GICv5 hardware, leveraging the legacy VGIC interface

   - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
     userspace to disable support for SGIs w/o an active state on
     hardware that previously advertised it unconditionally

   - Map supporting endpoints with cacheable memory attributes on
     systems with FEAT_S2FWB and DIC where KVM no longer needs to
     perform cache maintenance on the address range

   - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the
     guest hypervisor to inject external aborts into an L2 VM and take
     traps of masked external aborts to the hypervisor

   - Convert more system register sanitization to the config-driven
     implementation

   - Fixes to the visibility of EL2 registers, namely making VGICv3
     system registers accessible through the VGIC device instead of the
     ONE_REG vCPU ioctls

   - Various cleanups and minor fixes

  LoongArch:

   - Add stat information for in-kernel irqchip

   - Add tracepoints for CPUCFG and CSR emulation exits

   - Enhance in-kernel irqchip emulation

   - Various cleanups

  RISC-V:

   - Enable ring-based dirty memory tracking

   - Improve perf kvm stat to report interrupt events

   - Delegate illegal instruction trap to VS-mode

   - MMU improvements related to upcoming nested virtualization

  s390x

   - Fixes

  x86:

   - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O
     APIC, PIC, and PIT emulation at compile time

   - Share device posted IRQ code between SVM and VMX and harden it
     against bugs and runtime errors

   - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups
     O(1) instead of O(n)

   - For MMIO stale data mitigation, track whether or not a vCPU has
     access to (host) MMIO based on whether the page tables have MMIO
     pfns mapped; using VFIO is prone to false negatives

   - Rework the MSR interception code so that the SVM and VMX APIs are
     more or less identical

   - Recalculate all MSR intercepts from scratch on MSR filter changes,
     instead of maintaining shadow bitmaps

   - Advertise support for LKGS (Load Kernel GS base), a new instruction
     that's loosely related to FRED, but is supported and enumerated
     independently

   - Fix a user-triggerable WARN that syzkaller found by setting the
     vCPU in INIT_RECEIVED state (aka wait-for-SIPI), and then putting
     the vCPU into VMX Root Mode (post-VMXON). Trying to detect every
     possible path leading to architecturally forbidden states is hard
     and even risks breaking userspace (if it goes from valid to valid
     state but passes through invalid states), so just wait until
     KVM_RUN to detect that the vCPU state isn't allowed

   - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling
     interception of APERF/MPERF reads, so that a "properly" configured
     VM can access APERF/MPERF. This has many caveats (APERF/MPERF
     cannot be zeroed on vCPU creation or saved/restored on suspend and
     resume, or preserved over thread migration let alone VM migration)
     but can be useful whenever you're interested in letting Linux
     guests see the effective physical CPU frequency in /proc/cpuinfo

   - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been
     created, as there's no known use case for changing the default
     frequency for other VM types and it goes counter to the very reason
     why the ioctl was added to the vm file descriptor. And also, there
     would be no way to make it work for confidential VMs with a
     "secure" TSC, so kill two birds with one stone

   - Dynamically allocation the shadow MMU's hashed page list, and defer
     allocating the hashed list until it's actually needed (the TDP MMU
     doesn't use the list)

   - Extract many of KVM's helpers for accessing architectural local
     APIC state to common x86 so that they can be shared by guest-side
     code for Secure AVIC

   - Various cleanups and fixes

  x86 (Intel):

   - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest.
     Failure to honor FREEZE_IN_SMM can leak host state into guests

   - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to
     prevent L1 from running L2 with features that KVM doesn't support,
     e.g. BTF

  x86 (AMD):

   - WARN and reject loading kvm-amd.ko instead of panicking the kernel
     if the nested SVM MSRPM offsets tracker can't handle an MSR (which
     is pretty much a static condition and therefore should never
     happen, but still)

   - Fix a variety of flaws and bugs in the AVIC device posted IRQ code

   - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware
     supports) instead of rejecting vCPU creation

   - Extend enable_ipiv module param support to SVM, by simply leaving
     IsRunning clear in the vCPU's physical ID table entry

   - Disable IPI virtualization, via enable_ipiv, if the CPU is affected
     by erratum #1235, to allow (safely) enabling AVIC on such CPUs

   - Request GA Log interrupts if and only if the target vCPU is
     blocking, i.e. only if KVM needs a notification in order to wake
     the vCPU

   - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to
     the vCPU's CPUID model

   - Accept any SNP policy that is accepted by the firmware with respect
     to SMT and single-socket restrictions. An incompatible policy
     doesn't put the kernel at risk in any way, so there's no reason for
     KVM to care

   - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and
     use WBNOINVD instead of WBINVD when possible for SEV cache
     maintenance

   - When reclaiming memory from an SEV guest, only do cache flushes on
     CPUs that have ever run a vCPU for the guest, i.e. don't flush the
     caches for CPUs that can't possibly have cache lines with dirty,
     encrypted data

  Generic:

   - Rework irqbypass to track/match producers and consumers via an
     xarray instead of a linked list. Using a linked list leads to
     O(n^2) insertion times, which is hugely problematic for use cases
     that create large numbers of VMs. Such use cases typically don't
     actually use irqbypass, but eliminating the pointless registration
     is a future problem to solve as it likely requires new uAPI

   - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a
     "void *", to avoid making a simple concept unnecessarily difficult
     to understand

   - Decouple device posted IRQs from VFIO device assignment, as binding
     a VM to a VFIO group is not a requirement for enabling device
     posted IRQs

   - Clean up and document/comment the irqfd assignment code

   - Disallow binding multiple irqfds to an eventfd with a priority
     waiter, i.e. ensure an eventfd is bound to at most one irqfd
     through the entire host, and add a selftest to verify eventfd:irqfd
     bindings are globally unique

   - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues
     related to private <=> shared memory conversions

   - Drop guest_memfd's .getattr() implementation as the VFS layer will
     call generic_fillattr() if inode_operations.getattr is NULL

   - Fix issues with dirty ring harvesting where KVM doesn't bound the
     processing of entries in any way, which allows userspace to keep
     KVM in a tight loop indefinitely

   - Kill off kvm_arch_{start,end}_assignment() and x86's associated
     tracking, now that KVM no longer uses assigned_device_count as a
     heuristic for either irqbypass usage or MDS mitigation

  Selftests:

   - Fix a comment typo

   - Verify KVM is loaded when getting any KVM module param so that
     attempting to run a selftest without kvm.ko loaded results in a
     SKIP message about KVM not being loaded/enabled (versus some random
     parameter not existing)

   - Skip tests that hit EACCES when attempting to access a file, and
     print a "Root required?" help message. In most cases, the test just
     needs to be run with elevated permissions"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (340 commits)
  Documentation: KVM: Use unordered list for pre-init VGIC registers
  RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
  RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs
  RISC-V: perf/kvm: Add reporting of interrupt events
  RISC-V: KVM: Enable ring-based dirty memory tracking
  RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap
  RISC-V: KVM: Delegate illegal instruction fault to VS mode
  RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs
  RISC-V: KVM: Factor-out g-stage page table management
  RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence
  RISC-V: KVM: Introduce struct kvm_gstage_mapping
  RISC-V: KVM: Factor-out MMU related declarations into separate headers
  RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect()
  RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()
  RISC-V: KVM: Don't flush TLB when PTE is unchanged
  RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH
  RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()
  RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init()
  RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value
  KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
  ...
2025-07-30 17:14:01 -07:00
Linus Torvalds 2223228bb1 ktest updates for v6.17
- Add new -D option that allows to override variables and options
 
   For example:
 
     ./ktest.pl -DPATCH_START:=HEAD~1 -DOUTPUT_DIR=/work/build/urgent config
 
   The above sets the variable "PATCH_START" to HEAD~1 and the OUTPUT_DIR
   option to "/work/build/urgent".
 
   This is useful because currently the only way to make a slight change to a
   config file is by modifying that config file. For one time changes, this
   can be annoying. Having a way to do a one time override from the command
   line simplifies the workflow.
 
   Temp variables (PATCH_START) will override every temp variable in the
   config file,  whereas options will act like a normal OVERRIDE option and
   will only affect the session they define.
 
      -DBUILD_OUTPUT=/work/git/linux.git
 
   Replaces the default BUILD_OUTPUT option.
 
      '-DBUILD_OUTPUT[2]=/work/git/linux.git'
 
   Only replaces the BUILD_OUTPUT variable for test #2.
 
 - If an option contains itself, just drop it instead of going into an
   infinite loop and failing to parse (it doesn't crash, it detects the
   recursion after 100 iterations anyway).
 
   Some configs may define a variable with the same name as the option:
 
      ADD_CONFIG := $(ADD_CONFIG)
 
   But if the option doesn't exist, it the above will fail to parse. In these
   cases, just ignore evaluating the option inside the definition of another
   option if it has the same name.
 
 - Display the BUILD_DIR and OUTPUT_DIR options at the start of every test
 
   It is useful to know which kernel source and what destination a test is
   using when it starts, in case a mistake is made. This makes it easier to
   abort the test if the wrong source or destination is being used instead of
   waiting until the test completes.
 
 - Add new PATCHCHECK_SKIP option
 
   When testing a series of commits that also includes changes to the Linux
   tools directory, it is useless to test the changes in tools as they may
   not affect the kernel itself. Doing tests on the kernel for changes that
   do not affect the kernel is a waste of time.
 
   Add a PATCHCHECK_SKIP that takes a series of shas that will be skipped
   while doing the individual commit tests.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIjO+hQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qmJiAP93udmfI+j4+IcyrYg6hkm2F+stakot
 7TgM/YCgB8tTSAEA4iMGkAGxEUlHVv5Has7JFW1ajsqk2ZHlXlu9lSxvVwg=
 =JDuF
 -----END PGP SIGNATURE-----

Merge tag 'ktest-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest

Pull ktest updates from Steven Rostedt:

 - Add new -D option that allows to override variables and options

   For example:

     ./ktest.pl -DPATCH_START:=HEAD~1 -DOUTPUT_DIR=/work/build/urgent config

   The above sets the variable "PATCH_START" to HEAD~1 and the
   OUTPUT_DIR option to "/work/build/urgent".

   This is useful because currently the only way to make a slight change
   to a config file is by modifying that config file. For one time
   changes, this can be annoying. Having a way to do a one time override
   from the command line simplifies the workflow.

   Temp variables (PATCH_START) will override every temp variable in the
   config file, whereas options will act like a normal OVERRIDE option
   and will only affect the session they define.

      -DBUILD_OUTPUT=/work/git/linux.git

   Replaces the default BUILD_OUTPUT option.

      '-DBUILD_OUTPUT[2]=/work/git/linux.git'

   Only replaces the BUILD_OUTPUT variable for test #2.

 - If an option contains itself, just drop it instead of going into an
   infinite loop and failing to parse (it doesn't crash, it detects the
   recursion after 100 iterations anyway).

   Some configs may define a variable with the same name as the option:

      ADD_CONFIG := $(ADD_CONFIG)

   But if the option doesn't exist, it the above will fail to parse. In
   these cases, just ignore evaluating the option inside the definition
   of another option if it has the same name.

 - Display the BUILD_DIR and OUTPUT_DIR options at the start of every
   test

   It is useful to know which kernel source and what destination a test
   is using when it starts, in case a mistake is made. This makes it
   easier to abort the test if the wrong source or destination is being
   used instead of waiting until the test completes.

 - Add new PATCHCHECK_SKIP option

   When testing a series of commits that also includes changes to the
   Linux tools directory, it is useless to test the changes in tools as
   they may not affect the kernel itself. Doing tests on the kernel for
   changes that do not affect the kernel is a waste of time.

   Add a PATCHCHECK_SKIP that takes a series of shas that will be
   skipped while doing the individual commit tests.

* tag 'ktest-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest.pl: Add new PATCHCHECK_SKIP option to skip testing individual commits
  ktest.pl: Always display BUILD_DIR and OUTPUT_DIR at the start of tests
  ktest.pl: Prevent recursion of default variable options
  ktest.pl: Have -D option work without a space
  ktest.pl: Allow command option -D to override temp variables
  ktest.pl: Add -D option to override options
2025-07-30 15:59:36 -07:00
Linus Torvalds b7dbc2e813 Probes updates for v6.17:
- Stack usage reduction for probe events:
    - Allocate string buffers from the heap for uprobe, eprobe, kprobe,
      and fprobe events to avoid stack overflow.
    - Allocate traceprobe_parse_context from the heap to prevent
      potential stack overflow.
    - Fix a typo in the above commit.
 
  - New features for eprobe and tprobe events:
    - Add support for arrays in eprobes.
    - Support multiple tprobes on the same tracepoint.
 
  - Improve efficiency:
    - Register fprobe-events only when it is enabled to reduce overhead.
    - Register tracepoints for tprobe events only when enabled to
      resolve a lock dependency.
 
  - Code Cleanup:
    - Add kerneldoc for traceprobe_parse_event_name() and __get_insn_slot().
    - Sort #include alphabetically in the probes code.
    - Remove the unused 'mod' field from the tprobe-event.
    - Clean up the entry-arg storing code in probe-events.
 
  - Selftest update
    - Enable fprobe events before checking enable_functions in selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmiJ2DQbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bSfkH/06Zn5I55rU85FKSBQll
 FN4hipmef/9Nd13skDwpEuFyzLPNS4P1up/UBUuyDQUTlO74+t2zSFO2dpcNrWmu
 sPTenQ+6h82H3K591WTIC23VzF54syIbFLXEj8iMBALT3wyU4Nn0bs4DCbnTo5HX
 R3NVo77rk6wxNJoKYOtT6ALf/lHonuNlGF+KTUGWP8UbWsIY3fIp0RWWy572M0bt
 +YBE8D8RIVrw+ZY+vNKn1LdZdWlR1ton518XDf1gV9isTCfKErcd/6HJKwuj5q2v
 qMgwiaKK+Gne/ylAKmWLEg2oNDo7kpyfW+612oiECitgZkqxOXhyYYfWgRt1lFNp
 Wb8=
 =E+Z6
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
 "Stack usage reduction for probe events:
   - Allocate string buffers from the heap for uprobe, eprobe, kprobe,
     and fprobe events to avoid stack overflow
   - Allocate traceprobe_parse_context from the heap to prevent
     potential stack overflow
   - Fix a typo in the above commit

  New features for eprobe and tprobe events:
   - Add support for arrays in eprobes
   - Support multiple tprobes on the same tracepoint

  Improve efficiency:
   - Register fprobe-events only when it is enabled to reduce overhead
   - Register tracepoints for tprobe events only when enabled to resolve
     a lock dependency

  Code Cleanup:
   - Add kerneldoc for traceprobe_parse_event_name() and
     __get_insn_slot()
   - Sort #include alphabetically in the probes code
   - Remove the unused 'mod' field from the tprobe-event
   - Clean up the entry-arg storing code in probe-events

  Selftest update
   - Enable fprobe events before checking enable_functions in selftests"

* tag 'probes-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: trace_fprobe: Fix typo of the semicolon
  tracing: Have eprobes handle arrays
  tracing: probes: Add a kerneldoc for traceprobe_parse_event_name()
  tracing: uprobe-event: Allocate string buffers from heap
  tracing: eprobe-event: Allocate string buffers from heap
  tracing: kprobe-event: Allocate string buffers from heap
  tracing: fprobe-event: Allocate string buffers from heap
  tracing: probe: Allocate traceprobe_parse_context from heap
  tracing: probes: Sort #include alphabetically
  kprobes: Add missing kerneldoc for __get_insn_slot
  tracing: tprobe-events: Register tracepoint when enable tprobe event
  selftests: tracing: Enable fprobe events before checking enable_functions
  tracing: fprobe-events: Register fprobe-events only when it is enabled
  tracing: tprobe-events: Support multiple tprobes on the same tracepoint
  tracing: tprobe-events: Remove mod field from tprobe-event
  tracing: probe-events: Cleanup entry-arg storing code
2025-07-30 15:38:01 -07:00
Linus Torvalds 2db4df0c09 RCU pull request for v6.17
This pull request contains the following branches:
 
 rcu-exp.23.07.2025
 
   - Protect against early RCU exp quiescent state reporting during exp
     grace period initialization.
 
   - Remove superfluous barrier in task unblock path.
 
   - Remove the CPU online quiescent state report optimization, which is
     error prone for certain scenarios.
 
   - Add warning for unexpected pending requested expedited quiescent
     state on dying CPU.
 
 rcu.22.07.2025
 
   - Robustify rcu_is_cpu_rrupt_from_idle() by using more accurate
     indicators of the actual context tracking state of a CPU.
 
   - Handle ->defer_qs_iw_pending field data race.
 
   - Enable rcu_normal_wake_from_gp by default on systems with <= 16 CPUs.
 
   - Fix lockup in rcu_read_unlock() due to recursive irq_exit() calls.
 
   - Refactor expedited handling condition in rcu_read_unlock_special().
 
   - Documentation updates for hotplug and GP init scan ordering,
     separation of rcu_state and rnp's gp_seq states, quiescent state
     reporting for offline CPUs.
 
 torture-scripts.16.07.2025
 
   - Cleanup and improve scripts : remove superfluous warnings for disabled
     tests; better handling of kvm.sh --kconfig arg; suppress some confusing
     diagnostics; tolerate bad kvm.sh args; add new diagnostic for build
     output; fail allmodconfig testing on warnings.
 
   - Include RCU_TORTURE_TEST_CHK_RDR_STATE config for KCSAN kernels.
 
   - Disable default RCU-tasks and clocksource-wdog testing on arm64.
 
   - Add EXPERT Kconfig option for arm64 KCSAN runs.
 
   - Remove SRCU-lite testing.
 
 rcutorture.16.07.2025
 
   - Start torture writer threads creation after reader threads to handle
     race in SRCU-P scenario.
 
   - Add SRCU down_read()/up_read() test.
 
   - Add diagnostics for delayed SRCU up_read(), unmatched up_read(), print
     number of up/down readers and the number of such readers which
     migrated to other CPU.
 
   - Ignore certain unsupported configurations for trivial RCU test.
 
   - Fix splats in RT kernels due to inaccurate checks for BH-disabled
     context.
 
   - Enable checks and logs to capture intentionally exercised unexpected
     scenarios (too short readers) for BUSTED test.
 
   - Remove SRCU-lite testing.
 
 srcu.19.07.2025
 
   - Expedite SRCU-fast grace periods.
 
   - Remove SRCU-lite implementation.
 
   - Add guards for SRCU-fast readers.
 
 rcu.nocb.18.07.2025
 
   - Dump NOCB group leader state on stall detection.
 
   - Robustify nocb_cb_kthread pointer accesses.
 
   - Fix delayed execution of hurry callbacks when LAZY_RCU is enabled.
 
 refscale.07.07.2025
 
   - Fix multiplication overflow in "loops" and "nreaders" calculations.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCaINnRwAKCRAAHS7/6Z0w
 pRYJAQC97ZDW2wBegDbQPsg5ECLX9Lyd6+IC65sdi38IENl+TQEA4/oMzUUceIH+
 CDCnxv3fAMhPncJfvIukOLzMJpKw0go=
 =8t4O
 -----END PGP SIGNATURE-----

Merge tag 'rcu.release.v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Neeraj Upadhyay:
 "Expedited grace period updates:

   - Protect against early RCU exp quiescent state reporting during exp
     grace period initialization

   - Remove superfluous barrier in task unblock path

   - Remove the CPU online quiescent state report optimization, which is
     error prone for certain scenarios

   - Add warning for unexpected pending requested expedited quiescent
     state on dying CPU

  Core:

   - Robustify rcu_is_cpu_rrupt_from_idle() by using more accurate
     indicators of the actual context tracking state of a CPU

   - Handle ->defer_qs_iw_pending field data race

   - Enable rcu_normal_wake_from_gp by default on systems with <= 16
     CPUs

   - Fix lockup in rcu_read_unlock() due to recursive irq_exit() calls

   - Refactor expedited handling condition in rcu_read_unlock_special()

   - Documentation updates for hotplug and GP init scan ordering,
     separation of rcu_state and rnp's gp_seq states, quiescent state
     reporting for offline CPUs

  torture-scripts:

   - Cleanup and improve scripts : remove superfluous warnings for
     disabled tests; better handling of kvm.sh --kconfig arg; suppress
     some confusing diagnostics; tolerate bad kvm.sh args; add new
     diagnostic for build output; fail allmodconfig testing on warnings

   - Include RCU_TORTURE_TEST_CHK_RDR_STATE config for KCSAN kernels

   - Disable default RCU-tasks and clocksource-wdog testing on arm64

   - Add EXPERT Kconfig option for arm64 KCSAN runs

   - Remove SRCU-lite testing

  rcutorture:

   - Start torture writer threads creation after reader threads to
     handle race in SRCU-P scenario

   - Add SRCU down_read()/up_read() test

   - Add diagnostics for delayed SRCU up_read(), unmatched up_read(),
     print number of up/down readers and the number of such readers
     which migrated to other CPU

   - Ignore certain unsupported configurations for trivial RCU test

   - Fix splats in RT kernels due to inaccurate checks for BH-disabled
     context

   - Enable checks and logs to capture intentionally exercised
     unexpected scenarios (too short readers) for BUSTED test

   - Remove SRCU-lite testing

  srcu:

   - Expedite SRCU-fast grace periods

   - Remove SRCU-lite implementation

   - Add guards for SRCU-fast readers

  rcu nocb:

   - Dump NOCB group leader state on stall detection

   - Robustify nocb_cb_kthread pointer accesses

   - Fix delayed execution of hurry callbacks when LAZY_RCU is enabled

  refscale:

   - Fix multiplication overflow in "loops" and "nreaders" calculations"

* tag 'rcu.release.v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (49 commits)
  rcu: Document concurrent quiescent state reporting for offline CPUs
  rcu: Document separation of rcu_state and rnp's gp_seq
  rcu: Document GP init vs hotplug-scan ordering requirements
  srcu: Add guards for SRCU-fast readers
  rcu: Fix delayed execution of hurry callbacks
  rcu: Refactor expedited handling check in rcu_read_unlock_special()
  checkpatch: Remove SRCU-lite deprecation
  srcu: Remove SRCU-lite implementation
  srcu: Expedite SRCU-fast grace periods
  rcutorture: Remove support for SRCU-lite
  rcutorture: Remove SRCU-lite scenarios
  torture: Remove support for SRCU-lite
  torture: Make torture.sh --allmodconfig testing fail on warnings
  torture: Add "ERROR" diagnostic for testing kernel-build output
  torture: Make torture.sh tolerate runs having bad kvm.sh arguments
  torture: Add textid.txt file to --do-allmodconfig and --do-rcu-rust runs
  torture: Extract testid.txt generation to separate script
  torture: Suppress "find" diagnostics from torture.sh --do-none run
  torture: Provide EXPERT Kconfig option for arm64 KCSAN torture.sh runs
  rcu: Fix rcu_read_unlock() deadloop due to IRQ work
  ...
2025-07-30 11:01:41 -07:00
Linus Torvalds d9104cec3e bpf-next-6.17
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiINnEACgkQ6rmadz2v
 bToBnA/9F+A3R6rTwGk4HK3xpfc/nm2Tanl3oRN7S2ub/mskDOtWSIyG6cVFZ0UG
 1fK6IkByyRIpAF/5qhdlw8drRXHkQtGLA0lP2L9llm4X1mHLofB18y9OeLrDE1WN
 KwNP06+IGX9W802lCGSIXOY+VmRscVfXSMokyQt2ilHplKjOnDqJcYkWupi3T2rC
 mz79FY9aEl2YrIcpj9RXz+8nwP49pZBuW2P0IM5PAIj4BJBXShrUp8T1nz94okNe
 NFsnAyRxjWpUT0McEgtA9WvpD9lZqujYD8Qp0KlGZWmI3vNpV5d9S1+dBcEb1n7q
 dyNMkTF3oRrJhhg4VqoHc6fVpzSEoZ9ZxV5Hx4cs+ganH75D4YbdGqx/7mR3DUgH
 MZh6rHF1pGnK7TAm7h5gl3ZRAOkZOaahbe1i01NKo9CEe5fSh3AqMyzJYoyGHRKi
 xDN39eQdWBNA+hm1VkbK2Bv93Rbjrka2Kj+D3sSSO9Bo/u3ntcknr7LW39idKz62
 Q8dkKHcCEtun7gjk0YXPF013y81nEohj1C+52BmJ2l5JitM57xfr6YOaQpu7DPDE
 AJbHx6ASxKdyEETecd0b+cXUPQ349zmRXy0+CDMAGKpBicC0H0mHhL14cwOY1Hfu
 EIpIjmIJGI3JNF6T5kybcQGSBOYebdV0FFgwSllzPvuYt7YsHCs=
 =/O3j
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Remove usermode driver (UMD) framework (Thomas Weißschuh)

 - Introduce Strongly Connected Component (SCC) in the verifier to
   detect loops and refine register liveness (Eduard Zingerman)

 - Allow 'void *' cast using bpf_rdonly_cast() and corresponding
   '__arg_untrusted' for global function parameters (Eduard Zingerman)

 - Improve precision for BPF_ADD and BPF_SUB operations in the verifier
   (Harishankar Vishwanathan)

 - Teach the verifier that constant pointer to a map cannot be NULL
   (Ihor Solodrai)

 - Introduce BPF streams for error reporting of various conditions
   detected by BPF runtime (Kumar Kartikeya Dwivedi)

 - Teach the verifier to insert runtime speculation barrier (lfence on
   x86) to mitigate speculative execution instead of rejecting the
   programs (Luis Gerhorst)

 - Various improvements for 'veristat' (Mykyta Yatsenko)

 - For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to
   improve bug detection by syzbot (Paul Chaignon)

 - Support BPF private stack on arm64 (Puranjay Mohan)

 - Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's
   node (Song Liu)

 - Introduce kfuncs for read-only string opreations (Viktor Malik)

 - Implement show_fdinfo() for bpf_links (Tao Chen)

 - Reduce verifier's stack consumption (Yonghong Song)

 - Implement mprog API for cgroup-bpf programs (Yonghong Song)

* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits)
  selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
  selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
  bpf: Add log for attaching tracing programs to functions in deny list
  bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
  bpf: Fix various typos in verifier.c comments
  bpf: Add third round of bounds deduction
  selftests/bpf: Test invariants on JSLT crossing sign
  selftests/bpf: Test cross-sign 64bits range refinement
  selftests/bpf: Update reg_bound range refinement logic
  bpf: Improve bounds when s64 crosses sign boundary
  bpf: Simplify bounds refinement from s32
  selftests/bpf: Enable private stack tests for arm64
  bpf, arm64: JIT support for private stack
  bpf: Move bpf_jit_get_prog_name() to core.c
  bpf, arm64: Fix fp initialization for exception boundary
  umd: Remove usermode driver framework
  bpf/preload: Don't select USERMODE_DRIVER
  selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure
  selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure
  selftests/bpf: Increase xdp data size for arm64 64K page size
  ...
2025-07-30 09:58:50 -07:00
Linus Torvalds 8be4d31cb8 Networking changes for 6.17.
Core & protocols
 ----------------
 
  - Wrap datapath globals into net_aligned_data, to avoid false sharing.
 
  - Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container).
 
  - Add SO_INQ and SCM_INQ support to AF_UNIX.
 
  - Add SIOCINQ support to AF_VSOCK.
 
  - Add TCP_MAXSEG sockopt to MPTCP.
 
  - Add IPv6 force_forwarding sysctl to enable forwarding per interface.
 
  - Make TCP validation of whether packet fully fits in the receive
    window and the rcv_buf more strict. With increased use of HW
    aggregation a single "packet" can be multiple 100s of kB.
 
  - Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
    improves latency up to 33% for sockmap users.
 
  - Convert TCP send queue handling from tasklet to BH workque.
 
  - Improve BPF iteration over TCP sockets to see each socket exactly once.
 
  - Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code.
 
  - Support enabling kernel threads for NAPI processing on per-NAPI
    instance basis rather than a whole device. Fully stop the kernel NAPI
    thread when threaded NAPI gets disabled. Previously thread would stick
    around until ifdown due to tricky synchronization.
 
  - Allow multicast routing to take effect on locally-generated packets.
 
  - Add output interface argument for End.X in segment routing.
 
  - MCTP: add support for gateway routing, improve bind() handling.
 
  - Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink.
 
  - Add a new neighbor flag ("extern_valid"), which cedes refresh
    responsibilities to userspace. This is needed for EVPN multi-homing
    where a neighbor entry for a multi-homed host needs to be synced
    across all the VTEPs among which the host is multi-homed.
 
  - Support NUD_PERMANENT for proxy neighbor entries.
 
  - Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM.
 
  - Add sequence numbers to netconsole messages. Unregister netconsole's
    console when all net targets are removed. Code refactoring.
    Add a number of selftests.
 
  - Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
    should be used for an inbound SA lookup.
 
  - Support inspecting ref_tracker state via DebugFS.
 
  - Don't force bonding advertisement frames tx to ~333 ms boundaries.
    Add broadcast_neighbor option to send ARP/ND on all bonded links.
 
  - Allow providing upcall pid for the 'execute' command in openvswitch.
 
  - Remove DCCP support from Netfilter's conntrack.
 
  - Disallow multiple packet duplications in the queuing layer.
 
  - Prevent use of deprecated iptables code on PREEMPT_RT.
 
 Driver API
 ----------
 
  - Support RSS and hashing configuration over ethtool Netlink.
 
  - Add dedicated ethtool callbacks for getting and setting hashing fields.
 
  - Add support for power budget evaluation strategy in PSE /
    Power-over-Ethernet. Generate Netlink events for overcurrent etc.
 
  - Support DPLL phase offset monitoring across all device inputs.
    Support providing clock reference and SYNC over separate DPLL
    inputs.
 
  - Support traffic classes in devlink rate API for bandwidth management.
 
  - Remove rtnl_lock dependency from UDP tunnel port configuration.
 
 Device drivers
 --------------
 
  - Add a new Broadcom driver for 800G Ethernet (bnge).
 
  - Add a standalone driver for Microchip ZL3073x DPLL.
 
  - Remove IBM's NETIUCV device driver.
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
     - support zero-copy Tx of DMABUF memory
     - take page size into account for page pool recycling rings
    - Intel (100G, ice, idpf):
      - idpf: XDP and AF_XDP support preparations
      - idpf: add flow steering
      - add link_down_events statistic
      - clean up the TSPLL code
      - preparations for live VM migration
    - nVidia/Mellanox:
     - support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
     - optimize context memory usage for matchers
     - expose serial numbers in devlink info
     - support PCIe congestion metrics
    - Meta (fbnic):
      - add 25G, 50G, and 100G link modes to phylink
      - support dumping FW logs
    - Marvell/Cavium:
      - support for CN20K generation of the Octeon chips
    - Amazon:
      - add HW clock (without timestamping, just hypervisor time access)
 
  - Ethernet virtual:
    - VirtIO net:
      - support segmentation of UDP-tunnel-encapsulated packets
    - Google (gve):
      - support packet timestamping and clock synchronization
    - Microsoft vNIC:
      - add handler for device-originated servicing events
      - allow dynamic MSI-X vector allocation
      - support Tx bandwidth clamping
 
  - Ethernet NICs consumer, and embedded:
    - AMD:
      - amd-xgbe: hardware timestamping and PTP clock support
    - Broadcom integrated MACs (bcmgenet, bcmasp):
      - use napi_complete_done() return value to support NAPI polling
      - add support for re-starting auto-negotiation
    - Broadcom switches (b53):
      - support BCM5325 switches
      - add bcm63xx EPHY power control
    - Synopsys (stmmac):
      - lots of code refactoring and cleanups
    - TI:
      - icssg-prueth: read firmware-names from device tree
      - icssg: PRP offload support
    - Microchip:
      - lan78xx: convert to PHYLINK for improved PHY and MAC management
      - ksz: add KSZ8463 switch support
    - Intel:
      - support similar queue priority scheme in multi-queue and
        time-sensitive networking (taprio)
      - support packet pre-emption in both
    - RealTek (r8169):
      - enable EEE at 5Gbps on RTL8126
    - Airoha:
      - add PPPoE offload support
      - MDIO bus controller for Airoha AN7583
 
  - Ethernet PHYs:
    - support for the IPQ5018 internal GE PHY
    - micrel KSZ9477 switch-integrated PHYs:
      - add MDI/MDI-X control support
      - add RX error counters
      - add cable test support
      - add Signal Quality Indicator (SQI) reporting
    - dp83tg720: improve reset handling and reduce link recovery time
    - support bcm54811 (and its MII-Lite interface type)
    - air_en8811h: support resume/suspend
    - support PHY counters for QCA807x and QCA808x
    - support WoL for QCA807x
 
  - CAN drivers:
    - rcar_canfd: support for Transceiver Delay Compensation
    - kvaser: report FW versions via devlink dev info
 
  - WiFi:
    - extended regulatory info support (6 GHz)
    - add statistics and beacon monitor for Multi-Link Operation (MLO)
    - support S1G aggregation, improve S1G support
    - add Radio Measurement action fields
    - support per-radio RTS threshold
    - some work around how FIPS affects wifi, which was wrong (RC4 is used
      by TKIP, not only WEP)
    - improvements for unsolicited probe response handling
 
  - WiFi drivers:
    - RealTek (rtw88):
      - IBSS mode for SDIO devices
    - RealTek (rtw89):
      - BT coexistence for MLO/WiFi7
      - concurrent station + P2P support
      - support for USB devices RTL8851BU/RTL8852BU
    - Intel (iwlwifi):
      - use embedded PNVM in (to be released) FW images to fix
        compatibility issues
      - many cleanups (unused FW APIs, PCIe code, WoWLAN)
      - some FIPS interoperability
    - MediaTek (mt76):
      - firmware recovery improvements
      - more MLO work
    - Qualcomm/Atheros (ath12k):
      - fix scan on multi-radio devices
      - more EHT/Wi-Fi 7 features
      - encapsulation/decapsulation offload
    - Broadcom (brcm80211):
      - support SDIO 43751 device
 
  - Bluetooth:
    - hci_event: add support for handling LE BIG Sync Lost event
    - ISO: add socket option to report packet seqnum via CMSG
    - ISO: support SCM_TIMESTAMPING for ISO TS
 
  - Bluetooth drivers:
    - intel_pcie: support Function Level Reset
    - nxpuart: add support for 4M baudrate
    - nxpuart: implement powerup sequence, reset, FW dump, and FW loading
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiFgLgACgkQMUZtbf5S
 IrvafxAAnQRwYBoIG+piCILx6z5pRvBGHkmEQ4AQgSCFuq2eO3ubwMFIqEybfma1
 5+QFjUZAV3OgGgKRBS2KGWxtSzdiF+/JGV1VOIN67sX3Mm0a2QgjA4n5CgKL0FPr
 o6BEzjX5XwG1zvGcBNQ5BZ19xUUKjoZQgTtnea8sZ57Fsp5RtRgmYRqoewNvNk/n
 uImh0NFsDVb0UeOpSzC34VD9l1dJvLGdui4zJAjno/vpvmT1DkXjoK419J/r52SS
 X+5WgsfJ6DkjHqVN1tIhhK34yWqBOcwGFZJgEnWHMkFIl2FqRfFKMHyqtfLlVnLA
 mnIpSyz8Sq2AHtx0TlgZ3At/Ri8p5+yYJgHOXcDKyABa8y8Zf4wrycmr6cV9JLuL
 z54nLEVnJuvfDVDVJjsLYdJXyhMpZFq6+uAItdxKaw8Ugp/QqG4QtoRj+XIHz4ZW
 z6OohkCiCzTwEISFK+pSTxPS30eOxq43kCspcvuLiwCCStJBRkRb5GdZA4dm7LA+
 1Od4ADAkHjyrFtBqTyyC2scX8UJ33DlAIpAYyIeS6w9Cj9EXxtp1z33IAAAZ03MW
 jJwIaJuc8bK2fWKMmiG7ucIXjPo4t//KiWlpkwwqLhPbjZgfDAcxq1AC2TLoqHBL
 y4EOgKpHDCMAghSyiFIAn2JprGcEt8dp+11B0JRXIn4Pm/eYDH8=
 =lqbe
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Wrap datapath globals into net_aligned_data, to avoid false sharing

   - Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)

   - Add SO_INQ and SCM_INQ support to AF_UNIX

   - Add SIOCINQ support to AF_VSOCK

   - Add TCP_MAXSEG sockopt to MPTCP

   - Add IPv6 force_forwarding sysctl to enable forwarding per interface

   - Make TCP validation of whether packet fully fits in the receive
     window and the rcv_buf more strict. With increased use of HW
     aggregation a single "packet" can be multiple 100s of kB

   - Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
     improves latency up to 33% for sockmap users

   - Convert TCP send queue handling from tasklet to BH workque

   - Improve BPF iteration over TCP sockets to see each socket exactly
     once

   - Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code

   - Support enabling kernel threads for NAPI processing on per-NAPI
     instance basis rather than a whole device. Fully stop the kernel
     NAPI thread when threaded NAPI gets disabled. Previously thread
     would stick around until ifdown due to tricky synchronization

   - Allow multicast routing to take effect on locally-generated packets

   - Add output interface argument for End.X in segment routing

   - MCTP: add support for gateway routing, improve bind() handling

   - Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink

   - Add a new neighbor flag ("extern_valid"), which cedes refresh
     responsibilities to userspace. This is needed for EVPN multi-homing
     where a neighbor entry for a multi-homed host needs to be synced
     across all the VTEPs among which the host is multi-homed

   - Support NUD_PERMANENT for proxy neighbor entries

   - Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM

   - Add sequence numbers to netconsole messages. Unregister
     netconsole's console when all net targets are removed. Code
     refactoring. Add a number of selftests

   - Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
     should be used for an inbound SA lookup

   - Support inspecting ref_tracker state via DebugFS

   - Don't force bonding advertisement frames tx to ~333 ms boundaries.
     Add broadcast_neighbor option to send ARP/ND on all bonded links

   - Allow providing upcall pid for the 'execute' command in openvswitch

   - Remove DCCP support from Netfilter's conntrack

   - Disallow multiple packet duplications in the queuing layer

   - Prevent use of deprecated iptables code on PREEMPT_RT

  Driver API:

   - Support RSS and hashing configuration over ethtool Netlink

   - Add dedicated ethtool callbacks for getting and setting hashing
     fields

   - Add support for power budget evaluation strategy in PSE /
     Power-over-Ethernet. Generate Netlink events for overcurrent etc

   - Support DPLL phase offset monitoring across all device inputs.
     Support providing clock reference and SYNC over separate DPLL
     inputs

   - Support traffic classes in devlink rate API for bandwidth
     management

   - Remove rtnl_lock dependency from UDP tunnel port configuration

  Device drivers:

   - Add a new Broadcom driver for 800G Ethernet (bnge)

   - Add a standalone driver for Microchip ZL3073x DPLL

   - Remove IBM's NETIUCV device driver

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support zero-copy Tx of DMABUF memory
         - take page size into account for page pool recycling rings
      - Intel (100G, ice, idpf):
         - idpf: XDP and AF_XDP support preparations
         - idpf: add flow steering
         - add link_down_events statistic
         - clean up the TSPLL code
         - preparations for live VM migration
      - nVidia/Mellanox:
         - support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
         - optimize context memory usage for matchers
         - expose serial numbers in devlink info
         - support PCIe congestion metrics
      - Meta (fbnic):
         - add 25G, 50G, and 100G link modes to phylink
         - support dumping FW logs
      - Marvell/Cavium:
         - support for CN20K generation of the Octeon chips
      - Amazon:
         - add HW clock (without timestamping, just hypervisor time access)

   - Ethernet virtual:
      - VirtIO net:
         - support segmentation of UDP-tunnel-encapsulated packets
      - Google (gve):
         - support packet timestamping and clock synchronization
      - Microsoft vNIC:
         - add handler for device-originated servicing events
         - allow dynamic MSI-X vector allocation
         - support Tx bandwidth clamping

   - Ethernet NICs consumer, and embedded:
      - AMD:
         - amd-xgbe: hardware timestamping and PTP clock support
      - Broadcom integrated MACs (bcmgenet, bcmasp):
         - use napi_complete_done() return value to support NAPI polling
         - add support for re-starting auto-negotiation
      - Broadcom switches (b53):
         - support BCM5325 switches
         - add bcm63xx EPHY power control
      - Synopsys (stmmac):
         - lots of code refactoring and cleanups
      - TI:
         - icssg-prueth: read firmware-names from device tree
         - icssg: PRP offload support
      - Microchip:
         - lan78xx: convert to PHYLINK for improved PHY and MAC management
         - ksz: add KSZ8463 switch support
      - Intel:
         - support similar queue priority scheme in multi-queue and
           time-sensitive networking (taprio)
         - support packet pre-emption in both
      - RealTek (r8169):
         - enable EEE at 5Gbps on RTL8126
      - Airoha:
         - add PPPoE offload support
         - MDIO bus controller for Airoha AN7583

   - Ethernet PHYs:
      - support for the IPQ5018 internal GE PHY
      - micrel KSZ9477 switch-integrated PHYs:
         - add MDI/MDI-X control support
         - add RX error counters
         - add cable test support
         - add Signal Quality Indicator (SQI) reporting
      - dp83tg720: improve reset handling and reduce link recovery time
      - support bcm54811 (and its MII-Lite interface type)
      - air_en8811h: support resume/suspend
      - support PHY counters for QCA807x and QCA808x
      - support WoL for QCA807x

   - CAN drivers:
      - rcar_canfd: support for Transceiver Delay Compensation
      - kvaser: report FW versions via devlink dev info

   - WiFi:
      - extended regulatory info support (6 GHz)
      - add statistics and beacon monitor for Multi-Link Operation (MLO)
      - support S1G aggregation, improve S1G support
      - add Radio Measurement action fields
      - support per-radio RTS threshold
      - some work around how FIPS affects wifi, which was wrong (RC4 is
        used by TKIP, not only WEP)
      - improvements for unsolicited probe response handling

   - WiFi drivers:
      - RealTek (rtw88):
         - IBSS mode for SDIO devices
      - RealTek (rtw89):
         - BT coexistence for MLO/WiFi7
         - concurrent station + P2P support
         - support for USB devices RTL8851BU/RTL8852BU
      - Intel (iwlwifi):
         - use embedded PNVM in (to be released) FW images to fix
           compatibility issues
         - many cleanups (unused FW APIs, PCIe code, WoWLAN)
         - some FIPS interoperability
      - MediaTek (mt76):
         - firmware recovery improvements
         - more MLO work
      - Qualcomm/Atheros (ath12k):
         - fix scan on multi-radio devices
         - more EHT/Wi-Fi 7 features
         - encapsulation/decapsulation offload
      - Broadcom (brcm80211):
         - support SDIO 43751 device

   - Bluetooth:
      - hci_event: add support for handling LE BIG Sync Lost event
      - ISO: add socket option to report packet seqnum via CMSG
      - ISO: support SCM_TIMESTAMPING for ISO TS

   - Bluetooth drivers:
      - intel_pcie: support Function Level Reset
      - nxpuart: add support for 4M baudrate
      - nxpuart: implement powerup sequence, reset, FW dump, and FW loading"

* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
  dpll: zl3073x: Fix build failure
  selftests: bpf: fix legacy netfilter options
  ipv6: annotate data-races around rt->fib6_nsiblings
  ipv6: fix possible infinite loop in fib6_info_uses_dev()
  ipv6: prevent infinite loop in rt6_nlmsg_size()
  ipv6: add a retry logic in net6_rt_notify()
  vrf: Drop existing dst reference in vrf_ip6_input_dst
  net/sched: taprio: align entry index attr validation with mqprio
  net: fsl_pq_mdio: use dev_err_probe
  selftests: rtnetlink.sh: remove esp4_offload after test
  vsock: remove unnecessary null check in vsock_getname()
  igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
  stmmac: xsk: fix negative overflow of budget in zerocopy mode
  dt-bindings: ieee802154: Convert at86rf230.txt yaml format
  net: dsa: microchip: Disable PTP function of KSZ8463
  net: dsa: microchip: Setup fiber ports for KSZ8463
  net: dsa: microchip: Write switch MAC address differently for KSZ8463
  net: dsa: microchip: Use different registers for KSZ8463
  net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
  dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
  ...
2025-07-30 08:58:55 -07:00
Linus Torvalds 4b290aae78 Summary
* Move sysctls out of the kern_table array
 
   This is the final move of ctl_tables into their respective subsystems. Only 5
   (out of the original 50) will remain in kernel/sysctl.c file; these handle
   either sysctl or common arch variables.
 
   By decentralizing sysctl registrations, subsystem maintainers regain control
   over their sysctl interfaces, improving maintainability and reducing the
   likelihood of merge conflicts.
 
 * docs: Remove false positives from check-sysctl-docs
 
   Stopped falsely identifying sysctls as undocumented or unimplemented in the
   check-sysctl-docs script. This script can now be used to automatically
   identify if documentation is missing.
 
 * Testing
 
   All these have been in linux-next since rc3, giving them a solid 3 to 4 weeks
   worth of testing. Additionally, sysctl selftests and kunit were also run
   locally on my x86_64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmiAvd8ACgkQupfNUreW
 QU+9nAv/dtxaKoL4BXJSzsA2+49bbo9QfiK5Vjz1wSRYRQTb+jhGr9QdS5hG+NeX
 uN2ilvcNQqW7ENdiblU10lvcbPjIn2hw4lbMcpv/+QXnrudtGYlBFXlkWqW5nv7X
 AVvHU8y3uzfs6JbRIpROUA7Cn2cDOlfP2mMtwxCXR3iP+orS1ziuVEi1JRoirIyG
 iq5I/1rJMJBU3FjqqDTq6yljspLx8AlXO1yc5xUxAM67IcY4ew3ZTxqiZr6M9AhV
 DUbR2lu/88wcFNERt8DJmuQ50dSGGqOEpK3FURTmkwtMFxzNLmenFDQeBKKahz3Q
 2ntXSDfp2y+ppZNmcOP8tZZkra03Xpy1DQyoOgQ2r9uGekPxyr+wmKXwYPOeJIPO
 YWTNBm8omX9qr49zVzaZ1f2foRGfgStHL6aa6xLIf34zzScSDEPtO3og2+5Hw/30
 gnp+7v9E19uKpoE6oiGE0PtiFzAi/I6nFxzG2RRqrlMLFXyKVccTKygzY6tCnI3P
 6144s/Bt
 =R369
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

 - Move sysctls out of the kern_table array

   This is the final move of ctl_tables into their respective
   subsystems. Only 5 (out of the original 50) will remain in
   kernel/sysctl.c file; these handle either sysctl or common arch
   variables.

   By decentralizing sysctl registrations, subsystem maintainers regain
   control over their sysctl interfaces, improving maintainability and
   reducing the likelihood of merge conflicts.

 - docs: Remove false positives from check-sysctl-docs

   Stopped falsely identifying sysctls as undocumented or unimplemented
   in the check-sysctl-docs script. This script can now be used to
   automatically identify if documentation is missing.

* tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (23 commits)
  docs: Downgrade arm64 & riscv from titles to comment
  docs: Replace spaces with tabs in check-sysctl-docs
  docs: Remove colon from ctltable title in vm.rst
  docs: Add awk section for ucount sysctl entries
  docs: Use skiplist when checking sysctl admin-guide
  docs: nixify check-sysctl-docs
  sysctl: rename kern_table -> sysctl_subsys_table
  kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.c
  uevent: mv uevent_helper into kobject_uevent.c
  sysctl: Removed unused variable
  sysctl: Nixify sysctl.sh
  sysctl: Remove superfluous includes from kernel/sysctl.c
  sysctl: Remove (very) old file changelog
  sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.c
  sysctl: move cad_pid into kernel/pid.c
  sysctl: Move tainted ctl_table into kernel/panic.c
  Input: sysrq: mv sysrq into drivers/tty/sysrq.c
  fork: mv threads-max into kernel/fork.c
  parisc/power: Move soft-power into power.c
  mm: move randomize_va_space into memory.c
  ...
2025-07-29 21:43:08 -07:00
Linus Torvalds 6fb44438a5 arm64 updates for 6.17:
Perf and PMU updates:
 
  - Add support for new (v3) Hisilicon SLLC and DDRC PMUs
 
  - Add support for Arm-NI PMU integrations that share interrupts between
    clock domains within a given instance
 
  - Allow SPE to be configured with a lower sample period than the
    minimum recommendation advertised by PMSIDR_EL1.Interval
 
  - Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)
 
  - Adjust the perf watchdog period according to cpu frequency changes
 
  - Minor driver fixes and cleanups
 
 Hardware features:
 
  - Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)
 
  - Support for reporting the non-address bits during a synchronous MTE
    tag check fault (FEAT_MTE_TAGGED_FAR)
 
  - Optimise the TLBI when folding/unfolding contiguous PTEs on hardware
    with FEAT_BBM (break-before-make) level 2 and no TLB conflict aborts
 
 Software features:
 
  - Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
    and using the text-poke API for late module relocations
 
  - Force VMAP_STACK always on and change arm64_efi_rt_init() to use
    arch_alloc_vmap_stack() in order to avoid KASAN false positives
 
 ACPI:
 
  - Improve SPCR handling and messaging on systems lacking an SPCR table
 
 Debug:
 
  - Simplify the debug exception entry path
 
  - Drop redundant DBG_MDSCR_* macros
 
 Kselftests:
 
  - Cleanups and improvements for SME, SVE and FPSIMD tests
 
 Miscellaneous:
 
  - Optimise loop to reduce redundant operations in contpte_ptep_get()
 
  - Remove ISB when resetting POR_EL0 during signal handling
 
  - Mark the kernel as tainted on SEA and SError panic
 
  - Remove redundant gcs_free() call
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmiDkgoACgkQa9axLQDI
 XvFucQ//bYugRP5/Sdlrq5eDKWBGi1HufYzwfDEBLc4S75Eu8mGL/tuThfu9yFn+
 qCowtt4U84HdWsZDTSVo6lym6v2vJUpGOMgXzepvJaFBRnqGv9X9NxH6RQO1LTnu
 Pm7rO+7I9tNpfuc7Zu9pHDggsJEw+WzVfmEF6WPSFlT9mUNv6NbSx4rbLQKU86Dm
 ouTqXaePEQZ5oiRXVasxyT0otGtiACD20WpgOtNjYGzsfUVwCf/C83V/2DLwwbhr
 9cW9lCtFxA/yFdQcA9ThRzWZ9Eo5LAHqjGIq00+zOjuzgDbBtcTT79gpChkhovIR
 FBIsWHd9j9i3nYxzf4V4eRKQnyqS3NQWv7g7uKFwNgARif1Zk0VJ77QIlAYk5xLI
 ENTRjLKz5WNGGnhdkeCvDlVyxX+OktgcVTp3vqRxAKCRahMMUqBrwxiM8RzVF37e
 yzkEQayL8F7uZqy9H7Sjn48UpHZux6frJ1bBQw1oEvR9QmAoAdqavPMSAYIOT3Zr
 ze4WIljq/cFr3kBPIFP5pK1e0qYMHXZpSKIm8MAv6y/7KmQuVbMjZthpuPbLSIw0
 Q7C0KalB8lToPIbO7qMni/he0dCN4K2+E1YHFTR+pzfcoLuW4rjSg7i8tqMLKMJ8
 H+SeGLyPtM5A6bdAPTTpqefcgUUe7064ENUqrGUpDEynGXA7boE=
 =5h1C
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "A quick summary: perf support for Branch Record Buffer Extensions
  (BRBE), typical PMU hardware updates, small additions to MTE for
  store-only tag checking and exposing non-address bits to signal
  handlers, HAVE_LIVEPATCH enabled on arm64, VMAP_STACK forced on.

  There is also a TLBI optimisation on hardware that does not require
  break-before-make when changing the user PTEs between contiguous and
  non-contiguous.

  More details:

  Perf and PMU updates:

   - Add support for new (v3) Hisilicon SLLC and DDRC PMUs

   - Add support for Arm-NI PMU integrations that share interrupts
     between clock domains within a given instance

   - Allow SPE to be configured with a lower sample period than the
     minimum recommendation advertised by PMSIDR_EL1.Interval

   - Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)

   - Adjust the perf watchdog period according to cpu frequency changes

   - Minor driver fixes and cleanups

  Hardware features:

   - Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)

   - Support for reporting the non-address bits during a synchronous MTE
     tag check fault (FEAT_MTE_TAGGED_FAR)

   - Optimise the TLBI when folding/unfolding contiguous PTEs on
     hardware with FEAT_BBM (break-before-make) level 2 and no TLB
     conflict aborts

  Software features:

   - Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
     and using the text-poke API for late module relocations

   - Force VMAP_STACK always on and change arm64_efi_rt_init() to use
     arch_alloc_vmap_stack() in order to avoid KASAN false positives

  ACPI:

   - Improve SPCR handling and messaging on systems lacking an SPCR
     table

  Debug:

   - Simplify the debug exception entry path

   - Drop redundant DBG_MDSCR_* macros

  Kselftests:

   - Cleanups and improvements for SME, SVE and FPSIMD tests

  Miscellaneous:

   - Optimise loop to reduce redundant operations in contpte_ptep_get()

   - Remove ISB when resetting POR_EL0 during signal handling

   - Mark the kernel as tainted on SEA and SError panic

   - Remove redundant gcs_free() call"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
  arm64/gcs: task_gcs_el0_enable() should use passed task
  arm64: Kconfig: Keep selects somewhat alphabetically ordered
  arm64: signal: Remove ISB when resetting POR_EL0
  kselftest/arm64: Handle attempts to disable SM on SME only systems
  kselftest/arm64: Fix SVE write data generation for SME only systems
  kselftest/arm64: Test SME on SME only systems in fp-ptrace
  kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace
  kselftest/arm64: Allow sve-ptrace to run on SME only systems
  arm64/mm: Drop redundant addr increment in set_huge_pte_at()
  kselftest/arm4: Provide local defines for AT_HWCAP3
  arm64: Mark kernel as tainted on SAE and SError panic
  arm64/gcs: Don't call gcs_free() when releasing task_struct
  drivers/perf: hisi: Support PMUs with no interrupt
  drivers/perf: hisi: Relax the event number check of v2 PMUs
  drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
  drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
  drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
  drivers/perf: hisi: Simplify the probe process for each DDRC version
  perf/arm-ni: Support sharing IRQs within an NI instance
  perf/arm-ni: Consolidate CPU affinity handling
  ...
2025-07-29 20:21:54 -07:00
Linus Torvalds b1c21075d3 nolibc changes for v6.17
Highlights:
 
 * New supported architectures: SuperH, x32, MIPS n32/n64
 * Adopt general kernel architectures names
 * Integrate the nolibc selftests into the kselftests framework
 * Various fixes and new syscall wrappers
 
 Two non-nolibc changes:
 
 * New arm64 selftest which depends on nolibc changes
 * General tools/ cross-compilation bugfix for s390 clang
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTg4lxklFHAidmUs57B+h1jyw5bOAUCaIKVZQAKCRDB+h1jyw5b
 OFU7AP9pFk+eIO8M68GHCRVQoOjWTYa/A0lPfx31pa3HTHYiDAD+PLcTEBP4nc21
 ZQ4MFxwe+O9YXKX+Y1LkqkU7yOu5eQo=
 =xQ6f
 -----END PGP SIGNATURE-----

Merge tag 'nolibc-20250724-for-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc

Pull nolibc updates from Thomas Weißschuh:
 "Highlights:
   - New supported architectures: SuperH, x32, MIPS n32/n64
   - Adopt general kernel architectures names
   - Integrate the nolibc selftests into the kselftests framework
   - Various fixes and new syscall wrappers

  Two non-nolibc changes:
   - New arm64 selftest which depends on nolibc changes
   - General tools/ cross-compilation bugfix for s390 clang"

* tag 'nolibc-20250724-for-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (30 commits)
  selftests/nolibc: add x32 test configuration
  tools/nolibc: define time_t in terms of __kernel_old_time_t
  selftests/nolibc: show failed run if test process crashes
  tools/nolibc: drop s390 clang target override
  tools/build: Fix s390(x) cross-compilation with clang
  tools/nolibc: avoid false-positive -Wmaybe-uninitialized through waitpid()
  selftests/nolibc: correctly report errors from printf() and friends
  selftests/nolibc: create /dev/full when running as PID 1
  tools/nolibc: add support for clock_nanosleep() and nanosleep()
  kselftest/arm64: Add a test for vfork() with GCS
  selftests/nolibc: Add coverage of vfork()
  tools/nolibc: Provide vfork()
  tools/nolibc: Replace ifdef with if defined() in sys.h
  tools/nolibc: add support for SuperH
  selftests/nolibc: use file driver for QEMU serial
  selftests/nolibc: fix EXTRACONFIG variables ordering
  tools/nolibc: MIPS: add support for N64 and N32 ABIs
  tools/nolibc: MIPS: drop noreorder option
  tools/nolibc: MIPS: drop manual stack pointer alignment
  tools/nolibc: MIPS: drop $gp setup
  ...
2025-07-29 15:32:02 -07:00
Linus Torvalds 78bb43e51b Updates for the generic entry code:
- Split the code into syscall and exception/interrupt parts to ease the
     conversion of ARM[64] to the generic entry infrastructure
 
   - Extend syscall user dispatching to support a single intercepted range
     instead of the default single non-intercepted range. That allows
     monitoring/analysis of a specific executable range, e.g. a library, and
     also provides flexibility for sandboxing scenarios.
 
   - Cleanup and extend the user dispatch selftest
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiIg6UTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoWn7EACTvQpu7tGd1rN9hCjiB1W5po7nvlCd
 gKghjS9Kp0KttTDQPLVcmnH06BhDHWNNn1HXZ1ORea4bpLywiKHtVgqUAsJDsBsv
 ETeTHYNphk0sktvAqp3XusA6HF4T0s1KXJQj3W1ACrYZWRkK/VystCLYwBRGpc3r
 cj7jAFmJyNpU236R5XYJ7ooHfPYpzZ8VAHBO8ykK7muHDfyBRXEIlmkGep++ctSv
 v0uZXAy6LONljKg87YJTien0UA7ze9lFgPTuV1y/qfaLbYNekUaJSDjfuhOpZZUw
 TzSh9OYoIvKpd0ylHwB1qMLd5CaXNicaeLfTW3xbX06KaXa7WNAonS35sK0EjhtZ
 0bBA9g6bRhphyh0tzR4saF9bczNvJydNCn7/QFo9dKbQUEL/FRXtJiIeusVx/0fJ
 +ZqWRTcEdDw2Rsyv52hKgyEJi7F3nL9ovabUN9P1/0aPcTdM3WekMpSOJm1U6wVF
 e6oSyeoeNdjcdxgWbQrgRNbmq5CPEV3ig5J+G418r5DTF3ifqZX+WscijUtKTu5K
 V5GpLc0PL9eoigQ37LmGkwK/4xoB9SAPTQuzUs9qgh9NidwT0cCfoNxpeGh6GeHX
 GLHPGU61vZaefxpwuAuv+SQSgxXSKk2/H/ijPzSjrX/PkUp7MoX9XoOQAh4FxZjO
 ok5YEUGXzSJfXQ==
 =yaCQ
 -----END PGP SIGNATURE-----

Merge tag 'core-entry-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull generic entry code updates from Thomas Gleixner:

 - Split the code into syscall and exception/interrupt parts to ease the
   conversion of ARM[64] to the generic entry infrastructure

 - Extend syscall user dispatching to support a single intercepted range
   instead of the default single non-intercepted range. That allows
   monitoring/analysis of a specific executable range, e.g. a library,
   and also provides flexibility for sandboxing scenarios

 - Cleanup and extend the user dispatch selftest

* tag 'core-entry-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Split generic entry into generic exception and syscall entry
  selftests: Add tests for PR_SYS_DISPATCH_INCLUSIVE_ON
  syscall_user_dispatch: Add PR_SYS_DISPATCH_INCLUSIVE_ON
  selftests: Fix errno checking in syscall_user_dispatch test
2025-07-29 15:14:29 -07:00
Linus Torvalds a0482e3446 A set of updates for the VDSO selftests:
- Skip the chaca test when the architecture does not provide the random
     infrastructure in the VDSO
 
   - Switch back to a symlink for vdso_standalone_test_x86 to avoid code
     duplication.
 
   - Improve code quality and TAP output compliance
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiIihwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTWhD/0cI+VRRCuOqSrs/7GOeVjIjIK5IVn0
 DLXNurHokOu9aY6/BTUvLgHf4R+no+RIcnWSKc614UKjftIzkt1HVD/ekUXQbDnm
 hn6gOOEGMjUFzb7wBUToWkJ4O50/YjUmzjdEECwGafetOyre7a97cdVHOie4HrpO
 A7zy22J8kdRNcTB2FZfICZsLqvpxznNtZ9U1pHkNLddVLnC7ojMJKUTki4Ah5z18
 rb5I7K8FFQQ+QfOtMKJ0b90wFLQiYoEZjw8qp8LDq/r/fYHbm9ri3cO/rFWMtyjd
 /TLGl57H54IJUmq/M84cAI+TzZ+vZkWJqJwBF7Jt8TR01+00y3MEJG+p5Z3sG8ZQ
 sDCNypmMZPud8OsBhwZGW0vppkF1fG+64Ro5GnXZ2Q4aSScT9lyfs0lBQ6rDdb5U
 wM4wR/8UWsVY5F6Lm8nSdpKZo92U4mgaizPvWCwUl51cfRbUz71/qdXaoMW8YIOk
 Us+uDbwtHl7KFGcSehFxmrr74HJCzUezJgMPoaOjK0L/uFhXTJbvasXb7GKyS3aW
 WSYc2x62CaTJvXTEHwjZlcMFPs2OHwDSpiR2XJ+zlhiuDb++2Luhlje8cD+VieiI
 Rz7XmYiIeHsouvwD+G23ZiTOiMJH8+dVi3+oeDchITXcbGxFz+lPM1kay4w7k4zk
 q4IzgToknCuDRQ==
 =IYtA
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull VDSO selftest updates from Thomas Gleixner:

 - Skip the chacha test when the architecture does not provide the
   random infrastructure in the VDSO

 - Switch back to a symlink for vdso_standalone_test_x86 to avoid code
   duplication.

 - Improve code quality and TAP output compliance

* tag 'timers-vdso-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests: vDSO: vdso_standalone_test_x86: Replace source file with symlink
  selftests: vDSO: vdso_test_getrandom: Always print TAP header
  selftests: vDSO: vdso_test_correctness: Fix -Wstrict-prototypes
  selftests: vDSO: Enable -Wall
  selftests: vDSO: vdso_config: Avoid -Wunused-variables
  selftests: vDSO: vdso_test_getrandom: Avoid -Wunused
  selftests: vDSO: vdso_test_getrandom: Drop unused include of linux/compiler.h
  selftests: vDSO: clock_getres: Drop unused include of err.h
  selftests: vDSO: chacha: Correctly skip test if necessary
2025-07-29 15:12:29 -07:00
Linus Torvalds f38b1f243e Update for the futex subsystem:
- Switch the reference counting to a RCU based per-CPU reference to
      address a performance bottleneck vs. the single instance rcuref
      variant.
 
    - Make the futex selftest build on 32-bit architectures which only
      support 64-bit time_t, e.g. RISCV-32.
 
    - Cleanups and improvements in selftests and futex bench
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiIiDITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoblTD/0eV9w21tFVmn6ICrhgQgsrejJ0BANs
 mm5mE/0d29MZHEhnJO2CSccGXBDfykuk/gxHXHsUZ9tiVSOgjz9dDl1bcrZ8Je9V
 YNWMXiHASQrLctmrKLPSdjlcxQnPIxCm+K4lajoa+CyvReHE24sUDgCN8GC3P9pH
 VxTmQ7UjGrzvIRlfd4AL9GJBF1IGKNnpPHCeSwjn/cmlDxu4RxEdjRWTbW8Tbz9N
 1ay/T8vEE1SykI2qZOXIP16sYZw2dP9FOgARO90Ahb6hwAwbI72MvC69GpZe3lh5
 1B1ZgpEiUMa4IT5jJ43Wkm3k8BF6meW+rIUjUBt+y8yjNgaR4degvgnDx44YPZ94
 5Ek3cJgpTpVnWbfRxn2b2vRL8rZkRBIq9ezswp0/8KLgC7Gd+zPuQKPvoo2m+n3S
 UMufGGT2h5oJbx0qGry5rxZz03eGE6oWAm3H/WRl2wIw5D/kvU5ol6AYKJ5eGTyj
 JdPJVzzPBH319iCMZ1olqo/h5er148aYL16ga7w6w9pqhPuxGud30BFf8SHQ8F1R
 NIZiu6O3L2ge0RLb/8wxukFkDz3R1gZBWeTLxLEymTJG3TaA3uIByOI6UO03zgW/
 QBbNLr7ndkIcm8E31hAWamGQy+EAXj1/e5GYREvhhHOwUV+y/E1FTrrdwtT4GA0S
 tBYACfeCbOojsA==
 =WqFq
 -----END PGP SIGNATURE-----

Merge tag 'locking-futex-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull futex updates from Thomas Gleixner:

 - Switch the reference counting to a RCU based per-CPU reference to
   address a performance bottleneck vs the single instance rcuref
   variant

 - Make the futex selftest build on 32-bit architectures which only
   support 64-bit time_t, e.g. RISCV-32

 - Cleanups and improvements in selftests and futex bench

* tag 'locking-futex-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/futex: Fix spelling mistake "Succeffuly" -> "Successfully"
  selftests/futex: Define SYS_futex on 32-bit architectures with 64-bit time_t
  perf bench futex: Remove support for IMMUTABLE
  selftests/futex: Remove support for IMMUTABLE
  futex: Remove support for IMMUTABLE
  futex: Make futex_private_hash_get() static
  futex: Use RCU-based per-CPU reference counting instead of rcuref_t
  selftests/futex: Adapt the private hash test to RCU related changes
2025-07-29 14:39:42 -07:00
Johannes Nixdorf b0c9bfbab9 selftests/seccomp: Add a test for the WAIT_KILLABLE_RECV fast reply race
If WAIT_KILLABLE_RECV was specified, and an event is received, the
tracee's syscall is not supposed to be interruptible. This was not properly
ensured if the reply was sent too fast, and an interrupting signal was
received before the reply was processed on the tracee side.

Add a test for this, that consists of:

 - a tracee with a timer that keeps sending it signals while repeatedly
   running a traced syscall in a loop,
 - a tracer that repeatedly handles all syscalls from the tracee in a
   loop, and
 - a shared pipe between both, on which the tracee sends one byte per
   syscall attempted and the tracer reads one byte per syscall handled.

If the syscall for the tracee is restarted after the tracer received the
event for it due to this bug, the tracee will not have sent a second
token on the pipe, which the tracer will notice and fail the test.

The tests also uses SECCOMP_IOCTL_NOTIF_ADDFD with SECCOMP_ADDFD_FLAG_SEND
for the reply, as the fix for the bug has an additional code path
change for handling addfd, which would not be exercised by a simple
SECCOMP_IOCTL_NOTIF_SEND, and it is possible to fix the bug while leaving
the same race intact for the addfd case.

This test is not guaranteed to reproduce the bug on every run, but the
parameters (signal frequency and number of repeated syscalls) have been
chosen so that on my machine this test:

 - takes ~0.8s in the good case (+1s in the failure case), and
 - detects the bug in 999 of 1000 runs.

Signed-off-by: Johannes Nixdorf <johannes@nixdorf.dev>
Link: https://lore.kernel.org/r/20250725-seccomp-races-v2-2-cf8b9d139596@nixdorf.dev
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-29 13:33:02 -07:00
Linus Torvalds 0db240bc07 linux_kselftest-next-6.17-rc1
Fixes
 
 - false failure of subsystem event test
 - glob filter test to use mutex_unlock() instead of mutex_trylock()
 - several spelling errors in tests
 - test_kexec_jump build errors
 - pidfd test duplicate-symbol warnings for SCHED_ CPP symbols
 
 Adds a reliable check for suspend to breakpoints suspend test
 Improvements to ipc test
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmiH76IACgkQCwJExA0N
 QxyHAw/+PWj6IvX+3qQEhPL1l/huPpknPVqd/UVEozAudtkDMNwTXAzmc8TZAGU1
 BiQTMRBothDZVxLa6YyQCNgHcFuE9X4PIDPCeRrtHQmoSIJz5Db7wrgy1LtrStBN
 9bOrsoYUFyy5xWlnidreu5IHuZFrLGUdwF4BhlBL9IvNBv2FKujHfzGl0mbc2agH
 /pbJuLiDKRWk+Wqgyv8LOBu9wsQSraunyiKxYZ+ICqzhangnH8SBVNejum9B/KUT
 P2Jzyjjo9hyZxR85B7951HqubzWZjhrS2I9boDYaI0x8RyOM75pO6cdh681Gdzz9
 lluospIzQRsKR1brtP+M5vH7F6yGnrQRvjp/PvW5a410JFYeIiuhCuDhGZzYsPdj
 nFFXP/CLlTJ4U1TVAzZ0OpwapoLfSmnUzrmVHFJRDW6YpuhliFbxb4YErCRxmBLX
 MLxN7NnPG0fUxO4voy49GiqXfFuYnvgZO1FWJuczKv4NXxPUbc77Y1K9Tv4mV97n
 W5e1EO3JDLghUAkoQyDy9oC7g6ZYET1EcqhOeTujazOAVPlIKJ0YOnLqlFppc3F7
 oqlXJbRdjamn9w+UcAjWNPQxYAJeK8jyB+S2LkCDLULD3dgJEzBj7szC67Ou3yQB
 qKuh9u6xhFKc0xgW2YSm3+Cpj8BPGlKzNf+Ks4TIGF8pvurPCy0=
 =DyuD
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-next-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:

 - Fixes:
      - false failure of subsystem event test
      - glob filter test to use mutex_unlock() instead of mutex_trylock()
      - several spelling errors in tests
      - test_kexec_jump build errors
      - pidfd test duplicate-symbol warnings for SCHED_ CPP symbols

 - Add a reliable check for suspend to breakpoints suspend test

 - Improvements to ipc test

* tag 'linux_kselftest-next-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/pidfd: Fix duplicate-symbol warnings for SCHED_ CPP symbols
  selftests/tracing: Fix false failure of subsystem event test
  selftests/kexec: fix test_kexec_jump build
  selftests: breakpoints: use suspend_stats to reliably check suspend success
  selftests: tracing: Use mutex_unlock for testing glob filter
  selftests: print installation complete message
  selftests/ptrace: Fix spelling mistake "multible" -> "multiple"
  selftests: ipc: Replace fail print statements with ksft_test_result_fail
  selftests: Add version file to kselftest installation dir
  selftests/cpu-hotplug: fix typo in hotplaggable_offline_cpus function name
2025-07-29 12:48:53 -07:00
Paolo Bonzini 314b40b3b6 KVM/arm64 changes for 6.17, round #1
- Host driver for GICv5, the next generation interrupt controller for
    arm64, including support for interrupt routing, MSIs, interrupt
    translation and wired interrupts.
 
  - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
    GICv5 hardware, leveraging the legacy VGIC interface.
 
  - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
    userspace to disable support for SGIs w/o an active state on hardware
    that previously advertised it unconditionally.
 
  - Map supporting endpoints with cacheable memory attributes on systems
    with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
    maintenance on the address range.
 
  - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
    hypervisor to inject external aborts into an L2 VM and take traps of
    masked external aborts to the hypervisor.
 
  - Convert more system register sanitization to the config-driven
    implementation.
 
  - Fixes to the visibility of EL2 registers, namely making VGICv3 system
    registers accessible through the VGIC device instead of the ONE_REG
    vCPU ioctls.
 
  - Various cleanups and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCaIezbRccb2xpdmVyLnVw
 dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFr/eAQDY5NIG5cR6ZcAWnPQLmGWpz2ou
 pq4Jhn9E/mGR3n5L1AEAsJpfLLpOsmnLBdwfbjmW59gGsa8k3i5tjWEOJ6yzAwk=
 =r+sp
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.17' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 changes for 6.17, round #1

 - Host driver for GICv5, the next generation interrupt controller for
   arm64, including support for interrupt routing, MSIs, interrupt
   translation and wired interrupts.

 - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
   GICv5 hardware, leveraging the legacy VGIC interface.

 - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
   userspace to disable support for SGIs w/o an active state on hardware
   that previously advertised it unconditionally.

 - Map supporting endpoints with cacheable memory attributes on systems
   with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
   maintenance on the address range.

 - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
   hypervisor to inject external aborts into an L2 VM and take traps of
   masked external aborts to the hypervisor.

 - Convert more system register sanitization to the config-driven
   implementation.

 - Fixes to the visibility of EL2 registers, namely making VGICv3 system
   registers accessible through the VGIC device instead of the ONE_REG
   vCPU ioctls.

 - Various cleanups and minor fixes.
2025-07-29 12:27:40 -04:00
Steven Rostedt a5e71638dd ktest.pl: Add new PATCHCHECK_SKIP option to skip testing individual commits
When testing a series of commits that also includes changes to the Linux
tools directory, it is useless to test the changes in tools as they may
not affect the kernel itself. Doing tests on the kernel for changes that
do not affect the kernel is a waste of time.

Add a PATCHCHECK_SKIP that takes a series of shas that will be skipped
while doing the individual commit tests.

For example, the runtime verification may have a series of commits like:

$ git log --abbrev-commit --pretty=oneline fac5493251a6~1..HEAD
3d3800b4f7 rv: Remove rv_reactor's reference counter
3d3c376118 rv: Merge struct rv_reactor_def into struct rv_reactor
24cbfe18d5 rv: Merge struct rv_monitor_def into struct rv_monitor
b0c08dd534 rv: Remove unused field in struct rv_monitor_def
58d5f0d437 (debiantesting-x86-64/trace/rv/core) rv: Return init error when registering monitors
560473f2e2 verification/rvgen: Organise Kconfig entries for nested monitors
9efcf59082 tools/dot2c: Fix generated files going over 100 column limit
1160ccaf77 tools/rv: Stop gracefully also on SIGTERM
f60227f344 tools/rv: Do not skip idle in trace
f3735df628 verification/rvgen: Do not generate unused variables
6fb37c2a27 verification/rvgen: Generate each variable definition only once
8cfcf9b0e9 verification/rvgen: Support the 'next' operator
fac5493251 rv: Allow to configure the number of per-task monitor

Where the first commit touches the kernel followed by a series of commits
that do not, and ends with commits that do. Instead of having to add
multiple patchcheck tests to handle the gaps, just include the commits
that should not be tested:

$ git log --abbrev-commit --pretty=oneline fac5493251a6~1..HEAD |
  grep -e verification -e tools/ | cut -d' ' -f1 |
  while read a ; do echo -n "$a "; done
560473f2e2 9efcf59082 1160ccaf77 f60227f344 f3735df628 6fb37c2a27 8cfcf9b0e9

Then set PATCHCHECK_SKIP to that, and those commits will be skipped.

 PATCHCHECK_SKIP = 560473f2e2 9efcf59082 1160ccaf77 f60227f344 f3735df628 6fb37c2a27 8cfcf9b0e9

Cc: John 'Warthog9' Hawley <warthog9@kernel.org>
Cc: Dhaval Giani <dhaval.giani@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/20250725112153.1dd06b84@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2025-07-29 09:30:08 -04:00
Paolo Bonzini b4733cd5be Merge tag 'kvm-x86-selftests-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.17

 - Fix a comment typo.

 - Verify KVM is loaded when getting any KVM module param so that attempting to
   run a selftest without kvm.ko loaded results in a SKIP message about KVM not
   being loaded/enabled, versus some random parameter not existing.

 - SKIP tests that hit EACCES when attempting to access a file, with a "Root
   required?" help message.  In most cases, the test just needs to be run with
   elevated permissions.
2025-07-29 08:36:44 -04:00
Paolo Bonzini 1a14928e2e Merge tag 'kvm-x86-misc-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.17

 - Prevert the host's DEBUGCTL.FREEZE_IN_SMM (Intel only) when running the
   guest.  Failure to honor FREEZE_IN_SMM can bleed host state into the guest.

 - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter (Intel only) to
   prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF.

 - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
   vCPU's CPUID model.

 - Rework the MSR interception code so that the SVM and VMX APIs are more or
   less identical.

 - Recalculate all MSR intercepts from the "source" on MSR filter changes, and
   drop the dedicated "shadow" bitmaps (and their awful "max" size defines).

 - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
   nested SVM MSRPM offsets tracker can't handle an MSR.

 - Advertise support for LKGS (Load Kernel GS base), a new instruction that's
   loosely related to FRED, but is supported and enumerated independently.

 - Fix a user-triggerable WARN that syzkaller found by stuffing INIT_RECEIVED,
   a.k.a. WFS, and then putting the vCPU into VMX Root Mode (post-VMXON).  Use
   the same approach KVM uses for dealing with "impossible" emulation when
   running a !URG guest, and simply wait until KVM_RUN to detect that the vCPU
   has architecturally impossible state.

 - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
   APERF/MPERF reads, so that a "properly" configured VM can "virtualize"
   APERF/MPERF (with many caveats).

 - Reject KVM_SET_TSC_KHZ if vCPUs have been created, as changing the "default"
   frequency is unsupported for VMs with a "secure" TSC, and there's no known
   use case for changing the default frequency for other VM types.
2025-07-29 08:36:43 -04:00
Paolo Bonzini f02b1bcc73 Merge tag 'kvm-x86-irqs-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM IRQ changes for 6.17

 - Rework irqbypass to track/match producers and consumers via an xarray
   instead of a linked list.  Using a linked list leads to O(n^2) insertion
   times, which is hugely problematic for use cases that create large numbers
   of VMs.  Such use cases typically don't actually use irqbypass, but
   eliminating the pointless registration is a future problem to solve as it
   likely requires new uAPI.

 - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *",
   to avoid making a simple concept unnecessarily difficult to understand.

 - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC,
   and PIT emulation at compile time.

 - Drop x86's irq_comm.c, and move a pile of IRQ related code into irq.c.

 - Fix a variety of flaws and bugs in the AVIC device posted IRQ code.

 - Inhibited AVIC if a vCPU's ID is too big (relative to what hardware
   supports) instead of rejecting vCPU creation.

 - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning
   clear in the vCPU's physical ID table entry.

 - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by
   erratum #1235, to allow (safely) enabling AVIC on such CPUs.

 - Dedup x86's device posted IRQ code, as the vast majority of functionality
   can be shared verbatime between SVM and VMX.

 - Harden the device posted IRQ code against bugs and runtime errors.

 - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1)
   instead of O(n).

 - Generate GA Log interrupts if and only if the target vCPU is blocking, i.e.
   only if KVM needs a notification in order to wake the vCPU.

 - Decouple device posted IRQs from VFIO device assignment, as binding a VM to
   a VFIO group is not a requirement for enabling device posted IRQs.

 - Clean up and document/comment the irqfd assignment code.

 - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e.
   ensure an eventfd is bound to at most one irqfd through the entire host,
   and add a selftest to verify eventfd:irqfd bindings are globally unique.
2025-07-29 08:35:46 -04:00
KaFai Wan 51d3750aba selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
Delete fexit_noreturns.c files and migrate the cases into
tracing_failure.c files.

The result:

 $ tools/testing/selftests/bpf/test_progs -t tracing_failure/fexit_noreturns
 #467/4   tracing_failure/fexit_noreturns:OK
 #467     tracing_failure:OK
 Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-5-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:30 -07:00
KaFai Wan a32f6f17a7 selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
The result:

 $ tools/testing/selftests/bpf/test_progs -t tracing_failure/tracing_deny
 #468/3   tracing_failure/tracing_deny:OK
 #468     tracing_failure:OK
 Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-4-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:29 -07:00
KaFai Wan a5a6b29a70 bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
With this change, we know the precise rejected function name when
attaching fexit/fmod_ret to __noreturn functions from log.

$ ./fexit
libbpf: prog 'fexit': BPF program load failed: -EINVAL
libbpf: prog 'fexit': -- BEGIN PROG LOAD LOG --
Attaching fexit/fmod_ret to __noreturn function 'do_exit' is rejected.

Suggested-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:29 -07:00
Linus Torvalds ae388edd4a Landlock update for v6.17-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCaINQ2hAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbS8DcA/RvnXD7NcvINU94pkY6w+SxtdhsIe/w7EcGF
 LJdwxrKQAP0WpMNTfKdzfe6ub7qE+AkZYhagKAuMCloS1qE35nZgDg==
 =WKRS
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock update from Mickaël Salaün:
 "Fix test issues, improve build compatibility, and add new tests"

* tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Fix cosmetic change
  samples/landlock: Fix building on musl libc
  landlock: Fix warning from KUnit tests
  selftests/landlock: Add test to check rule tied to covered mount point
  selftests/landlock: Fix build of audit_test
  selftests/landlock: Fix readlink check
2025-07-28 19:21:32 -07:00
Linus Torvalds 13150742b0 Crypto library updates for 6.17
This is the main crypto library pull request for 6.17. The main focus
 this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
 high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
 and establishing conventions for lib/crypto/ going forward:
 
  - Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
    most of the SHA-512 code) into lib/crypto/. This includes both the
    generic and architecture-optimized code. Greatly simplify how the
    architecture-optimized code is integrated. Add an easy-to-use
    library API for each SHA variant, including HMAC support. Finally,
    reimplement the crypto_shash support on top of the library API.
 
  - Apply the same reorganization to the SHA-256 code (and also SHA-224
    which shares most of the SHA-256 code). This is a somewhat smaller
    change, due to my earlier work on SHA-256. But this brings in all
    the same additional improvements that I made for SHA-1 and SHA-512.
 
 There are also some smaller changes:
 
  - Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
    from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
    these algorithms it's just a move, not a full reorganization yet.
 
  - Fix the MIPS chacha-core.S to build with the clang assembler.
 
  - Fix the Poly1305 functions to work in all contexts.
 
  - Fix a performance regression in the x86_64 Poly1305 code.
 
  - Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
 
 Note that since the new organization of the SHA code is much simpler,
 the diffstat of this pull request is negative, despite the addition of
 new fully-documented library APIs for multiple SHA and HMAC-SHA
 variants. These APIs will allow further simplifications across the
 kernel as users start using them instead of the old-school crypto API.
 (I've already written a lot of such conversion patches, removing over
 1000 more lines of code. But most of those will target 6.18 or later.)
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ93BQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK8HCAQD3O9P0qd6wscne5XuRwaybzKHQ2AqU
 OlhlDZWQQEvYAgD/aa6KP/DS+8RKGj0TBn6bACAJyXyDygFXq5a5s9pGzAs=
 =UmMM
 -----END PGP SIGNATURE-----

Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library updates from Eric Biggers:
 "This is the main crypto library pull request for 6.17. The main focus
  this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
  high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
  and establishing conventions for lib/crypto/ going forward:

   - Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
     most of the SHA-512 code) into lib/crypto/. This includes both the
     generic and architecture-optimized code. Greatly simplify how the
     architecture-optimized code is integrated. Add an easy-to-use
     library API for each SHA variant, including HMAC support. Finally,
     reimplement the crypto_shash support on top of the library API.

   - Apply the same reorganization to the SHA-256 code (and also SHA-224
     which shares most of the SHA-256 code). This is a somewhat smaller
     change, due to my earlier work on SHA-256. But this brings in all
     the same additional improvements that I made for SHA-1 and SHA-512.

  There are also some smaller changes:

   - Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
     from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
     these algorithms it's just a move, not a full reorganization yet.

   - Fix the MIPS chacha-core.S to build with the clang assembler.

   - Fix the Poly1305 functions to work in all contexts.

   - Fix a performance regression in the x86_64 Poly1305 code.

   - Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.

  Note that since the new organization of the SHA code is much simpler,
  the diffstat of this pull request is negative, despite the addition of
  new fully-documented library APIs for multiple SHA and HMAC-SHA
  variants.

  These APIs will allow further simplifications across the kernel as
  users start using them instead of the old-school crypto API. (I've
  already written a lot of such conversion patches, removing over 1000
  more lines of code. But most of those will target 6.18 or later)"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
  lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
  lib/crypto: x86/sha1-ni: Convert to use rounds macros
  lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
  crypto: sha1 - Remove sha1_base.h
  lib/crypto: x86/sha1: Migrate optimized code into library
  lib/crypto: sparc/sha1: Migrate optimized code into library
  lib/crypto: s390/sha1: Migrate optimized code into library
  lib/crypto: powerpc/sha1: Migrate optimized code into library
  lib/crypto: mips/sha1: Migrate optimized code into library
  lib/crypto: arm64/sha1: Migrate optimized code into library
  lib/crypto: arm/sha1: Migrate optimized code into library
  crypto: sha1 - Use same state format as legacy drivers
  crypto: sha1 - Wrap library and add HMAC support
  lib/crypto: sha1: Add HMAC support
  lib/crypto: sha1: Add SHA-1 library functions
  lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
  crypto: x86/sha1 - Rename conflicting symbol
  lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
  lib/crypto: arm/poly1305: Remove unneeded empty weak function
  lib/crypto: x86/poly1305: Fix performance regression on short messages
  ...
2025-07-28 17:58:52 -07:00
Linus Torvalds 8e736a2eea hardening updates for v6.17-rc1
- Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)
 
 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)
 
 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
 
 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)
 
 - Refactor and rename stackleak feature to support Clang
 
 - Add KUnit test for seq_buf API
 
 - Fix KUnit fortify test under LTO
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaIfUkgAKCRA2KwveOeQk
 uypLAP92r6f47sWcOw/5B9aVffX6Bypsb7dqBJQpCNxI5U1xcAEAiCrZ98UJyOeQ
 JQgnXd4N67K4EsS2JDc+FutRn3Yi+A8=
 =+5Bq
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)

 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)

 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)

 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)

 - Refactor and rename stackleak feature to support Clang

 - Add KUnit test for seq_buf API

 - Fix KUnit fortify test under LTO

* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  sched/task_stack: Add missing const qualifier to end_of_stack()
  kstack_erase: Support Clang stack depth tracking
  kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
  init.h: Disable sanitizer coverage for __init and __head
  kstack_erase: Disable kstack_erase for all of arm compressed boot code
  x86: Handle KCOV __init vs inline mismatches
  arm64: Handle KCOV __init vs inline mismatches
  s390: Handle KCOV __init vs inline mismatches
  arm: Handle KCOV __init vs inline mismatches
  mips: Handle KCOV __init vs inline mismatch
  powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
  configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
  configs/hardening: Enable CONFIG_KSTACK_ERASE
  stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
  stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
  stackleak: Rename STACKLEAK to KSTACK_ERASE
  seq_buf: Introduce KUnit tests
  string: Group str_has_prefix() and strstarts()
  kunit/fortify: Add back "volatile" for sizeof() constants
  acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
  ...
2025-07-28 17:16:12 -07:00
Linus Torvalds 6e11664f14 for-6.17/block-20250728
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiHdZ8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptRED/9o3dQ1QHL5yNM/AyCCGox0V4zra8qGS/Vc
 cBWpAVrmPGRw0IYlLZENtN9PdwKcbMzJq3l6cxeC7dBnAZP0AxTzP4YYJYUNVsqo
 WtJ3d/k5+cVp0OyOp4uabaqNeMeLoPk9/JXe1Ml2KxtDmHtj5yee0JRh7zlPZmZj
 tsrpIUTeHgAPn6yR1EI+0ybx/mjCb05Mv2Y8gF5hkUPA2PuON+MTFixJmqoy2ySh
 n+22mz/prqlyOSYh/VVv1+9jcQ94wMjcW0JIpg9lM3Kg8BCPU4IetvO1UiX6X33v
 154zEh2aJJDBx+yORS4BM4JMXjRZI7lYea2dkHM8Cajctu1Wpja9bNwnK9ibXvEc
 WtyBwztleLbAZef25fA/W87JE23fGa/r3nwIb2cF4QqkAFslCvhjA93WkOzNJCgQ
 qsWOrlCh3IK2NUu4b1Ncs3ZHOPvc51+zzjMzC6SUr54xhrxDK+gngDPhRy7XDqWJ
 DTMpIlr366o8GdJqnib0/e/CPBrThS6Vl6u0tgLnNbwdpK1svgo/uHW5ksKvDqHX
 kGEIhyRRJJC+4wyl4dsYKXa2twcyFrlWdAE+pZguEC2nZRYqYl9uXftOtvfp1x0y
 /skDX0FIDjvyjRqCLcqF03FSGqwCGS8WuWXZjPhVhcfz47NvbHeFDh1G/jMzsbpj
 S9zrPve/DQ==
 =e86T
 -----END PGP SIGNATURE-----

Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - MD pull request via Yu:
      - call del_gendisk synchronously (Xiao)
      - cleanup unused variable (John)
      - cleanup workqueue flags (Ryo)
      - fix faulty rdev can't be removed during resync (Qixing)

 - NVMe pull request via Christoph:
      - try PCIe function level reset on init failure (Keith Busch)
      - log TLS handshake failures at error level (Maurizio Lombardi)
      - pci-epf: do not complete commands twice if nvmet_req_init()
        fails (Rick Wertenbroek)
      - misc cleanups (Alok Tiwari)

 - Removal of the pktcdvd driver

   This has been more than a decade coming at this point, and some
   recently revealed breakages that had it causing issues even for cases
   where it isn't required made me re-pull the trigger on this one. It's
   known broken and nobody has stepped up to maintain the code

 - Series for ublk supporting batch commands, enabling the use of
   multishot where appropriate

 - Speed up ublk exit handling

 - Fix for the two-stage elevator fixing which could leak data

 - Convert NVMe to use the new IOVA based API

 - Increase default max transfer size to something more reasonable

 - Series fixing write operations on zoned DM devices

 - Add tracepoints for zoned block device operations

 - Prep series working towards improving blk-mq queue management in the
   presence of isolated CPUs

 - Don't allow updating of the block size of a loop device that is
   currently under exclusively ownership/open

 - Set chunk sectors from stacked device stripe size and use it for the
   atomic write size limit

 - Switch to folios in bcache read_super()

 - Fix for CD-ROM MRW exit flush handling

 - Various tweaks, fixes, and cleanups

* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
  block: restore two stage elevator switch while running nr_hw_queue update
  cdrom: Call cdrom_mrw_exit from cdrom_release function
  sunvdc: Balance device refcount in vdc_port_mpgroup_check
  nvme-pci: try function level reset on init failure
  dm: split write BIOs on zone boundaries when zone append is not emulated
  block: use chunk_sectors when evaluating stacked atomic write limits
  dm-stripe: limit chunk_sectors to the stripe size
  md/raid10: set chunk_sectors limit
  md/raid0: set chunk_sectors limit
  block: sanitize chunk_sectors for atomic write limits
  ilog2: add max_pow_of_two_factor()
  nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
  nvme-tcp: log TLS handshake failures at error level
  docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
  nvme: fix typo in status code constant for self-test in progress
  nvmet: remove redundant assignment of error code in nvmet_ns_enable()
  nvme: fix incorrect variable in io cqes error message
  nvme: fix multiple spelling and grammar issues in host drivers
  block: fix blk_zone_append_update_request_bio() kernel-doc
  md/raid10: fix set but not used variable in sync_request_write()
  ...
2025-07-28 16:43:54 -07:00
Linus Torvalds 7e7bc8335b vfs-6.17-rc1.bpf
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCjwAKCRCRxhvAZXjc
 osnVAQCv4rM7sF4yJvGlm1myIJcJy5Sabk2q31qMdI1VHmkcOwD+Mxs7d1aByTS8
 /6djhVleq6lcT2LpP9j8YI3Rb+x30QY=
 =PF3o
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.bpf' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs bpf updates from Christian Brauner:
 "These changes allow bpf to read extended attributes from cgroupfs.

  This is useful in redirecting AF_UNIX socket connections based on
  cgroup membership of the socket. One use-case is the ability to
  implement log namespaces in systemd so services and containers are
  redirected to different journals"

* tag 'vfs-6.17-rc1.bpf' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/kernfs: test xattr retrieval
  selftests/bpf: Add tests for bpf_cgroup_read_xattr
  bpf: Mark cgroup_subsys_state->cgroup RCU safe
  bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
  kernfs: remove iattr_mutex
2025-07-28 14:42:31 -07:00
Linus Torvalds 672dcda246 vfs-6.17-rc1.pidfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCiQAKCRCRxhvAZXjc
 orltAQDq3y1anYETz5/FD6P2gXY1W5hXdSm3EHHeacQ1JjTXvgEA2g1lWO7J4anf
 oOVE8aSvMow/FOjivLZBYmI65pkYJAE=
 =oDKB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull pidfs updates from Christian Brauner:

 - persistent info

   Persist exit and coredump information independent of whether anyone
   currently holds a pidfd for the struct pid.

   The current scheme allocated pidfs dentries on-demand repeatedly.
   This scheme is reaching it's limits as it makes it impossible to pin
   information that needs to be available after the task has exited or
   coredumped and that should not be lost simply because the pidfd got
   closed temporarily. The next opener should still see the stashed
   information.

   This is also a prerequisite for supporting extended attributes on
   pidfds to allow attaching meta information to them.

   If someone opens a pidfd for a struct pid a pidfs dentry is allocated
   and stashed in pid->stashed. Once the last pidfd for the struct pid
   is closed the pidfs dentry is released and removed from pid->stashed.

   So if 10 callers create a pidfs dentry for the same struct pid
   sequentially, i.e., each closing the pidfd before the other creates a
   new one then a new pidfs dentry is allocated every time.

   Because multiple tasks acquiring and releasing a pidfd for the same
   struct pid can race with each another a task may still find a valid
   pidfs entry from the previous task in pid->stashed and reuse it. Or
   it might find a dead dentry in there and fail to reuse it and so
   stashes a new pidfs dentry. Multiple tasks may race to stash a new
   pidfs dentry but only one will succeed, the other ones will put their
   dentry.

   The current scheme aims to ensure that a pidfs dentry for a struct
   pid can only be created if the task is still alive or if a pidfs
   dentry already existed before the task was reaped and so exit
   information has been was stashed in the pidfs inode.

   That's great except that it's buggy. If a pidfs dentry is stashed in
   pid->stashed after pidfs_exit() but before __unhash_process() is
   called we will return a pidfd for a reaped task without exit
   information being available.

   The pidfds_pid_valid() check does not guard against this race as it
   doens't sync at all with pidfs_exit(). The pid_has_task() check might
   be successful simply because we're before __unhash_process() but
   after pidfs_exit().

   Introduce a new scheme where the lifetime of information associated
   with a pidfs entry (coredump and exit information) isn't bound to the
   lifetime of the pidfs inode but the struct pid itself.

   The first time a pidfs dentry is allocated for a struct pid a struct
   pidfs_attr will be allocated which will be used to store exit and
   coredump information.

   If all pidfs for the pidfs dentry are closed the dentry and inode can
   be cleaned up but the struct pidfs_attr will stick until the struct
   pid itself is freed. This will ensure minimal memory usage while
   persisting relevant information.

   The new scheme has various advantages. First, it allows to close the
   race where we end up handing out a pidfd for a reaped task for which
   no exit information is available. Second, it minimizes memory usage.
   Third, it allows to remove complex lifetime tracking via dentries
   when registering a struct pid with pidfs. There's no need to get or
   put a reference. Instead, the lifetime of exit and coredump
   information associated with a struct pid is bound to the lifetime of
   struct pid itself.

 - extended attributes

   Now that we have a way to persist information for pidfs dentries we
   can start supporting extended attributes on pidfds. This will allow
   userspace to attach meta information to tasks.

   One natural extension would be to introduce a custom pidfs.* extended
   attribute space and allow for the inheritance of extended attributes
   across fork() and exec().

   The first simple scheme will allow privileged userspace to set
   trusted extended attributes on pidfs inodes.

 - Allow autonomous pidfs file handles

   Various filesystems such as pidfs and drm support opening file
   handles without having to require a file descriptor to identify the
   filesystem. The filesystem are global single instances and can be
   trivially identified solely on the information encoded in the file
   handle.

   This makes it possible to not have to keep or acquire a sentinal file
   descriptor just to pass it to open_by_handle_at() to identify the
   filesystem. That's especially useful when such sentinel file
   descriptor cannot or should not be acquired.

   For pidfs this means a file handle can function as full replacement
   for storing a pid in a file. Instead a file handle can be stored and
   reopened purely based on the file handle.

   Such autonomous file handles can be opened with or without specifying
   a a file descriptor. If no proper file descriptor is used the
   FD_PIDFS_ROOT sentinel must be passed. This allows us to define
   further special negative fd sentinels in the future.

   Userspace can trivially test for support by trying to open the file
   handle with an invalid file descriptor.

 - Allow pidfds for reaped tasks with SCM_PIDFD messages

   This is a logical continuation of the earlier work to create pidfds
   for reaped tasks through the SO_PEERPIDFD socket option merged in
   923ea4d448 ("Merge patch series "net, pidfs: enable handing out
   pidfds for reaped sk->sk_peer_pid"").

 - Two minor fixes:

    * Fold fs_struct->{lock,seq} into a seqlock

    * Don't bother with path_{get,put}() in unix_open_file()

* tag 'vfs-6.17-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (37 commits)
  don't bother with path_get()/path_put() in unix_open_file()
  fold fs_struct->{lock,seq} into a seqlock
  selftests: net: extend SCM_PIDFD test to cover stale pidfds
  af_unix: enable handing out pidfds for reaped tasks in SCM_PIDFD
  af_unix: stash pidfs dentry when needed
  af_unix/scm: fix whitespace errors
  af_unix: introduce and use scm_replace_pid() helper
  af_unix: introduce unix_skb_to_scm helper
  af_unix: rework unix_maybe_add_creds() to allow sleep
  selftests/pidfd: decode pidfd file handles withou having to specify an fd
  fhandle, pidfs: support open_by_handle_at() purely based on file handle
  uapi/fcntl: add FD_PIDFS_ROOT
  uapi/fcntl: add FD_INVALID
  fcntl/pidfd: redefine PIDFD_SELF_THREAD_GROUP
  uapi/fcntl: mark range as reserved
  fhandle: reflow get_path_anchor()
  pidfs: add pidfs_root_path() helper
  fhandle: rename to get_path_anchor()
  fhandle: hoist copy_from_user() above get_path_from_fd()
  fhandle: raise FILEID_IS_DIR in handle_type
  ...
2025-07-28 14:10:15 -07:00
Linus Torvalds 7031769e10 vfs-6.17-rc1.mmap_prepare
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCgQAKCRCRxhvAZXjc
 os+nAP9LFHUwWO6EBzHJJGEVjJvvzsbzqeYrRFamYiMc5ulPJwD+KW4RIgJa/MWO
 pcYE40CacaekD8rFWwYUyszpgmv6ewc=
 =wCwp
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull mmap_prepare updates from Christian Brauner:
 "Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b ("mm:
  introduce new .mmap_prepare() file callback").

  This is preferred to the existing f_op->mmap() hook as it does require
  a VMA to be established yet, thus allowing the mmap logic to invoke
  this hook far, far earlier, prior to inserting a VMA into the virtual
  address space, or performing any other heavy handed operations.

  This allows for much simpler unwinding on error, and for there to be a
  single attempt at merging a VMA rather than having to possibly
  reattempt a merge based on potentially altered VMA state.

  Far more importantly, it prevents inappropriate manipulation of
  incompletely initialised VMA state, which is something that has been
  the cause of bugs and complexity in the past.

  The intent is to gradually deprecate f_op->mmap, and in that vein this
  series coverts the majority of file systems to using f_op->mmap_prepare.

  Prerequisite steps are taken - firstly ensuring all checks for mmap
  capabilities use the file_has_valid_mmap_hooks() helper rather than
  directly checking for f_op->mmap (which is now not a valid check) and
  secondly updating daxdev_mapping_supported() to not require a VMA
  parameter to allow ext4 and xfs to be converted.

  Commit bb666b7c27 ("mm: add mmap_prepare() compatibility layer for
  nested file systems") handles the nasty edge-case of nested file
  systems like overlayfs, which introduces a compatibility shim to allow
  f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.

  This allows for nested filesystems to continue to function correctly
  with all file systems regardless of which callback is used. Once we
  finally convert all file systems, this shim can be removed.

  As a result, ecryptfs, fuse, and overlayfs remain unaltered so they
  can nest all other file systems.

  We additionally do not update resctl - as this requires an update to
  remap_pfn_range() (or an alternative to it) which we defer to a later
  series, equally we do not update cramfs which needs a mixed mapping
  insertion with the same issue, nor do we update procfs, hugetlbfs,
  syfs or kernfs all of which require VMAs for internal state and hooks.
  We shall return to all of these later"

* tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  doc: update porting, vfs documentation to describe mmap_prepare()
  fs: replace mmap hook with .mmap_prepare for simple mappings
  fs: convert most other generic_file_*mmap() users to .mmap_prepare()
  fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
  mm/filemap: introduce generic_file_*_mmap_prepare() helpers
  fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
  fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
  fs/dax: make it possible to check dev dax support without a VMA
  fs: consistently use can_mmap_file() helper
  mm/nommu: use file_has_valid_mmap_hooks() helper
  mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
2025-07-28 13:43:25 -07:00
Linus Torvalds 117eab5c6e vfs-6.17-rc1.coredump
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINAYAAKCRCRxhvAZXjc
 opJiAQDXGs+gQcxJ+4BpV4QszT2OJC19oI/f5AQ4PWMJdHgr4AEA7fc6NbBrpmW7
 L/tbdAwIiWp8bL1Q8Wy7Q2qldHtcggM=
 =KbD9
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull coredump updates from Christian Brauner:
 "This contains an extension to the coredump socket and a proper rework
  of the coredump code.

   - This extends the coredump socket to allow the coredump server to
     tell the kernel how to process individual coredumps. This allows
     for fine-grained coredump management. Userspace can decide to just
     let the kernel write out the coredump, or generate the coredump
     itself, or just reject it.

     * COREDUMP_KERNEL
       The kernel will write the coredump data to the socket.

     * COREDUMP_USERSPACE
       The kernel will not write coredump data but will indicate to the
       parent that a coredump has been generated. This is used when
       userspace generates its own coredumps.

     * COREDUMP_REJECT
       The kernel will skip generating a coredump for this task.

     * COREDUMP_WAIT
       The kernel will prevent the task from exiting until the coredump
       server has shutdown the socket connection.

     The flexible coredump socket can be enabled by using the "@@"
     prefix instead of the single "@" prefix for the regular coredump
     socket:

       @@/run/systemd/coredump.socket

   - Cleanup the coredump code properly while we have to touch it
     anyway.

     Split out each coredump mode in a separate helper so it's easy to
     grasp what is going on and make the code easier to follow. The core
     coredump function should now be very trivial to follow"

* tag 'vfs-6.17-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits)
  cleanup: add a scoped version of CLASS()
  coredump: add coredump_skip() helper
  coredump: avoid pointless variable
  coredump: order auto cleanup variables at the top
  coredump: add coredump_cleanup()
  coredump: auto cleanup prepare_creds()
  cred: add auto cleanup method
  coredump: directly return
  coredump: auto cleanup argv
  coredump: add coredump_write()
  coredump: use a single helper for the socket
  coredump: move pipe specific file check into coredump_pipe()
  coredump: split pipe coredumping into coredump_pipe()
  coredump: move core_pipe_count to global variable
  coredump: prepare to simplify exit paths
  coredump: split file coredumping into coredump_file()
  coredump: rename do_coredump() to vfs_coredump()
  selftests/coredump: make sure invalid paths are rejected
  coredump: validate socket path in coredump_parse()
  coredump: don't allow ".." in coredump socket path
  ...
2025-07-28 11:50:36 -07:00
Paul Chaignon 5dbb19b16a bpf: Add third round of bounds deduction
Commit d7f0087381 ("bpf: try harder to deduce register bounds from
different numeric domains") added a second call to __reg_deduce_bounds
in reg_bounds_sync because a single call wasn't enough to converge to a
fixed point in terms of register bounds.

With patch "bpf: Improve bounds when s64 crosses sign boundary" from
this series, Eduard noticed that calling __reg_deduce_bounds twice isn't
enough anymore to converge. The first selftest added in "selftests/bpf:
Test cross-sign 64bits range refinement" highlights the need for a third
call to __reg_deduce_bounds. After instruction 7, reg_bounds_sync
performs the following bounds deduction:

  reg_bounds_sync entry:          scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
  __update_reg_bounds:            scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
  __reg_deduce_bounds:
      __reg32_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg64_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg_deduce_mixed_bounds:  scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
  __reg_deduce_bounds:
      __reg32_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
      __reg64_deduce_bounds:      scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg_deduce_mixed_bounds:  scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
  __reg_bound_offset:             scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))
  __update_reg_bounds:            scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))

In particular, notice how:
1. In the first call to __reg_deduce_bounds, __reg32_deduce_bounds
   learns new u32 bounds.
2. __reg64_deduce_bounds is unable to improve bounds at this point.
3. __reg_deduce_mixed_bounds derives new u64 bounds from the u32 bounds.
4. In the second call to __reg_deduce_bounds, __reg64_deduce_bounds
   improves the smax and umin bounds thanks to patch "bpf: Improve
   bounds when s64 crosses sign boundary" from this series.
5. Subsequent functions are unable to improve the ranges further (only
   tnums). Yet, a better smin32 bound could be learned from the smin
   bound.

__reg32_deduce_bounds is able to improve smin32 from smin, but for that
we need a third call to __reg_deduce_bounds.

As discussed in [1], there may be a better way to organize the deduction
rules to learn the same information with less calls to the same
functions. Such an optimization requires further analysis and is
orthogonal to the present patchset.

Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Co-developed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/79619d3b42e5525e0e174ed534b75879a5ba15de.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:13 -07:00
Paul Chaignon f96841bbf4 selftests/bpf: Test invariants on JSLT crossing sign
The improvement of the u64/s64 range refinement fixed the invariant
violation that was happening on this test for BPF_JSLT when crossing the
sign boundary.

After this patch, we have one test remaining with a known invariant
violation. It's the same test as fixed here but for 32 bits ranges.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/ad046fb0016428f1a33c3b81617aabf31b51183f.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:13 -07:00
Paul Chaignon 26e5e346a5 selftests/bpf: Test cross-sign 64bits range refinement
This patch adds coverage for the new cross-sign 64bits range refinement
logic. The three tests cover the cases when the u64 and s64 ranges
overlap (1) in the negative portion of s64, (2) in the positive portion
of s64, and (3) in both portions.

The first test is a simplified version of a BPF program generated by
syzkaller that caused an invariant violation [1]. It looks like
syzkaller could not extract the reproducer itself (and therefore didn't
report it to the mailing list), but I was able to extract it from the
console logs of a crash.

The principle is similar to the invariant violation described in
commit 6279846b9b ("bpf: Forget ranges when refining tnum after
JSET"): the verifier walks a dead branch, uses the condition to refine
ranges, and ends up with inconsistent ranges. In this case, the dead
branch is when we fallthrough on both jumps. The new refinement logic
improves the bounds such that the second jump is properly detected as
always-taken and the verifier doesn't end up walking a dead branch.

The second and third tests are inspired by the first, but rely on
condition jumps to prepare the bounds instead of ALU instructions. An
R10 write is used to trigger a verifier error when the bounds can't be
refined.

Link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/a0e17b00dab8dabcfa6f8384e7e151186efedfdd.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:12 -07:00
Paul Chaignon da653de268 selftests/bpf: Update reg_bound range refinement logic
This patch updates the range refinement logic in the reg_bound test to
match the new logic from the previous commit. Without this change, tests
would fail because we end with more precise ranges than the tests
expect.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/b7f6b1fbe03373cca4e1bb6a113035a6cd2b3ff7.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:12 -07:00
Oliver Upton 18ec25dd0e KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
VDISR_EL2 and VSESR_EL2 are now visible to userspace for nested VMs. Add
them to get-reg-list.

Link: https://lore.kernel.org/r/20250728152603.2823699-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:28:05 -07:00
Oliver Upton 0d46e324c0 Merge branch 'kvm-arm64/vgic-v4-ctl' into kvmarm/next
* kvm-arm64/vgic-v4-ctl:
  : Userspace control of nASSGIcap, courtesy of Raghavendra Rao Ananta
  :
  : Allow userspace to decide if support for SGIs without an active state is
  : advertised to the guest, allowing VMs from GICv3-only hardware to be
  : migrated to to GICv4.1 capable machines.
  Documentation: KVM: arm64: Describe VGICv3 registers writable pre-init
  KVM: arm64: selftests: Add test for nASSGIcap attribute
  KVM: arm64: vgic-v3: Allow userspace to write GICD_TYPER2.nASSGIcap
  KVM: arm64: vgic-v3: Allow access to GICD_IIDR prior to initialization
  KVM: arm64: vgic-v3: Consolidate MAINT_IRQ handling
  KVM: arm64: Disambiguate support for vSGIs v. vLPIs

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:11:38 -07:00
Oliver Upton a7f49a9bf4 Merge branch 'kvm-arm64/el2-reg-visibility' into kvmarm/next
* kvm-arm64/el2-reg-visibility:
  : Fixes to EL2 register visibility, courtesy of Marc Zyngier
  :
  :  - Expose EL2 VGICv3 registers via the VGIC attributes accessor, not the
  :    KVM_{GET,SET}_ONE_REG ioctls
  :
  :  - Condition visibility of FGT registers on the presence of FEAT_FGT in
  :    the VM
  KVM: arm64: selftest: vgic-v3: Add basic GICv3 sysreg userspace access test
  KVM: arm64: Enforce the sorting of the GICv3 system register table
  KVM: arm64: Clarify the check for reset callback in check_sysreg_table()
  KVM: arm64: vgic-v3: Fix ordering of ICH_HCR_EL2
  KVM: arm64: Document registers exposed via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: selftests: get-reg-list: Add base EL2 registers
  KVM: arm64: selftests: get-reg-list: Simplify feature dependency
  KVM: arm64: Advertise FGT2 registers to userspace
  KVM: arm64: Condition FGT registers on feature availability
  KVM: arm64: Expose GICv3 EL2 registers via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: Let GICv3 save/restore honor visibility attribute
  KVM: arm64: Define helper for ICH_VTR_EL2
  KVM: arm64: Define constant value for ICC_SRE_EL2
  KVM: arm64: Don't advertise ICH_*_EL2 registers through GET_ONE_REG
  KVM: arm64: Make RVBAR_EL2 accesses UNDEF

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:06:38 -07:00
Puranjay Mohan cf2a6de32c powerpc64/bpf: Add jit support for load_acquire and store_release
Add JIT support for the load_acquire and store_release instructions. The
implementation is similar to the kernel where:

        load_acquire  => plain load -> lwsync
        store_release => lwsync -> plain store

To test the correctness of the implementation, following selftests were
run:

  [fedora@linux-kernel bpf]$ sudo ./test_progs -a \
  verifier_load_acquire,verifier_store_release,atomics
  #11/1    atomics/add:OK
  #11/2    atomics/sub:OK
  #11/3    atomics/and:OK
  #11/4    atomics/or:OK
  #11/5    atomics/xor:OK
  #11/6    atomics/cmpxchg:OK
  #11/7    atomics/xchg:OK
  #11      atomics:OK
  #519/1   verifier_load_acquire/load-acquire, 8-bit:OK
  #519/2   verifier_load_acquire/load-acquire, 8-bit @unpriv:OK
  #519/3   verifier_load_acquire/load-acquire, 16-bit:OK
  #519/4   verifier_load_acquire/load-acquire, 16-bit @unpriv:OK
  #519/5   verifier_load_acquire/load-acquire, 32-bit:OK
  #519/6   verifier_load_acquire/load-acquire, 32-bit @unpriv:OK
  #519/7   verifier_load_acquire/load-acquire, 64-bit:OK
  #519/8   verifier_load_acquire/load-acquire, 64-bit @unpriv:OK
  #519/9   verifier_load_acquire/load-acquire with uninitialized
  src_reg:OK
  #519/10  verifier_load_acquire/load-acquire with uninitialized src_reg
  @unpriv:OK
  #519/11  verifier_load_acquire/load-acquire with non-pointer src_reg:OK
  #519/12  verifier_load_acquire/load-acquire with non-pointer src_reg
  @unpriv:OK
  #519/13  verifier_load_acquire/misaligned load-acquire:OK
  #519/14  verifier_load_acquire/misaligned load-acquire @unpriv:OK
  #519/15  verifier_load_acquire/load-acquire from ctx pointer:OK
  #519/16  verifier_load_acquire/load-acquire from ctx pointer @unpriv:OK
  #519/17  verifier_load_acquire/load-acquire with invalid register R15:OK
  #519/18  verifier_load_acquire/load-acquire with invalid register R15
  @unpriv:OK
  #519/19  verifier_load_acquire/load-acquire from pkt pointer:OK
  #519/20  verifier_load_acquire/load-acquire from flow_keys pointer:OK
  #519/21  verifier_load_acquire/load-acquire from sock pointer:OK
  #519     verifier_load_acquire:OK
  #556/1   verifier_store_release/store-release, 8-bit:OK
  #556/2   verifier_store_release/store-release, 8-bit @unpriv:OK
  #556/3   verifier_store_release/store-release, 16-bit:OK
  #556/4   verifier_store_release/store-release, 16-bit @unpriv:OK
  #556/5   verifier_store_release/store-release, 32-bit:OK
  #556/6   verifier_store_release/store-release, 32-bit @unpriv:OK
  #556/7   verifier_store_release/store-release, 64-bit:OK
  #556/8   verifier_store_release/store-release, 64-bit @unpriv:OK
  #556/9   verifier_store_release/store-release with uninitialized
  src_reg:OK
  #556/10  verifier_store_release/store-release with uninitialized src_reg
  @unpriv:OK
  #556/11  verifier_store_release/store-release with uninitialized
  dst_reg:OK
  #556/12  verifier_store_release/store-release with uninitialized dst_reg
  @unpriv:OK
  #556/13  verifier_store_release/store-release with non-pointer
  dst_reg:OK
  #556/14  verifier_store_release/store-release with non-pointer dst_reg
  @unpriv:OK
  #556/15  verifier_store_release/misaligned store-release:OK
  #556/16  verifier_store_release/misaligned store-release @unpriv:OK
  #556/17  verifier_store_release/store-release to ctx pointer:OK
  #556/18  verifier_store_release/store-release to ctx pointer @unpriv:OK
  #556/19  verifier_store_release/store-release, leak pointer to stack:OK
  #556/20  verifier_store_release/store-release, leak pointer to stack
  @unpriv:OK
  #556/21  verifier_store_release/store-release, leak pointer to map:OK
  #556/22  verifier_store_release/store-release, leak pointer to map
  @unpriv:OK
  #556/23  verifier_store_release/store-release with invalid register
  R15:OK
  #556/24  verifier_store_release/store-release with invalid register R15
  @unpriv:OK
  #556/25  verifier_store_release/store-release to pkt pointer:OK
  #556/26  verifier_store_release/store-release to flow_keys pointer:OK
  #556/27  verifier_store_release/store-release to sock pointer:OK
  #556     verifier_store_release:OK
  Summary: 3/55 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250717202935.29018-2-puranjay@kernel.org
2025-07-28 08:13:35 +05:30
Enze Li 511914506d selftests/damon: introduce _common.sh to host shared function
The current test scripts contain duplicated root permission checks in
multiple locations.  This patch consolidates these checks into _common.sh
to eliminate code redundancy.

Link: https://lkml.kernel.org/r/20250718064217.299300-1-lienze@kylinos.cn
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park da5973a0b8 selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
sysfs.py is testing if non-default additional parameters can be committed.
Add a test case for further reducing the parameters to the default set.

Link: https://lkml.kernel.org/r/20250720171652.92309-23-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park 62b7b1ffa2 selftests/damon/sysfs.py: test non-default parameters runtime commit
sysfs.py is testing only the default and minimum DAMON parameters.  Add
another test case for more non-default additional DAMON parameters
commitment on runtime.

Link: https://lkml.kernel.org/r/20250720171652.92309-22-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park 16797a55aa selftests/damon/sysfs.py: generalize DAMON context commit assertion
DAMON context commitment assertion is hard-coded for a specific test case.
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-21-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park a4027b5f24 selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
DAMON monitoring attributes commitment assertion is hard-coded for a
specific test case.  Split it out into a general version that can be
reused for different test cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-20-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park 771d7754ab selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
DAMOS schemes commitment assertion is hard-coded for a specific test case.
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-19-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park 53f800581f selftests/damon/sysfs.py: test DAMOS filters commitment
Current DAMOS scheme commitment assertion is not testing DAMOS filters. 
Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-18-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park f22ff7b5a5 selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
DAMOS scheme commitment assertion is hard-coded for a specific test case. 
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-17-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park bd0487a774 selftests/damon/sysfs.py: test DAMOS destinations commitment
Current DAMOS commitment assertion is not testing quota destinations
commitment.  Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-16-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park 84dc442bd5 selftests/damon/sysfs.py: test quota goal commitment
Current DAMOS quota commitment assertion is not testing quota goal
commitment.  Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-15-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park f797e709f7 selftests/damon/sysfs.py: generalize DamosQuota commit assertion
DamosQuota commitment assertion is hard-coded for a specific test case. 
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park b50c48de61 selftests/damon/sysfs.py: generalize DAMOS Watermarks commit assertion
DamosWatermarks commitment assertion is hard-coded for a specific test
case.  Split it out into a general version that can be reused for
different test cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-13-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park a1d52cd030 selftests/damon/drgn_dump_damon_status: dump DAMOS filters
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add damos filters dumping for more
tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park eb413daaf2 selftests/damon/drgn_dump_damon_status: dump ctx->ops.id
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add ctx->ops.id dumping for more
tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park c1a6958957 selftests/damon/drgn_dump_damon_status: dump damos->migrate_dests
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add damos->migrate_dests dumping for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park 80d4e38107 selftests/damon/_damon_sysfs: use 2**32 - 1 as max nr_accesses and age
nr_accesses and age are unsigned int.  Use the proper max value.

Link: https://lkml.kernel.org/r/20250720171652.92309-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park 86e541f0be selftests/damon/_damon_sysfs: support DAMOS target_nid setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS action destination target_nid setup for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00