mirror-linux

Commit Graph

Author	SHA1	Message	Date
Yi Liu	6d9500bb1f	iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO IOMMU_HW_INFO is extended to report max_pasid_log2, hence add coverage for it. Link: https://patch.msgid.link/r/20250321180143.8468-6-yi.l.liu@intel.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-28 10:07:23 -03:00
Linus Torvalds	7b667acd69	powerpc updates for 6.15 - Removal of support for IBM Cell Blades - SMP support for microwatt platform - Support for inline static calls on PPC32 - Enable pmu selftests for power11 platform - Enable hardware trace macro (HTM) hcall support - Support for limited address mode capability - Changes to RMA size from 512 MB to 768 MB to handle fadump - Misc fixes and cleanups Thanks to: Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann, Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav Jain, Venkat Rao Bagalkote. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmfjZnYACgkQpnEsdPSH ZJTmKg/+NGW7wyFY9d8Iai9ncYY7GSzsMSDTaan7qg0QWOd5gjHbsdbava7TM/DW 8p9XsC+17kSeftRNUjtc52bSN8Ei2gBdsXIagQG1alfB2X2e6wkNauifK+dz3Su6 usMEZZTO5R/jFDotFXNM1nsUj+8dvjnPgOUrji/P8k7PT5295wpza0hz1fy5SrOA hM5cliBP36UgFe5Efvgm4OUX2gQIhbc3stt9MVfymW/k0Mit5f41UIPuVGiTWowY s0cUJGkhxUlGXT3VfOVKuZfn4u9KMha7UCl9afSceJzXOdnUIKIbskui1VEv6cD/ iSIxi839uErAobFHlsLYprgYFciYLII3xe2qNZCA/ZxeIMS/Mm6xokESeWLhBnfa P7ke6l0z3GDtTvgI2eSeU9BdrVveF1NgbP9GYSKgT6gtw/kRRnxgHF8tzmLON5PT KXpQlzz8VuSBRtF2jnLFU89+FFwSA1bRUhDrp89HyYFqw1B5g4N7kFFTUJWHOuKS fwPGy+cveKehmCUBedeTRFqHvvqdwpD/WnPlQzCly3WxqdL8U/eTXYftMiAwuK28 ovLuSs3vRThKRQ8DnUa5oB0UGsjMpRV5LdvYkhw+x8mZKUR59oj4fx2ae4TtPakg dbAYuPPkCORdaSga/nV6vQgsLprFpcGX3dq6E19+BVBAY5D+1PE= =GFcj -----END PGP SIGNATURE----- Merge tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Remove support for IBM Cell Blades - SMP support for microwatt platform - Support for inline static calls on PPC32 - Enable pmu selftests for power11 platform - Enable hardware trace macro (HTM) hcall support - Support for limited address mode capability - Changes to RMA size from 512 MB to 768 MB to handle fadump - Misc fixes and cleanups Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann, Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav Jain, and Venkat Rao Bagalkote. * tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits) powerpc/kexec: fix physical address calculation in clear_utlb_entry() crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD powerpc: Fix 'intra_function_call not a direct call' warning powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu' KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7 powerpc/microwatt: Add SMP support powerpc: Define config option for processors with broadcast TLBIE powerpc/microwatt: Define an idle power-save function powerpc/microwatt: Device-tree updates powerpc/microwatt: Select COMMON_CLK in order to get the clock framework net: toshiba: Remove reference to PPC_IBM_CELL_BLADE net: spider_net: Remove powerpc Cell driver cpufreq: ppc_cbe: Remove powerpc Cell driver genirq: Remove IRQ_EDGE_EOI_HANDLER docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR powerpc: Remove UDBG_RTAS_CONSOLE powerpc/io: Use standard barrier macros in io.c powerpc/io: Rename _insw_ns() etc. powerpc/io: Use generic raw accessors ...	2025-03-27 19:39:08 -07:00
Linus Torvalds	a7e135fe59	Probes updates for v6.15: - probe-events: Add comments about entry data storing code to clarify where and how the entry data is stored for function return events. - probe-events: Log error for exceeding the number of arguments to help user to identify error reason via tracefs/error_log file. - selftests/ftrace: Improve the ftracetest to add followngs. . Expand the tprobe event test to check if it can correctly find the wrong format tracepoint name. . Add new syntax error test to check whether error_log correctly indicates a wrong character in the tracepoint name. . Add a new dynamic events argument limitation test case which checks max number of probe arguments. -----BEGIN PGP SIGNATURE----- iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmflTlobHG1hc2FtaS5o aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8ba9UIALzzZQxzUhJYO/B/XaCz KTuSrU2594cLr3nCrmCfL3UHUCr5IcjXKCfUdrgzE9mckWF+nVRXWwbp29KpOQky fzU9Ardbr7ksGAYFk4My+P/BeYa7vh9LwofXzWlJibANVxWvq66+GfKnWnh1P8Bl /zjov61DAEQmDfXNGZ2oTmKnYYMoPkJU4voRvAEUgiP02SrF04UUa00uC7hmZJJV aI5b6TatE5FDEmAoHWh0cf04HoyevRyLVOQd5GSswH/oWpyhs90M7a9WENHamBx6 RXEZ3xYyuH9zBSvYVeRn2H1eHnGCR4RMctZiRGXF8EdLWo7RXRQ1Hp9CgvwDrU8Z IL0= =vZ3e -----END PGP SIGNATURE----- Merge tag 'probes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - probe-events: Add comments about entry data storing code to clarify where and how the entry data is stored for function return events. - probe-events: Log error for exceeding the number of arguments to help user to identify error reason via tracefs/error_log file. - Improve the ftracetest selftests: - Expand the tprobe event test to check if it can correctly find the wrong format tracepoint name. - Add new syntax error test to check whether error_log correctly indicates a wrong character in the tracepoint name. - Add a new dynamic events argument limitation test case which checks max number of probe arguments. * tag 'probes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: probe-events: Add comments about entry data storing code selftests/ftrace: Add dynamic events argument limitation test case selftests/ftrace: Add new syntax error test selftests/ftrace: Expand the tprobe event test to check wrong format tracing: probe-events: Log error for exceeding the number of arguments	2025-03-27 19:31:34 -07:00
Linus Torvalds	dcf9f31c62	Livepatching changes for 6.15 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmflMGMACgkQUqAMR0iA lPKHhg/8DjxkQ3tvUpW53pfABBdKELJWOl4hH+IMvGcdh2dZrcFf6OkEB4JeXXVn duSODaefalrdU0jV21LChYpEkG40wZqStmamH9im+VbyCc8JwMwvWckrw45RNmjO AC4oVh0K9SJcrTj7XWsFSeS38KmQKhtp2zTHpSqCW9FZkfn8si11E8Xzue2kTtEH 7CESq6N0SMahVfiNkLI1n0x0mUyWiTE43hQ3e+tmmBgtmv3gU4FpQqeTGc/MQUxy B6AaI1vChovLmfNhw4KM8GvyPCTIn4xRQ09WXZRbK2ZzAxgFfm7QXGL3E93ra929 juSoIWnNwzw81RB3apzclOHFTbctHJWL+85KEefqCjrCuEbaoLNGoLkVqNuj0U+D LmLqmm6NfEtPzDVBY232KepjFUoIb890eKRF3Wq1NnHonaVxmUSMEMRSdlKeHpSo uImBhr5XhXDJLHt4+lAa4m0JSm3RH2AK83xIrdJq/YUcKscCzJMHLqqsp8pCa7mb L7iOhHFKs2r6GJCt5kLEw1jkNEg5Lpk08fY5AekjScvNcxXWXF8E8Jl9DnBBoz+5 e15mvqsiQtcAGHyR1Ohmw8UPYnWY2fSZTmru9aClwHCvC0q+cyCxIZgcqLxFYFIr TKIgQ6zSepA+j7SzI1SaDY4Z9cNWMTU+9GqrNO3xO4ddpKrp5C8= =ggzJ -----END PGP SIGNATURE----- Merge tag 'livepatching-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - Add a selftest for tracing of a livepatched function - Skip a selftest when kprobes are not using ftrace - Some documentation clean up * tag 'livepatching-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: test if ftrace can trace a livepatched function selftests: livepatch: add new ftrace helpers functions selftest/livepatch: Only run test-kprobe with CONFIG_KPROBES_ON_FTRACE docs: livepatch: move text out of code block livepatch: Add comment to clarify klp_add_nops()	2025-03-27 19:26:10 -07:00
Linus Torvalds	a10c7949ad	linux_kselftest-kunit-6.15-rc1 kunit tool: - Changes to kunit tool to use qboot on QEMU x86_64, and build GDB scripts. - Fixes kunit tool bug in parsing test plan. - Adds test to kunit tool to check parsing late test plan. kunit: - Clarifies kunit_skip() argument name. - Adds Kunit check for the longest symbol length. - Changes qemu_configs for sparc to use Zilog console. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmfkvDYACgkQCwJExA0N Qxwljg//ZRoF/Jncvlb0vapnOIYywHbJEPRVTKfNurRjhb7stAX7CpLKXing4Gtq ewy3UXRaAZKg1BvugDYWoUsDDD5o7jx6y9rOMOWM+aAHPzYgxY6gbIzyUVolNZg/ 50/ANMhT0bvME8KBB2k2l6p1NAblzOpH3zH35CCDL/40eVodwMPrhq0V5AqccOaE C5Bn+tDiviS6Icw+b/mVUw8fvmoJSTSKvdjaSeRAqThJN3KtqBVyX383++A1zNqy Y6tItu9wG06FDjuQ1miOlSMwhgMEYK4TS4GwbX4PUucR8ETaZNUXVviMRou7vMEa GGOdtsBG3CBgFNtO2VK1qJLWbJesw2G9+w2oIZ2KQKtyfoF7nDMj+DBO2QD/T+GB u2g/xlSDJ5PTzZBMVKENDMy+C9Q+ux8Y2PsQ0fTCdpYgadytKYBFA23EAiZaMdKa d1AweNvFS5gi8WkpS8SyMjs0D5pZnKMgHQqOIfRFjCi0HXsGE9RJfkOjLOzRnaOc zldLAgDcrhtdG8Xin08bux5UuCoqg/e/RJiXF+xQLLJkE7cltN/CuWMrHX4kija+ 8xmJtj4Oe0p7JCwnIaXjLAQDuFfxHYHM9wM0nKm+YpVJLPSWqSXk4+xtQEOlvZhN DJW61ez+pYVCmXuIZ/bgeRzpwXJMfALmI3kn+UtCYwqdTt6Xhp8= =h8xS -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: "kunit tool: - Changes to kunit tool to use qboot on QEMU x86_64, and build GDB scripts - Fixes kunit tool bug in parsing test plan - Adds test to kunit tool to check parsing late test plan kunit: - Clarifies kunit_skip() argument name - Adds Kunit check for the longest symbol length - Changes qemu_configs for sparc to use Zilog console" * tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: tool: add test to check parsing late test plan kunit: tool: Fix bug in parsing test plan Kunit to check the longest symbol length kunit: Clarify kunit_skip() argument name kunit: tool: Build GDB scripts kunit: qemu_configs: sparc: use Zilog console kunit: tool: Use qboot on QEMU x86_64	2025-03-27 19:06:07 -07:00
Linus Torvalds	8e324a5c98	linux_kselftest-next-6.15-rc1 Fixes bugs and cleans up code in tracing, ftrace, and user_events tests. Adds missing executables to ftrace gitignore. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmfkreMACgkQCwJExA0N QxxgKBAAh8oO0zPvbiVDYcOqQBJ4qBpCLr3YWHM92v/nbUbj5teyrPJLGhTstYHy khWWzOQBN+Kz91w355CLqiYZPYTTDEVaZDvxt43cBnU3yc8O72wBgo6adzQA1gzP 83TGB6B4CVT/6loHGtG40mHA0FHbB0imoCH39+5d9Nb63rOPUriyA5xWvACh4Dnl AOMOmtHNAxUWAaTljMrNI5ZiYVsGSIwjwXPYbMg3WUI8qMqKsFWPNScrsCHlQoMX EZOwPINwHVYO5IDKWdzwAAuGFWhesmydmnAI+6Ji1xBkCgG/YIIqmu/p0ru8vI/q 96GqxeuE9tgKZgEcOinOVKvclCbPr2wp9oUSFVzDuxuDBW9ErgyAZqCTYwBDhsxq 5P+O0qQBgBicYtrD437KF+nEvHdj9yOuhBrAftV70bVeCsCfgUWlAgFqB0bSvrX9 Fp/apOxc5LBqsDJJmvNiGMIUMWfoL05u83vtrkVhkhA7ZhUl6JFj2Tekgu29urnM pMM03hTVOpDRMQ1enV65XPwWkwOakFTnJU1lnGCEdwsklgy73DLueRI0iAc8DyeS +m9DwJNWmELHXQ4ugvNZ1A3RAX2E1E95QfQAt/Hmv+t0eRWTpL5idu98GQ7M7Tm9 6BjpbjUljTh/wGnQ2TIxmflHqIjqkKXYLKgT1+gjjf/6ihFvElU= =5VeG -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest updates from Shuah Khan: - Fix bugs and clean up code in tracing, ftrace, and user_events tests - Add missing executables to ftrace gitignore * tag 'linux_kselftest-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/ftrace: add 'poll' binary to gitignore selftests/ftrace: Use readelf to find entry point in uprobe test selftests/user_events: Fix failures caused by test code selftests/tracing: Allow some more tests to run in instances selftests/ftrace: Clean up triggers after setting them selftests/tracing: Test only toplevel README file not the instances	2025-03-27 18:57:58 -07:00
Linus Torvalds	68f090f09b	ktest update for v6.15: - Fix failure of directory of log file not existing If a LOG_FILE option is set for ktest to log its messages, and the directory path does not exist. Then ktest fails. Have ktest attempt to create the directory where the log file exists and if that succeeds continue on testing. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+WDNxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6quknAQDGW0MUyG3OSP1mdTm/a/kvAc4nnx8i OBfT8ZqDPMXAhgEA2jdn/pvOmh0Nm+cofUC8tzWIdaY0Ey1r/QlCOnx4TAQ= =QgSs -----END PGP SIGNATURE----- Merge tag 'ktest-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest update from Steven Rostedt: - Fix failure of directory of log file not existing If a LOG_FILE option is set for ktest to log its messages, and the directory path does not exist. Then ktest fails. Have ktest attempt to create the directory where the log file exists and if that succeeds continue on testing. * tag 'ktest-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest: Fix Test Failures Due to Missing LOG_FILE Directories	2025-03-27 18:22:46 -07:00
Bjorn Helgaas	cc28c0e5e7	Merge branch 'pci/endpoint-test' - Fix endpoint BAR testing so the test can skip disabled BARs instead of reporting them as failures (Niklas Cassel) - Verify that pci_endpoint interrupt tests set the correct IRQ type (Kunihiko Hayashi) - Fix interpretation of pci_endpoint_test_bars_read_bar() error returns (Niklas Cassel) - Fix potential string truncation in pci_endpoint_test_probe() (Niklas Cassel) - Increase endpoint test BAR size variable to accommodate BARs larger than INT_MAX (Niklas Cassel) - Release IRQs to avoid leak in pci_endpoint interrupt tests (Kunihiko Hayashi) - Log the correct IRQ type when pci_endpoint IRQ request test fails (Kunihiko Hayashi) - Remove pci_endpoint_test irq_type and no_msi globals; instead use test->irq_type (Kunihiko Hayashi) - Remove unnecessary use of managed IRQ functions in pci_endpoint_test (Kunihiko Hayashi) - Add and use IRQ_TYPE_* defines in pci_endpoint_test (Niklas Cassel) - Add struct pci_epc_features.intx_capable and note that RK3568 and RK3588 can't raise INTx interrupts (Niklas Cassel) - Expose supported IRQ types in CAPS so pci_endpoint_test can set appropriate type (Niklas Cassel) - Add PCITEST_IRQ_TYPE_AUTO to pci_endpoint_test for cases where the IRQ type doesn't matter (Niklas Cassel) * pci/endpoint-test: misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts PCI: endpoint: Add intx_capable to epc_features struct selftests: pci_endpoint: Use IRQ_TYPE_* defines from UAPI header misc: pci_endpoint_test: Use IRQ_TYPE_* defines from UAPI header PCI: endpoint: pcitest: Add IRQ_TYPE_* defines to UAPI header misc: pci_endpoint_test: Do not use managed IRQ functions misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi' misc: pci_endpoint_test: Fix 'irq_type' to convey the correct type misc: pci_endpoint_test: Fix displaying 'irq_type' after 'request_irq' error misc: pci_endpoint_test: Avoid issue of interrupts remaining after request_irq error misc: pci_endpoint_test: Handle BAR sizes larger than INT_MAX misc: pci_endpoint_test: Give disabled BARs a distinct error code misc: pci_endpoint_test: Fix potential truncation in pci_endpoint_test_probe() misc: pci_endpoint_test: Fix pci_endpoint_test_bars_read_bar() error handling selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test selftests: pci_endpoint: Skip disabled BARs	2025-03-27 13:14:46 -05:00
Ayush Jain	5a1bed2327	ktest: Fix Test Failures Due to Missing LOG_FILE Directories Handle missing parent directories for LOG_FILE path to prevent test failures. If the parent directories don't exist, create them to ensure the tests proceed successfully. Cc: <warthog9@eaglescrag.net> Link: https://lore.kernel.org/20250307043854.2518539-1-Ayush.jain3@amd.com Signed-off-by: Ayush Jain <Ayush.jain3@amd.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2025-03-27 09:34:00 -04:00
Masami Hiramatsu (Google)	581a7b26ab	selftests/ftrace: Add dynamic events argument limitation test case Add argument limitation test case for dynamic events. This is a boudary check for the maximum number of the probe event arguments. Link: https://lore.kernel.org/all/174055078295.4079315.14702008939511417359.stgit@mhiramat.tok.corp.google.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2025-03-27 21:19:54 +09:00
Masami Hiramatsu (Google)	168ccc9b99	selftests/ftrace: Add new syntax error test Add BAD_TP_NAME syntax error message check. Link: https://lore.kernel.org/all/174055077485.4079315.3624012056141021755.stgit@mhiramat.tok.corp.google.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2025-03-27 21:19:54 +09:00
Masami Hiramatsu (Google)	381af2ab91	selftests/ftrace: Expand the tprobe event test to check wrong format Expand the tprobe event test case to check wrong tracepoint format. Link: https://lore.kernel.org/all/174055076681.4079315.16941322116874021804.stgit@mhiramat.tok.corp.google.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2025-03-27 21:19:54 +09:00
Petr Mladek	d11f0d172a	Merge branch 'for-6.15/ftrace-test' into for-linus	2025-03-27 12:01:51 +01:00
Linus Torvalds	1a9239bb42	Networking changes for 6.15. Core & protocols ---------------- - Continue Netlink conversions to per-namespace RTNL lock (IPv4 routing, routing rules, routing next hops, ARP ioctls). - Continue extending the use of netdev instance locks. As a driver opt-in protect queue operations and (in due course) ethtool operations with the instance lock and not RTNL lock. - Support collecting TCP timestamps (data submitted, sent, acked) in BPF, allowing for transparent (to the application) and lower overhead tracking of TCP RPC performance. - Tweak existing networking Rx zero-copy infra to support zero-copy Rx via io_uring. - Optimize MPTCP performance in single subflow mode by 29%. - Enable GRO on packets which went thru XDP CPU redirect (were queued for processing on a different CPU). Improving TCP stream performance up to 2x. - Improve performance of contended connect() by 200% by searching for an available 4-tuple under RCU rather than a spin lock. Bring an additional 229% improvement by tweaking hash distribution. - Avoid unconditionally touching sk_tsflags on RX, improving performance under UDP flood by as much as 10%. - Avoid skb_clone() dance in ping_rcv() to improve performance under ping flood. - Avoid FIB lookup in netfilter if socket is available, 20% perf win. - Rework network device creation (in-kernel) API to more clearly identify network namespaces and their roles. There are up to 4 namespace roles but we used to have just 2 netns pointer arguments, interpreted differently based on context. - Use sysfs_break_active_protection() instead of trylock to avoid deadlocks between unregistering objects and sysfs access. - Add a new sysctl and sockopt for capping max retransmit timeout in TCP. - Support masking port and DSCP in routing rule matches. - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST. - Support specifying at what time packet should be sent on AF_XDP sockets. - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users. - Add Netlink YAML spec for WiFi (nl80211) and conntrack. - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols which only need to be exported when IPv6 support is built as a module. - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to normal bridging. - Allow users to specify source port range for GENEVE tunnels. - netconsole: allow attaching kernel release, CPU ID and task name to messages as metadata Driver API ---------- - Continue rework / fixing of Energy Efficient Ethernet (EEE) across the SW layers. Delegate the responsibilities to phylink where possible. Improve its handling in phylib. - Support symmetric OR-XOR RSS hashing algorithm. - Support tracking and preserving IRQ affinity by NAPI itself. - Support loopback mode speed selection for interface selftests. Device drivers -------------- - Remove the IBM LCS driver for s390. - Remove the sb1000 cable modem driver. - Add support for SFP module access over SMBus. - Add MCTP transport driver for MCTP-over-USB. - Enable XDP metadata support in multiple drivers. - Ethernet high-speed NICs: - Broadcom (bnxt): - add PCIe TLP Processing Hints (TPH) support for new AMD platforms - support dumping RoCE queue state for debug - opt into instance locking - Intel (100G, ice, idpf): - ice: rework MSI-X IRQ management and distribution - ice: support for E830 devices - iavf: add support for Rx timestamping - iavf: opt into instance locking - nVidia/Mellanox: - mlx4: use page pool memory allocator for Rx - mlx5: support for one PTP device per hardware clock - mlx5: support for 200Gbps per-lane link modes - mlx5: move IPSec policy check after decryption - AMD/Solarflare: - support FW flashing via devlink - Cisco (enic): - use page pool memory allocator for Rx - enable 32, 64 byte CQEs - get max rx/tx ring size from the device - Meta (fbnic): - support flow steering and RSS configuration - report queue stats - support TCP segmentation - support IRQ coalescing - support ring size configuration - Marvell/Cavium: - support AF_XDP - Wangxun: - support for PTP clock and timestamping - Huawei (hibmcge): - checksum offload - add more statistics - Ethernet virtual: - VirtIO net: - aggressively suppress Tx completions, improve perf by 96% with 1 CPU and 55% with 2 CPUs - expose NAPI to IRQ mapping and persist NAPI settings - Google (gve): - support XDP in DQO RDA Queue Format - opt into instance locking - Microsoft vNIC: - support BIG TCP - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - cleanup Tx and Tx clock setting and other link-focused cleanups - enable SGMII and 2500BASEX mode switching for Intel platforms - support Sophgo SG2044 - Broadcom switches (b53): - support for BCM53101 - TI: - iep: add perout configuration support - icssg: support XDP - Cadence (macb): - implement BQL - Xilinx (axinet): - support dynamic IRQ moderation and changing coalescing at runtime - implement BQL - report standard stats - MediaTek: - support phylink managed EEE - Intel: - igc: don't restart the interface on every XDP program change - RealTek (r8169): - support reading registers of internal PHYs directly - increase max jumbo packet size on RTL8125/RTL8126 - Airoha: - support for RISC-V NPU packet processing unit - enable scatter-gather and support MTU up to 9kB - Tehuti (tn40xx): - support cards with TN4010 MAC and an Aquantia AQR105 PHY - Ethernet PHYs: - support for TJA1102S, TJA1121 - dp83tg720: add randomized polling intervals for link detection - dp83822: support changing the transmit amplitude voltage - support for LEDs on 88q2xxx - CAN: - canxl: support Remote Request Substitution bit access - flexcan: add S32G2/S32G3 SoC - WiFi: - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support - batman-adv: add support for jumbo frames - WiFi drivers: - RealTek (rtw88): - support RTL8814AE and RTL8814AU - RealTek (rtw89): - switch using wiphy_lock and wiphy_work - add BB context to manipulate two PHY as preparation of MLO - improve BT-coexistence mechanism to play A2DP smoothly - Intel (iwlwifi): - add new iwlmld sub-driver for latest HW/FW combinations - MediaTek (mt76): - preparation for mt7996 Multi-Link Operation (MLO) support - Qualcomm/Atheros (ath12k): - continued work on MLO - Silabs (wfx): - Wake-on-WLAN support - Bluetooth: - add support for skb TX SND/COMPLETION timestamping - hci_core: enable buffer flow control for SCO/eSCO - coredump: log devcd dumps into the monitor - Bluetooth drivers: - intel: add support to configure TX power - nxp: handle bootloader error during cmd5 and cmd7 Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfkLC8ACgkQMUZtbf5S Irsb5g/+L7oKOf0ALbaV9kxFsoz8AymZfAW9i/27F07omGJGpks8oX6j6rQLgIRO OQOFcp7XEdDh1+jh82gHVuPrw2/6lchLtW8ARtzdiQKFr5DRjrsbtua6GRc8iBqA DIRCBFoV2HuMkF39Vr09HMa9AZAT7QR2RLsRGpSq8E8Z8xxKz0X7oujs10PFpMTE IVKhTrVrk+NDot/IU2hzVpnpup+0ld+T2/ZaBklJGcU8uDffImsqNepHRyCG5UC3 xz74Ju23MAj24Gct+og0yFUooF+lUltKyVm0FYCDCY3bASTwgY01NR3kEH/0NQvM cywLzd/ngHm/SMD2ggVAHkjZUieiIVHdaZ53dgjDeBOQoVP6p0dgUK7EumXX8Mx4 8ReR2UiGoYRPaq9c4o+IjG4K027MwVK2p+mF1a6MLa+20XcyMbev8FIRbbHtC/V4 z5/FsOAxcuICWkA1hU9bODrrGzIqemmdRgKG8sGuTJCt/kYGAn72/TCATGNSaCJ0 00n2jN1aepa7wtywHJ5MhVzxN9iQX7+geUHXz0BI+lK4e1Pmk+vjGksymb9ai2fk eQAUV9ekub6q68/J16scD7XeOUM37bTLiMBQeIF8UtZBOJscKiS71zn9QP9Twwxv P2pm01RDZUI+z5ZX3hc12Pm1vjRHaAh9S1JpAw/pTOVlQ+mAJEM= =XY0S -----END PGP SIGNATURE----- Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Continue Netlink conversions to per-namespace RTNL lock (IPv4 routing, routing rules, routing next hops, ARP ioctls) - Continue extending the use of netdev instance locks. As a driver opt-in protect queue operations and (in due course) ethtool operations with the instance lock and not RTNL lock. - Support collecting TCP timestamps (data submitted, sent, acked) in BPF, allowing for transparent (to the application) and lower overhead tracking of TCP RPC performance. - Tweak existing networking Rx zero-copy infra to support zero-copy Rx via io_uring. - Optimize MPTCP performance in single subflow mode by 29%. - Enable GRO on packets which went thru XDP CPU redirect (were queued for processing on a different CPU). Improving TCP stream performance up to 2x. - Improve performance of contended connect() by 200% by searching for an available 4-tuple under RCU rather than a spin lock. Bring an additional 229% improvement by tweaking hash distribution. - Avoid unconditionally touching sk_tsflags on RX, improving performance under UDP flood by as much as 10%. - Avoid skb_clone() dance in ping_rcv() to improve performance under ping flood. - Avoid FIB lookup in netfilter if socket is available, 20% perf win. - Rework network device creation (in-kernel) API to more clearly identify network namespaces and their roles. There are up to 4 namespace roles but we used to have just 2 netns pointer arguments, interpreted differently based on context. - Use sysfs_break_active_protection() instead of trylock to avoid deadlocks between unregistering objects and sysfs access. - Add a new sysctl and sockopt for capping max retransmit timeout in TCP. - Support masking port and DSCP in routing rule matches. - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST. - Support specifying at what time packet should be sent on AF_XDP sockets. - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users. - Add Netlink YAML spec for WiFi (nl80211) and conntrack. - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols which only need to be exported when IPv6 support is built as a module. - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to normal bridging. - Allow users to specify source port range for GENEVE tunnels. - netconsole: allow attaching kernel release, CPU ID and task name to messages as metadata Driver API: - Continue rework / fixing of Energy Efficient Ethernet (EEE) across the SW layers. Delegate the responsibilities to phylink where possible. Improve its handling in phylib. - Support symmetric OR-XOR RSS hashing algorithm. - Support tracking and preserving IRQ affinity by NAPI itself. - Support loopback mode speed selection for interface selftests. Device drivers: - Remove the IBM LCS driver for s390 - Remove the sb1000 cable modem driver - Add support for SFP module access over SMBus - Add MCTP transport driver for MCTP-over-USB - Enable XDP metadata support in multiple drivers - Ethernet high-speed NICs: - Broadcom (bnxt): - add PCIe TLP Processing Hints (TPH) support for new AMD platforms - support dumping RoCE queue state for debug - opt into instance locking - Intel (100G, ice, idpf): - ice: rework MSI-X IRQ management and distribution - ice: support for E830 devices - iavf: add support for Rx timestamping - iavf: opt into instance locking - nVidia/Mellanox: - mlx4: use page pool memory allocator for Rx - mlx5: support for one PTP device per hardware clock - mlx5: support for 200Gbps per-lane link modes - mlx5: move IPSec policy check after decryption - AMD/Solarflare: - support FW flashing via devlink - Cisco (enic): - use page pool memory allocator for Rx - enable 32, 64 byte CQEs - get max rx/tx ring size from the device - Meta (fbnic): - support flow steering and RSS configuration - report queue stats - support TCP segmentation - support IRQ coalescing - support ring size configuration - Marvell/Cavium: - support AF_XDP - Wangxun: - support for PTP clock and timestamping - Huawei (hibmcge): - checksum offload - add more statistics - Ethernet virtual: - VirtIO net: - aggressively suppress Tx completions, improve perf by 96% with 1 CPU and 55% with 2 CPUs - expose NAPI to IRQ mapping and persist NAPI settings - Google (gve): - support XDP in DQO RDA Queue Format - opt into instance locking - Microsoft vNIC: - support BIG TCP - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - cleanup Tx and Tx clock setting and other link-focused cleanups - enable SGMII and 2500BASEX mode switching for Intel platforms - support Sophgo SG2044 - Broadcom switches (b53): - support for BCM53101 - TI: - iep: add perout configuration support - icssg: support XDP - Cadence (macb): - implement BQL - Xilinx (axinet): - support dynamic IRQ moderation and changing coalescing at runtime - implement BQL - report standard stats - MediaTek: - support phylink managed EEE - Intel: - igc: don't restart the interface on every XDP program change - RealTek (r8169): - support reading registers of internal PHYs directly - increase max jumbo packet size on RTL8125/RTL8126 - Airoha: - support for RISC-V NPU packet processing unit - enable scatter-gather and support MTU up to 9kB - Tehuti (tn40xx): - support cards with TN4010 MAC and an Aquantia AQR105 PHY - Ethernet PHYs: - support for TJA1102S, TJA1121 - dp83tg720: add randomized polling intervals for link detection - dp83822: support changing the transmit amplitude voltage - support for LEDs on 88q2xxx - CAN: - canxl: support Remote Request Substitution bit access - flexcan: add S32G2/S32G3 SoC - WiFi: - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support - batman-adv: add support for jumbo frames - WiFi drivers: - RealTek (rtw88): - support RTL8814AE and RTL8814AU - RealTek (rtw89): - switch using wiphy_lock and wiphy_work - add BB context to manipulate two PHY as preparation of MLO - improve BT-coexistence mechanism to play A2DP smoothly - Intel (iwlwifi): - add new iwlmld sub-driver for latest HW/FW combinations - MediaTek (mt76): - preparation for mt7996 Multi-Link Operation (MLO) support - Qualcomm/Atheros (ath12k): - continued work on MLO - Silabs (wfx): - Wake-on-WLAN support - Bluetooth: - add support for skb TX SND/COMPLETION timestamping - hci_core: enable buffer flow control for SCO/eSCO - coredump: log devcd dumps into the monitor - Bluetooth drivers: - intel: add support to configure TX power - nxp: handle bootloader error during cmd5 and cmd7" * tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits) unix: fix up for "apparmor: add fine grained af_unix mediation" mctp: Fix incorrect tx flow invalidation condition in mctp-i2c net: usb: asix: ax88772: Increase phy_name size net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string net: libwx: fix Tx L4 checksum net: libwx: fix Tx descriptor content for some tunnel packets atm: Fix NULL pointer dereference net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus net: phy: aquantia: add essential functions to aqr105 driver net: phy: aquantia: search for firmware-name in fwnode net: phy: aquantia: add probe function to aqr105 for firmware loading net: phy: Add swnode support to mdiobus_scan gve: add XDP DROP and PASS support for DQ gve: update XDP allocation path support RX buffer posting gve: merge packet buffer size fields gve: update GQ RX to use buf_size gve: introduce config-based allocation for XDP gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics ...	2025-03-26 21:48:21 -07:00
Linus Torvalds	592329e5e9	Summary * Move vm_table members out of kernel/sysctl.c All vm_table array members have moved to their respective subsystems leading to the removal of vm_table from kernel/sysctl.c. This increases modularity by placing the ctl_tables closer to where they are actually used and at the same time reducing the chances of merge conflicts in kernel/sysctl.c. * ctl_table range fixes Replace the proc_handler function that checks variable ranges in coredump_sysctls and vdso_table with the one that actually uses the extra{1,2} pointers as min/max values. This tightens the range of the values that users can pass into the kernel effectively preventing {under,over}flows. * Misc fixes Correct grammar errors and typos in test messages. Update sysctl files in MAINTAINERS. Constified and removed array size in declaration for alignment_tbl * Testing - These have all been in linux-next for at least 1 month - They have gone through 0-day - Ran all these through sysctl selftests in x86_64 -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmfhV8EACgkQupfNUreW QU/udAv/VCXGkndQsJ5biXpXYFnokX0gIEaYzzHiqrFycZqr8ys0/wWzc+ar1LjF Jvanl2uKB0mUviLKt7Gk0+Hri+PJlYIrbx+5K5eo2wsKUUxFykqLLm59y/orPODl gyPQjKNpHJb7COsnEc3Lrq/fvol4NPHlcBPXG8NwehccTeBHZ1ninfo+pSnxh3o8 kI3GSLLxD4K9AgBl5QuVWH4gU7o//u7lUkKzy03NW+2jmuRv3dRcYF7IdgMINNee AeXnygdSBxLzECBvmkfNdyg+AmL8hdsmzbsIh7UuJDvxLlQOInVLZa+sXBotCOIc TImCrr1Ws1OuGrD0kpH+21tJvc8pNFWt61QlulObQdrLndWHdZEGyGOusLpXTwbn jIWZmMvzk1foSwdgzwPFzUqPEpW3FrBVDo4Z4kenBDrCp56QTX7hGRvkNYJNKvot Ue+i8BeHR/Gm/p+UMqgsSTOaNJXTqZhFqwJQVzxU/9LN/vkS0On6fbjgBd5X6Pn+ a5dlc9gy =0bcX -----END PGP SIGNATURE----- Merge tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move vm_table members out of kernel/sysctl.c All vm_table array members have moved to their respective subsystems leading to the removal of vm_table from kernel/sysctl.c. This increases modularity by placing the ctl_tables closer to where they are actually used and at the same time reducing the chances of merge conflicts in kernel/sysctl.c. - ctl_table range fixes Replace the proc_handler function that checks variable ranges in coredump_sysctls and vdso_table with the one that actually uses the extra{1,2} pointers as min/max values. This tightens the range of the values that users can pass into the kernel effectively preventing {under,over}flows. - Misc fixes Correct grammar errors and typos in test messages. Update sysctl files in MAINTAINERS. Constified and removed array size in declaration for alignment_tbl * tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits) selftests/sysctl: fix wording of help messages selftests: fix spelling/grammar errors in sysctl/sysctl.sh MAINTAINERS: Update sysctl file list in MAINTAINERS sysctl: Fix underflow value setting risk in vm_table coredump: Fixes core_pipe_limit sysctl proc_handler sysctl: remove unneeded include sysctl: remove the vm_table sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c fs: dcache: move the sysctl to fs/dcache.c sunrpc: simplify rpcauth_cache_shrink_count() fs: drop_caches: move sysctl to fs/drop_caches.c fs: fs-writeback: move sysctl to fs/fs-writeback.c mm: nommu: move sysctl to mm/nommu.c security: min_addr: move sysctl to security/min_addr.c mm: mmap: move sysctl to mm/mmap.c mm: util: move sysctls to mm/util.c mm: vmscan: move vmscan sysctls to mm/vmscan.c mm: swap: move sysctl to mm/swap.c mm: filemap: move sysctl to mm/filemap.c ...	2025-03-26 21:02:05 -07:00
Linus Torvalds	2e3fcbcc3b	SCSI misc on 20250326 Updates to the usual drivers (scsi_debug, ufs, lpfc, st, fnic, mpi3mr, mpt3sas) and the removal of cxlflash. The only non-trivial core change is an addition to unit attention handling to recognize UAs for power on/reset and new media so the tape driver can use it. Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZ+RQ2yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishe6DAQCdW/21 S1Y6BDlJLQfpWChGv6GIzanC+5sMfylw4d6ULgEA8upOE5L3fC29IY958jXig0o1 uLjxylwYEfVLDf8gwJ0= =mkM+ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (scsi_debug, ufs, lpfc, st, fnic, mpi3mr, mpt3sas) and the removal of cxlflash. The only non-trivial core change is an addition to unit attention handling to recognize UAs for power on/reset and new media so the tape driver can use it" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (107 commits) scsi: st: Tighten the page format heuristics with MODE SELECT scsi: st: ERASE does not change tape location scsi: st: Fix array overflow in st_setup() scsi: target: tcm_loop: Fix wrong abort tag scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flag scsi: hisi_sas: Fixed failure to issue vendor specific commands scsi: fnic: Remove unnecessary NUL-terminations scsi: fnic: Remove redundant flush_workqueue() calls scsi: core: Use a switch statement when attaching VPD pages scsi: ufs: renesas: Add initialization code for R-Car S4-8 ES1.2 scsi: ufs: renesas: Add reusable functions scsi: ufs: renesas: Refactor 0x10ad/0x10af PHY settings scsi: ufs: renesas: Remove register control helper function scsi: ufs: renesas: Add register read to remove save/set/restore scsi: ufs: renesas: Replace init data by init code scsi: ufs: dt-bindings: renesas,ufs: Add calibration data scsi: mpi3mr: Task Abort EH Support scsi: storvsc: Don't report the host packet status as the hv status scsi: isci: Make most module parameters static scsi: megaraid_sas: Make most module parameters static ...	2025-03-26 19:57:34 -07:00
Linus Torvalds	91928e0d3c	for-6.15/io_uring-20250322 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmfe8DcQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgprzNEADPOOfi05F0f4KOMLYOiRlB+D9yGtnbcu8k foPATU++L5fFDUpjAGobi38AXqHtF/Bx0CjCkPCvy4GjygcszYDY81FQmRlSFdeH CnYV/bm5myyupc8xANcWGJYpxf/ZAw7ugOAgcR3Xxk/qXdD5hYJ1jG4477o8v4IJ jyzE4vQCnWBogIoqpWTQyY3FSP5fdjeQlYdFGIFCx68rFTO9zEpuHhFr2OMfJtSu y1dLkeqbu1BZ0fhgRog9k9ZWk0uRQJI7d7EUXf9iwXOfkUyzCaV+nHvNGRdhJ/XO zj/qePw4lFyCgc+kuIejjIaPf/g84paCzshz/PIxOZTsvo6X78zj72RPM0Makhga amgtEqegWHtKSya1zew57oDwqwaIe6b+56PtNSP7K8IrwcJLheszoPKJIwiiynIL ZqvWseqYQXs58S9qZzSZlyEpwLVFcvyNt7Z/q7b56vhvXHabxOccPkUHt+UscVoQ qpG0R9+zmIiDfzY6Vfu7JyH2Uspdq3WjGYiaoVUko1rJXY+HFPd3be9pIbckyaA8 ZQHIxB+h8IlxhpQ8HJKQpZhtYiluxEu6wDdvtpwC0KiGp5g4Q5eg9SkVyvETIEN8 yW2tHqMAJJtJjDAPwJOV+GIyWQMIexzLmMP7Fq8R5VIoAox+Xn3flgNNEnMZdl/r kTc/hRjHvQ== =VEQr -----END PGP SIGNATURE----- Merge tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux Pull io_uring updates from Jens Axboe: "This is the first of the io_uring pull requests for the 6.15 merge window, there will be others once the net tree has gone in. This contains: - Cleanup and unification of cancelation handling across various request types. - Improvement for bundles, supporting them both for incrementally consumed buffers, and for non-multishot requests. - Enable toggling of using iowait while waiting on io_uring events or not. Unfortunately this is still tied with CPU frequency boosting on short waits, as the scheduler side has not been very receptive to splitting the (useless) iowait stat from the cpufreq implied boost. - Add support for kbuf nodes, enabling zero-copy support for the ublk block driver. - Various cleanups for resource node handling. - Series greatly cleaning up the legacy provided (non-ring based) buffers. For years, we've been pushing the ring provided buffers as the way to go, and that is what people have been using. Reduce the complexity and code associated with legacy provided buffers. - Series cleaning up the compat handling. - Series improving and cleaning up the recvmsg/sendmsg iovec and msg handling. - Series of cleanups for io-wq. - Start adding a bunch of selftests. The liburing repository generally carries feature and regression tests for everything, but at least for ublk initially, we'll try and go the route of having it in selftests as well. We'll see how this goes, might decide to migrate more tests this way in the future. - Various little cleanups and fixes" * tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux: (108 commits) selftests: ublk: add stripe target selftests: ublk: simplify loop io completion selftests: ublk: enable zero copy for null target selftests: ublk: prepare for supporting stripe target selftests: ublk: move common code into common.c selftests: ublk: increase max buffer size to 1MB selftests: ublk: add single sqe allocator helper selftests: ublk: add generic_01 for verifying sequential IO order selftests: ublk: fix starting ublk device io_uring: enable toggle of iowait usage when waiting on CQEs selftests: ublk: fix write cache implementation selftests: ublk: add variable for user to not show test result selftests: ublk: don't show `modprobe` failure selftests: ublk: add one dependency header io_uring/kbuf: enable bundles for incrementally consumed buffers Revert "io_uring/rsrc: simplify the bvec iter count calculation" selftests: ublk: improve test usability selftests: ublk: add stress test for covering IO vs. killing ublk server selftests: ublk: add one stress test for covering IO vs. removing device selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests ...	2025-03-26 17:56:00 -07:00
Mickaël Salaün	a5c369e45b	selftests/landlock: Add audit tests for network Test all network blockers: - net.bind_tcp - net.connect_tcp Test coverage for security/landlock is 94.0% of 1525 lines according to gcc/gcov-14. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-28-mic@digikod.net [mic: Update test coverage] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:48 +01:00
Mickaël Salaün	316d06b011	selftests/landlock: Add audit tests for filesystem Test all filesystem blockers, including events with several records, and record with several blockers: - fs.execute - fs.write_file - fs.read_file - fs_read_dir - fs.remove_dir - fs.remove_file - fs.make_char - fs.make_dir - fs.make_reg - fs.make_sock - fs.make_fifo - fs.make_block - fs.make_sym - fs.refer - fs.truncate - fs.ioctl_dev - fs.change_topology Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-27-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:48 +01:00
Mickaël Salaün	e1156872ef	selftests/landlock: Add audit tests for abstract UNIX socket scoping Add a new scoped_audit.connect_to_child test to check the abstract UNIX socket blocker. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-26-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:47 +01:00
Mickaël Salaün	e2893c0a69	selftests/landlock: Add audit tests for ptrace Add tests for all ptrace actions checking "blockers=ptrace" records. This also improves PTRACE_TRACEME and PTRACE_ATTACH tests by making sure that the restrictions comes from Landlock, and with the expected process. These extended tests are like enhanced errno checks that make sure Landlock enforcement is consistent. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-25-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:47 +01:00
Mickaël Salaün	960ed6ca4c	selftests/landlock: Test audit with restrict flags Add audit_exec tests to filter Landlock denials according to cross-execution or muted subdomains. Add a wait-pipe-sandbox.c test program to sandbox itself and send a (denied) signals to its parent. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-24-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:46 +01:00
Mickaël Salaün	6a500b2297	selftests/landlock: Add tests for audit flags and domain IDs Add audit_test.c to check with and without LANDLOCK_RESTRICT_SELF_* flags against the two Landlock audit record types: AUDIT_LANDLOCK_ACCESS and AUDIT_LANDLOCK_DOMAIN. Check consistency of domain IDs per layer in AUDIT_LANDLOCK_ACCESS and AUDIT_LANDLOCK_DOMAIN messages: denied access, domain allocation, and domain deallocation. These tests use signal scoping to make it simple. They are not in the scoped_signal_test.c file but in the new dedicated audit_test.c file. Tests are run with audit filters to ensure the audit records come from the test program. Moreover, because there can only be one audit process, tests would failed if run in parallel. Because of audit limitations, tests can only be run in the initial namespace. The audit test helpers were inspired by libaudit and tools/testing/selftests/net/netfilter/audit_logread.c Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Phil Sutter <phil@nwl.cc> Link: https://lore.kernel.org/r/20250320190717.2287696-23-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:45 +01:00
Mickaël Salaün	e178b404ea	selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags Add the base_test's restrict_self_fd_flags tests to align with previous restrict_self_fd tests but with the new LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF flag. Add the restrict_self_flags tests to check that LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF, LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON, and LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF are valid but not the next bit. Some checks are similar to restrict_self_checks_ordering's ones. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-22-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:45 +01:00
Mickaël Salaün	ec12a8d4c1	selftests/landlock: Add test for invalid ruleset file descriptor To align with fs_test's layout1.inval and layout0.proc_nsfs which test EBADFD for landlock_add_rule(2), create a new base_test's restrict_self_fd which test EBADFD for landlock_restrict_self(2). Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-21-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:44 +01:00
Mickaël Salaün	12bfcda73a	landlock: Add LANDLOCK_RESTRICT_SELF_LOG__EXEC_ flags Most of the time we want to log denied access because they should not happen and such information helps diagnose issues. However, when sandboxing processes that we know will try to access denied resources (e.g. unknown, bogus, or malicious binary), we might want to not log related access requests that might fill up logs. By default, denied requests are logged until the task call execve(2). If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied requests will not be logged for the same executed file. If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied requests from after an execve(2) call will be logged. The rationale is that a program should know its own behavior, but not necessarily the behavior of other programs. Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific Landlock domain, it makes it possible to selectively mask some access requests that would be logged by a parent domain, which might be handy for unprivileged processes to limit logs. However, system administrators should still use the audit filtering mechanism. There is intentionally no audit nor sysctl configuration to re-enable these logs. This is delegated to the user space program. Increment the Landlock ABI version to reflect this interface change. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-18-mic@digikod.net [mic: Rename variables and fix __maybe_unused] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:42 +01:00
Mickaël Salaün	d9d2a68ed4	landlock: Add unique ID generator Landlock IDs can be generated to uniquely identify Landlock objects. For now, only Landlock domains get an ID at creation time. These IDs map to immutable domain hierarchies. Landlock IDs have important properties: - They are unique during the lifetime of the running system thanks to the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs. - They are always greater than 2^32 and must then be stored in 64-bit integer types. - The initial ID (at boot time) is randomly picked between 2^32 and 2^33, which limits collisions in logs across different boots. - IDs are sequential, which enables users to order them. - IDs may not be consecutive but increase with a random 2^4 step, which limits side channels. Such IDs can be exposed to unprivileged processes, even if it is not the case with this audit patch series. The domain IDs will be useful for user space to identify sandboxes and get their properties. These Landlock IDs are more secure that other absolute kernel IDs such as pipe's inodes which rely on a shared global counter. For checkpoint/restore features (i.e. CRIU), we could easily implement a privileged interface (e.g. sysfs) to set the next ID counter. IDR/IDA are not used because we only need a bijection from Landlock objects to Landlock IDs, and we must not recycle IDs. This enables us to identify all Landlock objects during the lifetime of the system (e.g. in logs), but not to access an object from an ID nor know if an ID is assigned. Using a counter is simpler, it scales (i.e. avoids growing memory footprint), and it does not require locking. We'll use proper file descriptors (with IDs used as inode numbers) to access Landlock objects. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:34 +01:00
Mickaël Salaün	c5efa393d8	selftests/landlock: Add a new test for setuid() The new signal_scoping_thread_setuid tests check that the libc's setuid() function works as expected even when a thread is sandboxed with scoped signal restrictions. Before the signal scoping fix, this test would have failed with the setuid() call: [pid 65] getpid() = 65 [pid 65] tgkill(65, 66, SIGRT_1) = -1 EPERM (Operation not permitted) [pid 65] futex(0x40a66cdc, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 65] setuid(1001) = 0 After the fix, tgkill(2) is successfully leveraged to synchronize credentials update across threads: [pid 65] getpid() = 65 [pid 65] tgkill(65, 66, SIGRT_1) = 0 [pid 66] <... read resumed>0x40a65eb7, 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [pid 66] --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL, si_pid=65, si_uid=1000} --- [pid 66] getpid() = 65 [pid 66] setuid(1001) = 0 [pid 66] futex(0x40a66cdc, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 66] rt_sigreturn({mask=[]}) = 0 [pid 66] read(3, <unfinished ...> [pid 65] setuid(1001) = 0 Test coverage for security/landlock is 92.9% of 1137 lines according to gcc/gcov-14. Fixes: `c899496501` ("selftests/landlock: Test signal scoping for threads") Cc: Günther Noack <gnoack@google.com> Cc: Tahera Fahimi <fahimitahera@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-8-mic@digikod.net [mic: Update test coverage] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:32 +01:00
Mickaël Salaün	bbe7227403	selftests/landlock: Split signal_scoping_threads tests Split signal_scoping_threads tests into signal_scoping_thread_before and signal_scoping_thread_after. Use local variables for thread synchronization. Fix exported function. Replace some asserts with expects. Fixes: `c899496501` ("selftests/landlock: Test signal scoping for threads") Cc: Günther Noack <gnoack@google.com> Cc: Tahera Fahimi <fahimitahera@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-7-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:32 +01:00
Mickaël Salaün	18eb75f3af	landlock: Always allow signals between threads of the same process Because Linux credentials are managed per thread, user space relies on some hack to synchronize credential update across threads from the same process. This is required by the Native POSIX Threads Library and implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to synchronize threads. See nptl(7) and libpsx(3). Furthermore, some runtimes like Go do not enable developers to have control over threads [1]. To avoid potential issues, and because threads are not security boundaries, let's relax the Landlock (optional) signal scoping to always allow signals sent between threads of the same process. This exception is similar to the __ptrace_may_access() one. hook_file_set_fowner() now checks if the target task is part of the same process as the caller. If this is the case, then the related signal triggered by the socket will always be allowed. Scoping of abstract UNIX sockets is not changed because kernel objects (e.g. sockets) should be tied to their creator's domain at creation time. Note that creating one Landlock domain per thread puts each of these threads (and their future children) in their own scope, which is probably not what users expect, especially in Go where we do not control threads. However, being able to drop permissions on all threads should not be restricted by signal scoping. We are working on a way to make it possible to atomically restrict all threads of a process with the same domain [2]. Add erratum for signal scoping. Closes: https://github.com/landlock-lsm/go-landlock/issues/36 Fixes: `54a6e6bbf3` ("landlock: Add signal scoping") Fixes: `c899496501` ("selftests/landlock: Test signal scoping for threads") Depends-on: `26f204380a` ("fs: Fix file_set_fowner LSM hook inconsistencies") Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1] Link: https://github.com/landlock-lsm/linux/issues/2 [2] Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Tahera Fahimi <fahimitahera@gmail.com> Cc: stable@vger.kernel.org Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net [mic: Add extra pointer check and RCU guard, and ease backport] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:29 +01:00
Niklas Cassel	08818c6d7f	misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO For PCITEST_MSI we really want to set PCITEST_SET_IRQTYPE explicitly to PCITEST_IRQ_TYPE_MSI, since we want to test if MSI works. For PCITEST_MSIX we really want to set PCITEST_SET_IRQTYPE explicitly to PCITEST_IRQ_TYPE_MSIX, since we want to test if MSI works. For PCITEST_LEGACY_IRQ we really want to set PCITEST_SET_IRQTYPE explicitly to PCITEST_IRQ_TYPE_INTX, since we want to test if INTx works. However, for PCITEST_WRITE, PCITEST_READ, PCITEST_COPY, we really don't care which IRQ type that is used, we just want to use a IRQ type that is supported by the EPC. The old behavior was to always use MSI for PCITEST_WRITE, PCITEST_READ, PCITEST_COPY, was to always set IRQ type to MSI before doing the actual test, however, there are EPC drivers that do not support MSI. Add a new PCITEST_IRQ_TYPE_AUTO, that will use the CAPS register to see which IRQ types the endpoint supports, and use one of the supported IRQ types. For backwards compatibility, if the endpoint does not expose any supported IRQ type in the CAPS register, simply fallback to using MSI, as it was unconditionally done before. Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Link: https://lore.kernel.org/r/20250310111016.859445-16-cassel@kernel.org	2025-03-26 06:11:54 +00:00
Linus Torvalds	ee6740fd34	CRC updates for 6.15 Another set of improvements to the kernel's CRC (cyclic redundancy check) code: - Rework the CRC64 library functions to be directly optimized, like what I did last cycle for the CRC32 and CRC-T10DIF library functions. - Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ support and acceleration for crc64_be and crc64_nvme. - Rewrite the riscv Zbc-optimized CRC code, and add acceleration for crc_t10dif, crc64_be, and crc64_nvme. - Remove crc_t10dif and crc64_rocksoft from the crypto API, since they are no longer needed there. - Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect. - Add kunit test cases for crc64_nvme and crc7. - Eliminate redundant functions for calculating the Castagnoli CRC32, settling on just crc32c(). - Remove unnecessary prompts from some of the CRC kconfig options. - Further optimize the x86 crc32c code. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ+CGGhQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3wRAP4tbnzawUmlIHIF0hleoADXehUgAhMt NZn15mGvyiuwIQEA8W9qvnLdFXZkdxhxAEvDDFjyrRauL6eGtr/GvCx4AQY= =wmKG -----END PGP SIGNATURE----- Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: "Another set of improvements to the kernel's CRC (cyclic redundancy check) code: - Rework the CRC64 library functions to be directly optimized, like what I did last cycle for the CRC32 and CRC-T10DIF library functions - Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ support and acceleration for crc64_be and crc64_nvme - Rewrite the riscv Zbc-optimized CRC code, and add acceleration for crc_t10dif, crc64_be, and crc64_nvme - Remove crc_t10dif and crc64_rocksoft from the crypto API, since they are no longer needed there - Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect - Add kunit test cases for crc64_nvme and crc7 - Eliminate redundant functions for calculating the Castagnoli CRC32, settling on just crc32c() - Remove unnecessary prompts from some of the CRC kconfig options - Further optimize the x86 crc32c code" * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (36 commits) x86/crc: drop the avx10_256 functions and rename avx10_512 to avx512 lib/crc: remove unnecessary prompt for CONFIG_CRC64 lib/crc: remove unnecessary prompt for CONFIG_LIBCRC32C lib/crc: remove unnecessary prompt for CONFIG_CRC8 lib/crc: remove unnecessary prompt for CONFIG_CRC7 lib/crc: remove unnecessary prompt for CONFIG_CRC4 lib/crc7: unexport crc7_be_syndrome_table lib/crc_kunit.c: update comment in crc_benchmark() lib/crc_kunit.c: add test and benchmark for crc7_be() x86/crc32: optimize tail handling for crc32c short inputs riscv/crc64: add Zbc optimized CRC64 functions riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function riscv/crc32: reimplement the CRC32 functions using new template riscv/crc: add "template" for Zbc optimized CRC functions x86/crc: add ANNOTATE_NOENDBR to suppress objtool warnings x86/crc32: improve crc32c_arch() code generation with clang x86/crc64: implement crc64_be and crc64_nvme using new template x86/crc-t10dif: implement crc_t10dif using new template x86/crc32: implement crc32_le using new template x86/crc: add "template" for [V]PCLMULQDQ based CRC functions ...	2025-03-25 18:33:04 -07:00
Linus Torvalds	edb0e8f6e2	ARM: * Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM * Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace * Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers * Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration * Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation * pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat * Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events LoongArch: * Remove unnecessary header include path * Assume constant PGD during VM context switch * Add perf events support for guest VM RISC-V: * Disable the kernel perf counter during configure * KVM selftests improvements for PMU * Fix warning at the time of KVM module removal x86: * Add support for aging of SPTEs without holding mmu_lock. Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs. This includes an implementation of per-rmap-entry locking; aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas locking an rmap for write requires taking both the per-rmap spinlock and the mmu_lock. Note that this decreases slightly the accuracy of accessed-page information, because changes to the SPTE outside aging might not use atomic operations even if they could race against a clear of the Accessed bit. This is deliberate because KVM and mm/ tolerate false positives/negatives for accessed information, and testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information. * Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition. * Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2). * Drop "support" for async page faults for protected guests that do not set SEND_ALWAYS (i.e. that only want async page faults at CPL3) * Bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. Particularly, destroy vCPUs before the MMU, despite the latter being a VM-wide operation. * Add common secure TSC infrastructure for use within SNP and in the future TDX * Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing. * Don't take kvm->lock when iterating over vCPUs in the suspend notifier to fix a largely theoretical deadlock. * Use the vCPU's actual Xen PV clock information when starting the Xen timer, as the cached state in arch.hv_clock can be stale/bogus. * Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend notifier only accounts for kvmclock, and there's no evidence that the flag is actually supported by Xen guests. * Clean up the per-vCPU "cache" of its reference pvclock, and instead only track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately expensive to compute, and rarely changes for modern setups). * Don't write to the Xen hypercall page on MSR writes that are initiated by the host (userspace or KVM) to fix a class of bugs where KVM can write to guest memory at unexpected times, e.g. during vCPU creation if userspace has set the Xen hypercall MSR index to collide with an MSR that KVM emulates. * Restrict the Xen hypercall MSR index to the unofficial synthetic range to reduce the set of possible collisions with MSRs that are emulated by KVM (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside in the synthetic range). * Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config. * Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID entries when updating PV clocks; there is no guarantee PV clocks will be updated between TSC frequency changes and CPUID emulation, and guest reads of the TSC leaves should be rare, i.e. are not a hot path. x86 (Intel): * Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1. * Pass XFD_ERR as the payload when injecting #NM, as a preparatory step for upcoming FRED virtualization support. * Decouple the EPT entry RWX protection bit macros from the EPT Violation bits, both as a general cleanup and in anticipation of adding support for emulating Mode-Based Execution Control (MBEC). * Reject KVM_RUN if userspace manages to gain control and stuff invalid guest state while KVM is in the middle of emulating nested VM-Enter. * Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs in anticipation of adding sanity checks for secondary exit controls (the primary field is out of bits). x86 (AMD): * Ensure the PSP driver is initialized when both the PSP and KVM modules are built-in (the initcall framework doesn't handle dependencies). * Use long-term pins when registering encrypted memory regions, so that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to excessive fragmentation. * Add macros and helpers for setting GHCB return/error codes. * Add support for Idle HLT interception, which elides interception if the vCPU has a pending, unmasked virtual IRQ when HLT is executed. * Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical address. * Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g. because the vCPU was "destroyed" via SNP's AP Creation hypercall. * Reject SNP AP Creation if the requested SEV features for the vCPU don't match the VM's configured set of features. Selftests: * Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data instead of executing code. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters. * Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware. * Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration. * Fix a few minor bugs related to handling of stats FDs. * Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation). * Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked). -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfcTkEUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMnQAf/cPx72hJOdNy4Qrm8M33YLXVRVV00 yEZ8eN8TWdOclr0ltE/w/ELGh/qS4CU8pjURAk0A6lPioU+mdcTn3dPEqMDMVYom uOQ2lusEHw0UuSnGZSEjvZJsE/Ro2NSAsHIB6PWRqig1ZBPJzyu0frce34pMpeQH diwriJL9lKPAhBWXnUQ9BKoi1R0P5OLW9ahX4SOWk7cAFg4DLlDE66Nqf6nKqViw DwEucTiUEg5+a3d93gihdD4JNl+fb3vI2erxrMxjFjkacl0qgqRu3ei3DG0MfdHU wNcFSG5B1n0OECKxr80lr1Ip1KTVNNij0Ks+w6Gc6lSg9c4PptnNkfLK3A== =nnCN -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events LoongArch: - Remove unnecessary header include path - Assume constant PGD during VM context switch - Add perf events support for guest VM RISC-V: - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal x86: - Add support for aging of SPTEs without holding mmu_lock. Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs. This includes an implementation of per-rmap-entry locking; aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas locking an rmap for write requires taking both the per-rmap spinlock and the mmu_lock. Note that this decreases slightly the accuracy of accessed-page information, because changes to the SPTE outside aging might not use atomic operations even if they could race against a clear of the Accessed bit. This is deliberate because KVM and mm/ tolerate false positives/negatives for accessed information, and testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2) - Drop "support" for async page faults for protected guests that do not set SEND_ALWAYS (i.e. that only want async page faults at CPL3) - Bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. Particularly, destroy vCPUs before the MMU, despite the latter being a VM-wide operation - Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing - Don't take kvm->lock when iterating over vCPUs in the suspend notifier to fix a largely theoretical deadlock - Use the vCPU's actual Xen PV clock information when starting the Xen timer, as the cached state in arch.hv_clock can be stale/bogus - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend notifier only accounts for kvmclock, and there's no evidence that the flag is actually supported by Xen guests - Clean up the per-vCPU "cache" of its reference pvclock, and instead only track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately expensive to compute, and rarely changes for modern setups) - Don't write to the Xen hypercall page on MSR writes that are initiated by the host (userspace or KVM) to fix a class of bugs where KVM can write to guest memory at unexpected times, e.g. during vCPU creation if userspace has set the Xen hypercall MSR index to collide with an MSR that KVM emulates - Restrict the Xen hypercall MSR index to the unofficial synthetic range to reduce the set of possible collisions with MSRs that are emulated by KVM (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside in the synthetic range) - Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config - Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID entries when updating PV clocks; there is no guarantee PV clocks will be updated between TSC frequency changes and CPUID emulation, and guest reads of the TSC leaves should be rare, i.e. are not a hot path x86 (Intel): - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1 - Pass XFD_ERR as the payload when injecting #NM, as a preparatory step for upcoming FRED virtualization support - Decouple the EPT entry RWX protection bit macros from the EPT Violation bits, both as a general cleanup and in anticipation of adding support for emulating Mode-Based Execution Control (MBEC) - Reject KVM_RUN if userspace manages to gain control and stuff invalid guest state while KVM is in the middle of emulating nested VM-Enter - Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs in anticipation of adding sanity checks for secondary exit controls (the primary field is out of bits) x86 (AMD): - Ensure the PSP driver is initialized when both the PSP and KVM modules are built-in (the initcall framework doesn't handle dependencies) - Use long-term pins when registering encrypted memory regions, so that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to excessive fragmentation - Add macros and helpers for setting GHCB return/error codes - Add support for Idle HLT interception, which elides interception if the vCPU has a pending, unmasked virtual IRQ when HLT is executed - Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical address - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g. because the vCPU was "destroyed" via SNP's AP Creation hypercall - Reject SNP AP Creation if the requested SEV features for the vCPU don't match the VM's configured set of features Selftests: - Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data instead of executing code. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration - Fix a few minor bugs related to handling of stats FDs - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation) - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits) RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed RISC-V: KVM: Teardown riscv specific bits after kvm_exit LoongArch: KVM: Register perf callbacks for guest LoongArch: KVM: Implement arch-specific functions for guest perf LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel() LoongArch: KVM: Remove PGD saving during VM context switch LoongArch: KVM: Remove unnecessary header include path KVM: arm64: Tear down vGIC on failed vCPU creation KVM: arm64: PMU: Reload when resetting KVM: arm64: PMU: Reload when user modifies registers KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs KVM: arm64: PMU: Assume PMU presence in pmu-emul.c KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected KVM: x86: Add infrastructure for secure TSC KVM: x86: Push down setting vcpu.arch.user_set_tsc ...	2025-03-25 14:22:07 -07:00
Linus Torvalds	2d09a9449e	arm64 updates for 6.15: Perf and PMUs: - Support for the "Rainier" CPU PMU from Arm - Preparatory driver changes and cleanups that pave the way for BRBE support - Support for partial virtualisation of the Apple-M1 PMU - Support for the second event filter in Arm CSPMU designs - Minor fixes and cleanups (CMN and DWC PMUs) - Enable EL2 requirements for FEAT_PMUv3p9 Power, CPU topology: - Support for AMUv1-based average CPU frequency - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds a generic topology_is_primary_thread() function overridden by x86 and powerpc New(ish) features: - MOPS (memcpy/memset) support for the uaccess routines Security/confidential compute: - Fix the DMA address for devices used in Realms with Arm CCA. The CCA architecture uses the address bit to differentiate between shared and private addresses - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by default Memory management clean-ups: - Drop the PD_TABLE_BIT definition in preparation for 128-bit PTEs - Some minor page table accessor clean-ups - PIE/POE (permission indirection/overlay) helpers clean-up Kselftests: - MTE: skip hugetlb tests if MTE is not supported on such mappings and user correct naming for sync/async tag checking modes Miscellaneous: - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people request) - Sysreg updates for new register fields - CPU type info for some Qualcomm Kryo cores -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmfjB2QACgkQa9axLQDI XvGrfg//W3Bx9+jw1G/XHHEQqGEVFmvltvxZUkvgV0Qki0rPSMnappJhZRL9n0Nm V6PvGd2KoKHZuL3g5ViZb3cs2R9BiD2JB6PncwBKuxumHGh3vz3kk1JMkDVfWdHv qAceOckFJD9rXjPZn+PDsfYiEi2i3RRWIP5VglZ14ue8j3prHQ6DJXLUQF2GYvzE /bgLSq44wp5N59ddy23+qH9rxrHzz3bgpbVv/F56W/LErvE873mRmyFwiuGJm+M0 Pn8ra572rI6a4sgSwrMTeNPBU+F9o5AbqwauVhkz428RdMvgfEuW6qHUBnGWJDmt HotXmu+4Eb2KJks/iQkDo4OTJ38yUqvvZZJtP171ms3E4yqESSJngWP6O2A6LF+y xhe0sESF/Ew6jLhM6/hvOmBcE2AyB14JE3ymqLkXbWub4NXddBn2AF1WXFjF4CBw F8KSUhNLekrCYKv1k9M3nhvkcpoS9FkTF/TI+zEg546alI/GLPih6uDRkgMAODh1 RDJYixHsf2NDDRQbfwvt9Xua/KKpDF6qNkHLA4OiqqVUwh1hkas24Lrnp8vmce4o wIpWCLqYWey8Rl3XWuWgWz2Xu58fHH4Dl2k72Z8I0pwp3abCDa9xEj79G0Svk7Si Q+FCYrNlpKee1RXBC+1MUD/Gl5r/28dEUFkAzPD80F7AgafXPd0= =Kc9c -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Nothing major this time around. Apart from the usual perf/PMU updates, some page table cleanups, the notable features are average CPU frequency based on the AMUv1 counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in the uaccess routines. Perf and PMUs: - Support for the 'Rainier' CPU PMU from Arm - Preparatory driver changes and cleanups that pave the way for BRBE support - Support for partial virtualisation of the Apple-M1 PMU - Support for the second event filter in Arm CSPMU designs - Minor fixes and cleanups (CMN and DWC PMUs) - Enable EL2 requirements for FEAT_PMUv3p9 Power, CPU topology: - Support for AMUv1-based average CPU frequency - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds a generic topology_is_primary_thread() function overridden by x86 and powerpc New(ish) features: - MOPS (memcpy/memset) support for the uaccess routines Security/confidential compute: - Fix the DMA address for devices used in Realms with Arm CCA. The CCA architecture uses the address bit to differentiate between shared and private addresses - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by default Memory management clean-ups: - Drop the PD_TABLE_BIT definition in preparation for 128-bit PTEs - Some minor page table accessor clean-ups - PIE/POE (permission indirection/overlay) helpers clean-up Kselftests: - MTE: skip hugetlb tests if MTE is not supported on such mappings and user correct naming for sync/async tag checking modes Miscellaneous: - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people request) - Sysreg updates for new register fields - CPU type info for some Qualcomm Kryo cores" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits) arm64: mm: Don't use %pK through printk perf/arm_cspmu: Fix missing io.h include arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists arm64: cputype: Add MIDR_CORTEX_A76AE arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list arm64/sysreg: Enforce whole word match for open/close tokens arm64/sysreg: Fix unbalanced closing block arm64: Kconfig: Enable HOTPLUG_SMT arm64: topology: Support SMT control on ACPI based system arch_topology: Support SMT control for OF based system cpu/SMT: Provide a default topology_is_primary_thread() arm64/mm: Define PTDESC_ORDER perf/arm_cspmu: Add PMEVFILT2R support perf/arm_cspmu: Generalise event filtering perf/arm_cspmu: Move register definitons to header arm64/kernel: Always use level 2 or higher for early mappings arm64/mm: Drop PXD_TABLE_BIT arm64/mm: Check pmd_table() in pmd_trans_huge() ...	2025-03-25 13:16:16 -07:00
Catalin Marinas	8cc14fdcc1	Merge branches 'for-next/amuv1-avg-freq', 'for-next/pkey_unrestricted', 'for-next/sysreg', 'for-next/misc', 'for-next/pgtable-cleanups', 'for-next/kselftest', 'for-next/uaccess-mops', 'for-next/pie-poe-cleanup', 'for-next/cputype-kryo', 'for-next/cca-dma-address', 'for-next/drop-pxd_table_bit' and 'for-next/spectre-bhb-assume-vulnerable', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf/arm_cspmu: Fix missing io.h include perf/arm_cspmu: Add PMEVFILT2R support perf/arm_cspmu: Generalise event filtering perf/arm_cspmu: Move register definitons to header drivers/perf: apple_m1: Support host/guest event filtering drivers/perf: apple_m1: Refactor event select/filter configuration perf/dwc_pcie: fix duplicate pci_dev devices perf/dwc_pcie: fix some unreleased resources perf/arm-cmn: Minor event type housekeeping perf: arm_pmu: Move PMUv3-specific data perf: apple_m1: Don't disable counter in m1_pmu_enable_event() perf: arm_v7_pmu: Don't disable counter in (armv7\|krait_\|scorpion_)pmu_enable_event() perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() perf: arm_pmu: Don't disable counter in armpmu_add() perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters perf: arm_pmuv3: Add support for ARM Rainier PMU * for-next/amuv1-avg-freq: : Add support for AArch64 AMUv1-based average freq arm64: Utilize for_each_cpu_wrap for reference lookup arm64: Update AMU-based freq scale factor on entering idle arm64: Provide an AMU-based version of arch_freq_get_on_cpu cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry cpufreq: Allow arch_freq_get_on_cpu to return an error arch_topology: init capacity_freq_ref to 0 * for-next/pkey_unrestricted: : mm/pkey: Add PKEY_UNRESTRICTED macro selftest/powerpc/mm/pkey: fix build-break introduced by commit `00894c3fc9` selftests/powerpc: Use PKEY_UNRESTRICTED macro selftests/mm: Use PKEY_UNRESTRICTED macro mm/pkey: Add PKEY_UNRESTRICTED macro * for-next/sysreg: : arm64 sysreg updates arm64/sysreg: Enforce whole word match for open/close tokens arm64/sysreg: Fix unbalanced closing block arm64/sysreg: Add register fields for HFGWTR2_EL2 arm64/sysreg: Add register fields for HFGRTR2_EL2 arm64/sysreg: Add register fields for HFGITR2_EL2 arm64/sysreg: Add register fields for HDFGWTR2_EL2 arm64/sysreg: Add register fields for HDFGRTR2_EL2 arm64/sysreg: Update register fields for ID_AA64MMFR0_EL1 * for-next/misc: : Miscellaneous arm64 patches arm64: mm: Don't use %pK through printk arm64/fpsimd: Remove unused declaration fpsimd_kvm_prepare() * for-next/pgtable-cleanups: : arm64 pgtable accessors cleanup arm64/mm: Define PTDESC_ORDER arm64/kernel: Always use level 2 or higher for early mappings arm64/hugetlb: Consistently use pud_sect_supported() arm64/mm: Convert __pte_to_phys() and __phys_to_pte_val() as functions * for-next/kselftest: : arm64 kselftest updates kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c * for-next/uaccess-mops: : Implement the uaccess memory copy/set using MOPS instructions arm64: lib: Use MOPS for usercopy routines arm64: mm: Handle PAN faults on uaccess CPY* instructions arm64: extable: Add fixup handling for uaccess CPY* instructions * for-next/pie-poe-cleanup: : PIE/POE helpers cleanup arm64/sysreg: Move POR_EL0_INIT to asm/por.h arm64/sysreg: Rename POE_RXW to POE_RWX arm64/sysreg: Improve PIR/POR helpers * for-next/cputype-kryo: : Add cputype info for some Qualcomm Kryo cores arm64: cputype: Add comments about Qualcomm Kryo 5XX and 6XX cores arm64: cputype: Add QCOM_CPU_PART_KRYO_3XX_GOLD * for-next/cca-dma-address: : Fix DMA address for devices used in realms with Arm CCA arm64: realm: Use aliased addresses for device DMA to shared buffers dma: Introduce generic dma_addr_crypted helpers dma: Fix encryption bit clearing for dma_to_phys for-next/drop-pxd_table_bit: : Drop the arm64 PXD_TABLE_BIT (clean-up in preparation for 128-bit PTEs) arm64/mm: Drop PXD_TABLE_BIT arm64/mm: Check pmd_table() in pmd_trans_huge() arm64/mm: Check PUD_TYPE_TABLE in pud_bad() arm64/mm: Check PXD_TYPE_TABLE in [p4d\|pgd]_bad() arm64/mm: Clear PXX_TYPE_MASK and set PXD_TYPE_SECT in [pmd\|pud]_mkhuge() arm64/mm: Clear PXX_TYPE_MASK in mk_[pmd\|pud]_sect_prot() arm64/ptdump: Test PMD_TYPE_MASK for block mapping KVM: arm64: ptdump: Test PMD_TYPE_MASK for block mapping * for-next/spectre-bhb-assume-vulnerable: : Rework Spectre BHB mitigations to not assume "safe" arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists arm64: cputype: Add MIDR_CORTEX_A76AE arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list	2025-03-25 19:32:03 +00:00
Linus Torvalds	317a76a996	Updates for the VDSO infrastructure: - Consolidate the VDSO storage The VDSO data storage and data layout has been largely architecture specific for historical reasons. That increases the maintenance effort and causes inconsistencies over and over. There is no real technical reason for architecture specific layouts and implementations. The architecture specific details can easily be integrated into a generic layout, which also reduces the amount of duplicated code for managing the mappings. Convert all architectures over to a unified layout and common mapping infrastructure. This splits the VDSO data layout into subsystem specific blocks, timekeeping, random and architecture parts, which provides a better structure and allows to improve and update the functionalities without conflict and interaction. - Rework the timekeeping data storage The current implementation is designed for exposing system timekeeping accessors, which was good enough at the time when it was designed. PTP and Time Sensitive Networking (TSN) change that as there are requirements to expose independent PTP clocks, which are not related to system timekeeping. Replace the monolithic data storage by a structured layout, which allows to add support for independent PTP clocks on top while reusing both the data structures and the time accessor implementations. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfgSWUTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoYGED/0f/M8YyacAyErDYW4ufW+zh2sUidSf GVlK0Jn5BMljOoye+y2XfTxuvvXxEDjJNYiJm2uKGPdV29tjNXreGK39XyNqXPu5 jwR4f/IN/QVSM2nCO6jyydMz8ympJ2k6M4RewwmxXBL2KsUzzJWSKTgRNqM5Tdjs 1RhJMjkQVTiiSYerBpHXYCeZLM7/VEfZ120uuzVAYPXo0/R6zuyF7IBgIao9hbfO IQeCMLLfpDQHQhwquTA8ZbWqQusiEoSYHT+kTDa3eXDDbE/2UklAUs9gaatI979x 73zs0Yqxyx2iIGaghACWOAbKdcBWBeCYDw5fFwYVKn4VMQi1+wcxbtOYL767jp9o vfkLXGilXcVkvDjv4fH+e1NoJXXBxq1Ug1silKdOeJzenQF8Q1i3tavkWUVCNfwH qyOIM72NiCEWbYBDcz0lwBxEAyO4o0E6NP1bDc4y50VedEYIbXwSh0QGrdev1abn rjY9vsuUR9oznmZ6BRPPxMTY87gOSHoKvqydgSZUACEgLV9346f5qZf341OReYai MXUmXOM4+LdyaM1+Mec8ppvjMbLw+736NZyZtT2InusEBE+Ddp25L3hYiWnklJu8 2uwv0AoyrwaJ8y6ADOX4thcLZq0gND0Z/Ayz/XvpeI30eftsGUCt5KOVlqwfwOkI 4EQKvk2fAixPxg== =rwei -----END PGP SIGNATURE----- Merge tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO infrastructure updates from Thomas Gleixner: - Consolidate the VDSO storage The VDSO data storage and data layout has been largely architecture specific for historical reasons. That increases the maintenance effort and causes inconsistencies over and over. There is no real technical reason for architecture specific layouts and implementations. The architecture specific details can easily be integrated into a generic layout, which also reduces the amount of duplicated code for managing the mappings. Convert all architectures over to a unified layout and common mapping infrastructure. This splits the VDSO data layout into subsystem specific blocks, timekeeping, random and architecture parts, which provides a better structure and allows to improve and update the functionalities without conflict and interaction. - Rework the timekeeping data storage The current implementation is designed for exposing system timekeeping accessors, which was good enough at the time when it was designed. PTP and Time Sensitive Networking (TSN) change that as there are requirements to expose independent PTP clocks, which are not related to system timekeeping. Replace the monolithic data storage by a structured layout, which allows to add support for independent PTP clocks on top while reusing both the data structures and the time accessor implementations. * tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) sparc/vdso: Always reject undefined references during linking x86/vdso: Always reject undefined references during linking vdso: Rework struct vdso_time_data and introduce struct vdso_clock vdso: Move architecture related data before basetime data powerpc/vdso: Prepare introduction of struct vdso_clock arm64/vdso: Prepare introduction of struct vdso_clock x86/vdso: Prepare introduction of struct vdso_clock time/namespace: Prepare introduction of struct vdso_clock vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct vdso/vsyscall: Prepare introduction of struct vdso_clock vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock vdso/gettimeofday: Prepare introduction of struct vdso_clock vdso/helpers: Prepare introduction of struct vdso_clock vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks vdso: Make vdso_time_data cacheline aligned arm64: Make asm/cache.h compatible with vDSO ...	2025-03-25 11:30:42 -07:00
Eric Dumazet	a7c428ee8f	tcp/dccp: remove icsk->icsk_timeout icsk->icsk_timeout can be replaced by icsk->icsk_retransmit_timer.expires This saves 8 bytes in TCP/DCCP sockets and helps for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250324203607.703850-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 10:34:33 -07:00
Linus Torvalds	d5048d1176	Updates for the core time/timer subsystem: - Fix a memory ordering issue in posix-timers Posix-timer lookup is lockless and reevaluates the timer validity under the timer lock, but the update which validates the timer is not protected by the timer lock. That allows the store to be reordered against the initialization stores, so that the lookup side can observe a partially initialized timer. That's mostly a theoretical problem, but incorrect nevertheless. - Fix a long standing inconsistency of the coarse time getters The coarse time getters read the base time of the current update cycle without reading the actual hardware clock. NTP frequency adjustment can set the base time backwards. The fine grained interfaces compensate this by reading the clock and applying the new conversion factor, but the coarse grained time getters use the base time directly. That allows the user to observe time going backwards. Cure it by always forwarding base time, when NTP changes the frequency with an immediate step. - Rework of posix-timer hashing The posix-timer hash is not scalable and due to the CRIU timer restore mechanism prone to massive contention on the global hash bucket lock. Replace the global hash lock with a fine grained per bucket locking scheme to address that. - Rework the proc/$PID/timers interface. /proc/$PID/timers is provided for CRIU to be able to restore a timer. The printout happens with sighand lock held and interrupts disabled. That's not required as this can be done with RCU protection as well. - Provide a sane mechanism for CRIU to restore a timer ID CRIU restores timers by creating and deleting them until the kernel internal per process ID counter reached the requested ID. That's horribly slow for sparse timer IDs. Provide a prctl() which allows CRIU to restore a timer with a given ID. When enabled the ID pointer is used as input pointer to read the requested ID from user space. When disabled, the normal allocation scheme (next ID) is active as before. This is backwards compatible for both kernel and user space. - Make hrtimer_update_function() less expensive. The sanity checks are valuable, but expensive for high frequency usage in io/uring. Make the debug checks conditional and enable them only when lockdep is enabled. - Small updates, cleanups and improvements -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfgQ6wTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoeQzD/9p+EuUGrMbSNaLVMCYFULBbR0lersJ hrGGoKUsNt5T+f6hEEbSLBnkjZcMIj0J+mdIEUiRa73ryw1KmwLk/8MBu0c6u6q3 musDvJqt3dLTG98yN0YeWK3tJDxhSjxIpwcAXusPQ04j16I2fVXFzDQ/kGPq6MTI tdMYzsS3wjuWpi+CbgRSP2HEwu08fIDVsQ7Grynh4Kmd31apne4ZgF2UVp6UiZyp 8yJHZgVzJcFs7Y3MS6XTgezHnuADxMY1irzbXmok19941X8mZz2QRIpGQX+oMh6o g7SG2lj9i8YbLqU9/5RbC5ppjRcWfogDpW0Lk+OmdOpr0RiXTmx5Lz8Egxex9wG5 pUJszeTY+bLw7mmYmkGZyBz+PNoGgVM5KFZRe5ENvYM8Gy8LUW5DA9zvxeHqDDz1 FiMmKdYrwr8VCKqx+8hJQdzlzRbepxq9sNzDdMKVOUcFdGUVWekfG6ZFkfLKxwzA XDTKJilzXbAAj4r57vEvOCYLUZH/ZsFK4yyg0O53fEg6fj87EbTDb5+YUGazb3+C yNTEOQIT8LtutzLR9+xeLi92k+6zlJ4c1PfqBx5Kv/TwBrIfV1P8N2c6TCOWDoRM AOvo2SXEA/jEPix2GjT5jalSV1mROEXo2T9/G7kz4H7K+DkI/dGgS9mXyUDO2mMd ouOxYN0GohVqTQ== =XUGH -----END PGP SIGNATURE----- Merge tag 'timers-core-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Fix a memory ordering issue in posix-timers Posix-timer lookup is lockless and reevaluates the timer validity under the timer lock, but the update which validates the timer is not protected by the timer lock. That allows the store to be reordered against the initialization stores, so that the lookup side can observe a partially initialized timer. That's mostly a theoretical problem, but incorrect nevertheless. - Fix a long standing inconsistency of the coarse time getters The coarse time getters read the base time of the current update cycle without reading the actual hardware clock. NTP frequency adjustment can set the base time backwards. The fine grained interfaces compensate this by reading the clock and applying the new conversion factor, but the coarse grained time getters use the base time directly. That allows the user to observe time going backwards. Cure it by always forwarding base time, when NTP changes the frequency with an immediate step. - Rework of posix-timer hashing The posix-timer hash is not scalable and due to the CRIU timer restore mechanism prone to massive contention on the global hash bucket lock. Replace the global hash lock with a fine grained per bucket locking scheme to address that. - Rework the proc/$PID/timers interface. /proc/$PID/timers is provided for CRIU to be able to restore a timer. The printout happens with sighand lock held and interrupts disabled. That's not required as this can be done with RCU protection as well. - Provide a sane mechanism for CRIU to restore a timer ID CRIU restores timers by creating and deleting them until the kernel internal per process ID counter reached the requested ID. That's horribly slow for sparse timer IDs. Provide a prctl() which allows CRIU to restore a timer with a given ID. When enabled the ID pointer is used as input pointer to read the requested ID from user space. When disabled, the normal allocation scheme (next ID) is active as before. This is backwards compatible for both kernel and user space. - Make hrtimer_update_function() less expensive. The sanity checks are valuable, but expensive for high frequency usage in io/uring. Make the debug checks conditional and enable them only when lockdep is enabled. - Small updates, cleanups and improvements * tag 'timers-core-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) selftests/timers: Improve skew_consistency by testing with other clockids timekeeping: Fix possible inconsistencies in _COARSE clockids posix-timers: Drop redundant memset() invocation selftests/timers/posix-timers: Add a test for exact allocation mode posix-timers: Provide a mechanism to allocate a given timer ID posix-timers: Dont iterate /proc/$PID/timers with sighand:: Siglock held posix-timers: Make per process list RCU safe posix-timers: Avoid false cacheline sharing posix-timers: Switch to jhash32() posix-timers: Improve hash table performance posix-timers: Make signal_struct:: Next_posix_timer_id an atomic_t posix-timers: Make lock_timer() use guard() posix-timers: Rework timer removal posix-timers: Simplify lock/unlock_timer() posix-timers: Use guards in a few places posix-timers: Remove SLAB_PANIC from kmem cache posix-timers: Remove a few paranoid warnings posix-timers: Cleanup includes posix-timers: Add cond_resched() to posix_timer_add() search loop posix-timers: Initialise timer before adding it to the hash table ...	2025-03-25 10:33:23 -07:00
Oleg Nesterov	8661bb9c71	selftests/pidfd: fixes syscall number defines I had to spend some (a lot;) time to understand why pidfd_info_test (and more) fails with my patch under qemu on my machine ;) Until I applied the patch below. I think it is a bad idea to do the things like #ifndef __NR_clone3 #define __NR_clone3 -1 #endif because this can hide a problem. My working laptop runs Fedora-23 which doesn't have __NR_clone3/etc in /usr/include/. So "make" happily succeeds, but everything fails and it is not clear why. Link: https://lore.kernel.org/r/20250323174518.GB834@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-25 14:59:05 +01:00
Filipe Xavier	474eecc882	selftests: livepatch: test if ftrace can trace a livepatched function This new test makes sure that ftrace can trace a function that was introduced by a livepatch. Signed-off-by: Filipe Xavier <felipeaggger@gmail.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250324-ftrace-sftest-livepatch-v3-2-d9d7cc386c75@gmail.com Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-03-25 14:52:32 +01:00
Filipe Xavier	2ca7cd8020	selftests: livepatch: add new ftrace helpers functions Add new ftrace helpers functions cleanup_tracing, trace_function and check_traced_functions. Signed-off-by: Filipe Xavier <felipeaggger@gmail.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250324-ftrace-sftest-livepatch-v3-1-d9d7cc386c75@gmail.com Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-03-25 14:46:02 +01:00
Yi Liu	d57a1fb342	iommufd/selftest: Add coverage for iommufd pasid attach/detach This tests iommufd pasid attach/replace/detach. Link: https://patch.msgid.link/r/20250321171940.7213-19-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-25 10:18:31 -03:00
Dmitry Safonov	edbac739e4	selftests/net: Drop timeout argument from test_client_verify() It's always TEST_TIMEOUT_SEC, with an unjustified exception in rst test, that is more paranoia-long timeout rather than based on requirements. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-7-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:30 -07:00
Dmitry Safonov	1e1738faa2	selftests/net: Delete timeout from test_connect_socket() Unused: it's always either the default timeout or asynchronous connect(). Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-6-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:30 -07:00
Dmitry Safonov	266ed1ace8	selftests/net: Print the testing side in unsigned-md5 As both client and server print the same test name on failure or pass, add "[server]" so that it's more obvious from a log which side printed "ok" or "not ok". Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-5-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:30 -07:00
Dmitry Safonov	3f36781e57	selftests/net: Add mixed select()+polling mode to TCP-AO tests Currently, tcp_ao tests have two timeouts: TEST_RETRANSMIT_SEC and TEST_TIMEOUT_SEC [by default 1 and 5 seconds]. The first one, TEST_RETRANSMIT_SEC is used for operations that are expected to succeed in order for a test to pass. It is usually not consumed and exists only to avoid indefinite test run if the operation didn't complete. The second one, TEST_RETRANSMIT_SEC exists for the tests that checking operations, that are expected to fail/timeout. It is shorter as it is fully consumed, with an expectation that if operation didn't succeed during that period, it will timeout. And the related test that expects the timeout is passing. The actual operation failure is then cross-verified by other means like counters checks. The issue with TEST_RETRANSMIT_SEC timeout is that 1 second is the exact initial TCP timeout. So, in case the initial segment gets lost (quite unlikely on local veth interface between two net namespaces, yet happens in slow VMs), the retransmission never happens and as a result, the test is not actually testing the functionality. Which in the end fails counters checks. As I want tcp_ao selftests to be fast and finishing in a reasonable amount of time on manual run, I didn't consider increasing TEST_RETRANSMIT_SEC. Rather, initially, BPF_SOCK_OPS_TIMEOUT_INIT looked promising as a lever to make the initial TCP timeout shorter. But as it's not a socket bpf attached thing, but sock_ops (attaches to cgroups), the selftests would have to use libbpf, which I wanted to avoid if not absolutely required. Instead, use a mixed select() and counters polling mode with the longer TEST_TIMEOUT_SEC timeout to detect running-away failed tests. It actually not only allows losing segments and succeeding after the previous TEST_RETRANSMIT_SEC timeout was consumed, but makes the tests expecting timeout/failure pass faster. The only test case taking longer (TEST_TIMEOUT_SEC) now is connect-deny "wrong snd id", which checks for no key on SYN-ACK for which there is no counter in the kernel (see tcp_make_synack()). Yet it can be speed up by poking skpair from the trace event (see trace_tcp_ao_synack_no_key). Fixes: `ed9d09b309` ("selftests/net: Add a test for TCP-AO keys matching") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20241205070656.6ef344d7@kernel.org/ Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-4-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:30 -07:00
Dmitry Safonov	5a0a3193f6	selftests/net: Fetch and check TCP-MD5 counters There are related TCP-MD5 <=> TCP and TCP-MD5 <=> TCP-AO tests that can benefit from checking the related counters, not only from validating operations timeouts. It also prepares the code for introduction of mixed select()+poll mode, see the follow-up patches. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-3-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:30 -07:00
Dmitry Safonov	1fe4221093	selftests/net: Provide tcp-ao counters comparison helper Rename __test_tcp_ao_counters_cmp() into test_assert_counters_ao() and test_tcp_ao_key_counters_cmp() into test_assert_counters_key() as they are asserts, rather than just compare functions. Provide test_cmp_counters() helper, that's going to be used to compare ao_info and netns counters as a stop condition for polling the sockets. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-2-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:29 -07:00
Dmitry Safonov	65ffdf31be	selftests/net: Print TCP flags in more common format Before: ># 13145[lib/ftrace-tcp.c:427] trace event filter tcp_ao_key_not_found [2001:db8:1::1:-1 => 2001:db8:254::1:7010, L3index 0, flags: !FS!R!P!., keyid: 100, rnext: 100, maclen: -1, sne: -1] = 1 After: ># 13487[lib/ftrace-tcp.c:427] trace event filter tcp_ao_key_not_found [2001:db8:1::1:-1 => 2001:db8:254::1:7010, L3index 0, flags: S, keyid: 100, rnext: 100, maclen: -1, sne: -1] = 1 For the history, I think the initial format was to emphasize the absence of flags as well as their presence (!R meant no RST flag). But looking again, it's just unreadable and hard to understand. Make it the standard/expected one. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20250319-tcp-ao-selftests-polling-v2-1-da48040153d1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-25 06:10:29 -07:00
Song Liu	59481b8bd0	selftest/livepatch: Only run test-kprobe with CONFIG_KPROBES_ON_FTRACE CONFIG_KPROBES_ON_FTRACE is required for test-kprobe. Skip test-kprobe when CONFIG_KPROBES_ON_FTRACE is not set. Since some kernel may not have /proc/config.gz, grep for kprobe_ftrace_ops from /proc/kallsyms to check whether CONFIG_KPROBES_ON_FTRACE is enabled. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/r/20250318181518.1055532-1-song@kernel.org [pmladek@suse.com: Call grep with -q option.] Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-03-25 13:36:40 +01:00
Linus Torvalds	a49a879f0a	Miscellaneous x86 cleanups by Arnd Bergmann, Charles Han, Mirsad Todorovac, Randy Dunlap, Thorsten Blum and Zhang Kunbo. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfeo6ERHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gl9w//ZRxGhf3eXYVyPW35aoqaRe7W3fBJhaPn eLP0JuuglZO3JSTJfmbTmSXrNN4kkzI3nnuTnchK0iYZ+2Tn4E0iaz4tm5sjL5T1 KW9ajv7QQxMOwXS69hrklY/eGxVimfs4mIyhxEhxLjPvzbGJOVS6WFgvaYH5klQQ 7sPYXzrNGZJhGCmRqCelHSPhl7b6YVS/yWtMe+BpPBn1AqIQN+O+S/Kbu7dQokmq 7MNy4eUe7GYgnSB7Qhq/jNSqjmGWsVwhMiuoDxx7GvShM73+kYatQZT0H+qzTxzp 90rIPcPTNCOlKJO3xSWqXStcMH00MbplOhKdEU6SzeM+xC2aD2t4GoIG0EFxIQdS X0wh40Psm5GehThhpmMBJdmM4Le4TTSkHBhedpQ+sp+6BIX4A9hoeCoybIlcngaJ W89YKHqC5ruPSRDOzyaTMSypaEGH5VHP6AMj0CmkuyHRbxADsMazifpOVOw9AvVk IR0USCzyenjLMiugRrJgRpy8R2Q1jp35bP0e7Y4QvmydEGC1Wl5mW5Y8B9Zdlk0C iuxoqwntem7PAkEA2JejW6rzuRICxXloKv+xjtQjKzXpK+pwhhSp830zT8gaWZ1Z hdQRz7gSlJg3JzWinX8+XNusdHw9TxCUqwgLw8486nzNp01GU62gby5DWgECEdae 2IyRRSd0FyA= =GsK4 -----END PGP SIGNATURE----- Merge tag 'x86-cleanups-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Miscellaneous x86 cleanups by Arnd Bergmann, Charles Han, Mirsad Todorovac, Randy Dunlap, Thorsten Blum and Zhang Kunbo" * tag 'x86-cleanups-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/coco: Replace 'static const cc_mask' with the newly introduced cc_get_mask() function x86/delay: Fix inconsistent whitespace selftests/x86/syscall: Fix coccinelle WARNING recommending the use of ARRAY_SIZE() x86/platform: Fix missing declaration of 'x86_apple_machine' x86/irq: Fix missing declaration of 'io_apic_irqs' x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()	2025-03-24 22:39:53 -07:00
Linus Torvalds	71b639af06	x86/fpu updates for v6.15: - Improve crypto performance by making kernel-mode FPU reliably usable in softirqs ((Eric Biggers) - Fully optimize out WARN_ON_FPU() (Eric Biggers) - Initial steps to support Support Intel APX (Advanced Performance Extensions) (Chang S. Bae) - Fix KASAN for arch_dup_task_struct() (Benjamin Berg) - Refine and simplify the FPU magic number check during signal return (Chang S. Bae) - Fix inconsistencies in guest FPU xfeatures (Chao Gao, Stanislav Spassov) - selftests/x86/xstate: Introduce common code for testing extended states (Chang S. Bae) - Misc fixes and cleanups (Borislav Petkov, Colin Ian King, Uros Bizjak) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfepZQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1grLw/+Lt71PQWu/uVt3/dU0VkpppTqRmUTujuR XVwFlBVxMFw9A8wi56bLD3WT28mKWr0SepynBt1Wr7/pZCnaW4SH/91ERChRqTzU ifawmqQqhrVhFlXziDOKp9lcDy9f7NjJVKBfEBa1VBQGC0Q8FzeasOhCwTJw8qXa Lj2z8zenhDq6NBkk22VlScc1CvsUBiXppvm8uk3md7HKaDqdzmCakm/a0FgaEppt K6P/u1iaVvGXme2CHSskCshkYoEFIF6LRNYnkVloXsVV4AeWeLaJ54xW+syPd/4H EX6oMLifdXCmxmsmi3LPRUjBgdSMsDaAkLsPNoX1w4uWjsihqyUl4wTEpobMIuu1 PWOrCxQKxtaGoECJ0nsE4uR7MZEQ0sYCGKv4JSiiXeUVf/EDRuUBfvBCoGlDWXYA GZcoMmH+BtcYuQdzbrc/vWkfo3bpTxL3x5zsgtFkQL3xyzhugRPNgeOn9yuSZ9x6 AD8QS0G6uRcc9a5ZeeTEcYxoIE/+vvfnh1wfMjisehkVn179ixfZdgcSGrZJUNyH a7LmUmwLvLYNO5DUUGs1upN8YHP7jiohNc/r2ZC1IX5LHbuK1gyXA4xo6hAZ8r/7 XZ2FfRp0dr3PvuWIwq2v6T5JHvALADsKCAjqvNdwHLkxF87ygixT+B87wTA3H6ov LSY10A/eRx4= =U7PR -----END PGP SIGNATURE----- Merge tag 'x86-fpu-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu updates from Ingo Molnar: - Improve crypto performance by making kernel-mode FPU reliably usable in softirqs ((Eric Biggers) - Fully optimize out WARN_ON_FPU() (Eric Biggers) - Initial steps to support Support Intel APX (Advanced Performance Extensions) (Chang S. Bae) - Fix KASAN for arch_dup_task_struct() (Benjamin Berg) - Refine and simplify the FPU magic number check during signal return (Chang S. Bae) - Fix inconsistencies in guest FPU xfeatures (Chao Gao, Stanislav Spassov) - selftests/x86/xstate: Introduce common code for testing extended states (Chang S. Bae) - Misc fixes and cleanups (Borislav Petkov, Colin Ian King, Uros Bizjak) * tag 'x86-fpu-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu/xstate: Fix inconsistencies in guest FPU xfeatures x86/fpu: Clarify the "xa" symbolic name used in the XSTATE* macros x86/fpu: Use XSAVE{,OPT,C,S} and XRSTOR{,S} mnemonics in xstate.h x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs x86/fpu/xstate: Simplify print_xstate_features() x86/fpu: Refine and simplify the magic number check during signal return selftests/x86/xstate: Fix spelling mistake "hader" -> "header" x86/fpu: Avoid copying dynamic FP state from init_task in arch_dup_task_struct() vmlinux.lds.h: Remove entry to place init_task onto init_stack selftests/x86/avx: Add AVX tests selftests/x86/xstate: Clarify supported xstates selftests/x86/xstate: Consolidate test invocations into a single entry selftests/x86/xstate: Introduce signal ABI test selftests/x86/xstate: Refactor ptrace ABI test selftests/x86/xstate: Refactor context switching test selftests/x86/xstate: Enumerate and name xstate components selftests/x86/xstate: Refactor XSAVE helpers for general use selftests/x86: Consolidate redundant signal helper functions x86/fpu: Fix guest FPU state buffer allocation size x86/fpu: Fully optimize out WARN_ON_FPU()	2025-03-24 22:27:18 -07:00
Linus Torvalds	e34c38057a	[ Merge note: this pull request depends on you having merged two locking commits in the locking tree, part of the locking-core-2025-03-22 pull request. ] x86 CPU features support: - Generate the <asm/cpufeaturemasks.h> header based on build config (H. Peter Anvin, Xin Li) - x86 CPUID parsing updates and fixes (Ahmed S. Darwish) - Introduce the 'setcpuid=' boot parameter (Brendan Jackman) - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan Jackman) - Utilize CPU-type for CPU matching (Pawan Gupta) - Warn about unmet CPU feature dependencies (Sohil Mehta) - Prepare for new Intel Family numbers (Sohil Mehta) Percpu code: - Standardize & reorganize the x86 percpu layout and related cleanups (Brian Gerst) - Convert the stackprotector canary to a regular percpu variable (Brian Gerst) - Add a percpu subsection for cache hot data (Brian Gerst) - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak) - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak) MM: - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction (Rik van Riel) - Rework ROX cache to avoid writable copy (Mike Rapoport) - PAT: restore large ROX pages after fragmentation (Kirill A. Shutemov, Mike Rapoport) - Make memremap(MEMREMAP_WB) map memory as encrypted by default (Kirill A. Shutemov) - Robustify page table initialization (Kirill A. Shutemov) - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn) - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW (Matthew Wilcox) KASLR: - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir Singh) CPU bugs: - Implement FineIBT-BHI mitigation (Peter Zijlstra) - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta) - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta) - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta) System calls: - Break up entry/common.c (Brian Gerst) - Move sysctls into arch/x86 (Joel Granados) Intel LAM support updates: (Maciej Wieczor-Retman) - selftests/lam: Move cpu_has_la57() to use cpuinfo flag - selftests/lam: Skip test if LAM is disabled - selftests/lam: Test get_user() LAM pointer handling AMD SMN access updates: - Add SMN offsets to exclusive region access (Mario Limonciello) - Add support for debugfs access to SMN registers (Mario Limonciello) - Have HSMP use SMN through AMD_NODE (Yazen Ghannam) Power management updates: (Patryk Wlazlyn) - Allow calling mwait_play_dead with an arbitrary hint - ACPI/processor_idle: Add FFH state handling - intel_idle: Provide the default enter_dead() handler - Eliminate mwait_play_dead_cpuid_hint() Bootup: Build system: - Raise the minimum GCC version to 8.1 (Brian Gerst) - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor) Kconfig: (Arnd Bergmann) - Add cmpxchg8b support back to Geode CPUs - Drop 32-bit "bigsmp" machine support - Rework CONFIG_GENERIC_CPU compiler flags - Drop configuration options for early 64-bit CPUs - Remove CONFIG_HIGHMEM64G support - Drop CONFIG_SWIOTLB for PAE - Drop support for CONFIG_HIGHPTE - Document CONFIG_X86_INTEL_MID as 64-bit-only - Remove old STA2x11 support - Only allow CONFIG_EISA for 32-bit Headers: - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers (Thomas Huth) Assembly code & machine code patching: - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf) - x86/alternatives: Simplify callthunk patching (Peter Zijlstra) - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf) - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf) - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra) - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> (Uros Bizjak) - Use named operands in inline asm (Uros Bizjak) - Improve performance by using asm_inline() for atomic locking instructions (Uros Bizjak) Earlyprintk: - Harden early_serial (Peter Zijlstra) NMI handler: - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() (Waiman Long) Miscellaneous fixes and cleanups: - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner, Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly Kuznetsov, Xin Li, liuye. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfenkQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g1FRAAi6OFTSn/5aeLMI0IMNBxJ6ddQiFc3imd 7+C/vU5nul4CyDs8mKyj/+f/DDrbkG9lKz3VG631Yl237lXHjD8XWcVMeC/1z/q0 3zInDIloE9/nBHRPkF6F7fARBLBZ0LFgaBsGrCo7mwpGybiQdqGcqcxllvTbtXaw OHta4q6ok+lBDNlfc0v6H4cRnzhmmlKu6Ng0j6UI3V7uFhi3vtxas32ltDQtzorq 2+jbV6/+kbrrv+xPC+jlzOFhTEKRupNPQXmvyQteoQg6G3kqAKMDvBthGXd1rHuX Qa+BoDIifE/2NiVeRwNrhoqYH/pHCzUzDREW5IW8+ca+4XNKuzAC6EuC8CeCzyK1 q8ZjZjooQW4zEeVFeJYllHONzJYfxfSH5CLsnbcuhq99yfGlrQhF1qL72/Omn1w/ DfPJM8Zt5zyKvLqUg3Md+fkVCO2wyDNhB61QPzRgHF+yD+rvuDpoqvUWir+w7cSn fwEDVZGXlFx6dumtSrqRaTd1nvFt80s8yP2ll09DMvGQ8D/yruS7hndGAmmJVCSW NAfd8pSjq5v2+ux2UR92/Cc3VF3SjaUqHBOp/Nq9rESya18ZVa3cJpHhVYYtPIVf THW0h07RIkGVKs1uq+5ekLCr/8uAZg58UPIqmhTuW0ttymRHCNfohR45FQZzy+0M tJj1oc2TIZw= =Dcb3 -----END PGP SIGNATURE----- Merge tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: "x86 CPU features support: - Generate the <asm/cpufeaturemasks.h> header based on build config (H. Peter Anvin, Xin Li) - x86 CPUID parsing updates and fixes (Ahmed S. Darwish) - Introduce the 'setcpuid=' boot parameter (Brendan Jackman) - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan Jackman) - Utilize CPU-type for CPU matching (Pawan Gupta) - Warn about unmet CPU feature dependencies (Sohil Mehta) - Prepare for new Intel Family numbers (Sohil Mehta) Percpu code: - Standardize & reorganize the x86 percpu layout and related cleanups (Brian Gerst) - Convert the stackprotector canary to a regular percpu variable (Brian Gerst) - Add a percpu subsection for cache hot data (Brian Gerst) - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak) - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak) MM: - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction (Rik van Riel) - Rework ROX cache to avoid writable copy (Mike Rapoport) - PAT: restore large ROX pages after fragmentation (Kirill A. Shutemov, Mike Rapoport) - Make memremap(MEMREMAP_WB) map memory as encrypted by default (Kirill A. Shutemov) - Robustify page table initialization (Kirill A. Shutemov) - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn) - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW (Matthew Wilcox) KASLR: - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir Singh) CPU bugs: - Implement FineIBT-BHI mitigation (Peter Zijlstra) - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta) - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta) - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta) System calls: - Break up entry/common.c (Brian Gerst) - Move sysctls into arch/x86 (Joel Granados) Intel LAM support updates: (Maciej Wieczor-Retman) - selftests/lam: Move cpu_has_la57() to use cpuinfo flag - selftests/lam: Skip test if LAM is disabled - selftests/lam: Test get_user() LAM pointer handling AMD SMN access updates: - Add SMN offsets to exclusive region access (Mario Limonciello) - Add support for debugfs access to SMN registers (Mario Limonciello) - Have HSMP use SMN through AMD_NODE (Yazen Ghannam) Power management updates: (Patryk Wlazlyn) - Allow calling mwait_play_dead with an arbitrary hint - ACPI/processor_idle: Add FFH state handling - intel_idle: Provide the default enter_dead() handler - Eliminate mwait_play_dead_cpuid_hint() Build system: - Raise the minimum GCC version to 8.1 (Brian Gerst) - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor) Kconfig: (Arnd Bergmann) - Add cmpxchg8b support back to Geode CPUs - Drop 32-bit "bigsmp" machine support - Rework CONFIG_GENERIC_CPU compiler flags - Drop configuration options for early 64-bit CPUs - Remove CONFIG_HIGHMEM64G support - Drop CONFIG_SWIOTLB for PAE - Drop support for CONFIG_HIGHPTE - Document CONFIG_X86_INTEL_MID as 64-bit-only - Remove old STA2x11 support - Only allow CONFIG_EISA for 32-bit Headers: - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers (Thomas Huth) Assembly code & machine code patching: - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf) - x86/alternatives: Simplify callthunk patching (Peter Zijlstra) - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf) - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf) - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra) - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> (Uros Bizjak) - Use named operands in inline asm (Uros Bizjak) - Improve performance by using asm_inline() for atomic locking instructions (Uros Bizjak) Earlyprintk: - Harden early_serial (Peter Zijlstra) NMI handler: - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() (Waiman Long) Miscellaneous fixes and cleanups: - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner, Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly Kuznetsov, Xin Li, liuye" * tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits) zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault x86/asm: Make asm export of __ref_stack_chk_guard unconditional x86/mm: Only do broadcast flush from reclaim if pages were unmapped perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones perf/x86/intel, x86/cpu: Simplify Intel PMU initialization x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions x86/asm: Use asm_inline() instead of asm() in clwb() x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h> x86/hweight: Use asm_inline() instead of asm() x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm() x86/hweight: Use named operands in inline asm() x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP x86/head/64: Avoid Clang < 17 stack protector in startup code x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro x86/cpu/intel: Limit the non-architectural constant_tsc model checks x86/mm/pat: Replace Intel x86_model checks with VFM ones x86/cpu/intel: Fix fast string initialization for extended Families ...	2025-03-24 22:06:11 -07:00
Linus Torvalds	32b22538be	Scheduler updates for v6.15: [ Merge note, these two commits are identical: - `f3fa0e40df` ("sched/clock: Don't define sched_clock_irqtime as static key") - `b9f2b29b94` ("sched: Don't define sched_clock_irqtime as static key") The first one is a cherry-picked version of the second, and the first one is already upstream. ] Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfejsoRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1ivkhAAwBF2tYRBS1oIHcC/OKK3JJoHVDp2LFbU 9sm5S3ZlGD/Ns2fbpY+9A8UFgUFfjYiTSV7hvf2B9Vge0XSxTmMNFu/MdxLBbo9r w6GSeNcNDQKpjEGLkrmPFsa2fiYI4dmH0IzDbS9V2cNPk470QBKjAKXNPaSER691 n2wLnQq+m5o4gXnPjnSz6RrrzisRnm2GOWnDV/iqR47pZFNlX2wWlo3s5r7//Hw0 a+QfEfpgKehhy/VSDXmSAgpqnNffjc78yBV6LNoVUddwahnOWiQMS3XViOqgy5VO jUGBrzW+sKkdRMBppxwJ/0XWgHGC27amIgnU0ZE5u+eiUEu8H9qWl1cRCFxyeB0O 8+WNfwmkH+FPWUdsn84kdePhSsZy6HfM6h44Xe0hx1V7tQXEXfbPzK3TnQg8Ktt1 Ky6ctbZt4cGpqGQuIqvba21A/racrD/DgvB7mHeZksnqZoKTDwxhT/nlQGpuwPoy SJYd1ynFVJvfC69SwMdwnaglimvEZx1GfT0o5XtCMslY5NkWCou5u+e65WX7ccU5 94wBCwI1/+KiFMJZp6TlPw07Q/Hsj9dDryxLc3OunU3zMVt3++1GZnBS/eb5FN1A 5TlAEpxgH9c5Q4/XJFCvx21DrTVHSuIrR6naK91bgqHo0cEfaHGtO/ejPtJGSfxe YIFnnu1dhRw= =SuQE -----END PGP SIGNATURE----- Merge tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior)" * tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling() sched/debug: Remove CONFIG_SCHED_DEBUG sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional sched/debug: Make 'const_debug' tunables unconditional __read_mostly sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE() rseq/selftests: Fix namespace collision with rseq UAPI header include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h sched/topology: Stop exposing partition_sched_domains_locked cgroup/cpuset: Remove partition_and_rebuild_sched_domains sched/topology: Remove redundant dl_clear_root_domain call sched/deadline: Rebuild root domain accounting after every update sched/deadline: Generalize unique visiting of root domains sched/topology: Wrappers for sched_domains_mutex sched/deadline: Ignore special tasks when rebuilding domains tracing: Use preempt_model_str() xtensa: Rely on generic printing of preemption model x86: Rely on generic printing of preemption model s390: Rely on generic printing of preemption model ...	2025-03-24 21:28:12 -07:00
Linus Torvalds	3ba7dfb8da	RCU pull request for v6.15 This pull request contains the following branches: docs.2025.02.04a: - Add broken-timing possibility to stallwarn.rst. - Improve discussion of this_cpu_ptr(), add raw_cpu_ptr(). - Document self-propagating callbacks. - Point call_srcu() to call_rcu() for detailed memory ordering. - Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header. - Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text. - Remove references to old grace-period-wait primitives. srcu.2025.02.05a: - Introduce srcu_read_{un,}lock_fast(), which is similar to srcu_read_{un,}lock_lite(): avoid smp_mb()s in lock and unlock at the cost of calling synchronize_rcu() in synchronize_srcu(). Moreover, by returning the percpu offset of the counter at srcu_read_lock_fast() time, srcu_read_unlock_fast() can save extra pointer dereferencing, which makes it faster than srcu_read_{un,}lock_lite(). srcu_read_{un,}lock_fast() are intended to replace rcu_read_{un,}lock_trace() if possible. torture.2025.02.05a: - Add get_torture_init_jiffies() to return the start time of the test. - Add a test_boost_holdoff module parameter to allow delaying boosting tests when building rcutorture as built-in. - Add grace period sequence number logging at the beginning and end of failure/close-call results. - Switch to hexadecimal for the expedited grace period sequence number in the rcu_exp_grace_period trace point. - Make cur_ops->format_gp_seqs take buffer length. - Move RCU_TORTURE_TEST_{CHK_RDR_STATE,LOG_CPU} to bool. - Complain when invalid SRCU reader_flavor is specified. - Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing, which forces SRCU uses atomics even when percpu ops are NMI safe, and use the Kconfig for SRCU lockdep testing. misc.2025.03.04a: - Split rcu_report_exp_cpu_mult() mask parameter and use for tracing. - Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes(). - Fix get_state_synchronize_rcu_full() GP-start detection. - Move RCU Tasks self-tests to core_initcall(). - Print segment lengths in show_rcu_nocb_gp_state(). - Make RCU watch ct_kernel_exit_state() warning. - Flush console log from kernel_power_off(). - rcutorture: Allow a negative value for nfakewriters. - rcu: Update TREE05.boot to test normal synchronize_rcu(). - rcu: Use _full() API to debug synchronize_rcu(). lazypreempt.2025.03.04a: Make RCU handle PREEMPT_LAZY better: - Fix header guard for rcu_all_qs(). - rcu: Rename PREEMPT_AUTO to PREEMPT_LAZY. - Update __cond_resched comment about RCU quiescent states. - Handle unstable rdp in rcu_read_unlock_strict(). - Handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y. - osnoise: Provide quiescent states. - Adjust rcutorture with possible PREEMPT_RCU=n && PREEMPT_COUNT=y combination. - Limit PREEMPT_RCU configurations. - Make rcutorture senario TREE07 and senario TREE10 use PREEMPT_LAZY=y. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEj5IosQTPz8XU1wRHSXnow7UH+rgFAmfeBLQACgkQSXnow7UH +rh11Qf/Rt6IZJ/YT/V9Sd+8hMx4O0BMh779pr9cD6mbAG+FDk2Yeva1m8vIdFOb qId6oc8K/ef2JfFjSn0oHMzQP2D3XUyiJWPNbBDHv/D8Os8GZgjzu8dkxVkSbdbY OxtvIflbcqFN1JDJfGKZnTEW0/YxGqfnS9b6R7iyyA7SOGQ/WubGOE5qNCqPufc9 zJiP+qTUFYQzCIiPlEJul39o9KboPogbt3QAAQjWmi3utd77ehJnm/15FvAjyau4 uhC2cnGfMY535rQaiaQeBQ/IHIowKripCq0JQFvcUNdyArZM3HOI2x79+2II6ft7 mjHskNODOIJHfW2o1RzQ0yRYAywFIg== =J+mH -----END PGP SIGNATURE----- Merge tag 'rcu-next-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Boqun Feng: "Documentation: - Add broken-timing possibility to stallwarn.rst - Improve discussion of this_cpu_ptr(), add raw_cpu_ptr() - Document self-propagating callbacks - Point call_srcu() to call_rcu() for detailed memory ordering - Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header - Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text - Remove references to old grace-period-wait primitives srcu: - Introduce srcu_read_{un,}lock_fast(), which is similar to srcu_read_{un,}lock_lite(): avoid smp_mb()s in lock and unlock at the cost of calling synchronize_rcu() in synchronize_srcu() Moreover, by returning the percpu offset of the counter at srcu_read_lock_fast() time, srcu_read_unlock_fast() can avoid extra pointer dereferencing, which makes it faster than srcu_read_{un,}lock_lite() srcu_read_{un,}lock_fast() are intended to replace rcu_read_{un,}lock_trace() if possible RCU torture: - Add get_torture_init_jiffies() to return the start time of the test - Add a test_boost_holdoff module parameter to allow delaying boosting tests when building rcutorture as built-in - Add grace period sequence number logging at the beginning and end of failure/close-call results - Switch to hexadecimal for the expedited grace period sequence number in the rcu_exp_grace_period trace point - Make cur_ops->format_gp_seqs take buffer length - Move RCU_TORTURE_TEST_{CHK_RDR_STATE,LOG_CPU} to bool - Complain when invalid SRCU reader_flavor is specified - Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing, which forces SRCU uses atomics even when percpu ops are NMI safe, and use the Kconfig for SRCU lockdep testing Misc: - Split rcu_report_exp_cpu_mult() mask parameter and use for tracing - Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes() - Fix get_state_synchronize_rcu_full() GP-start detection - Move RCU Tasks self-tests to core_initcall() - Print segment lengths in show_rcu_nocb_gp_state() - Make RCU watch ct_kernel_exit_state() warning - Flush console log from kernel_power_off() - rcutorture: Allow a negative value for nfakewriters - rcu: Update TREE05.boot to test normal synchronize_rcu() - rcu: Use _full() API to debug synchronize_rcu() Make RCU handle PREEMPT_LAZY better: - Fix header guard for rcu_all_qs() - rcu: Rename PREEMPT_AUTO to PREEMPT_LAZY - Update __cond_resched comment about RCU quiescent states - Handle unstable rdp in rcu_read_unlock_strict() - Handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y - osnoise: Provide quiescent states - Adjust rcutorture with possible PREEMPT_RCU=n && PREEMPT_COUNT=y combination - Limit PREEMPT_RCU configurations - Make rcutorture senario TREE07 and senario TREE10 use PREEMPT_LAZY=y" * tag 'rcu-next-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (59 commits) rcutorture: Make scenario TREE07 build CONFIG_PREEMPT_LAZY=y rcutorture: Make scenario TREE10 build CONFIG_PREEMPT_LAZY=y rcu: limit PREEMPT_RCU configurations rcutorture: Update ->extendables check for lazy preemption rcutorture: Update rcutorture_one_extend_check() for lazy preemption osnoise: provide quiescent states rcu: Use _full() API to debug synchronize_rcu() rcu: Update TREE05.boot to test normal synchronize_rcu() rcutorture: Allow a negative value for nfakewriters Flush console log from kernel_power_off() context_tracking: Make RCU watch ct_kernel_exit_state() warning rcu/nocb: Print segment lengths in show_rcu_nocb_gp_state() rcu-tasks: Move RCU Tasks self-tests to core_initcall() rcu: Fix get_state_synchronize_rcu_full() GP-start detection torture: Make SRCU lockdep testing use srcu_read_lock_nmisafe() srcu: Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing rcutorture: Complain when invalid SRCU reader_flavor is specified rcutorture: Move RCU_TORTURE_TEST_{CHK_RDR_STATE,LOG_CPU} to bool rcutorture: Make cur_ops->format_gp_seqs take buffer length rcutorture: Add ftrace-compatible timestamp to GP# failure/close-call output ...	2025-03-24 19:41:37 -07:00
Linus Torvalds	418becac37	nolibc changes for 6.15 Changes ------- * 32bit s390 support * opendir() and friends * openat() support * sscanf() support * various cleanups -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTg4lxklFHAidmUs57B+h1jyw5bOAUCZ8w24QAKCRDB+h1jyw5b OIRTAQD1KvUV0qYOKU3xqOiUxDe84DyFv4mgu1iZoqmd0po/CQEAs8HU4XWp4Ls0 tPkcLqlhiVPgwhtipjklW8uy6JcFzAI= =+Zma -----END PGP SIGNATURE----- Merge tag 'nolibc-20250308-for-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc updates from Paul McKenney: - 32bit s390 support - opendir() and friends - openat() support - sscanf() support - various cleanups [ Paul has just forwarded the pull request from Thomas Weißschuh, so the tag signature is from Thomas, not Paul - Linus ] * tag 'nolibc-20250308-for-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (26 commits) tools/nolibc: don't use asm/ UAPI headers selftests/nolibc: stop testing constructor order selftests/nolibc: use O_RDONLY flag instead of 0 tools/nolibc: drop outdated example from overview comment tools/nolibc: process open() vararg as mode_t tools/nolibc: always use openat(2) instead of open(2) tools/nolibc: add support for openat(2) selftests/nolibc: add armthumb configuration selftests/nolibc: explicitly enable ARM mode Revert "selftests: kselftest: Fix build failure with NOLIBC" tools/nolibc: add support for [v]sscanf() tools/nolibc: add support for 32-bit s390 selftests/nolibc: rename s390 to s390x selftests/nolibc: only run constructor tests on nolibc selftests/nolibc: split up architecture list in run-tests.sh tools/nolibc: add support for directory access tools/nolibc: add support for sys_llseek() selftests/nolibc: always keep test kernel configuration up to date selftests/nolibc: execute defconfig before other targets selftests/nolibc: drop call to mrproper target ...	2025-03-24 17:59:29 -07:00
Linus Torvalds	bcb044256d	sched_ext: Changes for v6.15 - Add mechanism to count and report internal events. This significantly improves visibility on subtle corner conditions. - The default idle CPU selection logic is revamped and improved in multiple ways including being made topology aware. - sched_ext was disabling ttwu_queue for simplicity, which can be costly when hardware topology is more complex. Implement SCX_OPS_ALLOWED_QUEUED_WAKEUP so that BPF schedulers can selectively enable ttwu_queue. - tools/sched_ext updates to improve compatibility among others. - Other misc updates and fixes. - sched_ext/for-6.14-fixes were pulled a few times to receive prerequisite fixes and resolve conflicts. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ999Sg4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGf/KAQCoMTVOBpQT9gCaCKDOmrVJTwi6boEoV5WnGZzw PDr0vwEAq36iz4no6Y5THcN/DCx+52IiS0zuhPy3rBZVo11TMgU= =iQ+A -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Add mechanism to count and report internal events. This significantly improves visibility on subtle corner conditions. - The default idle CPU selection logic is revamped and improved in multiple ways including being made topology aware. - sched_ext was disabling ttwu_queue for simplicity, which can be costly when hardware topology is more complex. Implement SCX_OPS_ALLOWED_QUEUED_WAKEUP so that BPF schedulers can selectively enable ttwu_queue. - tools/sched_ext updates to improve compatibility among others. - Other misc updates and fixes. - sched_ext/for-6.14-fixes were pulled a few times to receive prerequisite fixes and resolve conflicts. * tag 'sched_ext-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (42 commits) sched_ext: idle: Refactor scx_select_cpu_dfl() sched_ext: idle: Honor idle flags in the built-in idle selection policy sched_ext: Skip per-CPU tasks in scx_bpf_reenqueue_local() sched_ext: Add trace point to track sched_ext core events sched_ext: Change the event type from u64 to s64 sched_ext: Documentation: add task lifecycle summary tools/sched_ext: Provide a compatible helper for scx_bpf_events() selftests/sched_ext: Add NUMA-aware scheduler test tools/sched_ext: Provide consistent access to scx flags sched_ext: idle: Fix scx_bpf_pick_any_cpu_node() behavior sched_ext: idle: Introduce scx_bpf_nr_node_ids() sched_ext: idle: Introduce node-aware idle cpu kfunc helpers sched_ext: idle: Per-node idle cpumasks sched_ext: idle: Introduce SCX_OPS_BUILTIN_IDLE_PER_NODE sched_ext: idle: Make idle static keys private sched/topology: Introduce for_each_node_numadist() iterator mm/numa: Introduce nearest_node_nodemask() nodemask: numa: reorganize inclusion path nodemask: add nodes_copy() tools/sched_ext: Sync with scx repo ...	2025-03-24 17:23:48 -07:00
Linus Torvalds	11c2b2e332	seccomp updates for v6.15-rc1 - avoid the lock trip seccomp_filter_release in common case (Mateusz Guzik) - remove unused 'sd' argument through-out (Oleg Nesterov) - selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hJKgAKCRA2KwveOeQk u3UDAQCUPSJ+iNGEsh9ErHJZWmg+LQJ999fOVrWWz+onjXFoQgD/cEHyLtSnz1Oa RbdGEqkBkRTXgOdkcIW5pJ4lBtfIpQQ= =z/8H -----END PGP SIGNATURE----- Merge tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: - avoid the lock trip seccomp_filter_release in common case (Mateusz Guzik) - remove unused 'sd' argument through-out (Oleg Nesterov) - selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64 * tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: avoid the lock trip seccomp_filter_release in common case seccomp: remove the 'sd' argument from __seccomp_filter() seccomp: remove the 'sd' argument from __secure_computing() seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER seccomp/mips: change syscall_trace_enter() to use secure_computing() selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64	2025-03-24 15:34:38 -07:00
Linus Torvalds	06961fbbbd	move-lib-kunit for v6.15-rc1 - move lib/ selftests into lib/tests/ (Kees Cook, Gabriela Bittencourt, Luis Felipe Hernandez, Lukas Bulwahn, Tamir Duberstein) - lib/math: Add int_log test suite (Bruno Sobreira França) - lib/math: Add Kunit test suite for gcd() (Yu-Chun Lin) - lib/tests/kfifo_kunit.c: add tests for the kfifo structure (Diego Vieira) - unicode: refactor selftests into KUnit (Gabriela Bittencourt) - lib/prime_numbers: convert self-test to KUnit (Tamir Duberstein) - printf: convert self-test to KUnit (Tamir Duberstein) - scanf: convert self-test to KUnit (Tamir Duberstein) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hCvgAKCRA2KwveOeQk u1nzAP9F/vQZUPkU9ADuqdcbyyEXlTzNk8R5rC2e1w+uKzJx+QD9EAbeCv9ZLdC0 KQF0pWVYCJtiSMEhkiMS/bMmpRCgwQ8= =VYZG -----END PGP SIGNATURE----- Merge tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull lib kunit selftest move from Kees Cook: "This is a one-off tree to coordinate the move of selftests out of lib/ and into lib/tests/. A separate tree was used for this to keep the paths sane with all the work in the same place. - move lib/ selftests into lib/tests/ (Kees Cook, Gabriela Bittencourt, Luis Felipe Hernandez, Lukas Bulwahn, Tamir Duberstein) - lib/math: Add int_log test suite (Bruno Sobreira França) - lib/math: Add Kunit test suite for gcd() (Yu-Chun Lin) - lib/tests/kfifo_kunit.c: add tests for the kfifo structure (Diego Vieira) - unicode: refactor selftests into KUnit (Gabriela Bittencourt) - lib/prime_numbers: convert self-test to KUnit (Tamir Duberstein) - printf: convert self-test to KUnit (Tamir Duberstein) - scanf: convert self-test to KUnit (Tamir Duberstein)" * tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (21 commits) scanf: break kunit into test cases scanf: convert self-test to KUnit scanf: remove redundant debug logs scanf: implicate test line in failure messages printf: implicate test line in failure messages printf: break kunit into test cases printf: convert self-test to KUnit kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR() kunit/fortify: Expand testing of __compiletime_strlen() kunit/stackinit: Use fill byte different from Clang i386 pattern kunit/overflow: Fix DEFINE_FLEX tests for counted_by selftests: remove reference to prime_numbers.sh MAINTAINERS: adjust entries in FORTIFY_SOURCE and KERNEL HARDENING lib/prime_numbers: convert self-test to KUnit lib/math: Add Kunit test suite for gcd() unicode: kunit: change tests filename and path unicode: kunit: refactor selftest to kunit tests lib/tests/kfifo_kunit.c: add tests for the kfifo structure lib: Move KUnit tests into tests/ subdirectory lib/math: Add int_log test suite ...	2025-03-24 15:15:11 -07:00
Amit Cohen	36ed81bcad	selftests: vxlan_bridge: Test flood with unresolved FDB entry Extend flood test to configure FDB entry with unresolved destination IP, check that packets are not sent twice. Without the previous patch which handles such scenario in mlxsw, the tests fail: $ TESTS='test_flood' ./vxlan_bridge_1d.sh Running tests with UDP port 4789 TEST: VXLAN: flood [ OK ] TEST: VXLAN: flood, unresolved FDB entry [FAIL] vx2 ns2: Expected to capture 10 packets, got 20. $ TESTS='test_flood' ./vxlan_bridge_1q.sh INFO: Running tests with UDP port 4789 TEST: VXLAN: flood vlan 10 [ OK ] TEST: VXLAN: flood vlan 20 [ OK ] TEST: VXLAN: flood vlan 10, unresolved FDB entry [FAIL] vx10 ns2: Expected to capture 10 packets, got 20. TEST: VXLAN: flood vlan 20, unresolved FDB entry [FAIL] vx20 ns2: Expected to capture 10 packets, got 20. With the previous patch, the tests pass. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/7bc96e317531f3bf06319fb2ea447bd8666f29fa.1742224300.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-24 15:09:31 -07:00
Gal Pressman	c61209eeb0	selftests: drv-net: rss_ctx: Don't assume indirection table is present The test_rss_context_dump() test assumes the indirection table is always supported, which is not true for all drivers, e.g., virtio_net when VIRTIO_NET_F_RSS is disabled. Skip the check if 'indir' is not present. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250318112426.386651-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-24 12:22:37 -07:00
Peter Seiderer	3099f9e156	selftest: net: update proc_net_pktgen (add more imix_weights test cases) Add more imix_weights test cases (for incomplete input). Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250317090401.1240704-2-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-24 12:01:38 -07:00
Linus Torvalds	130e696aa6	vfs-6.15-rc1.mount.namespace -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90r2wAKCRCRxhvAZXjc ouC6AQCk3MoqskN0WeNcaZT23dB7dHbEhf/7YXOFC9MFRMKXqQD9Fbn95+GuIe3U nBVPbVyQfDtfXE08ml6gbDJrCsbkkQI= =Xm1C -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount namespace updates from Christian Brauner: "This expands the ability of anonymous mount namespaces: - Creating detached mounts from detached mounts Currently, detached mounts can only be created from attached mounts. This limitaton prevents various use-cases. For example, the ability to mount a subdirectory without ever having to make the whole filesystem visible first. The current permission modelis: (1) Check that the caller is privileged over the owning user namespace of it's current mount namespace. (2) Check that the caller is located in the mount namespace of the mount it wants to create a detached copy of. While it is not strictly necessary to do it this way it is consistently applied in the new mount api. This model will also be used when allowing the creation of detached mount from another detached mount. The (1) requirement can simply be met by performing the same check as for the non-detached case, i.e., verify that the caller is privileged over its current mount namespace. To meet the (2) requirement it must be possible to infer the origin mount namespace that the anonymous mount namespace of the detached mount was created from. The origin mount namespace of an anonymous mount is the mount namespace that the mounts that were copied into the anonymous mount namespace originate from. In order to check the origin mount namespace of an anonymous mount namespace the sequence number of the original mount namespace is recorded in the anonymous mount namespace. With this in place it is possible to perform an equivalent check (2') to (2). The origin mount namespace of the anonymous mount namespace must be the same as the caller's mount namespace. To establish this the sequence number of the caller's mount namespace and the origin sequence number of the anonymous mount namespace are compared. The caller is always located in a non-anonymous mount namespace since anonymous mount namespaces cannot be setns()ed into. The caller's mount namespace will thus always have a valid sequence number. The owning namespace of any mount namespace, anonymous or non-anonymous, can never change. A mount attached to a non-anonymous mount namespace can never change mount namespace. If the sequence number of the non-anonymous mount namespace and the origin sequence number of the anonymous mount namespace match, the owning namespaces must match as well. Hence, the capability check on the owning namespace of the caller's mount namespace ensures that the caller has the ability to copy the mount tree. - Allow mount detached mounts on detached mounts Currently, detached mounts can only be mounted onto attached mounts. This limitation makes it impossible to assemble a new private rootfs and move it into place. Instead, a detached tree must be created, attached, then mounted open and then either moved or detached again. Lift this restriction. In order to allow mounting detached mounts onto other detached mounts the same permission model used for creating detached mounts from detached mounts can be used (cf. above). Allowing to mount detached mounts onto detached mounts leaves three cases to consider: (1) The source mount is an attached mount and the target mount is a detached mount. This would be equivalent to moving a mount between different mount namespaces. A caller could move an attached mount to a detached mount. The detached mount can now be freely attached to any mount namespace. This changes the current delegatioh model significantly for no good reason. So this will fail. (2) Anonymous mount namespaces are always attached fully, i.e., it is not possible to only attach a subtree of an anoymous mount namespace. This simplifies the implementation and reasoning. Consequently, if the anonymous mount namespace of the source detached mount and the target detached mount are the identical the mount request will fail. (3) The source mount's anonymous mount namespace is different from the target mount's anonymous mount namespace. In this case the source anonymous mount namespace of the source mount tree must be freed after its mounts have been moved to the target anonymous mount namespace. The source anonymous mount namespace must be empty afterwards. By allowing to mount detached mounts onto detached mounts a caller may do the following: fd_tree1 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE) fd_tree2 = open_tree(-EBADF, "/tmp", OPEN_TREE_CLONE) fd_tree1 and fd_tree2 refer to two different detached mount trees that belong to two different anonymous mount namespace. It is important to note that fd_tree1 and fd_tree2 both refer to the root of their respective anonymous mount namespaces. By allowing to mount detached mounts onto detached mounts the caller may now do: move_mount(fd_tree1, "", fd_tree2, "", MOVE_MOUNT_F_EMPTY_PATH \| MOVE_MOUNT_T_EMPTY_PATH) This will cause the detached mount referred to by fd_tree1 to be mounted on top of the detached mount referred to by fd_tree2. Thus, the detached mount fd_tree1 is moved from its separate anonymous mount namespace into fd_tree2's anonymous mount namespace. It also means that while fd_tree2 continues to refer to the root of its respective anonymous mount namespace fd_tree1 doesn't anymore. This has the consequence that only fd_tree2 can be moved to another anonymous or non-anonymous mount namespace. Moving fd_tree1 will now fail as fd_tree1 doesn't refer to the root of an anoymous mount namespace anymore. Now fd_tree1 and fd_tree2 refer to separate detached mount trees referring to the same anonymous mount namespace. This is conceptually fine. The new mount api does allow for this to happen already via: mount -t tmpfs tmpfs /mnt mkdir -p /mnt/A mount -t tmpfs tmpfs /mnt/A fd_tree3 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE \| AT_RECURSIVE) fd_tree4 = open_tree(-EBADF, "/mnt/A", 0) Both fd_tree3 and fd_tree4 refer to two different detached mount trees but both detached mount trees refer to the same anonymous mount namespace. An as with fd_tree1 and fd_tree2, only fd_tree3 may be moved another mount namespace as fd_tree3 refers to the root of the anonymous mount namespace just while fd_tree4 doesn't. However, there's an important difference between the fd_tree3/fd_tree4 and the fd_tree1/fd_tree2 example. Closing fd_tree4 and releasing the respective struct file will have no further effect on fd_tree3's detached mount tree. However, closing fd_tree3 will cause the mount tree and the respective anonymous mount namespace to be destroyed causing the detached mount tree of fd_tree4 to be invalid for further mounting. By allowing to mount detached mounts on detached mounts as in the fd_tree1/fd_tree2 example both struct files will affect each other. Both fd_tree1 and fd_tree2 refer to struct files that have FMODE_NEED_UNMOUNT set. To handle this we use the fact that @fd_tree1 will have a parent mount once it has been attached to @fd_tree2. When dissolve_on_fput() is called the mount that has been passed in will refer to the root of the anonymous mount namespace. If it doesn't it would mean that mounts are leaked. So before allowing to mount detached mounts onto detached mounts this would be a bug. Now that detached mounts can be mounted onto detached mounts it just means that the mount has been attached to another anonymous mount namespace and thus dissolve_on_fput() must not unmount the mount tree or free the anonymous mount namespace as the file referring to the root of the namespace hasn't been closed yet. If it had been closed yet it would be obvious because the mount namespace would be NULL, i.e., the @fd_tree1 would have already been unmounted. If @fd_tree1 hasn't been unmounted yet and has a parent mount it is safe to skip any cleanup as closing @fd_tree2 will take care of all cleanup operations. - Allow mount propagation for detached mount trees In commit `ee2e3f5062` ("mount: fix mounting of detached mounts onto targets that reside on shared mounts") I fixed a bug where propagating the source mount tree of an anonymous mount namespace into a target mount tree of a non-anonymous mount namespace could be used to trigger an integer overflow in the non-anonymous mount namespace causing any new mounts to fail. The cause of this was that the propagation algorithm was unable to recognize mounts from the source mount tree that were already propagated into the target mount tree and then reappeared as propagation targets when walking the destination propagation mount tree. When fixing this I disabled mount propagation into anonymous mount namespaces. Make it possible for anonymous mount namespace to receive mount propagation events correctly. This is now also a correctness issue now that we allow mounting detached mount trees onto detached mount trees. Mark the source anonymous mount namespace with MNTNS_PROPAGATING indicating that all mounts belonging to this mount namespace are currently in the process of being propagated and make the propagation algorithm discard those if they appear as propagation targets" * tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits) selftests: test subdirectory mounting selftests: add test for detached mount tree propagation fs: namespace: fix uninitialized variable use mount: handle mount propagation for detached mount trees fs: allow creating detached mounts from fsmount() file descriptors selftests: seventh test for mounting detached mounts onto detached mounts selftests: sixth test for mounting detached mounts onto detached mounts selftests: fifth test for mounting detached mounts onto detached mounts selftests: fourth test for mounting detached mounts onto detached mounts selftests: third test for mounting detached mounts onto detached mounts selftests: second test for mounting detached mounts onto detached mounts selftests: first test for mounting detached mounts onto detached mounts fs: mount detached mounts onto detached mounts fs: support getname_maybe_null() in move_mount() selftests: create detached mounts from detached mounts fs: create detached mounts from detached mounts fs: add may_copy_tree() fs: add fastpath for dissolve_on_fput() fs: add assert for move_mount() fs: add mnt_ns_empty() helper ...	2025-03-24 11:41:41 -07:00
Linus Torvalds	74adf9e353	vfs-6.15-rc1.nsfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90rXwAKCRCRxhvAZXjc ogrYAP4kWLzxD2IbBGSs5kBkKdc9qNGMtjrOn5InHm263vTpPwD/VYcOmyc3gScO e8hTBES3mYlzBpselh99HnGx5geMtAE= =+I5+ -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs nsfs updates from Christian Brauner: "This contains non-urgent fixes for nsfs to validate ioctls before performing any relevant operations. We alredy did this for a few other filesystems last cycle" * tag 'vfs-6.15-rc1.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/nsfs: add ioctl validation tests nsfs: validate ioctls	2025-03-24 11:38:12 -07:00
Linus Torvalds	804382d59b	vfs-6.15-rc1.overlayfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90rUAAKCRCRxhvAZXjc opI3AP9ws4S/JXOjxNKoTYmNM2nZ8+r1v8tUxbLIiqdvzx9PygD/V1ZjXtn6lwZr OK8d5Y8UnlPZTlBF8D61op3AjnXYzws= =KV4p -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.overlayfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs overlayfs updates from Christian Brauner: "Currently overlayfs uses the mounter's credentials for its override_creds() calls. That provides a consistent permission model. This patches allows a caller to instruct overlayfs to use its credentials instead. The caller must be located in the same user namespace hierarchy as the user namespace the overlayfs instance will be mounted in. This provides a consistent and simple security model. With this it is possible to e.g., mount an overlayfs instance where the mounter must have CAP_SYS_ADMIN but the credentials used for override_creds() have dropped CAP_SYS_ADMIN. It also allows the usage of custom fs{g,u}id different from the callers and other tweaks" * tag 'vfs-6.15-rc1.overlayfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/ovl: add third selftest for "override_creds" selftests/ovl: add second selftest for "override_creds" selftests/filesystems: add utils.{c,h} selftests/ovl: add first selftest for "override_creds" ovl: allow to specify override credentials	2025-03-24 10:37:40 -07:00
Linus Torvalds	df00ded23a	vfs-6.15-rc1.pidfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90pqgAKCRCRxhvAZXjc oqVsAP9Aq/fMCI14HeXehPCezKQZPu1HTrPPo2clLHXoSnafawEAsA3YfWTT4Heb iexzqvAEUOMYOVN66QEc+6AAwtMLrwc= =0eYo -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs pidfs updates from Christian Brauner: - Allow retrieving exit information after a process has been reaped through pidfds via the new PIDFD_INTO_EXIT extension for the PIDFD_GET_INFO ioctl. Various tools need access to information about a process/task even after it has already been reaped. Pidfd polling allows waiting on either task exit or for a task to have been reaped. The contract for PIDFD_INFO_EXIT is simply that EPOLLHUP must be observed before exit information can be retrieved, i.e., exit information is only provided once the task has been reaped and then can be retrieved as long as the pidfd is open. - Add PIDFD_SELF_{THREAD,THREAD_GROUP} sentinels allowing userspace to forgo allocating a file descriptor for their own process. This is useful in scenarios where users want to act on their own process through pidfds and is akin to AT_FDCWD. - Improve premature thread-group leader and subthread exec behavior when polling on pidfds: (1) During a multi-threaded exec by a subthread, i.e., non-thread-group leader thread, all other threads in the thread-group including the thread-group leader are killed and the struct pid of the thread-group leader will be taken over by the subthread that called exec. IOW, two tasks change their TIDs. (2) A premature thread-group leader exit means that the thread-group leader exited before all of the other subthreads in the thread-group have exited. Both cases lead to inconsistencies for pidfd polling with PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the current thread-group leader may or may not see an exit notification on the file descriptor depending on when poll is performed. If the poll is performed before the exec of the subthread has concluded an exit notification is generated for the old thread-group leader. If the poll is performed after the exec of the subthread has concluded no exit notification is generated for the old thread-group leader. The correct behavior is to simply not generate an exit notification on the struct pid of a subhthread exec because the struct pid is taken over by the subthread and thus remains alive. But this is difficult to handle because a thread-group may exit premature as mentioned in (2). In that case an exit notification is reliably generated but the subthreads may continue to run for an indeterminate amount of time and thus also may exec at some point. After this pull no exit notifications will be generated for a PIDFD_THREAD pidfd for a thread-group leader until all subthreads have been reaped. If a subthread should exec before no exit notification will be generated until that task exits or it creates subthreads and repeates the cycle. This means an exit notification indicates the ability for the father to reap the child. * tag 'vfs-6.15-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits) selftests/pidfd: third test for multi-threaded exec polling selftests/pidfd: second test for multi-threaded exec polling selftests/pidfd: first test for multi-threaded exec polling pidfs: improve multi-threaded exec and premature thread-group leader exit polling pidfs: ensure that PIDFS_INFO_EXIT is available selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest selftests/pidfd: add third PIDFD_INFO_EXIT selftest selftests/pidfd: add second PIDFD_INFO_EXIT selftest selftests/pidfd: add first PIDFD_INFO_EXIT selftest selftests/pidfd: expand common pidfd header pidfs/selftests: ensure correct headers for ioctl handling selftests/pidfd: fix header inclusion pidfs: allow to retrieve exit information pidfs: record exit code and cgroupid at exit pidfs: use private inode slab cache pidfs: move setting flags into pidfs_alloc_file() pidfd: rely on automatic cleanup in __pidfd_prepare() ...	2025-03-24 10:16:37 -07:00
Linus Torvalds	fd101da676	vfs-6.15-rc1.mount -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90qAwAKCRCRxhvAZXjc on7lAP0akpIsJMWREg9tLwTNTySI1b82uKec0EAgM6T7n/PYhAD/T4zoY8UYU0Pr qCxwTXHUVT6bkNhjREBkfqq9OkPP8w8= =GxeN -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - Mount notifications The day has come where we finally provide a new api to listen for mount topology changes outside of /proc/<pid>/mountinfo. A mount namespace file descriptor can be supplied and registered with fanotify to listen for mount topology changes. Currently notifications for mount, umount and moving mounts are generated. The generated notification record contains the unique mount id of the mount. The listmount() and statmount() api can be used to query detailed information about the mount using the received unique mount id. This allows userspace to figure out exactly how the mount topology changed without having to generating diffs of /proc/<pid>/mountinfo in userspace. - Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount api - Support detached mounts in overlayfs Since last cycle we support specifying overlayfs layers via file descriptors. However, we don't allow detached mounts which means userspace cannot user file descriptors received via open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to attach them to a mount namespace via move_mount() first. This is cumbersome and means they have to undo mounts via umount(). Allow them to directly use detached mounts. - Allow to retrieve idmappings with statmount Currently it isn't possible to figure out what idmapping has been attached to an idmapped mount. Add an extension to statmount() which allows to read the idmapping from the mount. - Allow creating idmapped mounts from mounts that are already idmapped So far it isn't possible to allow the creation of idmapped mounts from already idmapped mounts as this has significant lifetime implications. Make the creation of idmapped mounts atomic by allow to pass struct mount_attr together with the open_tree_attr() system call allowing to solve these issues without complicating VFS lookup in any way. The system call has in general the benefit that creating a detached mount and applying mount attributes to it becomes an atomic operation for userspace. - Add a way to query statmount() for supported options Allow userspace to query which mount information can be retrieved through statmount(). - Allow superblock owners to force unmount * tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits) umount: Allow superblock owners to force umount selftests: add tests for mount notification selinux: add FILE__WATCH_MOUNTNS samples/vfs: fix printf format string for size_t fs: allow changing idmappings fs: add kflags member to struct mount_kattr fs: add open_tree_attr() fs: add copy_mount_setattr() helper fs: add vfs_open_tree() helper statmount: add a new supported_mask field samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP selftests: add tests for using detached mount with overlayfs samples/vfs: check whether flag was raised statmount: allow to retrieve idmappings uidgid: add map_id_range_up() fs: allow detached mounts in clone_private_mount() selftests/overlayfs: test specifying layers as O_PATH file descriptors fs: support O_PATH fds with FSCONFIG_SET_FD vfs: add notifications for mount attach and detach fanotify: notify on mount attach and detach ...	2025-03-24 09:34:10 -07:00
Ming Lei	0f3ebf2d4b	selftests: ublk: add stripe target Add ublk stripe target which can take 1~4 underlying backing files or block device, with stripe size 4k ~ 512K. Add two basic tests(write verify & mkfs/mount/umount) over ublk/stripe. This target is helpful to cover multiple IOs aiming at same fixed/registered IO kernel buffer. It is also capable of verifying vectored registered (kernel)buffers in future for zero copy, so far it isn't supported yet. Todo: support vectored registered kernel buffer for ublk/zc. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	263846eb43	selftests: ublk: simplify loop io completion Use the added target io handling helpers for simplifying loop io completion. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	8cb9b971e2	selftests: ublk: enable zero copy for null target Enable zero copy for null target so that we can evaluate performance from zero copy or not. Also this should be the simplest ublk zero copy implementation, which can be served as zc example. Add test for covering 'add -t null -z'. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	8842b72a82	selftests: ublk: prepare for supporting stripe target - pass 'truct dev_ctx *ctx' to target init function - add 'private_data' to 'struct ublk_dev' for storing target specific data - add 'private_data' to 'struct ublk_io' for storing per-IO data - add 'tgt_ios' to 'struct ublk_io' for counting how many io_uring ios for handling the current io command - add helper ublk_get_io() for supporting stripe target - add two helpers for simplifying target io handling Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	10d962dae2	selftests: ublk: move common code into common.c Move two functions for initializing & de-initializing backing file into common.c. Also move one common helper into kublk.h. Prepare for supporting ublk-stripe. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	9413c0ca8e	selftests: ublk: increase max buffer size to 1MB Increase max buffer size to 1MB, and 64KB is too small to evaluate performance with builtin ublk server implementation. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	f2639ed11e	selftests: ublk: add single sqe allocator helper Unify the sqe allocator helper, and we will use it for supporting more cases, such as ublk stripe, in which variable sqe allocation is required. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Ming Lei	723977cab4	selftests: ublk: add generic_01 for verifying sequential IO order block layer, ublk and io_uring might re-order IO in the past - plug - queue ublk io command via task work Add one test for verifying if sequential WRITE IO is dispatched in order. - null target is taken, so we can just observe io order from `tracepoint:block:block_rq_complete` which represents the dispatch order - WRITE IO is taken because READ may come from system-wide utility Cc: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250322093218.431419-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-22 08:35:08 -06:00
Kohei Enju	5f3077d7fc	selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid syzbot reported out-of-bounds read in check_atomic_load/store() when the register number is invalid in this context: https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c To avoid the issue from now on, let's add tests where the register number is invalid for load-acquire/store-release. After discussion with Eduard, I decided to use R15 as invalid register because the actual slab-out-of-bounds read issue occurs when the register number is R12 or larger. Signed-off-by: Kohei Enju <enjuk@amazon.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250322045340.18010-6-enjuk@amazon.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-22 06:19:09 -07:00
Kohei Enju	c03bb2fa32	bpf: Fix out-of-bounds read in check_atomic_load/store() syzbot reported the following splat [0]. In check_atomic_load/store(), register validity is not checked before atomic_ptr_type_ok(). This causes the out-of-bounds read in is_ctx_reg() called from atomic_ptr_type_ok() when the register number is MAX_BPF_REG or greater. Call check_load_mem()/check_store_reg() before atomic_ptr_type_ok() to avoid the OOB read. However, some tests introduced by commit `ff3afe5da9` ("selftests/bpf: Add selftests for load-acquire and store-release instructions") assume calling atomic_ptr_type_ok() before checking register validity. Therefore the swapping of order unintentionally changes verifier messages of these tests. For example in the test load_acquire_from_pkt_pointer(), expected message is 'BPF_ATOMIC loads from R2 pkt is not allowed' although actual messages are different. validate_msgs:FAIL:754 expect_msg VERIFIER LOG: ============= Global function load_acquire_from_pkt_pointer() doesn't return scalar. Only those are supported. 0: R1=ctx() R10=fp0 ; asm volatile ( @ verifier_load_acquire.c:140 0: (61) r2 = (u32 )(r1 +0) ; R1=ctx() R2_w=pkt(r=0) 1: (d3) r0 = load_acquire((u8 *)(r2 +0)) invalid access to packet, off=0 size=1, R2(id=0,off=0,r=0) R2 offset is outside of the packet processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ============= EXPECTED SUBSTR: 'BPF_ATOMIC loads from R2 pkt is not allowed' #505/19 verifier_load_acquire/load-acquire from pkt pointer:FAIL This is because instructions in the test don't pass check_load_mem() and therefore don't enter the atomic_ptr_type_ok() path. In this case, we have to modify instructions so that they pass the check_load_mem() and trigger atomic_ptr_type_ok(). Similarly for store-release tests, we need to modify instructions so that they pass check_store_reg(). Like load_acquire_from_pkt_pointer(), modify instructions in: load_acquire_from_sock_pointer() store_release_to_ctx_pointer() store_release_to_pkt_pointer() Also in store_release_to_sock_pointer(), check_store_reg() returns error early and atomic_ptr_type_ok() is not triggered, since write to sock pointer is not possible in general. We might be able to remove the test, but for now let's leave it and just change the expected message. [0] BUG: KASAN: slab-out-of-bounds in is_ctx_reg kernel/bpf/verifier.c:6185 [inline] BUG: KASAN: slab-out-of-bounds in atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223 Read of size 4 at addr ffff888141b0d690 by task syz-executor143/5842 CPU: 1 UID: 0 PID: 5842 Comm: syz-executor143 Not tainted 6.14.0-rc3-syzkaller-gf28214603dc6 #0 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0x16e/0x5b0 mm/kasan/report.c:521 kasan_report+0x143/0x180 mm/kasan/report.c:634 is_ctx_reg kernel/bpf/verifier.c:6185 [inline] atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223 check_atomic_store kernel/bpf/verifier.c:7804 [inline] check_atomic kernel/bpf/verifier.c:7841 [inline] do_check+0x89dd/0xedd0 kernel/bpf/verifier.c:19334 do_check_common+0x1678/0x2080 kernel/bpf/verifier.c:22600 do_check_main kernel/bpf/verifier.c:22691 [inline] bpf_check+0x165c8/0x1cca0 kernel/bpf/verifier.c:23821 Reported-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c Tested-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com Fixes: e24bbad29a8d ("bpf: Introduce load-acquire and store-release instructions") Fixes: `ff3afe5da9` ("selftests/bpf: Add selftests for load-acquire and store-release instructions") Signed-off-by: Kohei Enju <enjuk@amazon.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250322045340.18010-5-enjuk@amazon.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-22 06:15:57 -07:00
Ryan Roberts	a2c6f9c3ca	selftests/mm: speed up split_huge_page_test create_pagecache_thp_and_fd() was previously writing a file sized at twice the PMD size by making a per-byte write syscall. This was quite slow when the PMD size is 4M, but completely intolerable for 32M (PMD size for arm64's 16K page size), and 512M (PMD size for arm64's 64K page size). The byte pattern has a 256 byte period, so let's create a 1K buffer and fill it with exactly 4 periods. Then we can write the buffer as many times as is required to fill the file. This makes things much more tolerable. The test now passes for 16K page size. It still fails for 64K page size because MAX_PAGECACHE_ORDER is too small for 512M folio size (I think). Link: https://lkml.kernel.org/r/20250318174343.243631-3-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Rafael Aquini <raquini@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-21 22:03:16 -07:00
Ryan Roberts	735b3f7e77	selftests/mm: uffd-unit-tests support for hugepages > 2M uffd-unit-tests uses a memory area with a fixed 32M size. Then it calculates the number of pages by dividing by page_size, which itself is either the base page size or the PMD huge page size depending on the test config. For the latter, we end up with nr_pages=1 for arm64 16K base pages, and nr_pages=0 for 64K base pages. This doesn't end well. So let's make the 32M size a floor and also ensure that we have at least 2 pages given the PMD size. With this change, the tests pass on arm64 64K base page size configuration. Link: https://lkml.kernel.org/r/20250318174343.243631-2-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Rafael Aquini <raquini@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-21 22:03:15 -07:00
Brendan Jackman	d8a866c766	selftests/mm: add commentary about 9pfs bugs As discussed here: https://lore.kernel.org/lkml/Z9RRkL1hom48z3Tt@google.com/ This code could benefit from some more commentary. To avoid needing to comment the same thing in multiple places (I guess more of these SKIPs will need to be added over time, for now I am only like 20% of the way through Project Run run_vmtests.sh Successfully), add a dummy "skip tests for this specific reason" function that basically just serves as a hook to hang comments on. Link: https://lkml.kernel.org/r/20250317-9pfs-comments-v1-1-9ac96043e146@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-21 22:03:14 -07:00
Ming Lei	ffde32a49a	selftests: ublk: fix starting ublk device Firstly ublk char device node may not be created by udev yet, so wait a while until it can be opened or timeout. Secondly delete created ublk device in case of start failure, otherwise the device becomes zombie. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250321135324.259677-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-21 14:09:24 -06:00
John Stultz	e40d3709c0	selftests/timers: Improve skew_consistency by testing with other clockids Lei Chen reported a bug with CLOCK_MONOTONIC_COARSE having inconsistencies when NTP is adjusting the clock frequency. This has gone seemingly undetected for ~15 years, illustrating a clear gap in our testing. The skew_consistency test is intended to catch this sort of problem, but was focused on only evaluating CLOCK_MONOTONIC, and thus missed the problem on CLOCK_MONOTONIC_COARSE. So adjust the test to run with all clockids for 60 seconds each instead of 10 minutes with just CLOCK_MONOTONIC. Reported-by: Lei Chen <lei.chen@smartx.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250320200306.1712599-2-jstultz@google.com Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/	2025-03-21 19:16:18 +01:00
Breno Leitao	4b73dc83ed	selftests: netconsole: Add tests for 'release' feature in sysdata Expands the self-tests to include the 'release' feature in sysdata. Verifies that enabling the 'release' feature appends the correct data and ensures that disabling it functions as expected. When enabled, the message should have an item similar to in the userdata: `release=$(uname -r)` Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-5-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-21 18:59:25 +01:00
Mickaël Salaün	15383a0d63	landlock: Add the errata interface Some fixes may require user space to check if they are applied on the running kernel before using a specific feature. For instance, this applies when a restriction was previously too restrictive and is now getting relaxed (e.g. for compatibility reasons). However, non-visible changes for legitimate use (e.g. security fixes) do not require an erratum. Because fixes are backported down to a specific Landlock ABI, we need a way to avoid cherry-pick conflicts. The solution is to only update a file related to the lower ABI impacted by this issue. All the ABI files are then used to create a bitmask of fixes. The new errata interface is similar to the one used to get the supported Landlock ABI version, but it returns a bitmask instead because the order of fixes may not match the order of versions, and not all fixes may apply to all versions. The actual errata will come with dedicated commits. The description is not actually used in the code but serves as documentation. Create the landlock_abi_version symbol and use its value to check errata consistency. Update test_base's create_ruleset_checks_ordering tests and add errata tests. This commit is backportable down to the first version of Landlock. Fixes: `3532b0b435` ("landlock: Enable user space to infer supported features") Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-21 12:12:19 +01:00
Eric Biggers	ca17aa6640	crypto: lib/chacha - remove unused arch-specific init support All implementations of chacha_init_arch() just call chacha_init_generic(), so it is pointless. Just delete it, and replace chacha_init() with what was previously chacha_init_generic(). Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-03-21 17:39:06 +08:00
Ming Lei	96af5af47b	selftests: ublk: fix write cache implementation For loop target, write cache isn't enabled, and each write isn't be marked as DSYNC too. Fix it by enabling write cache, meantime fix FLUSH implementation by not taking LBA range into account, and there isn't such info for FLUSH command. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250321004758.152572-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-20 20:01:03 -06:00
Ming Lei	beb31982ad	selftests: ublk: add variable for user to not show test result Some user decides test result by exit code only, and wouldn't like to be bothered by the test result. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-20 17:18:55 -06:00
Ming Lei	fe2230d921	selftests: ublk: don't show `modprobe` failure ublk_drv may be built-in, so don't show modprobe failure, and we do check `/dev/ublk-control` for skipping test if ublk_drv isn't enabled. Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-20 17:18:55 -06:00
Ming Lei	8764c1a72b	selftests: ublk: add one dependency header Add one dependency helper which can include new uapi definition which isn't synced from kernel. This way also helps a lot for downstream test deployment. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-20 17:18:55 -06:00
Paolo Abeni	6f13bec53a	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ9NWrwAKCRAraS+Z+3y5 E2VIAP4iIdMobFDEm04JuXyrOK6HasnkOFkuEgLNNhGRbCCTTwD/X/2A8baSXnjg eqYN2DJfrs7OZ1ZUku/ic1VGtJg7Sgk= =jTyW -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-13 The following pull-request contains BPF updates for your net-next tree. We've added 4 non-merge commits during the last 3 day(s) which contain a total of 2 files changed, 35 insertions(+), 12 deletions(-). The main changes are: 1) bpf_getsockopt support for TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX, from Jason Xing bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN tcp: bpf: Support bpf_getsockopt for TCP_BPF_DELACK_MAX tcp: bpf: Support bpf_getsockopt for TCP_BPF_RTO_MIN tcp: bpf: Introduce bpf_sol_tcp_getsockopt to support TCP_BPF flags ==================== Link: https://patch.msgid.link/20250313221620.2512684-1-martin.lau@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-20 21:48:14 +01:00
Paolo Abeni	f491593394	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc8). Conflict: tools/testing/selftests/net/Makefile `03544faad7` ("selftest: net: add proc_net_pktgen") `3ed61b8938` ("selftests: net: test for lwtunnel dst ref loops") tools/testing/selftests/net/config: `85cb3711ac` ("selftests: net: Add test cases for link and peer netns") `3ed61b8938` ("selftests: net: test for lwtunnel dst ref loops") Adjacent commits: tools/testing/selftests/net/Makefile `c935af429e` ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY") `355d940f4d` ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-20 21:38:01 +01:00
Björn Töpel	e16e64f9e0	selftests/bpf: Sanitize pointer prior fclose() There are scenarios where env.{sub,}test_state->stdout_saved, can be NULL, e.g. sometimes when the watchdog timeout kicks in, or if the open_memstream syscall is not available. Avoid crashing test_progs by adding an explicit NULL check prior the fclose() call. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20250318081648.122523-1-bjorn@kernel.org	2025-03-20 10:35:07 -07:00
Paolo Bonzini	0afd104fb3	KVM/arm64 updates for 6.15 - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events -----BEGIN PGP SIGNATURE----- iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZ9s9gBccb2xpdmVyLnVw dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFp6LAQCOQ1Fidp8RT1NdhLLAhW5D4gLe MNT619R4qfqu64ZpeQEAidHMAYaGRk5KDNBq6Jn+awcJnwCcMnh2ok0vTOjz3gY= =RC6A -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.15 - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events	2025-03-20 12:54:12 -04:00
Paolo Bonzini	c0f99fb4e5	KVM/riscv changes for 6.15 - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmfbsiQACgkQrUjsVaLH LAfGAxAAlODgZMDt+f/YlOOmQPsLohGrWsZjvheRfrf3X5PFNuXd1frpZSbF1PCs lk4WZrekJo2VT2tWMEBjFrH6A9slHUzjZs+xIxig1N/Gmxy1clKp/svJVG20WwQI a5/5jPH3TNDL3KUphwBF0QIi/Tlp6+SZ5RAag6QsBXKnFy5+o9lioR2T6tFzMoxy ry8VD8e395bQXn5tYkk4vsr9vBRzpgq7ZBOMg6zyNTL86UBOVq3QXPNzSE5kenXW 4zqDmuxbk+GgpUKLXA9aXwLvasGiOr+59uToZd2lHf5ynSdSprQyW11I5ZOFs2DX 800mLeYtyHc4nGTi6OO3VN9toMhmgijtFLNpBDg4LRvwMfB+sp62Ixsmjs3CBCd+ GqcbJaMtdgM1SpZ3tlzEQoWpIBlNYs7BuG/6Jen9xKHHvOg7Ty9UimJeMikV+FUm srq0+GlgjaniZ/d8arIIm7LVqZLAc6OSzn+A8IcE0KQ9fKBgB0CZrANW+N+b+7ub Lkq9hA8tM5Ti0bNszqPyJu3bmhWAOEOwF0CjYL79vEhIYsegCKKtruKOppdM+sl2 dzk+DVV9Aar+bDu/v/bYV5TfIq39U86Tqxe+OU7EPzch5NdyDvKcpZ9/JIZnf/j5 tviM3awZfFmgTeBUTAYvaR34KExBOLQLkGb9BqAydYjg+pl9jVE= =HP3N -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-6.15-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.15 - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal	2025-03-20 12:53:34 -04:00
Linus Torvalds	5fc3193608	Including fixes from can, bluetooth and ipsec. This contains a last minute revert of a recent GRE patch, mostly to allow me stating there are no known regressions outstanding. Current release - regressions: - revert "gre: Fix IPv6 link-local address generation." - eth: ti: am65-cpsw: fix NAPI registration sequence Previous releases - regressions: - ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw(). - mptcp: fix data stream corruption in the address announcement - bluetooth: fix connection regression between LE and non-LE adapters - can: - flexcan: only change CAN state when link up in system PM - ucan: fix out of bound read in strscpy() source Previous releases - always broken: - lwtunnel: fix reentry loops - ipv6: fix TCP GSO segmentation with NAT - xfrm: force software GSO only in tunnel mode - eth: ti: icssg-prueth: add lock to stats Misc: - add Andrea Mayer as a maintainer of SRv6 Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmfcOhESHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkJUQP/2jpOcA+ogkCfRhcJHHaeX9gCX+buSY4 RZnMUkSUhW5dcL9mTybx3tzH/0oNlGfLtA016SR+Grq0mIBIYsRWhTW6C8mOgvsZ PrHvE9VyATBSxUM9o9bL5WMg/M9TvX7foR63+zGPN2lEk6mmLK/hIBFNjkv9R8Vk /VR0ZRKHMsJARsyRve1Sf8DZXndAfROnWhCCmWWxKpnCd4biBL/6n6p00vfxpAls /Xnm1PC0NMuRz0hlIr8UN9DkkF1v+LEhp1EFbg5e7i8cJkAJyXcvZBb4rJ8f2Ty7 4qXK53i+kp+NryNAZ6cNu6OtkD+DtdDUNJ28ElkdnKOc8H787YGFvCTBDPiqe5yu CVA6tT1hsuKjEnXdX3545+tTW48XS+6J60ZeVnslfC3fakG03ckOZboofT+LQV1y wcv4+73dDrIbSo3X7DBdksFzQi/ICb1VG/GOdGlg8vlokKnH0di/veoBtbyAR7dJ 2Fv3rpE6e/JQQmXYffto0qrNXOYrEx7Zqmo2QbDS6dTkQ1FiDsxRYegRmVxHLHX9 0fd48FNFIqEJSM5RwLOQW9X83aObT5p2OiHiA+WrnmkiSiMwK4EdusBm4WtFZ7qc bqaUHbnAmg1g7UiPn7cGE3navoasH8xmkS77ZsbyW618ej/hRFGDwWMQSuEMyeLz PKrT/OMXIHu9 =aYLc -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, bluetooth and ipsec. This contains a last minute revert of a recent GRE patch, mostly to allow me stating there are no known regressions outstanding. Current release - regressions: - revert "gre: Fix IPv6 link-local address generation." - eth: ti: am65-cpsw: fix NAPI registration sequence Previous releases - regressions: - ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw(). - mptcp: fix data stream corruption in the address announcement - bluetooth: fix connection regression between LE and non-LE adapters - can: - flexcan: only change CAN state when link up in system PM - ucan: fix out of bound read in strscpy() source Previous releases - always broken: - lwtunnel: fix reentry loops - ipv6: fix TCP GSO segmentation with NAT - xfrm: force software GSO only in tunnel mode - eth: ti: icssg-prueth: add lock to stats Misc: - add Andrea Mayer as a maintainer of SRv6" * tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6 Revert "gre: Fix IPv6 link-local address generation." Revert "selftests: Add IPv6 link-local address generation tests for GRE devices." net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES tools headers: Sync uapi/asm-generic/socket.h with the kernel sources mptcp: Fix data stream corruption in the address announcement selftests: net: test for lwtunnel dst ref loops net: ipv6: ioam6: fix lwtunnel_output() loop net: lwtunnel: fix recursion loops net: ti: icssg-prueth: Add lock to stats net: atm: fix use after free in lec_send() xsk: fix an integer overflow in xp_create_and_assign_umem() net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data selftests: drv-net: use defer in the ping test phy: fix xa_alloc_cyclic() error handling dpll: fix xa_alloc_cyclic() error handling devlink: fix xa_alloc_cyclic() error handling ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create(). ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw(). net: ipv6: fix TCP GSO segmentation with NAT ...	2025-03-20 09:39:15 -07:00
Guillaume Nault	355d940f4d	Revert "selftests: Add IPv6 link-local address generation tests for GRE devices." This reverts commit `6f50175cca`. Commit `183185a18f` ("gre: Fix IPv6 link-local address generation.") is going to be reverted. So let's revert the corresponding kselftest first. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/259a9e98f7f1be7ce02b53d0b4afb7c18a8ff747.1742418408.git.gnault@redhat.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-20 15:46:16 +01:00
Christian Brauner	4d5483a42c	selftests/pidfd: third test for multi-threaded exec polling Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-4-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-20 15:32:43 +01:00
Christian Brauner	9b6f723db5	selftests/pidfd: second test for multi-threaded exec polling Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-3-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-20 15:32:43 +01:00
Christian Brauner	db7ce91e22	selftests/pidfd: first test for multi-threaded exec polling Add first test for premature thread-group leader exit. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-2-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-20 15:32:43 +01:00
Justin Iurman	3ed61b8938	selftests: net: test for lwtunnel dst ref loops As recently specified by commit `0ea09cbf83` ("docs: netdev: add a note on selftest posting") in net-next, the selftest is therefore shipped in this series. However, this selftest does not really test this series. It needs this series to avoid crashing the kernel. What it really tests, thanks to kmemleak, is what was fixed by the following commits: - commit `c71a192976` ("net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels") - commit `92191dd107` ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") - commit `c64a0727f9` ("net: ipv6: fix dst ref loop on input in seg6 lwt") - commit `13e55fbaec` ("net: ipv6: fix dst ref loop on input in rpl lwt") - commit `0e7633d7b9` ("net: ipv6: fix dst ref loop in ila lwtunnel") - commit `5da15a9c11` ("net: ipv6: fix missing dst ref drop in ila lwtunnel") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-20 11:25:52 +01:00
Geliang Tang	9cf0128e64	selftests: mptcp: add pm sysctl mapping tests This patch checks if the newly added net.mptcp.path_manager is mapped successfully from or to the old net.mptcp.pm_type in userspace_pm.sh. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-12-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-20 10:14:49 +01:00
Bastien Curutchet (eBPF Foundation)	f8df95e84c	selftests/bpf: Migrate test_xdp_vlan.sh into test_progs test_xdp_vlan.sh isn't used by the BPF CI. Migrate test_xdp_vlan.sh in prog_tests/xdp_vlan.c. It uses the same BPF programs located in progs/test_xdp_vlan.c and the same network topology. Remove test_xdp_vlan*.sh and their Makefile entries. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250221-xdp_vlan-v1-2-7d29847169af@bootlin.com/	2025-03-19 16:01:33 -07:00
Bastien Curutchet (eBPF Foundation)	0f9ff4cb68	selftests/bpf: test_xdp_vlan: Rename BPF sections The __load() helper expects BPF sections to be names 'xdp' or 'tc' Rename BPF sections so they can be loaded with the __load() helper in upcoming patch. Rename the BPF functions with their previous section's name. Update the 'ip link' commands in the script to use the program name instead of the section name to load the BPF program. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250221-xdp_vlan-v1-1-7d29847169af@bootlin.com	2025-03-19 15:59:27 -07:00
Oliver Upton	4f2774c57a	Merge branch 'kvm-arm64/writable-midr' into kvmarm/next * kvm-arm64/writable-midr: : Writable implementation ID registers, courtesy of Sebastian Ott : : Introduce a new capability that allows userspace to set the : ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1, : and AIDR_EL1. Also plug a hole in KVM's trap configuration where : SMIDR_EL1 was readable at EL1, despite the fact that KVM does not : support SME. KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable KVM: arm64: Copy guest CTR_EL0 into hyp VM KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR KVM: arm64: Allow userspace to change the implementation ID registers KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value KVM: arm64: Maintain per-VM copy of implementation ID regs KVM: arm64: Set HCR_EL2.TID1 unconditionally Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-19 14:54:32 -07:00
Oliver Upton	d300b0168e	Merge branch 'kvm-arm64/pv-cpuid' into kvmarm/next * kvm-arm64/pv-cpuid: : Paravirtualized implementation ID, courtesy of Shameer Kolothum : : Big-little has historically been a pain in the ass to virtualize. The : implementation ID (MIDR, REVIDR, AIDR) of a vCPU can change at the whim : of vCPU scheduling. This can be particularly annoying when the guest : needs to know the underlying implementation to mitigate errata. : : "Hyperscalers" face a similar scheduling problem, where VMs may freely : migrate between hosts in a pool of heterogenous hardware. And yes, our : server-class friends are equally riddled with errata too. : : In absence of an architected solution to this wart on the ecosystem, : introduce support for paravirtualizing the implementation exposed : to a VM, allowing the VMM to describe the pool of implementations that a : VM may be exposed to due to scheduling/migration. : : Userspace is expected to intercept and handle these hypercalls using the : SMCCC filter UAPI, should it choose to do so. smccc: kvm_guest: Fix kernel builds for 32 bit arm KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2 smccc/kvm_guest: Enable errata based on implementation CPUs arm64: Make _midr_in_range_list() an exported function KVM: arm64: Introduce KVM_REG_ARM_VENDOR_HYP_BMAP_2 KVM: arm64: Specify hypercall ABI for retrieving target implementations arm64: Modify _midr_range() functions to read MIDR/REVIDR internally Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-19 14:53:16 -07:00
Oliver Upton	13f64f6d21	Merge branch 'kvm-arm64/nv-idregs' into kvmarm/next * kvm-arm64/nv-idregs: : Changes to exposure of NV features, courtesy of Marc Zyngier : : Apply NV-specific feature restrictions at reset rather than at the point : of KVM_RUN. This makes the true feature set visible to userspace, a : necessary step towards save/restore support or NV VMs. : : Add an additional vCPU feature flag for selecting the E2H0 flavor of NV, : such that the VHE-ness of the VM can be applied to the feature set. KVM: arm64: selftests: Test that TGRAN_2 fields are writable KVM: arm64: Allow userspace to write ID_AA64MMFR0_EL1.TGRAN_2 KVM: arm64: Advertise FEAT_ECV when possible KVM: arm64: Make ID_AA64MMFR4_EL1.NV_frac writable KVM: arm64: Allow userspace to limit NV support to nVHE KVM: arm64: Move NV-specific capping to idreg sanitisation KVM: arm64: Enforce NV limits on a per-idregs basis KVM: arm64: Make ID_REG_LIMIT_FIELD_ENUM() more widely available KVM: arm64: Consolidate idreg callbacks KVM: arm64: Advertise NV2 in the boot messages KVM: arm64: Mark HCR.EL2.{NV*,AT} RES0 when ID_AA64MMFR4_EL1.NV_frac is 0 KVM: arm64: Mark HCR.EL2.E2H RES0 when ID_AA64MMFR1_EL1.VH is zero KVM: arm64: Hide ID_AA64MMFR2_EL1.NV from guest and userspace arm64: cpufeature: Handle NV_frac as a synonym of NV2 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-19 14:52:26 -07:00
Ingo Molnar	14d281db78	sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files We leave most of the defconfigs alone (there's over 70 of them), but let's remove CONFIG_SCHED_DEBUG from the scheduler self-test Kconfig files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z9szt3MpQmQ56TRd@gmail.com	2025-03-19 22:23:24 +01:00
Michael Jeanson	d047e32b8d	rseq/selftests: Fix namespace collision with rseq UAPI header When the rseq UAPI header is included, 'union rseq' clashes with 'struct rseq'. It's not the case in the rseq selftests but it does break the KVM selftests that also include this file. Rename 'union rseq' to 'union rseq_tls' to fix this. Fixes: `e6644c967d` ("rseq/selftests: Ensure the rseq ABI TLS is actually 1024 bytes") Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/20250319202144.1141542-1-mjeanson@efficios.com	2025-03-19 21:26:24 +01:00
Jonathan Lennox	756f88ff9c	tc-tests: Update tc police action tests for tc buffer size rounding fixes. Before tc's recent change to fix rounding errors, several tests which specified a burst size of "1m" would translate back to being 1048574 bytes (2b less than 1Mb). sprint_size prints this as "1024Kb". With the tc fix, the burst size is instead correctly reported as 1048576 bytes (precisely 1Mb), which sprint_size prints as "1Mb". This updates the expected output in the tests' matchPattern values to accept either the old or the new output. Signed-off-by: Jonathan Lennox <jonathan.lennox@8x8.com> Link: https://patch.msgid.link/20250312174804.313107-1-jonathan.lennox@8x8.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-19 18:40:24 +01:00
Jakub Kicinski	acf10a8c0b	selftests: drv-net: use defer in the ping test Make sure the test cleans up after itself. The XDP off statements at the end of the test may not be reached. Fixes: `75cc19c8ff` ("selftests: drv-net: add xdp cases for ping.py") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250312131040.660386-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-19 18:21:13 +01:00
Kumar Kartikeya Dwivedi	60ba5b3ed7	selftests/bpf: Add tests for rqspinlock Introduce selftests that trigger AA, ABBA deadlocks, and test the edge case where the held locks table runs out of entries, since we then fallback to the timeout as the final line of defense. Also exercise verifier's AA detection where applicable. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-26-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-19 08:03:06 -07:00
Paolo Bonzini	783e9cd05c	KVM selftests changes for 6.15, part 2 - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration. - Fix a few minor bugs related to handling of stats FDs. - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation). - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked). - Misc cleanups and fixes. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmfZqZYACgkQOlYIJqCj N/2G0Q/5AZAnOPBU/xmIZrntaWSkPX0wYCBzTCFAbyuXTYdWmCMaVJUjj0W4h15U rAPxq76moGSHT06EQhw8clrNHbRSr4OZvsJIQMgGK5GBXlxfD+S1bNb/NKijpKm1 yjI4VIKA1ugpPLbb4KM522UQdAwkBI8KuPbhzWCsz2kzFf3Iy/B6ZgfevTjldZ9f KbocBQhCaEOmk0i1UZWMnuJU+rh06y/BNR/7a/vhaZGy4pPm/WZpSezKbti3j7F5 +/VXJ+fh2T1oPzlycdQ4i9phYAOMJONcNxlzhs0Hp71zkxNykVKyv1zxvCI3ioEn nnHFYQyremDfDJhxabOJRYB4ETlhLpX9nOlz2euIy7YeurDs73Qtjg2rixgAyq80 /FITPTFVlcHxjPr9YC+wtWzZB2qnGNMfA7QujnoG3XlGQJJsvnFTygXfBMj+W0G/ K8V93lvGTcwtT7x+9MCDhD0wZXfPBXorze8QrDfCWF96G1Y+5gZmYGs38Cu6IWrc 6q3tHcDmUaOGI1Mag97J1smSWw+ZC1+UuHJSW7DMZldOm/2fXkqrRpbHRXEzRQIa vZjBZKoYHpqGMKayQ+GIRql9R6o+ZGQ6OQXgtLl0fxPPA01/hKX1ITZayBnHDl3C wm3svDZy6hPWs17kwtLj+cRSMVYItbjOgpOj58gBo16w1NPNYTI= =t77Q -----END PGP SIGNATURE----- Merge tag 'kvm-x86-selftests-6.15' of https://github.com/kvm-x86/linux into HEAD KVM selftests changes for 6.15, part 2 - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration. - Fix a few minor bugs related to handling of stats FDs. - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation). - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked). - Misc cleanups and fixes.	2025-03-19 09:05:34 -04:00
Paolo Bonzini	9b47f288eb	KVM selftests changes for 6.15, part 1 - Misc cleanups and prep work. - Annotate _no_printf() with "printf" so that pr_debug() statements are checked by the compiler for default builds (and pr_info() when QUIET). - Attempt to whack the last LLC references/misses mole in the Intel PMU counters test by adding a data load and doing CLFLUSH{OPT} on the data instead of the code being executed. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters. - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmfZpvcACgkQOlYIJqCj N/3a+BAAmv0iagaYSiigAWZ9EM/2+CZDGp0+g7sB95graJFKnrKCXSkpuUr1UIYr aDPLHAPBZ8szWNqrJARdSui0QP8cEd2erJkMNRDEQqx/OYKQJ+0W+gmlbleQ6GL6 7C/zUn0JdC0AhpdfePw6nIPpwIYBxDwHMQSuPUOxw5ZJmVCzQxn43MmPWTfL11Eo ARfh6VWmXSvp+5emyjnlGmcT1/a3UlfAGDRekK0gaPYmEtaf6LfKCXc+nX84q0fr T2N8nVcmEDg9VUO3JZfexoLcKpFMsTdwg4GbQ4beytYwPx3YELvCP9eFl5yo4lxk 9YL6uPdXyk0/Cgpv/QyB4YtnaQ4tktPmFLYjEtf8NIMLubouTS6dtYErKq9MOgsP 7UaOBVY6O7+cG+iOU2yGn1Mct0XxhKKLu3n5ZlGLJ/+8v1PUIegL3wUwpQaycp7O xgRd/narmKEmMYFr7D0y7epcUVBlkP17YsFH4w6z9moDjJNVz4Ziqft6ju0N3P1s a/8CA5gd2lplp/Yae62f2mgJNjw5kbEYtGO9ybINbhacznsl+3JsU85k1h+7dZrW 8lmfNJ8+B95UrZ9Wo0mWsv+UKYYFHRFoHjKlevuUTPQ/QxoghvpQ5iU3P+rT8E5n 858OhTi2vlAhg8a338XdfUOy799F73ecDaIKyPMdO74v7wq1swo= =RjiY -----END PGP SIGNATURE----- Merge tag 'kvm-x86-selftests_6.15-1' of https://github.com/kvm-x86/linux into HEAD KVM selftests changes for 6.15, part 1 - Misc cleanups and prep work. - Annotate _no_printf() with "printf" so that pr_debug() statements are checked by the compiler for default builds (and pr_info() when QUIET). - Attempt to whack the last LLC references/misses mole in the Intel PMU counters test by adding a data load and doing CLFLUSH{OPT} on the data instead of the code being executed. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters. - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware.	2025-03-19 09:05:22 -04:00
Paolo Bonzini	4d9a677596	KVM x86 misc changes for 6.15: - Fix a bug in PIC emulation that caused KVM to emit a spurious KVM_REQ_EVENT. - Add a helper to consolidate handling of mp_state transitions, and use it to clear pv_unhalted whenever a vCPU is made RUNNABLE. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition. - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2). - Drop "support" for PV Async #PF with proctected guests without SEND_ALWAYS, as KVM can't get the current CPL. - Misc cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmfZhoYACgkQOlYIJqCj N/1oSBAAlhsZzv4m6rHtWACjGsSBlAE5WM7HrpnwjOyMW+Desc0WLL4L6qAlxgT7 4ZkBvQ4zsiyUIdLv1XARvkBYHTUkcgG8fQSSJQ6grZit5OcFcMgWafvQrJYI6428 TyzF/x6t0gj8CjDluA6jM/eL5HhWZZDOIg0Wma+pyYpW+7y2tkphXyYyOQPBGwv3 geRUetbDHHXjf042k/8f1j1vjzrNNvAg3YyNyx1YbdU9XKsn5D+SeUW2eVfYk8G7 5QsCOGvUYcbbjrR8kbCZKexvoH6Np9J6YKDe4R9R2yDzgs/96qz6xkYTGCVkHA1y uursKqRHgbXBxzxa+ban073laT7Qt3S01Gd9bJW3IO7hzG89gl4qfX7fap8T9Yc2 yeBTYIgInpyx+NCdZ2Z/++BzPagBGfa77gFX/eIkmsVA9LWYi9CI3FSjtr/czvWm a4tfMPvTVBjsBQQ7t/lNksrq0O51lbb3iqqv3ToQpDOOqCWuMEU5xcihhPRr5NSZ dX4o/jIDhCV8EyXdtASyqMlYBXcuC45ojEZn1elh0QogzYAdSGQ2bIDyxuBtA//k kSbi+E4GB64jVfBWUyK2QeLOBnBkH7mh6Cg5UYr1Ln9Sm6l8vrcxhcbnchiWxXMI WCK7BJwI2HojBVpEZ04jMkjHvg36uSfjOzmMLT5yPXfFNebsGmA= =8SGF -----END PGP SIGNATURE----- Merge tag 'kvm-x86-misc-6.15' of https://github.com/kvm-x86/linux into HEAD KVM x86 misc changes for 6.15: - Fix a bug in PIC emulation that caused KVM to emit a spurious KVM_REQ_EVENT. - Add a helper to consolidate handling of mp_state transitions, and use it to clear pv_unhalted whenever a vCPU is made RUNNABLE. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition. - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2). - Drop "support" for PV Async #PF with proctected guests without SEND_ALWAYS, as KVM can't get the current CPL. - Misc cleanups	2025-03-19 09:04:48 -04:00
Clément Léger	7a9827e777	KVM: riscv: selftests: Add Zaamo/Zalrsc extensions to get-reg-list test The KVM RISC-V allows Zaamo/Zalrsc extensions for Guest/VM so add these extensions to get-reg-list test. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240619153913.867263-6-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>	2025-03-19 12:03:46 +00:00
Ingo Molnar	89771319e0	Linux 6.14-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfXVtUeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGN/sH/i5423Gt/z51gDjA s4v5Z7GaBJ9zOGBahn2RWFe72zytTqKrEJmMnGfguirs0atD1DtQj4WAP7iFKP+e WyO663X6HF7i5y37ja0Yd4PZc31hwtqzKH8LjBf8f8tTy8UsEVqumdi5A4sS9KTM qm4kTyyVEY9D/s7oRY8ywjDlRJtO6nT0aKMp4kAqNEbrNUYbilT/a0hgXcgSmPyB uIjmjL2fZfutxGI5LgfbaSHCa1ElmhvTvivOMpaAmZSGCRVHCKGgT0CTNnHyn/7C dB145JkRO4ZOUqirCdO4PE/23id3ajq9fcixJGBzAv7c45y+B3JZ1r2kAfKalE8/ qrOKLys= =8r7a -----END PGP SIGNATURE----- Merge tag 'v6.14-rc7' into x86/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-19 11:03:06 +01:00
Yafang Shao	be16ddeaae	selftests/bpf: Add selftest for attaching fexit to __noreturn functions The reuslt: $ tools/testing/selftests/bpf/test_progs --name=fexit_noreturns #99/1 fexit_noreturns/noreturns:OK #99 fexit_noreturns:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20250318114447.75484-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-18 19:07:18 -07:00
Nicolin Chen	97717a1f28	iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage Trigger vEVENTs by feeding an idev ID and validating the returned output virt_ids whether they equal to the value that was set to the vDEVICE. Link: https://patch.msgid.link/r/e829532ec0a3927d61161b7674b20e731ecd495b.1741719725.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-18 14:17:48 -03:00
Nicolin Chen	941d0719aa	iommufd/selftest: Require vdev_id when attaching to a nested domain When attaching a device to a vIOMMU-based nested domain, vdev_id must be present. Add a piece of code hard-requesting it, preparing for a vEVENTQ support in the following patch. Then, update the TEST_F. A HWPT-based nested domain will return a NULL new_viommu, thus no such a vDEVICE requirement. Link: https://patch.msgid.link/r/4051ca8a819e51cb30de6b4fe9e4d94d956afe3d.1741719725.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-18 14:17:48 -03:00
Yunhui Cui	36dec9e448	RISC-V: selftests: Add TEST_ZICBOM into CBO tests Add test for Zicbom and its block size into CBO tests, when Zicbom is present, test that cbo.clean/flush may be issued and works. As the software can't verify the clean/flush functions, we just judged that cbo.clean/flush isn't executed illegally. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20250226063206.71216-4-cuiyunhui@bytedance.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>	2025-03-18 12:44:17 +00:00
Linus Torvalds	76b6905c11	15 hotfixes. 7 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 13 are for MM and the other two are for squashfs and procfs. All are singletons. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ9jkDgAKCRDdBJ7gKXxA jtgjAP0ZIVQ8wEL/txkrNyMT5JrWCK1XfjsfPNJ0afkcl4Oo/QEA8rnJHT/hX8aY Wr+aY2CoCsFEOsW4oDRUPrUYpkUBqA8= =AwLJ -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes. 7 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 13 are for MM and the other two are for squashfs and procfs. All are singletons. Please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/page_alloc: fix memory accept before watermarks gets initialized mm: decline to manipulate the refcount on a slab page memcg: drain obj stock on cpu hotplug teardown mm/huge_memory: drop beyond-EOF folios with the right number of refs selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT mm: memcontrol: fix swap counter leak from offline cgroup mm/vma: do not register private-anon mappings with khugepaged during mmap squashfs: fix invalid pointer dereference in squashfs_cache_delete mm/migrate: fix shmem xarray update during migration mm/hugetlb: fix surplus pages in dissolve_free_huge_page() mm/damon/core: initialize damos->walk_completed in damon_new_scheme() mm/damon: respect core layer filters' allowance decision on ops layer filemap: move prefaulting out of hot write path proc: fix UAF in proc_get_inode()	2025-03-17 22:27:27 -07:00
Cyan Yang	f841ad9ca5	selftests/mm/cow: fix the incorrect error handling Error handling doesn't check the correct return value. This patch will fix it. Link: https://lkml.kernel.org/r/20250312043840.71799-1-cyan.yang@sifive.com Fixes: `f4b5fd6946` ("selftests/vm: anon_cow: THP tests") Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 22:07:06 -07:00
Zi Yan	80a5c494c8	selftests/mm: add tests for folio_split(), buddy allocator like split It splits page cache folios to orders from 0 to 8 at different in-folio offset. Link: https://lkml.kernel.org/r/20250307174001.242794-9-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Kairui Song <kasong@tencent.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 22:07:00 -07:00
Zi Yan	3fec86f8aa	xarray: add xas_try_split() to split a multi-index entry Patch series "Buddy allocator like (or non-uniform) folio split", v10. This patchset adds a new buddy allocator like (or non-uniform) large folio split from a order-n folio to order-m with m < n. It reduces 1. the total number of after-split folios from 2^(n-m) to n-m+1; 2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to n/6-m/6, assuming XA_CHUNK_SHIFT=6; 3. keep more large folios after a split from all order-m folios to order-(n-1) to order-m folios. For example, to split an order-9 to order-0, folio split generates 10 (or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios. Instead of duplicating existing split_huge_page() code, __folio_split() is introduced as the shared backend code for both split_huge_page_to_list_to_order() and folio_split(). __folio_split() can support both uniform split and buddy allocator like (or non-uniform) split. All existing split_huge_page() users can be gradually converted to use folio_split() if possible. In this patchset, I converted truncate_inode_partial_folio() to use folio_split(). xfstests quick group passed for both tmpfs and xfs. I also semi-replicated Hugh's test[12] and ran it without any issue for almost 24 hours. This patch (of 8): A preparation patch for non-uniform folio split, which always split a folio into half iteratively, and minimal xarray entry split. Currently, xas_split_alloc() and xas_split() always split all slots from a multi-index entry. They cost the same number of xa_node as the to-be-split slots. For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8 xa_node are needed. Instead xas_try_split() is intended to be used iteratively to split the order-9 entry into 2 order-8 entries, then split one order-8 entry, based on the given index, to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. When splitting the order-6 entry and a new xa_node is needed, xas_try_split() will try to allocate one if possible. As a result, xas_try_split() would only need 1 xa_node instead of 8. When a new xa_node is needed during the split, xas_try_split() can try to allocate one but no more. -ENOMEM will be return if a node cannot be allocated. -EINVAL will be return if a sibling node is split or cascade split happens, where two or more new nodes are needed, and these are not supported by xas_try_split(). xas_split_alloc() and xas_split() split an order-9 to order-0: --------------------------------- \| \| \| \| \| \| \| \| \| \| 0 \| 1 \| 2 \| 3 \| 4 \| 5 \| 6 \| 7 \| \| \| \| \| \| \| \| \| \| --------------------------------- \| \| \| \| ------- --- --- ------- \| \| ... \| \| V V V V ----------- ----------- ----------- ----------- \| xa_node \| \| xa_node \| ... \| xa_node \| \| xa_node \| ----------- ----------- ----------- ----------- xas_try_split() splits an order-9 to order-0: --------------------------------- \| \| \| \| \| \| \| \| \| \| 0 \| 1 \| 2 \| 3 \| 4 \| 5 \| 6 \| 7 \| \| \| \| \| \| \| \| \| \| --------------------------------- \| \| V ----------- \| xa_node \| ----------- Link: https://lkml.kernel.org/r/20250307174001.242794-1-ziy@nvidia.com Link: https://lkml.kernel.org/r/20250307174001.242794-2-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Kairui Song <kasong@tencent.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 22:06:59 -07:00
Mykyta Yatsenko	a024843d92	selftests/bpf: Test freplace from user namespace Add selftests to verify that it is possible to load freplace program from user namespace if BPF token is initialized by bpf_object__prepare before calling bpf_program__set_attach_target. Negative test is added as well. Modified type of the priv_prog to xdp, as kprobe did not work on aarch64 and s390x. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20250317174039.161275-5-mykyta.yatsenko5@gmail.com	2025-03-17 13:45:12 -07:00
Wei Yang	ccaf3efcee	lib/interval_tree: add test case for span iteration Verify interval_tree_span_iter_xxx() helpers works as expected. Link: https://lkml.kernel.org/r/20250310074938.26756-6-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 12:17:01 -07:00
Wei Yang	16b1936ae6	lib/rbtree: add random seed Current test use pseudo rand function with fixed seed, which means the test data is the same pattern each time. Add random seed parameter to randomize the test. Link: https://lkml.kernel.org/r/20250310074938.26756-4-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 12:17:00 -07:00
Wei Yang	4164e1525d	lib/rbtree: enable userland test suite for rbtree related data structure Patch series "lib/interval_tree: add some test cases and cleanup", v2. Since rbtree/augmented tree/interval tree share similar data structure, besides new cases for interval tree, this patch set also does cleanup for others. This patch (of 7): Currently we have some tests for rbtree related data structure, e.g. rbtree, augmented rbtree, interval tree, in lib/ as kernel module. To facilitate the test and debug for those fundamental data structure, this patch enable those tests in userland. Link: https://lkml.kernel.org/r/20250310074938.26756-1-richard.weiyang@gmail.com Link: https://lkml.kernel.org/r/20250310074938.26756-2-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-17 12:17:00 -07:00
Dave Jiang	1729808c54	cxl/test: Add Set Feature support to cxl_test Add emulation to support Set Feature mailbox command to cxl_test. The only feature supported is the device patrol scrub feature. The set feature allows activation of patrol scrub for the cxl_test emulated device. The command does not support partial data transfer even though the spec allows it. This restriction is to reduce complexity of the emulation given the patrol scrub feature is very minimal. Link: https://patch.msgid.link/r/20250307205648.1021626-8-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:37 -03:00
Dave Jiang	e77e9c1079	cxl/test: Add Get Feature support to cxl_test Add emulation of Get Feature command to cxl_test. The feature for device patrol scrub is returned by the emulation code. This is the only feature currently supported by cxl_test. It returns the information for the device patrol scrub feature. Link: https://patch.msgid.link/r/20250307205648.1021626-7-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:37 -03:00
Dave Jiang	858ce2f56b	cxl: Add FWCTL support to CXL Add fwctl support code to allow sending of CXL feature commands from userspace through as ioctls via FWCTL. Provide initial setup bits. The CXL PCI probe function will call devm_cxl_setup_fwctl() after the cxl_memdev has been enumerated in order to setup FWCTL char device under the cxl_memdev like the existing memdev char device for issuing CXL raw mailbox commands from userspace via ioctls. Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:36 -03:00
Dave Jiang	3b5d43245f	Merge branch 'for-6.15/features' into cxl-for-next Add CXL Features support. Setup code for enabling in kernel usage of CXL Features. Expecting EDAC/RAS to utilize CXL Features in kernel for things such as memory sparing. Also prepartion for enabling of CXL FWCTL support to issue allowed Features from user space.	2025-03-17 09:22:59 -07:00
Lorenzo Stoakes	f3b92176f4	tools/selftests: add guard region test for /proc/$pid/pagemap Add a test to the guard region self tests to assert that the /proc/$pid/pagemap information now made availabile to the user correctly identifies and reports guard regions. As a part of this change, update vm_util.h to add the new bit (note there is no header file in the kernel where this is exposed, the user is expected to provide their own mask) and utilise the helper functions there for pagemap functionality. [lorenzo.stoakes@oracle.com: fixup define name] Link: https://lkml.kernel.org/r/32e83941-e6f5-42ee-9292-a44c16463cf1@lucifer.local Link: https://lkml.kernel.org/r/164feb0a43ae72650e6b20c3910213f469566311.1740139449.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:41 -07:00
Brendan Jackman	1ddae9d67e	selftests/mm/mlock: print error on failure It's not really possible to start diagnosing this without knowing the actual error. Also update the mlock2 helper to behave like libc would by setting errno and returning -1. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-12-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:40 -07:00
Brendan Jackman	5d2146a335	selftests/mm: skip mlock tests if nobody user can't read it If running from a directory that can't be read by unprivileged users, executing on-fault-test via the nobody user will fail. The kselftest build does give the file the correct permissions, but after being installed it might be in a directory without global execute permissions. Since the script can't safely fix that, just skip if it happens. Note that the stderr of the `ls` command is unfiltered meaning the user sees a "permission denied" error that can help inform them why the test was skipped. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-11-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:40 -07:00
Brendan Jackman	f896c6de83	selftests/mm: ensure uffd-wp-mremap gets pages of each size This test allocates a page of every available size and doesn't have any SKIP logic if the allocation fails. So, ensure it's available and skip the test if we can't do so. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-10-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:39 -07:00
Brendan Jackman	e9269b2cc4	selftests/mm: drop unnecessary sudo usage This script must be run as root anyway (see all the writing to privileged files in /proc etc). Remove the unnecessary use of sudo to avoid breaking on single-user systems that don't have sudo. This also avoids confusing readers. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-9-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:39 -07:00
Brendan Jackman	32b42970e8	selftests/mm: skip gup_longterm tests on weird filesystems Some filesystems don't support ftruncate()ing unlinked files. They return ENOENT. In that case, skip the test. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-8-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:39 -07:00
Brendan Jackman	571a4b62ed	selftests/mm: skip map_populate on weird filesystems It seems that 9pfs does not allow truncating unlinked files, Mark Brown has noted that NFS may also behave this way. It doesn't seem quite right to call this a "bug" but it's probably a special enough case that it makes sense for the test to just SKIP if it happens. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-7-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:39 -07:00
Brendan Jackman	bf6d575e24	selftests/mm: don't fail uffd-stress if too many CPUs This calculation divides a fixed parameter by an environment-dependent parameter i.e. the number of CPUs. The simple way to avoid machine-specific failures here is to just put a cap on the max value of the latter. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-6-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Suggested-by: Mateusz Guzik <mjguzik@gmail.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	db0f1c138f	selftests/mm: print some details when uffd-stress gets bad params So this can be debugged more easily. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-5-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	f3b5535abc	selftests/mm/uffd: rename nr_cpus -> nr_parallel A later commit will bound this variable so it no longer necessarily matches the number of CPUs. Rename it appropriately. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-4-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	f4b3e6c7f1	selftests/mm: skip uffd-wp-mremap if userfaultfd not available It's obvious that this should fail in that case, but still, save the reader the effort of figuring out that they've run into this by just SKIPping Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-3-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	0046dbed80	selftests/mm: skip uffd-stress if userfaultfd not available It's pretty obvious that the test wouldn't work if you don't have the feature enabled. But, it's still useful to SKIP instead of failing so the reader can immediately tell that this is the reason why. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-2-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:37 -07:00
Brendan Jackman	800ddf3cd7	selftests/mm: report errno when things fail in gup_longterm Patch series "selftests/mm: Some cleanups from trying to run them", v4. I never had much luck running mm selftests so I spent a few hours digging into why. Looks like most of the reason is missing SKIP checks, so this series is just adding a bunch of those that I found. I did not do anything like all of them, just the ones I spotted in gup_longterm, gup_test, mmap, userfaultfd and memfd_secret. It's a bit unfortunate to have to skip those tests when ftruncate() fails, but I don't have time to dig deep enough into it to actually make them pass. I have observed the issue on 9pfs and heard rumours that NFS has a similar problem. I'm now able to run these test groups successfully: - mmap - gup_test - compaction - migration - page_frag - userfaultfd - mlock I've never gone past "Waiting for hugetlb memory to get depleted", in the hugetlb tests. I don't know if they are stuck or if they would eventually work if I was patient enough (testing on a 1G machine). I have not investigated further. I had some issues with mlock tests failing due to -ENOSRCH from mlock2(), I can no longer reproduce that though, things work OK now. Of the remaining tests there may be others that work fine, but there's no convenient way to survey the whole output of run_vmtests.sh so I'm just going test by test here. In my spare moments I am slowly chipping away at a setup to run these tests continuously in a reasonably hermetic QEMU environment via virtme-ng: `5fad4b9c59/README.md` Hopefully that will eventually offer a way to provide a "canned" environment where the tests are known to work, which can be fairly easily reproduced by any developer. This patch (of 12): Just reporting failure doesn't tell you what went wrong. This can fail in different ways so report errno to help the reader get started debugging. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-0-dec210a658f5@google.com Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-1-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:37 -07:00
Ujwal Kundur	af3b45aac5	selftests/mm: fix spelling Fix misspelling flagged by codespell. Link: https://lkml.kernel.org/r/20250215081803.1793-1-ujwal.kundur@gmail.com Signed-off-by: Ujwal Kundur <ujwal.kundur@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:23 -07:00
Suren Baghdasaryan	3104138517	mm: make vma cache SLAB_TYPESAFE_BY_RCU To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that object reuse before RCU grace period is over will be detected by lock_vma_under_rcu(). Current checks are sufficient as long as vma is detached before it is freed. The only place this is not currently happening is in exit_mmap(). Add the missing vma_mark_detached() in exit_mmap(). Another issue which might trick lock_vma_under_rcu() during vma reuse is vm_area_dup(), which copies the entire content of the vma into a new one, overriding new vma's vm_refcnt and temporarily making it appear as attached. This might trick a racing lock_vma_under_rcu() to operate on a reused vma if it found the vma before it got reused. To prevent this situation, we should ensure that vm_refcnt stays at detached state (0) when it is copied and advances to attached state only after it is added into the vma tree. Introduce vm_area_init_from() which preserves new vma's vm_refcnt and use it in vm_area_dup(). Since all vmas are in detached state with no current readers when they are freed, lock_vma_under_rcu() will not be able to take vm_refcnt after vma got detached even if vma is reused. vma_mark_attached() in modified to include a release fence to ensure all stores to the vma happen before vm_refcnt gets initialized. Finally, make vm_area_cachep SLAB_TYPESAFE_BY_RCU. This will facilitate vm_area_struct reuse and will minimize the number of call_rcu() calls. [surenb@google.com: remove atomic_set_release() usage in tools/] Link: https://lkml.kernel.org/r/20250217054351.2973666-1-surenb@google.com Link: https://lkml.kernel.org/r/20250213224655.1680278-18-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:21 -07:00
Suren Baghdasaryan	6bef4c2f97	mm: move lesser used vma_area_struct members into the last cacheline Move several vma_area_struct members which are rarely or never used during page fault handling into the last cacheline to better pack vm_area_struct. As a result vm_area_struct will fit into 3 as opposed to 4 cachelines. New typical vm_area_struct layout: struct vm_area_struct { union { struct { long unsigned int vm_start; /* 0 8 / long unsigned int vm_end; / 8 8 / }; / 0 16 / freeptr_t vm_freeptr; / 0 8 / }; / 0 16 / struct mm_struct vm_mm; /* 16 8 / pgprot_t vm_page_prot; / 24 8 / union { const vm_flags_t vm_flags; / 32 8 / vm_flags_t __vm_flags; / 32 8 / }; / 32 8 / unsigned int vm_lock_seq; / 40 4 / / XXX 4 bytes hole, try to pack / struct list_head anon_vma_chain; / 48 16 / / --- cacheline 1 boundary (64 bytes) --- / struct anon_vma anon_vma; /* 64 8 / const struct vm_operations_struct vm_ops; /* 72 8 / long unsigned int vm_pgoff; / 80 8 / struct file vm_file; /* 88 8 / void vm_private_data; /* 96 8 / atomic_long_t swap_readahead_info; / 104 8 / struct mempolicy vm_policy; /* 112 8 / struct vma_numab_state numab_state; /* 120 8 / / --- cacheline 2 boundary (128 bytes) --- / refcount_t vm_refcnt (__aligned__(64)); / 128 4 / / XXX 4 bytes hole, try to pack / struct { struct rb_node rb (__aligned__(8)); / 136 24 / long unsigned int rb_subtree_last; / 160 8 / } __attribute__((__aligned__(8))) shared; / 136 32 / struct anon_vma_name anon_name; /* 168 8 / struct vm_userfaultfd_ctx vm_userfaultfd_ctx; / 176 8 / / size: 192, cachelines: 3, members: 18 / / sum members: 176, holes: 2, sum holes: 8 / / padding: 8 / / forced alignments: 2, forced holes: 1, sum forced holes: 4 */ } __attribute__((__aligned__(64))); Memory consumption per 1000 VMAs becomes 48 pages: slabinfo after vm_area_struct changes: <name> ... <objsize> <objperslab> <pagesperslab> : ... vm_area_struct ... 192 42 2 : ... Link: https://lkml.kernel.org/r/20250213224655.1680278-14-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:20 -07:00
Suren Baghdasaryan	f35ab95ca0	mm: replace vm_lock and detached flag with a reference count rw_semaphore is a sizable structure of 40 bytes and consumes considerable space for each vm_area_struct. However vma_lock has two important specifics which can be used to replace rw_semaphore with a simpler structure: 1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails. 2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore and the vma->detached flag with a refcount (vm_refcnt). When vma is in detached state, vm_refcnt is 0 and only a call to vma_mark_attached() can take it out of this state. Note that unlike before, now we enforce both vma_mark_attached() and vma_mark_detached() to be done only after vma has been write-locked. vma_mark_attached() changes vm_refcnt to 1 to indicate that it has been attached to the vma tree. When a reader takes read lock, it increments vm_refcnt, unless the top usable bit of vm_refcnt (0x40000000) is set, indicating presence of a writer. When writer takes write lock, it sets the top usable bit to indicate its presence. If there are readers, writer will wait using newly introduced mm->vma_writer_wait. Since all writers take mmap_lock in write mode first, there can be only one writer at a time. The last reader to release the lock will signal the writer to wake up. refcount might overflow if there are many competing readers, in which case read-locking will fail. Readers are expected to handle such failures. In summary: 1. all readers increment the vm_refcnt; 2. writer sets top usable (writer) bit of vm_refcnt; 3. readers cannot increment the vm_refcnt if the writer bit is set; 4. in the presence of readers, writer must wait for the vm_refcnt to drop to 1 (plus the VMA_LOCK_OFFSET writer bit), indicating an attached vma with no readers; 5. vm_refcnt overflow is handled by the readers. While this vm_lock replacement does not yet result in a smaller vm_area_struct (it stays at 256 bytes due to cacheline alignment), it allows for further size optimization by structure member regrouping to bring the size of vm_area_struct below 192 bytes. [surenb@google.com: fix a crash due to vma_end_read() that should have been removed] Link: https://lkml.kernel.org/r/20250220200208.323769-1-surenb@google.com Link: https://lkml.kernel.org/r/20250213224655.1680278-13-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:20 -07:00
Suren Baghdasaryan	55e50223bf	mm: introduce vma_iter_store_attached() to use with attached vmas vma_iter_store() functions can be used both when adding a new vma and when updating an existing one. However for existing ones we do not need to mark them attached as they are already marked that way. With vma->detached being a separate flag, double-marking a vmas as attached or detached is not an issue because the flag will simply be overwritten with the same value. However once we fold this flag into the refcount later in this series, re-attaching or re-detaching a vma becomes an issue since these operations will be incrementing/decrementing a refcount. Introduce vma_iter_store_new() and vma_iter_store_overwrite() to replace vma_iter_store() and avoid re-attaching a vma during vma update. Add assertions in vma_mark_attached()/vma_mark_detached() to catch invalid usage. Update vma tests to check for vma detached state correctness. Link: https://lkml.kernel.org/r/20250213224655.1680278-5-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:18 -07:00
Suren Baghdasaryan	8ef95d8f15	mm: mark vma as detached until it's added into vma tree Current implementation does not set detached flag when a VMA is first allocated. This does not represent the real state of the VMA, which is detached until it is added into mm's VMA tree. Fix this by marking new VMAs as detached and resetting detached flag only after VMA is added into a tree. Introduce vma_mark_attached() to make the API more readable and to simplify possible future cleanup when vma->vm_mm might be used to indicate detached vma and vma_mark_attached() will need an additional mm parameter. Link: https://lkml.kernel.org/r/20250213224655.1680278-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:17 -07:00
Suren Baghdasaryan	7b6218ae12	mm: move per-vma lock into vm_area_struct Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. Move vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned as well. With kernel compiled using defconfig, this causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes: slabinfo before: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 160 51 2 : ... slabinfo after moving vm_lock: <name> ... <objsize> <objperslab> <pagesperslab> : ... vm_area_struct ... 256 32 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, which is 5.5MB per 100000 VMAs. Note that the size of this structure is dependent on the kernel configuration and typically the original size is higher than 160 bytes. Therefore these calculations are close to the worst case scenario. A more realistic vm_area_struct usage before this change is: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 176 46 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 pages, which is 3.9MB per 100000 VMAs. This memory consumption growth can be addressed later by optimizing the vm_lock. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20250213224655.1680278-3-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:17 -07:00
Lorenzo Stoakes	0b6d4853d1	tools/selftests: add file/shmem-backed mapping guard region tests Extend the guard region self tests to explicitly assert that guard regions work correctly for functionality specific to file-backed and shmem mappings. In addition to testing all of the existing guard region functionality that is currently tested against anonymous mappings against file-backed and shmem mappings (except those which are exclusive to anonymous mapping), we now also: * Test that MADV_SEQUENTIAL does not cause unexpected readahead behaviour. * Test that MAP_PRIVATE behaves as expected with guard regions installed in both a shared and private mapping of an fd. * Test that a read-only file can correctly establish guard regions. * Test a probable fault-around case does not interfere with guard regions (or vice-versa). * Test that truncation does not eliminate guard regions. * Test that hole punching functions as expected in the presence of guard regions. * Test that a read-only mapping of a memfd write sealed mapping can have guard regions established within it and function correctly without violation of the seal. * Test that guard regions installed into a mapping of the anonymous zero page function correctly. Link: https://lkml.kernel.org/r/90c16bec5fcaafcd1700dfa3e9988c3e1aa9ac1d.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Lorenzo Stoakes	272f37d3e9	tools/selftests: expand all guard region tests to file-backed Extend the guard region tests to allow for test fixture variants for anon, shmem, and local file files. This allows us to assert that each of the expected behaviours of anonymous memory also applies correctly to file-backed (both shmem and an a file created locally in the current working directory) and thus asserts the same correctness guarantees as all the remaining tests do. The fixture teardown is now performed in the parent process rather than child forked ones, meaning cleanup is always performed, including unlinking any generated temporary files. Additionally the variant fixture data type now contains an enum value indicating the type of backing store and the mmap() invocation is abstracted to allow for the mapping of whichever backing store the variant is testing. We adjust tests as necessary to account for the fact they may now reference files rather than anonymous memory. Link: https://lkml.kernel.org/r/ab42228d2bd9b8aa18e9faebcd5c88732a7e5820.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Lorenzo Stoakes	ce1c0824fc	selftests/mm: rename guard-pages to guard-regions The feature formerly referred to as guard pages is more correctly referred to as 'guard regions', as in fact no pages are ever allocated in the process of installing the regions. To avoid confusion, rename the tests accordingly. [lorenzo.stoakes@oracle.com: fix guard regions invocation] Link: https://lkml.kernel.org/r/13426c71-d069-4407-9340-b227ff8b8736@lucifer.local Link: https://lkml.kernel.org/r/1c3cd04a3f69b5756b94bda701ac88325a9be18b.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Mark Brown	85968b6a20	selftests/mm: allow tests to run with no huge pages support Currently the mm selftests refuse to run if huge pages are not available in the current system but this is an optional feature and not all the tests actually require them. Change the test during startup to be non-fatal and skip or omit tests which actually rely on having huge pages, allowing the other tests to be run. The gup_test does support using madvise() to configure huge pages but it ignores the error code so we just let it run. Link: https://lkml.kernel.org/r/20250212-kselftest-mm-no-hugepages-v1-2-44702f538522@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nico Pache <npache@redhat.com> Cc: Mariano Pache <npache@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:14 -07:00
Eric Salem	3fae696393	selftests: mm: fix typo Fix misspelling. Link: https://lkml.kernel.org/r/77e0e915-36c3-4c95-84b8-0b73aaa17951@gmail.com Signed-off-by: Eric Salem <ericsalem@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:11 -07:00
Mark Brown	023fff71d8	selftests/mm: fix thuge-gen test name uniqueness The thuge-gen test_mmap() and test_shmget() tests are repeatedly run for a variety of sizes but always report the result of their test with the same name, meaning that automated sysetms running the tests are unable to distinguish between the various tests. Add the supplied sizes to the logged test names to distinguish between runs. My test automation was getting pretty confused about what was going on - the test names are a pretty important external interface. Link: https://lkml.kernel.org/r/20250204-kselftest-mm-fix-dups-v1-1-6afe417ef4bb@kernel.org Fixes: `b38bd9b2c4` ("selftests/mm: thuge-gen: conform to TAP format output") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:03 -07:00
Lorenzo Stoakes	c372473a54	mm: completely abstract unnecessary adj_start calculation The adj_start calculation has been a constant source of confusion in the VMA merge code. There are two cases to consider, one where we adjust the start of the vmg->middle VMA (i.e. the vmg->__adjust_middle_start merge flag is set), in which case adj_start is calculated as: (1) adj_start = vmg->end - vmg->middle->vm_start And the case where we adjust the start of the vmg->next VMA (i.e. the vmg->__adjust_next_start merge flag is set), in which case adj_start is calculated as: (2) adj_start = -(vmg->middle->vm_end - vmg->end) We apply (1) thusly: vmg->middle->vm_start = vmg->middle->vm_start + vmg->end - vmg->middle->vm_start Which simplifies to: vmg->middle->vm_start = vmg->end Similarly, we apply (2) as: vmg->next->vm_start = vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end) Noting that for these VMAs to be mergeable vmg->middle->vm_end == vmg->next->vm_start and so this simplifies to: vmg->next->vm_start = vmg->next->vm_start + -(vmg->next->vm_start - vmg->end) Which simplifies to: vmg->next->vm_start = vmg->end Therefore in each case, we simply need to adjust the start of the VMA to vmg->end (!) and can do away with this adj_start calculation. The only caveat is that we must ensure we update the vm_pgoff field correctly. We therefore abstract this entire calculation to a new function vmg_adjust_set_range() which performs this calculation and sets the adjusted VMA's new range using the general vma_set_range() function. We also must update vma_adjust_trans_huge() which expects the now-abstracted adj_start parameter. It turns out this is wholly unnecessary. In vma_adjust_trans_huge() the relevant code is: if (adjust_next > 0) { struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end); unsigned long nstart = next->vm_start; nstart += adjust_next; split_huge_pmd_if_needed(next, nstart); } The only case where this is relevant is when vmg->__adjust_middle_start is specified (in which case adj_next would have been positive), i.e. the one in which the vma specified is vmg->prev and this the sought 'next' VMA would be vmg->middle. We can therefore eliminate the find_vma() invocation altogether and simply provide the vmg->middle VMA in this instance, or NULL otherwise. Again we have an adj_next offset calculation: next->vm_start + vmg->end - vmg->middle->vm_start Where next == vmg->middle this simplifies to vmg->end as previously demonstrated. Therefore nstart is equal to vmg->end, which is already passed to vma_adjust_trans_huge() via the 'end' parameter and so this code (rather delightfully) simplifies to: if (next) split_huge_pmd_if_needed(next, end); With these changes in place, it becomes silly for commit_merge() to return vmg->target, as it is always the same and threaded through vmg, so we finally change commit_merge() to return an error value once again. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/7bce2cd4b5afb56211822835d145471280c3dccc.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	fe3e9cf0d7	mm: eliminate adj_start parameter from commit_merge() Introduce internal vmg->__adjust_middle_start and vmg->__adjust_next_start merge flags, enabling us to indicate to commit_merge() that we are performing a merge which either spans only part of vmg->middle, or part of vmg->next respectively. In the former instance, we change the start of vmg->middle to match the attributes of vmg->prev, without spanning all of vmg->middle. This implies that vmg->prev->vm_end and vmg->middle->vm_start are both increased to form the new merged VMA (vmg->prev) and the new subsequent VMA (vmg->middle). In the latter case, we change the end of vmg->middle to match the attributes of vmg->next, without spanning all of vmg->next. This implies that vmg->middle->vm_end and vmg->next->vm_start are both decreased to form the new merged VMA (vmg->next) and the new prior VMA (vmg->middle). Since we now have a stable set of prev, middle, next VMAs threaded through vmg and with these flags set know what is happening, we can perform the calculation in commit_merge() instead. This allows us to drop the confusing adj_start parameter and instead pass semantic information to commit_merge(). In the latter case the -(middle->vm_end - start) calculation becomes -(middle->vm-end - vmg->end), however this is correct as vmg->end is set to the start parameter. This is because in this case (rather confusingly), we manipulate vmg->middle, but ultimately return vmg->next, whose range will be correctly specified. At this point vmg->start, end is the new range for the prior VMA rather than the merged one. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/bcec0cd980b373a5eb02236cb033034ce1effe42.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	6ab2d9c7c6	mm: further refactor commit_merge() The current VMA merge mechanism contains a number of confusing mechanisms around removal of VMAs on merge and the shrinking of the VMA adjacent to vma->target in the case of merges which result in a partial merge with that adjacent VMA. Since we now have a STABLE set of VMAs - prev, middle, next - we are now able to have the caller of commit_merge() explicitly tell us which VMAs need deleting, using newly introduced internal VMA merge flags. Doing so allows us to embed this state within the VMG and remove the confusing remove, remove2 parameters from commit_merge(). We additionally are able to eliminate the highly confusing and misleading 'expanded' parameter - a parameter that in reality refers to whether or not the return VMA is the target one or the one immediately adjacent. We can infer which is the case from whether or not the adj_start parameter is negative. This also allows us to simplify further logic around iterator configuration and VMA iterator stores. Doing so means we can also eliminate the adjust parameter, as we are able to infer which VMA ought to be adjusted from adj_start - a positive value implies we adjust the start of 'middle', a negative one implies we adjust the start of 'next'. We are then able to have commit_merge() explicitly return the target VMA, or NULL on inability to pre-allocate memory. Errors were previously filtered so behaviour does not change. We additionally move from the slightly odd use of a bitwise-flag enum vmg->merge_flags field to vmg bitfields. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/7bf2ed24af68aac18672b7acebbd9102f48c5b03.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	3a75ccba04	mm: simplify vma merge structure and expand comments Patch series "mm: further simplify VMA merge operation", v3. While significant efforts have been made to improve the VMA merge operation, there remains remnants of the bad (or rather confusing) old days, which make the code difficult to understand, more bug prone and thus harder to modify. This series attempts to significantly improve matters in a number of respects - with a focus on simplifying the commit_merge() function which actually actions the merge operation - and importantly, adjusting the two most confusing merge cases - those in which we 'adjust' the VMA immediately adjacent to the one being merged. One source of confusion are the VMAs being threaded through the operation themselves - vmg->prev, vmg->vma and vmg->next. At the start of the operation, vmg->vma is either NULL if a new VMA is propose to be added, or if not then a pointer to an existing VMA being modified, and prev/next are (perhaps not present) VMAs sat immediately before and after the range specified in vmg->start, end, respectively. However, during the VMA merge operation, we change vmg->start, end and pgoff to span the newly merged range and vmg->vma to either be: a. The ultimately returned VMA (in most cases) or b. A VMA which we will manipulate, but ultimately instead return vmg->next. Case b. especially here is confusing for somebody reading this code, but the fact we update this state, along with vmg->start, end, pgoff only makes matters worse. We simplify things by replacing vmg->vma with vmg->middle and never changing it - this is always either NULL (for a new VMA) or the VMA being modified between vmg->prev and vmg->next. We further simplify by placing the merged VMA in a new vmg->target field - whether case b. above is the case or not. The reader of the code can now simply rely on vmg->middle being the middle VMA and vmg->target being the ultimately merged VMA. We additionally tackle the confusing cases where we 'adjust' VMAs other than the one we ultimately return as the merged VMA (this includes case b. above). These are: (1) merge <-----------> \|------\|\|--------\| \|------------\|---\| \| prev \|\| middle \| -> \| target \| m \| \|------\|\|--------\| \|------------\|---\| In which case middle must be adjusted so middle->vm_start is increased as well as performing the merge. (2) (equivalent to case b. above) <-------------> \|---------\|\|------\| \|---\|-------------\| \| middle \|\| next \| -> \| m \| target \| \|---------\|\|------\| \|---\|-------------\| In which case next must be adjusted so next->vm_start is decreased as well as performing the merge. This cases have previously been performed by calculating and passing around a dubious and confusing 'adj_start' parameter along side a pointer to an 'adjust' VMA indicating which VMA requires additional adjustment (middle in case 1 and next in case 2). With the VMG structure in place we are able to avoid this by simply setting a merge flag to describe each case: (1) Sets the vmg->__adjust_middle_start flag (2) Sets the vmg->__adjust_next_start flag By doing so it turns out we can vastly simplify the logic and calculate what is required to perform the operation. Taken together the refactorings make it far easier to understand what is being done even in these more confusing cases, make the code far more maintainable, debuggable, and testable, providing more internal state indicating what is happening in the merge operation. The changes have no functional net impact on the merge operation and everything should still behave as it did before. This patch (of 5): The merge code, while much improved, still has a number of points of confusion. As part of a broader series cleaning this up to make this more maintainable, we start by addressing some confusion around vma_merge_struct fields. So far, the caller either provides no vmg->vma (a new VMA) or supplies the existing VMA which is being altered, setting vmg->start,end,pgoff to the proposed VMA dimensions. vmg->vma is then updated, as are vmg->start,end,pgoff as the merge process proceeds and the appropriate merge strategy is determined. This is rather confusing, as vmg->vma starts off as the 'middle' VMA between vmg->prev,next, but becomes the 'target' VMA, except in one specific edge case (merge next, shrink middle). Int his patch we introduce vmg->middle to describe the VMA that is between vmg->prev and vmg->next, and does NOT change during the merge operation. We replace vmg->vma with vmg->target, and use this only during the merge operation itself. Aside from the merge right, shrink middle case, this becomes the VMA that forms the basis of the VMA that is returned. This edge case can be addressed in a future commit. We also add a number of comments to explain what is going on. Finally, we adjust the ASCII diagrams showing each merge case in vma_merge_existing_range() to be clearer - the arrow range previously showed the vmg->start, end spanned area, but it is clearer to change this to show the final merged VMA. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/cover.1738326519.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/4dfe60f1419d55e5d0516f56349695d73a57184c.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:01 -07:00
Zi Yan	ad4c9bb541	selftests/mm: test splitting file-backed THP to any lower order Now split_huge_page*() supports shmem THP split to any lower order. Test it. The test now reads file content out after split to check if the split corrupts the file data. Link: https://lkml.kernel.org/r/20250122161928.1240637-3-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:05:56 -07:00
Zi Yan	035a112e5f	selftests/mm: make file-backed THP split work by writing PMD size data Commit `acd7ccb284` ("mm: shmem: add large folio support for tmpfs") changes huge=always to allocate THP/mTHP based on write size and split_huge_page_test does not write PMD size data, so file-back THP is not created during the test. Fix it by writing PMD size data. Link: https://lkml.kernel.org/r/20250122161928.1240637-1-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:05:56 -07:00
Rafael Aquini	67a2f86846	selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation We noticed that uffd-stress test was always failing to run when invoked for the hugetlb profiles on x86_64 systems with a processor count of 64 or bigger: ... # ------------------------------------ # running ./uffd-stress hugetlb 128 32 # ------------------------------------ # ERROR: invalid MiB (errno=9, @uffd-stress.c:459) ... # [FAIL] not ok 3 uffd-stress hugetlb 128 32 # exit=1 ... The problem boils down to how run_vmtests.sh (mis)calculates the size of the region it feeds to uffd-stress. The latter expects to see an amount of MiB while the former is just giving out the number of free hugepages halved down. This measurement discrepancy ends up violating uffd-stress' assertion on number of hugetlb pages allocated per CPU, causing it to bail out with the error above. This commit fixes that issue by adjusting run_vmtests.sh's half_ufd_size_MB calculation so it properly renders the region size in MiB, as expected, while maintaining all of its original constraints in place. Link: https://lkml.kernel.org/r/20250218192251.53243-1-aquini@redhat.com Fixes: `2e47a445d7` ("selftests/mm: run_vmtests.sh: fix hugetlb mem size calculation") Signed-off-by: Rafael Aquini <raquini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 17:40:25 -07:00
Rae Moar	2e0cf2b32f	kunit: tool: add test to check parsing late test plan Add test to check for the infinite loop caused by the inability to parse a late test plan. The test parses the following output: TAP version 13 ok 4 test4 1..4 Link: https://lore.kernel.org/r/20250313192714.1380005-1-rmoar@google.com Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-15 18:13:43 -06:00
Rae Moar	1d4c06d519	kunit: tool: Fix bug in parsing test plan A bug was identified where the KTAP below caused an infinite loop: TAP version 13 ok 4 test_case 1..4 The infinite loop was caused by the parser not parsing a test plan if following a test result line. Fix this bug by parsing test plan line to avoid the infinite loop. Link: https://lore.kernel.org/r/20250313192714.1380005-1-rmoar@google.com Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-15 18:13:38 -06:00
Saket Kumar Bhaskar	8c10109a97	selftests/bpf: Fix sockopt selftest failure on powerpc The SO_RCVLOWAT option is defined as 18 in the selftest header, which matches the generic definition. However, on powerpc, SO_RCVLOWAT is defined as 16. This discrepancy causes sol_socket_sockopt() to fail with the default switch case on powerpc. This commit fixes by defining SO_RCVLOWAT as 16 for powerpc. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311084647.3686544-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:49:24 -07:00
Viktor Malik	de07b18289	selftests/bpf: Fix string read in strncmp benchmark The strncmp benchmark uses the bpf_strncmp helper and a hand-written loop to compare two strings. The values of the strings are filled from userspace. One of the strings is non-const (in .bss) while the other is const (in .rodata) since that is the requirement of bpf_strncmp. The problem is that in the hand-written loop, Clang optimizes the reads from the const string to always return 0 which breaks the benchmark. Use barrier_var to prevent the optimization. The effect can be seen on the strncmp-no-helper variant. Before this change: # ./bench strncmp-no-helper Setting up benchmark 'strncmp-no-helper'... Benchmark 'strncmp-no-helper' started. Iter 0 (112.309us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 1 (-23.238us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 2 ( 58.994us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 3 (-30.466us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 4 ( 29.996us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 5 ( 16.949us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 6 (-60.035us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Summary: hits 0.000 ± 0.000M/s ( 0.000M/prod), drops 0.000 ± 0.000M/s, total operations 0.000 ± 0.000M/s After this change: # ./bench strncmp-no-helper Setting up benchmark 'strncmp-no-helper'... Benchmark 'strncmp-no-helper' started. Iter 0 ( 77.711us): hits 5.534M/s ( 5.534M/prod), drops 0.000M/s, total operations 5.534M/s Iter 1 ( 11.215us): hits 6.006M/s ( 6.006M/prod), drops 0.000M/s, total operations 6.006M/s Iter 2 (-14.253us): hits 5.931M/s ( 5.931M/prod), drops 0.000M/s, total operations 5.931M/s Iter 3 ( 59.087us): hits 6.005M/s ( 6.005M/prod), drops 0.000M/s, total operations 6.005M/s Iter 4 (-21.379us): hits 6.010M/s ( 6.010M/prod), drops 0.000M/s, total operations 6.010M/s Iter 5 (-20.310us): hits 5.861M/s ( 5.861M/prod), drops 0.000M/s, total operations 5.861M/s Iter 6 ( 53.937us): hits 6.004M/s ( 6.004M/prod), drops 0.000M/s, total operations 6.004M/s Summary: hits 5.969 ± 0.061M/s ( 5.969M/prod), drops 0.000 ± 0.000M/s, total operations 5.969 ± 0.061M/s Fixes: `9c42652f8b` ("selftests/bpf: Add benchmark for bpf_strncmp() helper") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20250313122852.1365202-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:49:24 -07:00
Kumar Kartikeya Dwivedi	1f375aef6c	selftests/bpf: Fix arena_spin_lock compilation on PowerPC Venkat reported a compilation error for BPF selftests on PowerPC [0]. The crux of the error is the following message: In file included from progs/arena_spin_lock.c:7: /root/bpf-next/tools/testing/selftests/bpf/bpf_arena_spin_lock.h:122:8: error: member reference base type '__attribute__((address_space(1))) u32' (aka '__attribute__((address_space(1))) unsigned int') is not a structure or union 122 \| old = atomic_read(&lock->val); This is because PowerPC overrides the qspinlock type changing the lock->val member's type from atomic_t to u32. To remedy this, import the asm-generic version in the arena spin lock header, name it __qspinlock (since it's aliased to arena_spinlock_t, the actual name hardly matters), and adjust the selftest to not depend on the type in vmlinux.h. [0]: https://lore.kernel.org/bpf/7bc80a3b-d708-4735-aa3b-6a8c21720f9d@linux.ibm.com Fixes: `88d706ba7c` ("selftests/bpf: Introduce arena spin lock") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311154244.3775505-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:59 -07:00
Blaise Boscaccy	7987f1627e	selftests/bpf: Add a kernel flag test for LSM bpf hook This test exercises the kernel flag added to security_bpf by effectively blocking light-skeletons from loading while allowing normal skeletons to function as-is. Since this should work with any arbitrary BPF program, an existing program from LSKELS_EXTRA was used as a test payload. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250310221737.821889-3-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Blaise Boscaccy	082f1db02c	security: Propagate caller information in bpf hooks Certain bpf syscall subcommands are available for usage from both userspace and the kernel. LSM modules or eBPF gatekeeper programs may need to take a different course of action depending on whether or not a BPF syscall originated from the kernel or userspace. Additionally, some of the bpf_attr struct fields contain pointers to arbitrary memory. Currently the functionality to determine whether or not a pointer refers to kernel memory or userspace memory is exposed to the bpf verifier, but that information is missing from various LSM hooks. Here we augment the LSM hooks to provide this data, by simply passing a boolean flag indicating whether or not the call originated in the kernel, in any hook that contains a bpf_attr struct that corresponds to a subcommand that may be called from the kernel. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250310221737.821889-2-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Chen Ni	a03d375330	selftests/bpf: Convert comma to semicolon Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/bpf/20250310032045.651068-1-nichen@iscas.ac.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Anton Protopopov	caa4237a79	selftests/bpf: Fix selection of static vs. dynamic LLVM The Makefile uses the exit code of the `llvm-config --link-static --libs` command to choose between statically-linked and dynamically-linked LLVMs. The stdout and stderr of that command are redirected to /dev/null. To redirect the output the "&>" construction is used, which might not be supported by /bin/sh, which is executed by make for $(shell ...) commands. On such systems the test will fail even if static LLVM is actually supported. Replace "&>" by ">/dev/null 2>&1" to fix this. Fixes: `2a9d30fac8` ("selftests/bpf: Support dynamically linking LLVM if static is not available") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20250310145112.1261241-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Emil Tsalapatis	c06707ff07	selftests: bpf: fix duplicate selftests in cpumask_success. The BPF cpumask selftests are currently run twice in test_progs/cpumask.c, once by traversing cpumask_success_testcases, and once by invoking RUN_TESTS(cpumask_success). Remove the invocation of RUN_TESTS to properly run the selftests only once. Now that the tests are run only through cpumask_success_testscases, add to it the missing test_refcount_null_tracking testcase. Also remove the __success annotation from it, since it is now loaded and invoked by the runner. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Emil Tsalapatis	918ba2636d	selftests: bpf: add bpf_cpumask_populate selftests Add selftests for the bpf_cpumask_populate helper that sets a bpf_cpumask to a bit pattern provided by a BPF program. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	1041b8bc9f	selftests/bpf: lwt_seg6local: Move test to test_progs test_lwt_seg6local.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_seg6local.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_seg6local.c. Use the network helpers instead of `nc` to exchange the final packet. Remove test_lwt_seg6local.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-2-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	b1d85ff517	selftests/bpf: lwt_seg6local: Remove unused routes Some routes in fb00:: are initialized during setup, even though they aren't needed by the test as the UDP packets will travel through the lightweight tunnels. Remove these unnecessary routes. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-1-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Feng Yang	339c1f8ea1	selftests/bpf: Fix cap_enable_effective() return code The caller of cap_enable_effective() expects negative error code. Fix it. Before: failed to restore CAP_SYS_ADMIN: -1, Unknown error -1 After: failed to restore CAP_SYS_ADMIN: -3, No such process failed to restore CAP_SYS_ADMIN: -22, Invalid argument Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305022234.44932-1-yangfeng59949@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	8bda5b787d	selftests/bpf: Fix dangling stdout seen by traffic monitor thread Traffic monitor thread may see dangling stdout as the main thread closes and reassigns stdout without protection. This happens when the main thread finishes one subtest and moves to another one in the same netns_new() scope. The issue can be reproduced by running test_progs repeatedly with traffic monitor enabled: for ((i=1;i<=100;i++)); do ./test_progs -a flow_dissector_skb* -m '*' done For restoring stdout in crash_handler(), since it does not really care about closing stdout, simlpy flush stdout and restore it to the original one. Then, Fix the issue by consolidating stdio_restore_cleanup() and stdio_restore(), and protecting the use/close/assignment of stdout with a lock. The locking in the main thread is always performed regradless of whether traffic monitor is running or not for simplicity. It won't have any side-effect. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-3-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	34a25aabcd	selftests/bpf: Allow assigning traffic monitor print function Allow users to change traffic monitor's print function. If not provided, traffic monitor will print to stdout by default. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305182057.2802606-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	aeb5bbb025	selftests/bpf: Clean up call sites of stdio_restore() reset_affinity() and save_ns() are only called in run_one_test(). There is no need to call stdio_restore() in reset_affinity() and save_ns() if stdio_restore() is moved right after a test finishes in run_one_test(). Also remove an unnecessary check of env.stdout_saved in crash_handler() by moving env.stdout_saved assignment to the beginning of main(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	f5e288943e	selftests/bpf: Move test_lwt_ip_encap to test_progs test_lwt_ip_encap.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_ip_encap.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_ip_encap.c. Rework the GSO part to avoid using nc and dd. Remove test_lwt_ip_encap.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250304-lwt_ip-v1-1-8fdeb9e79a56@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Kumar Kartikeya Dwivedi	2dfc8186d6	selftests/bpf: Add tests for arena spin lock Add some basic selftests for qspinlock built over BPF arena using cond_break_label macro. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Kumar Kartikeya Dwivedi	88d706ba7c	selftests/bpf: Introduce arena spin lock Implement queued spin lock algorithm as BPF program for lock words living in BPF arena. The algorithm is copied from kernel/locking/qspinlock.c and adapted for BPF use. We first implement abstract helpers for portable atomics and acquire/release load instructions, by relying on X86_64 presence to elide expensive barriers and rely on implementation details of the JIT, and fall back to slow but correct implementations elsewhere. When support for acquire/release load/stores lands, we can improve this state. Then, the qspinlock algorithm is adapted to remove dependence on multi-word atomics due to lack of support in BPF ISA. For instance, xchg_tail cannot use 16-bit xchg, and needs to be a implemented as a 32-bit try_cmpxchg loop. Loops which are seemingly infinite from verifier PoV are annotated with cond_break_label macro to return an error. Only 1024 NR_CPUs are supported. Note that the slow path is a global function, hence the verifier doesn't know the return value's precision. The recommended way of usage is to always test against zero for success, and not ret < 0 for error, as the verifier would assume ret > 0 has not been accounted for. Add comments in the function documentation about this quirk. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Kumar Kartikeya Dwivedi	4b7ede0be3	selftests/bpf: Introduce cond_break_label Add a new cond_break_label macro that jumps to the specified label when the cond_break termination check fires, and allows us to better handle the uncontrolled termination of the loop. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Eduard Zingerman	871ef8d50e	bpf: correct use/def for may_goto instruction may_goto instruction does not use any registers, but in compute_insn_live_regs() it was treated as a regular conditional jump of kind BPF_K with r0 as source register. Thus unnecessarily marking r0 as used. Fixes: `14c8552db6` ("bpf: simple DFA-based live registers analysis") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305085436.2731464-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:30 -07:00
Eduard Zingerman	2ea8f6a1cd	selftests/bpf: test cases for compute_live_registers() Cover instructions from each kind: - assignment - arithmetic - store/load - endian conversion - atomics - branches, conditional branches, may_goto, calls - LD_ABS/LD_IND - address_space_cast Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-6-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:30 -07:00
Eduard Zingerman	14c8552db6	bpf: simple DFA-based live registers analysis Compute may-live registers before each instruction in the program. The register is live before the instruction I if it is read by I or some instruction S following I during program execution and is not overwritten between I and S. This information would be used in the next patch as a hint in func_states_equal(). Use a simple algorithm described in [1] to compute this information: - define the following: - I.use : a set of all registers read by instruction I; - I.def : a set of all registers written by instruction I; - I.in : a set of all registers that may be alive before I execution; - I.out : a set of all registers that may be alive after I execution; - I.successors : a set of instructions S that might immediately follow I for some program execution; - associate separate empty sets 'I.in' and 'I.out' with each instruction; - visit each instruction in a postorder and update corresponding 'I.in' and 'I.out' sets as follows: I.out = U [S.in for S in I.successors] I.in = (I.out / I.def) U I.use (where U stands for set union, / stands for set difference) - repeat the computation while I.{in,out} changes for any instruction. On implementation side keep things as simple, as possible: - check_cfg() already marks instructions EXPLORED in post-order, modify it to save the index of each EXPLORED instruction in a vector; - represent I.{in,out,use,def} as bitmasks; - don't split the program into basic blocks and don't maintain the work queue, instead: - do fixed-point computation by visiting each instruction; - maintain a simple 'changed' flag if I.{in,out} for any instruction change; Measurements show that even such simplistic implementation does not add measurable verification time overhead (for selftests, at-least). Note on check_cfg() ex_insn_beg/ex_done change: To avoid out of bounds access to env->cfg.insn_postorder array, it should be guaranteed that instruction transitions to EXPLORED state only once. Previously this was not the fact for incorrect programs with direct calls to exception callbacks. The 'align' selftest needs adjustment to skip computed insn/live registers printout. Otherwise it matches lines from the live registers printout. [1] https://en.wikipedia.org/wiki/Live-variable_analysis Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:29 -07:00
Peilin Ye	ff3afe5da9	selftests/bpf: Add selftests for load-acquire and store-release instructions Add several ./test_progs tests: - arena_atomics/load_acquire - arena_atomics/store_release - verifier_load_acquire/* - verifier_store_release/* - verifier_precision/bpf_load_acquire - verifier_precision/bpf_store_release The last two tests are added to check if backtrack_insn() handles the new instructions correctly. Additionally, the last test also makes sure that the verifier "remembers" the value (in src_reg) we store-release into e.g. a stack slot. For example, if we take a look at the test program: #0: r1 = 8; /* store_release((u64 )(r10 - 8), r1); / #1: .8byte %[store_release]; #2: r1 = (u64 )(r10 - 8); #3: r2 = r10; #4: r2 += r1; #5: r0 = 0; #6: exit; At #1, if the verifier doesn't remember that we wrote 8 to the stack, then later at #4 we would be adding an unbounded scalar value to the stack pointer, which would cause the program to be rejected: VERIFIER LOG: ============= ... math between fp pointer and register with unbounded min value is not allowed For easier CI integration, instead of using built-ins like __atomic_{load,store}_n() which depend on the new __BPF_FEATURE_LOAD_ACQ_STORE_REL pre-defined macro, manually craft load-acquire/store-release instructions using __imm_insn(), as suggested by Eduard. All new tests depend on: (1) Clang major version >= 18, and (2) ENABLE_ATOMICS_TESTS is defined (currently implies -mcpu=v3 or v4), and (3) JIT supports load-acquire/store-release (currently arm64 and x86-64) In .../progs/arena_atomics.c: /* 8-byte-aligned / __u8 __arena_global load_acquire8_value = 0x12; / 1-byte hole */ __u16 __arena_global load_acquire16_value = 0x1234; That 1-byte hole in the .addr_space.1 ELF section caused clang-17 to crash: fatal error: error in backend: unable to write nop sequence of 1 bytes To work around such llvm-17 CI job failures, conditionally define __arena_global variables as 64-bit if __clang_major__ < 18, to make sure .addr_space.1 has no holes. Ideally we should avoid compiling this file using clang-17 at all (arena tests depend on __BPF_FEATURE_ADDR_SPACE_CAST, and are skipped for llvm-17 anyway), but that is a separate topic. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/1b46c6feaf0f1b6984d9ec80e500cc7383e9da1a.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:29 -07:00
Kumar Kartikeya Dwivedi	2fb761823e	bpf, x86: Add x86 JIT support for timed may_goto Implement the arch_bpf_timed_may_goto function using inline assembly to have control over which registers are spilled, and use our special protocol of using BPF_REG_AX as an argument into the function, and as the return value when going back. Emit call depth accounting for the call made from this stub, and ensure we don't have naked returns (when rethunk mitigations are enabled) by falling back to the RET macro (instead of retq). After popping all saved registers, the return address into the BPF program should be on top of the stack. Since the JIT support is now enabled, ensure selftests which are checking the produced may_goto sequences do not break by adjusting them. Make sure we still test the old may_goto sequence on other architectures, while testing the new sequence on x86_64. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250304003239.2390751-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:28 -07:00
Mykyta Yatsenko	6419d08b6c	selftests/bpf: Add tests for bpf_object__prepare Add selftests, checking that running bpf_object__prepare successfully creates maps before load step. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-5-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:28 -07:00
Bastien Curutchet (eBPF Foundation)	a54e700696	selftests/bpf: test_tunnel: Remove test_tunnel.sh All tests from test_tunnel.sh have been migrated into test test_progs. The last test remaining in the script is the test_ipip() that is already covered in the test_prog framework by the NONE case of test_ipip_tunnel(). Remove the test_tunnel.sh script and its Makefile entry Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-10-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	05cd60ab57	selftests/bpf: test_tunnel: Move ip6tnl tunnel tests to test_progs ip6tnl tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6tnl tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ipip6() and test_ip6ip6() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-9-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	260f2da62d	selftests/bpf: test_tunnel: Move ip6geneve tunnel test to test_progs ip6geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-8-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	bd477738e6	selftests/bpf: test_tunnel: Move geneve tunnel test to test_progs geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-7-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	ea60b6a524	selftests/bpf: test_tunnel: Move ip6erspan tunnel test to test_progs ip6erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-6-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	cadb08a4d3	selftests/bpf: test_tunnel: Move erspan tunnel tests to test_progs erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-5-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	856818b28f	selftests/bpf: test_tunnel: Move ip6gre tunnel test to test_progs ip6gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6gre tunnels. It uses the same network topology and the same BPF programs than the script. Disable the IPv6 DAD feature because it can take lot of time and cause some tests to fail depending on the environment they're run on. Remove test_ip6gre() and test_ip6gretap() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-4-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	257dfd1c6b	selftests/bpf: test_tunnel: Move gre tunnel test to test_progs gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test gre tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_gre() and test_gre_no_tunnel_key() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-3-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	fcb39996a2	selftests/bpf: test_tunnel: Add ping helpers All tests use more or less the same ping commands as final validation. Also test_ping()'s return value is checked with ASSERT_OK() while this check is already done by the SYS() macro inside test_ping(). Create helpers around test_ping() and use them in the tests to avoid code duplication. Remove the unnecessary ASSERT_OK() from the tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-2-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Bastien Curutchet (eBPF Foundation)	5d6aa606c1	selftests/bpf: test_tunnel: Add generic_attach* helpers A fair amount of code duplication is present among tests to attach BPF programs. Create generic_attach* helpers that attach BPF programs to a given interface. Use ASSERT_OK_FD() instead of ASSERT_GE() to check fd's validity. Use these helpers in all the available tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-1-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	346e4ca462	veristat: Report program type guess results to sdterr In order not to pollute CSV output, e.g.: $ ./veristat -o csv exceptions_ext.bpf.o > test.csv Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/extension... Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/throwing_extension... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	2d95b3f582	veristat: Strerror expects positive number (errno) Before: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': Unknown error -2 ... After: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': No such file or directory ... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	c0d078da7a	veristat: @files-list.txt notation for object files list Allow reading object file list from file. E.g. the following command: ./veristat @list.txt Is equivalent to the following invocation: ./veristat line-1 line-2 ... line-N Where line-i corresponds to lines from list.txt. Lines starting with '#' are ignored. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Kumar Kartikeya Dwivedi	72ed076abf	selftests/bpf: Add tests for extending sleepable global subprogs Add tests for freplace behavior with the combination of sleepable and non-sleepable global subprogs. The changes_pkt_data selftest did all the hardwork, so simply rename it and include new support for more summarization tests for might_sleep bit. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Kumar Kartikeya Dwivedi	b2bb703434	selftests/bpf: Test sleepable global subprogs in atomic contexts Add tests for rejecting sleepable and accepting non-sleepable global function calls in atomic contexts. For spin locks, we still reject all global function calls. Once resilient spin locks land, we will carefully lift in cases where we deem it safe. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Alexis Lothoré (eBPF Foundation)	93cf4e537e	bpf/selftests: test_select_reuseport_kern: Remove unused header test_select_reuseport_kern.c is currently including <stdlib.h>, but it does not use any definition from there. Remove stdlib.h inclusion from test_select_reuseport_kern.c Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250227-remove_wrong_header-v1-1-bc94eb4e2f73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Yonghong Song	2222aa1c89	selftests/bpf: Add selftests allowing cgroup prog pre-ordering Add a few selftests with cgroup prog pre-ordering. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250224230121.283601-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Jiayuan Chen	7f260af1f2	selftests/bpf: Fixes for test_maps test BPF CI has failed 3 times in the last 24 hours. Add retry for ENOMEM. It's similar to the optimization plan: commit `2f553b032c` ("selftsets/bpf: Retry map update for non-preallocated per-cpu map") Failed CI: https://github.com/kernel-patches/bpf/actions/runs/13549227497/job/37868926343 https://github.com/kernel-patches/bpf/actions/runs/13548089029/job/37865812030 https://github.com/kernel-patches/bpf/actions/runs/13553536268/job/37883329296 selftests/bpf: Fixes for test_maps test Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' ...... test_task_storage_map_stress_lookup:PASS test_maps: OK, 0 SKIPPED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-4-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Jiayuan Chen	93a279b65a	selftests/bpf: Allow auto port binding for bpf nf Allow auto port binding for bpf nf test to avoid binding conflict. ./test_progs -a bpf_nf 24/1 bpf_nf/xdp-ct:OK 24/2 bpf_nf/tc-bpf-ct:OK 24/3 bpf_nf/alloc_release:OK 24/4 bpf_nf/insert_insert:OK 24/5 bpf_nf/lookup_insert:OK 24/6 bpf_nf/set_timeout_after_insert:OK 24/7 bpf_nf/set_status_after_insert:OK 24/8 bpf_nf/change_timeout_after_alloc:OK 24/9 bpf_nf/change_status_after_alloc:OK 24/10 bpf_nf/write_not_allowlisted_field:OK 24 bpf_nf:OK Summary: 1/10 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-3-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:24 -07:00
Jiayuan Chen	acf0d6f681	selftests/bpf: Allow auto port binding for cgroup connect Allow auto port binding for cgroup connect test to avoid binding conflict. Result: ./test_progs -a cgroup_v1v2 59 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-2-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:24 -07:00
Mykyta Yatsenko	064e9aacfd	selftests/bpf: Add tests for bpf_dynptr_copy Add XDP setup type for dynptr tests, enabling testing for non-contiguous buffer. Add 2 tests: - test_dynptr_copy - verify correctness for the fast (contiguous buffer) code path. - test_dynptr_copy_xdp - verifies code paths that handle non-contiguous buffer. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250226183201.332713-4-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:20 -07:00
Alison Schofield	114a89b433	cxl/test: Define a CFMWS capable of a 3 way HB interleave The CXL unit test cxl-xor-region.sh is skipping a 1+1+1 region interleave test case because the window is not defined. Additionally, upcoming expansion of 3 way HB interleave test cases (like 2+2+2) require the same window. Replace an unused CFMWS with a 3-way capable CFMWS in the set of CFMWS's loaded when interleave_arithmetic=1. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://patch.msgid.link/20250226221931.2352061-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 16:27:40 -07:00
Dave Jiang	763e15d047	Merge branch 'for-6.15/extended-linear-cache' into cxl-for-next2 Add support for Extended Linear Cache for CXL. Add enumeration support of the cache. Add MCE notification of the aliased memory address.	2025-03-14 16:22:34 -07:00
Dave Jiang	d781a45270	Merge branch 'for-6.15/dirty-shutdown' into cxl-for-next2 Add support for Global Persistent Flush (GPF) and dirty shutdown accounting.	2025-03-14 16:11:42 -07:00
Davidlohr Bueso	6eb52f63ea	tools/testing/cxl: Set Shutdown State support Add support to emulate the CXL Set Shutdown State operation. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-5-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:27 -07:00
Yuquan Wang	2da9ad027e	cxl/pmem: debug invalid serial number data In a nvdimm interleave-set each device with an invalid or zero serial number may cause pmem region initialization to fail, but in cxl case such device could still set cookies of nd_interleave_set and create a nvdimm pmem region. This adds the validation of serial number in cxl pmem region creation. The event of no serial number would cause to fail to set the cookie and pmem region. For cxl-test to work properly, always +1 on mock device's serial number. Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250219040029.515451-2-wangyuquan1236@phytium.com.cn Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:01:04 -07:00
Dave Jiang	9387c6aec0	Merge branch 'for-6.15/fw-first-error-logging' into cxl-for-next2 Add logging support for CXL CPER endpoint and port protocol errors. Including the 2 patches that was completed later. Link: https://lore.kernel.org/linux-cxl/20250123084421.127697-1-Smita.KoralahalliChannabasappa@amd.com/ Link: https://lore.kernel.org/linux-cxl/20250310223839.31342-1-Smita.KoralahalliChannabasappa@amd.com/	2025-03-14 14:27:17 -07:00
Smita Koralahalli	36f257e3b0	acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol errors. The defined trace events cxl_aer_uncorrectable_error and cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them to trace FW-First Protocol errors. Since the CXL code is required to be called from process context and GHES is in interrupt context, use workqueues for processing. Similar to CXL CPER event handling, use kfifo to handle errors as it simplifies queue processing by providing lock free fifo operations. Add the ability for the CXL sub-system to register a workqueue to process CXL CPER protocol errors. [DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:21:45 -07:00
Tamir Duberstein	97c1f302f2	scanf: convert self-test to KUnit Convert the scanf() self-test to a KUnit test. In the interest of keeping the patch reasonably-sized this doesn't refactor the tests into proper parameterized tests - it's all one big test case. Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-3-b98820fa39ff@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-14 13:55:37 -07:00
Niklas Cassel	24a42582b0	selftests: pci_endpoint: Use IRQ_TYPE_* defines from UAPI header In order to improve readability, use the IRQ_TYPE_* defines from the UAPI header rather than using raw values. No functional change. Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Link: https://lore.kernel.org/r/20250310111016.859445-12-cassel@kernel.org	2025-03-14 16:12:04 +00:00
Matthieu Baerts (NGI0)	a1e36ec363	selftests: drv-net: fix merge conflicts resolution After the recent merge between net-next and net, I got some conflicts on my side because the merge resolution was different from Stephen's one [1] I applied on my side in the MPTCP tree. It looks like the code that is now in net-next is using the old way to retrieve the local and remote addresses. This patch is now using the new way, like what was in Stephen's email [1]. Also, in get_interface_info(), there were no conflicts in this area, because that was new code from 'net', but a small adaptation was needed there as well to get the remote address. Fixes: `941defcea7` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Link: https://lore.kernel.org/20250311115758.17a1d414@canb.auug.org.au [1] Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250314-net-next-drv-net-ping-fix-merge-v1-1-0d5c19daf707@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-14 10:24:14 +01:00
Paolo Abeni	941defcea7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: tools/testing/selftests/drivers/net/ping.py `75cc19c8ff` ("selftests: drv-net: add xdp cases for ping.py") `de94e86974` ("selftests: drv-net: store addresses in dict indexed by ipver") https://lore.kernel.org/netdev/20250311115758.17a1d414@canb.auug.org.au/ net/core/devmem.c `a70f891e0f` ("net: devmem: do not WARN conditionally after netdev_rx_queue_restart()") `1d22d3060b` ("net: drop rtnl_lock for queue_mgmt operations") https://lore.kernel.org/netdev/20250313114929.43744df1@canb.auug.org.au/ Adjacent changes: tools/testing/selftests/net/Makefile `6f50175cca` ("selftests: Add IPv6 link-local address generation tests for GRE devices.") `2e5584e0f9` ("selftests/net: expand cmsg_ipv6.sh with ipv4") drivers/net/ethernet/broadcom/bnxt/bnxt.c `661958552e` ("eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic") `fe96d717d3` ("bnxt_en: Extend queue stop/start for TX rings") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 23:08:11 +01:00
Jason Xing	a1e0783e10	selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN Add selftests for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN BPF socket cases. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250312153523.9860-5-kerneljasonxing@gmail.com	2025-03-13 14:42:53 -07:00
Linus Torvalds	4003c9e787	Including fixes from netfilter, bluetooth and wireless. No known regressions outstanding. Current release - regressions: - wifi: nl80211: fix assoc link handling - eth: lan78xx: sanitize return values of register read/write functions Current release - new code bugs: - ethtool: tsinfo: fix dump command - bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL - eth: mlx5: DR, use the right action structs for STEv3 Previous releases - regressions: - netfilter: nf_tables: make destruction work queue pernet - gre: fix IPv6 link-local address generation. - wifi: iwlwifi: fix TSO preparation - bluetooth: revert "bluetooth: hci_core: fix sleeping function called from invalid context" - ovs: revert "openvswitch: switch to per-action label counting in conntrack" - eth: ice: fix switchdev slow-path in LAG - eth: bonding: fix incorrect MAC address setting to receive NS messages Previous releases - always broken: - core: prevent TX of unreadable skbs - sched: prevent creation of classes with TC_H_ROOT - netfilter: nft_exthdr: fix offset with ipv4_find_option() - wifi: cfg80211: cancel wiphy_work before freeing wiphy - mctp: copy headers if cloned - phy: nxp-c45-tja11xx: add errata for TJA112XA/B - eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx \| tx} - eth: mlx5: bridge, fix the crash caused by LAG state check Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmfS+yMSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOknd8P/3XdIaDXrwVl9BSi37qToRs40i+6qPep 5MZN+wakJkuHQqooADF/UarqHWEtQwFOTJalgsPbrUpsespA0lhmzSRHsk+2Xy// 8o8RuZ2VngN1/UVBaoXb0QOY28CgtDs0SdW6/jST/dEp3bHvW+ZTDR2aaivKAWtJ gdWIWR+Ctw1lpetKJer3loPXjPd2doDedSlf64CiHw7dRFTwiE2AcrTRkYdd18Ml HJjiBkqKMnUTxqWxOU3GJxdl7Qru7CvAfn3XQ0PjCO9hADPJb/uWZDRENauk3/jE 6lJc+EP5AoG3Ce8MLzHr/aayZR3YtyhJ2TaWmyPbe6BJq7tPH84ydKMZnF7UnOd5 Jp4T6rykkJOYnUdiYGiOPHcTxQxUugKUrL3rNvGJep+x3JAi/DJJkHNWEx+EBB55 3YbDTOUiY1bCGOoGJIZRwNzw52O0+7qINvCtAN21sMu4pKr/nYMzO8L9ldFExllj EuOcxr813V09NyoJs+wWcy9FdlVmcPwCsT4Pylrdb+mA1XohPzMwibH3yHk4XCZ6 Aijz5Bb78i4vg+kZzNZmVY11J5yL0/eC8DefMgUe3+JKwyM2TiSn5hHK6tYbTLtM yAjbTkeu/FD6alSRAnIAL+YBHg4ri3fXXjU4vLZA5n3wWIjfqpL7JAFgnK57Bd8Z 1X5k+52VjQzj =lIrc -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, bluetooth and wireless. No known regressions outstanding. Current release - regressions: - wifi: nl80211: fix assoc link handling - eth: lan78xx: sanitize return values of register read/write functions Current release - new code bugs: - ethtool: tsinfo: fix dump command - bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL - eth: mlx5: DR, use the right action structs for STEv3 Previous releases - regressions: - netfilter: nf_tables: make destruction work queue pernet - gre: fix IPv6 link-local address generation. - wifi: iwlwifi: fix TSO preparation - bluetooth: revert "bluetooth: hci_core: fix sleeping function called from invalid context" - ovs: revert "openvswitch: switch to per-action label counting in conntrack" - eth: - ice: fix switchdev slow-path in LAG - bonding: fix incorrect MAC address setting to receive NS messages Previous releases - always broken: - core: prevent TX of unreadable skbs - sched: prevent creation of classes with TC_H_ROOT - netfilter: nft_exthdr: fix offset with ipv4_find_option() - wifi: cfg80211: cancel wiphy_work before freeing wiphy - mctp: copy headers if cloned - phy: nxp-c45-tja11xx: add errata for TJA112XA/B - eth: - bnxt: fix kernel panic in the bnxt_get_queue_stats{rx \| tx} - mlx5: bridge, fix the crash caused by LAG state check" * tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) net: mana: cleanup mana struct after debugfs_remove() net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices net/mlx5: Bridge, fix the crash caused by LAG state check net/mlx5: Lag, Check shared fdb before creating MultiPort E-Switch net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs net/mlx5: HWS, Rightsize bwc matcher priority net/mlx5: DR, use the right action structs for STEv3 Revert "openvswitch: switch to per-action label counting in conntrack" net: openvswitch: remove misbehaving actions length check selftests: Add IPv6 link-local address generation tests for GRE devices. gre: Fix IPv6 link-local address generation. netfilter: nft_exthdr: fix offset with ipv4_find_option() selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT net_sched: Prevent creation of classes with TC_H_ROOT ipvs: prevent integer overflow in do_ip_vs_get_ctl() selftests: netfilter: skip br_netfilter queue tests if kernel is tainted netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree() wifi: mac80211: fix MPDU length parsing for EHT 5/6 GHz qlcnic: fix memory leak issues in qlcnic_sriov_common.c rtase: Fix improper release of ring list entries in rtase_sw_reset ...	2025-03-13 07:58:48 -10:00
Tamir Duberstein	7a79e7daa8	printf: convert self-test to KUnit Convert the printf() self-test to a KUnit test. In the interest of keeping the patch reasonably-sized this doesn't refactor the tests into proper parameterized tests - it's all one big test case. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-1-4d85c361c241@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-13 10:26:33 -07:00
Paolo Abeni	2409fa66e2	netfilter pull request 25-03-13 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmfSqyUACgkQ1w0aZmrP KyGocBAAu0+rdAcW1wwSYk53tRN663WpbbYg/AchJ9ilLH2IXdCzzbeuekRzvSmS /gS2WNANmckL314300RV5vTd9+PCbANjh1248O4S8ZQnhmomoiB3xD4QM9fu8HKZ RE+koHmnHESl+TiiUzgw31OtZEdSpjn7zUzm+qAznBeDXfyBE1YWQ9q8LALi0+3m RW/waYykwRktzHCANBYNF0oqt6siBrHnI9KHgXIfQCEK0mc3A/s8MfO1IJqC177Y It/+4a6qsaGZZsSwGCrQ/40CO3rItzF7jefpeAjv/di224EQ+p2sWZgEiLPCMtvk 6WSfYhKNiDnKGAe1u9hCG2RdaSQ8Og7BsBWFeB56gZbK7gPoFY2VCM/hATsBTYG8 bn8xQKFoA9y7dYp8AP6a33OlzV+/qOfJa2kXQ90yjS4yWnMY+q59j1D564z/WhZ9 gVZjrcUDSsv3ETtp7vk7s+0SoI3EvSLb/umcjHZR6oeLkNFWpQ4r9NqY0GX6+zDC 89YR2w/RbO9WMRb52OAqY3HdKmOflwEIN5mDWCAB4sIPUgCpoDzvIjC2xLdr6lUW ydoisH2B87luG0p7zKP7iGE1KIp8wm+zO+XciOrcjyie5YNMHyCFRbzGAsd47CLB UjNrrlAS0tU65Nb+JzsrZOYye2cghE+KC8fKHvpwjk54lsaPHl0= =OTtx -----END PGP SIGNATURE----- Merge tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Missing initialization of cpu and jiffies32 fields in conncount, from Kohei Enju. 2) Skip several tests in case kernel is tainted, otherwise tests bogusly report failure too as they also check for tainted kernel, from Florian Westphal. 3) Fix a hyphothetical integer overflow in do_ip_vs_get_ctl() leading to bogus error logs, from Dan Carpenter. 4) Fix incorrect offset in ipv4 option match in nft_exthdr, from Alexey Kashavkin. netfilter pull request 25-03-13 * tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_exthdr: fix offset with ipv4_find_option() ipvs: prevent integer overflow in do_ip_vs_get_ctl() selftests: netfilter: skip br_netfilter queue tests if kernel is tainted netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree() ==================== Link: https://patch.msgid.link/20250313095636.2186-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 15:07:39 +01:00
Thomas Gleixner	8e63360d86	selftests/timers/posix-timers: Add a test for exact allocation mode The exact timer ID allocation mode is used by CRIU to restore timers with a given ID. Add a test case for it. It's skipped on older kernels when the prctl() fails. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/8734fl2tkx.ffs@tglx	2025-03-13 12:07:18 +01:00
Guillaume Nault	6f50175cca	selftests: Add IPv6 link-local address generation tests for GRE devices. GRE devices have their special code for IPv6 link-local address generation that has been the source of several regressions in the past. Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an IPv6 link-link local address in accordance with the net.ipv6.conf.<dev>.addr_gen_mode sysctl. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/2d6772af8e1da9016b2180ec3f8d9ee99f470c77.1741375285.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 10:17:42 +01:00
Sebastian Ott	edfd826b8b	KVM: arm64: selftests: Test that TGRAN*_2 fields are writable Userspace can write to these fields for non-NV guests; add test that do just that. Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/kvmarm/20250306184013.30008-1-sebott@redhat.com/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-12 13:40:53 -07:00
Jakub Kicinski	188107b2c4	selftests: net: bump GRO timeout for gro/setup_veth Commit `51bef03e1a` ("selftests/net: deflake GRO tests") recently switched to NAPI suspension, and lowered the timeout from 1ms to 100us. This started causing flakes in netdev-run CI. Let's bump it to 200us. In a quick test of a debug kernel I see failures with 100us, with 200us in 5 runs I see 2 completely clean runs and 3 with a single retry (GRO test will retry up to 5 times). Reviewed-by: Kevin Krakauer <krakauer@google.com> Link: https://patch.msgid.link/20250310110821.385621-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-12 13:20:04 -07:00
Cong Wang	bb7737de5f	selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT Integrate the reproduer from Mingi to TDC. All test results: 1..4 ok 1 0385 - Create DRR with default setting ok 2 2375 - Delete DRR with handle ok 3 3092 - Show DRR class ok 4 4009 - Reject creation of DRR class with classid TC_H_ROOT Cc: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306232355.93864-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-12 12:52:14 -07:00
Florian Westphal	c21b02fd9c	selftests: netfilter: skip br_netfilter queue tests if kernel is tainted These scripts fail if the kernel is tainted which leads to wrong test failure reports in CI environments when an unrelated test triggers some splat. Check taint state at start of script and SKIP if its already dodgy. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2025-03-12 15:28:34 +01:00
Michael Jeanson	e6644c967d	rseq/selftests: Ensure the rseq ABI TLS is actually 1024 bytes Adding the aligned(1024) attribute to the definition of __rseq_abi did not increase its size to 1024, for this attribute to impact the size of __rseq_abi it would need to be added to the declaration of 'struct rseq_abi'. We only want to increase the size of the TLS allocation to ensure registration will succeed with future extended ABI. Use a union with a dummy member to ensure we allocate 1024 bytes. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/20250311192222.323453-1-mjeanson@efficios.com	2025-03-12 13:19:47 +01:00
Madhavan Srinivasan	73276cee1a	selftest/powerpc/mm/pkey: fix build-break introduced by commit `00894c3fc9` Build break was reported in the powerpc mailing list for next-20250218 with below errors make[1]: Nothing to be done for 'all'. BUILD_TARGET=/root/venkat/linux-next/tools/testing/selftests/powerpc/mm; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C mm all CC pkey_exec_prot In file included from pkey_exec_prot.c:18: /root/venkat/linux-next/tools/testing/selftests/powerpc/include/pkeys.h: In function ‘pkeys_unsupported’: /root/venkat/linux-next/tools/testing/selftests/powerpc/include/pkeys.h:96:34: error: ‘PKEY_UNRESTRICTED’ undeclared (first use in this function) 96 \| pkey = sys_pkey_alloc(0, PKEY_UNRESTRICTED); \| ^~~~~~~~~~~~~~~~~ https://lore.kernel.org/all/20250113170619.484698-2-yury.khrustalev@arm.com/ patchset has been queued to arm64/for-next/pkey_unrestricted which is causing a build break in the selftest/powerpc builds. Commit `6d61527d93` ("mm/pkey: Add PKEY_UNRESTRICTED macro") added a macro PKEY_UNRESTRICTED to handle implicit literal value of 0x0 (which is "unrestricted"). Add the same to selftest/powerpc/pkeys.h to fix the reported build break. Fixes: `00894c3fc9` ("selftests/powerpc: Use PKEY_UNRESTRICTED macro") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/lkml/3267ea6e-5a1a-4752-96ef-8351c912d386@linux.ibm.com/T/ Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://lore.kernel.org/r/20250311084129.39308-1-maddy@linux.ibm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-11 15:16:10 +00:00
Hangbin Liu	9318dc2357	selftests: bonding: fix incorrect mac address The correct mac address for NS target 2001:db8::254 is 33:33:ff:00:02:54, not 33:33:00:00:02:54. The same with client maddress. Fixes: `86fb6173d1` ("selftests: bonding: add ns multicast group testing") Acked-by: Jay Vosburgh <jv@jvosburgh.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306023923.38777-3-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-11 13:19:27 +01:00
Miklos Szeredi	e1c24b52ad	selftests: add tests for mount notification Provide coverage for all mnt_notify_add() instances. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250307204046.322691-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-11 12:04:02 +01:00
Ming Lei	390174c91d	selftests: ublk: improve test usability Add UBLK_TEST_QUIET, so we can print test result(PASS/SKIP/FAIL) only. Also always run from test script's current directory, then the same test script can be started from other work directory. This way helps a lot to reuse this test source code and scripts for other projects(liburing, blktests, ...) Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-12-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	af83ccc7db	selftests: ublk: add stress test for covering IO vs. killing ublk server Add stress_test_01 for running IO vs. killing ublk server, so io_uring exit & cancel code path can be covered, same with ublk's cancel code path. Especially IO buffer lifetime is one big thing for ublk zero copy, the added test can verify if this area works as expected. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-11-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c60ac48eab	selftests: ublk: add one stress test for covering IO vs. removing device Add stress_test_01 for running IO vs. removing device for verifying that ublk device removal can work as expected when heavy IO workloads are in progress. null, loop and loop/zc are covered in this tests. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-10-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	87a9265213	selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests Load ublk_drv module in _prep_test(), and unload it in _cleanup_test(), so that test can always be done in consistent state. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c2cb669a86	selftests: ublk: move zero copy feature check into _add_ublk_dev() Move zero copy feature check into _add_ublk_dev() since we will have more tests which requires to cover zero copy. Then one check function of _check_add_dev() has to be added for dealing with cleanup since '_add_ublk_dev()' is run in sub-shell, and we can't exit from it to terminal shell. Meantime always return error code from _add_ublk_dev(). Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c83b089a70	selftests: ublk: don't pass ${dev_id} to _cleanup_test() More devices can be created in single tests, so simply remove all ublk devices in _cleanup_test(), meantime remove the ${dev_id} argument of _cleanup_test(). Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	632051ffbd	selftests: ublk: support shellcheck and fix all warning Add shellcheck, meantime fixes all warnings. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	5b2db7a8c7	selftests: ublk: fix parsing '-a' argument The argument of '-a' doesn't follow any value, so fix it by putting it with '-z' together. Fixes: `bedc9cbc5f` ("selftests: ublk: add ublk zero copy test") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:18 -06:00
Taehee Yoo	75cc19c8ff	selftests: drv-net: add xdp cases for ping.py ping.py has 3 cases, test_v4, test_v6 and test_tcp. But these cases are not executed on the XDP environment. So, it adds XDP environment, existing tests(test_v4, test_v6, and test_tcp) are executed too on the below XDP environment. So, it adds XDP cases. 1. xdp-generic + single-buffer 2. xdp-generic + multi-buffer 3. xdp-native + single-buffer 4. xdp-native + multi-buffer 5. xdp-offload It also makes test_{v4 \| v6 \| tcp} sending large size packets. this may help to check whether multi-buffer is working or not. Note that the physical interface may be down and then up when xdp is attached or detached. This takes some period to activate traffic. So sleep(10) is added if the test interface is the physical interface. netdevsim and veth type interfaces skip sleep. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250309134219.91670-9-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-10 13:31:12 -07:00
Willem de Bruijn	0922cb68ed	selftests/net: expand cmsg_ip with MSG_MORE UDP send with MSG_MORE takes a slightly different path than the lockless fast path. For completeness, add coverage to this case too. Pass MSG_MORE on the initial sendmsg, then follow up with a zero byte write to unplug the cork. Unrelated: also add two missing endlines in usage(). Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250307033620.411611-4-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-10 13:13:04 -07:00
Greg Kroah-Hartman	993a47bd7b	Linux 6.14-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfOKBUeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1aQH/iC+Oyij4VxAjBek BOXIT/p6CwlIXb8ObiWWcRjDPizlcxb3RaV8J2RO+IqaQ2wltxpFANq2G7Re2FPm SNcEpIURAOVcxHGedcfFA91srO5F4FzNTO8LVp7MIbcgMYy3pdk+dbZmi6A691R+ t9pb74m+MAnF1o/MUx7pUlhAT/4ymuuR0F7WCSg4h0Xwe5m0nlJY89kJBC7PCjyd n3mdhsz3rDSLmt/z/T7HGD89r8sYSvm9cOKtL3ELgGTrm7boQV8ii9Y9w04DI8PQ JmIernugcCxmhH36mVUAHgJf2+/T388xFUh/D5+skeUOUZpaJZG866rnb32WpsHc eWLFUeg= =Wypt -----END PGP SIGNATURE----- Merge 6.14-rc6 into driver-core-next We need the driver core fix in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-03-10 17:37:25 +01:00
Ming Lei	2ecdcdfee5	selftests: ublk: add --foreground command line Add --foreground command for helping to debug. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Ming Lei	9d80f48c5e	selftests: ublk: fix build failure Fixes the following build failure: ublk//file_backed.c: In function ‘backing_file_tgt_init’: ublk//file_backed.c:28:42: error: ‘O_DIRECT’ undeclared (first use in this function); did you mean ‘O_DIRECTORY’? 28 \| fd = open(file, O_RDWR \| O_DIRECT); \| ^~~~~~~~ \| O_DIRECTORY when trying to reuse this same utility for liburing test. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Ming Lei	9894e0eaae	selftests: ublk: make ublk_stop_io_daemon() more reliable Improve ublk_stop_io_daemon() in the following ways: - don't wait if ->ublksrv_pid becomes -1, which means that the disk has been stopped - don't wait if ublk char device doesn't exist any more, so we can avoid to rely on inoitfy for wait until the char device is closed And this way may reduce time of delete command a lot. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Linus Torvalds	a382b06d29	KVM/arm64 fixes for 6.14, take #4 * Fix a couple of bugs affecting pKVM's PSCI relay implementation when running in the hVHE mode, resulting in the host being entered with the MMU in an unknown state, and EL2 being in the wrong mode. x86: * Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. * Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. * Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. * Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. * Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. * Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. * Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfNSeUUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNKngf/cLgQAT9AF4nFqcwh5b5uucKHVJ8W uTiGlWqLAf2UN53L63eZ/7vKQWGQYkOTFvormR14Jam6IYtytsZw1xLBH4fGtUyB qVjk0EPzaKGqn3LrgyneQNCXdyxJv7EBVBgoOKH0pvOksoW2E5ZizhhtRFtL7nCE Yk8FQKpP0mIBk04RMsvzJVEFKIb4OZgJadWo0gryg1oF2aAv7mxQjyqUWsBDsb3q 99c0ElSBfV39FeT8xeok4k7S5jbBWii2KiaH72ZsNiBu0rYmEuLwIoygCNNWL9Wu FPdQ+r//YrzfCJSXwGPfdUaRaF4p2642S6oiXQuusNNUmhK6/MRo3mZo8A== =XQHm -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "arm64: - Fix a couple of bugs affecting pKVM's PSCI relay implementation when running in the hVHE mode, resulting in the host being entered with the MMU in an unknown state, and EL2 being in the wrong mode x86: - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM KVM: selftests: Fix printf() format goof in SEV smoke test KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 KVM: SVM: Save host DR masks on CPUs with DebugSwap KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() KVM: arm64: Initialize HCR_EL2.E2H early KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled KVM: x86: Snapshot the host's DEBUGCTL in common x86 KVM: SVM: Suppress DEBUGCTL.BTF on AMD KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value KVM: selftests: Assert that STI blocking isn't set after event injection KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow	2025-03-09 09:04:08 -10:00
Paolo Bonzini	ea9bd29a9c	KVM x86 fixes for 6.14-rcN #2 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmfLlhUACgkQOlYIJqCj N/0x7w/+MhqJdHbshL7Gzw+rcXwCROiCkqsxFP+YoTXte8uaHS5CEfcMYjE8SuGp KBpgLo4Lj1dVTXiCjemlY5sn6CDiuSs74X8A88ksuu5hVsFByJUgyWU9iw8J/crZ B2vj8huhqa8OCEPe5JujWfnfyAkKE5tUA4GFi73vhHMcftTNj+ftxT33/Pfg7y7M xOvWFWS6ZshKrouRKzI7ZFEYLwp0lr4U3dzO5rCRAd5J4MSBWRx6Dx2um5dyEYKJ xgwl4ylM4S/+78u1+0nQnToM0UWHJ3e7x8nze6UXYTZIrBr/lSeKlbhOPnEWJcJB Eemnur9ORI2BRPUReqBKluCZsSK+E5B/HPCVt5cxtuRIuUOD+kW17LPgnPyE4Sso eVt+XAvQc7EjrpWDSHr3ZQZZM89l9zHhuSAQ0npO6y71s0FzEVZQoDamNmOLAPjH Qg+qhBV2l6pyfqhqiLzADasYLOl57cJsfiMjM331ALLqAn57jzd+B8c4hdB2Xg4s KPuy8w8uBaY9zpd9YDBLLr7JJVs35KexNZMjT2vqBYXcScyLgmAuSQXy3hub6Mzn gI5ZXIKG8eO9v2jejfClI6/OEdtEwgSGEVwuBKB16pMrIxqpguMTMTWLVRn5G+oo qA8anmKaac62GaB66JE/Wjy069OPIGYnHSU2nal0Tej6kG0xv6E= =as6u -----END PGP SIGNATURE----- Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux into HEAD KVM x86 fixes for 6.14-rcN #2 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM.	2025-03-09 03:44:06 -04:00
Linus Torvalds	1110ce6a1e	33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ8zgnAAKCRDdBJ7gKXxA jmTzAP9LsTIpkPkRXDpwxPR/Si5KPwOkE6sGj4ETEqbX3vUvcAEA/Lp7oafc7Vqr XxlC1VFush1ZK29Tecxzvnapl2/VSAs= =zBvY -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits) mm/page_alloc: fix uninitialized variable rapidio: add check for rio_add_net() in rio_scan_alloc_net() rapidio: fix an API misues when rio_add_net() fails MAINTAINERS: .mailmap: update Sumit Garg's email address Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone" mm: fix finish_fault() handling for large folios mm: don't skip arch_sync_kernel_mappings() in error paths mm: shmem: remove unnecessary warning in shmem_writepage() userfaultfd: fix PTE unmapping stack-allocated PTE copies userfaultfd: do not block on locking a large folio with raised refcount mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages mm: shmem: fix potential data corruption during shmem swapin mm: fix kernel BUG when userfaultfd_move encounters swapcache selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms selftests/damon/damos_quota: make real expectation of quota exceeds include/linux/log2.h: mark is_power_of_2() with __always_inline NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback mm, swap: avoid BUG_ON in relocate_cluster() mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() ...	2025-03-08 14:34:06 -10:00
Kunihiko Hayashi	a28d2f2398	selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test Add GET_IRQTYPE API checks to each interrupt test. While at it, change pci_ep_ioctl() to get the appropriate return value from ioctl(). Suggested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20250225110252.28866-2-hayashi.kunihiko@socionext.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>	2025-03-08 14:36:00 +00:00
Niklas Cassel	af1451b673	selftests: pci_endpoint: Skip disabled BARs Currently BARs that have been disabled by the endpoint controller driver will result in a test FAIL. Returning FAIL for a BAR that is disabled seems overly pessimistic. There are EPC that disables one or more BARs intentionally. One reason for this is that there are certain EPCs that are hardwired to expose internal PCIe controller registers over a certain BAR, so the EPC driver disables such a BAR, such that the host will not overwrite random registers during testing. Such a BAR will be disabled by the EPC driver's init function, and the BAR will be marked as BAR_RESERVED, such that it will be unavailable to endpoint function drivers. Let's return FAIL only for BARs that are actually enabled and failed the test, and let's return skip for BARs that are not even enabled. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250123120147.3603409-4-cassel@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>	2025-03-08 14:35:57 +00:00
Jakub Kicinski	93b1e05517	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ8qHFwAKCRAraS+Z+3y5 E6QRAQCI/irYxpT6nBgT3ejyO6CIHjbHjdNP5KRkE8JT61zmJgD8Diet4dwwOxLK c1Pq5Jnl9KLpEPG99TcjtH3c++N3fww= =ltml -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-06 We've added 6 non-merge commits during the last 13 day(s) which contain a total of 6 files changed, 230 insertions(+), 56 deletions(-). The main changes are: 1) Add XDP metadata support for tun driver, from Marcus Wichelmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Fix file descriptor assertion in open_tuntap helper selftests/bpf: Add test for XDP metadata support in tun driver selftests/bpf: Refactor xdp_context_functional test and bpf program selftests/bpf: Move open_tuntap to network helpers net: tun: Enable transfer of XDP metadata to skb net: tun: Enable XDP metadata support ==================== Link: https://patch.msgid.link/20250307055335.441298-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-07 19:08:49 -08:00
Willem de Bruijn	7ae495a537	selftests/net: add proc_net_pktgen to .gitignore Ensure git doesn't pick up this new target. Fixes: `03544faad7` ("selftest: net: add proc_net_pktgen") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250307031356.368350-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-07 19:08:14 -08:00
Linus Torvalds	2a520073e7	s390 updates for 6.14-rc6 - Fix return address recovery of traced function in ftrace to ensure reliable stack unwinding - Fix compiler warnings and runtime crashes of vDSO selftests on s390 by introducing a dedicated GNU hash bucket pointer with correct 32-bit entry size - Fix test_monitor_call() inline asm, which misses CC clobber, by switching to an instruction that doesn't modify CC -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmfLedEACgkQjYWKoQLX FBhXMQf/bIaJEOV/Ww5PWY+AqtHPpQjhP2cvOfSRoh2py38Gp1sy/6ddpn8lcl3R skyeW4XHLpARYGg6NNOAOgFr43c7wQYrr+7uSW0zska+hOlSOi5TtycEOgu7o+VE Nig5XRuBPc01CBokFc+gbvMTjp2fTVlzTQ2/F/B7py62j2c21Y2/n7yZ4hhfQolM oiWlRDXzCRHM/sfa3FlML6sSgfIcHhhMSZLpdmRDhir9vBm1VpeGHa+FM5V8Pzsk KsiuoH7+ylbw+swVn2V/p3fqfS7JoopYdng8h40eLRkeHaTi052+NmMwv8gYAnCk BxfC1re5KlZ6kgWh0CDLUoYQHdwHxw== =KCjH -----END PGP SIGNATURE----- Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix return address recovery of traced function in ftrace to ensure reliable stack unwinding - Fix compiler warnings and runtime crashes of vDSO selftests on s390 by introducing a dedicated GNU hash bucket pointer with correct 32-bit entry size - Fix test_monitor_call() inline asm, which misses CC clobber, by switching to an instruction that doesn't modify CC * tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ftrace: Fix return address recovery of traced function selftests/vDSO: Fix GNU hash table entry size for s390x s390/traps: Fix test_monitor_call() inline assembly	2025-03-07 16:21:02 -10:00
Thomas Weißschuh	a782d45c86	selftests/nolibc: stop testing constructor order The execution order of constructors in undefined and depends on the toolchain. While recent toolchains seems to have a stable order, it doesn't work for older ones and may also change at any time. Stop validating the order and instead only validate that all constructors are executed. Reported-by: Willy Tarreau <w@1wt.eu> Closes: https://lore.kernel.org/lkml/20250301110735.GA18621@1wt.eu/ Link: https://lore.kernel.org/r/20250306-nolibc-constructor-order-v1-1-68fd161cc5ec@weissschuh.net Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-07 07:34:12 +01:00
Jakub Kicinski	1cc3462159	selftests: openvswitch: don't hardcode the drop reason subsys WiFi removed one of their subsys entries from drop reasons, in commit `286e696770` ("wifi: mac80211: Drop cooked monitor support") SKB_DROP_REASON_SUBSYS_OPENVSWITCH is now 2 not 3. The drop reasons are not uAPI, read the correct value from debug info. We need to enable vmlinux BTF, otherwise pahole needs a few GB of memory to decode the enum name. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20250304180615.945945-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 16:45:43 -08:00
Jakub Kicinski	56a586961b	selftests: net: bpf_offload: add 'libbpf_global' to ignored maps After installing pahole on the CI image we have a new map created by libbpf. Ignore it otherwise we see: Exception: Time out waiting for map counts to stabilize want 2, have 3 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250304233204.1139251-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 15:41:37 -08:00
Jakub Kicinski	f9d2f5ddd4	selftests: net: fix error message in bpf_offload We hit a following exception on timeout, nmaps is never set: Test bpftool bound info reporting (own ns)... Traceback (most recent call last): File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 1128, in <module> check_dev_info(False, "") File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 583, in check_dev_info maps = bpftool_map_list_wait(expected=2, ns=ns) File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 215, in bpftool_map_list_wait raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps)) NameError: name 'nmaps' is not defined Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250304233204.1139251-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 15:41:37 -08:00
Louis Taylor	6e406202a4	selftests/nolibc: use O_RDONLY flag instead of 0 This doesn't matter much, but is what the standard says. Signed-off-by: Louis Taylor <louis@kragniz.eu> Link: https://lore.kernel.org/r/20250306184147.208723-5-louis@kragniz.eu Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-06 22:30:21 +01:00
Louis Taylor	b2edaad7f5	tools/nolibc: add support for openat(2) openat is useful to avoid needing to construct relative paths, so expose a wrapper for using it directly. Signed-off-by: Louis Taylor <louis@kragniz.eu> Link: https://lore.kernel.org/r/20250306184147.208723-1-louis@kragniz.eu Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-06 22:30:20 +01:00
Ingo Molnar	82354fce16	Merge branch 'sched/urgent' into sched/core, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-06 22:26:36 +01:00
Jakub Kicinski	2525e16a2b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c `2bcf4772e4` ("net: ethtool: try to protect all callback with netdev instance lock") `637399bf7e` ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 13:03:35 -08:00
Marcus Wichelmann	49306d5bfc	selftests/bpf: Fix file descriptor assertion in open_tuntap helper The open_tuntap helper function uses open() to get a file descriptor for /dev/net/tun. The open(2) manpage writes this about its return value: On success, open(), openat(), and creat() return the new file descriptor (a nonnegative integer). On error, -1 is returned and errno is set to indicate the error. This means that the fd > 0 assertion in the open_tuntap helper is incorrect and should rather check for fd >= 0. When running the BPF selftests locally, this incorrect assertion was not an issue, but the BPF kernel-patches CI failed because of this: open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun): actual 0 <= expected 0 Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250305213438.3863922-7-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	73eeecc3cd	selftests/bpf: Add test for XDP metadata support in tun driver Add a selftest that creates a tap device, attaches XDP and TC programs, writes a packet with a test payload into the tap device and checks the test result. This test ensures that the XDP metadata support in the tun driver is enabled and that the metadata size is correctly passed to the skb. See the previous commit ("selftests/bpf: refactor xdp_context_functional test and bpf program") for details about the test design. The test runs in its own network namespace. This provides some extra safety against conflicting interface names. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-6-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	b46aa22b66	selftests/bpf: Refactor xdp_context_functional test and bpf program The existing XDP metadata test works by creating a veth pair and attaching XDP & TC programs that drop the packet when the condition of the test isn't fulfilled. The test then pings through the veth pair and succeeds when the ping comes through. While this test works great for a veth pair, it is hard to replicate for tap devices to test the XDP metadata support of them. A similar test for the tun driver would either involve logic to reply to the ping request, or would have to capture the packet to check if it was dropped or not. To make the testing of other drivers easier while still maximizing code reuse, this commit refactors the existing xdp_context_functional test to use a test_result map. Instead of conditionally passing or dropping the packet, the TC program is changed to copy the received metadata into the value of that single-entry array map. Tests can then verify that the map value matches the expectation. This testing logic is easy to adapt to other network drivers as the only remaining requirement is that there is some way to send a custom Ethernet packet through it that triggers the XDP & TC programs. The Ethernet header of that custom packet is all-zero, because it is not required to be valid for the test to work. The zero ethertype also helps to filter out packets that are not related to the test and would otherwise interfere with it. The payload of the Ethernet packet is used as the test data that is expected to be passed as metadata from the XDP to the TC program and written to the map. It has a fixed size of 32 bytes which is a reasonable size that should be supported by both drivers. Additional packet headers are not necessary for the test and were therefore skipped to keep the testing code short. This new testing methodology no longer requires the veth interfaces to have IP addresses assigned, therefore these were removed. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250305213438.3863922-5-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	d5ca409c86	selftests/bpf: Move open_tuntap to network helpers To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-4-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
SeongJae Park	582ccf78f6	selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries damon_nr_regions.py starts DAMON, periodically collect number of regions in snapshots, and see if it is in the requested range. The check code assumes the numbers are sorted on the collection list, but there is no such guarantee. Hence this can result in false positive test success. Sort the list before doing the check. Link: https://lkml.kernel.org/r/20250225222333.505646-4-sj@kernel.org Fixes: `781497347d` ("selftests/damon: implement test for min/max_nr_regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	695469c07a	selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms damon_nr_regions.py updates max_nr_regions to a number smaller than expected number of real regions and confirms DAMON respect the harsh limit. To give time for DAMON to make changes for the regions, 3 aggregation intervals (300 milliseconds) are given. The internal mechanism works with not only the max_nr_regions, but also sz_limit, though. It avoids merging region if that casn make region of size larger than sz_limit. In the test, sz_limit is set too small to achive the new max_nr_regions, unless it is updated for the new min_nr_regions. But the update is done only once per operations set update interval, which is one second by default. Hence, the test randomly incurs false positive failures. Fix it by setting the ops interval same to aggregation interval, to make sure sz_limit is updated by the time of the check. Link: https://lkml.kernel.org/r/20250225222333.505646-3-sj@kernel.org Fixes: `8bf890c816` ("selftests/damon/damon_nr_regions: test online-tuned max_nr_regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	1c684d77df	selftests/damon/damos_quota: make real expectation of quota exceeds Patch series "selftests/damon: three fixes for false results". Fix three DAMON selftest bugs that cause two and one false positive failures and successes. This patch (of 3): damos_quota.py assumes the quota will always exceeded. But whether quota will be exceeded or not depend on the monitoring results. Actually the monitored workload has chaning access pattern and hence sometimes the quota may not really be exceeded. As a result, false positive test failures happen. Expect how much time the quota will be exceeded by checking the monitoring results, and use it instead of the naive assumption. Link: https://lkml.kernel.org/r/20250225222333.505646-1-sj@kernel.org Link: https://lkml.kernel.org/r/20250225222333.505646-2-sj@kernel.org Fixes: `51f58c9da1` ("selftests/damon: add a test for DAMOS quota") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	349db086a6	selftests/damon/damos_quota_goal: handle minimum quota that cannot be further reduced damos_quota_goal.py selftest see if DAMOS quota goals tuning feature increases or reduces the effective size quota for given score as expected. The tuning feature sets the minimum quota size as one byte, so if the effective size quota is already one, we cannot expect it further be reduced. However the test is not aware of the edge case, and fails since it shown no expected change of the effective quota. Handle the case by updating the failure logic for no change to see if it was the case, and simply skips to next test input. Link: https://lkml.kernel.org/r/20250217182304.45215-1-sj@kernel.org Fixes: `f1c07c0a16` ("selftests/damon: add a test for DAMOS quota goal") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202502171423.b28a918d-lkp@intel.com Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> [6.10.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:12 -08:00
John Hubbard	0a7565ee6e	Revert "selftests/mm: remove local __NR_* definitions" This reverts commit `a5c6bc5900`. The general approach described in commit `e076eaca59` ("selftests: break the dependency upon local header files") was taken one step too far here: it should not have been extended to include the syscall numbers. This is because doing so would require per-arch support in tools/include/uapi, and no such support exists. This revert fixes two separate reports of test failures, from Dave Hansen[1], and Li Wang[2]. An excerpt of Dave's report: Before this commit (`a5c6bc5900`) things are fine. But after, I get: running PKEY tests for unsupported CPU/OS An excerpt of Li's report: I just found that mlock2_() return a wrong value in mlock2-test [1] https://lore.kernel.org/dc585017-6740-4cab-a536-b12b37a7582d@intel.com [2] https://lore.kernel.org/CAEemH2eW=UMu9+turT2jRie7+6ewUazXmA6kL+VBo3cGDGU6RA@mail.gmail.com Link: https://lkml.kernel.org/r/20250214033850.235171-1-jhubbard@nvidia.com Fixes: `a5c6bc5900` ("selftests/mm: remove local __NR_* definitions") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Li Wang <liwang@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Andrei Vagin <avagin@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:12 -08:00
Atish Patra	ee4e778c58	KVM: riscv: selftests: Allow number of interrupts to be configurable It is helpful to vary the number of the LCOFI interrupts generated by the overflow test. Allow additional argument for overflow test to accommodate that. It can be easily cross-validated with /proc/interrupts output in the host. Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-4-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Atish Patra	4b506adfea	KVM: riscv: selftests: Change command line option The PMU test commandline option takes an argument to disable a certain test. The initial assumption behind this was a common use case is just to run all the test most of the time. However, running a single test seems more useful instead. Especially, the overflow test has been helpful to validate PMU virtualizaiton interrupt changes. Switching the command line option to run a single test instead of disabling a single test also allows to provide additional test specific arguments to the test. The default without any options remains unchanged which continues to run all the tests. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-3-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Atish Patra	1f6bbe1255	KVM: riscv: selftests: Do not start the counter in the overflow handler There is no need to start the counter in the overflow handler as we intend to trigger precise number of LCOFI interrupts through these tests. The overflow irq handler has already stopped the counter. As a result, the stop call from the test function may return already stopped error which is fine as well. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-2-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Catalin Marinas	306219d59b	kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings While the kselftest was added at the same time with the kernel support for MTE on hugetlb mappings, the tests may be run on older kernels. Skip the tests if PROT_MTE is not supported on MAP_HUGETLB mappings. Fixes: `27879e8cb6` ("selftests: arm64: add hugetlb mte tests") Cc: Yang Shi <yang@os.amperecomputing.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20250221093331.2184245-3-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-05 18:54:02 +00:00
Catalin Marinas	7ae95109c6	kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c The architecture doesn't define precise/imprecise MTE tag check modes, only synchronous and asynchronous. Use the correct naming and also ensure they match the MTE_{ASYNC,SYNC}_ERR type. Fixes: `27879e8cb6` ("selftests: arm64: add hugetlb mte tests") Cc: Yang Shi <yang@os.amperecomputing.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20250221093331.2184245-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-05 18:54:02 +00:00
Wojtek Wasko	76868642e4	testptp: Add option to open PHC in readonly mode PTP Hardware Clocks no longer require WRITE permission to perform readonly operations, such as listing device capabilities or listening to EXTTS events once they have been enabled by a process with WRITE permissions. Add '-r' option to testptp to open the PHC in readonly mode instead of the default read-write mode. Skip enabling EXTTS if readonly mode is requested. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-03-05 12:43:54 +00:00
Christian Brauner	56f235da15	selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-16-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	1c4f2dbe42	selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-15-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	2adf6ca638	selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-14-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	2e94e4c649	selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-13-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	a79975f05e	selftests/pidfd: add third PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-12-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	86c1dfdd52	selftests/pidfd: add second PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-11-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	853ab1ff2c	selftests/pidfd: add first PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-10-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	ddf5315526	selftests/pidfd: expand common pidfd header Move more infrastructure to the pidfd header. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-9-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Christian Brauner	18938f7180	pidfs/selftests: ensure correct headers for ioctl handling Ensure that necessary ioctl infrastructure is available. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-8-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Christian Brauner	17457a47d2	selftests/pidfd: fix header inclusion Ensure that necessary defines are present. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-7-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Boqun Feng	467c890f2d	Merge branches 'docs.2025.02.04a', 'lazypreempt.2025.03.04a', 'misc.2025.03.04a', 'srcu.2025.02.05a' and 'torture.2025.02.05a'	2025-03-04 18:47:32 -08:00
Paul E. McKenney	8d608f0801	rcutorture: Make scenario TREE07 build CONFIG_PREEMPT_LAZY=y This commit tests lazy preemption by causing the TREE07 rcutorture scenario to build its kernel with CONFIG_PREEMPT_LAZY=y. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:46:47 -08:00
Paul E. McKenney	118559a994	rcutorture: Make scenario TREE10 build CONFIG_PREEMPT_LAZY=y This commit tests lazy preemption by causing the TREE10 rcutorture scenario to build its kernel with CONFIG_PREEMPT_LAZY=y. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:46:47 -08:00
Uladzislau Rezki (Sony)	a6cea3954e	rcu: Update TREE05.boot to test normal synchronize_rcu() Add extra parameters for rcutorture module. One is the "nfakewriters" which is set -1. There will be created number of test-kthreads which correspond to number of CPUs in a test system. Those threads randomly invoke synchronize_rcu() call. Apart of that "rcu_normal" is set to 1, because it is specifically for a normal synchronize_rcu() testing, also a newly added parameter which is "rcu_normal_wake_from_gp" is set to 1 also. That prevents interaction with other callbacks in a system. Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Link: https://lore.kernel.org/r/20250227131613.52683-2-urezki@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:44:29 -08:00
Jakub Kicinski	ea4342739d	selftests: drv-net: use env.rpath in the HDS test Commit `29b036be1b` ("selftests: drv-net: test XDP, HDS auto and the ioctl path") added a new test case in the net tree, now that this code has made its way to net-next convert it to use the env.rpath() helper instead of manually computing the relative path. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250228212956.25399-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 17:14:23 -08:00
Gang Yan	ba24001665	selftests: mptcp: add a test for mptcp_diag_dump_one This patch introduces a new 'chk_diag' test in diag.sh. It retrieves the token for a specified MPTCP socket (msk) using the 'ss' command and then accesses the 'mptcp_diag_dump_one' in kernel via ./mptcp_diag to verify if the correct token is returned. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/524 Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-2-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 16:57:38 -08:00
Gang Yan	00f5e338cf	selftests: mptcp: Add a tool to get specific msk_info This patch enables the retrieval of the mptcp_info structure corresponding to a specified MPTCP socket (msk). When multiple MPTCP connections are present, specific information can be obtained for a given connection through the 'mptcp_diag_dump_one' by using the 'token' associated with the msk. Signed-off-by: Gang Yan <yangang@kylinos.cn> Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-1-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 16:57:37 -08:00
Yi Lai	df6f8c4d72	selftests/pcie_bwctrl: Add 'set_pcie_speed.sh' to TEST_PROGS The test shell script "set_pcie_speed.sh" is not installed in INSTALL_PATH. Attempting to execute set_pcie_cooling_state.sh shows warning: ./set_pcie_cooling_state.sh: line 119: ./set_pcie_speed.sh: No such file or directory Add "set_pcie_speed.sh" to TEST_PROGS. Link: https://lore.kernel.org/r/Z8FfK8rN30lKzvVV@ly-workstation Fixes: `838f12c3d5` ("selftests/pcie_bwctrl: Create selftests") Signed-off-by: Yi Lai <yi1.lai@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2025-03-04 16:49:19 -06:00
Thomas Weißschuh	a22ee38d2e	selftests/vDSO: Fix GNU hash table entry size for s390x Commit `14be4e6f35` ("selftests: vDSO: fix ELF hash table entry size for s390x") changed the type of the ELF hash table entries to 64bit on s390x. However the GNU hash tables entries are always 32bit. The "bucket" pointer is shared between both hash algorithms. On s390, this caused the GNU hash algorithm to access its 32-bit entries as if they were 64-bit, triggering compiler warnings (assignment between "Elf64_Xword " and "Elf64_Word ") and runtime crashes. Introduce a new dedicated "gnu_bucket" pointer which is used by the GNU hash. Fixes: `e0746bde6f` ("selftests/vDSO: support DT_GNU_HASH") Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250217-selftests-vdso-s390-gnu-hash-v2-1-f6c2532ffe2a@linutronix.de Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2025-03-04 17:15:18 +01:00
Bharadwaj Raju	82ef781f24	selftests/ftrace: add 'poll' binary to gitignore When building this test, a binary file 'poll' is generated and should be gitignore'd. Link: https://lore.kernel.org/r/20250210160138.4745-1-bharadwaj.raju777@gmail.com Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-04 08:51:17 -07:00
Breno Leitao	d7a2522426	netconsole: selftest: add task name append testing Add test coverage for the netconsole task name feature to the existing sysdata selftest script. This extends the test infrastructure to verify that task names are correctly appended when enabled and absent when disabled. The test validates that: - Task names appear in the expected format "taskname=<name>" - Task names are included when the feature is enabled - Task names are excluded when the feature is disabled - The feature works correctly alongside other sysdata fields like CPU Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 15:28:28 +01:00
Yi Liu	1062d81086	iommufd: Disallow allocating nested parent domain with fault ID Allocating a domain with a fault ID indicates that the domain is faultable. However, there is a gap for the nested parent domain to support PRI. Some hardware lacks the capability to distinguish whether PRI occurs at stage 1 or stage 2. This limitation may require software-based page table walking to resolve. Since no in-tree IOMMU driver currently supports this functionality, it is disallowed. For more details, refer to the related discussion at [1]. [1] https://lore.kernel.org/linux-iommu/bd1655c6-8b2f-4cfa-adb1-badc00d01811@intel.com/ Link: https://patch.msgid.link/r/20250226104012.82079-1-yi.l.liu@intel.com Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-04 09:34:54 -04:00
Nicolas Dichtel	0c493da863	net: rename netns_local to netns_immutable The name 'netns_local' is confusing. A following commit will export it via netlink, so let's use a more explicit name. Reported-by: Eric Dumazet <edumazet@google.com> Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 12:44:48 +01:00
Ingo Molnar	cfdaa618de	Merge branch 'x86/cpu' into x86/asm, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-04 11:19:21 +01:00
Ingo Molnar	1b4c36f9b1	Merge branch 'x86/urgent' into x86/cpu, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-04 11:15:26 +01:00
Peter Seiderer	03544faad7	selftest: net: add proc_net_pktgen Add some test for /proc/net/pktgen/... interface. - enable 'CONFIG_NET_PKTGEN=m' in tools/testing/selftests/net/config Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 10:57:58 +01:00
Christian Brauner	e5c10b19b3	selftests: test subdirectory mounting This tests mounting a subdirectory without ever having to expose the filesystem to a non-anonymous mount namespace. Link: https://lore.kernel.org/r/20250225-work-mount-propagation-v1-3-e6e3724500eb@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:55 +01:00
Christian Brauner	12e40cbb0e	selftests: add test for detached mount tree propagation Test that detached mount trees receive propagation events. Link: https://lore.kernel.org/r/20250225-work-mount-propagation-v1-2-e6e3724500eb@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:55 +01:00
Christian Brauner	7bece690a9	selftests: seventh test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-16-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:54 +01:00
Christian Brauner	0e6eef95df	selftests: sixth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-15-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:54 +01:00
Christian Brauner	d346ae7a64	selftests: fifth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-14-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	1768f9bd53	selftests: fourth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-13-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	4fd0aea1ea	selftests: third test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-12-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	89ddfd4af1	selftests: second test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-11-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	d83a55b826	selftests: first test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-10-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	3f1724dd56	selftests: create detached mounts from detached mounts Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-7-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Jakub Kicinski	d110dbf149	selftests: net: report output format as TAP 13 in Python tests The Python lib based tests report that they are producing "KTAP version 1", but really we aren't making use of any KTAP features, like subtests. Our output is plain TAP. Report TAP 13 instead of KTAP 1, this is what mptcp tests do, and what NIPA knows how to parse best. For HW testing we need precise subtest result tracking. Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228180007.83325-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-03 15:03:19 -08:00
Thomas Weißschuh	8770a9183f	selftests: vDSO: vdso_standalone_test_x86: Switch to nolibc vdso_standalone_test_x86 provides its own ASM syscall wrappers and _start() implementation. The in-tree nolibc library already provides this functionality for multiple architectures. By making use of nolibc, the standalone testcase can be built from the exact same codebase as the non-standalone version. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-16-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	4f65df6a58	selftests: vDSO: vdso_test_gettimeofday: Make compatible with nolibc nolibc does not provide sys/time.h and sys/auxv.h, instead their definitions are available unconditionally. Guard the includes so they are not attempted on nolibc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-15-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	97a8814124	selftests: vDSO: vdso_test_gettimeofday: Clean up includes Some unnecessary headers are included, remove them. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-14-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	032e871686	selftests: vDSO: parse_vdso: Test __SIZEOF_LONG__ instead of ULONG_MAX According to limits.h(2) ULONG_MAX is only guaranteed to expand to an expression, not a symbolic constant which can be evaluated by the preprocessor. Specifically the definition of ULONG_MAX from nolibc can not be evaluated by the preprocessor. To provide compatibility with nolibc, check with __SIZEOF_LONG__ instead, with is provided directly by the preprocessor and therefore always a symbolic constant. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-13-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	c9fbaa8795	selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers To allow the usage of parse_vdso.c together with a limited libc like nolibc, use the kernels own elf.h and auxvec.h headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-12-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	09dcec6470	selftests: vDSO: parse_vdso: Drop vdso_init_from_auxv() There are no users left. This also removes the usage of ElfXX_auxv_t, which is not formally standardized. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-11-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	05c204acf5	selftests: vDSO: vdso_standalone_test_x86: Use vdso_init_form_sysinfo_ehdr vdso_standalone_test_x86 is the only user of vdso_init_from_auxv(). Instead of combining the parsing the aux vector with the parsing of the vDSO, split them apart into getauxval() and the regular vdso_init_from_sysinfo_ehdr(). The implementation of getauxval() is taken from tools/include/nolibc/stdlib.h. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-10-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	1a59f5d315	selftests: Add headers target Some selftests need access to a full UAPI headers tree, for example when building with nolibc which heavily relies on UAPI headers. A reference to such a tree is available in the KHDR_INCLUDES variable, but there is currently no way to populate such a tree automatically. Provide a target that the tests can depend on to get access to usable UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-8-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Tejun Heo	8a9b1585e2	sched_ext: Merge branch 'for-6.14-fixes' into for-6.15 Pull for-6.14-fixes to receive: `9360dfe4cb` ("sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()") which conflicts with: `337d1b354a` ("sched_ext: Move built-in idle CPU selection policy to a separate file") Signed-off-by: Tejun Heo <tj@kernel.org>	2025-03-03 08:02:22 -10:00
Sean Christopherson	3b2d3db368	KVM: selftests: Fix printf() format goof in SEV smoke test Print out the index of mismatching XSAVE bytes using unsigned decimal format. Some versions of clang complain about trying to print an integer as an unsigned char. x86/sev_smoke_test.c:55:51: error: format specifies type 'unsigned char' but the argument has type 'int' [-Werror,-Wformat] Fixes: `8c53183dba` ("selftests: kvm: add test for transferring FPU state into VMSA") Link: https://lore.kernel.org/r/20250228233852.3855676-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-03-03 07:45:34 -08:00
Sean Christopherson	d88ed5fb7c	KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage During the initial mprotect(RO) stage of mmu_stress_test, keep vCPUs spinning until all vCPUs have hit -EFAULT, i.e. until all vCPUs have tried to write to a read-only page. If a vCPU manages to complete an entire iteration of the loop without hitting a read-only page, and the vCPU observes mprotect_ro_done before starting a second iteration, then the vCPU will prematurely fall through to GUEST_SYNC(3) (on x86 and arm64) and get out of sequence. Replace the "do-while (!r)" loop around the associated _vcpu_run() with a single invocation, as barring a KVM bug, the vCPU is guaranteed to hit -EFAULT, and retrying on success is super confusion, hides KVM bugs, and complicates this fix. The do-while loop was semi-unintentionally added specifically to fudge around a KVM x86 bug, and said bug is unhittable without modifying the test to force x86 down the !(x86\|\|arm64) path. On x86, if forced emulation is enabled, vcpu_arch_put_guest() may trigger emulation of the store to memory. Due a (very, very) longstanding bug in KVM x86's emulator, emulate writes to guest memory that fail during __kvm_write_guest_page() unconditionally return KVM_EXIT_MMIO. While that is desirable in the !memslot case, it's wrong in this case as the failure happens due to __copy_to_user() hitting a read-only page, not an emulated MMIO region. But as above, x86 only uses vcpu_arch_put_guest() if the __x86_64__ guards are clobbered to force x86 down the common path, and of course the unexpected MMIO is a KVM bug, i.e. should cause a test failure. Fixes: `b6c304aec6` ("KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)") Reported-by: Yan Zhao <yan.y.zhao@intel.com> Closes: https://lore.kernel.org/all/20250208105318.16861-1-yan.y.zhao@intel.com Debugged-by: Yan Zhao <yan.y.zhao@intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20250228230804.3845860-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-03-03 07:37:28 -08:00
Mirsad Todorovac	40fc756101	selftests/x86/syscall: Fix coccinelle WARNING recommending the use of ARRAY_SIZE() Coccinelle gives WARNING recommending the use of ARRAY_SIZE() macro definition to improve the code readability: ./tools/testing/selftests/x86/syscall_numbering.c:316:35-36: WARNING: Use ARRAY_SIZE Fixes: `15c82d98a0` ("selftests/x86/syscall: Update and extend syscall_numbering_64") Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20241101111523.1293193-2-mtodorovac69@gmail.com	2025-03-03 12:38:49 +01:00
Thomas Weißschuh	cb839e0cc8	selftests/nolibc: add armthumb configuration While nolibc does support ARM Thumb instructions, that support was not tested specifically. Add a new test configuration for it. Tested-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250301-nolibc-armthumb-v1-2-d1f04abb5f6d@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-02 13:25:20 +01:00
Thomas Weißschuh	f8bedb30d6	selftests/nolibc: explicitly enable ARM mode The default could also be -mthumb. Explicitly use -marm to keep everything predictable. Tested-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250301-nolibc-armthumb-v1-1-d1f04abb5f6d@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-02 13:25:20 +01:00
Linus Torvalds	5c44ddaf7d	Tracing fixes for v6.14: - Fix crash from bad histogram entry An error path in the histogram creation could leave an entry in a link list that gets freed. Then when a new entry is added it can cause a u-a-f bug. This is fixed by restructuring the code so that the histogram is consistent on failure and everything is cleaned up appropriately. - Fix fprobe self test The fprobe self test relies on no function being attached by ftrace. BPF programs can attach to functions via ftrace and systemd now does so. This causes those functions to appear in the enabled_functions list which holds all functions attached by ftrace. The selftest also uses that file to see if functions are being connected correctly. It counts the functions in the file, but if there's already functions in the file, it fails. Instead, add the number of functions in the file at the start of the test to all the calculations during the test. - Fix potential division by zero of the function profiler stddev The calculated divisor that calculates the standard deviation of the function times can overflow. If the overflow happens to land on zero, that can cause a division by zero. Check for zero from the calculation before doing the division. TODO: Catch when it ever overflows and report it accordingly. For now, just prevent the system from crashing. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ8HqYBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qpoXAP90gvO2LfjItjZVjBYudr4GOzcsjAAK cZ2vL2LJp3hT4QD+Kud2YaZqzrV8tvFFBikO7FvEV3zZpnw48895pIgcoww= =NLe0 -----END PGP SIGNATURE----- Merge tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix crash from bad histogram entry An error path in the histogram creation could leave an entry in a link list that gets freed. Then when a new entry is added it can cause a u-a-f bug. This is fixed by restructuring the code so that the histogram is consistent on failure and everything is cleaned up appropriately. - Fix fprobe self test The fprobe self test relies on no function being attached by ftrace. BPF programs can attach to functions via ftrace and systemd now does so. This causes those functions to appear in the enabled_functions list which holds all functions attached by ftrace. The selftest also uses that file to see if functions are being connected correctly. It counts the functions in the file, but if there's already functions in the file, it fails. Instead, add the number of functions in the file at the start of the test to all the calculations during the test. - Fix potential division by zero of the function profiler stddev The calculated divisor that calculates the standard deviation of the function times can overflow. If the overflow happens to land on zero, that can cause a division by zero. Check for zero from the calculation before doing the division. TODO: Catch when it ever overflows and report it accordingly. For now, just prevent the system from crashing. * tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Avoid potential division by zero in function_stat_show() selftests/ftrace: Let fprobe test consider already enabled functions tracing: Fix bad hist from corrupting named_triggers list	2025-02-28 15:43:32 -08:00
Sean Christopherson	62838fa5ea	KVM: selftests: Relax assertion on HLT exits if CPU supports Idle HLT If the CPU supports Idle HLT, which elides HLT VM-Exits if the vCPU has an unmasked pending IRQ or NMI, relax the xAPIC IPI test's assertion on the number of HLT exits to only require that the number of exits is less than or equal to the number of HLT instructions that were executed. I.e. don't fail the test if Idle HLT does what it's supposed to do. Note, unfortunately there's no way to determine if KVM supports Idle HLT, as this_cpu_has() checks raw CPU support, and kvm_cpu_has() checks what can be exposed to L1, i.e. the latter would check if KVM supports nested Idle HLT. But, since the assert is purely bonus coverage, checking for CPU support is good enough. Cc: Manali Shukla <Manali.Shukla@amd.com> Tested-by: Manali Shukla <Manali.Shukla@amd.com> Link: https://lore.kernel.org/r/20250226231809.3183093-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 15:42:28 -08:00
Sean Christopherson	f3513a335e	KVM: selftests: Assert that STI blocking isn't set after event injection Add an L1 (guest) assert to the nested exceptions test to verify that KVM doesn't put VMRUN in an STI shadow (AMD CPUs bleed the shadow into the guest's int_state if a #VMEXIT occurs before VMRUN fully completes). Add a similar assert to the VMX side as well, because why not. Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 09:15:23 -08:00
Colin Ian King	75418e222e	KVM: selftests: Fix spelling mistake "UFFDIO_CONINUE" -> "UFFDIO_CONTINUE" There is a spelling mistake in a PER_PAGE_DEBUG debug message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20250227220819.656780-1-colin.i.king@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 09:12:52 -08:00
Ming Lei	bedc9cbc5f	selftests: ublk: add ublk zero copy test Enable zero copy on file backed target, meantime add one fio test for covering write verify, another test for mkfs/mount/umount. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Ming Lei	5d95bfb535	selftests: ublk: add file backed ublk Add file backed ublk target code, meantime add one fio test for covering write verify, another test for mkfs/mount/umount. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Ming Lei	6aecda00b7	selftests: ublk: add kernel selftests for ublk Both ublk driver and userspace heavily depends on io_uring subsystem, and tools/testing/selftests/ should be the best place for holding this cross-subsystem tests. Add basic read/write IO test over this ublk null disk, and make sure ublk working. More tests will be added. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Kevin Krakauer	51bef03e1a	selftests/net: deflake GRO tests GRO tests are timing dependent and can easily flake. This is partially mitigated in gro.sh by giving each subtest 3 chances to pass. However, this still flakes on some machines. Reduce the flakiness by: - Bumping retries to 6. - Setting napi_defer_hard_irqs to 1 to reduce the chance that GRO is flushed prematurely. This also lets us reduce the gro_flush_timeout from 1ms to 100us. Tested: Ran `gro.sh -t large` 1000 times. There were no failures with this change. Ran inside strace to increase flakiness. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-4-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Kevin Krakauer	41cda57284	selftests/net: only print passing message in GRO tests when tests pass gro.c:main no longer erroneously claims a test passes when running as a sender. Tested: Ran `gro.sh -t large` to verify the sender no longer prints a status. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-3-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Kevin Krakauer	784e6abd99	selftests/net: have `gro.sh -t` return a correct exit code Modify gro.sh to return a useful exit code when the -t flag is used. It formerly returned 0 no matter what. Tested: Ran `gro.sh -t large` and verified that test failures return 1. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-2-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Heiko Carstens	3908b6baf2	selftests/ftrace: Let fprobe test consider already enabled functions The fprobe test fails on Fedora 41 since the fprobe test assumption that the number of enabled_functions is zero before the test starts is not necessarily true. Some user space tools, like systemd, add BPF programs that attach to functions. Those will show up in the enabled_functions table and must be taken into account by the fprobe test. Therefore count the number of lines of enabled_functions before tests start, and use that as base when comparing expected results. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250226142703.910860-1-hca@linux.ibm.com Fixes: `e85c5e9792` ("selftests/ftrace: Update fprobe test to check enabled_functions file") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-02-27 21:02:09 -05:00
Jakub Kicinski	357660d759	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c `fa52f15c74` ("net: cadence: macb: Synchronize stats calculations") `75696dd0fd` ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c `79990cf5e7` ("ice: Fix deinitializing VF in error path") `a203163274` ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c `18912c5206` ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") `297d389e9e` ("net: prefix devmem specific helpers") net/mptcp/subflow.c `8668860b0a` ("mptcp: reset when MPTCP opts are dropped after join") `c3349a22c2` ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 10:20:58 -08:00
Linus Torvalds	1e15510b71	Including fixes from bluetooth. We didn't get netfilter or wireless PRs this week, so next week's PR is probably going to be bigger. A healthy dose of fixes for bugs introduced in the current release nonetheless. Current release - regressions: - Bluetooth: always allow SCO packets for user channel - af_unix: fix memory leak in unix_dgram_sendmsg() - rxrpc: - remove redundant peer->mtu_lock causing lockdep splats - fix spinlock flavor issues with the peer record hash - eth: iavf: fix circular lock dependency with netdev_lock - net: use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net() RDMA driver register notifier after the device Current release - new code bugs: - ethtool: fix ioctl confusing drivers about desired HDS user config - eth: ixgbe: fix media cage present detection for E610 device Previous releases - regressions: - loopback: avoid sending IP packets without an Ethernet header - mptcp: reset connection when MPTCP opts are dropped after join Previous releases - always broken: - net: better track kernel sockets lifetime - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT - eth: enetc: number of error handling fixes - dsa: rtl8366rb: reshuffle the code to fix config / build issue with LED support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfAj8MACgkQMUZtbf5S IrtoTRAAj0XNWXGWZdOuVub0xhtjsPLoZktux4AzsELqaynextkJW6w9pG5qVrWu UZt3a3bC7u6+JoTgb+GQVhyjuuVjv6NOSuLK3FS+NePW8ijhLP5oTg6eD0MQS60Z wa9yQx3yL1Kvb6b80Go/3WgRX9V6Rx8zlROAl/gOlZ9NKB0rSVqnueZGPjGZJf1a ayyXsmzRykshbr5Ic0e+b74hFP3DGxVgHjIob1C4kk/Q+WOfQKnm3C3fnZ/R2QcS 7B7kSk9WokvNwk3hJc7ZtFxJbrQKSSuRI8nCD93hBjTn76yJjlPicJ9b6HJoGhE/ Pwt7fBnDCCA00x6ejD3OrurR+/80PbPtyvNtgMMTD49wSwxQpQ6YpTMInnodCzAV NvIhkkXBprI0kiTT4dDpNoeFMKD3i07etKpvMfEoDzZR7vgUsj6aClSmuxILeU9a crFC4Vp5SgyU1/lUPDiG4dfbd8s4hfM4bZ+d0zAtth3/rQA7/EA6dLqbRXXWX7h5 Gl6egKWPsSl+WUgFjpBjYfhqrQsc06hxaCh0SQYH6SnS3i+PlMU2uRJYZMLQ66rX QsSQOyqCEHwd1qnrLedg9rCniv+DzOJf+qh+H0eY9WhuOay+8T52OHLxpRjSHxBo SCP+qQxSX0qhH5DtUiOV50Fwg19UhJJyWd0COfv5SIGm/I1dUOY= =+Ci7 -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth. We didn't get netfilter or wireless PRs this week, so next week's PR is probably going to be bigger. A healthy dose of fixes for bugs introduced in the current release nonetheless. Current release - regressions: - Bluetooth: always allow SCO packets for user channel - af_unix: fix memory leak in unix_dgram_sendmsg() - rxrpc: - remove redundant peer->mtu_lock causing lockdep splats - fix spinlock flavor issues with the peer record hash - eth: iavf: fix circular lock dependency with netdev_lock - net: use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net() RDMA driver register notifier after the device Current release - new code bugs: - ethtool: fix ioctl confusing drivers about desired HDS user config - eth: ixgbe: fix media cage present detection for E610 device Previous releases - regressions: - loopback: avoid sending IP packets without an Ethernet header - mptcp: reset connection when MPTCP opts are dropped after join Previous releases - always broken: - net: better track kernel sockets lifetime - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT - eth: enetc: number of error handling fixes - dsa: rtl8366rb: reshuffle the code to fix config / build issue with LED support" * tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits) net: ti: icss-iep: Reject perout generation request idpf: fix checksums set in idpf_rx_rsc() selftests: drv-net: Check if combined-count exists net: ipv6: fix dst ref loop on input in rpl lwt net: ipv6: fix dst ref loop on input in seg6 lwt usbnet: gl620a: fix endpoint checking in genelink_bind() net/mlx5: IRQ, Fix null string in debug print net/mlx5: Restore missing trace event when enabling vport QoS net/mlx5: Fix vport QoS cleanup on error net: mvpp2: cls: Fixed Non IP flow, with vlan tag flow defination. af_unix: Fix memory leak in unix_dgram_sendmsg() net: Handle napi_schedule() calls from non-interrupt net: Clear old fragment checksum value in napi_reuse_skb gve: unlink old napi when stopping a queue using queue API net: Use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net(). tcp: Defer ts_recent changes until req is owned net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs() net: enetc: remove the mm_lock from the ENETC v4 driver net: enetc: add missing enetc4_link_deinit() net: enetc: update UDP checksum when updating originTimestamp field ...	2025-02-27 09:32:42 -08:00
Joe Damato	1cbddbddee	selftests: drv-net: Check if combined-count exists Some drivers, like tg3, do not set combined-count: $ ethtool -l enp4s0f1 Channel parameters for enp4s0f1: Pre-set maximums: RX: 4 TX: 4 Other: n/a Combined: n/a Current hardware settings: RX: 4 TX: 1 Other: n/a Combined: n/a In the case where combined-count is not set, the ethtool netlink code in the kernel elides the value and the code in the test: netnl.channels_get(...) With a tg3 device, the returned dictionary looks like: {'header': {'dev-index': 3, 'dev-name': 'enp4s0f1'}, 'rx-max': 4, 'rx-count': 4, 'tx-max': 4, 'tx-count': 1} Note that the key 'combined-count' is missing. As a result of this missing key the test raises an exception: # Exception\| if channels['combined-count'] == 0: # Exception\| ~~~~~~~~^^^^^^^^^^^^^^^^^^ # Exception\| KeyError: 'combined-count' Change the test to check if 'combined-count' is a key in the dictionary first and if not assume that this means the driver has separate RX and TX queues. With this change, the test now passes successfully on tg3 and mlx5 (which does have a 'combined-count'). Fixes: `1cf2704242` ("net: selftest: add test for netdev netlink queue-get API") Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250226181957.212189-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 07:30:32 -08:00
Colin Ian King	bd64e9d6aa	selftests/x86/xstate: Fix spelling mistake "hader" -> "header" There is a spelling mistake in a sig_print() message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250227091533.599213-1-colin.i.king@gmail.com	2025-02-27 10:49:31 +01:00
Bharadwaj Raju	29fa7d7934	selftests/sysctl: fix wording of help messages Change wording of test number recommendation to "the recommended number of times". Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-02-27 10:02:12 +01:00
Jakub Kicinski	185646a8a0	selftests: drv-net: add tests for napi IRQ affinity notifiers Add tests to check that the napi retained the IRQ after down/up, multiple changes in the number of rx queues and after attaching/releasing XDP program. Tested on ice and idpf: # NETIF=<iface> tools/testing/selftests/drivers/net/hw/irq.py KTAP version 1 1..4 ok 1 irq.check_irqs_reported ok 2 irq.check_reconfig_queues ok 3 irq.check_reconfig_xdp ok 4 irq.check_down # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Tested-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://patch.msgid.link/20250224232228.990783-7-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 19:51:38 -08:00
Willem de Bruijn	2e5584e0f9	selftests/net: expand cmsg_ipv6.sh with ipv4 Expand IPV6_TCLASS to also cover IP_TOS. Expand IPV6_HOPLIMIT to also cover IP_TTL. Expand csmg_sender.c to allow setting IPv4 setsockopts. Also rename struct v6 to cmsg to match its expanded scope. Don't bother updating all occurrences of tclass and hoplimit. Rename cmsg_ipv6.sh to cmsg_ip.sh to match the expanded scope. Be careful around the subtle API difference between TCLASS and TOS. IP_TOS includes ECN bits. Add a test to verify that these are masked when making routing decisions. Diff is more concise with --word-diff Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250225022431.2083926-3-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 18:59:00 -08:00
Willem de Bruijn	c1d6d629ab	selftests/net: prepare cmsg_ipv6.sh for ipv4 Move IPV6_TCLASS and IPV6_HOPLIMIT into loops, to be able to use them for IP_TOS and IP_TTL in a follow-on patch. Indentation in this file is a mix of four spaces and tabs for double indents. To minimize code churn, maintain that pattern. Very small diff if viewing with -w. Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250225022431.2083926-2-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 18:59:00 -08:00
Shameer Kolothum	f69656656f	KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2 One difference here with other pseudo-firmware bitmap registers is that the default/reset value for the supported hypercall function-ids is 0 at present. Hence, modify the test accordingly. Reviewed-by: Sebastian Ott <sebott@redhat.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20250221140229.12588-7-shameerali.kolothum.thodi@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-02-26 13:30:37 -08:00
Dave Jiang	516e5bd0b6	cxl: Add mce notifier to emit aliased address for extended linear cache Below is a setup with extended linear cache configuration with an example layout of memory region shown below presented as a single memory region consists of 256G memory where there's 128G of DRAM and 128G of CXL memory. The kernel sees a region of total 256G of system memory. 128G DRAM 128G CXL memory \|-----------------------------------\|-------------------------------------\| Data resides in either DRAM or far memory (FM) with no replication. Hot data is swapped into DRAM by the hardware behind the scenes. When error is detected in one location, it is possible that error also resides in the aliased location. Therefore when a memory location that is flagged by MCE is part of the special region, the aliased memory location needs to be offlined as well. Add an mce notify callback to identify if the MCE address location is part of an extended linear cache region and handle accordingly. Added symbol export to set_mce_nospec() in x86 code in order to call set_mce_nospec() from the CXL MCE notify callback. Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 14:13:49 -07:00
Thomas Weißschuh	3bd53b2fa5	Revert "selftests: kselftest: Fix build failure with NOLIBC" This reverts commit `16767502aa`. Nolibc gained support for uname(2) and sscanf(3) which are the dependencies of ksft_min_kernel_version(). So re-enable support for ksft_min_kernel_version() under nolibc. Acked-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250209-nolibc-scanf-v2-2-c29dea32f1cd@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-26 22:13:48 +01:00
Thomas Weißschuh	22edf1f8d4	tools/nolibc: add support for [v]sscanf() These functions are used often, also in selftests. sscanf() itself is also used by kselftest.h itself. The implementation is limited and only supports numeric arguments. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250209-nolibc-scanf-v2-1-c29dea32f1cd@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-26 22:13:48 +01:00
Heiko Carstens	dc4b165855	selftests/ftrace: Use readelf to find entry point in uprobe test The uprobe events test fails on s390, but also on x86 (Fedora 41). The problem appears to be that there is an assumption that adding a uprobe to the beginning of the executable mapping of /bin/sh is sufficient to trigger a uprobe event when /bin/sh is executed. This assumption is not necessarily true. Therefore use "readelf -h" to find the entry point address of /bin/sh and use this address when adding the uprobe event. This adds a dependency to readelf which is not always installed. Therefore add a check and exit with exit_unresolved if it is not installed. Link: https://lore.kernel.org/r/20250220130102.2079179-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-26 13:53:58 -07:00
Dave Jiang	0ec9849b63	acpi/hmat / cxl: Add extended linear cache support for CXL The current cxl region size only indicates the size of the CXL memory region without accounting for the extended linear cache size. Retrieve the cache size from HMAT and append that to the cxl region size for the cxl region range that matches the SRAT range that has extended linear cache enabled. The SRAT defines the whole memory range that includes the extended linear cache and the CXL memory region. The new HMAT ECN/ECR to the Memory Side Cache Information Structure defines the size of the extended linear cache size and matches to the SRAT Memory Affinity Structure by the memory proxmity domain. Add a helper to match the cxl range to the SRAT memory range in order to retrieve the cache size. There are several places that checks the cxl region range against the decoder range. Use new helper to check between the two ranges and address the new cache size. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 13:45:22 -07:00
Linus Torvalds	c0d35086a2	Landlock fix for v6.14-rc5 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ785FhAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSILQBAMFwpFClzjVeWyLFNd/gaTlPWeedvnag+yZu CK9q39jOAP9tj1unQBFpsI7jsTk6ZxPVxb4DymPIirZMb/5FuxUnDQ== =QrSM -----END PGP SIGNATURE----- Merge tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fixes to TCP socket identification, documentation, and tests" * tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add binaries to .gitignore selftests/landlock: Test that MPTCP actions are not restricted selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP landlock: Fix non-TCP sockets restriction landlock: Minor typo and grammar fixes in IPC scoping documentation landlock: Fix grammar error selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB	2025-02-26 11:55:44 -08:00
Andrea Righi	5ae5161820	selftests/sched_ext: Add NUMA-aware scheduler test Add a selftest to validate the behavior of the NUMA-aware scheduler functionalities, including idle CPU selection within nodes, per-node DSQs and CPU to node mapping. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-02-26 08:49:02 -10:00
Mykyta Yatsenko	3d1033caf0	selftests/bpf: Introduce veristat test Introducing test for veristat, part of test_progs. Test cases cover functionality of setting global variables in BPF program. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250225163101.121043-3-mykyta.yatsenko5@gmail.com	2025-02-26 10:45:01 -08:00
Mykyta Yatsenko	e3c9abd0d1	selftests/bpf: Implement setting global variables in veristat To better verify some complex BPF programs we'd like to preset global variables. This patch introduces CLI argument `--set-global-vars` or `-G` to veristat, that allows presetting values to global variables defined in BPF program. For example: prog.c: ``` enum Enum { ELEMENT1 = 0, ELEMENT2 = 5 }; const volatile __s64 a = 5; const volatile __u8 b = 5; const volatile enum Enum c = ELEMENT2; const volatile bool d = false; char arr[4] = {0}; SEC("tp_btf/sched_switch") int BPF_PROG(...) { bpf_printk("%c\n", arr[a]); bpf_printk("%c\n", arr[b]); bpf_printk("%c\n", arr[c]); bpf_printk("%c\n", arr[d]); return 0; } ``` By default verification of the program fails: ``` ./veristat prog.bpf.o ``` By presetting global variables, we can make verification pass: ``` ./veristat wq.bpf.o -G "a = 0" -G "b = 1" -G "c = 2" -G "d = 3" ``` Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250225163101.121043-2-mykyta.yatsenko5@gmail.com	2025-02-26 10:45:00 -08:00
Ihor Solodrai	0ba0ef012e	selftests/bpf: Test bpf_usdt_arg_size() function Update usdt tests to also check for correct behavior of bpf_usdt_arg_size(). Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20250224235756.2612606-2-ihor.solodrai@linux.dev	2025-02-26 08:59:44 -08:00
Dave Jiang	44818d387e	cxl/test: Add Get Supported Features mailbox command support Add cxl-test emulation of Get Supported Features mailbox command. Currently only adding a test feature with feature identifier of all f's for testing. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250220194438.2281088-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:31 -07:00
Dave Jiang	f0e6a2329b	cxl: Add Get Supported Features command for kernel usage CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h) The command retrieve the list of supported device-specific features (identified by UUID) and general information about each Feature. The driver will retrieve the Feature entries in order to make checks and provide information for the Get Feature and Set Feature command. One of the main piece of information retrieved are the effects a Set Feature command would have for a particular feature. The retrieved Feature entries are stored in the cxl_mailbox context. The setup of Features is initiated via devm_cxl_setup_features() during the pci probe function before the cxl_memdev is enumerated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:27 -07:00
Mahe Tardy	9138048bb5	selftests/bpf: add cgroup_skb netns cookie tests Add netns cookie test that verifies the helper is now supported and work in the context of cgroup_skb programs. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com> Link: https://lore.kernel.org/r/20250225125031.258740-2-mahe.tardy@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-26 07:35:51 -08:00
Chang S. Bae	bfc98dbcb3	selftests/x86/avx: Add AVX tests Add xstate testing specifically for those vector register states, validating kernel's context switching and ensuring ABI compliance. Use the established xstate testing framework. Alternatively, this invocation could be placed directly in xstate.c::main(). However, the current test file naming convention, which clearly specifies the tested area, seems reasonable. Adding avx.c considerably aligns with that convention. The test output should be like this for ZMM_Hi256 as an example: $ avx_64 ... [RUN] AVX-512 ZMM_Hi256: check context switches, 10 iterations, 5 threads. [OK] No incorrect case was found. [RUN] AVX-512 ZMM_Hi256: inject xstate via ptrace(). [OK] 'xfeatures' in SW reserved area was correctly written [OK] xstate was correctly updated. [RUN] AVX-512 ZMM_Hi256: load xstate and raise SIGUSR1 [OK] 'magic1' is valid [OK] 'xfeatures' in SW reserved area is valid [OK] 'xfeatures' in XSAVE header is valid [OK] xstate delivery was successful [OK] 'magic2' is valid [RUN] AVX-512 ZMM_Hi256: load new xstate from sighandler and check it after sigreturn [OK] xstate was restored correctly But systems without AVX-512 will look like: ... The kernel does not support feature number: 5 The kernel does not support feature number: 6 The kernel does not support feature number: 7 Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-10-chang.seok.bae@intel.com	2025-02-26 13:05:30 +01:00
Chang S. Bae	fa826c1f2c	selftests/x86/xstate: Clarify supported xstates The established xstate test code is designed to be generic, but certain xstates require special handling and cannot be tested without additional adjustments. Clarify which xstates are currently supported, and enforce testing only for them. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-9-chang.seok.bae@intel.com	2025-02-26 13:05:30 +01:00
Chang S. Bae	10d8a204c5	selftests/x86/xstate: Consolidate test invocations into a single entry Currently, each of the three xstate tests runs as a separate invocation, requiring the xstate number to be passed and state information to be reconstructed repeatedly. This approach arose from their individual and isolated development, but now it makes sense to unify them. Introduce a wrapper function that first verifies feature availability from the kernel and constructs the necessary state information once. The wrapper then sequentially invokes all tests to ensure consistent execution. Update the AMX test to use this unified invocation. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-8-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	e075d9fa16	selftests/x86/xstate: Introduce signal ABI test With the refactored test cases, another xstate exposure to userspace is through signal delivery. While amx.c includes signal-related scenarios, its primary focus is on xstate permission management, which is largely specific to dynamic states. The remaining gap is testing xstate preservation and restoration across signal delivery. The kernel defines an ABI for presenting xstate in the signal frame, closely resembling the hardware XSAVE format, where xstate modification is also possible. Introduce a new test case to verify xstate preservation across signal delivery and return, that is ensuring ABI compatibility by: - Loading xstate before raising a signal. - Verifying correct exposure in the signal frame - Modifying xstate in the signal frame before returning. - Checking the state restoration upon signal return. Integrate this test into the AMX test suite as an initial usage site. Expected output: $ amx_64 ... [RUN] AMX Tile data: load xstate and raise SIGUSR1 [OK] 'magic1' is valid [OK] 'xfeatures' in SW reserved area is valid [OK] 'xfeatures' in XSAVE header is valid [OK] xstate delivery was successful [OK] 'magic2' is valid [RUN] AMX Tile data: load new xstate from sighandler and check it after sigreturn [OK] xstate was restored correctly Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-7-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	7cb2fbe419	selftests/x86/xstate: Refactor ptrace ABI test Following the refactoring of the context switching test, the ptrace test is another component reusable for other xstate features. As part of this restructuring, add a missing check to validate the user_xstateregs->xstate_fx_sw field in the ABI. Also, replace err() and fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: inject xstate via ptrace(). [OK] 'xfeatures' in SW reserved area was correctly written [OK] xstate was correctly updated. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-6-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	40f6852ef2	selftests/x86/xstate: Refactor context switching test The existing context switching and ptrace tests in amx.c are not specific to dynamic states, making them reusable for general xstate testing. As a first step, move the context switching test to xstate.c. Refactor the test code to allow specifying which xstate component being tested. To decouple the test from dynamic states, remove the permission request code. In fact, The permission request inside the test wrapper was redundant. Additionally, replace fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: check context switches, 10 iterations, 5 threads. [OK] No incorrect case was found. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-5-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	3fcb4d6146	selftests/x86/xstate: Enumerate and name xstate components After moving essential helpers from amx.c, the code remains neutral regarding which xstate components it handles. However, explicitly listing known components helps users identify which features are ready for testing. Enumerate xstate components to facilitate identification. Extend struct xstate_info to include a name field, providing a human-readable identifier. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-4-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Chang S. Bae	0f6d91a327	selftests/x86/xstate: Refactor XSAVE helpers for general use The AMX test introduced several XSAVE-related helper functions, but so far, it has been the only user of them. These helpers can be generalized for broader test of multiple xstate features. Move most XSAVE-related code into xsave.h, making it shareable. The restructuring includes: * Establishing low-level XSAVE helpers for saving and restoring register states, as well as handling XSAVE buffers. * Generalizing state data manipuldations: set_rand_data() * Introducing a generic feature query helper: get_xstate_info() While doing so, remove unused defines in amx.c. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-3-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Chang S. Bae	dbd6b649e7	selftests/x86: Consolidate redundant signal helper functions The x86 selftests frequently register and clean up signal handlers, but the sethandler() and clearhandler() functions have been redundantly copied across multiple .c files. Move these functions to helpers.h to enable reuse across tests, eliminating around 250 lines of duplicate code. Converge the error handling by using ksft_exit_fail_msg(), which is functionally equivalent with err() within the selftest framework. This change is a prerequisite for the upcoming xstate selftest, which requires signal handling for registering and cleaning up handlers. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-2-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Sebastian Ott	a88c7c2244	KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR Assert that MIDR_EL1, REVIDR_EL1, AIDR_EL1 are writable from userspace, that the changed values are visible to guests, and that they are preserved across a vCPU reset. Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20250225005401.679536-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-02-26 01:32:23 -08:00
Amery Hung	4e4136c644	selftests/bpf: Test gen_pro/epilogue that generate kfuncs Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program. The main bpf program and return value checks are identical to pro_epilogue.c introduced in commit `47e69431b5` ("selftests/bpf: Test gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops detects a program name with prefix "test_kfunc_", it generates slightly different prologue and epilogue: They still add 1000 to args->a in prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue, but involve kfuncs. At high level, the alternative version of prologue and epilogue look like this: cgrp = bpf_cgroup_from_id(0); if (cgrp) bpf_cgroup_release(cgrp); else /* Perform what original bpf_testmod_st_ops prologue or * epilogue does */ Since 0 is never a valid cgroup id, the original prologue or epilogue logic will be performed. As a result, the __retval check should expect the exact same return value. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250225233545.285481-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-25 19:04:43 -08:00
Gal Pressman	da87cabaf8	selftests: drv-net-hw: Add a test for symmetric RSS hash Add a selftest that verifies symmetric RSS hash is working as intended. The test runs iterations of traffic, swapping the src/dst UDP ports, and verifies that the same RX queue is receiving the traffic in both cases. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-5-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:31:05 -08:00
Gal Pressman	0163250039	selftests: drv-net: Make rand_port() get a port more reliably Instead of guessing a port and checking whether it's available, get an available port from the OS. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:31:05 -08:00
Hangbin Liu	0f58804080	selftests/net: ensure mptcp is enabled in netns Some distributions may not enable MPTCP by default. All other MPTCP tests source mptcp_lib.sh to ensure MPTCP is enabled before testing. However, the ip_local_port_range test is the only one that does not include this step. Let's also ensure MPTCP is enabled in netns for ip_local_port_range so that it passes on all distributions. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224094013.13159-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:13:28 -08:00
Linus Torvalds	9f5270d758	perf tools fixes for v6.14: 2nd batch - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZ74OJwAKCRCyPKLppCJ+ J30mAPsHCA8A+CNq/5yW2VhFLV1GgCSL5oWqxXRn7QjhSrCQBQEAot2u4O5zXs7M sg+mPlYiS1oT+zmvTLlXrN+bVyWP9A4= =jH1N -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure	2025-02-25 13:32:32 -08:00
liuye	c4f23a9d6e	selftests/x86/lam: Fix minor memory in do_uring() Exception branch returns without freeing 'fi'. Signed-off-by: liuye <liuye@kylinos.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250114082650.113105-1-liuye@kylinos.cn	2025-02-25 15:03:06 +01:00
Linus Torvalds	2a1944bff5	RISC-V Fixes for 6.14-rc5 * A fix for cacheinfo DT probing to avoid reading non-boolean properties as booleans. * A fix for cpufeature to use bitmap_equal() instead of memcmp(), so unused bits are ignored. * Fixes for cmpxchg and futex cmpxchg that properly encode the sign extension requirements on inline asm, which results in spurious successes. This manifests in at least inode_set_ctime_current, but is likely just a disaster waiting to happen. * A fix for the rseq selftests, which was using an invalid constraint. * A pair of fixes for signal frame size handling: * We were reserving space for an extra empty extension context header on systems with extended signal context, thus resulting in unnecessarily large allocations. * We weren't properly checking for available extensions before calculating the signal stack size, which resulted in undersized stack allocations on some systems (at least those with T-Head custom vectors). Also, we've added Alex as a reviewer. He's been helping out a ton lately, thanks! -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAme8rsQTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQVumD/9EZz6+ejftG1u7M/YzFfIa6ZOqYtoi 7aaecvsKNhVu0zvLU2vDAj+lTeokutQKAI9hByoVDVBzCllW/AYnpentQXTbHbl4 3pyIeOPKMXEDyru8heQ/u7h2qhq1AN54btPHx9UdIc5/vyrQIF2Ec4jo1GojJmyj qNHSyeTOB8nhBHYzM2RYAw6r0s27UX5wfL09Cm97b7KJigJ6GGPsI+KjruOqVfST i7wqbH1ufQoXrU8DhICA5wT/Q7f2WqJ3W2IyP186EJOAXhEG9xzOMQzVwCm2KI/I Rf5mhE5MoS9+ZmyhpATRc8Dwr1iZjIh7nKIFJh4/IGcSCuwsnWEFRvihmqCnphl3 /fFLstBflFAEKadmc7LgRJDTZYAjVAe/kSrl3v9ArVBU7d07/dAUOXcgjJy6fygx 7yrMnT3Ew4J160iDvFIalUdjEgYsKbwNygT/D6XlcXuR/KoWA+HYQ7S4eTgbqpiz 366JAhZQ9xyQxTlca13sdkz+2eBjU/SAX8O9/WYpDp6J5uqOFtyQ52a2qEzQFznF MRtFe2YERlM8L0vqJhNFEDefo7FUuZIyOU8evhJA+L7X8b3FpsEx5xUzamXdeGsQ 0sEbsBPhNToVF0DWn8HtdpMYYX95xujj83QneMawaGjpS7cpY93OVUNFAJ/a/uuJ bGLv+1HIdW0LoQ== =+DfW -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for cacheinfo DT probing to avoid reading non-boolean properties as booleans. - A fix for cpufeature to use bitmap_equal() instead of memcmp(), so unused bits are ignored. - Fixes for cmpxchg and futex cmpxchg that properly encode the sign extension requirements on inline asm, which results in spurious successes. This manifests in at least inode_set_ctime_current, but is likely just a disaster waiting to happen. - A fix for the rseq selftests, which was using an invalid constraint. - A pair of fixes for signal frame size handling: - We were reserving space for an extra empty extension context header on systems with extended signal context, thus resulting in unnecessarily large allocations. - We weren't properly checking for available extensions before calculating the signal stack size, which resulted in undersized stack allocations on some systems (at least those with T-Head custom vectors). Also, we've added Alex as a reviewer. He's been helping out a ton lately, thanks! * tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: MAINTAINERS: Add myself as a riscv reviewer riscv: signal: fix signal_minsigstksz riscv: signal: fix signal frame size rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm riscv/futex: sign extend compare value in atomic cmpxchg riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg riscv: cpufeature: use bitmap_equal() instead of memcmp() riscv: cacheinfo: Use of_property_present() for non-boolean properties	2025-02-24 16:40:32 -08:00
Martin K. Petersen	adc4fb9c81	Merge patch series "Initial support for RK3576 UFS controller" Shawn Lin <shawn.lin@rock-chips.com> says: This patchset adds initial UFS controller supprt for RK3576 SoC. Patch 1 is the dt-bindings. Patch 2-4 deal with rpm and spm support in advanced suggested by Ulf. Patch 5 exports two new APIs for host driver. Patch 6 and 7 are the host driver and dtsi support. Link: https://lore.kernel.org/r/1738736156-119203-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-02-24 19:24:14 -05:00
Yiqian Xun	e402c70856	selftests/user_events: Fix failures caused by test code In parse_abi function,the dyn_test fails because the enable_file isn’t closed after successfully registering an event. By adding wait_for_delete(), the dyn_test now passes as expected. Link: https://lore.kernel.org/r/20250221033555.326716-1-realxxyq@163.com Signed-off-by: Yiqian Xun <xunyiqian@kylinos.cn> Acked-by: Beau Belgrave <beaub@linux.microsoft.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-24 16:37:17 -07:00
Jakub Kicinski	29b036be1b	selftests: drv-net: test XDP, HDS auto and the ioctl path Test XDP and HDS interaction. While at it add a test for using the IOCTL, as that turned out to be the real culprit. Testing bnxt: # NETIF=eth0 ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by the device ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0 and netdevsim: # ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0 Netdevsim needs a sane default for tx/rx ring size. ethtool 6.11 is needed for the --disable-netlink option. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250221025141.1132944-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-24 14:16:37 -08:00
David Wei	89baa22d75	io_uring/zcrx: add selftest case for recvzc with read limit Add a selftest case to iou-zcrx where the sender sends 4x4K = 16K and the receiver does 4x4K recvzc requests. Validate that the requests complete successfully and that the data is not corrupted. Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20250224041319.2389785-3-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-24 12:56:13 -07:00
Sean Christopherson	2428865bf0	KVM: selftests: Add a nested (forced) emulation intercept test for x86 Add a rudimentary test for validating KVM's handling of L1 hypervisor intercepts during instruction emulation on behalf of L2. To minimize complexity and avoid overlap with other tests, only validate KVM's handling of instructions that L1 wants to intercept, i.e. that generate a nested VM-Exit. Full testing of emulation on behalf of L2 is better achieved by running existing (forced) emulation tests in a VM, (although on VMX, getting L0 to emulate on #UD requires modifying either L1 KVM to not intercept #UD, or modifying L0 KVM to prioritize L0's exception intercepts over L1's intercepts, as is done by KVM for SVM). Since emulation should never be successful, i.e. L2 always exits to L1, dynamically generate the L2 code stream instead of adding a helper for each instruction. Doing so requires hand coding instruction opcodes, but makes it significantly easier for the test to compute the expected "next RIP" and instruction length. Link: https://lore.kernel.org/r/20250201015518.689704-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-24 09:01:07 -08:00
Linus Torvalds	b8c8c1414f	tracing fixes for v6.14: Function graph accounting fixes: - Fix the manage ops hashes The function graph registers a "manager ops" and "sub-ops" to ftrace. The manager ops does not have any callback but calls the sub-ops callbacks. The manage ops hashes (what is used to tell ftrace what functions to attach to) is built on the sub-ops it manages. There was an error in the way it built the hash. An empty hash means to attach to all functions. When the manager ops had one sub-ops it properly copied its hash. But when the manager ops had more than one sub-ops, it went into a loop to make a set of all functions it needed to add to the hash. If any of the subops hashes was empty, that would mean to attach to all functions. The error was that the first iteration of the loop passed in an empty hash to start with in order to add the other hashes. That starting hash was mistaken as to attach to all functions. This made the manage ops attach to all functions whenever it had two or more sub-ops, even if each sub-op was attached to only a single function. - Do not add duplicate entries to the manager ops hash If two or more subops hashes trace the same function, an entry for that function will be added to the manager ops for each subops. This causes waste and extra overhead. Fprobe accounting fixes: - Remove last function from fprobe hash Fprobes has a ftrace hash to manage which functions an fprobe is attached to. It also has a counter of how many fprobes are attached. When the last fprobe is removed, it unregisters the fprobe from ftrace but does not remove the functions the last fprobe was attached to from the hash. This leaves the old functions attached. When a new fprobe is added, the fprobe infrastructure attaches to not only the functions of the new fprobe, but also to the functions of the last fprobe. - Fix accounting of the fprobe counter When a fprobe is added, it updates a counter. If the counter goes from zero to one, it attaches its ops to ftrace. When an fprobe is removed, the counter is decremented. If the counter goes from 1 to zero, it removes the fprobes ops from ftrace. There was an issue where if two fprobes trace the same function, the addition of each fprobe would increment the counter. But when removing the first of the fprobes, it would notice that another fprobe is still attached to one of its functions no it does not remove the functions from the ftrace ops. But it also did not decrement the counter. When the last fprobe is removed, the counter is still one. This leaves the fprobes callback still registered with ftrace and it being called by the functions defined by the fprobes ops hash. Worse yet, because all the functions from the fprobe ops hash have been removed, that tells ftrace that it wants to trace all functions. Thus, this puts the state of the system where every function is calling the fprobe callback handler (which does nothing as there are no registered fprobes), but this causes a good 13% slow down of the entire system. Other updates: - Add a selftest to test the above issues to prevent regressions. - Fix preempt count accounting in function tracing Better recursion protection was added to function tracing which added another layer of preempt disable. As the preempt_count gets traced in the event, it needs to subtract the amount of preempt disabling the tracer does to record what the preempt_count was when the trace was triggered. - Fix memory leak in output of set_event A variable is passed by the seq_file functions in the location that is set by the return of the next() function. The start() function allocates it and the stop() function frees it. But when the last item is found, the next() returns NULL which leaks the data that was allocated in start(). The m->private is used for something else, so have next() free the data when it returns NULL, as stop() will then just receive NULL in that case. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ7j6ARQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6quFxAQDrO8tjYbhLqg/LMOQyzwn/EF3Jx9ub 87961mA0rKTkYwEAhPNzTZ6GwKyKc4ny/R338KgNY69wWnOK6k/BTxCRmwk= =TOah -----END PGP SIGNATURE----- Merge tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Function graph accounting fixes: - Fix the manage ops hashes The function graph registers a "manager ops" and "sub-ops" to ftrace. The manager ops does not have any callback but calls the sub-ops callbacks. The manage ops hashes (what is used to tell ftrace what functions to attach to) is built on the sub-ops it manages. There was an error in the way it built the hash. An empty hash means to attach to all functions. When the manager ops had one sub-ops it properly copied its hash. But when the manager ops had more than one sub-ops, it went into a loop to make a set of all functions it needed to add to the hash. If any of the subops hashes was empty, that would mean to attach to all functions. The error was that the first iteration of the loop passed in an empty hash to start with in order to add the other hashes. That starting hash was mistaken as to attach to all functions. This made the manage ops attach to all functions whenever it had two or more sub-ops, even if each sub-op was attached to only a single function. - Do not add duplicate entries to the manager ops hash If two or more subops hashes trace the same function, an entry for that function will be added to the manager ops for each subops. This causes waste and extra overhead. Fprobe accounting fixes: - Remove last function from fprobe hash Fprobes has a ftrace hash to manage which functions an fprobe is attached to. It also has a counter of how many fprobes are attached. When the last fprobe is removed, it unregisters the fprobe from ftrace but does not remove the functions the last fprobe was attached to from the hash. This leaves the old functions attached. When a new fprobe is added, the fprobe infrastructure attaches to not only the functions of the new fprobe, but also to the functions of the last fprobe. - Fix accounting of the fprobe counter When a fprobe is added, it updates a counter. If the counter goes from zero to one, it attaches its ops to ftrace. When an fprobe is removed, the counter is decremented. If the counter goes from 1 to zero, it removes the fprobes ops from ftrace. There was an issue where if two fprobes trace the same function, the addition of each fprobe would increment the counter. But when removing the first of the fprobes, it would notice that another fprobe is still attached to one of its functions no it does not remove the functions from the ftrace ops. But it also did not decrement the counter, so when the last fprobe is removed, the counter is still one. This leaves the fprobes callback still registered with ftrace and it being called by the functions defined by the fprobes ops hash. Worse yet, because all the functions from the fprobe ops hash have been removed, that tells ftrace that it wants to trace all functions. Thus, this puts the state of the system where every function is calling the fprobe callback handler (which does nothing as there are no registered fprobes), but this causes a good 13% slow down of the entire system. Other updates: - Add a selftest to test the above issues to prevent regressions. - Fix preempt count accounting in function tracing Better recursion protection was added to function tracing which added another layer of preempt disable. As the preempt_count gets traced in the event, it needs to subtract the amount of preempt disabling the tracer does to record what the preempt_count was when the trace was triggered. - Fix memory leak in output of set_event A variable is passed by the seq_file functions in the location that is set by the return of the next() function. The start() function allocates it and the stop() function frees it. But when the last item is found, the next() returns NULL which leaks the data that was allocated in start(). The m->private is used for something else, so have next() free the data when it returns NULL, as stop() will then just receive NULL in that case" * tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak when reading set_event file ftrace: Correct preemption accounting for function tracing. selftests/ftrace: Update fprobe test to check enabled_functions file fprobe: Fix accounting of when to unregister from function graph fprobe: Always unregister fgraph function from ops ftrace: Do not add duplicate entries in subops manager ops ftrace: Fix accounting of adding subops to a manager ops	2025-02-22 09:03:54 -08:00
Tamir Duberstein	ba6be7ba2d	selftests: remove reference to prime_numbers.sh Remove a leftover shell script reference from commit `313b38a6ec` ("lib/prime_numbers: convert self-test to KUnit"). Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202502171110.708d965a-lkp@intel.com Fixes: `313b38a6ec` ("lib/prime_numbers: convert self-test to KUnit") Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250217-fix-prime-numbers-v1-1-eb0ca7235e60@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-22 06:44:45 -08:00
Michael Jeanson	3c27b40830	selftests/rseq: Add rseq syscall errors test This test adds coverage of expected errors during rseq registration and unregistration, it disables glibc integration and will thus always exercise the rseq syscall explictly. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250121213402.1754762-1-mjeanson@efficios.com	2025-02-22 14:13:45 +01:00
Maciej Wieczor-Retman	782b819827	selftests/lam: Test get_user() LAM pointer handling Recent change in how get_user() handles pointers: https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundation.org/ has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/1624d9d1b9502517053a056652d50dc5d26884ac.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Maciej Wieczor-Retman	51f909dcd1	selftests/lam: Skip test if LAM is disabled Until LASS is merged into the kernel: https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com/ LAM is left disabled in the config file. Running the LAM selftest with disabled LAM only results in unhelpful output. Use one of LAM syscalls() to determine whether the kernel was compiled with LAM support (CONFIG_ADDRESS_MASKING) or not. Skip running the tests in the latter case. Merge CPUID checking function with the one mentioned above to achieve a single function that shows LAM's availability from both CPU and the kernel. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/251d0f45f6a768030115e8d04bc85458910cb0dc.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Maciej Wieczor-Retman	ec8f5b4659	selftests/lam: Move cpu_has_la57() to use cpuinfo flag In current form cpu_has_la57() reports platform's support for LA57 through reading the output of cpuid. A much more useful information is whether 5-level paging is actually enabled on the running system. Check whether 5-level paging is enabled by trying to map a page in the high linear address space. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/8b1ca51b13e6d94b5a42b6930d81b692cbb0bcbb.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Hangbin Liu	465b210fdc	selftests: fib_nexthops: do not mark skipped tests as failed The current test marks all unexpected return values as failed and sets ret to 1. If a test is skipped, the entire test also returns 1, incorrectly indicating failure. To fix this, add a skipped variable and set ret to 4 if it was previously 0. Otherwise, keep ret set to 1. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:23:29 -08:00
Ido Schimmel	e818d1d1a6	selftests: fib_rule_tests: Add DSCP mask match tests Add tests for FIB rules that match on DSCP with a mask. Test both good and bad flows and both the input and output paths. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: dscp redirect to table [ OK ] TEST: rule6 check: dscp no redirect to table [ OK ] TEST: rule6 del by pref: dscp redirect to table [ OK ] TEST: rule6 check: iif dscp redirect to table [ OK ] TEST: rule6 check: iif dscp no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp redirect to table [ OK ] TEST: rule6 check: dscp masked redirect to table [ OK ] TEST: rule6 check: dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp masked redirect to table [ OK ] [...] Tests passed: 316 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:49 -08:00
Jakub Kicinski	e87700965a	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ7ffOQAKCRAraS+Z+3y5 EzVHAP9h/QkeYoOZW9gul08I8vFiZsFe/lbOSLJWxeVfxb9JhgD/cMqby3qAxQK6 lsdNQ9jYG2232Wym89ag7fvTBK15Wg4= =gkN2 -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:59:47 -08:00
Xiao Liang	85cb3711ac	selftests: net: Add test cases for link and peer netns - Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	0303294162	selftests: net: Add python context manager for netns entering Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-13-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Steven Rostedt	e85c5e9792	selftests/ftrace: Update fprobe test to check enabled_functions file A few bugs were found in the fprobe accounting logic along with it using the function graph infrastructure. Update the fprobe selftest to catch those bugs in case they or something similar shows up in the future. The test now checks the enabled_functions file which shows all the functions attached to ftrace or fgraph. When enabling a fprobe, make sure that its corresponding function is also added to that file. Also add two more fprobes to enable to make sure that the fprobe logic works properly with multiple probes. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250220202055.733001756@goodmis.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-02-21 09:36:12 -05:00
Chandra Pratap	ae9ebda1bc	selftests: fix spelling/grammar errors in sysctl/sysctl.sh Fix the grammatical/spelling errors in sysctl/sysctl.sh. This fixes all errors pointed out by codespell in the file. Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-02-21 09:27:04 +01:00
Amery Hung	63817c7711	selftests/bpf: Test struct_ops program with __ref arg calling bpf_tail_call Test if the verifier rejects struct_ops program with __ref argument calling bpf_tail_call(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250220221532.1079331-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-20 18:44:35 -08:00
Alexei Starovoitov	bd4319b6c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 Cross-merge bpf fixes after downstream PR (bpf-6.14-rc4). Minor conflict: kernel/bpf/btf.c Adjacent changes: kernel/bpf/arena.c kernel/bpf/btf.c kernel/bpf/syscall.c kernel/bpf/verifier.c mm/memory.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-20 18:13:57 -08:00
Jakub Kicinski	932a9249f7	selftests: drv-net: rename queues check_xdp to check_xsk The test is for AF_XDP, we refer to AF_XDP as XSK. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	4fde839846	selftests: drv-net: improve the use of ksft helpers in XSK queue test Avoid exceptions when xsk attr is not present, and add a proper ksft helper for "not in" condition. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	7147713799	selftests: drv-net: add a way to wait for a local process We use wait_port_listen() extensively to wait for a process we spawned to be ready. Not all processes will open listening sockets. Add a method of explicitly waiting for a child to be ready. Pass a FD to the spawned process and wait for it to write a message to us. FD number is passed via KSFT_READY_FD env variable. Similarly use KSFT_WAIT_FD to let the child process for a sign that we are done and child should exit. Sending a signal to a child with shell=True can get tricky. Make use of this method in the queues test to make it less flaky. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	d3726ab45c	selftests: drv-net: probe for AF_XDP sockets more explicitly Separate the support check from socket binding for easier refactoring. Use: ./helper - - just to probe if we can open the socket. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	bab59dcf71	selftests: drv-net: add missing new line in xdp_helper Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	dabd31baa3	selftests: drv-net: use cfg.rpath() in netlink xsk attr test The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	846742f7e3	selftests: drv-net: add a warning for bkg + shell + terminate Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:57:29 -08:00
Linus Torvalds	319fc77f8f	BPF fixes: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> -----BEGIN PGP SIGNATURE----- iIsEABYKADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZ7evlRUcZGFuaWVsQGlv Z2VhcmJveC5uZXQACgkQ2yufC7HISIPwHgD/dTvM00Ck4Q73fPivyT7tcqxeXJlD D6ggzWl/SG9LAbwA/2/cSgAM9Jm1g7ddvn/S9QaDYOs5GmFl6urq6krs+tYD =FCs9 -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull BPF fixes from Daniel Borkmann: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests: bpf: test batch lookup on array of maps with holes bpf: skip non exist keys in generic_map_lookup_batch bpf: Handle allocation failure in acquire_lock_state bpf: verifier: Disambiguate get_constant_map_key() errors bpf: selftests: Test constant key extraction on irrelevant maps bpf: verifier: Do not extract constant map keys for irrelevant maps bpf: Fix softlockup in arena_map_free on 64k page kernel net: Add rx_skb of kfree_skb to raw_tp_null_args[]. bpf: Fix deadlock when freeing cgroup storage selftests/bpf: Add strparser test for bpf selftests/bpf: Fix invalid flag of recv() bpf: Disable non stream socket for strparser bpf: Fix wrong copied_seq calculation strparser: Add read_sock callback bpf: avoid holding freeze_mutex during mmap operation bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic selftests/bpf: Adjust data size to have ETH_HLEN bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed	2025-02-20 15:37:17 -08:00
Song Yoong Siang	6164847e54	selftests/bpf: Add launch time request to xdp_hw_metadata Add launch time hardware offload request to xdp_hw_metadata. Users can configure the delta of launch time relative to HW RX-time using the "-l" argument. By default, the delta is set to 0 ns, which means the launch time is disabled. By setting the delta to a non-zero value, the launch time hardware offload feature will be enabled and requested. Additionally, users can configure the Tx Queue to be enabled with the launch time hardware offload using the "-L" argument. By default, Tx Queue 0 will be used. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250216093430.957880-3-yoong.siang.song@intel.com	2025-02-20 15:13:45 -08:00
Jason Xing	f4924aec58	selftests/bpf: Add simple bpf tests in the tx path for timestamping feature BPF program calculates a couple of latency deltas between each tx timestamping callbacks. It can be used in the real world to diagnose the kernel behaviour in the tx path. Check the safety issues by accessing a few bpf calls in bpf_test_access_bpf_calls() which are implemented in the patch 3 and 4. Check if the bpf timestamping can co-exist with socket timestamping. There remains a few realistic things[1][2] to highlight: 1. in general a packet may pass through multiple qdiscs. For instance with bonding or tunnel virtual devices in the egress path. 2. packets may be resent, in which case an ACK might precede a repeat SCHED and SND. 3. erroneous or malicious peers may also just never send an ACK. [1]: https://lore.kernel.org/all/67a389af981b0_14e0832949d@willemb.c.googlers.com.notmuch/ [2]: https://lore.kernel.org/all/c329a0c1-239b-4ca1-91f2-cb30b8dd2f6a@linux.dev/ Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-13-kerneljasonxing@gmail.com	2025-02-20 14:30:07 -08:00
Thomas Weißschuh	9c812b01f1	tools/nolibc: add support for 32-bit s390 32-bit s390 is very close to the existing 64-bit implementation. Some special handling is necessary as there is neither LLVM nor QEMU support. Also the kernel itself can not build natively for 32-bit s390, so instead the test program is executed with a 64-bit kernel. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-2-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:06:32 +01:00
Thomas Weißschuh	3d1e67c615	selftests/nolibc: rename s390 to s390x Support for 32-bit s390 is about to be added. As "s39032" would look horrible, use the another naming scheme. 32-bit s390 is "s390" and 64-bit s390 is "s390x", similar to how it is handled in various toolchain components. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-1-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:06:18 +01:00
Thomas Weißschuh	00ddf4cc97	selftests/nolibc: only run constructor tests on nolibc The nolibc testsuite can be run against other libcs to test for interoperability. Some aspects of the constructor execution are not standardized and musl does not provide all tested feature, for one it does not provide arguments to the constructors, anymore? Skip the constructor tests on non-nolibc configurations. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250212-nolibc-test-constructor-v1-1-c963875b3da4@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:04:12 +01:00
Steven Rostedt	e35896f236	selftests/tracing: Allow some more tests to run in instances The tests: trigger-action-hist-xfail.tc trigger-onchange-action-hist.tc trigger-snapshot-action-hist.tc trigger-hist-expressions.tc can all run in an instance. Test them in an instance as well. Link: https://lore.kernel.org/r/20250220185846.451234966@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:14 -07:00
Steven Rostedt	a58cc70af2	selftests/ftrace: Clean up triggers after setting them The triggers set in trigger-onchange-action-hist.tc and trigger-snapshot-action-hist.tc are not cleaned up at the end. These tests can also be done in instances and without cleaning up the triggers, the instances can not be removed as they are still "busy". Link: https://lore.kernel.org/r/20250220185846.291817731@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:07 -07:00
Steven Rostedt	4a3134b114	selftests/tracing: Test only toplevel README file not the instances For the tests that have both a README attribute as well as the instance flag to run the tests as an instance, the instance version will always exit with UNSUPPORTED. That's because the instance directory does not contain a README file. Currently, the tests check for a README file in the directory that the test runs in and if there's a requirement for something to be present in the README file, it will not find it, as the instance directory doesn't have it. Have the tests check if the current directory is an instance directory, and if it is, check two directories above the current directory for the README file: /sys/kernel/tracing/README /sys/kernel/tracing/instances/foo/../../README Link: https://lore.kernel.org/r/20250220185846.130216270@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:01 -07:00
Jakub Kicinski	5d6ba5ab85	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 10:37:30 -08:00
Linus Torvalds	27eddbf344	Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)). Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: geneve: fix use-after-free in geneve_find_dev(). - eth: ibmvnic: don't reference skb after sending to VIOS Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAme3Dc0SHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkQToQAKKTNLnFkYGFevxOukoZtU1l4ZOiaavg FlSsCVQrXfOvq5tIbkT7X86F3jYBNElNHvSDAFNQu5ZayEjqJHljSQE17rVVhepX Gvo9Jfk/ERwwd9vD78+KSvGMzSYAiIFoZmgLjaNyiFH53gMku8K41NRQAI2EeUzG G7lK/XYHvp/N7Ikt6ZMTFFW5NH7Wt2oeKZ8DbNBhK9+9lCet4s6PWzmGYrYkGUyy 0iuYh09tKCGrBH0BpHMIGokqttVawjdg4tCJUMXurVoOWus0dKXuIdpKQuCdm8jO lV89xotE7Icw9CJTFQQZNn2C94J3RL4xtqo4fOuO1ztll+ma/nbPp9lJgy2P1WRI DqWJart1YuzC4Ef1NQiNeqRjxKoLDHbuu2A1zdAHMyHxZsKaQWowhhjylQLYsU3N d0WHlo43VQf95wx9cqdsGmEopk5YPLqBS1Wz/uL/m7y/+dmK4nKCcmqhIQ1QG23L SOhuOmHDs3mDkXusxr7NULCwGzjy3rPSExGrM68kiDEUX7cWFvVGmrRHf2bBiv7A gJ6UZ3J4au477adHjj2lBn3n1iIQDSN04gjPYKjunSEaMI2nqr55dN/aAnKFTjtm fMPXMWOHDfkjtX0iV69/345dXyzL5r4JHNwtTwoLx1QRMm6WoznJveHI6qXQ1RGb HdG8y9A9ZfLn =pEjU -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)) Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: - geneve: fix use-after-free in geneve_find_dev() - ibmvnic: don't reference skb after sending to VIOS" * tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: skb: introduce and use a single page frag cache" net: allow small head cache usage with large MAX_SKB_FRAGS values nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() tcp: drop secpath at the same time as we currently drop dst net: axienet: Set mac_managed_pm arp: switch to dev_getbyhwaddr() in arp_req_set_public() net: Add non-RCU dev_getbyhwaddr() helper sctp: Fix undefined behavior in left shift operation selftests/bpf: Add a specific dst port matching flow_dissector: Fix port range key handling in BPF conversion selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range flow_dissector: Fix handling of mixed port and port-range keys geneve: Suppress list corruption splat in geneve_destroy_tunnels(). gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). dev: Use rtnl_net_dev_lock() in unregister_netdev(). net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). net: Add net_passive_inc() and net_passive_dec(). net: pse-pd: pd692x0: Fix power limit retrieval MAINTAINERS: trim the GVE entry gve: set xdp redirect target only when it is available ...	2025-02-20 10:19:54 -08:00
Christian Brauner	540dcf0f44	selftests/nsfs: add ioctl validation tests Add simple tests to validate that non-nsfs ioctls are rejected. Link: https://lore.kernel.org/r/20250219-work-nsfs-v1-2-21128d73c5e8@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-20 09:13:52 +01:00
Jakub Kicinski	0d0f4174f6	selftests: drv-net: add a simple TSO test Add a simple test for TSO. Send a few MB of data and check device stats to verify that the device was performing segmentation. Do the same thing over a few tunnel types. Injecting GSO packets directly would give us more ability to test corner cases, but perhaps starting simple is good enough? # ./ksft-net-drv/drivers/net/hw/tso.py # Detected qstat for LSO wire-packets KTAP version 1 1..14 ok 1 tso.ipv4 # SKIP Test requires IPv4 connectivity ok 2 tso.vxlan4_ipv4 # SKIP Test requires IPv4 connectivity ok 3 tso.vxlan6_ipv4 # SKIP Test requires IPv4 connectivity ok 4 tso.vxlan_csum4_ipv4 # SKIP Test requires IPv4 connectivity ok 5 tso.vxlan_csum6_ipv4 # SKIP Test requires IPv4 connectivity ok 6 tso.gre4_ipv4 # SKIP Test requires IPv4 connectivity ok 7 tso.gre6_ipv4 # SKIP Test requires IPv4 connectivity ok 8 tso.ipv6 ok 9 tso.vxlan4_ipv6 ok 10 tso.vxlan6_ipv6 ok 11 tso.vxlan_csum4_ipv6 ok 12 tso.vxlan_csum6_ipv6 # Testing with mangleid enabled ok 13 tso.gre4_ipv6 ok 14 tso.gre6_ipv6 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:7 error:0 Note that the test currently depends on the driver reporting the LSO count via qstat, which appears to be relatively rare (virtio, cisco/enic, sfc/efc; but virtio needs host support). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	de94e86974	selftests: drv-net: store addresses in dict indexed by ipver Looks like more and more tests want to iterate over IP version, run the same test over ipv4 and ipv6. The current naming of members in the env class makes it a bit awkward, we have separate members for ipv4 and ipv6 parameters. Store the parameters inside dicts, so that tests can easily index them with ip version. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	2aefca8e1f	selftests: drv-net: get detailed interface info We already record output of ip link for NETIF in env for easy access. Record the detailed version. TSO test will want to know the max tso size. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	2217bcb491	selftests: drv-net: resolve remote interface name Find out and record in env the name of the interface which remote host will use for the IP address provided via config. Interface name is useful for mausezahn and for setting up tunnels. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:49 -08:00
Suchit	23dcacff2d	selftests: net: Fix minor typos in MPTCP and psock tests Fixes minor spelling errors: - `simult_flows.sh`: "al testcases" -> "all testcases" - `psock_tpacket.c`: "accross" -> "across" Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250218165923.20740-1-suchitkarunakaran@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:02:48 -08:00
Cong Wang	15de6ba95d	selftests/bpf: Add a specific dst port matching After this patch: #102/1 flow_dissector_classification/ipv4:OK #102/2 flow_dissector_classification/ipv4_continue_dissect:OK #102/3 flow_dissector_classification/ipip:OK #102/4 flow_dissector_classification/gre:OK #102/5 flow_dissector_classification/port_range:OK #102/6 flow_dissector_classification/ipv6:OK #102 flow_dissector_classification:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250218043210.732959-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:54:59 -08:00
Cong Wang	dfc1580f96	selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range After this patch: # ./tc_flower_port_range.sh TEST: Port range matching - IPv4 UDP [ OK ] TEST: Port range matching - IPv4 TCP [ OK ] TEST: Port range matching - IPv6 UDP [ OK ] TEST: Port range matching - IPv6 TCP [ OK ] TEST: Port range matching - IPv4 UDP Drop [ OK ] Cc: Qiang Zhang <dtzq01@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250218043210.732959-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:54:59 -08:00
Ido Schimmel	f5d783c088	selftests: fib_rule_tests: Add port mask match tests Add tests for FIB rules that match on source and destination ports with a mask. Test both good and bad flows. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport range no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport masked redirect to table [ OK ] TEST: rule6 check: sport and dport masked no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport masked redirect to table [ OK ] [...] Tests passed: 292 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:43:38 -08:00
Ido Schimmel	94694aa641	selftests: fib_rule_tests: Add port range match tests Currently, only matching on specific ports is tested. Add port range testing to make sure this use case does not regress. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:43:38 -08:00
Jordan Rome	7042882abc	selftests/bpf: Add tests for bpf_copy_from_user_task_str This adds tests for both the happy path and the error path (with and without the BPF_F_PAD_ZEROS flag). Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250213152125.1837400-3-linux@jordanrome.com	2025-02-19 17:01:36 -08:00
Alexis Lothoré (eBPF Foundation)	ac13c50872	selftests/bpf: Enable kprobe_multi tests for ARM64 The kprobe_multi feature was disabled on ARM64 due to the lack of fprobe support. The fprobe rewrite on function_graph has been recently merged and thus brought support for fprobes on arm64. This then enables kprobe_multi support on arm64, and so the corresponding tests can now be run on this architecture. Remove the tests depending on kprobe_multi from DENYLIST.aarch64 to allow those to run in CI. CONFIG_FPROBE is already correctly set in tools/testing/selftests/bpf/config Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250219-enable_kprobe_multi_tests-v1-1-faeec99240c8@bootlin.com	2025-02-19 14:57:05 -08:00
Jason Xing	7a93ba8048	selftests/bpf: Add rto max for bpf_setsockopt test Test the TCP_RTO_MAX_MS optname in the existing setget_sockopt test. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250219081333.56378-3-kerneljasonxing@gmail.com	2025-02-19 12:30:52 -08:00
Bastien Curutchet (eBPF Foundation)	157feaaf18	selftests/bpf: ns_current_pid_tgid: Use test_progs's ns_ feature Two subtests use the test_in_netns() function to run the test in a dedicated network namespace. This can now be done directly through the test_progs framework with a test name starting with 'ns_'. Replace the use of test_in_netns() by test_ns_* calls. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-4-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	207cd7578a	selftests/bpf: tc_links/tc_opts: Unserialize tests Tests are serialized because they all use the loopback interface. Replace the 'serial_test_' prefixes with 'test_ns_' to benefit from the new test_prog feature which creates a dedicated namespace for each test, allowing them to run in parallel. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-3-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	c047e0e0e4	selftests/bpf: Optionally open a dedicated namespace to run test in it Some tests are serialized to prevent interference with others. Open a dedicated network namespace when a test name starts with 'ns_' to allow more test parallelization. Use the test name as namespace name to avoid conflict between namespaces. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-2-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	4a06c5251a	selftests/bpf: ns_current_pid_tgid: Rename the test function Next patch will add a new feature to test_prog to run tests in a dedicated namespace if the test name starts with 'ns_'. Here the test name already starts with 'ns_' and creates some namespaces which would conflict with the new feature. Rename the test to avoid this conflict. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-1-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Christian Brauner	a1579f6bf6	selftests/ovl: add third selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-4-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	6e5ed6587e	selftests/ovl: add second selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-3-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	c68946ee7e	selftests/filesystems: add utils.{c,h} Add a new set of helpers that will be used in follow-up patches. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	96f0943259	selftests/ovl: add first selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-2-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Eduard Zingerman	6361cd26e4	selftests/bpf: check states pruning for deeply nested iterator A test case with ridiculously deep bpf_for() nesting and a conditional update of a stack location. Consider the innermost loop structure: 1: bpf_for(o, 0, 10) 2: if (unlikely(bpf_get_prandom_u32())) 3: buf[0] = 42; 4: <exit> Assuming that verifier.c:clean_live_states() operates w/o change from the previous patch (e.g. as on current master) verification would proceed as follows: - at (1) state {buf[0]=?,o=drained}: - checkpoint - push visit to (2) for later - at (4) {buf[0]=?,o=drained} - pop (2) {buf[0]=?,o=active}, push visit to (3) for later - at (1) {buf[0]=?,o=active} - checkpoint - push visit to (2) for later - at (4) {buf[0]=?,o=drained} - pop (2) {buf[0]=?,o=active}, push visit to (3) for later - at (1) {buf[0]=?,o=active}: - checkpoint reached, checkpoint's branch count becomes 0 - checkpoint is processed by clean_live_states() and becomes {o=active} - pop (3) {buf[0]=42,o=active} - at (1), {buf[0]=42,o=active} - checkpoint - push visit to (2) for later - at (4) {buf[0]=42,o=drained} - pop (2) {buf[0]=42,o=active}, push visit to (3) for later - at (1) {buf[0]=42,o=active}, checkpoint reached - pop (3) {buf[0]=42,o=active} - at (1) {buf[0]=42,o=active}: - checkpoint reached, checkpoint's branch count becomes 0 - checkpoint is processed by clean_live_states() and becomes {o=active} - ... Note how clean_live_states() converted the checkpoint {buf[0]=42,o=active} to {o=active} and it can no longer be matched against {buf[0]=<any>,o=active}, because iterator based states are compared using stacksafe(... RANGE_WITHIN), that requires stack slots to have same types. At the same time there are still states {buf[0]=42,o=active} pushed to DFS stack. This behaviour becomes exacerbated with multiple nesting levels, here are veristat results: - nesting level 1: 69 insns - nesting level 2: 258 insns - nesting level 3: 900 insns - nesting level 4: 4754 insns - nesting level 5: 35944 insns - nesting level 6: 312558 insns - nesting level 7: 1M limit Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250215110411.3236773-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 19:22:59 -08:00
Eduard Zingerman	6da35da1a3	selftests/bpf: test correct loop_entry update in copy_verifier_state A somewhat cumbersome test case sensitive to correct copying of bpf_verifier_state->loop_entry fields in verifier.c:copy_verifier_state(). W/o the fix from a previous commit the program is accepted as safe. 1: /* poison block / 2: if (random() != 24) { // assume false branch is placed first 3: i = iter_new(); 4: while (iter_next(i)); 5: iter_destroy(i); 6: return; 7: } 8: 9: / dfs_depth block / 10: for (i = 10; i > 0; i--); 11: 12: / main block / 13: i = iter_new(); // fp[-16] 14: b = -24; // r8 15: for (;;) { 16: if (iter_next(i)) 17: break; 18: if (random() == 77) { // assume false branch is placed first 19: (u64 )(r10 + b) = 7; // this is not safe when b == -25 20: iter_destroy(i); 21: return; 22: } 23: if (random() == 42) { // assume false branch is placed first 24: b = -25; 25: } 26: } 27: iter_destroy(i); The goal of this example is to: (a) poison env->cur_state->loop_entry with a state S, such that S->branches == 0; (b) set state S as a loop_entry for all checkpoints in / main block /, thus forcing NOT_EXACT states comparisons; (c) exploit incorrect loop_entry set for checkpoint at line 18 by first creating a checkpoint with b == -24 and then pruning the state with b == -25 using that checkpoint. The / poison block / is responsible for goal (a). It forces verifier to first validate some unrelated iterator based loop, which leads to an update_loop_entry() call in is_state_visited(), which places checkpoint created at line 4 as env->cur_state->loop_entry. Starting from line 8, the branch count for that checkpoint is 0. The / dfs_depth block / is responsible for goal (b). It abuses the fact that update_loop_entry(cur, hdr) only updates cur->loop_entry when hdr->dfs_depth <= cur->dfs_depth. After line 12 every state has dfs_depth bigger then dfs_depth of poisoned env->cur_state->loop_entry. Thus the above condition is never true for lines 12-27. The / main block */ is responsible for goal (c). Verification proceeds as follows: - checkpoint {b=-24,i=active} created at line 16; - jump 18->23 is verified first, jump to 19 pushed to stack; - jump 23->26 is verified first, jump to 24 pushed to stack; - checkpoint {b=-24,i=active} created at line 15; - current state is pruned by checkpoint created at line 16, this sets branches count for checkpoint at line 15 to 0; - jump to 24 is popped from stack; - line 16 is reached in state {b=-25,i=active}; - this is pruned by a previous checkpoint {b=-24,i=active}: - checkpoint's loop_entry is poisoned and has branch count of 0, hence states are compared using NOT_EXACT rules; - b is not marked precise yet. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250215110411.3236773-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 19:22:58 -08:00
Chandra Mohan Sundar	f29e41454b	selftests: net: Fix few spelling mistakes Fix few spelling mistakes in net selftests Signed-off-by: Chandra Mohan Sundar <chandru.dav@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250217141520.81033-1-chandru.dav@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-18 18:10:31 -08:00
Yan Zhai	d66b773917	selftests: bpf: test batch lookup on array of maps with holes Iterating through array of maps may encounter non existing keys. The batch operation should not fail on when this happens. Signed-off-by: Yan Zhai <yan@cloudflare.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/9007237b9606dc2ee44465a4447fe46e13f3bea6.1739171594.git.yan@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 17:27:37 -08:00
Bastien Curutchet (eBPF Foundation)	e06f5bfd93	selftests/bpf: Remove test_xdp_redirect_multi.sh The tests done by test_xdp_redirect_multi.sh are now fully covered by the CI through test_xdp_veth.c. Remove test_xdp_redirect_multi.sh Remove xdp_redirect_multi.c that was used by the script to load and attach the BPF programs. Remove their entries in the Makefile Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-6-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	a93bfd824d	selftests/bpf: test_xdp_veth: Add XDP program on egress test XDP programs loaded on egress is tested by test_xdp_redirect_multi.sh but not by the test_progs framework. Add a test case in test_xdp_veth.c to test the XDP program on egress. Use the same BPF program than test_xdp_redirect_multi.sh that replaces the source MAC address by one provided through a BPF map. Use a BPF program that stores the source MAC of received packets in a map to check the test results. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-5-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	1e7e634542	selftests/bpf: test_xdp_veth: Add XDP broadcast redirection tests XDP redirections with BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS flags are tested by test_xdp_redirect_multi.sh but not within the test_progs framework. Add a broadcast test case in test_xdp_veth.c to test them. Use the same BPF programs than the one used by test_xdp_redirect_multi.sh. Use a BPF map to select the broadcast flags. Use a BPF map with an entry per veth to check whether packets are received or not Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-4-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	09c8bb1fae	selftests/bpf: Optionally select broadcasting flags Broadcasting flags are hardcoded for each kind for protocol. Create a redirect_flags map that allows to select the broadcasting flags to use in the bpf_redirect_map(). The protocol ID is used as a key. Set the old hardcoded values as default if the map isn't filled by the BPF caller. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-3-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	19a9484c1b	selftests/bpf: test_xdp_veth: Use a dedicated namespace Tests use the root network namespace, so they aren't fully independent of each other. For instance, the index of the created veth interfaces is incremented every time a new test is launched. Wrap the network topology in a network namespace to ensure full isolation. Use the append_tid() helper to ensure the uniqueness of this namespace's name during parallel runs. Remove the use of the append_tid() on the veth names as they now belong to an already unique namespace. Simplify cleanup_network() by directly deleting the namespaces Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250212-redirect-multi-v5-2-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	6bdac0e317	selftests/bpf: test_xdp_veth: Create struct net_configuration The network configuration is defined by a table of struct veth_configuration. This isn't convenient if we want to add a network configuration that isn't linked to a veth pair. Create a struct net_configuration that holds the veth_configuration table to ease adding new configuration attributes in upcoming patch. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250212-redirect-multi-v5-1-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Brendan Jackman	43ebec94e1	kunit: tool: Build GDB scripts Following a similar rationale as commit `e4835f1da4` ("kunit: tool: Build compile_commands.json"), make a common developer tool available by default for KUnit users. Compared to compile_commands.json, there is a little more work to be done to build the GDB scripts. Is it enough to affect development cycle duration? Unscientific evaluation: rm -rf .kunit; time tools/testing/kunit/kunit.py build --kunitconfig ./lib/kunit/.kunitconfig --jobs 96 Without this patch it took 14.77s, with this patch it took 14.83. So, although `make scripts_gdb` is pretty slow, presumably most of that is just the overhead of running Kbuild at all, actually building the scripts is approximately free. Note also, to actually get the GDB scripts the user needs to enable CONFIG_SCRIPTS_GDB, but building the scripts_gdb target without that is still harmless. Link: https://lore.kernel.org/r/20250121-kunit-gdb-v1-1-faedfd0653ef@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-18 14:28:01 -07:00
Charlie Jenkins	42367eca76	tools: Remove redundant quiet setup Q is exported from Makefile.include so it is not necessary to manually set it. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Benjamin Tissoires <bentiss@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Lukasz Luba <lukasz.luba@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-2-07de4482a581@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-02-18 16:27:43 -03:00
Petr Machata	eae1e92a1d	selftests: test_vxlan_fdb_changelink: Add a test for MC remote change Changes to MC remote need to be reflected in actual group memberships. Add a test to verify that it is the case. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:44 +01:00
Petr Machata	24adf47ea9	selftests: test_vxlan_fdb_changelink: Convert to lib.sh Instead of inlining equivalents, use lib.sh-provided primitives. Use defer to manage vx lifetime. This will make it easier to extend the test in the next patch. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:44 +01:00
Petr Machata	f802f172d7	selftests: forwarding: lib: Move require_command to net, generalize This helper could be useful to more than just forwarding tests. Move it upstairs and port over to log_test_skip(). Split the function into two parts: the bit that actually checks and reports skip, which is in a new function check_command(). And a bit that exits the test script if the check fails. This allows users consistent checking behavior while giving an option to bail out from a single test without bailing out of the whole script. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:43 +01:00
Michal Luczaj	85928e9c43	selftest/bpf: Add vsock test for sockmap rejecting unconnected Verify that for a connectible AF_VSOCK socket, merely having a transport assigned is insufficient; socket must be connected for the sockmap to accept. This does not test datagram vsocks. Even though it hardly matters. VMCI is the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an unimplemented vsock_transport::readskb() callback, making it unsupported by BPF/sockmap. Signed-off-by: Michal Luczaj <mhal@rbox.co> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 12:00:01 +01:00
Michal Luczaj	8350695bfb	selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected Commit `515745445e` ("selftest/bpf: Add test for vsock removal from sockmap on close()") added test that checked if proto::close() callback was invoked on AF_VSOCK socket release. I.e. it verified that a close()d vsock does indeed get removed from the sockmap. It was done simply by creating a socket pair and attempting to replace a close()d one with its peer. Since, due to a recent change, sockmap does not allow updating index with a non-established connectible vsock, redo it with a freshly established one. Signed-off-by: Michal Luczaj <mhal@rbox.co> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 12:00:01 +01:00
Mark Brown	5dcf52e2ce	selftests/mm: fix check for running THP tests When testing if we should try to compact memory or drop caches before we run the THP or HugeTLB tests we use \| as an or operator. This doesn't work since run_vmtests.sh is written in shell where this is used to pipe the output of the first argument into the second. Instead use the shell's -o operator. Link: https://lkml.kernel.org/r/20250212-kselftest-mm-no-hugepages-v1-1-44702f538522@kernel.org Fixes: `b433ffa8db` ("selftests: mm: perform some system cleanup before using hugepages") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nico Pache <npache@redhat.com> Cc: Mariano Pache <npache@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-02-17 22:40:04 -08:00
Amery Hung	af17bad9fb	selftests/bpf: Test returning referenced kptr from struct_ops programs Test struct_ops programs returning referenced kptr. When the return type of a struct_ops operator is pointer to struct, the verifier should only allow programs that return a scalar NULL or a non-local kptr with the correct type in its unmodified form. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250217190640.1748177-6-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-17 18:47:27 -08:00
Amery Hung	6991ec6beb	selftests/bpf: Test referenced kptr arguments of struct_ops programs Test referenced kptr acquired through struct_ops argument tagged with "__ref". The success case checks whether 1) a reference to the correct type is acquired, and 2) the referenced kptr argument can be accessed in multiple paths as long as it hasn't been released. In the fail cases, we first confirm that a referenced kptr acquried through a struct_ops argument is not allowed to be leaked. Then, we make sure this new referenced kptr acquiring mechanism does not accidentally allow referenced kptrs to flow into global subprograms through their arguments. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250217190640.1748177-4-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-17 18:47:27 -08:00
Joe Damato	788e52e2b6	selftests: drv-net: Test queue xsk attribute Test that queues which are used for AF_XDP have the xsk nest attribute. The attribute is currently empty, but its existence means the AF_XDP is being used for the queue. Enable CONFIG_XDP_SOCKETS for selftests/drivers/net tests, as well. Signed-off-by: Joe Damato <jdamato@fastly.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250214211255.14194-4-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:46:03 -08:00
Anna Emese Nyiri	c935af429e	selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY Introduce tests to verify the correct functionality of the SO_RCVMARK and SO_RCVPRIORITY socket options. Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250214205828.48503-1-annaemesenyiri@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:45:19 -08:00
Pranav Tyagi	dbcbec81c9	selftests: net: fix grammar in reuseaddr_ports_exhausted.c log message This patch fixes a grammatical error in a test log message in reuseaddr_ports_exhausted.c for better clarity. Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250213152612.4434-1-pranav.tyagi03@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:34:36 -08:00
Yury Khrustalev	00894c3fc9	selftests/powerpc: Use PKEY_UNRESTRICTED macro Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions are used in mm selftests for memory protection keys for ppc target. Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com> Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20250113170619.484698-4-yury.khrustalev@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-02-17 18:16:36 +00:00
Yury Khrustalev	3809cefe93	selftests/mm: Use PKEY_UNRESTRICTED macro Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions are used in mm selftests for memory protection keys. Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com> Suggested-by: Joey Gouly <joey.gouly@arm.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250113170619.484698-3-yury.khrustalev@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-02-17 18:16:36 +00:00
David Wei	71082faa2c	io_uring/zcrx: add selftest Add a selftest for io_uring zero copy Rx. This test cannot run locally and requires a remote host to be configured in net.config. The remote host must have hardware support for zero copy Rx as listed in the documentation page. The test will restore the NIC config back to before the test and is idempotent. liburing is required to compile the test and be installed on the remote host running the test. Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-17 05:41:09 -07:00
Jens Axboe	5c496ff11d	Merge commit '71f0dd5a3293d75d26d405ffbaedfdda4836af32' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc Merge networking zerocopy receive tree, to get the prep patches for the io_uring rx zc support. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (63 commits) net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers net: page_pool: add a mp hook to unregister_netdevice* net: page_pool: add callback for mp info printing netdev: add io_uring memory provider info net: page_pool: create hooks for custom memory providers net: generalise net_iov chunk owners net: prefix devmem specific helpers net: page_pool: don't cast mp param to devmem tools: ynl: add all headers to makefile deps eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs eth: fbnic: add MAC address TCAM to debugfs tools: ynl-gen: support limits using definitions tools: ynl-gen: don't output external constants net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled net/mlx5e: Remove unused mlx5e_tc_flow_action struct net/mlx5: Remove stray semicolon in LAG port selection table creation net/mlx5e: Support FEC settings for 200G per lane link modes net/mlx5: Add support for 200Gbps per lane link modes ...	2025-02-17 05:38:28 -07:00
Greg Kroah-Hartman	2ce177e9b3	Merge 6.14-rc3 into driver-core-next We need the faux_device changes in here for future work. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 07:24:33 +01:00
Linus Torvalds	82ff316456	ARM: - Large set of fixes for vector handling, specially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours. - Fix use of kernel VAs at EL2 when emulating timers with nVHE. - Small set of pKVM improvements and cleanups. x86: - Fix broken SNP support with KVM module built-in, ensuring the PSP module is initialized before KVM even when the module infrastructure cannot be used to order initcalls - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference. - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1. - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmev2ykUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMvxwf/bw2u08moAYWAjJLROFvfiKXnznLS iqJ2+jcw0lJ7wDqm4Zw8M5t74Rd+y5yzkLkZOyjav9yBB09zRkItiTHljCNMOQnt 2QptBa3CUN8N+rNnvVRt6dMkhw7z6n7eoFRSIDY2Y9PgiTapbFXPV1gFkMPO6+0f SyF4LCr0iuDkJdvGAZJAH/Mp8nG6dv/A6a+Q+R1RkbKn9c2OdWw4VMfhIzimFGN6 0RFjbfXXvyO0aU/W/VHwvvuhcjGkAZWfHDdaTXqbvSMhayW562UPVMVBwXdVBmDj Dk1gCKcbm4WyktbXYW6iOYj3MgdK96eI24ozps4R0aDexsrTRY4IfH4KEg== =20Ql -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM: - Large set of fixes for vector handling, especially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours - Fix use of kernel VAs at EL2 when emulating timers with nVHE - Small set of pKVM improvements and cleanups x86: - Fix broken SNP support with KVM module built-in, ensuring the PSP module is initialized before KVM even when the module infrastructure cannot be used to order initcalls - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1 - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) x86/sev: Fix broken SNP support with KVM module built-in KVM: SVM: Ensure PSP module is initialized if KVM module is built-in crypto: ccp: Add external API interface for PSP module initialization KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic() KVM: arm64: timer: Drop warning on failed interrupt signalling KVM: arm64: Fix alignment of kvm_hyp_memcache allocations KVM: arm64: Convert timer offset VA when accessed in HYP code KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp() KVM: arm64: Eagerly switch ZCR_EL{1,2} KVM: arm64: Mark some header functions as inline KVM: arm64: Refactor exit handlers KVM: arm64: Refactor CPTR trap deactivation KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN KVM: arm64: Remove host FPSIMD saving for non-protected KVM KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop KVM: nSVM: Enter guest mode before initializing nested NPT MMU KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper ...	2025-02-16 10:25:12 -08:00
Thomas Weißschuh	e275f44e0a	kunit: qemu_configs: sparc: use Zilog console The driver for the 8250 console is not used, as no port is found. Instead the prom0 bootconsole is used the whole time. The prom driver translates '\n' to '\r\n' before handing of the message off to the firmware. The firmware performs the same translation again. In the final output produced by QEMU each line ends with '\r\r\n'. This breaks the kunit parser, which can only handle '\r\n' and '\n'. Use the Zilog console instead. It works correctly, is the one documented by the QEMU manual and also saves a bit of codesize: Before=4051011, After=4023326, chg -0.68% Observed on QEMU 9.2.0. Link: https://lore.kernel.org/r/20250214-kunit-qemu-sparc-console-v1-1-ba1dfdf8f0b1@linutronix.de Fixes: `87c9c16317` ("kunit: tool: add support for QEMU") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-15 19:06:56 -07:00
Brendan Jackman	08fafac4c9	kunit: tool: Use qboot on QEMU x86_64 As noted in [0], SeaBIOS (QEMU default) makes a mess of the terminal, qboot does not. It turns out this is actually useful with kunit.py, since the user is exposed to this issue if they set --raw_output=all. qboot is also faster than SeaBIOS, but it's is marginal for this usecase. [0] https://lore.kernel.org/all/CA+i-1C0wYb-gZ8Mwh3WSVpbk-LF-Uo+njVbASJPe1WXDURoV7A@mail.gmail.com/ Both SeaBIOS and qboot are x86-specific. Link: https://lore.kernel.org/r/20250124-kunit-qboot-v1-1-815e4d4c6f7c@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-15 19:06:41 -07:00
Sebastian Andrzej Siewior	633488947e	kernfs: Use RCU to access kernfs_node::parent. kernfs_rename_lock is used to obtain stable kernfs_node::{name\|parent} pointer. This is a preparation to access kernfs_node::parent under RCU and ensure that the pointer remains stable under the RCU lifetime guarantees. For a complete path, as it is done in kernfs_path_from_node(), the kernfs_rename_lock is still required in order to obtain a stable parent relationship while computing the relevant node depth. This must not change while the nodes are inspected in order to build the path. If the kernfs user never moves the nodes (changes the parent) then the kernfs_rename_lock is not required and the RCU guarantees are sufficient. This "restriction" can be set with KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required. Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU access and use RCU accessor while accessing the node. Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can not change. Acked-by: Tejun Heo <tj@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-15 17:46:32 +01:00
Andrii Nakryiko	4eb93fea59	selftests/bpf: add test for LDX/STX/ST relocations over array field Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE relocation size adjustment when the CO-RE relocation target type is an ARRAY. We need to make sure that compiler generates LDX/STX/ST instruction with CO-RE relocation against entire ARRAY type, not ARRAY's element. With the code pattern in selftest, we get this: 59: 61 71 00 00 00 00 00 00 w1 = (u32 )(r7 + 0x0) 00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0) Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory load instruction itself. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250207014809.1573841-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:58:14 -08:00
Jiayuan Chen	72266ee83f	selftests/bpf: Add selftest for may_goto Added test cases to ensure that programs with stack sizes exceeding 512 bytes are restricted in non-JITed mode, and can be executed normally in JITed mode, even with stack sizes exceeding 512 bytes due to the presence of may_goto instructions. Test result: echo "0" > /proc/sys/net/core/bpf_jit_enable ./test_progs -t verifier_stack_ptr ... stack size 512 with may_goto with jit:SKIP stack size 512 with may_goto without jit:OK ... Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED echo "1" > /proc/sys/net/core/bpf_jit_enable ./test_progs -t verifier_stack_ptr ... stack size 512 with may_goto with jit:OK stack size 512 with may_goto without jit:SKIP ... Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20250214091823.46042-4-mrpre@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:55:15 -08:00
Jiayuan Chen	b38c72ab80	selftests/bpf: Introduce __load_if_JITed annotation for tests In some cases, the verification logic under the interpreter and JIT differs, such as may_goto, and the test program behaves differently under different runtime modes, requiring separate verification logic for each result. Introduce __load_if_JITed and __load_if_no_JITed annotation for tests. Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20250214091823.46042-3-mrpre@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:55:15 -08:00
Stafford Horne	713e788c0e	rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm When working on OpenRISC support for restartable sequences I noticed and fixed these two issues with the riscv support bits. 1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly passed to the macro. Fix this by adding 'inc' to the list of macro arguments. 2 The inline asm input constraints for 'inc' and 'off' use "er", The riscv gcc port does not have an "e" constraint, this looks to be copied from the x86 port. Fix this by just using an "r" constraint. I have compile tested this only for riscv. However, the same fixes I use in the OpenRISC rseq selftests and everything passes with no issues. Fixes: `171586a6ab` ("selftests/rseq: riscv: Template memory ordering and percpu access mode") Signed-off-by: Stafford Horne <shorne@gmail.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250114170721.3613280-1-shorne@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-02-14 13:06:36 -08:00
Linus Torvalds	04f41cbf03	sched_ext: Fixes for v6.14-rc2 - Fix lock imbalance in a corner case of dispatch_to_local_dsq(). - Migration disabled tasks were confusing some BPF schedulers and its handling had a bug. Fix it and simplify the default behavior by dispatching them automatically. - ops.tick(), ops.disable() and ops.exit_task() were incorrectly disallowing kfuncs that require the task argument to be the rq operation is currently operating on and thus is rq-locked. Allow them. - Fix autogroup migration handling bug which was occasionally triggering a warning in the cgroup migration path. - tools/sched_ext, selftest and other misc updates. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ695uA4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGeCfAQDmUixMNJCIrRphYsWcYUzlGLZyyRpQEEYFtRMO UC266gD+PUV2UvuO5sAVH8HVnGdOqkXaE/IRG+TC7fQH3ruPlgI= =LFd1 -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix lock imbalance in a corner case of dispatch_to_local_dsq() - Migration disabled tasks were confusing some BPF schedulers and its handling had a bug. Fix it and simplify the default behavior by dispatching them automatically - ops.tick(), ops.disable() and ops.exit_task() were incorrectly disallowing kfuncs that require the task argument to be the rq operation is currently operating on and thus is rq-locked. Allow them. - Fix autogroup migration handling bug which was occasionally triggering a warning in the cgroup migration path - tools/sched_ext, selftest and other misc updates * tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Use SCX_CALL_OP_TASK in task_tick_scx sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h. sched_ext: selftests: Fix grammar in tests description sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq() sched_ext: Fix migration disabled handling in targeted dispatches sched_ext: Implement auto local dispatching of migration disabled tasks sched_ext: Fix incorrect time delta calculation in time_delta() sched_ext: Fix lock imbalance in dispatch_to_local_dsq() sched_ext: selftests/dsp_local_on: Fix selftest on UP systems tools/sched_ext: Add helper to check task migration state sched_ext: Fix incorrect autogroup migration detection sched_ext: selftests/dsp_local_on: Fix sporadic failures selftests/sched_ext: Fix enum resolution sched_ext: Include task weight in the error state dump sched_ext: Fixes typos in comments	2025-02-14 11:14:24 -08:00
Linus Torvalds	80868f5d3d	cgroup: Fixes for v6.14-rc2 - Fix a race window where a newly forked task could escape cgroup.kill. - Remove incorrectly included steal time from cpu.stat::usage_usec. - Minor update in selftest. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ6928A4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGYrbAPsEtoH5GFw7VtKIy4fS23QbtxUuwW0fERrPWGyt JtQv3gD5AboBUrGWdgiM5c2bIXT3f+Bn9w3HhLiPaB8ieN/0kA8= =3BBH -----END PGP SIGNATURE----- Merge tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix a race window where a newly forked task could escape cgroup.kill - Remove incorrectly included steal time from cpu.stat::usage_usec - Minor update in selftest * tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Remove steal time from usage_usec selftests/cgroup: use bash in test_cpuset_v1_hp.sh cgroup: fix race between fork and cgroup.kill	2025-02-14 11:00:42 -08:00
Sean Christopherson	16fc7cb406	KVM: selftests: Add infrastructure for getting vCPU binary stats Now that the binary stats cache infrastructure is largely scope agnostic, add support for vCPU-scoped stats. Like VM stats, open and cache the stats FD when the vCPU is created so that it's guaranteed to be valid when vcpu_get_stats() is invoked. Account for the extra per-vCPU file descriptor in kvm_set_files_rlimit(), so that tests that create large VMs don't run afoul of resource limits. To sanity check that the infrastructure actually works, and to get a bit of bonus coverage, add an assert in x86's xapic_ipi_test to verify that the number of HLTs executed by the test matches the number of HLT exits observed by KVM. Tested-by: Manali Shukla <Manali.Shukla@amd.com> Link: https://lore.kernel.org/r/20250111005049.1247555-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:13 -08:00
Sean Christopherson	9b56532b8a	KVM: selftests: Adjust number of files rlimit for all "standard" VMs Move the max vCPUs test's RLIMIT_NOFILE adjustments to common code, and use the new helper to adjust the resource limit for non-barebones VMs by default. x86's recalc_apic_map_test creates 512 vCPUs, and a future change will open the binary stats fd for all vCPUs, which will put the recalc APIC test above some distros' default limit of 1024. Link: https://lore.kernel.org/r/20250111005049.1247555-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:12 -08:00
Sean Christopherson	ea7179f995	KVM: selftests: Get VM's binary stats FD when opening VM Get and cache a VM's binary stats FD when the VM is opened, as opposed to waiting until the stats are first used. Opening the stats FD outside of __vm_get_stat() will allow converting it to a scope-agnostic helper. Note, this doesn't interfere with kvm_binary_stats_test's testcase that verifies a stats FD can be used after its own VM's FD is closed, as the cached FD is also closed during kvm_vm_free(). Link: https://lore.kernel.org/r/20250111005049.1247555-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:11 -08:00
Sean Christopherson	e65faf71bd	KVM: selftests: Add struct and helpers to wrap binary stats cache Add a struct and helpers to manage the binary stats cache, which is currently used only for VM-scoped stats. This will allow expanding the selftests infrastructure to provide support for vCPU-scoped binary stats, which, except for the ioctl to get the stats FD are identical to VM-scoped stats. Defer converting __vm_get_stat() to a scope-agnostic helper to a future patch, as getting the stats FD from KVM needs to be moved elsewhere before it can be made completely scope-agnostic. Link: https://lore.kernel.org/r/20250111005049.1247555-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:09 -08:00
Sean Christopherson	b0c3f5df92	KVM: selftests: Macrofy vm_get_stat() to auto-generate stat name string Turn vm_get_stat() into a macro that generates a string for the stat name, as opposed to taking a string. This will allow hardening stat usage in the future to generate errors on unknown stats at compile time. No functional change intended. Link: https://lore.kernel.org/r/20250111005049.1247555-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:01:55 -08:00
Sean Christopherson	eead13d493	KVM: selftests: Assert that __vm_get_stat() actually finds a stat Fail the test if it attempts to read a stat that doesn't exist, e.g. due to a typo (hooray, strings), or because the test tried to get a stat for the wrong scope. As is, there's no indiciation of failure and @data is left untouched, e.g. holds '0' or random stack data in most cases. Fixes: `8448ec5993` ("KVM: selftests: Add NX huge pages test") Link: https://lore.kernel.org/r/20250111005049.1247555-4-seanjc@google.com [sean: fixup spelling mistake, courtesy of Colin Ian King] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:01:27 -08:00
Bharadwaj Raju	78332fdb95	selftests/landlock: Add binaries to .gitignore Building the test creates binaries 'wait-pipe' and 'sandbox-and-launch' which need to be gitignore'd. Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Link: https://lore.kernel.org/r/20250210161101.6024-1-bharadwaj.raju777@gmail.com [mic: Sort entries] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:11 +01:00
Mikhail Ivanov	3d4033985f	selftests/landlock: Test that MPTCP actions are not restricted Extend protocol fixture with test suits for MPTCP protocol. Add CONFIG_MPTCP and CONFIG_MPTCP_IPV6 options in config. Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Link: https://lore.kernel.org/r/20250205093651.1424339-4-ivanov.mikhail1@huawei-partners.com Cc: <stable@vger.kernel.org> # 6.7.x Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:10 +01:00
Mikhail Ivanov	f5534d511b	selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP Extend protocol_variant structure with protocol field (Cf. socket(2)). Extend protocol fixture with TCP test suits with protocol=IPPROTO_TCP which can be used as an alias for IPPROTO_IP (=0) in socket(2). Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Link: https://lore.kernel.org/r/20250205093651.1424339-3-ivanov.mikhail1@huawei-partners.com Cc: <stable@vger.kernel.org> # 6.7.x Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:09 +01:00
Mickaël Salaün	89cb121e94	selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB Since commit `5155cbcdbf` ("af_unix: Add a prompt to CONFIG_AF_UNIX_OOB"), the Landlock selftests's configuration is not enough to build a minimal kernel. Because scoped_signal_test checks with the MSG_OOB flag, we need to enable CONFIG_AF_UNIX_OOB for tests: # RUN fown.no_sandbox.sigurg_socket ... # scoped_signal_test.c:420:sigurg_socket:Expected 1 (1) == send(client_socket, ".", 1, MSG_OOB) (-1) # sigurg_socket: Test terminated by assertion # FAIL fown.no_sandbox.sigurg_socket ... Cc: Günther Noack <gnoack@google.com> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/r/20250211132531.1625566-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:06 +01:00
Song Liu	60c2e1fa91	selftests/bpf: Test kfuncs that set and remove xattr from BPF programs Two sets of tests are added to exercise the not _locked and _locked version of the kfuncs. For both tests, user space accesses xattr security.bpf.foo on a testfile. The BPF program is triggered by user space access (on LSM hook inode_[set\|get]_xattr) and sets or removes xattr security.bpf.bar. Then user space then validates that xattr security.bpf.bar is set or removed as expected. Note that, in both tests, the BPF programs use the not _locked kfuncs. The verifier picks the proper kfuncs based on the calling context. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250130213549.3353349-6-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-13 19:35:32 -08:00
Song Liu	ab39ad6796	selftests/bpf: Extend test fs_kfuncs to cover security.bpf. xattr names Extend test_progs fs_kfuncs to cover different xattr names. Specifically: xattr name "user.kfuncs" and "security.bpf.xxx" can be read from BPF program with kfuncs bpf_get_[file\|dentry]_xattr(); while "security.bpf" and "security.selinux" cannot be read. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250130213549.3353349-3-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-13 19:35:31 -08:00
Amery Hung	b99f27e902	selftests/bpf: Fix stdout race condition in traffic monitor Fix a race condition between the main test_progs thread and the traffic monitoring thread. The traffic monitor thread tries to print a line using multiple printf and use flockfile() to prevent the line from being torn apart. Meanwhile, the main thread doing io redirection can reassign or close stdout when going through tests. A deadlock as shown below can happen. main traffic_monitor_thread ==== ====================== show_transport() -> flockfile(stdout) stdio_hijack_init() -> stdout = open_memstream(log_buf, log_cnt); ... env.subtest_state->stdout_saved = stdout; ... funlockfile(stdout) stdio_restore_cleanup() -> fclose(env.subtest_state->stdout_saved); After the traffic monitor thread lock stdout, A new memstream can be assigned to stdout by the main thread. Therefore, the traffic monitor thread later will not be able to unlock the original stdout. As the main thread tries to access the old stdout, it will hang indefinitely as it is still locked by the traffic monitor thread. The deadlock can be reproduced by running test_progs repeatedly with traffic monitor enabled: for ((i=1;i<=100;i++)); do ./test_progs -a flow_dissector_skb* -m '*' done Fix this by only calling printf once and remove flockfile()/funlockfile(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250213233217.553258-1-ameryhung@gmail.com	2025-02-13 17:06:25 -08:00
Jakub Kicinski	7a7e019713	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-13 12:43:30 -08:00
Linus Torvalds	348f968b89	Including fixes from netfilter, wireless and bluetooth. Kalle Valo steps down after serving as the WiFi driver maintainer for over a decade. Current release - fix to a fix: - vsock: orphan socket after transport release, avoid null-deref - Bluetooth: L2CAP: fix corrupted list in hci_chan_del Current release - regressions: - eth: stmmac: correct Rx buffer layout when SPH is enabled - rxrpc: fix alteration of headers whilst zerocopy pending - eth: iavf: fix a locking bug in an error path - s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH - Revert "netfilter: flowtable: teardown flow if cached mtu is stale" Current release - new code bugs: - rxrpc: fix ipv6 path MTU discovery, only ipv4 worked - pse-pd: fix deadlock in current limit functions Previous releases - regressions: - rtnetlink: fix netns refleak with rtnl_setlink() - wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364 firmware Previous releases - always broken: - add missing RCU protection of struct net throughout the stack - can: rockchip: bail out if skb cannot be allocated - eth: ti: am65-cpsw: base XDP support fixes Misc: - ethtool: tsconfig: update the format of hwtstamp flags, changes the uAPI but this uAPI was not in any release yet Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmeuOoMACgkQMUZtbf5S Irs43w/+M9ZRNqU7O1CK6kykgAkuWCZvfVvvKFCTKcxg+ibrelzn78m1CtFWfpci aZ6meM65XQ/9a8y4VQgjMs8dDPSQphLXjXTLRtMLCEbL6Wakg6pobj/Rb/Ya4p9T 4Ao7VrCRzAbGDv/M4NXSuqVxc1YYBXA3i3FSKR913cMsYeTMOYLRvjsB8ZvSJOgK Qs4iGYz3D/oO0KRHVWpzn1DUxQhwoqSjU1lFeuLc+1yxDmicqqkKnnP0A6DbN4zd /JMVkM1ysAqKh4HNDdurMhy5D42Gdc/W/QfLJiNtpohNO2wItR9cs+Nn4TMnCvpF DK4tS1z5V60S/t0G8isVAjtZYGcBL2hlC94H/8m/FztUdoVew2vaAlD3Fa2i0ED0 7Q9vzNHUfUSYfI2QLYJC+QHXBkzSiU18uK3zIFq/sIOZJco1Wz8Aevg78c2JV+Qi nZLyyeH73Yt1lINBK5/td+KBFSVlMI5gAAGMVuH4+djLOOQohiufe5uKVlU/vjZ0 o2YTiIXRwhvAe2QIbjbZV1y4xhWLrH+NXhKYzRZZfTgx2wjLFVovjCfodQjskjNk fVp5wg3m9qKO3MkEN+NTFcnZytUmpNgC1LqXLEMCOfccU+vDqQ+0xylNhEipSuWd PW5c/dcoVnDARvxGVp1wWtEy83kF1RefgIp++wqSzSqisr2l2l4= =7lE+ -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wireless and bluetooth. Kalle Valo steps down after serving as the WiFi driver maintainer for over a decade. Current release - fix to a fix: - vsock: orphan socket after transport release, avoid null-deref - Bluetooth: L2CAP: fix corrupted list in hci_chan_del Current release - regressions: - eth: - stmmac: correct Rx buffer layout when SPH is enabled - iavf: fix a locking bug in an error path - rxrpc: fix alteration of headers whilst zerocopy pending - s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH - Revert "netfilter: flowtable: teardown flow if cached mtu is stale" Current release - new code bugs: - rxrpc: fix ipv6 path MTU discovery, only ipv4 worked - pse-pd: fix deadlock in current limit functions Previous releases - regressions: - rtnetlink: fix netns refleak with rtnl_setlink() - wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364 firmware Previous releases - always broken: - add missing RCU protection of struct net throughout the stack - can: rockchip: bail out if skb cannot be allocated - eth: ti: am65-cpsw: base XDP support fixes Misc: - ethtool: tsconfig: update the format of hwtstamp flags, changes the uAPI but this uAPI was not in any release yet" * tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) net: pse-pd: Fix deadlock in current limit functions rxrpc: Fix ipv6 path MTU discovery Reapply "net: skb: introduce and use a single page frag cache" s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw() ipv6: mcast: add RCU protection to mld_newpack() team: better TEAM_OPTION_TYPE_STRING validation Bluetooth: L2CAP: Fix corrupted list in hci_chan_del Bluetooth: btintel_pcie: Fix a potential race condition Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases vsock/test: Add test for SO_LINGER null ptr deref vsock: Orphan socket after transport release MAINTAINERS: Add sctp headers to the general netdev entry Revert "netfilter: flowtable: teardown flow if cached mtu is stale" iavf: Fix a locking bug in an error path rxrpc: Fix alteration of headers whilst zerocopy pending net: phylink: make configuring clock-stop dependent on MAC support ...	2025-02-13 12:17:04 -08:00
Devaansh Kumar	0760d62dad	sched_ext: selftests: Fix grammar in tests description Fixed grammar for a few tests of sched_ext. Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-02-13 06:46:22 -10:00
Michal Luczaj	440c9d4887	vsock/test: Add test for SO_LINGER null ptr deref Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER enabled. As for now, test does not verify if close() actually lingers. On an unpatched machine, may trigger a null pointer dereference. Tested-by: Luigi Leonardi <leonardi@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-2-ef6244d02b54@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-12 20:01:29 -08:00
Tamir Duberstein	313b38a6ec	lib/prime_numbers: convert self-test to KUnit Extract a private header and convert the prime_numbers self-test to a KUnit test. I considered parameterizing the test using `KUNIT_CASE_PARAM` but didn't see how it was possible since the test logic is entangled with the test parameter generation logic. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250208-prime_numbers-kunit-convert-v5-2-b0cb82ae7c7d@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-12 14:00:11 -08:00
Thomas Weißschuh	16681bea9a	selftests/nolibc: split up architecture list in run-tests.sh The list is getting overly long and any modifications introduce a lot of noise and are prone to conflicts. Split the string into a bash array and break that into multiple lines. Link: https://lore.kernel.org/r/20250211-nolibc-test-archs-v1-1-8e55aa3369cf@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-12 18:57:04 +01:00
Sean Christopherson	f7f232a01f	KVM: selftests: Close VM's binary stats FD when releasing VM Close/free a VM's binary stats cache when the VM is released, not when the VM is fully freed. When a VM is re-created, e.g. for state save/restore tests, the stats FD and descriptor points at the old, defunct VM. The FD is still valid, in that the underlying stats file won't be freed until the FD is closed, but reading stats will always pull information from the old VM. Note, this is a benign bug in the current code base as none of the tests that recreate VMs use binary stats. Fixes: `83f6e109f5` ("KVM: selftests: Cache binary stats metadata for duration of test") Link: https://lore.kernel.org/r/20250111005049.1247555-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:02:42 -08:00
Sean Christopherson	fd546aba19	KVM: selftests: Fix mostly theoretical leak of VM's binary stats FD When allocating and freeing a VM's cached binary stats info, check for a NULL descriptor, not a '0' file descriptor, as '0' is a legal FD. E.g. in the unlikely scenario the kernel installs the stats FD at entry '0', selftests would reallocate on the next __vm_get_stat() and/or fail to free the stats in kvm_vm_free(). Fixes: `83f6e109f5` ("KVM: selftests: Cache binary stats metadata for duration of test") Link: https://lore.kernel.org/r/20250111005049.1247555-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:02:42 -08:00
Sean Christopherson	dae7d81e8d	KVM: selftests: Allow running a single iteration of dirty_log_test Now that dirty_log_test doesn't require running multiple iterations to verify dirty pages, and actually runs the requested number of iterations, drop the requirement that the test run at least "3" (which was really "2" at the time the test was written) iterations. Link: https://lore.kernel.org/r/20250111003004.1235645-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	7f225650e0	KVM: selftests: Fix an off-by-one in the number of dirty_log_test iterations Actually run all requested iterations, instead of iterations-1 (the count starts at '1' due to the need to avoid '0' as an in-memory value for a dirty page). Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	2680dcfb34	KVM: selftests: Set per-iteration variables at the start of each iteration Set the per-iteration variables at the start of each iteration instead of setting them before the loop, and at the end of each iteration. To ensure the vCPU doesn't race ahead before the first iteration, simply have the vCPU worker want for sem_vcpu_cont, which conveniently avoids the need to special case posting sem_vcpu_cont from the loop. Link: https://lore.kernel.org/r/20250111003004.1235645-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	2020d3b77a	KVM: selftests: Tighten checks around prev iter's last dirty page in ring Now that each iteration collects all dirty entries and ensures the guest completes at least one write, tighten the exemptions for the last dirty page of the previous iteration. Specifically, the only legal value (other than the current iteration) is N-1. Unlike the last page for the current iteration, the in-progress write from the previous iteration is guaranteed to have completed, otherwise the test would have hung. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	73eaa2aa14	KVM: selftests: Ensure guest writes min number of pages in dirty_log_test Ensure the vCPU fully completes at least one write in each dirty_log_test iteration, as failure to dirty any pages complicates verification and forces the test to be overly conservative about possible values. E.g. verification needs to allow the last dirty page from a previous iteration to have any value, because the vCPU could get stuck for multiple iterations, which is unlikely but can happen in heavily overloaded and/or nested virtualization setups. Somewhat arbitrarily set the minimum to 0x100/256; high enough to be interesting, but not so high as to lead to pointlessly long runtimes. Opportunistically report the number of writes per iteration for debug purposes, and so that a human can sanity check the test. Due to each write targeting a random page, the number of dirty pages will likely be lower than the number of total writes, but it shouldn't be absurdly lower (which would suggest the pRNG is broken) Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	485e27ed20	KVM: sefltests: Verify value of dirty_log_test last page isn't bogus Add a sanity check that a completely garbage value wasn't written to the last dirty page in the ring, e.g. that it doesn't contain the next iteration's value. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	d0bd72cb91	KVM: selftests: Collect all dirty entries in each dirty_log_test iteration Collect all dirty entries during each iteration of dirty_log_test by doing a final collection after the vCPU has been stopped. To deal with KVM's destructive approach to getting the dirty bitmaps, use a second bitmap for the post-stop collection. Collecting all entries that were dirtied during an iteration simplifies the verification logic and improves test coverage. - If a page is written during iteration X, but not seen as dirty until X+1, the test can get a false pass if the page is also written during X+1. - If a dirty page used a stale value from a previous iteration, the test would grant a false pass. - If a missed dirty log occurs in the last iteration, the test would fail to detect the issue. E.g. modifying mark_page_dirty_in_slot() to dirty an unwritten gfn: if (memslot && kvm_slot_dirty_track_enabled(memslot)) { unsigned long rel_gfn = gfn - memslot->base_gfn; u32 slot = (memslot->as_id << 16) \| memslot->id; if (!vcpu->extra_dirty && gfn_to_memslot(kvm, gfn + 1) == memslot) { vcpu->extra_dirty = true; mark_page_dirty_in_slot(kvm, memslot, gfn + 1); } if (kvm->dirty_ring_size && vcpu) kvm_dirty_ring_push(vcpu, slot, rel_gfn); else if (memslot->dirty_bitmap) set_bit_le(rel_gfn, memslot->dirty_bitmap); } isn't detected with the current approach, even with an interval of 1ms (when running nested in a VM; bare metal would be even less likely to detect the bug due to the vCPU being able to dirty more memory). Whereas collecting all dirty entries consistently detects failures with an interval of 700ms or more (the longer interval means a higher probability of an actual write to the prematurely-dirtied page). Link: https://lore.kernel.org/r/20250111003004.1235645-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	24b9a2a613	KVM: selftests: Print (previous) last_page on dirty page value mismatch Print out the last dirty pages from the current and previous iteration on verification failures. In many cases, bugs (especially test bugs) occur on the edges, i.e. on or near the last pages, and being able to correlate failures with the last pages can aid in debug. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:56 -08:00
Sean Christopherson	c616f36a10	KVM: selftests: Use continue to handle all "pass" scenarios in dirty_log_test When verifying pages in dirty_log_test, immediately continue on all "pass" scenarios to make the logic consistent in how it handles pass vs. fail. No functional change intended. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	9a91f65424	KVM: selftests: Post to sem_vcpu_stop if and only if vcpu_stop is true When running dirty_log_test using the dirty ring, post to sem_vcpu_stop only when the main thread has explicitly requested that the vCPU stop. Synchronizing the vCPU and main thread whenever the dirty ring happens to be full is unnecessary, as KVM's ABI is to actively prevent the vCPU from running until the ring is no longer full. I.e. attempting to run the vCPU will simply result in KVM_EXIT_DIRTY_RING_FULL without ever entering the guest. And if KVM doesn't exit, e.g. let's the vCPU dirty more pages, then that's a KVM bug worth finding. Posting to sem_vcpu_stop on ring full also makes it difficult to get the test logic right, e.g. it's easy to let the vCPU keep running when it shouldn't, as a ring full can essentially happen at any given time. Opportunistically rework the handling of dirty_ring_vcpu_ring_full to leave it set for the remainder of the iteration in order to simplify the surrounding logic. Link: https://lore.kernel.org/r/20250111003004.1235645-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	0a818b3541	KVM: selftests: Keep dirty_log_test vCPU in guest until it needs to stop In the dirty_log_test guest code, exit to userspace only when the vCPU is explicitly told to stop. Periodically exiting just to check if a flag has been set is unnecessary, weirdly complex, and wastes time handling exits that could be used to dirty memory. Opportunistically convert 'i' to a uint64_t to guard against the unlikely scenario that guest_num_pages exceeds the storage of an int. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	f3629c0ef1	KVM: selftests: Honor "stop" request in dirty ring test Now that the vCPU doesn't dirty every page on the first iteration for architectures that support the dirty ring, honor vcpu_stop in the dirty ring's vCPU worker, i.e. stop when the main thread says "stop". This will allow plumbing vcpu_stop into the guest so that the vCPU doesn't need to periodically exit to userspace just to see if it should stop. Add a comment explaining that marking all pages as dirty is problematic for the dirty ring, as it results in the guest getting stuck on "ring full". This could be addressed by adding a GUEST_SYNC() in that initial loop, but it's not clear how that would interact with s390's behavior. Link: https://lore.kernel.org/r/20250111003004.1235645-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Maxim Levitsky	deb8b8400e	KVM: selftests: Limit dirty_log_test's s390x workaround to s390x s390 specific workaround causes the dirty-log mode of the test to dirty all guest memory on the first iteration, which is very slow when the test is run in a nested VM. Limit this workaround to s390x. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	9b1feec83e	KVM: selftests: Continuously reap dirty ring while vCPU is running Continue collecting entries from the dirty ring for the entire time the vCPU is running. Collecting exactly once all but guarantees the vCPU will encounter a "ring full" event and stop. While testing ring full is interesting, stopping and doing nothing is not, especially for larger intervals as the test effectively does nothing for a much longer time. To balance continuous collection with letting the guest make forward progress, chunk the interval waiting into 1ms loops (which also makes the math dead simple). To maintain coverage for "ring full", collect entries on subsequent iterations if and only if the ring has been filled at least once. I.e. let the ring fill up (if the interval allows), but after that contiuously empty it so that the vCPU can keep running. Opportunistically drop unnecessary zero-initialization of "count". Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	f2228aa083	KVM: selftests: Read per-page value into local var when verifying dirty_log_test Cache the page's value during verification in a local variable, re-reading from the pointer is ugly and error prone, e.g. allows for bugs like checking the pointer itself instead of the value. No functional change intended. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	af2d85d34d	KVM: selftests: Precisely track number of dirty/clear pages for each iteration Track and print the number of dirty and clear pages for each iteration. This provides parity between all log modes, and will allow collecting the dirty ring multiple times per iteration without spamming the console. Opportunistically drop the "Dirtied N pages" print, which is redundant and wrong. For the dirty ring testcase, the vCPU isn't guaranteed to complete a loop. And when the vCPU does complete a loot, there are no guarantees that it has dirtied that many pages; because the writes are to random address, the vCPU may have written the same page over and over, i.e. only dirtied one page. While the number of writes performed by the vCPU is also interesting, e.g. the pr_info() could be tweaked to use different verbiage, pages_count doesn't correctly track the number of writes either (because loops aren't guaranteed to a complete). Delete the print for now, as a future patch will precisely track the number of writes, at which point the verification phase can report the number of writes performed by each iteration. Link: https://lore.kernel.org/r/20250111003004.1235645-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	1230907864	KVM: selftests: Drop stale srandom() initialization from dirty_log_test Drop an srandom() initialization that was leftover from the conversion to use selftests' guest_random_xxx() APIs. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	ff0efc77bc	KVM: selftests: Drop signal/kick from dirty ring testcase Drop the signal/kick from dirty_log_test's dirty ring handling, as kicking the vCPU adds marginal value, at the cost of adding significant complexity to the test. Asynchronously interrupting the vCPU isn't novel; unless the kernel is fully tickless, the vCPU will be interrupted by IRQs for any decently large interval. And exiting to userspace mode in the middle of a sequence isn't novel either, as the vCPU will do so every time the ring becomes full. Link: https://lore.kernel.org/r/20250111003004.1235645-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:55 -08:00
Sean Christopherson	67428ee7b7	KVM: selftests: Sync dirty_log_test iteration to guest before resuming Sync the new iteration to the guest prior to restarting the vCPU, otherwise it's possible for the vCPU to dirty memory for the next iteration using the current iteration's value. Note, because the guest can be interrupted between the vCPU's load of the iteration and its write to memory, it's still possible for the guest to store the previous iteration to memory as the previous iteration may be cached in a CPU register (which the test accounts for). Note #2, the test's current approach of collecting dirty entries before stopping the vCPU also results dirty memory having the previous iteration. E.g. if page is dirtied in the previous iteration, but not the current iteration, the verification phase will observe the previous iteration's value in memory. That wart will be remedied in the near future, at which point synchronizing the iteration before restarting the vCPU will guarantee the only way for verification to observe stale iterations is due to the CPU register caching case, or due to a dirty entry being collected before the store retires. Link: https://lore.kernel.org/r/20250111003004.1235645-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:54 -08:00
Maxim Levitsky	fe49f80052	KVM: selftests: Support multiple write retires in dirty_log_test If dirty_log_test is run nested, it is possible for entries in the emulated PML log to appear before the actual memory write is committed to the RAM, due to the way KVM retries memory writes as a response to a MMU fault. In addition to that in some very rare cases retry can happen more than once, which will lead to the test failure because once the write is finally committed it may have a very outdated iteration value. Detect and avoid this case. Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20250111003004.1235645-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:54 -08:00
Sean Christopherson	89ea56a404	KVM: selftests: Actually emit forced emulation prefix for kvm_asm_safe_fep() Use KVM_ASM_SAFE_FEP, not simply KVM_ASM_SAFE, for kvm_asm_safe_fep(), as the non-FEP version doesn't force emulation (stating the obvious). Note, there are currently no users of kvm_asm_safe_fep(). Fixes: `ab3b6a7de8` ("KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()") Link: https://lore.kernel.org/r/20250130163135.270770-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 09:00:39 -08:00
Sean Christopherson	e36454461c	KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC Add testcases to x86's Hyper-V CPUID test to verify that KVM advertises support for features that require an in-kernel local APIC appropriately, i.e. that KVM hides support from the vCPU-scoped ioctl if the VM doesn't have an in-kernel local APIC. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20250118003454.2619573-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:55:11 -08:00
Sean Christopherson	cd5a0c2f0f	KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper Allocate, get, and free the CPUID array in the Hyper-V CPUID test in the test's core helper, instead of copy+pasting code at each call site. In addition to deduplicating a small amount of code, restricting visibility of the array to a single invocation of the core test prevents "leaking" an array across test cases. Passing in @vcpu to the helper will also allow pivoting on VM-scoped information without needing to pass more booleans, e.g. to conditionally assert on features that require an in-kernel APIC. To avoid use-after-free bugs due to overzealous and careless developers, opportunstically add a comment to explain that the system-scoped helper caches the Hyper-V CPUID entries, i.e. that the caller is not responsible for freeing the memory. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20250118003454.2619573-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:53:59 -08:00
Sean Christopherson	0b6db0dc43	KVM: selftests: Mark test_hv_cpuid_e2big() static in Hyper-V CPUID test Make the Hyper-V CPUID test's local helper test_hv_cpuid_e2big() static, it's not used outside of the test (and isn't intended to be). Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20250118003454.2619573-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:53:57 -08:00
Sean Christopherson	54108e7334	KVM: selftests: Print out the actual Top-Down Slots count on failure Print out the expected vs. actual count of the Top-Down Slots event on failure in the Intel PMU counters test. GUEST_ASSERT() only expands constants/macros, i.e. only prints the value of the expected count, which makes it difficult to debug and triage failures. Link: https://lore.kernel.org/r/20250117234204.2600624-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:34:56 -08:00
Sean Christopherson	0e6714735c	KVM: selftests: Drop the "feature event" param from guest test helpers Now that validation of event count is tied to hardware support for event, and not to guest support for an event, drop the unused "event" parameter from the various helpers. No functional change intended. Link: https://lore.kernel.org/r/20250117234204.2600624-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:34:56 -08:00
Sean Christopherson	e327630e2a	KVM: selftests: Remove dead code in Intel PMU counters test Drop the local "nr_arch_events" in the Intel PMU counters test as the test asserts that "nr_arch_events <= NR_INTEL_ARCH_EVENTS", and then sets nr_arch_events to the max of the two. I.e. nr_arch_events is guaranteed to be NR_INTEL_ARCH_EVENTS for the meat of the test, just use NR_INTEL_ARCH_EVENTS directly. No functional change intended. Link: https://lore.kernel.org/r/20250117234204.2600624-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:34:56 -08:00
Sean Christopherson	8752e2b4a2	KVM: selftests: Only validate counts for hardware-supported arch events In the Intel PMU counters test, only validate the counts for architectural events that are supported in hardware. If an arch event isn't supported, the event selector may enable a completely different event, and thus the logic for the expected count is bogus. This fixes test failures on pre-Icelake systems due to the encoding for the architectural Top-Down Slots event corresponding to something else (at least on the Skylake family of CPUs). Note, validation relies on hardware support, not KVM support and not guest support. Architectural events are all about enumerating the event selector encoding; lack of enumeration for an architectural event doesn't mean the event itself is unsupported, i.e. the event should still count as expected even if KVM and/or guest CPUID doesn't enumerate the event as being "architectural". Note #2, it's desirable to _program_ the architectural event encoding even if hardware doesn't support the event. The count can't be validated when the event is fully enabled, but KVM should still let the guest program the event selector, and the PMC shouldn't count if the event is disabled. Fixes: `4f1bd6b160` ("KVM: selftests: Test Intel PMU architectural events on gp counters") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202501141009.30c629b4-lkp@intel.com Debugged-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250117234204.2600624-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:34:56 -08:00
Sean Christopherson	933178ddf7	KVM: selftests: Make Intel arch events globally available in PMU counters test Wrap PMU counter test's array of Intel architectrual in a helper function so that the events can be queried in multiple locations. Add a comment to explain the need for a wrapper. No functional change intended. Link: https://lore.kernel.org/r/20250117234204.2600624-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-12 08:34:56 -08:00
Christian Brauner	ccc829b15d	selftests: add tests for using detached mount with overlayfs Test that it is possible to use detached mounts as overlayfs layers. Link: https://lore.kernel.org/r/20250123-erstbesteigung-angeeignet-1d30e64b7df2@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-12 12:12:27 +01:00
Christian Brauner	85c8700cb6	selftests/overlayfs: test specifying layers as O_PATH file descriptors Verify that userspace can specify layers via O_PATH file descriptors. Link: https://lore.kernel.org/r/20250210-work-overlayfs-v2-2-ed2a949b674b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-12 10:02:11 +01:00
Saket Kumar Bhaskar	4107a1aeb2	selftests/bpf: Select NUMA_NO_NODE to create map On powerpc, a CPU does not necessarily originate from NUMA node 0. This contrasts with architectures like x86, where CPU 0 is not hot-pluggable, making NUMA node 0 a consistently valid node. This discrepancy can lead to failures when creating a map on NUMA node 0, which is initialized by default, if no CPUs are allocated from NUMA node 0. This patch fixes the issue by setting NUMA_NO_NODE (-1) for map creation for this selftest. Fixes: `96eabe7a40` ("bpf: Allow selecting numa node during map creation") Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/cf1f61468b47425ecf3728689bc9636ddd1d910e.1738302337.git.skb99@linux.ibm.com	2025-02-11 16:41:25 -08:00
Saket Kumar Bhaskar	650f20bbd9	selftests/bpf: Define SYS_PREFIX for powerpc Since commit `7e92e01b72` ("powerpc: Provide syscall wrapper") landed in v6.1, syscall wrapper is enabled on powerpc. Commit `9474689020` ("powerpc: Don't add __powerpc_ prefix to syscall entry points") , that drops the prefix to syscall entry points, also landed in the same release. So, add the missing empty SYS_PREFIX prefix definition for powerpc, to fix some fentry and kprobe selftests. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/7192d6aa9501115dc242435970df82b3d190f257.1738302337.git.skb99@linux.ibm.com	2025-02-11 16:41:25 -08:00
Tamir Duberstein	b341f6fd45	blackhole_dev: convert self-test to KUnit Convert this very simple smoke test to a KUnit test. Add a missing `htons` call that was spotted[0] by kernel test robot <lkp@intel.com> after initial conversion to KUnit. Link: https://lore.kernel.org/oe-kbuild-all/202502090223.qCYMBjWT-lkp@intel.com/ [0] Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://patch.msgid.link/20250208-blackholedev-kunit-convert-v2-1-182db9bd56ec@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-11 16:21:42 -08:00
Yuyang Huang	4f280376e5	selftests/net: Add selftest for IPv4 RTM_GETMULTICAST support This change introduces a new selftest case to verify the functionality of dumping IPv4 multicast addresses using the RTM_GETMULTICAST netlink message. The test utilizes the ynl library to interact with the netlink interface and validate that the kernel correctly reports the joined IPv4 multicast addresses. To run the test, execute the following command: $ vng -v --user root --cpus 16 -- \ make -C tools/testing/selftests TARGETS=net \ TEST_PROGS=rtnetlink.py TEST_GEN_PROGS="" run_tests Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Yuyang Huang <yuyanghuang@google.com> Link: https://patch.msgid.link/20250207110836.2407224-2-yuyanghuang@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-11 11:26:53 +01:00
Athira Rajeev	c96b1402cc	selftests/powerpc/pmu: Update comment with details to understand auxv_generic_compat_pmu() utility function auxv_generic_compat_pmu() utility function is to detect whether the system is having generic compat PMU. The check is based on base platform value from /proc/self/auxv. Update the comment with details on how auxv is used to detect the platform. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-5-atrajeev@linux.vnet.ibm.com	2025-02-11 11:33:52 +05:30
Kajol Jain	9785def259	selftests/powerpc/pmu: Add interface test for extended reg support The testcase uses check_extended_regs_support and perf_get_platform_reg_mask function to check if the platform has extended reg support. This will help to check if sampling pmu selftest is enabled or not for a given platform. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-4-atrajeev@linux.vnet.ibm.com	2025-02-11 11:33:51 +05:30
Athira Rajeev	43751c3ce2	tools/testing/selftests/powerpc/pmu: Update comment description to mention ISA v3.1 for power10 and above Updated the comments in the pmu selftests to include power11/ISA v3.1 where ever required. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-3-atrajeev@linux.vnet.ibm.com	2025-02-11 11:33:51 +05:30
Athira Rajeev	520ee327c5	tools/testing/selftests/powerpc: Add check for power11 pvr for pmu selfests Some of the tests depends on pvr value to choose the event. Example: - event_alternatives_tests_p10: alternative event depends on registered PMU driver which is based on pvr - generic_events_valid_test varies based on platform - bhrb_filter_map_test: again its dependent on pmu to decide which bhrb filter to use - reserved_bits_mmcra_sample_elig_mode: randome sampling mode reserved bits is also varies based on platform Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-2-atrajeev@linux.vnet.ibm.com	2025-02-11 11:33:51 +05:30
Athira Rajeev	fd4d2f3251	tools/testing/selftests/powerpc: Enable pmu selftests for power11 Add check for power11 pvr in the selftest utility functions. Selftests uses pvr value to check for platform support inorder to run the tests. pvr is also used to send the extended mask value to capture sampling registers. Update some of the utility functions to use hwcap2 inorder to return platform specific bits from sampling registers. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-1-atrajeev@linux.vnet.ibm.com	2025-02-11 11:33:51 +05:30
Jakub Kicinski	3337064f42	selftests: drv-net: add helper for path resolution Refering to C binaries from Python code is going to be a common need. Add a helper to convert from path in relation to the test. Meaning, if the test is in the same directory as the binary, the call would be simply: cfg.rpath("binary"). The helper name "rpath" is not great. I can't think of a better name that would be accurate yet concise. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250207184140.1730466-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-10 19:10:10 -08:00
Jakub Kicinski	29604bc2aa	selftests: drv-net: factor out a DrvEnv base class We have separate Env classes for local tests and tests with a remote endpoint. Make it easier to share the code by creating a base class. Make env loading a method of this class. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250207184140.1730466-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-10 19:10:10 -08:00
Jakub Kicinski	a980da54b6	selftests: drv-net: remove an unnecessary libmnl include ncdevmem doesn't need libmnl, remove the unnecessary include. Since YNL doesn't depend on libmnl either, any more, it's actually possible to build selftests without having libmnl installed. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250207183119.1721424-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-10 19:09:58 -08:00
Kees Cook	18f7686a1c	selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64 Since headers don't always follow the selftests around correct, explicitly include the __NR_uretprobe syscall for better test coverage. Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-10 09:26:19 -08:00
Jakub Kicinski	d2348b4bf7	selftests: drv-net: rss_ctx: skip tests which need multiple contexts cleanly There's no good API to check how many contexts device supports. But initial tests sense the context count already, so just store that number and skip tests which we know need more. Link: https://patch.msgid.link/20250206235334.1425329-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-10 08:26:51 -08:00
Jakub Kicinski	23bac39910	selftests: net-drv: test adding flow rule to invalid RSS context Check that adding Rx flow steering rules pointing to an RSS context which does not exist is prevented. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250206235334.1425329-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-10 08:26:50 -08:00
Breno Leitao	12fd83ca44	netconsole: selftest: test for sysdata CPU Add a new selftest to verify that the netconsole module correctly handles CPU runtime data in sysdata. The test validates three scenarios: 1. Basic CPU sysdata functionality - verifies that cpu=X is appended to messages 2. CPU sysdata with userdata - ensures CPU data works alongside userdata 3. Disabled CPU sysdata - confirms no CPU data is included when disabled The test uses taskset to control which CPU sends messages and verifies the reported CPU matches the one used. This helps ensure that netconsole accurately tracks and reports the originating CPU of messages. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-02-10 15:04:18 +00:00
Linus Torvalds	954a209f43	ARM: * Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations * Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 * Stop pretending that we can run the protected mode without a GICv3 being present on the host * Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts * Always evaluate the need to arm a background timer for fully emulated guest timers * Fix the emulation of EL1 timers in the absence of FEAT_ECV * Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0 s390: * move some of the guest page table (gmap) logic into KVM itself, inching towards the final goal of completely removing gmap from the non-kvm memory management code. As an initial set of cleanups, move some code from mm/gmap into kvm and start using __kvm_faultin_pfn() to fault-in pages as needed; but especially stop abusing page->index and page->lru to aid in the pgdesc conversion. x86: * Add missing check in the fix to defer starting the huge page recovery vhost_task * SRSO_USER_KERNEL_NO does not need SYNTHESIZED_F -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmemTnEUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroP97gf/Rew+yEsRrHVk/j0R2XwFx51raYZy eaicv07jYmwsnaALq6BEj4xxDW8dJEJgj05Czm1o9+9x8nvY2+UjT2q4J8fY9xdD Yr+5GvEEz4x2GNL3ZYE3iHTNFQckNxOgMilLW3br1E+wjusShKmgGxPYTRyClQ34 gDBZQWzOG22UNC6PbW9dgTK54b57+NJdIZYuHz4LkMsTvzf6jXo5VumsgbbZqC4e VGh5EUEPL7+cNzGY/+WURXI6OojdPzbneH1NP82uT3lo2WaHK9+B3N6H+W71N/T4 u1P7+g0WmdNj3FITvDpTJ7jNhke2atEjI9rvtHz6gwtf9SIujyuNl55uRA== =r0h9 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations - Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 - Stop pretending that we can run the protected mode without a GICv3 being present on the host - Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts - Always evaluate the need to arm a background timer for fully emulated guest timers - Fix the emulation of EL1 timers in the absence of FEAT_ECV - Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0 s390: - move some of the guest page table (gmap) logic into KVM itself, inching towards the final goal of completely removing gmap from the non-kvm memory management code. As an initial set of cleanups, move some code from mm/gmap into kvm and start using __kvm_faultin_pfn() to fault-in pages as needed; but especially stop abusing page->index and page->lru to aid in the pgdesc conversion. x86: - Add missing check in the fix to defer starting the huge page recovery vhost_task - SRSO_USER_KERNEL_NO does not need SYNTHESIZED_F" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (31 commits) KVM: x86/mmu: Ensure NX huge page recovery thread is alive before waking KVM: remove kvm_arch_post_init_vm KVM: selftests: Fix spelling mistake "initally" -> "initially" kvm: x86: SRSO_USER_KERNEL_NO is not synthesized KVM: arm64: timer: Don't adjust the EL2 virtual timer offset KVM: arm64: timer: Correctly handle EL1 timer emulation when !FEAT_ECV KVM: arm64: timer: Always evaluate the need for a soft timer KVM: arm64: Fix nested S2 MMU structures reallocation KVM: arm64: Fail protected mode init if no vgic hardware is present KVM: arm64: Flush/sync debug state in protected mode KVM: s390: selftests: Streamline uc_skey test to issue iske after sske KVM: s390: remove the last user of page->index KVM: s390: move PGSTE softbits KVM: s390: remove useless page->index usage KVM: s390: move gmap_shadow_pgt_lookup() into kvm KVM: s390: stop using lists to keep track of used dat tables KVM: s390: stop using page->index for non-shadow gmaps KVM: s390: move some gmap shadowing functions away from mm/gmap.c KVM: s390: get rid of gmap_translate() KVM: s390: get rid of gmap_fault() ...	2025-02-09 09:41:38 -08:00
Thomas Weißschuh	665fa8dea9	tools/nolibc: add support for directory access Add an implementation for directory access operations. To keep nolibc itself allocation-free, a "DIR " does not point to any data, but directly encodes a filedescriptor number, equivalent to "FILE ". Without any per-directory storage it is not possible to implement readdir() POSIX confirming. Instead only readdir_r() is provided. While readdir_r() is deprecated in glibc, the reasons for that are not applicable to nolibc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20250209-nolibc-dir-v2-2-57cc1da8558b@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-09 16:46:50 +01:00
Eric Biggers	8522104f75	crypto: crct10dif - remove from crypto API Remove the "crct10dif" shash algorithm from the crypto API. It has no known user now that the lib is no longer built on top of it. It has no remaining references in kernel code. The only other potential users would be the usual components that allow specifying arbitrary hash algorithms by name, namely AF_ALG and dm-integrity. However there are no indications that "crct10dif" is being used with these components. Debian Code Search and web searches don't find anything relevant, and explicitly grepping the source code of the usual suspects (cryptsetup, libell, iwd) finds no matches either. "crc32" and "crc32c" are used in a few more places, but that doesn't seem to be the case for "crct10dif". crc_t10dif_update() is also tested by crc_kunit now, so the test coverage provided via the crypto self-tests is no longer needed. Also note that the "crct10dif" shash algorithm was inconsistent with the rest of the shash API in that it wrote the digest in CPU endianness, making the resulting byte array differ on little endian vs. big endian platforms. This means it was effectively just built for use by the lib functions, and it was not actually correct to treat it as "just another hash function" that could be dropped in via the shash API. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250206173857.39794-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>	2025-02-08 20:06:30 -08:00
Linus Torvalds	f4a45f14cf	seccomp fix for v6.14-rc2 - Allow uretprobe on x86_64 to avoid behavioral complications (Eyal Birger) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ6e/FAAKCRA2KwveOeQk u/xCAP9gd2nRw9jXgpg/CCfxkX0Yj3/pnzQoCDlS2lWy43BWNgEAtcJFDvz2Lg09 omRld7QHjbMhNqihYgXRyD0nzX42uwY= =0JOg -----END PGP SIGNATURE----- Merge tag 'seccomp-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fix from Kees Cook: "This is really a work-around for x86_64 having grown a syscall to implement uretprobe, which has caused problems since v6.11. This may change in the future, but for now, this fixes the unintended seccomp filtering when uretprobe switched away from traps, and does so with something that should be easy to backport. - Allow uretprobe on x86_64 to avoid behavioral complications (Eyal Birger)" * tag 'seccomp-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: validate uretprobe syscall passes through seccomp seccomp: passthrough uretprobe systemcall without filtering	2025-02-08 14:04:21 -08:00
Bastien Curutchet (eBPF Foundation)	9b6cdaf2ac	selftests/bpf: Remove with_addr.sh and with_tunnels.sh Those two scripts were used by test_flow_dissector.sh to setup/cleanup the network topology before/after the tests. test_flow_dissector.sh have been deleted by commit `63b37657c5` ("selftests/bpf: remove test_flow_dissector.sh") so they aren't used anywhere now. Remove the two unused scripts and their Makefile entries. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250204-with-v1-1-387a42118cd4@bootlin.com	2025-02-07 18:30:41 -08:00
Jakub Kicinski	285b3f78ea	netdevsim: allow normal queue reset while down Resetting queues while the device is down should be legal. Allow it, test it. Ideally we'd test this with a real device supporting devmem but I don't have access to such devices. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250206225638.1387810-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-07 17:21:03 -08:00
Daniel Xu	973cb1382e	bpf: selftests: Test constant key extraction on irrelevant maps Test that very high constant map keys are not interpreted as an error value by the verifier. This would previously fail. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/c0590b62eb9303f389b2f52c0c7e9cf22a358a30.1738689872.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-07 15:45:44 -08:00
Linus Torvalds	8c67da5bc1	vfs-6.14-rc2.fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ6XhrQAKCRCRxhvAZXjc oujrAQCpGmhvh2jGIKcSmEigNHOGCUXDG+1QsVpnCeP9OaUrkAEA+dMo4Ai4hz4J nYeeAgpjGuu+XLMmi7EiGxpI0fQL3gc= =oN/E -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix fsnotify FMODE_NONOTIFY* handling. This also disables fsnotify on all pseudo files by default apart from very select exceptions. This carries a regression risk so we need to watch out and adapt accordingly. However, it is overall a significant improvement over the current status quo where every rando file can get fsnotify enabled. - Cleanup and simplify lockref_init() after recent lockref changes. - Fix vboxfs build with gcc-15. - Add an assert into inode_set_cached_link() to catch corrupt links. - Allow users to also use an empty string check to detect whether a given mount option string was empty or not. - Fix how security options were appended to statmount()'s ->mnt_opt field. - Fix statmount() selftests to always check the returned mask. - Fix uninitialized value in vfs_statx_path(). - Fix pidfs_ioctl() sanity checks to guard against ioctl() overloading and preserve extensibility. * tag 'vfs-6.14-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: sanity check the length passed to inode_set_cached_link() pidfs: improve ioctl handling fsnotify: disable pre-content and permission events by default selftests: always check mask returned by statmount(2) fsnotify: disable notification by default for all pseudo files fs: fix adding security options to statmount.mnt_opt fsnotify: use accessor to set FMODE_NONOTIFY_* lockref: remove count argument of lockref_init gfs2: switch to lockref_init(..., 1) gfs2: use lockref_init for gl_lockref statmount: let unset strings be empty vboxsf: fix building with GCC 15 fs/stat.c: avoid harmless garbage value problem in vfs_statx_path()	2025-02-07 09:22:31 -08:00
Miklos Szeredi	2cc02059fb	selftests: always check mask returned by statmount(2) STATMOUNT_MNT_OPTS can actually be missing if there are no options. This is a change of behavior since `75ead69a71` ("fs: don't let statmount return empty strings"). The other checks shouldn't actually trigger, but add them for correctness and for easier debugging if the test fails. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250129160641.35485-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-07 10:27:26 +01:00
Jason Xing	003be25ab9	selftests/bpf: Correct the check of join cgroup Use ASSERT_OK_FD to check the return value of join cgroup, or else this test will pass even if the fd < 0. ASSERT_OK_FD can print the error message to the console. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/all/6d62bd77-6733-40c7-b240-a1aeff55566c@linux.dev/ Link: https://patch.msgid.link/20250204051154.57655-1-kerneljasonxing@gmail.com	2025-02-06 21:18:04 -08:00
Jakub Kicinski	ba6ec09911	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc2). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-06 15:19:00 -08:00
Eyal Birger	c2debdb854	selftests/seccomp: validate uretprobe syscall passes through seccomp The uretprobe syscall is implemented as a performance enhancement on x86_64 by having the kernel inject a call to it on function exit; User programs cannot call this system call explicitly. As such, this syscall is considered a kernel implementation detail and should not be filtered by seccomp. Enhance the seccomp bpf test suite to check that uretprobes can be attached to processes without the killing the process regardless of seccomp policy. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20250202162921.335813-3-eyal.birger@gmail.com [kees: Skip archs without __NR_uretprobe] Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-06 13:19:14 -08:00
Linus Torvalds	3cf0a98fea	Current release - regressions: - core: harmonize tstats and dstats - ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels - eth: tun: revert fix group permission check - eth: stmmac: revert "specify hardware capability value when FIFO size isn't specified" Previous releases - regressions: - udp: gso: do not drop small packets when PMTU reduces - rxrpc: fix race in call state changing vs recvmsg() - eth: ice: fix Rx data path for heavy 9k MTU traffic - eth: vmxnet3: fix tx queue race condition with XDP Previous releases - always broken: - sched: pfifo_tail_enqueue: drop new packet when sch->limit == 0 - ethtool: ntuple: fix rss + ring_cookie check - rxrpc: fix the rxrpc_connection attend queue handling Misc: - recognize Kuniyuki Iwashima as a maintainer Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmekqVQSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkLgcP/3sewa114G/vERakXPptWLnCRtXLMaMk E8ihlBj9qI0hD51Qi+NRYKI/IJqA5ZB/p4I5cBz7xI8d5VOQYhSuCdCnwMEZPAfG IQS8VpcFjcdj1e7cjlFu/s5PzF6cRRvinhWQpE5YpuE2TFCl+SJ8XAupLvi8zCe4 wo1Pet4dhKOKtrrhqiCeSNUer0/MSeLrCIB7mHha279l0D8Wx+m+h8OJMQahXI0t IJWNkxPyrIqx0ksR8Uy9SVAIrNlR5iEeqcwztLUtqxzs+b/s7OJcWmgqXgSMTOb5 RUr8Bthm3DQyXuFoR5bFA/oaoJb+iZvOjcEqQaYU2dF5zcRfdY/omhQmNq6s/7/w 7KV7C/RIrnhwLjkEd2IX5pzvgM9VuO9ewk65+sZ82o0wukhKn6RIyJYHzvwbc0rH rQMGAoz/uF4eiuqfS+uh86+VJfjBkG/sgdw0JSAt7dN14Fadg14+Ocz6apjwNuzJ 0QUWABxKSpvktufycyoo5s/bPqLMBxI/W3XZwIzbJ+dupRavPxkWpT/AwI//ZTDe rUTl2SEd4Ruliy1w41TXgubRxuW07M7bm4AxUrLzAzTx+aX7AdBeIyZp83x3oW1p nNUSdJ0m7A0Z1k28tdAP/gjD/vjcQ0J5YQ6MvZ1RBQl7MI+GLYD2irWbbrlI+g+6 HKQcXhv/7pG4 =t5WM -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Interestingly the recent kmemleak improvements allowed our CI to catch a couple of percpu leaks addressed here. We (mostly Jakub, to be accurate) are working to increase review coverage over the net code-base tweaking the MAINTAINER entries. Current release - regressions: - core: harmonize tstats and dstats - ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels - eth: tun: revert fix group permission check - eth: stmmac: revert "specify hardware capability value when FIFO size isn't specified" Previous releases - regressions: - udp: gso: do not drop small packets when PMTU reduces - rxrpc: fix race in call state changing vs recvmsg() - eth: ice: fix Rx data path for heavy 9k MTU traffic - eth: vmxnet3: fix tx queue race condition with XDP Previous releases - always broken: - sched: pfifo_tail_enqueue: drop new packet when sch->limit == 0 - ethtool: ntuple: fix rss + ring_cookie check - rxrpc: fix the rxrpc_connection attend queue handling Misc: - recognize Kuniyuki Iwashima as a maintainer" * tag 'net-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: stmmac: Specify hardware capability value when FIFO size isn't specified" MAINTAINERS: add a sample ethtool section entry MAINTAINERS: add entry for ethtool rxrpc: Fix race in call state changing vs recvmsg() rxrpc: Fix call state set to not include the SERVER_SECURING state net: sched: Fix truncation of offloaded action statistics tun: revert fix group permission check selftests/tc-testing: Add a test case for qdisc_tree_reduce_backlog() netem: Update sch->q.qlen before qdisc_tree_reduce_backlog() selftests/tc-testing: Add a test case for pfifo_head_drop qdisc when limit==0 pfifo_tail_enqueue: Drop new packet when sch->limit == 0 selftests: mptcp: connect: -f: no reconnect net: rose: lock the socket in rose_bind() net: atlantic: fix warning during hot unplug rxrpc: Fix the rxrpc_connection attend queue handling net: harmonize tstats and dstats selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not supported selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigure ethtool: ntuple: fix rss + ring_cookie check ethtool: rss: fix hiding unsupported fields in dumps ...	2025-02-06 09:14:54 -08:00
Ido Schimmel	c467a98e1d	selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding Extend the VXLAN FDB aging test case to verify that FDB entries are aged out when they only forward traffic and not refreshed by received traffic. The test fails before "vxlan: Age out FDB entries based on 'updated' time": # ./vxlan_bridge_1d.sh [...] TEST: VXLAN: Ageing of learned FDB entry [FAIL] [...] # echo $? 1 And passes after it: # ./vxlan_bridge_1d.sh [...] TEST: VXLAN: Ageing of learned FDB entry [ OK ] [...] # echo $? 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250204145549.1216254-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 18:53:57 -08:00
Cong Wang	91aadc16ee	selftests/tc-testing: Add a test case for qdisc_tree_reduce_backlog() Integrate the test case provided by Mingi Cho into TDC. All test results: 1..4 ok 1 ca5e - Check class delete notification for ffff: ok 2 e4b7 - Check class delete notification for root ffff: ok 3 33a9 - Check ingress is not searchable on backlog update ok 4 a4b9 - Test class qlen notification Cc: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://patch.msgid.link/20250204005841.223511-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 18:15:00 -08:00
Quang Le	3fe5648d1d	selftests/tc-testing: Add a test case for pfifo_head_drop qdisc when limit==0 When limit == 0, pfifo_tail_enqueue() must drop new packet and increase dropped packets count of the qdisc. All test results: 1..16 ok 1 a519 - Add bfifo qdisc with system default parameters on egress ok 2 585c - Add pfifo qdisc with system default parameters on egress ok 3 a86e - Add bfifo qdisc with system default parameters on egress with handle of maximum value ok 4 9ac8 - Add bfifo qdisc on egress with queue size of 3000 bytes ok 5 f4e6 - Add pfifo qdisc on egress with queue size of 3000 packets ok 6 b1b1 - Add bfifo qdisc with system default parameters on egress with invalid handle exceeding maximum value ok 7 8d5e - Add bfifo qdisc on egress with unsupported argument ok 8 7787 - Add pfifo qdisc on egress with unsupported argument ok 9 c4b6 - Replace bfifo qdisc on egress with new queue size ok 10 3df6 - Replace pfifo qdisc on egress with new queue size ok 11 7a67 - Add bfifo qdisc on egress with queue size in invalid format ok 12 1298 - Add duplicate bfifo qdisc on egress ok 13 45a0 - Delete nonexistent bfifo qdisc ok 14 972b - Add prio qdisc on egress with invalid format for handles ok 15 4d39 - Delete bfifo qdisc twice ok 16 d774 - Check pfifo_head_drop qdisc enqueue behaviour when limit == 0 Signed-off-by: Quang Le <quanglex97@gmail.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://patch.msgid.link/20250204005841.223511-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 18:14:46 -08:00
Matthieu Baerts (NGI0)	5368a67307	selftests: mptcp: connect: -f: no reconnect The '-f' parameter is there to force the kernel to emit MPTCP FASTCLOSE by closing the connection with unread bytes in the receive queue. The xdisconnect() helper was used to stop the connection, but it does more than that: it will shut it down, then wait before reconnecting to the same address. This causes the mptcp_join's "fastclose test" to fail all the time. This failure is due to a recent change, with commit `218cc16632` ("selftests: mptcp: avoid spurious errors on disconnect"), but that went unnoticed because the test is currently ignored. The recent modification only shown an existing issue: xdisconnect() doesn't need to be used here, only the shutdown() part is needed. Fixes: `6bf41020b7` ("selftests: mptcp: update and extend fastclose test-cases") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250204-net-mptcp-sft-conn-f-v1-1-6b470c72fffa@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 17:54:32 -08:00
Petr Machata	d9e9f6d7b7	bridge: mdb: Allow replace of a host-joined group Attempts to replace an MDB group membership of the host itself are currently bounced: # ip link add name br up type bridge vlan_filtering 1 # bridge mdb replace dev br port br grp 239.0.0.1 vid 2 # bridge mdb replace dev br port br grp 239.0.0.1 vid 2 Error: bridge: Group is already joined by host. A similar operation done on a member port would succeed. Ignore the check for replacement of host group memberships as well. The bit of code that this enables is br_multicast_host_join(), which, for already-joined groups only refreshes the MC group expiration timer, which is desirable; and a userspace notification, also desirable. Change a selftest that exercises this code path from expecting a rejection to expecting a pass. The rest of MDB selftests pass without modification. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/e5c5188b9787ae806609e7ca3aa2a0a501b9b5c4.1738685648.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 17:50:03 -08:00
Jakub Kicinski	cbecd06a22	selftests: net: suppress ReST file generation when building selftests Some selftests need libynl.a. When building it try to skip generating the ReST documentation, libynl.a does not depend on them. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250203214850.1282291-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-05 17:49:40 -08:00
Daniel Xu	2a9d30fac8	selftests/bpf: Support dynamically linking LLVM if static is not available Since `67ab80a018` ("selftests/bpf: Prefer static linking for LLVM libraries"), only statically linking test_progs is supported. However, some distros only provide a dynamically linkable LLVM. This commit adds a fallback for dynamically linking LLVM if static linking is not available. If both options are available, static linking is chosen. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/872b64e93de9a6cd6a7a10e6a5c5e7893704f743.1738276344.git.dxu@dxuuu.xyz	2025-02-05 16:46:14 -08:00
Ihor Solodrai	770cdcf4a5	selftests/bpf: Add a BTF verification test for kflagged type_tag Add a BTF verification test case for a type_tag with a kflag set. Type tags with a kflag are now valid. Add BTF_DECL_ATTR_ENC and BTF_TYPE_ATTR_ENC test helper macros, corresponding to *_TAG_ENC. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250130201239.1429648-7-ihor.solodrai@linux.dev	2025-02-05 16:18:00 -08:00
Ihor Solodrai	53ee0d66d7	bpf: Allow kind_flag for BTF type and decl tags BTF type tags and decl tags now may have info->kflag set to 1, changing the semantics of the tag. Change BTF verification to permit BTF that makes use of this feature: * remove kflag check in btf_decl_tag_check_meta(), as both values are valid * allow kflag to be set for BTF_KIND_TYPE_TAG type in btf_ref_type_check_meta() Make sure kind_flag is NOT set when checking for specific BTF tags, such as "kptr", "user" etc. Modify a selftest checking for kflag in decl_tag accordingly. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250130201239.1429648-6-ihor.solodrai@linux.dev	2025-02-05 16:17:59 -08:00
Ihor Solodrai	6c2d2a05a7	selftests/bpf: Add a btf_dump test for type_tags Factor out common routines handling custom BTF from test_btf_dump_incremental. Then use them in the test_btf_dump_type_tags. test_btf_dump_type_tags verifies that a type tag is dumped correctly with respect to its kflag. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250130201239.1429648-5-ihor.solodrai@linux.dev	2025-02-05 16:17:59 -08:00
Paul E. McKenney	6be43acb2a	torture: Make SRCU lockdep testing use srcu_read_lock_nmisafe() Recent experience shows that the srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions are not sufficiently tested. This commit therefore causes the torture.sh script's SRCU lockdep testing to use these two functions. This will cause these two functions to be regularly tested by several developers (myself included) who use torture.sh as an RCU acceptance test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-02-05 07:14:40 -08:00
Paul E. McKenney	c143bac019	rcutorture: Make scenario SRCU-P use srcu_read_lock_fast() This commit causes the rcutorture SRCU-P scenario use the srcu_read_lock_fast() and srcu_read_unlock_fast() functions. This will cause these two functions to be regularly tested by several developers (myself included), for example, those who use torture.sh as an RCU acceptance test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-02-05 07:12:05 -08:00
Lorenzo Stoakes	62d648c442	selftests/mm: use PIDFD_SELF in guard pages test Now we have PIDFD_SELF available for process_madvise(), make use of it in the guard pages test. This is both more convenient and asserts that PIDFD_SELF works as expected. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/r/69fbbe088d3424de9983e145228459cb05a8f13d.1738268370.git.lorenzo.stoakes@oracle.com Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-05 15:14:55 +01:00
Lorenzo Stoakes	2bbf47f2d3	selftests/pidfd: add tests for PIDFD_SELF_* Add tests to assert that PIDFD_SELF* correctly refers to the current thread and process. We explicitly test pidfd_send_signal(), however We defer testing of mm-specific functionality which uses pidfd, namely process_madvise() and process_mrelease() to mm testing (though note the latter can not be sensibly tested as it would require the testing process to be dying). We also correct the pidfd_open_test.c fields which refer to .request_mask whereas the UAPI header refers to .mask, which otherwise break the import of the UAPI header. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/r/7ab0e48b26ba53abf7b703df2dd11a2e99b8efb2.1738268370.git.lorenzo.stoakes@oracle.com Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-05 15:14:55 +01:00
Christian Brauner	e943271f79	selftests/pidfd: add new PIDFD_SELF* defines They will be needed in selftests in follow-up patches. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-05 15:14:55 +01:00
Breno Leitao	d5fdfe480c	netconsole: selftest: Add test for fragmented messages Add a new selftest to verify netconsole's handling of messages that exceed the packet size limit and require fragmentation. The test sends messages with varying sizes and userdata, validating that: 1. Large messages are correctly fragmented and reassembled 2. Userdata fields are properly preserved across fragments 3. Messages work correctly with and without kernel release version appending The test creates a networking environment using netdevsim, sends messages through /dev/kmsg, and verifies the received fragments maintain message integrity. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250203-netcons_frag_msgs-v1-1-5bc6bedf2ac0@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-04 18:09:21 -08:00
Dan Williams	8e4c411c53	cxl: Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' The pending efforts to add CXL Accelerator (type-2) device [1], and Dynamic Capacity (DCD) support [2], tripped on the no-longer-fit-for-purpose design in the CXL subsystem for tracking device-physical-address (DPA) metadata. Trip hazards include: - CXL Memory Devices need to consider a PMEM partition, but Accelerator devices with CXL.mem likely do not in the common case. - CXL Memory Devices enumerate DPA through Memory Device mailbox commands like Partition Info, Accelerators devices do not. - CXL Memory Devices that support DCD support more than 2 partitions. Some of the driver algorithms are awkward to expand to > 2 partition cases. - DPA performance data is a general capability that can be shared with accelerators, so tracking it in 'struct cxl_memdev_state' is no longer suitable. - Hardcoded assumptions around the PMEM partition always being index-1 if RAM is zero-sized or PMEM is zero sized. - 'enum cxl_decoder_mode' is sometimes a partition id and sometimes a memory property, it should be phased in favor of a partition id and the memory property comes from the partition info. Towards cleaning up those issues and allowing a smoother landing for the aforementioned pending efforts, introduce a 'struct cxl_dpa_partition' array to 'struct cxl_dev_state', and 'struct cxl_range_info' as a shared way for Memory Devices and Accelerators to initialize the DPA information in 'struct cxl_dev_state'. For now, split a new cxl_dpa_setup() from cxl_mem_create_range_info() to get the new data structure initialized, and cleanup some qos_class init. Follow on patches will go further to use the new data structure to cleanup algorithms that are better suited to loop over all possible partitions. cxl_dpa_setup() follows the locking expectations of mutating the device DPA map, and is suitable for Accelerator drivers to use. Accelerators likely only have one hardcoded 'ram' partition to convey to the cxl_core. Link: http://lore.kernel.org/20241230214445.27602-1-alejandro.lucero-palau@amd.com [1] Link: http://lore.kernel.org/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com [2] Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305827.668823.13978794102080021276.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:19 -07:00
Dan Williams	d77ca6c2b5	cxl: Introduce to_{ram,pmem}_{res,perf}() helpers In preparation for consolidating all DPA partition information into an array of DPA metadata, introduce helpers that hide the layout of the current data. I.e. make the eventual replacement of ->ram_res, ->pmem_res, ->ram_perf, and ->pmem_perf with a new DPA metadata array a no-op for code paths that consume that information, and reduce the noise of follow-on patches. The end goal is to consolidate all DPA information in 'struct cxl_dev_state', but for now the helpers just make it appear that all DPA metadata is relative to @cxlds. As the conversion to generic partition metadata walking is completed, these helpers will naturally be eliminated, or reduced in scope. Cc: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305238.668823.16553986866633608541.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:18 -07:00
Bharadwaj Raju	fd07912411	selftests/cgroup: use bash in test_cpuset_v1_hp.sh The script uses non-POSIX features like `[[` for conditionals and hence does not work when run with a POSIX /bin/sh. Change the shebang to /bin/bash instead, like the other tests in cgroup. Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-02-04 09:36:54 -10:00
Linus Torvalds	d009de7d54	Livepatching fixup for 6.14-rc2 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmeh4BEACgkQUqAMR0iA lPJLtxAAixRnu2x7piazO5vWcoJLub1xrxsYwDzUY+2yyuHeqdNAmQbhbWHaMeZ6 ZsJ/jUBmIcrkwxVN6maQagPqYct5e82JpGLL67fE4qfaqaFlnL3lnBttA6XNXTAD 2QH2ovAHWn/Szlr3WcbmSYKZyttbfKVxgvp4IwmAk5hlRgVVGoSgxMGsZZHiV2PX IJCoeV9UGYbQEhVDsU9dtPWGgAHgiH2eAQdR0gWHpXrRVgGTubI3GhiUyPkW6yTy sZ6aIxVKb7atQEeAfiPvJV+3ubCUc2w7YAv5qNYqkmz2PUpvgvVnkr0RdR/Vmeq1 6PtCG0POpnPYVTzhvjyhOmQQlGY3/RePh+krUGvIeOZnceqShjIuRdfMlM99AqSj cUMlnhUhfN+Fo4p3QL7QIc/rnbYBqpoG7gKd2b1iI265gRQ/PabGp2NeZt3vF0dI BvAJetrnga/kZIWHxjQ1XUd9QenSSX9jK9916kRYBY/bDBi3o4xXjSVN0uKfoIY/ iYml6iuekmpKp4ZxcsZAJ5iaP7Su5dYepef6RcEpoRvr+1XJ8Bgf6evc/UDgBPXM cXK1V1q0lZ/b8tDfIvftiWuq6ViWc7u8rWnjAW5MHfYUxeVzQXWR8l7eBM13tnVZ bb9zQGxT6xECGj28xEd6wJU4dzZIldZA8Y74RgKUOYut6/9gs04= =P6Tm -----END PGP SIGNATURE----- Merge tag 'livepatching-for-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching fix from Petr Mladek: - Fix livepatching selftests for util-linux-2.40.x * tag 'livepatching-for-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: handle PRINTK_CALLER in check_result()	2025-02-04 09:52:01 -08:00
Colin Ian King	203a53029a	KVM: selftests: Fix spelling mistake "initally" -> "initially" There is a spelling mistake in a literal string and in the function test_get_inital_dirty. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-ID: <20250204105647.367743-1-colin.i.king@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-02-04 11:15:24 -05:00
Paolo Bonzini	35441cdd50	- some selftest fixes - move some kvm-related functions from mm into kvm - remove all usage of page->index and page->lru from kvm - fixes and cleanups for vsie -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEoWuZBM6M3lCBSfTnuARItAMU6BMFAmecrk4ACgkQuARItAMU 6BOnEw/+IKRVjfEdn2dbHu9PpEqphnnV7DtmfYn8+XRM9RU8hKIRzRFN7wfmVcSt Uz1tMCEQORSI6kYim+M5lbsIwtyeAEtrfA7hj7iZrAVbV9PXvYYPdIZBeeKbNvhc ScHYqfhbECaSrHJNi8+0ZG3vEO8yFhIgdst11UsxczR97S5GSt5d+7E4pbXpV3Ij wDbjIwSv9cUQzVV80Gt39VxxLZZti+z5Kch4RYgM2QhPFXag7Ja7F1XCDxiMHYj8 YdQu2ayhEd5erdgmrwB+OFfFk280j13Ml/QYH+u0lBl11Yzc5spM/x+pb5dzuQwy I1tloWPHB4mgEbpINHHDJqVqQNGXlA5YSosESMNSS8kY7xGGbbZKR36kv61TiaRF kOA/KwffKOvMJHOIiE6ZPp5+vkYGrAeUK9c1s6dl3sug87dZ9hijqWCfa/1TkdnX VJT03JOuRAAzYEv1yp0bcPdA2Ex0ArDRA9jowqum2xKbwjYAL4EKozmYPG4kLarW tjXoC+Yy0z/U3qMsqRwcGhvGVP/Pau0/LeRsZK0udT0yz1a2na5snCHrPyXaxRrp qHZObCVPB+0IGxe1Rj8utnB3nUqHZT4RZFvU62J/tUZoGY7KLsXNL4MVIh/QAwVC nBp/Qjpcx7Khm92kaR55Z2PcgLyd0N7lI/C7pE6YEk+7qVz0Suo= =zCOv -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-6.14-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - some selftest fixes - move some kvm-related functions from mm into kvm - remove all usage of page->index and page->lru from kvm - fixes and cleanups for vsie	2025-02-04 11:14:21 -05:00
Jakub Kicinski	c3da585509	selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not supported Vast majority of drivers does not support queue offset. Simply return if the rss context + queue ntuple fails. Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-03 18:39:23 -08:00
Jakub Kicinski	de379dfd9a	selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigure Commit under Fixes adds ntuple rules but never deletes them. Fixes: `29a4bc1fe9` ("selftest: extend test_rss_context_queue_reconfigure for action addition") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-03 18:39:23 -08:00
Andrew Donnellan	772ba9b5bd	scsi: cxlflash: Remove driver Remove the cxlflash driver for IBM CAPI Flash devices. The cxlflash driver has received minimal maintenance for some time, and the CAPI Flash hardware that uses it is no longer commercially available. Thanks to Uma Krishnan, Matthew Ochs and Manoj Kumar for their work on this driver over the years. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Link: https://lore.kernel.org/r/20250203072801.365551-2-ajd@linux.ibm.com Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-02-03 18:04:55 -05:00
Thomas Weißschuh	c1f4a7a840	selftests/nolibc: always keep test kernel configuration up to date Avoid using a stale test kernel configuration by always synchronizing it to the current source tree. kbuild is smart enough to avoid spurious rebuilds. Shuffle the code around a bit to keep all the commands with side-effects together. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-config-v2-5-5701c35995d6@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 21:02:16 +01:00
Thomas Weißschuh	d7d271ec30	selftests/nolibc: execute defconfig before other targets Some targets use the test kernel configuration. Executing defconfig in the same make invocation as those targets results in errors as the configuration may be in an inconsistent state during reconfiguration. Avoid this by introducing ordering dependencies between the defconfig and some other targets. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-config-v2-4-5701c35995d6@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 21:02:15 +01:00
Thomas Weißschuh	25d5ef9e7c	selftests/nolibc: drop call to mrproper target "mrproper" unnecessarily cleans a lot of files. kbuild is smart enough to handle changed configurations, so the cleanup is not necessary and only leads to excessive rebuilds. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-config-v2-3-5701c35995d6@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 21:02:13 +01:00
Thomas Weißschuh	a75b763b51	selftests/nolibc: drop call to prepare target The "prepare" target does not need to be run manually. kbuild knows when to use it on its own and the target is not even documented. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-config-v2-2-5701c35995d6@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 21:02:02 +01:00
Thomas Weißschuh	e16214dc1f	selftests/nolibc: drop mips32be EXTRACONFIG kbuild already contains logic to merge predefines snippets into a defconfig file. For MIPS a snippet for big-endian is already provided. Link: https://lore.kernel.org/r/20250123-nolibc-config-v2-1-5701c35995d6@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 21:01:32 +01:00
Thomas Weißschuh	4da4e35e9d	selftests/nolibc: enable -Wmissing-prototypes User code may want to use this compiler flag. Make sure it is supported by nolibc. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-prototype-v1-3-e1afc5c1999a@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 20:57:39 +01:00
Thomas Weißschuh	69ccba67d7	selftests/nolibc: ignore -Wmissing-prototypes To make sure nolibc itself is compatible with -Wmissing-prototypes the compiler flag should be enabled when building nolibc-test. However some of its functions are non-static to ease debugging [0], triggering the compiler warning. Disable the warning inside nolibc-test while still enabling it for nolibc itself. [0] https://lore.kernel.org/lkml/ZMjM0UPRAqoC+goY@1wt.eu/ Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250123-nolibc-prototype-v1-2-e1afc5c1999a@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-03 20:57:38 +01:00
Bastien Curutchet (eBPF Foundation)	0c4ea7e347	selftests/bpf: test_xdp_veth: Add new test cases for XDP flags The XDP redirection is tested without any flag provided to the xdp_attach() function. Add two subtests that check the correct behaviour with XDP_FLAGS_{DRV/SKB}_MODE flags Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250131-redirect-multi-v4-10-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:52 -08:00
Bastien Curutchet (eBPF Foundation)	29c7bb7d0f	selftests/bpf: test_xdp_veth: Use unique names The network namespaces and the veth used by the tests have hardcoded names that can conflict with other tests during parallel runs. Use the append_tid() helper to ensure the uniqueness of these names. Use the static network configuration table as a template on which thread IDs are appended in each test. Set a fixed size to remote_addr field so the struct veth_configuration can also have a fixed size. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250131-redirect-multi-v4-9-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:52 -08:00
Bastien Curutchet (eBPF Foundation)	450effe2da	selftests/bpf: test_xdp_veth: Add XDP flags to prog_configuration XDP flags are hardcoded to 0 at attachment. Add flags attributes to the struct prog_configuration to allow flag modifications for each test case. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250131-redirect-multi-v4-8-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:52 -08:00
Bastien Curutchet (eBPF Foundation)	edb996fae2	selftests/bpf: test_xdp_veth: Add prog_config[] table The BPF program attached to each veth is hardcoded through the use of the struct skeletons. It prevents from re-using the initialization code in new test cases. Replace the struct skeletons by a bpf_object table. Add a struct prog_configuration that holds the name of BPF program to load on a given veth pair. Use bpf_object__find_program_by_name() / bpf_xdp_attach() API instead of bpf_program__attach_xdp() to retrieve the BPF programs from their names. Detach BPF progs in the cleanup() as it's not automatically done by this API. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250131-redirect-multi-v4-7-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:52 -08:00
Bastien Curutchet (eBPF Foundation)	7e9f3c875d	selftests/bpf: test_xdp_veth: Rename config[] The network topology is held by the config[] table. This 'config' name is a bit too generic if we want to add other configuration variables. Rename config[] to net_config[]. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250131-redirect-multi-v4-6-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Bastien Curutchet (eBPF Foundation)	3c32cbbbcd	selftests/bpf: test_xdp_veth: Split network configuration configure_network() does two things : it first creates the network topology and then configures the BPF maps to fit the test needs. This isn't convenient if we want to re-use the same network topology for different test cases. Rename configure_network() create_network(). Move the BPF configuration to the test itself. Split the test description in two parts, first the description of the network topology, then the description of the test case. Remove the veth indexes from the ASCII art as dynamic ones are used Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250131-redirect-multi-v4-5-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Bastien Curutchet (eBPF Foundation)	71e0b1cc72	selftests/bpf: test_xdp_veth: Use int to describe next veth In the struct veth_configuration, the next_veth string is used to tell the next virtual interface to which packets must be redirected to. So it has to match the local_veth string of an other veth_configuration. Change next_veth type to int to avoid handling two identical strings. This integer is used as an offset in the network configuration table. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250131-redirect-multi-v4-4-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Bastien Curutchet (eBPF Foundation)	0f5bab8dff	selftests/bpf: test_xdp_veth: Remove unecessarry check_ping() check_ping() directly returns a SYS_NOFAIL without any previous treatment. It's called only once in the file and hardcodes the used namespace and ip address. Replace check_ping() with a direct call of SYS_NOFAIL in the test. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250131-redirect-multi-v4-3-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Bastien Curutchet (eBPF Foundation)	6d34f5b728	selftests/bpf: test_xdp_veth: Remove unused defines IP_CMD_MAX_LEN and NS_SUFFIX_LEN aren't used anywhere. Remove these unused defines Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250131-redirect-multi-v4-2-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Bastien Curutchet (eBPF Foundation)	723f1b9ce3	selftests/bpf: helpers: Add append_tid() Some tests can't be run in parallel because they use same namespace names or veth names. Create an helper that appends the thread ID to a given string. 8 characters are used for it (7 digits + '\0') Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250131-redirect-multi-v4-1-970b33678512@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Tony Ambardar	cb3ade5678	selftests/bpf: Fix runqslower cross-endian build The runqslower binary from a cross-endian build currently fails to run because the included skeleton has host endianness. Fix this by passing the target BPF endianness to the runqslower sub-make. Fixes: `5a63c33d6f` ("selftests/bpf: Support cross-endian building") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250125071423.2603588-1-itugrok@yahoo.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Tengda Wu	a63a631c9b	selftests/bpf: Fix freplace_link segfault in tailcalls prog test There are two bpf_link__destroy(freplace_link) calls in test_tailcall_bpf2bpf_freplace(). After the first bpf_link__destroy() is called, if the following bpf_map_{update,delete}_elem() throws an exception, it will jump to the "out" label and call bpf_link__destroy() again, causing double free and eventually leading to a segfault. Fix it by directly resetting freplace_link to NULL after the first bpf_link__destroy() call. Fixes: `021611d33e` ("selftests/bpf: Add test to verify tailcall and freplace restrictions") Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/bpf/20250122022838.1079157-1-wutengda@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-03 03:33:51 -08:00
Yan Zhai	235174b2be	udp: gso: do not drop small packets when PMTU reduces Commit `4094871db1` ("udp: only do GSO if # of segs > 1") avoided GSO for small packets. But the kernel currently dismisses GSO requests only after checking MTU/PMTU on gso_size. This means any packets, regardless of their payload sizes, could be dropped when PMTU becomes smaller than requested gso_size. We encountered this issue in production and it caused a reliability problem that new QUIC connection cannot be established before PMTU cache expired, while non GSO sockets still worked fine at the same time. Ideally, do not check any GSO related constraints when payload size is smaller than requested gso_size, and return EMSGSIZE instead of EINVAL on MTU/PMTU check failure to be more specific on the error cause. Fixes: `4094871db1` ("udp: only do GSO if # of segs > 1") Signed-off-by: Yan Zhai <yan@cloudflare.com> Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-02-03 10:13:27 +00:00
Linus Torvalds	bdd4f86c97	AT_EXECVE_CHECK update for v6.14-rc1 (fix1) - selftests: Handle old glibc without execveat(2) (Mickaël Salaün) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ51DjgAKCRA2KwveOeQk u5ahAP9m2RJdQm/oW/SPdhZ3nJynrD0UXKpZPYe733E9D2mccQEAvh0LIAUJGJoK FbpLRWSGXOkWxAJ1oabQo8GB5v+8EQs= =WVRq -----END PGP SIGNATURE----- Merge tag 'AT_EXECVE_CHECK-v6.14-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull AT_EXECVE_CHECK selftest fix from Kees Cook: "Fixes the AT_EXECVE_CHECK selftests which didn't run on old versions of glibc" * tag 'AT_EXECVE_CHECK-v6.14-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests: Handle old glibc without execveat(2)	2025-01-31 17:12:31 -08:00
Linus Torvalds	1b5f3c51fb	RISC-V Patches for the 6.14 Merge Window, Part 1 * The PH1520 pinctrl and dwmac drivers are enabeled in defconfig. * A redundant AQRL barrier has been removed from the futex cmpxchg implementation. * Support for the T-Head vector extensions, which includes exposing these extensions to userspace on systems that implement them. * Some more page table information is now printed on die() and systems that cause PA overflows. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmedHIoTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYievXD/4hdt8h+fMM0I9mmJS096YevRJONdfe Wk7D5q4PBwSHISHahuzfphieBhqPVnYkkEd7Vw6xRrLbUnhA41Fe0uvR52dx5UZd 3LwrDV/kjGTD59x6A2Zo9bSs/qPKJ2WHmHwHM21jY5tvcIB2Lo4dF8HT63OrwVNW DxsujLO0jUw+HEwXPsfmUAZJWOPZuUnatl/9CaLMLwQv5N7yiMuz5oYDzJXTLnNh m3Hv3CCtj1EeQPqDoWzz9nZvmAKOwcblSzz6OAy+xrRk1N0N3QFQPbIaRvkI9OVz +wPHQiyx4KZNeAe0csV0uLQRIiXZV8rkCz5UT65s3Bfy3vukvzz+1VBdNnCqiP8Q RpCTcYw62Cr6BWnvyTh+s9bhHb1ijG043nXd/Ty7ZRPCNLKHY6oL1CZ0pgqbTwPs D2U2ZTZFTc35mPrU6QMfbTiUVWCU2XagFhI27Dgj3xh9mkBOQCHwk2Mrzn7uS4iz xGNnrjRnKtuwBrvD68JzxCkEi8INFn2ifbVr44VZrOdTM7XtODGAYrBohQtV62kU 2L+q8DoHYis+0xFbR1wdrY1mRZoe45boUFgwnOpmoBr9ULe584sL+526y7IkkEHu /9hmLPtLg7nyoR/rO1j1Sfg4Eqdwg5HY1TKNfagJZAdu23EDRwrcW1PD0P6vtDv8 j4og8MmL7dTt3A== =HbAQ -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - The PH1520 pinctrl and dwmac drivers are enabeled in defconfig - A redundant AQRL barrier has been removed from the futex cmpxchg implementation - Support for the T-Head vector extensions, which includes exposing these extensions to userspace on systems that implement them - Some more page table information is now printed on die() and systems that cause PA overflows * tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: add a warning when physical memory address overflows riscv/mm/fault: add show_pte() before die() riscv: Add ghostwrite vulnerability selftests: riscv: Support xtheadvector in vector tests selftests: riscv: Fix vector tests riscv: hwprobe: Document thead vendor extensions and xtheadvector extension riscv: hwprobe: Add thead vendor extension probing riscv: vector: Support xtheadvector save/restore riscv: Add xtheadvector instruction definitions riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT RISC-V: define the elements of the VCSR vector CSR riscv: vector: Use vlenb from DT for thead riscv: Add thead and xtheadvector as a vendor extension riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree dt-bindings: cpus: add a thead vlen register length property dt-bindings: riscv: Add xtheadvector ISA extension description RISC-V: Mark riscv_v_init() as __init riscv: defconfig: drop RT_GROUP_SCHED=y riscv/futex: Optimize atomic cmpxchg riscv: defconfig: enable pinctrl and dwmac support for TH1520	2025-01-31 15:13:25 -08:00
Linus Torvalds	c545cd3276	x86/mm changes for v6.14: - The biggest changes are the TLB flushing scalability optimizations, to update the mm_cpumask lazily and related changes. This feature has both a track record and a continued risk of performance regressions, so it was already delayed by a cycle - but it's all 100% perfect now™. (Rik van Riel) - Also miscellaneous fixes and cleanups. (Gautam Somani, Kirill A. Shutemov, Sebastian Andrzej Siewior) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeclXoRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iDixAAjmTv/3KBuXaW/EoqGkyr/dgJld/Cww5a 4yyM6pbVOkiP+pmSTiHChhn07A4eB1TMCP0RJHXUgsCr6VLY8+68MdafCMIn9hWK mZYbCFF2yWy2EP4a26ifTi/3P355x5WILxJH5K4fHxcsXjRy5LgCLaq0tObEqnZ8 OAGIBw+g3t7CYurqlKfYiVSUiUG8PbXbS9Bh/0SjRe5FRbJDre3XJy9ks2c83wHU anPe5qpkw3mg8hPiFQfv3EYyGe1NhAs9hBMYLKqUyyxZEixymZDsvjYnOe154OMI 9xk3XpeFFejwvBJ1pfSS3V5svm5sqtnRpZSivUl/gsT7LM65N8RqKMrTvcpT+fm7 cQs8JK3LP+S2ih3S4wTZRdVGnIQGzqHkp9R6e8T4r9FQ2688mk/OvqJOCZEAcPgx VRHiMXtgZ3e8OsMiY+82TGt9wyujCR/kk+hzgXtNC1Lr++jCz848n3UcUe+wvzzw Lo8LGGdAzBRviwiwwrRxCYKtlUtkIwbIKtfswv5pfapji2cTHckhvuKAcujpvaXd +qgnX8XNVZWoG57tN02jZ8ZgAFgZlV2A03WG5e0c1wb4/3AnGQDGpCEWX2/lMj1J U/FFwNA6+jzcVMYyN/LQAETv0Go7sJOVTTie7mAHEhyHvxvb2YfV9VJ60V2WBKn5 znIuU0l2qyQ= =g00u -----END PGP SIGNATURE----- Merge tag 'x86-mm-2025-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - The biggest changes are the TLB flushing scalability optimizations, to update the mm_cpumask lazily and related changes. This feature has both a track record and a continued risk of performance regressions, so it was already delayed by a cycle - but it's all 100% perfect now™ (Rik van Riel) - Also miscellaneous fixes and cleanups. (Gautam Somani, Kirill Shutemov, Sebastian Andrzej Siewior) * tag 'x86-mm-2025-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Remove unnecessary include of <linux/extable.h> x86/mtrr: Rename mtrr_overwrite_state() to guest_force_mtrr_state() x86/mm/selftests: Fix typo in lam.c x86/mm/tlb: Only trim the mm_cpumask once a second x86/mm/tlb: Also remove local CPU from mm_cpumask if stale x86/mm/tlb: Add tracepoint for TLB flush IPI to stale CPU x86/mm/tlb: Update mm_cpumask lazily	2025-01-31 10:39:07 -08:00
Christoph Schlameuss	3223906677	KVM: s390: selftests: Streamline uc_skey test to issue iske after sske In some rare situations a non default storage key is already set on the memory used by the test. Within normal VMs the key is reset / zapped when the memory is added to the VM. This is not the case for ucontrol VMs. With the initial iske check removed this test case can work in all situations. The function of the iske instruction is still validated by the remaining code. Fixes: `0185fbc6a2` ("KVM: s390: selftests: Add uc_skey VM test case") Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250128131803.1047388-1-schlameuss@linux.ibm.com Message-ID: <20250128131803.1047388-1-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-31 12:03:53 +01:00
Claudio Imbrenda	63e7151989	KVM: s390: selftests: fix ucontrol memory region test With the latest patch, attempting to create a memslot from userspace will result in an EEXIST error for UCONTROL VMs, instead of EINVAL, since the new memslot will collide with the internal memslot. There is no simple way to bring back the previous behaviour. This is not a problem, but the test needs to be fixed accordingly. Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-5-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-5-imbrenda@linux.ibm.com>	2025-01-31 12:03:52 +01:00
Linus Torvalds	c2933b2bef	First batch of fixes for 6.14. Nothing really stands out, but as usual there's a slight concentration of fixes for issues added in the last two weeks before the MW, and driver bugs from 6.13 which tend to get discovered upon wider distribution. Including fixes from IPSec, netfilter and Bluetooth. Current release - regressions: - net: revert RTNL changes in unregister_netdevice_many_notify() - Bluetooth: fix possible infinite recursion of btusb_reset - eth: adjust locking in some old drivers which protect their state with spinlocks to avoid sleeping in atomic; core protects netdev state with a mutex now Previous releases - regressions: - eth: mlx5e: make sure we pass node ID, not CPU ID to kvzalloc_node() - eth: bgmac: reduce max frame size to support just 1500 bytes; the jumbo frame support would previously cause OOB writes, but now fails outright - mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted, avoid false detection of MPTCP blackholing Previous releases - always broken: - mptcp: handle fastopen disconnect correctly - xfrm: make sure skb->sk is a full sock before accessing its fields - xfrm: fix taking a lock with preempt disabled for RT kernels - usb: ipheth: improve safety of packet metadata parsing; prevent potential OOB accesses - eth: renesas: fix missing rtnl lock in suspend/resume path Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmebzXsACgkQMUZtbf5S IrvGBQ//auOF2yY1sg40fBvc6Hr1jpZBcr+uqTL6Qka1uVOvTFY51hAN54lBt32+ ixmcHsD0xdcHrr7VrqSXqurQLiGsdwUpnxZFCj/FymQuMunVysEqudvPeKDVHpsw JW5c4nJOexEA2viByK9iB23Qq0P3uBoPEnKrbSTVSDvYaXUj6y8Cvt3/vXc+H/tc T7GaxHH55NNNPkRz34YU3OWcaZsgkQEcdVpZf4tODPmg7J5VQj8SQeMhk/HI0sdO WKjWB0woZkiQECtamqAOXnv47PXd6igv8NALRPlJcKjs0EszUvuYhD/9MEOeghjI sjcQn9JnPpG+ca/qFVCSpEEOo2zGVn5dkJT5x26udH+5XHf7Pq+zpJwB6LHo98yF bGMpIrF6gi2EnBtS/tRjMyBU9Ut9KiUtjXMvn9EsD1U1FGIbz6wyQLlT0pG0hwJb rEdKfegrcyhWKHOD4vH9ciEg/7lgGfsGyfJDktIMdyailZc6tBcxwbdlc+5jRDA9 0RqGASIXaiX7AC3WOSeQzMgbV+WXdhaX/yrMJL5KBfBTzxG2audnJt1tPN3mbh3z NM6M2cnMsoX4QLSiaukJaCL7LWHSTlVttZVg8FGXHj1PejMQQBVjGVvzk2UF55UR gV7X9/VkXhmIDAZgThWtOdPLz+ItksfSKiruhUsXust6JgqRuqc= =GjBE -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from IPSec, netfilter and Bluetooth. Nothing really stands out, but as usual there's a slight concentration of fixes for issues added in the last two weeks before the merge window, and driver bugs from 6.13 which tend to get discovered upon wider distribution. Current release - regressions: - net: revert RTNL changes in unregister_netdevice_many_notify() - Bluetooth: fix possible infinite recursion of btusb_reset - eth: adjust locking in some old drivers which protect their state with spinlocks to avoid sleeping in atomic; core protects netdev state with a mutex now Previous releases - regressions: - eth: - mlx5e: make sure we pass node ID, not CPU ID to kvzalloc_node() - bgmac: reduce max frame size to support just 1500 bytes; the jumbo frame support would previously cause OOB writes, but now fails outright - mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted, avoid false detection of MPTCP blackholing Previous releases - always broken: - mptcp: handle fastopen disconnect correctly - xfrm: - make sure skb->sk is a full sock before accessing its fields - fix taking a lock with preempt disabled for RT kernels - usb: ipheth: improve safety of packet metadata parsing; prevent potential OOB accesses - eth: renesas: fix missing rtnl lock in suspend/resume path" * tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits) MAINTAINERS: add Neal to TCP maintainers net: revert RTNL changes in unregister_netdevice_many_notify() net: hsr: fix fill_frame_info() regression vs VLAN packets doc: mptcp: sysctl: blackhole_timeout is per-netns mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted netfilter: nf_tables: reject mismatching sum of field_len with set key length net: sh_eth: Fix missing rtnl lock in suspend/resume path net: ravb: Fix missing rtnl lock in suspend/resume path selftests/net: Add test for loading devbound XDP program in generic mode net: xdp: Disallow attaching device-bound programs in generic mode tcp: correct handling of extreme memory squeeze bgmac: reduce max frame size to support just MTU 1500 vsock/test: Add test for connect() retries vsock/test: Add test for UAF due to socket unbinding vsock/test: Introduce vsock_connect_fd() vsock/test: Introduce vsock_bind() vsock: Allow retrying on connect() failure vsock: Keep the binding until socket destruction Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming ...	2025-01-30 12:24:20 -08:00
Linus Torvalds	90cb220062	gpio fixes for v6.14-rc1 - update gpio-sim selftests to not fail now that we no longer allow rmdir() on configfs entries of active devices - remove leftover code from gpio-mxc -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmebRzwACgkQEacuoBRx 13KUEQ/+KAAVHAr3l3bFG9POniRk6fZB2kyWeOUvGGmW1qBHjRF50wKsV0+EjhEi 0jm5HX92tvqTfci/jCkZbw0eIZ72P9QXqgBKo+Hare+7Q0bX/eddBgjNTyaAyjLW VHwOoP3M7jdsb5vBmgMOIbsJHQ8Q3TscSN5ndFfQ04qZ0ahAczV86rQQRiFaxhle rr8XrICLlOnLS9YV4yYgJhNulIMvhLhcz70gyYNU6UOfEHNm3ZblBgoOWrPAvd+d 3lZ+4ZDNaAPyBhW2a64FDs1e+/9VpFfrp3CNGecM8tV8swd33atpeNsClPqFPJYR pLir57iVbH3FiE76PdifR128toaH3+pSrPNuL7Gi2oVsdA5aN2l0y+YuXIFgpP+7 /FCLKEfcY1G/Kp9nai0FG3ltAAkbaULi06dYdgXAC+xv391j1p9EI9iz9ZifpAA8 I1Apa36E+MDgku52eiL16isNsLESE3ky/URsf8zLe8bRUEKCF6YsVoOLe7RDl5CQ fDgJx7xALIm8vk2jizmTmMIDVgY5leLgXS4pVHvEVBAg8zZe0OH5SRyYan82pv6t asw4RfEu7J7l5k5fgqWf2p6H4cpU2SuI35YdoTjYReGbExabnkylZTyoRjkGLX7o NqIktzN4nJzyGM0VAkOVnj5RKTlvCKSVLg0Qk3srXeG42qJ3Cqw= =Ql6L -----END PGP SIGNATURE----- Merge tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - update gpio-sim selftests to not fail now that we no longer allow rmdir() on configfs entries of active devices - remove leftover code from gpio-mxc * tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: selftests: gpio: gpio-sim: Fix missing chip disablements gpio: mxc: remove dead code after switch to DT-only	2025-01-30 10:19:30 -08:00
Linus Torvalds	d3d90cc289	Provide stable parent and name to ->d_revalidate() instances Most of the filesystem methods where we care about dentry name and parent have their stability guaranteed by the callers; ->d_revalidate() is the major exception. It's easy enough for callers to supply stable values for expected name and expected parent of the dentry being validated. That kills quite a bit of boilerplate in ->d_revalidate() instances, along with a bunch of races where they used to access ->d_name without sufficient precautions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZ5gkoQAKCRBZ7Krx/gZQ 6w9FAP4nyxNNWMjE1TwuWR/DNDMYYuw/qn/miZ88B5BUM8hzqgD/W2SjRvcbSaIm xSIYpbtKgtqNU34P1PU+dBvL8Utz2AE= =TWY8 -----END PGP SIGNATURE----- Merge tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs d_revalidate updates from Al Viro: "Provide stable parent and name to ->d_revalidate() instances Most of the filesystem methods where we care about dentry name and parent have their stability guaranteed by the callers; ->d_revalidate() is the major exception. It's easy enough for callers to supply stable values for expected name and expected parent of the dentry being validated. That kills quite a bit of boilerplate in ->d_revalidate() instances, along with a bunch of races where they used to access ->d_name without sufficient precautions" * tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: 9p: fix ->rename_sem exclusion orangefs_d_revalidate(): use stable parent inode and name passed by caller ocfs2_dentry_revalidate(): use stable parent inode and name passed by caller nfs: fix ->d_revalidate() UAF on ->d_name accesses nfs{,4}_lookup_validate(): use stable parent inode passed by caller gfs2_drevalidate(): use stable parent inode and name passed by caller fuse_dentry_revalidate(): use stable parent inode and name passed by caller vfat_revalidate{,_ci}(): use stable parent inode passed by caller exfat_d_revalidate(): use stable parent inode passed by caller fscrypt_d_revalidate(): use stable parent inode passed by caller ceph_d_revalidate(): propagate stable name down into request encoding ceph_d_revalidate(): use stable parent inode passed by caller afs_d_revalidate(): use stable name and parent inode passed by caller Pass parent directory inode and expected name to ->d_revalidate() generic_ci_d_compare(): use shortname_storage ext4 fast_commit: make use of name_snapshot primitives dissolve external_name.u into separate members make take_dentry_name_snapshot() lockless dcache: back inline names with a struct-wrapped array of unsigned long make sure that DNAME_INLINE_LEN is a multiple of word size	2025-01-30 09:13:35 -08:00
Toke Høiland-Jørgensen	f7bf624b1f	selftests/net: Add test for loading devbound XDP program in generic mode Add a test to bpf_offload.py for loading a devbound XDP program in generic mode, checking that it fails correctly. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250127131344.238147-2-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-29 19:04:23 -08:00
Michal Luczaj	4695f64e02	vsock/test: Add test for connect() retries Deliberately fail a connect() attempt; expect error. Then verify that subsequent attempt (using the same socket) can still succeed, rather than fail outright. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-6-1cf57065b770@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-29 18:50:37 -08:00
Michal Luczaj	301a62dfb0	vsock/test: Add test for UAF due to socket unbinding Fail the autobind, then trigger a transport reassign. Socket might get unbound from unbound_sockets, which then leads to a reference count underflow. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-5-1cf57065b770@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-29 18:50:37 -08:00
Michal Luczaj	ac12b7e291	vsock/test: Introduce vsock_connect_fd() Distill timeout-guarded vsock_connect_fd(). Adapt callers. Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-4-1cf57065b770@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-29 18:50:37 -08:00
Michal Luczaj	852a00c428	vsock/test: Introduce vsock_bind() Add a helper for socket()+bind(). Adapt callers. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-3-1cf57065b770@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-29 18:50:37 -08:00
Jiayuan Chen	6fcfe96e0f	selftests/bpf: Add strparser test for bpf Add test cases for bpf + strparser and separated them from sockmap_basic, as strparser has more encapsulation and parsing capabilities compared to standard sockmap. Signed-off-by: Jiayuan Chen <mrpre@163.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://patch.msgid.link/20250122100917.49845-6-mrpre@163.com	2025-01-29 13:32:48 -08:00
Jiayuan Chen	a0c1114950	selftests/bpf: Fix invalid flag of recv() SOCK_NONBLOCK flag is only effective during socket creation, not during recv. Use MSG_DONTWAIT instead. Signed-off-by: Jiayuan Chen <mrpre@163.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://patch.msgid.link/20250122100917.49845-5-mrpre@163.com	2025-01-29 13:32:40 -08:00
Linus Torvalds	9071080d1e	cxl changes for v6.14 - Move HMAT printouts to pr_debug() - Add CXL type2 support to cxl_dvsec_rr_decode() in preparation for type2 support - A series that updates CXL event records to spec r3.1 and related changes - Refactoring of cxl_find_regblock_instance() to count regblocks -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmeaUgcACgkQYGjFFmlT OEqheg/+Iz7d1uBmyEcgf9JX0XAoPuG7uAPQcgSDK1u4gG0WxuBn+rc4Vprukml5 0wn1n6LVQg3lkQp9oNQKwEMt1es7Y2RDOaPd17eiBxFoBPkmHIF8fl2nhoLzCSco j23kdKsySW5wcT7FdppVT0cRdMQM/9ecJhR7mribdIxNwmPP2D7knnOYjytQjRFp C65EGZ97cn8CAJt7zdBRLNcgzJc4iVfT7jyRbtFvA/Bx4WU7qSCiwh8c2KZ4U3wE piCoMnsv/RwMjbGZNmEt7dTC3RIawWajGchWJ9aolXYidWm6sxpx6JOgzaGWJK5q 19fsb978SnqA6tr4qNmTgohl20gkRiqQZnYWZoKlyIZeBA+IqMdpl0YosSVNvS85 kBlau9eMmZVIxJlHevQ3+5NBCzzXz2rLnORAJUUCgHnuWeKU/+t31xK2lfbMJet9 Q6wkJC4kz1u6xLBP1aEnkVtQE0RY1UmUw4aU0L3jiLF6IGRF9wBRW/bFBKTEYLul 7cL7R5rwFxIwvR1uNIQVVnSTUzHUeA1brQS1I6QJP2SNzVl2uew1Xlx6KrLt3kQw zOuQFAvMa3L/HWjfDJYvcxfzhYN2ng3ec3XjyGN75S5jrb7a0PZ5vOmxZ6PFvEZE 8HFHfQqvpdO0e1SLWgLv7OBENM3GgLovbeSbEhQOlxzjx7QEsFc= =Tavn -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: "A tweak to the HMAT output that was acked by Rafael, a prep patch for CXL type2 devices support that's coming soon, refactoring of the CXL regblock enumeration code, and a series of patches to update the event records to CXL spec r3.1: - Move HMAT printouts to pr_debug() - Add CXL type2 support to cxl_dvsec_rr_decode() in preparation for type2 support - A series that updates CXL event records to spec r3.1 and related changes - Refactoring of cxl_find_regblock_instance() to count regblocks" * tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/regs: Refactor out functions to count regblocks of given type cxl/test: Update test code for event records to CXL spec rev 3.1 cxl/events: Update Memory Module Event Record to CXL spec rev 3.1 cxl/events: Update DRAM Event Record to CXL spec rev 3.1 cxl/events: Update General Media Event Record to CXL spec rev 3.1 cxl/events: Add Component Identifier formatting for CXL spec rev 3.1 cxl/events: Update Common Event Record to CXL spec rev 3.1 cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode() ACPI/HMAT: Move HMAT messages to pr_debug()	2025-01-29 11:23:22 -08:00
Shigeru Yoshida	c7f2188d68	selftests/bpf: Adjust data size to have ETH_HLEN The function bpf_test_init() now returns an error if user_size (.data_size_in) is less than ETH_HLEN, causing the tests to fail. Adjust the data size to ensure it meets the requirement of ETH_HLEN. Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250121150643.671650-2-syoshida@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-29 08:51:51 -08:00
Linus Torvalds	2ab002c755	Driver core and debugfs updates Here is the big set of driver core and debugfs updates for 6.14-rc1. It's coming late in the merge cycle as there are a number of merge conflicts with your tree now, and I wanted to make sure they were working properly. To resolve them, look in linux-next, and I will send the "fixup" patch as a response to the pull request. Included in here is a bunch of driver core, PCI, OF, and platform rust bindings (all acked by the different subsystem maintainers), hence the merge conflict with the rust tree, and some driver core api updates to mark things as const, which will also require some fixups due to new stuff coming in through other trees in this merge window. There are also a bunch of debugfs updates from Al, and there is at least one user that does have a regression with these, but Al is working on tracking down the fix for it. In my use (and everyone else's linux-next use), it does not seem like a big issue at the moment. Here's a short list of the things in here: - driver core bindings for PCI, platform, OF, and some i/o functions. We are almost at the "write a real driver in rust" stage now, depending on what you want to do. - misc device rust bindings and a sample driver to show how to use them - debugfs cleanups in the fs as well as the users of the fs api for places where drivers got it wrong or were unnecessarily doing things in complex ways. - driver core const work, making more of the api take const * for different parameters to make the rust bindings easier overall. - other small fixes and updates All of these have been in linux-next with all of the aforementioned merge conflicts, and the one debugfs issue, which looks to be resolved "soon". Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ5koPA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymFHACfT5acDKf2Bov2Lc/5u3vBW/R6ChsAnj+LmgVI hcDSPodj4szR40RRnzBd =u5Ey -----END PGP SIGNATURE----- Merge tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the big set of driver core and debugfs updates for 6.14-rc1. Included in here is a bunch of driver core, PCI, OF, and platform rust bindings (all acked by the different subsystem maintainers), hence the merge conflict with the rust tree, and some driver core api updates to mark things as const, which will also require some fixups due to new stuff coming in through other trees in this merge window. There are also a bunch of debugfs updates from Al, and there is at least one user that does have a regression with these, but Al is working on tracking down the fix for it. In my use (and everyone else's linux-next use), it does not seem like a big issue at the moment. Here's a short list of the things in here: - driver core rust bindings for PCI, platform, OF, and some i/o functions. We are almost at the "write a real driver in rust" stage now, depending on what you want to do. - misc device rust bindings and a sample driver to show how to use them - debugfs cleanups in the fs as well as the users of the fs api for places where drivers got it wrong or were unnecessarily doing things in complex ways. - driver core const work, making more of the api take const * for different parameters to make the rust bindings easier overall. - other small fixes and updates All of these have been in linux-next with all of the aforementioned merge conflicts, and the one debugfs issue, which looks to be resolved "soon"" * tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) rust: device: Use as_char_ptr() to avoid explicit cast rust: device: Replace CString with CStr in property_present() devcoredump: Constify 'struct bin_attribute' devcoredump: Define 'struct bin_attribute' through macro rust: device: Add property_present() saner replacement for debugfs_rename() orangefs-debugfs: don't mess with ->d_name octeontx2: don't mess with ->d_parent or ->d_parent->d_name arm_scmi: don't mess with ->d_parent->d_name slub: don't mess with ->d_name sof-client-ipc-flood-test: don't mess with ->d_name qat: don't mess with ->d_name xhci: don't mess with ->d_iname mtu3: don't mess wiht ->d_iname greybus/camera - stop messing with ->d_iname mediatek: stop messing with ->d_iname netdevsim: don't embed file_operations into your structs b43legacy: make use of debugfs_get_aux() b43: stop embedding struct file_operations into their objects carl9170: stop embedding file_operations into their objects ...	2025-01-28 12:25:12 -08:00
Linus Torvalds	e2ee2e9b15	KVM/arm64 updates for 6.14 * New features: - Support for non-protected guest in protected mode, achieving near feature parity with the non-protected mode - Support for the EL2 timers as part of the ongoing NV support - Allow control of hardware tracing for nVHE/hVHE * Improvements, fixes and cleanups: - Massive cleanup of the debug infrastructure, making it a bit less awkward and definitely easier to maintain. This should pave the way for further optimisations - Complete rewrite of pKVM's fixed-feature infrastructure, aligning it with the rest of KVM and making the code easier to follow - Large simplification of pKVM's memory protection infrastructure - Better handling of RES0/RES1 fields for memory-backed system registers - Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer from a pretty nasty timer bug - Small collection of cleanups and low-impact fixes -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmeYqJcQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNLUhCACxUTMVQXhfW3qbh0UQxPd7XXvjI+Hm7SPS wDuVTle4jrFVGHxuZqtgWLmx8hD7bqO965qmFgbevKlwsRY33onH2nbH4i4AcwbA jcdM4yMHZI4+Qmnb4G5ZJ89IwjAhHPZTBOV5KRhyHQ/qtRciHHtOgJde7II9fd68 uIESg4SSSyUzI47YSEHmGVmiBIhdQhq2qust0m6NPFalEGYstPbpluPQ6R1CsDqK v14TIAW7t0vSPucBeODxhA5gEa2JsvNi+sqA+DF/ELH2ZqpkuR7rofgMGblaXCSD JXa5xamRB9dI5zi8vatwfOzYlog+/gzmPqMh/9JXpiDGHxJe0vlz =tQ8F -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull KVM/arm64 updates from Will Deacon: "New features: - Support for non-protected guest in protected mode, achieving near feature parity with the non-protected mode - Support for the EL2 timers as part of the ongoing NV support - Allow control of hardware tracing for nVHE/hVHE Improvements, fixes and cleanups: - Massive cleanup of the debug infrastructure, making it a bit less awkward and definitely easier to maintain. This should pave the way for further optimisations - Complete rewrite of pKVM's fixed-feature infrastructure, aligning it with the rest of KVM and making the code easier to follow - Large simplification of pKVM's memory protection infrastructure - Better handling of RES0/RES1 fields for memory-backed system registers - Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer from a pretty nasty timer bug - Small collection of cleanups and low-impact fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits) arm64/sysreg: Get rid of TRFCR_ELx SysregFields KVM: arm64: nv: Fix doc header layout for timers KVM: arm64: nv: Apply RESx settings to sysreg reset values KVM: arm64: nv: Always evaluate HCR_EL2 using sanitising accessors KVM: arm64: Fix selftests after sysreg field name update coresight: Pass guest TRFCR value to KVM KVM: arm64: Support trace filtering for guests KVM: arm64: coresight: Give TRBE enabled state to KVM coresight: trbe: Remove redundant disable call arm64/sysreg/tools: Move TRFCR definitions to sysreg tools: arm64: Update sysreg.h header files KVM: arm64: Drop pkvm_mem_transition for host/hyp donations KVM: arm64: Drop pkvm_mem_transition for host/hyp sharing KVM: arm64: Drop pkvm_mem_transition for FF-A KVM: arm64: Explicitly handle BRBE traps as UNDEFINED KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe() arm64: kvm: Introduce nvhe stack size constants KVM: arm64: Fix nVHE stacktrace VA bits mask KVM: arm64: Fix FEAT_MTE in pKVM Documentation: Update the behaviour of "kvm-arm.mode" ...	2025-01-28 09:01:36 -08:00
Linus Torvalds	13845bdc86	Char/Misc/IIO driver updates for 6.14-rc1 Here is the "big" set of char/misc/iio and other smaller driver subsystem updates for 6.14-rc1. Loads of different things in here this development cycle, highlights are: - ntsync "driver" to handle Windows locking types enabling Wine to work much better on many workloads (i.e. games). The driver framework was in 6.13, but now it's enabled and fully working properly. Should make many SteamOS users happy. Even comes with tests! - Large IIO driver updates and bugfixes - FPGA driver updates - Coresight driver updates - MHI driver updates - PPS driver updatesa - const bin_attribute reworking for many drivers - binder driver updates - smaller driver updates and fixes All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ5fGOQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynatACeLlbkhUT544Va1eOL2TkjfcGxrZUAoJ3ymGC0 y0N7/+fWL6aS+b4sEilv =TU0D -----END PGP SIGNATURE----- Merge tag 'char-misc-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull Char/Misc/IIO driver updates from Greg KH: "Here is the "big" set of char/misc/iio and other smaller driver subsystem updates for 6.14-rc1. Loads of different things in here this development cycle, highlights are: - ntsync "driver" to handle Windows locking types enabling Wine to work much better on many workloads (i.e. games). The driver framework was in 6.13, but now it's enabled and fully working properly. Should make many SteamOS users happy. Even comes with tests! - Large IIO driver updates and bugfixes - FPGA driver updates - Coresight driver updates - MHI driver updates - PPS driver updatesa - const bin_attribute reworking for many drivers - binder driver updates - smaller driver updates and fixes All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits) ntsync: Fix reference leaks in the remaining create ioctls. spmi: hisi-spmi-controller: Drop duplicated OF node assignment in spmi_controller_probe() spmi: Set fwnode for spmi devices ntsync: fix a file reference leak in drivers/misc/ntsync.c scripts/tags.sh: Don't tag usages of DECLARE_BITMAP dt-bindings: interconnect: qcom,msm8998-bwmon: Add SM8750 CPU BWMONs dt-bindings: interconnect: OSM L3: Document sm8650 OSM L3 compatible dt-bindings: interconnect: qcom-bwmon: Document QCS615 bwmon compatibles interconnect: sm8750: Add missing const to static qcom_icc_desc memstick: core: fix kernel-doc notation intel_th: core: fix kernel-doc warnings binder: log transaction code on failure iio: dac: ad3552r-hs: clear reset status flag iio: dac: ad3552r-common: fix ad3541/2r ranges iio: chemical: bme680: Fix uninitialized variable in __bme680_read_raw() misc: fastrpc: Fix copy buffer page size misc: fastrpc: Fix registered buffer page address misc: fastrpc: Deregister device nodes properly in error scenarios nvmem: core: improve range check for nvmem_cell_write() nvmem: qcom-spmi-sdam: Set size in struct nvmem_config ...	2025-01-27 16:51:51 -08:00
Jan Stancek	9b06d5b956	selftests: net/{lib,openvswitch}: extend CFLAGS to keep options from environment Package build environments like Fedora rpmbuild introduced hardening options (e.g. -pie -Wl,-z,now) by passing a -spec option to CFLAGS and LDFLAGS. Some Makefiles currently override CFLAGS but not LDFLAGS, which leads to a mismatch and build failure, for example: /usr/bin/ld: /tmp/ccd2apay.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIE /usr/bin/ld: failed to set dynamic section sizes: bad value collect2: error: ld returned 1 exit status make[1]: *** [../../lib.mk:222: tools/testing/selftests/net/lib/csum] Error 1 openvswitch/Makefile CFLAGS currently do not appear to be used, but fix it anyway for the case when new tests are introduced in future. Fixes: `1d0dc857b5` ("selftests: drv-net: add checksum tests") Signed-off-by: Jan Stancek <jstancek@redhat.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/3d173603ee258f419d0403363765c9f9494ff79a.1737635092.git.jstancek@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-27 14:45:27 -08:00
Jan Stancek	23b3a7c4a7	selftests: mptcp: extend CFLAGS to keep options from environment Package build environments like Fedora rpmbuild introduced hardening options (e.g. -pie -Wl,-z,now) by passing a -spec option to CFLAGS and LDFLAGS. mptcp Makefile currently overrides CFLAGS but not LDFLAGS, which leads to a mismatch and build failure, for example: make[1]: *** [../../lib.mk:222: tools/testing/selftests/net/mptcp/mptcp_sockopt] Error 1 /usr/bin/ld: /tmp/ccqyMVdb.o: relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a PIE object; recompile with -fPIE /usr/bin/ld: failed to set dynamic section sizes: bad value collect2: error: ld returned 1 exit status Fixes: `cc937dad85` ("selftests: centralize -D_GNU_SOURCE= to CFLAGS in lib.mk") Signed-off-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/7abc701da9df39c2d6cd15bc3cf9e6cee445cb96.1737621162.git.jstancek@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-27 14:45:19 -08:00
Jakub Kicinski	50bf398e1c	net: netdevsim: try to close UDP port harness races syzbot discovered that we remove the debugfs files after we free the netdev. Try to clean up the relevant dir while the device is still around. Reported-by: syzbot+2e5de9e3ab986b71d2bf@syzkaller.appspotmail.com Fixes: `424be63ad8` ("netdevsim: add UDP tunnel port offload support") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250122224503.762705-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-27 14:24:34 -08:00
Mickaël Salaün	38567b972a	selftests: Handle old glibc without execveat(2) Add an execveat(2) wrapper because glibc < 2.34 does not have one. This fixes the check-exec tests and samples. Cc: Günther Noack <gnoack@google.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Kees Cook <kees@kernel.org> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Stefan Berger <stefanb@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/r/20250114205645.GA2825031@ax162 Signed-off-by: Mickaël Salaün <mic@digikod.net> Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250115144753.311152-1-mic@digikod.net Signed-off-by: Kees Cook <kees@kernel.org>	2025-01-27 11:37:18 -08:00
Andrea Righi	3c7d51b0d2	sched_ext: selftests/dsp_local_on: Fix selftest on UP systems In UP systems p->migration_disabled is not available. Fix this by using the portable helper is_migration_disabled(p). Fixes: `e9fe182772` ("sched_ext: selftests/dsp_local_on: Fix sporadic failures") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-27 09:00:09 -10:00
Madhavan Srinivasan	28aecef5b1	selftests: livepatch: handle PRINTK_CALLER in check_result() Some arch configs (like ppc64) enable CONFIG_PRINTK_CALLER, which adds the caller id as part of the dmesg. With recent util-linux's update 467a5b3192f16 ('dmesg: add caller_id support') the standard "dmesg" has been enhanced to print PRINTK_CALLER fields. Due to this, even though the expected vs observed are same, end testcase results are failed. -% insmod test_modules/test_klp_livepatch.ko -livepatch: enabling patch 'test_klp_livepatch' -livepatch: 'test_klp_livepatch': initializing patching transition -livepatch: 'test_klp_livepatch': starting patching transition -livepatch: 'test_klp_livepatch': completing patching transition -livepatch: 'test_klp_livepatch': patching complete -% echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled -livepatch: 'test_klp_livepatch': initializing unpatching transition -livepatch: 'test_klp_livepatch': starting unpatching transition -livepatch: 'test_klp_livepatch': completing unpatching transition -livepatch: 'test_klp_livepatch': unpatching complete -% rmmod test_klp_livepatch +[ T3659] % insmod test_modules/test_klp_livepatch.ko +[ T3682] livepatch: enabling patch 'test_klp_livepatch' +[ T3682] livepatch: 'test_klp_livepatch': initializing patching transition +[ T3682] livepatch: 'test_klp_livepatch': starting patching transition +[ T826] livepatch: 'test_klp_livepatch': completing patching transition +[ T826] livepatch: 'test_klp_livepatch': patching complete +[ T3659] % echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled +[ T3659] livepatch: 'test_klp_livepatch': initializing unpatching transition +[ T3659] livepatch: 'test_klp_livepatch': starting unpatching transition +[ T789] livepatch: 'test_klp_livepatch': completing unpatching transition +[ T789] livepatch: 'test_klp_livepatch': unpatching complete +[ T3659] % rmmod test_klp_livepatch ERROR: livepatch kselftest(s) failed not ok 1 selftests: livepatch: test-livepatch.sh # exit=1 Currently the check_result() handles the "[time]" removal from the dmesg. Enhance the check to also handle removal of "[Thread Id]" or "[CPU Id]". Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250119163238.749847-1-maddy@linux.ibm.com Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-01-27 11:41:28 +01:00
Linus Torvalds	9c5968db9e	The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec. - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones. - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest. - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code. - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups. - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code. - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c. - "mm/page_alloc: gfp flags cleanups for alloc_contig_()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator. - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading. - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated (https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/). Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED). - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled. - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL. - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests. - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size. - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic. - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated. - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated. - "move pagetable__dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed. - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic. - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy. - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic. - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions. - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed. - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting. - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface. - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior. - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi "introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors." - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram. - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal. - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance. - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation. - "Cleanup for memfd_create()" from Isaac Manjarres does that thing. - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration. - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices. - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5a+cwAKCRDdBJ7gKXxA jtoyAP9R58oaOKPJuTizEKKXvh/RpMyD6sYcz/uPpnf+cKTZxQEAqfVznfWlw/Lz uC3KRZYhmd5YrxU4o+qjbzp9XWX/xAE= =Ib2s -----END PGP SIGNATURE----- Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c - "mm/page_alloc: gfp flags cleanups for alloc_contig_()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated: https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/ Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED) - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated - "move pagetable__dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation - "Cleanup for memfd_create()" from Isaac Manjarres does that thing - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests" * tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm/compaction: fix UBSAN shift-out-of-bounds warning s390/mm: add missing ctor/dtor on page table upgrade kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags() tools: add VM_WARN_ON_VMG definition mm/damon/core: use str_high_low() helper in damos_wmark_wait_us() seqlock: add missing parameter documentation for raw_seqcount_try_begin() mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh mm/page_alloc: remove the incorrect and misleading comment zram: remove zcomp_stream_put() from write_incompressible_page() mm: separate move/undo parts from migrate_pages_batch() mm/kfence: use str_write_read() helper in get_access_type() selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags() selftests/mm: virtual_address_range: avoid reading from VM_IO mappings selftests/mm: vm_util: split up /proc/self/smaps parsing selftests/mm: virtual_address_range: unmap chunks after validation selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/memfd/memfd_test: fix possible NULL pointer dereference mm: add FGP_DONTCACHE folio creation flag mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue ...	2025-01-26 18:36:23 -08:00
Linus Torvalds	c159dfbdd4	Mainly individually changelogged singleton patches. The patch series in this pull are: - "lib min_heap: Improve min_heap safety, testing, and documentation" from Kuan-Wei Chiu provides various tightenings to the min_heap library code. - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some cleanup and Rust preparation in the xarray library code. - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes pathnames in some code comments. - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the new secs_to_jiffies() in various places where that is appropriate. - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen switches two filesystems to the new mount API. - "Convert ocfs2 to use folios" from Matthew Wilcox does that. - "Remove get_task_comm() and print task comm directly" from Yafang Shao removes now-unneeded calls to get_task_comm() in various places. - "squashfs: reduce memory usage and update docs" from Phillip Lougher implements some memory savings in squashfs and performs some maintainability work. - "lib: clarify comparison function requirements" from Kuan-Wei Chiu tightens the sort code's behaviour and adds some maintenance work. - "nilfs2: protect busy buffer heads from being force-cleared" from Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a corrupted image. - "nilfs2: fix kernel-doc comments for function return values" from Ryusuke Konishi fixes some nilfs kerneldoc. - "nilfs2: fix issues with rename operations" from Ryusuke Konishi addresses some nilfs BUG_ONs which syzbot was able to trigger. - "minmax.h: Cleanups and minor optimisations" from David Laight does some maintenance work on the min/max library code. - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work on the xarray library code. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5SP5QAKCRDdBJ7gKXxA jqN7AQChvwXGG43n4d5SDiA/rH7ddvowQcDqhC9cAMJ1ReR7qwEA8/LIWDE4PdMX mJnaZ1/ibpEpearrChCViApQtcyEGQI= =ti4E -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Mainly individually changelogged singleton patches. The patch series in this pull are: - "lib min_heap: Improve min_heap safety, testing, and documentation" from Kuan-Wei Chiu provides various tightenings to the min_heap library code - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some cleanup and Rust preparation in the xarray library code - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes pathnames in some code comments - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the new secs_to_jiffies() in various places where that is appropriate - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen switches two filesystems to the new mount API - "Convert ocfs2 to use folios" from Matthew Wilcox does that - "Remove get_task_comm() and print task comm directly" from Yafang Shao removes now-unneeded calls to get_task_comm() in various places - "squashfs: reduce memory usage and update docs" from Phillip Lougher implements some memory savings in squashfs and performs some maintainability work - "lib: clarify comparison function requirements" from Kuan-Wei Chiu tightens the sort code's behaviour and adds some maintenance work - "nilfs2: protect busy buffer heads from being force-cleared" from Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a corrupted image - "nilfs2: fix kernel-doc comments for function return values" from Ryusuke Konishi fixes some nilfs kerneldoc - "nilfs2: fix issues with rename operations" from Ryusuke Konishi addresses some nilfs BUG_ONs which syzbot was able to trigger - "minmax.h: Cleanups and minor optimisations" from David Laight does some maintenance work on the min/max library code - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work on the xarray library code" * tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits) ocfs2: use str_yes_no() and str_no_yes() helper functions include/linux/lz4.h: add some missing macros Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Xarray: remove repeat check in xas_squash_marks() Xarray: distinguish large entries correctly in xas_split_alloc() Xarray: move forward index correctly in xas_pause() Xarray: do not return sibling entries from xas_find_marked() ipc/util.c: complete the kernel-doc function descriptions gcov: clang: use correct function param names latencytop: use correct kernel-doc format for func params minmax.h: remove some #defines that are only expanded once minmax.h: simplify the variants of clamp() minmax.h: move all the clamp() definitions after the min/max() ones minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp() minmax.h: reduce the #define expansion of min(), max() and clamp() minmax.h: update some comments minmax.h: add whitespace around operators and after commas nilfs2: do not update mtime of renamed directory that is not moved nilfs2: handle errors that nilfs_prepare_chunk() may return CREDITS: fix spelling mistake ...	2025-01-26 17:50:53 -08:00
Suren Baghdasaryan	cf929a2863	tools: add VM_WARN_ON_VMG definition vma tests compilation yields the following error: vma.c:732:9: error: implicit declaration of function ‘VM_WARN_ON_VMG’ Fix it by adding missing VM_WARN_ON_VMG() definition. Link: https://lkml.kernel.org/r/20250116181538.759469-1-surenb@google.com Fixes: e3a7ae85f87c ("mm/debug: prefer VM_WARN_ON_VMG() to report VMG debug warnings") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:46 -08:00
liuye	7882d8fc8f	selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() Release memory before exception branch returns to prevent memory leaks Checking tools/testing/selftests/mm/mkdirty.c ... tools/testing/selftests/mm/mkdirty.c:283:3: error: Memory leak: src [memleak] return; ^ Link: https://lkml.kernel.org/r/20250114023838.48589-1-liuye@kylinos.cn Signed-off-by: liuye <liuye@kylinos.cn> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:45 -08:00
Thomas Weißschuh	3bd6137220	selftests/mm: virtual_address_range: avoid reading from VM_IO mappings The virtual_address_range selftest reads from the start of each mapping listed in /proc/self/maps. However not all mappings are valid to be arbitrarily accessed. For example the vvar data used for virtual clocks on x86 [vvar_vclock] can only be accessed if 1) the kernel configuration enables virtual clocks and 2) the hypervisor provided the data for it. Only the VDSO itself has the necessary information to know this. Since commit `e93d2521b2` ("x86/vdso: Split virtual clock pages into dedicated mapping") the virtual clock data was split out into its own mapping, leading to EFAULT from read() during the validation. Check for the VM_IO flag as a proxy. It is present for the VVAR mappings and MMIO ranges can be dangerous to access arbitrarily. Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-4-6fd7269934a5@linutronix.de Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202412271148.2656e485-lkp@intel.com Fixes: `e93d2521b2` ("x86/vdso: Split virtual clock pages into dedicated mapping") Fixes: `0104096498` ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Suggested-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/lkml/e97c2a5d-c815-4936-a767-ac42a3220a90@redhat.com/ Acked-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:44 -08:00
Thomas Weißschuh	3c479b5dc6	selftests/mm: vm_util: split up /proc/self/smaps parsing Upcoming changes want to reuse the /proc/self/smaps parsing logic to parse the VmFlags field. As that works differently from the currently parsed HugePage counters, split up the logic so common functionality can be shared. While reworking this code, also use the correct sscanf placeholder for the "uint64_t thp" variable. Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-3-6fd7269934a5@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: kernel test robot <oliver.sang@intel.com> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:44 -08:00
Thomas Weißschuh	b2a79f6213	selftests/mm: virtual_address_range: unmap chunks after validation For each accessed chunk a PTE is created. More than 1GiB of PTEs is used in this way. Remove each PTE after validating a chunk to reduce peak memory usage. It is important to only unmap memory that previously mmap()ed, as unmapping other mappings like the stack, heap or executable mappings will crash the process. The mappings read from /proc/self/maps and the return values from mmap() don't allow a simple correlation due to merging and no guaranteed order. To correlate the pointers and mappings use prctl(PR_SET_VMA_ANON_NAME). While it introduces a test dependency, other alternatives would introduce runtime or development overhead. Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-2-6fd7269934a5@linutronix.de Fixes: `0104096498` ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: kernel test robot <oliver.sang@intel.com> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:44 -08:00
Thomas Weißschuh	a005145b9c	selftests/mm: virtual_address_range: mmap() without PROT_WRITE Patch series "selftests/mm: virtual_address_range: Reduce memory", v4. The selftest started failing since commit `e93d2521b2` ("x86/vdso: Split virtual clock pages into dedicated mapping") was merged. While debugging I stumbled upon some memory usage optimizations. With these test now runs on a VM with only 60MiB of memory. This patch (of 4): When mapping a larger chunk than physical memory is available with PROT_WRITE and overcommit is disabled, the mapping will fail. This will prevent the test from running on systems with less then ~1GiB of memory and triggering an inscrutinable test failure. As the mappings are never written to anyways, the flag can be removed. Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-0-6fd7269934a5@linutronix.de Link: https://lkml.kernel.org/r/20250114-virtual_address_range-tests-v4-1-6fd7269934a5@linutronix.de Fixes: `4e5ce33ceb` ("selftests/vm: add a test for virtual address range mapping") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Dev Jain <dev.jain@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: kernel test robot <oliver.sang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:44 -08:00
liuye	73519ded99	selftests/memfd/memfd_test: fix possible NULL pointer dereference If `name' is NULL, a NULL pointer may be accessed in printf. Link: https://lkml.kernel.org/r/20250114032115.58638-1-liuye@kylinos.cn Signed-off-by: liuye <liuye@kylinos.cn> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Greg Thelen <gthelen@google.com> Cc: "Isaac J. Manjarres" <isaacmanjarres@google.com> Cc: Jeff Xu <jeffxu@google.com> Cc: Saurav Shah <sauravshah.31@gmail.com> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:44 -08:00
Hao Ge	bf069012df	selftests/mm/cow: modify the incorrect checking parameters In run_with_memfd_hugetlb(), some error handle have passed incorrect parameters. It should be "smem", but it was mistakenly written as "mem". Let's fix it. [gehao@kylinos.cn: fix other errant sites, per Anshuman] Link: https://lkml.kernel.org/r/20250113050908.93638-1-hao.ge@linux.dev Link: https://lkml.kernel.org/r/20250113032858.63670-1-hao.ge@linux.dev Fixes: `f8664f3c4a` ("selftests/vm: cow: basic COW tests for non-anonymous pages") Signed-off-by: Hao Ge <gehao@kylinos.cn> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:41 -08:00
Zi Yan	dfe61db4a1	selftests/mm: add tests for splitting pmd THPs to all lower orders Kernel already supports splitting a folio to any lower order. Test it. [ziy@nvidia.com: no need to test splitting to order-1] Link: https://lkml.kernel.org/r/DDA202EA-4664-4F50-A7FD-B00CBB7A624B@nvidia.com Link: https://lkml.kernel.org/r/20250110235028.96824-2-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Alexander Zhu <alexlzhu@fb.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Usama Arif <usamaarif642@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:40 -08:00
Zi Yan	136c5b40e0	selftests/mm: use selftests framework to print test result Otherwise the number of tests does not match the reality. Link: https://lkml.kernel.org/r/20250110235028.96824-1-ziy@nvidia.com Fixes: `391e869711` ("mm: selftest to verify zero-filled pages are mapped to zeropage") Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Alexander Zhu <alexlzhu@fb.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Usama Arif <usamaarif642@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:40 -08:00
Lorenzo Stoakes	f8d4a6cabb	mm: make mmap_region() internal Now that we have removed the one user of mmap_region() outside of mm, make it internal and add it to vma.c so it can be userland tested. This ensures that all external memory mappings are performed using the appropriate interfaces and allows us to modify memory mapping logic as we see fit. Additionally expand test stubs to allow for the mmap_region() code to compile and be userland testable. Link: https://lkml.kernel.org/r/de5a3c574d35c26237edf20a1d8652d7305709c9.1735819274.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:38 -08:00
Ryan Roberts	b2466bb3b4	selftests/mm: introduce uffd-wp-mremap regression test Introduce a test that registers a range of memory for UFFDIO_WRITEPROTECT_MODE_WP without UFFD_FEATURE_EVENT_REMAP. First check that the uffd-wp bit is set for every PTE in the range. Then mremap() the range to a new location and check that the uffd-wp bit is clear for every PTE in the range. Run the test for small folios, all supported THP sizes and all supported hugetlb sizes, and for swapped out memory, shared and private. There was previously a bug in the kernel where the uffd-wp bits remained set in all PTEs for this case, after fixing the kernel, the tests all pass. Link: https://lkml.kernel.org/r/20250107144755.1871363-3-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:31 -08:00
SeongJae Park	d8a142058f	kunit: configs: remove configs for DAMON debugfs interface tests It's time to remove DAMON debugfs interface, which has deprecated long before in February 2023. Read the cover letter of this patch series for more details. Remove kernel configs for running DAMON debugfs interface kunit tests from the kunit all_tests configuration, to prevent unnecessary noises from tests. Link: https://lkml.kernel.org/r/20250106191941.107070-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Alex Shi <alexs@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Hu Haowen <2023002089@link.tyut.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:29 -08:00
SeongJae Park	859de14931	selftests/damon: remove tests for DAMON debugfs interface It's time to remove DAMON debugfs interface, which has deprecated long before in February 2023. Read the cover letter of this patch series for more details. Remove selftests for the interface, to prevent causing unnecessary test failures. Link: https://lkml.kernel.org/r/20250106191941.107070-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Alex Shi <alexs@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Hu Haowen <2023002089@link.tyut.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:29 -08:00
SeongJae Park	ce1d750282	selftests/damon/config: remove configs for DAMON debugfs interface selftests It's time to remove DAMON debugfs interface, which has deprecated long before in February 2023. Read the cover letter of this patch series for more details. Remove configs for selftests of it from DAMON selftests config file, to prevent unnecessary noises from the tests. [1] https://lore.kernel.org/20230209192009.7885-1-sj@kernel.org Link: https://lkml.kernel.org/r/20250106191941.107070-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Alex Shi <alexs@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Hu Haowen <2023002089@link.tyut.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:29 -08:00
Donet Tom	901083d8f5	selftests/mm: add new test cases to the migration test Added three new test cases to the migration tests: 1. Shared anon THP migration test This test will mmap shared anon memory, madvise it to MADV_HUGEPAGE, then do migration entry testing. One thread will move pages back and forth between nodes whilst other threads try and access them. 2. Private anon hugetlb migration test This test will mmap private anon hugetlb memory and then do the migration entry testing. 3. Shared anon hugetlb migration test This test will mmap shared anon hugetlb memory and then do the migration entry testing. Test results ============ # ./tools/testing/selftests/mm/migration TAP version 13 1..6 # Starting 6 tests from 1 test cases. # RUN migration.private_anon ... # OK migration.private_anon ok 1 migration.private_anon # RUN migration.shared_anon ... # OK migration.shared_anon ok 2 migration.shared_anon # RUN migration.private_anon_thp ... # OK migration.private_anon_thp ok 3 migration.private_anon_thp # RUN migration.shared_anon_thp ... # OK migration.shared_anon_thp ok 4 migration.shared_anon_thp # RUN migration.private_anon_htlb ... # OK migration.private_anon_htlb ok 5 migration.private_anon_htlb # RUN migration.shared_anon_htlb ... # OK migration.shared_anon_htlb ok 6 migration.shared_anon_htlb # PASSED: 6 / 6 tests passed. # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0 # Link: https://lkml.kernel.org/r/20241219102720.4487-1-donettom@linux.ibm.com Signed-off-by: Donet Tom <donettom@linux.ibm.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:21 -08:00
Lorenzo Stoakes	7e8c8fd348	tools: testing: add simple __mmap_region() userland test Introduce demonstrative, basic, __mmap_region() test upon which we can base further work upon moving forwards. This simply asserts that mappings can be made and merges occur as expected. As part of this change, fix the security_vm_enough_memory_mm() stub which was previously incorrectly implemented. Link: https://lkml.kernel.org/r/20241213162409.41498-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:18 -08:00
Linus Torvalds	647d69605c	pci-v6.14-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmeTr8wUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxrMw//TJXH+U6x5LhYvBPD/KZ20ecGHqaA eGXrbHAasYbU1CfW7HM0onR8NffOIGoYvQrtefjQAln0w6rTvyFO0xJKLP15vMfN hnj+y1WWtKwAkSpu10Cl9nTj8uYRNNSQeoy5kS+1diwuXdby/DlgQONO2APSe9zd KMPXJcqSfDJlM5zHrcqqtlxauO9KHInLCc/iutd85AKjvcjOoNHNeZE0pTC0C3gE sXYHDqJiS3zdEG6X6mWFo3OzI/Q/7NGlHJ2j0CQaObsgQ9yA7eWkez25ifwZcugc TPtjm8DhaDo9/zx0NV9c2dPauHRC6NYUjAflMPK7Aye/41BE1Ag5Ka+tMDgC2i/N TbfBxSeArhjnjY+eZwRhrJNNC58TtHTUs69TO7Dbmuwr7cp99MIEDAYI5V6LFAdk plKqn1h8FztW5QKRPCgmzy6KTE+WPytiGAGAQFxzIGYkV/QqyvFaVs8FIyOJUIFM aDSa6Xy5WLGxmPZ9hPapzEm4ws/HTRpFjNgi/d4rRG5RWMwAxZZa44s9eldhN1D/ ZwEmF2rJ+U8S7Q+mXPHDlwcsHe5APCbiaTEp4X+e3LNe0i9oxhhaUWG6LrDDmTlQ tU5j5daHiBa0nTDL1lfaayJlYX/oJ+IYQrIYzGbnivZv4ZVdPnuWSsOsMOEiLhEt 4QqCoanqf0mCn2A= =O62O -----END PGP SIGNATURE----- Merge tag 'pci-v6.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Batch sizing of multiple BARs while memory decoding is disabled instead of disabling/enabling decoding for each BAR individually; this optimizes virtualized environments where toggling decoding enable is expensive (Alex Williamson) - Add host bridge .enable_device() and .disable_device() hooks for bridges that need to configure things like Requester ID to StreamID mapping when enabling devices (Frank Li) - Extend struct pci_ecam_ops with .enable_device() and .disable_device() hooks so drivers that use pci_host_common_probe() instead of their own .probe() have a way to set the .enable_device() callbacks (Marc Zyngier) - Drop 'No bus range found' message so we don't complain when DTs don't specify the default 'bus-range = <0x00 0xff>' (Bjorn Helgaas) - Rename the drivers/pci/of_property.c struct of_pci_range to of_pci_range_entry to avoid confusion with the global of_pci_range in include/linux/of_address.h (Bjorn Helgaas) Driver binding: - Update resource request API documentation to encourage callers to supply a driver name when requesting resources (Philipp Stanner) - Export pci_intx_unmanaged() and pcim_intx() (always managed) so callers of pci_intx() (which is sometimes managed) can explicitly choose the one they need (Philipp Stanner) - Convert drivers from pci_intx() to always-managed pcim_intx() or never-managed pci_intx_unmanaged(): amd_sfh, ata (ahci, ata_piix, pata_rdc, sata_sil24, sata_sis, sata_uli, sata_vsc), bnx2x, bna, ntb, qtnfmac, rtsx, tifm_7xx1, vfio, xen-pciback (Philipp Stanner) - Remove pci_intx_unmanaged() since pci_intx() is now always unmanaged and pcim_intx() is always managed (Philipp Stanner) Error handling: - Unexport pcie_read_tlp_log() to encourage drivers to use PCI core logging rather than building their own (Ilpo Järvinen) - Move TLP Log handling to its own file (Ilpo Järvinen) - Store number of supported End-End TLP Prefixes always so we can read the correct number of DWORDs from the TLP Prefix Log (Ilpo Järvinen) - Read TLP Prefixes in addition to the Header Log in pcie_read_tlp_log() (Ilpo Järvinen) - Add pcie_print_tlp_log() to consolidate printing of TLP Header and Prefix Log (Ilpo Järvinen) - Quirk the Intel Raptor Lake-P PIO log size to accommodate vendor BIOSes that don't configure it correctly (Takashi Iwai) ASPM: - Save parent L1 PM Substates config so when we restore it along with an endpoint's config, the parent info isn't junk (Jian-Hong Pan) Power management: - Avoid D3 for Root Ports on TUXEDO Sirius Gen1 with old BIOS because the system can't wake up from suspend (Werner Sembach) Endpoint framework: - Destroy the EPC device in devm_pci_epc_destroy(), which previously didn't call devres_release() (Zijun Hu) - Finish virtual EP removal in pci_epf_remove_vepf(), which previously caused a subsequent pci_epf_add_vepf() to fail with -EBUSY (Zijun Hu) - Write BAR_MASK before iATU registers in pci_epc_set_bar() so we don't depend on the BAR_MASK reset value being larger than the requested BAR size (Niklas Cassel) - Prevent changing BAR size/flags in pci_epc_set_bar() to prevent reads from bypassing the iATU if we reduced the BAR size (Niklas Cassel) - Verify address alignment when programming iATU so we don't attempt to write bits that are read-only because of the BAR size, which could lead to directing accesses to the wrong address (Niklas Cassel) - Implement artpec6 pci_epc_features so we can rely on all drivers supporting it so we can use it in EPC core code (Niklas Cassel) - Check for BARs of fixed size to prevent endpoint drivers from trying to change their size (Niklas Cassel) - Verify that requested BAR size is a power of two when endpoint driver sets the BAR (Niklas Cassel) Endpoint framework tests: - Clear pci-epf-test dma_chan_rx, not dma_chan_tx, after freeing dma_chan_rx (Mohamed Khalfella) - Correct the DMA MEMCPY test so it doesn't fail if the Endpoint supports both DMA_PRIVATE and DMA_MEMCPY (Manivannan Sadhasivam) - Add pci-epf-test and pci_endpoint_test support for capabilities (Niklas Cassel) - Add Endpoint test for consecutive BARs (Niklas Cassel) - Remove redundant comparison from Endpoint BAR test because a > 1MB BAR can always be exactly covered by iterating with a 1MB buffer (Hans Zhang) - Move and convert PCI Endpoint tests from tools/pci to Kselftests (Manivannan Sadhasivam) Apple PCIe controller driver: - Convert StreamID mapping configuration from a bus notifier to the .enable_device() and .disable_device() callbacks (Marc Zyngier) Freescale i.MX6 PCIe controller driver: - Add Requester ID to StreamID mapping configuration when enabling devices (Frank Li) - Use DWC core suspend/resume functions for imx6 (Frank Li) - Add suspend/resume support for i.MX8MQ, i.MX8Q, and i.MX95 (Richard Zhu) - Add DT compatible string 'fsl,imx8q-pcie-ep' and driver support for i.MX8Q series (i.MX8QM, i.MX8QXP, and i.MX8DXL) Endpoints (Frank Li) - Add DT binding for optional i.MX95 Refclk and driver support to enable it if the platform hasn't enabled it (Richard Zhu) - Configure PHY based on controller being in Root Complex or Endpoint mode (Frank Li) - Rely on dbi2 and iATU base addresses from DT via dw_pcie_get_resources() instead of hardcoding them (Richard Zhu) - Deassert apps_reset in imx_pcie_deassert_core_reset() since it is asserted in imx_pcie_assert_core_reset() (Richard Zhu) - Add missing reference clock enable or disable logic for IMX6SX, IMX7D, IMX8MM (Richard Zhu) - Remove redundant imx7d_pcie_init_phy() since imx7d_pcie_enable_ref_clk() does the same thing (Richard Zhu) Freescale Layerscape PCIe controller driver: - Simplify by using syscon_regmap_lookup_by_phandle_args() instead of syscon_regmap_lookup_by_phandle() followed by of_property_read_u32_array() (Krzysztof Kozlowski) Marvell MVEBU PCIe controller driver: - Add MODULE_DEVICE_TABLE() to enable module autoloading (Liao Chen) MediaTek PCIe Gen3 controller driver: - Use clk_bulk_prepare_enable() instead of separate clk_bulk_prepare() and clk_bulk_enable() (Lorenzo Bianconi) - Rearrange reset assert/deassert so they're both done in the _power_up() callbacks (Lorenzo Bianconi) - Document that Airoha EN7581 requires PHY init and power-on before PHY reset deassert, unlike other MediaTek Gen3 controllers (Lorenzo Bianconi) - Move Airoha EN7581 post-reset delay from the en7581 clock .enable() method to mtk_pcie_en7581_power_up() (Lorenzo Bianconi) - Sleep instead of delay during Airoha EN7581 power-up, since this is a non-atomic context (Lorenzo Bianconi) - Skip PERST# assertion on Airoha EN7581 during probe and suspend/resume to avoid a hardware defect (Lorenzo Bianconi) - Enable async probe to reduce system startup time (Douglas Anderson) Microchip PolarFlare PCIe controller driver: - Set up the inbound address translation based on whether the platform allows coherent or non-coherent DMA (Daire McNamara) - Update DT binding such that platforms are DMA-coherent by default and must specify 'dma-noncoherent' if needed (Conor Dooley) Mobiveil PCIe controller driver: - Convert mobiveil-pcie.txt to YAML and update 'interrupt-names' and 'reg-names' (Frank Li) Qualcomm PCIe controller driver: - Add DT SM8550 and SM8650 optional 'global' interrupt for link events (Neil Armstrong) - Add DT 'compatible' strings for IPQ5424 PCIe controller (Manikanta Mylavarapu) - If 'global' IRQ is supported for detection of Link Up events, tell DWC core not to wait for link up (Krishna chaitanya chundru) Renesas R-Car PCIe controller driver: - Avoid passing stack buffer as resource name (King Dix) Rockchip PCIe controller driver: - Simplify clock and reset handling by using bulk interfaces (Anand Moon) - Pass typed rockchip_pcie (not void) pointer to rockchip_pcie_disable_clocks() (Anand Moon) - Return -ENOMEM, not success, when pci_epc_mem_alloc_addr() fails (Dan Carpenter) Rockchip DesignWare PCIe controller driver: - Use dll_link_up IRQ to detect Link Up and enumerate devices so users don't have to manually rescan (Niklas Cassel) - Tell DWC core not to wait for link up since the 'sys' interrupt is required and detects Link Up events (Niklas Cassel) Synopsys DesignWare PCIe controller driver: - Don't wait for link up in DWC core if driver can detect Link Up event (Krishna chaitanya chundru) - Update ICC and OPP votes after Link Up events (Krishna chaitanya chundru) - Always stop link in dw_pcie_suspend_noirq(), which is required at least for i.MX8QM to re-establish link on resume (Richard Zhu) - Drop racy and unnecessary LTSSM state check before sending PME_TURN_OFF message in dw_pcie_suspend_noirq() (Richard Zhu) - Add struct of_pci_range.parent_bus_addr for devices that need their immediate parent bus address, not the CPU address, e.g., to program an internal Address Translation Unit (iATU) (Frank Li) TI DRA7xx PCIe controller driver: - Simplify by using syscon_regmap_lookup_by_phandle_args() instead of syscon_regmap_lookup_by_phandle() followed by of_parse_phandle_with_fixed_args() or of_property_read_u32_index() (Krzysztof Kozlowski) Xilinx Versal CPM PCIe controller driver: - Add DT binding and driver support for Xilinx Versal CPM5 (Thippeswamy Havalige) MicroSemi Switchtec management driver: - Add Microchip PCI100X device IDs (Rakesh Babu Saladi) Miscellaneous: - Move reset related sysfs code from pci.c to pci-sysfs.c where other similar code lives (Ilpo Järvinen) - Simplify reset_method_store() memory management by using __free() instead of explicit kfree() cleanup (Ilpo Järvinen) - Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM ACPI hotplug driver (Thomas Weißschuh) - Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong Zhang) - Correct documentation of the 'config_acs=' kernel parameter (Akihiko Odaki)" tag 'pci-v6.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (111 commits) PCI: Batch BAR sizing operations dt-bindings: PCI: microchip,pcie-host: Allow dma-noncoherent PCI: microchip: Set inbound address translation for coherent or non-coherent mode Documentation: Fix pci=config_acs= example PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT PCI: Don't include 'pm_wakeup.h' directly selftests: pci_endpoint: Migrate to Kselftest framework selftests: Move PCI Endpoint tests from tools/pci to Kselftests misc: pci_endpoint_test: Fix IOCTL return value dt-bindings: PCI: qcom: Document the IPQ5424 PCIe controller dt-bindings: PCI: qcom,pcie-sm8550: Document 'global' interrupt dt-bindings: PCI: mobiveil: Convert mobiveil-pcie.txt to YAML PCI: switchtec: Add Microchip PCI100X device IDs misc: pci_endpoint_test: Remove redundant 'remainder' test misc: pci_endpoint_test: Add consecutive BAR test misc: pci_endpoint_test: Add support for capabilities PCI: endpoint: pci-epf-test: Add support for capabilities PCI: endpoint: pci-epf-test: Fix check for DMA MEMCPY test PCI: endpoint: pci-epf-test: Set dma_chan_rx pointer to NULL on error PCI: dwc: Simplify config resource lookup ...	2025-01-25 16:03:40 -08:00
Linus Torvalds	fd56e5104a	OpenRISC updates for 6.14 A few updates from me and the community: * Added support for restartable sequences * Migration to Generic built-in DTB from Masahiro Yamada -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmeUhJEACgkQw7McLV5m J+TpeA//VSJeutnK1s6u4VQbO6P0FM2JkY7yy8oAcitlG5bJ/WrRboNCgshaIRlG etoTwEb5llcUL2jDp7217I6Z5XDWUgQwul63PTYeoqEfwvF6kC3DEbOZ8QSfan/x F38kUDQWSyPKzt5aeNQZnKBr32jIZ/MfY9+XC2RHrNYhCTSoTqUcKxk8a69sx9cN MSxrqcfOsywakVgWsM4kPkWzQD4GI5E67BJBw0cI6ux0pqThQsiUyjtwxbIOZXUe /QtaBtGdqYMNEPs4RmoD2qGS3j5GtD3Nxv3adbKs0vpDXHyZj1+SFnIsBU9QBzni meXp4qONoIovg4Mh9VK6iZUWd6z6wegxP370bbsJGErzWS11t3p6qRYXg52DRMh8 w8rB/rHH2n/Dx7T17ec/E+D67/9BiapZrfxaDRjZ1/aa17JYiQeuHFDHRzo4uap2 TZ1NyPHQYLblmj0XDAsh8Me6PKI2pDL8DWZbFZ9dY7rN3lJP9G7kCdqBsM54NmBp M47asri2N2VRhrewWF4fNk9bbf4Ai9ERqShyg211d4fRTg1FhfpGEAKHnu9x2z1D isZf6JSrWhSrgQSq1dxvSjUnW7aw+NCPNqJjuKXYpCnxZePAa1O3/7HpipY58qr5 Ki1TN6x7DuWjR2mTaiIGzT2vktdBP9qFVJYfNneEme7111nRRuA= =9RPl -----END PGP SIGNATURE----- Merge tag 'for-linus' of https://github.com/openrisc/linux Pull OpenRISC updates from Stafford Horne: - Added support for restartable sequences (me) - Migration to Generic built-in DTB (Masahiro Yamada) * tag 'for-linus' of https://github.com/openrisc/linux: rseq/selftests: Add support for OpenRISC openrisc: Add support for restartable sequences openrisc: Add HAVE_REGS_AND_STACK_ACCESS_API support openrisc: migrate to the generic rule for built-in DTB	2025-01-25 10:16:56 -08:00
Linus Torvalds	0f8e26b38d	Loongarch: * Clear LLBCTL if secondary mmu mapping changes. * Add hypercall service support for usermode VMM. x86: * Add a comment to kvm_mmu_do_page_fault() to explain why KVM performs a direct call to kvm_tdp_page_fault() when RETPOLINE is enabled. * Ensure that all SEV code is compiled out when disabled in Kconfig, even if building with less brilliant compilers. * Remove a redundant TLB flush on AMD processors when guest CR4.PGE changes. * Use str_enabled_disabled() to replace open coded strings. * Drop kvm_x86_ops.hwapic_irr_update() as KVM updates hardware's APICv cache prior to every VM-Enter. * Overhaul KVM's CPUID feature infrastructure to track all vCPU capabilities instead of just those where KVM needs to manage state and/or explicitly enable the feature in hardware. Along the way, refactor the code to make it easier to add features, and to make it more self-documenting how KVM is handling each feature. * Rework KVM's handling of VM-Exits during event vectoring; this plugs holes where KVM unintentionally puts the vCPU into infinite loops in some scenarios (e.g. if emulation is triggered by the exit), and brings parity between VMX and SVM. * Add pending request and interrupt injection information to the kvm_exit and kvm_entry tracepoints respectively. * Fix a relatively benign flaw where KVM would end up redoing RDPKRU when loading guest/host PKRU, due to a refactoring of the kernel helpers that didn't account for KVM's pre-checking of the need to do WRPKRU. * Make the completion of hypercalls go through the complete_hypercall function pointer argument, no matter if the hypercall exits to userspace or not. Previously, the code assumed that KVM_HC_MAP_GPA_RANGE specifically went to userspace, and all the others did not; the new code need not special case KVM_HC_MAP_GPA_RANGE and in fact does not care at all whether there was an exit to userspace or not. * As part of enabling TDX virtual machines, support support separation of private/shared EPT into separate roots. When TDX will be enabled, operations on private pages will need to go through the privileged TDX Module via SEAMCALLs; as a result, they are limited and relatively slow compared to reading a PTE. The patches included in 6.14 allow KVM to keep a mirror of the private EPT in host memory, and define entries in kvm_x86_ops to operate on external page tables such as the TDX private EPT. * The recently introduced conversion of the NX-page reclamation kthread to vhost_task moved the task under the main process. The task is created as soon as KVM_CREATE_VM was invoked and this, of course, broke userspace that didn't expect to see any child task of the VM process until it started creating its own userspace threads. In particular crosvm refuses to fork() if procfs shows any child task, so unbreak it by creating the task lazily. This is arguably a userspace bug, as there can be other kinds of legitimate worker tasks and they wouldn't impede fork(); but it's not like userspace has a way to distinguish kernel worker tasks right now. Should they show as "Kthread: 1" in proc/.../status? x86 - Intel: * Fix a bug where KVM updates hardware's APICv cache of the highest ISR bit while L2 is active, while ultimately results in a hardware-accelerated L1 EOI effectively being lost. * Honor event priority when emulating Posted Interrupt delivery during nested VM-Enter by queueing KVM_REQ_EVENT instead of immediately handling the interrupt. * Rework KVM's processing of the Page-Modification Logging buffer to reap entries in the same order they were created, i.e. to mark gfns dirty in the same order that hardware marked the page/PTE dirty. * Misc cleanups. Generic: * Cleanup and harden kvm_set_memory_region(); add proper lockdep assertions when setting memory regions and add a dedicated API for setting KVM-internal memory regions. The API can then explicitly disallow all flags for KVM-internal memory regions. * Explicitly verify the target vCPU is online in kvm_get_vcpu() to fix a bug where KVM would return a pointer to a vCPU prior to it being fully online, and give kvm_for_each_vcpu() similar treatment to fix a similar flaw. * Wait for a vCPU to come online prior to executing a vCPU ioctl, to fix a bug where userspace could coerce KVM into handling the ioctl on a vCPU that isn't yet onlined. * Gracefully handle xarray insertion failures; even though such failures are impossible in practice after xa_reserve(), reserving an entry is always followed by xa_store() which does not know (or differentiate) whether there was an xa_reserve() before or not. RISC-V: * Zabha, Svvptc, and Ziccrse extension support for guests. None of them require anything in KVM except for detecting them and marking them as supported; Zabha adds byte and halfword atomic operations, while the others are markers for specific operation of the TLB and of LL/SC instructions respectively. * Virtualize SBI system suspend extension for Guest/VM * Support firmware counters which can be used by the guests to collect statistics about traps that occur in the host. Selftests: * Rework vcpu_get_reg() to return a value instead of using an out-param, and update all affected arch code accordingly. * Convert the max_guest_memory_test into a more generic mmu_stress_test. The basic gist of the "conversion" is to have the test do mprotect() on guest memory while vCPUs are accessing said memory, e.g. to verify KVM and mmu_notifiers are working as intended. * Play nice with treewrite builds of unsupported architectures, e.g. arm (32-bit), as KVM selftests' Makefile doesn't do anything to ensure the target architecture is actually one KVM selftests supports. * Use the kernel's $(ARCH) definition instead of the target triple for arch specific directories, e.g. arm64 instead of aarch64, mainly so as not to be different from the rest of the kernel. * Ensure that format strings for logging statements are checked by the compiler even when the logging statement itself is disabled. * Attempt to whack the last LLC references/misses mole in the Intel PMU counters test by adding a data load and doing CLFLUSH{OPT} on the data instead of the code being executed. It seems that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters. * Fix a flaw in the Intel PMU counters test where it asserts that events are counting correctly without actually knowing what the events count given the underlying hardware; this can happen if Intel reuses a formerly microarchitecture-specific event encoding as an architectural event, as was the case for Top-Down Slots. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeTuzoUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOkBwf8CRNExYaM3j9y2E7mmo6AiL2ug6+J Uy5Hai1poY48pPwKC6ke3EWT8WVsgj/Py5pCeHvLojQchWNjCCYNfSQluJdkRxwG DgP3QUljSxEJWBeSwyTRcKM+IySi5hZd1IFo3gePFRB829Jpnj05vjbvCyv8gIwU y3HXxSYDsViaaFoNg4OlZFsIGis7mtknsZzk++QjuCXmxNa6UCbv3qvE/UkVLhVg WH65RTRdjk+EsdwaOMHKuUvQoGa+iM4o39b6bqmw8+ZMK39+y33WeTX/y5RXsp1N tUUBRfS+MuuYgC/6LmTr66EkMzoChxk3Dp3kKUaCBcfqRC8PxQag5reZhw== =NEaO -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Loongarch: - Clear LLBCTL if secondary mmu mapping changes - Add hypercall service support for usermode VMM x86: - Add a comment to kvm_mmu_do_page_fault() to explain why KVM performs a direct call to kvm_tdp_page_fault() when RETPOLINE is enabled - Ensure that all SEV code is compiled out when disabled in Kconfig, even if building with less brilliant compilers - Remove a redundant TLB flush on AMD processors when guest CR4.PGE changes - Use str_enabled_disabled() to replace open coded strings - Drop kvm_x86_ops.hwapic_irr_update() as KVM updates hardware's APICv cache prior to every VM-Enter - Overhaul KVM's CPUID feature infrastructure to track all vCPU capabilities instead of just those where KVM needs to manage state and/or explicitly enable the feature in hardware. Along the way, refactor the code to make it easier to add features, and to make it more self-documenting how KVM is handling each feature - Rework KVM's handling of VM-Exits during event vectoring; this plugs holes where KVM unintentionally puts the vCPU into infinite loops in some scenarios (e.g. if emulation is triggered by the exit), and brings parity between VMX and SVM - Add pending request and interrupt injection information to the kvm_exit and kvm_entry tracepoints respectively - Fix a relatively benign flaw where KVM would end up redoing RDPKRU when loading guest/host PKRU, due to a refactoring of the kernel helpers that didn't account for KVM's pre-checking of the need to do WRPKRU - Make the completion of hypercalls go through the complete_hypercall function pointer argument, no matter if the hypercall exits to userspace or not. Previously, the code assumed that KVM_HC_MAP_GPA_RANGE specifically went to userspace, and all the others did not; the new code need not special case KVM_HC_MAP_GPA_RANGE and in fact does not care at all whether there was an exit to userspace or not - As part of enabling TDX virtual machines, support support separation of private/shared EPT into separate roots. When TDX will be enabled, operations on private pages will need to go through the privileged TDX Module via SEAMCALLs; as a result, they are limited and relatively slow compared to reading a PTE. The patches included in 6.14 allow KVM to keep a mirror of the private EPT in host memory, and define entries in kvm_x86_ops to operate on external page tables such as the TDX private EPT - The recently introduced conversion of the NX-page reclamation kthread to vhost_task moved the task under the main process. The task is created as soon as KVM_CREATE_VM was invoked and this, of course, broke userspace that didn't expect to see any child task of the VM process until it started creating its own userspace threads. In particular crosvm refuses to fork() if procfs shows any child task, so unbreak it by creating the task lazily. This is arguably a userspace bug, as there can be other kinds of legitimate worker tasks and they wouldn't impede fork(); but it's not like userspace has a way to distinguish kernel worker tasks right now. Should they show as "Kthread: 1" in proc/.../status? x86 - Intel: - Fix a bug where KVM updates hardware's APICv cache of the highest ISR bit while L2 is active, while ultimately results in a hardware-accelerated L1 EOI effectively being lost - Honor event priority when emulating Posted Interrupt delivery during nested VM-Enter by queueing KVM_REQ_EVENT instead of immediately handling the interrupt - Rework KVM's processing of the Page-Modification Logging buffer to reap entries in the same order they were created, i.e. to mark gfns dirty in the same order that hardware marked the page/PTE dirty - Misc cleanups Generic: - Cleanup and harden kvm_set_memory_region(); add proper lockdep assertions when setting memory regions and add a dedicated API for setting KVM-internal memory regions. The API can then explicitly disallow all flags for KVM-internal memory regions - Explicitly verify the target vCPU is online in kvm_get_vcpu() to fix a bug where KVM would return a pointer to a vCPU prior to it being fully online, and give kvm_for_each_vcpu() similar treatment to fix a similar flaw - Wait for a vCPU to come online prior to executing a vCPU ioctl, to fix a bug where userspace could coerce KVM into handling the ioctl on a vCPU that isn't yet onlined - Gracefully handle xarray insertion failures; even though such failures are impossible in practice after xa_reserve(), reserving an entry is always followed by xa_store() which does not know (or differentiate) whether there was an xa_reserve() before or not RISC-V: - Zabha, Svvptc, and Ziccrse extension support for guests. None of them require anything in KVM except for detecting them and marking them as supported; Zabha adds byte and halfword atomic operations, while the others are markers for specific operation of the TLB and of LL/SC instructions respectively - Virtualize SBI system suspend extension for Guest/VM - Support firmware counters which can be used by the guests to collect statistics about traps that occur in the host Selftests: - Rework vcpu_get_reg() to return a value instead of using an out-param, and update all affected arch code accordingly - Convert the max_guest_memory_test into a more generic mmu_stress_test. The basic gist of the "conversion" is to have the test do mprotect() on guest memory while vCPUs are accessing said memory, e.g. to verify KVM and mmu_notifiers are working as intended - Play nice with treewrite builds of unsupported architectures, e.g. arm (32-bit), as KVM selftests' Makefile doesn't do anything to ensure the target architecture is actually one KVM selftests supports - Use the kernel's $(ARCH) definition instead of the target triple for arch specific directories, e.g. arm64 instead of aarch64, mainly so as not to be different from the rest of the kernel - Ensure that format strings for logging statements are checked by the compiler even when the logging statement itself is disabled - Attempt to whack the last LLC references/misses mole in the Intel PMU counters test by adding a data load and doing CLFLUSH{OPT} on the data instead of the code being executed. It seems that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters - Fix a flaw in the Intel PMU counters test where it asserts that events are counting correctly without actually knowing what the events count given the underlying hardware; this can happen if Intel reuses a formerly microarchitecture-specific event encoding as an architectural event, as was the case for Top-Down Slots" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (151 commits) kvm: defer huge page recovery vhost task to later KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault() KVM: Disallow all flags for KVM-internal memslots KVM: x86: Drop double-underscores from __kvm_set_memory_region() KVM: Add a dedicated API for setting KVM-internal memslots KVM: Assert slots_lock is held when setting memory regions KVM: Open code kvm_set_memory_region() into its sole caller (ioctl() API) LoongArch: KVM: Add hypercall service support for usermode VMM LoongArch: KVM: Clear LLBCTL if secondary mmu mapping is changed KVM: SVM: Use str_enabled_disabled() helper in svm_hardware_setup() KVM: VMX: read the PML log in the same order as it was written KVM: VMX: refactor PML terminology KVM: VMX: Fix comment of handle_vmx_instruction() KVM: VMX: Reinstate __exit attribute for vmx_exit() KVM: SVM: Use str_enabled_disabled() helper in sev_hardware_setup() KVM: x86: Avoid double RDPKRU when loading host/guest PKRU KVM: x86: Use LVT_TIMER instead of an open coded literal RISC-V: KVM: Add new exit statstics for redirected traps RISC-V: KVM: Update firmware counters for various events RISC-V: KVM: Redirect instruction access fault trap to guest ...	2025-01-25 09:55:09 -08:00
Kemeng Shi	7e060df04f	Xarray: do not return sibling entries from xas_find_marked() Patch series "Fixes and cleanups to xarray", v5. This series contains some random fixes and cleanups to xarray. Patch 1-2 are fixes and patch 3-6 are cleanups. More details can be found in respective patches. This patch (of 5): Similar to issue fixed in commit `cbc0285433` ("XArray: Do not return sibling entries from xa_load()"), we may return sibling entries from xas_find_marked as following: Thread A: Thread B: xa_store_range(xa, entry, 6, 7, gfp); xa_set_mark(xa, 6, mark) XA_STATE(xas, xa, 6); xas_find_marked(&xas, 7, mark); offset = xas_find_chunk(xas, advance, mark); [offset is 6 which points to a valid entry] xa_store_range(xa, entry, 4, 7, gfp); entry = xa_entry(xa, node, 6); [entry is a sibling of 4] if (!xa_is_node(entry)) return entry; Skip sibling entry like xas_find() does to protect caller from seeing sibling entry from xas_find_marked() or caller may use sibling entry as a valid entry and crash the kernel. Besides, load_race() test is modified to catch mentioned issue and modified load_race() only passes after this fix is merged. Here is an example how this bug could be triggerred in tmpfs which enables large folio in mapping: Let's take a look at involved racer: 1. How pages could be created and dirtied in shmem file. write ksys_write vfs_write new_sync_write shmem_file_write_iter generic_perform_write shmem_write_begin shmem_get_folio shmem_allowable_huge_orders shmem_alloc_and_add_folios shmem_alloc_folio __folio_set_locked shmem_add_to_page_cache XA_STATE_ORDER(..., index, order) xax_store() shmem_write_end folio_mark_dirty() 2. How dirty pages could be deleted in shmem file. ioctl do_vfs_ioctl file_ioctl ioctl_preallocate vfs_fallocate shmem_fallocate shmem_truncate_range shmem_undo_range truncate_inode_folio filemap_remove_folio page_cache_delete xas_store(&xas, NULL); 3. How dirty pages could be lockless searched sync_file_range ksys_sync_file_range __filemap_fdatawrite_range filemap_fdatawrite_wbc do_writepages writeback_use_writepage writeback_iter writeback_get_folio filemap_get_folios_tag find_get_entry folio = xas_find_marked() folio_try_get(folio) Kernel will crash as following: 1.Create 2.Search 3.Delete /* write page 2,3 / write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 2, order = 1) xa_store(&xas, folio) shmem_write_end folio_mark_dirty() / sync page 2 and page 3 / sync_file_range ... find_get_entry folio = xas_find_marked() / offset will be 2 / offset = xas_find_chunk() / delete page 2 and page 3 / ioctl ... xas_store(&xas, NULL); / write page 0-3 / write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 0, order = 2) xa_store(&xas, folio) shmem_write_end folio_mark_dirty(folio) / get sibling entry from offset 2 / entry = xa_entry(.., 2) / use sibling entry as folio and crash kernel */ folio_try_get(folio) Link: https://lkml.kernel.org/r/20241213122523.12764-1-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20241213122523.12764-2-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> [English fixes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-24 22:47:27 -08:00
Linus Torvalds	ae8b53aac3	EFI updates for v6.14 - Increase the headroom in the EFI memory map allocation created by the EFI stub. This is needed because event callbacks called during ExitBootServices() may cause fragmentation, and reallocation is not allowed after that. - Drop obsolete UGA graphics code and switch to a more ergonomic API to traverse handle buffers. Simplify some error paths using a __free() helper while at it. - Fix some W=1 warnings when CONFIG_EFI=n - Rely on the dentry cache to keep track of the contents of the efivarfs filesystem, rather than using a separate linked list. - Improve and extend efivarfs test cases. - Synchronize efivarfs with underlying variable store on resume from hibernation - this is needed because the firmware itself or another OS running on the same machine may have modified it. - Fix x86 EFI stub build with GCC 15. - Fix kexec/x86 false positive warning in EFI memory attributes table sanity check. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZ5IH+gAKCRAwbglWLn0t XHyMAP9Mqn5dD4XT22gvTRUrJuVYFLBlN+9d8ysRMjRVCzGwCQEAvCUJMy5Kje0J h9i2InWjjPOVATx5hTrEoIEl96BGOgk= =3Hnk -----END PGP SIGNATURE----- Merge tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Increase the headroom in the EFI memory map allocation created by the EFI stub. This is needed because event callbacks called during ExitBootServices() may cause fragmentation, and reallocation is not allowed after that. - Drop obsolete UGA graphics code and switch to a more ergonomic API to traverse handle buffers. Simplify some error paths using a __free() helper while at it. - Fix some W=1 warnings when CONFIG_EFI=n - Rely on the dentry cache to keep track of the contents of the efivarfs filesystem, rather than using a separate linked list. - Improve and extend efivarfs test cases. - Synchronize efivarfs with underlying variable store on resume from hibernation - this is needed because the firmware itself or another OS running on the same machine may have modified it. - Fix x86 EFI stub build with GCC 15. - Fix kexec/x86 false positive warning in EFI memory attributes table sanity check. * tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (23 commits) x86/efi: skip memattr table on kexec boot efivarfs: add variable resync after hibernation efivarfs: abstract initial variable creation routine efi: libstub: Use '-std=gnu11' to fix build with GCC 15 selftests/efivarfs: add concurrent update tests selftests/efivarfs: fix tests for failed write removal efivarfs: fix error on write to new variable leaving remnants efivarfs: remove unused efivarfs_list efivarfs: move variable lifetime management into the inodes selftests/efivarfs: add check for disallowing file truncation efivarfs: prevent setting of zero size on the inodes in the cache efi: sysfb_efi: fix W=1 warnings when EFI is not set efi/libstub: Use __free() helper for pool deallocations efi/libstub: Use cleanup helpers for freeing copies of the memory map efi/libstub: Simplify PCI I/O handle buffer traversal efi/libstub: Refactor and clean up GOP resolution picker code efi/libstub: Simplify GOP handling code efi/libstub: Use C99-style for loop to traverse handle buffer x86/efistub: Drop long obsolete UGA support efivarfs: make variable_is_present use dcache lookup ...	2025-01-24 15:33:33 -08:00
Tejun Heo	e9fe182772	sched_ext: selftests/dsp_local_on: Fix sporadic failures dsp_local_on has several incorrect assumptions, one of which is that p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task is scheduled out while migration is disabled - p->cpus_ptr is temporarily overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. This led to sporadic test faliures when dsp_local_on_dispatch() tries to put a migration disabled task to a different CPU. Fix it by keeping the previous CPU when migration is disabled. There are SCX schedulers that make use of p->nr_cpus_allowed. They should also implement explicit handling for p->migration_disabled. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Cc: Andrea Righi <arighi@nvidia.com> Cc: Changwoo Min <changwoo@igalia.com>	2025-01-24 10:48:25 -10:00
Andrea Righi	74ca334338	selftests/sched_ext: Fix enum resolution All scx enums are now automatically generated from vmlinux.h and they must be initialized using the SCX_ENUM_INIT() macro. Fix the scx selftests to use this macro to properly initialize these values. Fixes: `8da7bf2cee` ("tools/sched_ext: Receive updates from SCX repo") Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-24 08:09:52 -10:00
Linus Torvalds	bc8198dc7e	sched_ext: Changes for v6.14 - scx_bpf_now() added so that BPF scheduler can access the cached timestamp in struct rq to avoid reading TSC multiple times within a locked scheduling operation. - Minor updates to the built-in idle CPU selection logic. - tool/sched_ext updates and other misc changes. Pulling sched_ext/for-6.14 into master causes a merge conflict between the following two commits (first commit in master, second in for-6.14): `a2a3374c47` sched_ext: idle: Refresh idle masks during idle-to-idle transitions `9cf9aceed2` sched_ext: idle: use assign_cpu() to update the idle cpumask static void update_builtin_idle(int cpu, bool idle) { <<<<<<< HEAD if (idle) cpumask_set_cpu(cpu, idle_masks.cpu); else cpumask_clear_cpu(cpu, idle_masks.cpu); ======= int cpu = cpu_of(rq); if (SCX_HAS_OP(update_idle) && !scx_rq_bypassing(rq)) { SCX_CALL_OP(SCX_KF_REST, update_idle, cpu_of(rq), idle); if (!static_branch_unlikely(&scx_builtin_idle_enabled)) return; } assign_cpu(cpu, idle_masks.cpu, idle); >>>>>>> `987ce79b52` The first commit factored out update_builtin_idle() and the second replaced cpumask_set/clear_cpu() calls with assign_cpu(). The conflict can be resolved by taking the code from the first and then replacing the cpumask_set/clear_cpu() calls with assign_cpu(): static void update_builtin_idle(int cpu, bool idle) { assign_cpu(cpu, idle_masks.cpu, idle); -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ4/yjw4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGc/3AQChJ2zxBLMtQOfNVWgqhAyatCnFquMncl/IeGAs Sf4eawD/QHNWYBPx0nOFQS8RZNmUcXuEDeHMy+1MvTUaIDL5MQM= =6t0t -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - scx_bpf_now() added so that BPF scheduler can access the cached timestamp in struct rq to avoid reading TSC multiple times within a locked scheduling operation. - Minor updates to the built-in idle CPU selection logic. - tool/sched_ext updates and other misc changes. * tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: fix kernel-doc warnings sched_ext: Use time helpers in BPF schedulers sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now() sched_ext: Add time helpers for BPF schedulers sched_ext: Add scx_bpf_now() for BPF scheduler sched_ext: Implement scx_bpf_now() sched_ext: Relocate scx_enabled() related code sched_ext: Add option -l in selftest runner to list all available tests sched_ext: Include remaining task time slice in error state dump sched_ext: update scx_bpf_dsq_insert() doc for SCX_DSQ_LOCAL_ON sched_ext: idle: small CPU iteration refactoring sched_ext: idle: introduce check_builtin_idle_enabled() helper sched_ext: idle: clarify comments sched_ext: idle: use assign_cpu() to update the idle cpumask sched_ext: Use str_enabled_disabled() helper in update_selcpu_topology() sched_ext: Use sizeof_field for key_len in dsq_hash_params tools/sched_ext: Receive updates from SCX repo sched_ext: Use the NUMA scheduling domain for NUMA optimizations	2025-01-23 18:49:43 -08:00
Linus Torvalds	e8744fbc83	tracing updates for v6.14: - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Update the Rust tracepoint code to use the C code too There was some duplication of the tracepoint code for Rust that did the same logic as the C code. Add a helper that makes it possible for both algorithms to use the same logic in one place. - Add poll to trace event hist files It is useful to know when an event is triggered, or even with some filtering. Since hist files of events get updated when active and the event is triggered, allow applications to poll the hist file and wake up when an event is triggered. This will let the application know that the event it is waiting for happened. - Add :mod: command to enable events for current or future modules The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Add the command where if ':mod:<module>' is written into set_event, then either all the modules events are enabled if it is loaded, or cache it so that the module's events are enabled when it is loaded. This also works from the kernel command line, where "trace_event=:mod:<module>", when the module is loaded at boot up, its events will be enabled then. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ5EbMxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qkZsAP9Amgx9frSbR1pn1t0I3wVnQx7khgOu s/b8Ro+vjTx1/QD/RN2AA7f+HK4F27w3Aqfrs0nKXAPtXWsJ9Epp8raG5w8= =Pg+4 -----END PGP SIGNATURE----- Merge tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Update the Rust tracepoint code to use the C code too There was some duplication of the tracepoint code for Rust that did the same logic as the C code. Add a helper that makes it possible for both algorithms to use the same logic in one place. - Add poll to trace event hist files It is useful to know when an event is triggered, or even with some filtering. Since hist files of events get updated when active and the event is triggered, allow applications to poll the hist file and wake up when an event is triggered. This will let the application know that the event it is waiting for happened. - Add :mod: command to enable events for current or future modules The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Add the command where if ':mod:<module>' is written into set_event, then either all the modules events are enabled if it is loaded, or cache it so that the module's events are enabled when it is loaded. This also works from the kernel command line, where "trace_event=:mod:<module>", when the module is loaded at boot up, its events will be enabled then. * tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) tracing: Fix output of set_event for some cached module events tracing: Fix allocation of printing set_event file content tracing: Rename update_cache() to update_mod_cache() tracing: Fix #if CONFIG_MODULES to #ifdef CONFIG_MODULES selftests/ftrace: Add test that tests event :mod: commands tracing: Cache ":mod:" events for modules not loaded yet tracing: Add :mod: command to enabled module events selftests/tracing: Add hist poll() support test tracing/hist: Support POLLPRI event for poll on histogram tracing/hist: Add poll(POLLIN) support on hist file tracing: Fix using ret variable in tracing_set_tracer() tracepoint: Reduce duplication of __DO_TRACE_CALL tracing/string: Create and use __free(argv_free) in trace_dynevent.c tracing: Switch trace_stat.c code over to use guard() tracing: Switch trace_stack.c code over to use guard() tracing: Switch trace_osnoise.c code over to use guard() and __free() tracing: Switch trace_events_synth.c code over to use guard() tracing: Switch trace_events_filter.c code over to use guard() tracing: Switch trace_events_trigger.c code over to use guard() tracing: Switch trace_events_hist.c code over to use guard() ...	2025-01-23 17:51:16 -08:00
Linus Torvalds	7f71554b4e	ktest updates for v6.14: - Fix use of KERNEL_VERSION in newly created output directory If a new output directory is created (O=/dir), and one of the options uses KERNEL_VERSION which will run a "make kernelversion" in the output directory, it will fail because there is no config file yet. In this case, have it do a "make allnoconfig" which is the minimal needed to run the "make kernelversion". - Remove unused variables - Fix some typos -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ5AsqBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qlfJAQCmRlwC8x+AVy2lij2sTqM8nfB7DfTF QMavz/I1xn0f3wD/bxVfN8SebrCBBiICW09bvnWzqzyzgFsjjxRRq+Zzvwc= =qfZn -----END PGP SIGNATURE----- Merge tag 'ktest-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest updates from Steven Rostedt: - Fix use of KERNEL_VERSION in newly created output directory If a new output directory is created (O=/dir), and one of the options uses KERNEL_VERSION which will run a "make kernelversion" in the output directory, it will fail because there is no config file yet. In this case, have it do a "make allnoconfig" which is the minimal needed to run the "make kernelversion". - Remove unused variables - Fix some typos * tag 'ktest-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest.pl: Fix typo "accesing" ktest.pl: Fix typo in comment ktest.pl: Remove unused declarations in run_bisect_test function ktest.pl: Check kernelrelease return in get_version	2025-01-23 17:45:24 -08:00
Linus Torvalds	d0d106a2bd	bpf-next-6.14 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmeOu1YACgkQ6rmadz2v bTrrHxAAn6eqEsluWnDlzhI0OGsPjvgS00sf+MOeqiXYeS2eJ8yJuKifp38+nIQZ lIplsWU2ReUY20eizPqLPnQ7TXZGvLgp08E8yHUoZ0siWanqr9iDRfbZCCNrDMNm lMqeR1SLapMws2R/UX9JbvPn2ajIJ6Lb4wxenTfdlW6q+0hAGM6Dt0k/jBod+quq /oo+xwG3L0q4APBovJfiAFN2z6IYN03b+zLiOrpIJtMACGewEXnl3m4mkL8ZM/FV nZGPIxIUPXCpKTGEkNqxfkrnHN2wZQ4ZSKEJ6lhEEp4jrgCVITaGZ/E7jlx6fZoj bbd4YMonIPo9Nhim8p1dt8yYBhKKiE5IXIq0GqlMv5+MvAN8ylrlydpsouW1fu66 hZ1W1BxbxmrgyF0Bwo9JPOMhBHwMrmD6iH9LgiMpZf0ASeF+q9cJpoSOU5j5E9XB LpLIRf5jYTd4wZjhDmrQREReLo+Bng9DlCBu+jjh2+YTz6l6Qed+ETpENcd7lL5i IHZVbgD2RVPNJoUfdrd763HfYfDTk+50MF5FIMEyfKHz11if0E/LhBMzto22hm6b 2f8ruj/8yvg8s2dxEP3ySQgcnynlwEnGxLenUVv7uEOYKeWri1rq+fvTK5ne1OLK oHnTlkViwQb74c0r8cFW+nkyfUYTfhhBAql14rl/fMjGDO2KZ10= =f2CA -----END PGP SIGNATURE----- Merge tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "A smaller than usual release cycle. The main changes are: - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai) In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile only mode. Half of the tests are failing, since support for btf_decl_tag is still WIP, but this is a great milestone. - Convert various samples/bpf to selftests/bpf/test_progs format (Alexis Lothoré and Bastien Curutchet) - Teach verifier to recognize that array lookup with constant in-range index will always succeed (Daniel Xu) - Cleanup migrate disable scope in BPF maps (Hou Tao) - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao) - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin KaFai Lau) - Refactor verifier lock support (Kumar Kartikeya Dwivedi) This is a prerequisite for upcoming resilient spin lock. - Remove excessive 'may_goto +0' instructions in the verifier that LLVM leaves when unrolls the loops (Yonghong Song) - Remove unhelpful bpf_probe_write_user() warning message (Marco Elver) - Add fd_array_cnt attribute for prog_load command (Anton Protopopov) This is a prerequisite for upcoming support for static_branch" * tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits) selftests/bpf: Add some tests related to 'may_goto 0' insns bpf: Remove 'may_goto 0' instruction in opt_remove_nops() bpf: Allow 'may_goto 0' instruction in verifier selftests/bpf: Add test case for the freeing of bpf_timer bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT bpf: Free element after unlock in __htab_map_lookup_and_delete_elem() bpf: Bail out early in __htab_map_lookup_and_delete_elem() bpf: Free special fields after unlock in htab_lru_map_delete_node() tools: Sync if_xdp.h uapi tooling header libbpf: Work around kernel inconsistently stripping '.llvm.' suffix bpf: selftests: verifier: Add nullness elision tests bpf: verifier: Support eliding map lookup nullness bpf: verifier: Refactor helper access type tracking bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write bpf: verifier: Add missing newline on verbose() call selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED libbpf: Fix return zero when elf_begin failed selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test veristat: Load struct_ops programs only once ...	2025-01-23 08:04:07 -08:00
Jakub Kicinski	965adae5a3	selftests/net: packetdrill: more xfail changes (and a correction) Recent change to add more cases to XFAIL has a broken regex, the matching needs a real regex not a glob pattern. While at it add the cases Willem pointed out during review. Fixes: `3030e3d57b` ("selftests/net: packetdrill: make tcp buf limited timing tests benign") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250121143423.215261-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-23 07:07:41 -08:00
Koichiro Den	f8524ac33c	selftests: gpio: gpio-sim: Fix missing chip disablements Since upstream commit `8bd76b3d3f` ("gpio: sim: lock up configfs that an instantiated device depends on"), rmdir for an active virtual devices been prohibited. Update gpio-sim selftest to align with the change. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202501221006.a1ca5dfa-lkp@intel.com Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250122043309.304621-1-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-23 15:44:48 +01:00
Linus Torvalds	21266b8df5	AT_EXECVE_CHECK introduction for v6.14-rc1 - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün) - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits (Mickaël Salaün) - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hO7wAKCRA2KwveOeQk u4l+AP9UHO1KwMn3aOt6uFPj7omaoY0vpcB1rx/x5s4efNFHOAD/QjY0f+ND+HzF mKLYOIeacGEQi7TNhpnOkGjz6jzSiwg= =sMhZ -----END PGP SIGNATURE----- Merge tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull AT_EXECVE_CHECK from Kees Cook: - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün) - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits (Mickaël Salaün) - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün) * tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ima: instantiate the bprm_creds_for_exec() hook samples/check-exec: Add an enlighten "inc" interpreter and 28 tests selftests: ktap_helpers: Fix uninitialized variable samples/check-exec: Add set-exec selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits exec: Add a new AT_EXECVE_CHECK flag to execveat(2)	2025-01-22 20:34:42 -08:00
Linus Torvalds	de5817bbfb	Landlock updates for v6.14-rc1 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ5EMhBAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSMv0BAMOG2TFwq+UhbtxtL6pM7qzxfdWg6GR/t4t8 MFasAcCaAQDtTnW0HymHge8k7JFgWHHp0JBu7V7dhFrdJoS+718aDA== =1Hfr -----END PGP SIGNATURE----- Merge tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mostly factors out some Landlock code and prepares for upcoming audit support. Because files with invalid modes might be visible after filesystem corruption, Landlock now handles those weird files too. A few sample and test issues are also fixed" * tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add layout1.umount_sandboxer tests selftests/landlock: Add wrappers.h selftests/landlock: Fix error message landlock: Optimize file path walks and prepare for audit support selftests/landlock: Add test to check partial access in a mount tree landlock: Align partial refer access checks with final ones landlock: Simplify initially denied access rights landlock: Move access types landlock: Factor out check_access_path() selftests/landlock: Fix build with non-default pthread linking landlock: Use scoped guards for ruleset in landlock_add_rule() landlock: Use scoped guards for ruleset landlock: Constify get_mode_access() landlock: Handle weird files samples/landlock: Fix possible NULL dereference in parse_path() selftests/landlock: Remove unused macros in ptrace_test.c	2025-01-22 20:20:55 -08:00
Linus Torvalds	37b33c68b0	CRC updates for 6.14 - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be directly accessible via the library API, instead of requiring the crypto API. This is much simpler and more efficient. - Convert some users such as ext4 to use the CRC32 library API instead of the crypto API. More conversions like this will come later. - Add a KUnit test that tests and benchmarks multiple CRC variants. Remove older, less-comprehensive tests that are made redundant by this. - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm volunteering to maintain it. I have additional cleanups and optimizations planned for future cycles. These patches have been in linux-next since -rc1. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ418ZRQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyJYAP9kBlpm8W9/XY6N8SpjKaXE/vKQYHQl Nobhak06Us8uJwEAkcUTymWP4IwQj5A9jgBAPRw53FQcNVKIc+01C7gRHw0= =mqSH -----END PGP SIGNATURE----- Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be directly accessible via the library API, instead of requiring the crypto API. This is much simpler and more efficient. - Convert some users such as ext4 to use the CRC32 library API instead of the crypto API. More conversions like this will come later. - Add a KUnit test that tests and benchmarks multiple CRC variants. Remove older, less-comprehensive tests that are made redundant by this. - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm volunteering to maintain it. I have additional cleanups and optimizations planned for future cycles. * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits) MAINTAINERS: add entry for CRC library powerpc/crc: delete obsolete crc-vpmsum_test.c lib/crc32test: delete obsolete crc32test.c lib/crc16_kunit: delete obsolete crc16_kunit.c lib/crc_kunit.c: add KUnit test suite for CRC library functions powerpc/crc-t10dif: expose CRC-T10DIF function through lib arm64/crc-t10dif: expose CRC-T10DIF function through lib arm/crc-t10dif: expose CRC-T10DIF function through lib x86/crc-t10dif: expose CRC-T10DIF function through lib crypto: crct10dif - expose arch-optimized lib function lib/crc-t10dif: add support for arch overrides lib/crc-t10dif: stop wrapping the crypto API scsi: target: iscsi: switch to using the crc32c library f2fs: switch to using the crc32 library jbd2: switch to using the crc32c library ext4: switch to using the crc32c library lib/crc32: make crc32c() go directly to lib bcachefs: Explicitly select CRYPTO from BCACHEFS_FS x86/crc32: expose CRC32 functions through lib x86/crc32: update prototype for crc32_pclmul_le_16() ...	2025-01-22 19:55:08 -08:00
Linus Torvalds	7004a2e46d	linux_kselftest-nolibc-6.14-rc1 - adds support for waitid() - uses waitid() over waitpid() - uses a pipe to in vfprintf tests - skips tests for unimplemented syscalls - renames riscv to riscv64 - adds configurations for riscv32 - adds detecting missing toolchain to run-tests.sh -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmeQGFQACgkQCwJExA0N QxzStg/9H8ZSEtoGkMEtgW6jbXGTfmUZWOK/4KN6LJDiCl6Hrvyj9Tf5f4nFOhL2 8eR9hYMGxx3kawY3qoGgQaotj2thlBLUGQVY8vLMnBIrA2r2mMmFOwwK4FtxBHHf jI37bvXvfIx8DzaUllZfDMs+hLSeteS1Qcq8n6nnuVTyrG8/Zt32Dal7pOf+rGh4 C81L82n0vwOrj69vlfGIOBrhzoy0XWIvHWTxBY5EUzIpmRonAo3Us/pkMmTndU4K xFLzDldttIjtIrKI0qKPhipKrx5tmIzQbpTN3K+u8dgeFEGASsi7NdyctW5SRcYc efAvZxt91bQ9WpysSC+KWKpsO56nQ0MrpqLCnVK+fL5QrDqjAimRES3J6Aij9+Hs ojliXn6AO1aA7mXm/nLvcJIMV0i9CufVv/H2ZfVMxgCObN5mzlXmsEESI4fkROK4 ZnkvYVMQBvQTtdoILOwFqu6mDKWwNitjNNvlePMtSFxFRkzMxal+LitV6cB0LFV5 OIqh8w69MShBD1jTA1TQJOA9G1HMiB9Fm+kjT4U6H9sgSJaKzV4+n1XeFpVTWM6B TpI/kFL99V9pkPbZH17jRjIOg+0o3T2acs4FZsJ+jDt8hEk6qVy5sN1a4G6GTxyl +Oc204YJEQ9hLNiXljSiEzYrAvFIa055PzEuni3PFBxMisNKTjM= =jwYX -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-nolibc-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull nolibc updates from Shuah Khan: - add support for waitid() - use waitid() over waitpid() - use a pipe in vfprintf tests - skip tests for unimplemented syscalls - rename riscv to riscv64 - add configurations for riscv32 - add detecting missing toolchain to run-tests.sh * tag 'linux_kselftest-nolibc-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/nolibc: add configurations for riscv32 selftests/nolibc: rename riscv to riscv64 selftests/nolibc: skip tests for unimplemented syscalls selftests/nolibc: use a pipe to in vfprintf tests selftests/nolibc: use waitid() over waitpid() tools/nolibc: add support for waitid() selftests/nolibc: run-tests.sh: detect missing toolchain	2025-01-22 12:36:16 -08:00
Linus Torvalds	e8f17cb6f5	linux_kselftest-kunit-6.14-rc1 - fixes struct completion warning - introduces autorun option - adds fallback for os.sched_getaffinity - enables hardware acceleration when available -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmeQDEwACgkQCwJExA0N QxzsahAAod2fq5NoQ+SY/x5oc5v0k8tV8pbrdcJBMDx5iKAf/B+EBmqsKHs5VuBi /fUkSQiTndFjXTxZbS1zTRN4XfO5H6AUVmazfHAGhIL4QEsyOocGXIEHwlhHYmLP YwOA2UTS7FilIZA0Z9slKiKnxCZga7pp6Et11rwnydDro2XvPhsnsi9FHchjYmXx lQyaO17RHf5z+LfNAH3j8wsYU910z/Vg5AE1kZ7ckcftFgPXpiK2P2XtDTAKZz4D p7qW6kntUQ9994HbhCa+fw5YIFdSy8fL9QG9uBdWb0x03dQzNkW8mOs8I6DWr4Kw cVp06829K/fpwy3P15mVFjv8cO7W8t74LBGq/EipjQ8eA2RhfkZdwNE/awH9GBDS kjjlNfIh+U4wY6++SAF58k1bZorVgpZfRtpl1anfftEOlex+JPKXaJpoZloMZ/P9 Jh8BtZ+yc16tDkNQlqT24CeSGiC4GvtqUBytXvwGjEdUFzIS+bXGPwHpKrVlHWVV lpntJiUEqIbgZ+XS4UxDHBqXbYKRv7sUlToMJNkMEO5Hz5ok57NjxuPmbfS+LJdk uc6gEH3aAlyI52uJZqotcRmmea52S1HZSUO9E80yl/cS5PHysTlivTXCm85PI7GV a6T43DgnpBqqWPHafnm93DSvlx/wl1LU2JsRYeXp59CkXonlNUk= =caDL -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-kunit-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - fix struct completion warning - introduce autorun option - add fallback for os.sched_getaffinity - enable hardware acceleration when available * tag 'linux_kselftest-kunit-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Introduce autorun option kunit: enable hardware acceleration when available kunit: add fallback for os.sched_getaffinity kunit: platform: Resolve 'struct completion' warning	2025-01-22 12:32:39 -08:00
Linus Torvalds	8fb1e2eed1	linux_kselftest-next-6.14-rc1 - fixes, reporting improvements, and cleanup changes to several tests - adds support for DT_GNU_HASH to selftests/vDSO -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmeQARsACgkQCwJExA0N QxygPQ/8CACwz0b/nOEIcVIcwuShkqkjToi5H4LxDDXkt80SFgxzjYATPIvIiFas ZCZFz088VkSKFhNJnjwWz2AjWhvpuF2yhbkIoYPbvlzBtOVzP1DvaAcsmWSinOoa jjGY/EjgT3nS5S277cZ+h335FYID52YMG0dR2N3rdUYTzSGth2dLAB63IMHM7/jX +Nzlwf77FzWZi0M+2/cc4D8C5gt0HfM3IeEPsO7uSjFIGH1voirEC86E7+MV7F5E mQ57T4AMg2MZmcH7laFRkKf3i0G3rkpxFrVdNvr1nNMGxeBQeZERWQVYmp/viZXP QhKQQMaMmwSXORQb/CbZUGMHvUGPCmK2n9hJoXkePR8sfO6yjywE/uIZFaRlq6IE 9M2Qe+5/7iSziakBbhw3F4xYTI1VLe2RPj1IoB0eMUg5UNGYIQ1OaM5F/6Ol6w1z 4mf2XmoyDb5D5vxFlqDjxio56NVJO5+7588oexKm0BkQgDU3F88tQ8wIbPryl/yf lMDiLQlDF3SF5NpNtd7cwYB1m/VqtT7n/hJVwwQNTO2GylgpMYLRQdAPZcVetuxA LIuqrd7hOg7tem1OvuU8SERJfTzs2DA9UcBxgJcNDBH9HxgZ66tA++Q84L/UcFof oZn6WOrM6fLvIvkBp+fVKw7vt/BpzvVpRu66lwx6SwWlkTxILtU= =XSsc -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - fixes, reporting improvements, and cleanup changes to several tests - add support for DT_GNU_HASH to selftests/vDSO * tag 'linux_kselftest-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/rseq: Fix handling of glibc without rseq support selftests/resctrl: Discover SNC kernel support and adjust messages selftests/resctrl: Adjust effective L3 cache size with SNC enabled selftests/ftrace: Make uprobe test more robust against binary name selftests/ftrace: Fix to use remount when testing mount GID option selftests: tmpfs: Add kselftest support to tmpfs selftests: tmpfs: Add Test-skip if not run as root selftests: harness: fix printing of mismatch values in __EXPECT() selftests/ring-buffer: Add test for out-of-bound pgoff mapping selftests/run_kselftest.sh: Fix help string for --per-test-log selftests: acct: Add ksft_exit_skip if not running as root selftests: kselftest: Fix the wrong format specifier selftests: timers: clocksource-switch: Adapt progress to kselftest framework selftests/zram: gitignore output file selftests/filesystems: Add missing gitignore file selftests: Warn about skipped tests in result summary selftests: kselftest: Add ksft_test_result_xpass selftests/vDSO: support DT_GNU_HASH selftests/ipc: Remove unused variables selftest: media_tests: fix trivial UAF typo	2025-01-22 12:30:20 -08:00
Linus Torvalds	27c0278477	hid-for-linus-2025012001 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEL65usyKPHcrRDEicpmLzj2vtYEkFAmeOFTYACgkQpmLzj2vt YEkzxw//TysEhHRBl7Fff+pTDSCag7xuJHso+U/y2iqXCJM/zpX8vp4EOV6A0Fwa 2UbhaR5IASHjytAE7rwStlaOScgG04mmgB9Rj+OXbjM+2M/VDQbeM/CRrZb7LJDZ rFgz0W84M3tCRxkqtEID42Q+1n7K3Fxv/DJYSPofMuT6t9H8qEb+n8mgVOXp/B1U nAEoLmxBvDPqiWxpfIHcVZ3fDZglp74906LcayfVbOAt7Q0xZkGfj7Etw5uWx/yE H60GgFpDGQj56OGDEB/pyQFEEeIIJfvq77wCPPk0RmITbBkTW6KPV6SJorW4pgQU ruQXQSaToxePm9TvmumJ08vfLYNpaMJ5TUkWk74ccFe/9aNpSdvO5miHIcIhbE8B ooV0ojqU/TyTYE/UbEnsoxvlZUhX8v3zzfmPw+rUcZTn4imHTVkHuw/4iDWX+ZxP 2v8n1Wwle+ofBuy5PS32MUHDwDnEQRulWSrOHDFmRXoISM7/RGKLARGUqcOwsRmF 0sZSyCrXU3GSfNF8sVMv5SYKEW+feqKLanL7Hna7LnUaw4DRarWRqXmHXc5sfcY/ PP1hpdA/iJ44su1M07dBaVkM1bqn5u7BngOgvTCQVo0Nr4NftyBzkjGz715DFIrz w2eYvZpaPXyGsRb+JiTRVaa9PI/P1E7heZrQTCnc8zqUCZMVVlM= =WYkk -----END PGP SIGNATURE----- Merge tag 'hid-for-linus-2025012001' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - newly added support for Intel Touch Host Controller (Even Xu, Xinpeng Sun) - hid-core fix for long-standing syzbot-reported cornercase of Resolution Multiplier not being present in any of the Logical Collections in the device HID report descriptor (Alan Stern) - improvement of behavior for non-standard LED brightness values for Wacom driver (Jason Gerecke) - PCI Wacom device support (depends on Intel THC support) (Even Xu) - SteelSeries Arctis 9 support (Christian Mayer) - constification of 'struct bin_attribute' in various HID driver (Thomas Weißschuh) - other assorted code cleanups / fixes and device ID additions * tag 'hid-for-linus-2025012001' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (63 commits) HID: hid-asus: Disable OOBE mode on the ProArt P16 HID: steelseries: remove unnecessary return HID: steelseries: export model and manufacturer HID: steelseries: export charging state for the SteelSeries Arctis 9 headset HID: steelseries: add SteelSeries Arctis 9 support HID: steelseries: preparation for adding SteelSeries Arctis 9 support HID: intel-thc-hid: fix build errors in um mode HID: intel-thc-hid: intel-quicki2c: fix potential memory corruption HID: intel-thc-hid: intel-thc: Fix error code in thc_i2c_subip_init() HID: lenovo: Fix undefined platform_profile_cycle in ThinkPad X12 keyboard patch HID: uclogic: make const read-only array touch_ring_model_params_buf static HID: hid-steam: Make sure rumble work is canceled on removal HID: Wacom: Add PCI Wacom device support HID: intel-thc-hid: intel-quicki2c: Add PM implementation HID: intel-thc-hid: intel-quicki2c: Complete THC QuickI2C driver HID: intel-thc-hid: intel-quicki2c: Add HIDI2C protocol implementation HID: intel-thc-hid: intel-quicki2c: Add THC QuickI2C ACPI interfaces HID: intel-thc-hid: intel-quicki2c: Add THC QuickI2C driver hid layer HID: intel-thc-hid: intel-quicki2c: Add THC QuickI2C driver skeleton HID: intel-thc-hid: intel-quickspi: Add PM implementation ...	2025-01-22 11:56:39 -08:00
Linus Torvalds	f4b9d3bf44	Power management updates for 6.14-rc1 - Use str_enable_disable()-like helpers in cpufreq (Krzysztof Kozlowski). - Extend the Apple cpufreq driver to support more SoCs (Hector Martin, Nick Chan). - Add new cpufreq driver for Airoha SoCs (Christian Marangi). - Fix using cpufreq-dt as module (Andreas Kemnade). - Minor fixes for Sparc, SCMI, and Qcom cpufreq drivers (Ethan Carter Edwards, Sibi Sankar, Manivannan Sadhasivam). - Fix the maximum supported frequency computation in the ACPI cpufreq driver to avoid relying on unfounded assumptions (Gautham Shenoy). - Fix an amd-pstate driver regression with preferred core rankings not being used (Mario Limonciello). - Fix a precision issue with frequency calculation in the amd-pstate driver (Naresh Solanki). - Add ftrace event to the amd-pstate driver for active mode (Mario Limonciello). - Set default EPP policy on Ryzen processors in amd-pstate (Mario Limonciello). - Clean up the amd-pstate cpufreq driver and optimize it to increase code reuse (Mario Limonciello, Dhananjay Ugwekar). - Use CPPC to get scaling factors between HWP performance levels and frequency in the intel_pstate driver and make it stop using a built -in scaling factor for Arrow Lake processors (Rafael Wysocki). - Make intel_pstate initialize epp_policy to CPUFREQ_POLICY_UNKNOWN for consistency with CPU offline (Christian Loehle). - Fix superfluous updates caused by need_freq_update in the schedutil cpufreq governor (Sultan Alsawaf). - Allow configuring the system suspend-resume (DPM) watchdog to warn earlier than panic (Douglas Anderson). - Implement devm_device_init_wakeup() helper and introduce a device- managed variant of dev_pm_set_wake_irq() (Joe Hattori, Peng Fan). - Remove direct inclusions of 'pm_wakeup.h' which should be only included via 'device.h' (Wolfram Sang). - Clean up two comments in the core system-wide PM code (Rafael Wysocki, Randy Dunlap). - Add Clearwater Forest processor support to the intel_idle cpuidle driver (Artem Bityutskiy). - Clean up the Exynos devfreq driver and devfreq core (Markus Elfring, Jeongjun Park). - Minor cleanups and fixes for OPP (Dan Carpenter, Neil Armstrong, Joe Hattori). - Implement dev_pm_opp_get_bw() (Neil Armstrong). - Expose OPP reference counting helpers for Rust (Viresh Kumar). - Fix TSC MHz calculation in cpupower (He Rongguang). - Add install and uninstall options to bindings Makefile and add header changes for cpufreq.h to SWIG bindings in cpupower (John B. Wyatt IV). - Add missing residency header changes in cpuidle.h to SWIG bindings in cpupower (John B. Wyatt IV). - Add output files to .gitignore and clean them up in "make clean" in selftests/cpufreq (Li Zhijian). - Fix cross-compilation in cpupower Makefile (Peng Fan). - Revise the is_valid flag handling for idle_monitor in the cpupower utility (wangfushuai). - Extend and clean up AMD processors support in cpupower (Mario Limonciello). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmeOthsSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxqQsP/ivDt8nqDnxdKB7cKFQIsEK+tl0RnFVD o5regvYeRcGWpUXuMaqBtTmCMjsB8bUkcj2yLquM54ubjHAGF6zJuw9ZytMPHVcC b2xk3RCFlXSBFXVK8eOh3XRviA9nGhuY97ZnPsQOlvoECrxT2xyeL+mWo7s+t+q9 2NUH+yfRoi5FM+nqqDhsm0xXxJuPaNg6eAjIASuMjXap48rNk3L5kW6W/6nw7i0I xQWd/pKLHaI5e7DRF/QdMKu8+Fm4BbN0jMqLblKPOmTe9KggvBkck5q1Um20sYkJ vdKMAT02ClGavIC7DtY092Xik84NZfID4ZUchS6e2hJIQ3Uaw/eDvAo/jlT8gIzq fnXPdApRIzQGDvMxFaAsKaGlwxiVlAGHPDSTH6MVWzsp+1DSkbloSwVPAfeYIn44 Jhov+6Ydux3597sSjo+YmD58acimXl7urVuk8P6m3U5+gb8/jlgbxpIn+vbxH3Ka o44Vt7axD63gezOQY134sj5gic5JL0GuZovOlvzrF6+FsjvVqcax6FZ4n3uIXu7P C1nwai+Wdzo7wvuz7RfO0g15Y15wYLQLYsRq/osRlf+sOmGVv7nA9tSzZ0LUdD5D Pp6PxppF6anM0Kjen8Ppuu+Bcr11JfVvhnVTJqhs6u71XdAy4TnG1JjL4lPWYJ4D Gfz2hyPNjiQX =AoMC -----END PGP SIGNATURE----- Merge tag 'pm-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The majority of changes here are cpufreq updates which are dominated by amd-pstate driver changes, like in the previous cycle. Moreover, changes related to amd-pstate are also the majority of cpupower utility updates. Included are some pieces of new hardware support, like the addition of Clearwater Forest processors support to intel_idle, new cpufreq driver for Airoha SoCs, and Apple cpufreq driver extensions to support more SoCs. The intel_pstate driver is also extended to be able to support new platforms by using ACPI CPPC to compute scaling factors between HWP performance states and frequency. The rest is mostly fixes and cleanups in assorted pieces of power management code. Specifics: - Use str_enable_disable()-like helpers in cpufreq (Krzysztof Kozlowski) - Extend the Apple cpufreq driver to support more SoCs (Hector Martin, Nick Chan) - Add new cpufreq driver for Airoha SoCs (Christian Marangi) - Fix using cpufreq-dt as module (Andreas Kemnade) - Minor fixes for Sparc, SCMI, and Qcom cpufreq drivers (Ethan Carter Edwards, Sibi Sankar, Manivannan Sadhasivam) - Fix the maximum supported frequency computation in the ACPI cpufreq driver to avoid relying on unfounded assumptions (Gautham Shenoy) - Fix an amd-pstate driver regression with preferred core rankings not being used (Mario Limonciello) - Fix a precision issue with frequency calculation in the amd-pstate driver (Naresh Solanki) - Add ftrace event to the amd-pstate driver for active mode (Mario Limonciello) - Set default EPP policy on Ryzen processors in amd-pstate (Mario Limonciello) - Clean up the amd-pstate cpufreq driver and optimize it to increase code reuse (Mario Limonciello, Dhananjay Ugwekar) - Use CPPC to get scaling factors between HWP performance levels and frequency in the intel_pstate driver and make it stop using a built-in scaling factor for Arrow Lake processors (Rafael Wysocki) - Make intel_pstate initialize epp_policy to CPUFREQ_POLICY_UNKNOWN for consistency with CPU offline (Christian Loehle) - Fix superfluous updates caused by need_freq_update in the schedutil cpufreq governor (Sultan Alsawaf) - Allow configuring the system suspend-resume (DPM) watchdog to warn earlier than panic (Douglas Anderson) - Implement devm_device_init_wakeup() helper and introduce a device- managed variant of dev_pm_set_wake_irq() (Joe Hattori, Peng Fan) - Remove direct inclusions of 'pm_wakeup.h' which should be only included via 'device.h' (Wolfram Sang) - Clean up two comments in the core system-wide PM code (Rafael Wysocki, Randy Dunlap) - Add Clearwater Forest processor support to the intel_idle cpuidle driver (Artem Bityutskiy) - Clean up the Exynos devfreq driver and devfreq core (Markus Elfring, Jeongjun Park) - Minor cleanups and fixes for OPP (Dan Carpenter, Neil Armstrong, Joe Hattori) - Implement dev_pm_opp_get_bw() (Neil Armstrong) - Expose OPP reference counting helpers for Rust (Viresh Kumar) - Fix TSC MHz calculation in cpupower (He Rongguang) - Add install and uninstall options to bindings Makefile and add header changes for cpufreq.h to SWIG bindings in cpupower (John B. Wyatt IV) - Add missing residency header changes in cpuidle.h to SWIG bindings in cpupower (John B. Wyatt IV) - Add output files to .gitignore and clean them up in "make clean" in selftests/cpufreq (Li Zhijian) - Fix cross-compilation in cpupower Makefile (Peng Fan) - Revise the is_valid flag handling for idle_monitor in the cpupower utility (wangfushuai) - Extend and clean up AMD processors support in cpupower (Mario Limonciello)" * tag 'pm-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits) PM / OPP: Add reference counting helpers for Rust implementation PM: sleep: wakeirq: Introduce device-managed variant of dev_pm_set_wake_irq() cpufreq: Use str_enable_disable()-like helpers cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver PM: sleep: Allow configuring the DPM watchdog to warn earlier than panic PM: sleep: convert comment from kernel-doc to plain comment cpufreq: ACPI: Fix max-frequency computation pm: cpupower: Add missing residency header changes in cpuidle.h to SWIG PM / devfreq: exynos: remove unused function parameter OPP: OF: Fix an OF node leak in _opp_add_static_v2() cpufreq/amd-pstate: Refactor max frequency calculation cpufreq/amd-pstate: Fix prefcore rankings pm: cpupower: Add header changes for cpufreq.h to SWIG bindings cpufreq: sparc: change kzalloc to kcalloc cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocks cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is not available cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq support cpufreq: apple-soc: Set fallback transition latency to APPLE_DVFS_TRANSITION_TIMEOUT cpufreq: apple-soc: Increase cluster switch timeout to 400us cpufreq: apple-soc: Use 32-bit read for status register ...	2025-01-22 11:16:14 -08:00
Linus Torvalds	0ad9617c78	Networking changes for 6.14. Core ---- - More core refactoring to reduce the RTNL lock contention, including preparatory work for the per-network namespace RTNL lock, replacing RTNL lock with a per device-one to protect NAPI-related net device data and moving synchronize_net() calls outside such lock. - Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and more specific TCP coverage. - Reduce network namespace tear-down time by removing per-subsystems synchronize_net() in tipc and sched. - Add flow label selector support for fib rules, allowing traffic redirection based on such header field. Netfilter --------- - Do not remove netdev basechain when last device is gone, allowing netdev basechains without devices. - Revisit the flowtable teardown strategy, dealing better with fin, reset and re-open events. - Scale-up IP-vs connection dumping by avoiding linear search on each restart. Protocols --------- - A significant XDP socket refactor, consolidating and optimizing several helpers into the core - Better scaling of ICMP rate-limiting, by removing false-sharing in inet peers handling. - Introduces netlink notifications for multicast IPv4 and IPv6 address changes. - Add ipsec support for IP-TFS/AggFrag encapsulation, allowing aggregation and fragmentation of the inner IP. - Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to avoid local port exhaustion issues when the average connection lifetime is very short. - Support updating keys (re-keying) for connections using kernel TLS (for TLS 1.3 only). - Support ipv4-mapped ipv6 address clients in smc-r v2. - Add support for jumbo data packet transmission in RxRPC sockets, gluing multiple data packets in a single UDP packet. - Support RxRPC RACK-TLP to manage packet loss and retransmission in conjunction with the congestion control algorithm. Driver API ---------- - Introduce a unified and structured interface for reporting PHY statistics, exposing consistent data across different H/W via ethtool. - Make timestamping selectable, allow the user to select the desired hwtstamp provider (PHY or MAC) administratively. - Add support for configuring a header-data-split threshold (HDS) value via ethtool, to deal with partial or buggy H/W implementation. - Consolidate DSA drivers Energy Efficiency Ethernet support. - Add EEE management to phylink, making use of the phylib implementation. - Add phylib support for in-band capabilities negotiation. - Simplify how phylib-enabled mac drivers expose the supported interfaces. Tests and tooling ----------------- - Make the YNL tool package-friendly to make it easier to deploy it separately from the kernel. - Increase TCP selftest coverage importing several packetdrill test-cases. - Regenerate the ethtool uapi header from the YNL spec, to ease maintenance and future development. - Add YNL support for decoding the link types used in net self-tests, allowing a single build to run both net and drivers/net. Drivers ------- - Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - add cross E-Switch QoS support - add SW Steering support for ConnectX-8 - implement support for HW-Managed Flow Steering, improving the rule deletion/insertion rate - support for multi-host LAG - Intel (ixgbe, ice, igb): - ice: add support for devlink health events - ixgbe: add initial support for E610 chipset variant - igb: add support for AF_XDP zero-copy - Meta: - add support for basic RSS config - allow changing the number of channels - add hardware monitoring support - Broadcom (bnxt): - implement TCP data split and HDS threshold ethtool support, enabling Device Memory TCP. - Marvell Octeon: - implement egress ipsec offload support for the cn10k family - Hisilicon (HIBMC): - implement unicast MAC filtering - Ethernet NICs embedded and virtual: - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding contented atomic operations for drop counters - Freescale: - quicc: phylink conversion - enetc: support Tx and Rx checksum offload and improve TSO performances - MediaTek: - airoha: introduce support for ETS and HTB Qdisc offload - Microchip: - lan78XX USB: preparation work for phylink conversion - Synopsys (stmmac): - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45 - refactor EEE support to leverage the new driver API - optimize DMA and cache access to increase raw RX performances by 40% - TI: - icssg-prueth: add multicast filtering support for VLAN interface - netkit: - add ability to configure head/tailroom - VXLAN: - accepts packets with user-defined reserved bit - Ethernet switches: - Microchip: - lan969x: add RGMII support - lan969x: improve TX and RX performance using the FDMA engine - nVidia/Mellanox: - move Tx header handling to PCI driver, to ease XDP support - Ethernet PHYs: - Texas Instruments DP83822: - add support for GPIO2 clock output - Realtek: - 8169: add support for RTL8125D rev.b - rtl822x: add hwmon support for the temperature sensor - Microchip: - add support for RDS PTP hardware - consolidate periodic output signal generation - CAN: - several DT-bindings to DT schema conversions - tcan4x5x: - add HW standby support - support nWKRQ voltage selection - kvaser: - allowing Bus Error Reporting runtime configuration - WiFi: - the on-going Multi-Link Operation (MLO) effort continues, affecting both the stack and in drivers - mac80211/cfg80211: - Emergency Preparedness Communication Services (EPCS) station mode support - support for adding and removing station links for MLO - add support for WiFi 7/EHT mesh over 320 MHz channels - report Tx power info for each link - RealTek (rtw88): - enable USB Rx aggregation and USB 3 to improve performance - LED support - RealTek (rtw89): - refactor power save to support Multi-Link Operations - add support for RTL8922AE-VS variant - MediaTek (mt76): - single wiphy multiband support (preparation for MLO) - p2p device support - add TP-Link TXE50UH USB adapter support - Qualcomm (ath10k): - support for the QCA6698AQ IP core - Qualcomm (ath12k): - enable MLO for QCN9274 - Bluetooth: - Allow sysfs to trigger hdev reset, to allow recovering devices not responsive from user-space - MediaTek: add support for MT7922, MT7925, MT7921e devices - Realtek: add support for RTL8851BE devices - Qualcomm: add support for WCN785x devices - ISO: allow BIG re-sync Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmePf5YSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkUcMQALblhkGTxurnfT+yK+Bsuhn2LoHl2RPN 4u2Kjkzm+2FYgcw6lS17cFXsnfAPlRIpmhnmKk1EBgsBdkuL29c+jtqnljA2bboD tIMhMgWiaLS3xgEMrLeKnseIo0G9mviQRphGeZPFTaLb4Ww/bd5LAp4ZGc5oij76 tURatC3b6MuO4Lt5U+jWKnRwviXku8udHkVHXlvPdirawHCVinmx3tvce/BI/MaD eUOp6ZeJCPCOLtk7b8WEyxxvdY0f6D9ed82qfPDHjb94SJv+Vxb38RZtNuApIjn9 S0KdlNih/4flDy17LDxGYSyFps78lUFRbpqmsUlnZkyLXpsph7/WTvAmMAFcrX0K UgQ/F/q5GAvcP5WZcCj5+tZaRmfKQraQirXMtYU/Uj50qCnSU7ssyACASt23GLZ8 OF8tCLlm9lLOU1B6Ofkul1Dbo5f0Xpaghga4dFb0kzSfbm78fTUnqBNsJ7jIkWfi fD6dO+fg+p2ZMD0CACGo3CNxQuJmaQWg6BIDeno6God8kZ6qBMxY/sFr4qozrvFH x/FgQq8dgc8WLmaPejKiNIPkdQepXrIiv3T9jgMVyEjJnWB/LBfyWKSQOdTfnLs+ rgr4YMV6XW4bx0fYqTI8B9jZ+FCWbG6sn4UtRTHITKcd3FSvd8Y+PHa5YyCUWvJM l8pePMGF0XVF =hrsp -----END PGP SIGNATURE----- Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "This is slightly smaller than usual, with the most interesting work being still around RTNL scope reduction. Core: - More core refactoring to reduce the RTNL lock contention, including preparatory work for the per-network namespace RTNL lock, replacing RTNL lock with a per device-one to protect NAPI-related net device data and moving synchronize_net() calls outside such lock. - Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and more specific TCP coverage. - Reduce network namespace tear-down time by removing per-subsystems synchronize_net() in tipc and sched. - Add flow label selector support for fib rules, allowing traffic redirection based on such header field. Netfilter: - Do not remove netdev basechain when last device is gone, allowing netdev basechains without devices. - Revisit the flowtable teardown strategy, dealing better with fin, reset and re-open events. - Scale-up IP-vs connection dumping by avoiding linear search on each restart. Protocols: - A significant XDP socket refactor, consolidating and optimizing several helpers into the core - Better scaling of ICMP rate-limiting, by removing false-sharing in inet peers handling. - Introduces netlink notifications for multicast IPv4 and IPv6 address changes. - Add ipsec support for IP-TFS/AggFrag encapsulation, allowing aggregation and fragmentation of the inner IP. - Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to avoid local port exhaustion issues when the average connection lifetime is very short. - Support updating keys (re-keying) for connections using kernel TLS (for TLS 1.3 only). - Support ipv4-mapped ipv6 address clients in smc-r v2. - Add support for jumbo data packet transmission in RxRPC sockets, gluing multiple data packets in a single UDP packet. - Support RxRPC RACK-TLP to manage packet loss and retransmission in conjunction with the congestion control algorithm. Driver API: - Introduce a unified and structured interface for reporting PHY statistics, exposing consistent data across different H/W via ethtool. - Make timestamping selectable, allow the user to select the desired hwtstamp provider (PHY or MAC) administratively. - Add support for configuring a header-data-split threshold (HDS) value via ethtool, to deal with partial or buggy H/W implementation. - Consolidate DSA drivers Energy Efficiency Ethernet support. - Add EEE management to phylink, making use of the phylib implementation. - Add phylib support for in-band capabilities negotiation. - Simplify how phylib-enabled mac drivers expose the supported interfaces. Tests and tooling: - Make the YNL tool package-friendly to make it easier to deploy it separately from the kernel. - Increase TCP selftest coverage importing several packetdrill test-cases. - Regenerate the ethtool uapi header from the YNL spec, to ease maintenance and future development. - Add YNL support for decoding the link types used in net self-tests, allowing a single build to run both net and drivers/net. Drivers: - Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - add cross E-Switch QoS support - add SW Steering support for ConnectX-8 - implement support for HW-Managed Flow Steering, improving the rule deletion/insertion rate - support for multi-host LAG - Intel (ixgbe, ice, igb): - ice: add support for devlink health events - ixgbe: add initial support for E610 chipset variant - igb: add support for AF_XDP zero-copy - Meta: - add support for basic RSS config - allow changing the number of channels - add hardware monitoring support - Broadcom (bnxt): - implement TCP data split and HDS threshold ethtool support, enabling Device Memory TCP. - Marvell Octeon: - implement egress ipsec offload support for the cn10k family - Hisilicon (HIBMC): - implement unicast MAC filtering - Ethernet NICs embedded and virtual: - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding contented atomic operations for drop counters - Freescale: - quicc: phylink conversion - enetc: support Tx and Rx checksum offload and improve TSO performances - MediaTek: - airoha: introduce support for ETS and HTB Qdisc offload - Microchip: - lan78XX USB: preparation work for phylink conversion - Synopsys (stmmac): - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45 - refactor EEE support to leverage the new driver API - optimize DMA and cache access to increase raw RX performances by 40% - TI: - icssg-prueth: add multicast filtering support for VLAN interface - netkit: - add ability to configure head/tailroom - VXLAN: - accepts packets with user-defined reserved bit - Ethernet switches: - Microchip: - lan969x: add RGMII support - lan969x: improve TX and RX performance using the FDMA engine - nVidia/Mellanox: - move Tx header handling to PCI driver, to ease XDP support - Ethernet PHYs: - Texas Instruments DP83822: - add support for GPIO2 clock output - Realtek: - 8169: add support for RTL8125D rev.b - rtl822x: add hwmon support for the temperature sensor - Microchip: - add support for RDS PTP hardware - consolidate periodic output signal generation - CAN: - several DT-bindings to DT schema conversions - tcan4x5x: - add HW standby support - support nWKRQ voltage selection - kvaser: - allowing Bus Error Reporting runtime configuration - WiFi: - the on-going Multi-Link Operation (MLO) effort continues, affecting both the stack and in drivers - mac80211/cfg80211: - Emergency Preparedness Communication Services (EPCS) station mode support - support for adding and removing station links for MLO - add support for WiFi 7/EHT mesh over 320 MHz channels - report Tx power info for each link - RealTek (rtw88): - enable USB Rx aggregation and USB 3 to improve performance - LED support - RealTek (rtw89): - refactor power save to support Multi-Link Operations - add support for RTL8922AE-VS variant - MediaTek (mt76): - single wiphy multiband support (preparation for MLO) - p2p device support - add TP-Link TXE50UH USB adapter support - Qualcomm (ath10k): - support for the QCA6698AQ IP core - Qualcomm (ath12k): - enable MLO for QCN9274 - Bluetooth: - Allow sysfs to trigger hdev reset, to allow recovering devices not responsive from user-space - MediaTek: add support for MT7922, MT7925, MT7921e devices - Realtek: add support for RTL8851BE devices - Qualcomm: add support for WCN785x devices - ISO: allow BIG re-sync" * tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits) net/rose: prevent integer overflows in rose_setsockopt() net: phylink: fix regression when binding a PHY net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL. ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL. ipv6: Move lifetime validation to inet6_rtm_newaddr(). ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr(). ipv6: Pass dev to inet6_addr_add(). ipv6: Convert inet6_ioctl() to per-netns RTNL. ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup(). ipv6: Hold rtnl_net_lock() in addrconf_dad_work(). ipv6: Hold rtnl_net_lock() in addrconf_verify_work(). ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL. ipv6: Add __in6_dev_get_rtnl_net(). net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags net: mii: Fix the Speed display when the network cable is not connected sysctl net: Remove macro checks for CONFIG_SYSCTL eth: bnxt: update header sizing defaults ...	2025-01-22 08:28:57 -08:00
Linus Torvalds	f96a974170	lsm/stable-6.14 PR 20250121 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe 6LLOjmF7LLw+G4M= =fgyr -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improved handling of LSM "secctx" strings through lsm_context struct The LSM secctx string interface is from an older time when only one LSM was supported, migrate over to the lsm_context struct to better support the different LSMs we now have and make it easier to support new LSMs in the future. These changes explain the Rust, VFS, and networking changes in the diffstat. - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are enabled Small tweak to be a bit smarter about when we build the LSM's common audit helpers. - Check for absurdly large policies from userspace in SafeSetID SafeSetID policies rules are fairly small, basically just "UID:UID", it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which helps quiet a number of syzbot related issues. While work is being done to address the syzbot issues through other mechanisms, this is a trivial and relatively safe fix that we can do now. - Various minor improvements and cleanups A collection of improvements to the kernel selftests, constification of some function parameters, removing redundant assignments, and local variable renames to improve readability. * tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lockdown: initialize local array before use to quiet static analysis safesetid: check size of policy writes net: corrections for security_secid_to_secctx returns lsm: rename variable to avoid shadowing lsm: constify function parameters security: remove redundant assignment to return variable lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test binder: initialize lsm_context structure rust: replace lsm context+len with lsm_context lsm: secctx provider check on release lsm: lsm_context in security_dentry_init_security lsm: use lsm_context in security_inode_getsecctx lsm: replace context+len with lsm_context lsm: ensure the correct LSM context releaser	2025-01-21 20:03:04 -08:00
Linus Torvalds	2e04247f7c	ftrace updates for v6.14: - Have fprobes built on top of function graph infrastructure The fprobe logic is an optimized kprobe that uses ftrace to attach to functions when a probe is needed at the start or end of the function. The fprobe and kretprobe logic implements a similar method as the function graph tracer to trace the end of the function. That is to hijack the return address and jump to a trampoline to do the trace when the function exits. To do this, a shadow stack needs to be created to store the original return address. Fprobes and function graph do this slightly differently. Fprobes (and kretprobes) has slots per callsite that are reserved to save the return address. This is fine when just a few points are traced. But users of fprobes, such as BPF programs, are starting to add many more locations, and this method does not scale. The function graph tracer was created to trace all functions in the kernel. In order to do this, when function graph tracing is started, every task gets its own shadow stack to hold the return address that is going to be traced. The function graph tracer has been updated to allow multiple users to use its infrastructure. Now have fprobes be one of those users. This will also allow for the fprobe and kretprobe methods to trace the return address to become obsolete. With new technologies like CFI that need to know about these methods of hijacking the return address, going toward a solution that has only one method of doing this will make the kernel less complex. - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Remove disabling of interrupts in the function graph tracer When function graph tracer was first introduced, it could race with interrupts and NMIs. To prevent that race, it would disable interrupts and not trace NMIs. But the code has changed to allow NMIs and also interrupts. This change was done a long time ago, but the disabling of interrupts was never removed. Remove the disabling of interrupts in the function graph tracer is it is not needed. This greatly improves its performance. - Allow the :mod: command to enable tracing module functions on the kernel command line. The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Because enabling function tracing can be done very early at boot up (before scheduling is enabled), the commands that can be done when function tracing is started is limited. Having the ":mod:" command to trace module functions as they are loaded is very useful. Update the kernel command line function filtering to allow it. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ42E2RQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qqXSAPwOMxuhye8tb1GYG62QD9+w7e6nOmlC 2GCPj4detnEM2QD/ciivkhespVKhHpZHRewAuSnJgHPSM45NQ3EVESzjWQ4= =snbx -----END PGP SIGNATURE----- Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Have fprobes built on top of function graph infrastructure The fprobe logic is an optimized kprobe that uses ftrace to attach to functions when a probe is needed at the start or end of the function. The fprobe and kretprobe logic implements a similar method as the function graph tracer to trace the end of the function. That is to hijack the return address and jump to a trampoline to do the trace when the function exits. To do this, a shadow stack needs to be created to store the original return address. Fprobes and function graph do this slightly differently. Fprobes (and kretprobes) has slots per callsite that are reserved to save the return address. This is fine when just a few points are traced. But users of fprobes, such as BPF programs, are starting to add many more locations, and this method does not scale. The function graph tracer was created to trace all functions in the kernel. In order to do this, when function graph tracing is started, every task gets its own shadow stack to hold the return address that is going to be traced. The function graph tracer has been updated to allow multiple users to use its infrastructure. Now have fprobes be one of those users. This will also allow for the fprobe and kretprobe methods to trace the return address to become obsolete. With new technologies like CFI that need to know about these methods of hijacking the return address, going toward a solution that has only one method of doing this will make the kernel less complex. - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Remove disabling of interrupts in the function graph tracer When function graph tracer was first introduced, it could race with interrupts and NMIs. To prevent that race, it would disable interrupts and not trace NMIs. But the code has changed to allow NMIs and also interrupts. This change was done a long time ago, but the disabling of interrupts was never removed. Remove the disabling of interrupts in the function graph tracer is it is not needed. This greatly improves its performance. - Allow the :mod: command to enable tracing module functions on the kernel command line. The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Because enabling function tracing can be done very early at boot up (before scheduling is enabled), the commands that can be done when function tracing is started is limited. Having the ":mod:" command to trace module functions as they are loaded is very useful. Update the kernel command line function filtering to allow it. * tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) ftrace: Implement :mod: cache filtering on kernel command line tracing: Adopt __free() and guard() for trace_fprobe.c bpf: Use ftrace_get_symaddr() for kprobe_multi probes ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr Documentation: probes: Update fprobe on function-graph tracer selftests/ftrace: Add a test case for repeating register/unregister fprobe selftests: ftrace: Remove obsolate maxactive syntax check tracing/fprobe: Remove nr_maxactive from fprobe fprobe: Add fprobe_header encoding feature fprobe: Rewrite fprobe on function-graph tracer s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS tracing: Add ftrace_fill_perf_regs() for perf event tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs fprobe: Use ftrace_regs in fprobe exit handler fprobe: Use ftrace_regs in fprobe entry handler fgraph: Pass ftrace_regs to retfunc fgraph: Replace fgraph_ret_regs with ftrace_regs ...	2025-01-21 15:15:28 -08:00
Linus Torvalds	9f3ee94e70	RCU pull request for v6.14 This pull request contains the following branches: fixes.2024.12.14a: Misc fixes, check if IRQs are disabled in rcu_exp_need_qs(), instrument KCSAN exclusive-writer assertions, add extra WARN_ON_ONCE() check, set the cpu_no_qs.b.exp under lock, warn if callback enqueued on offline CPU. rcutorture.2024.12.14a: Torture-test updates, add rcutorture.preempt_duration kernel module parameter, make the TREE03 scenario do preemption, improve pooling timeouts for rcu_torture_writer(), improve output of "Failure/close-call rcutorture reader segments", add some reader-state debugging checks, update doc of polled APIs, add extra diagnostics for per-reader-segment preemption. srcu.2024.12.14a: SRCU updates, improve doc for srcu_read_lock() in terms of return value, fix typo in comments, remove redundant GP sequence checks in the srcu_funnel_gp_start. torture-test.2024.12.14a: Add an extra test for sched_clock(), improve testing on unresponsive systems. -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEu6QRe/mAUYNn5U0PBYqkjnKWLM8FAmeGprwACgkQBYqkjnKW LM+QUwv/VVwYKI3f9eH4AcjijIFufmsP3I5REfY3s7+a6BItwZrulwhGK+mWWE+9 nKjsrrjw3sv3dEvdaUfZxwiLQBfJmWdUUGTZ748qyCpidlo4wB7OW/BR+Pn4ZiB/ rWkn28fHdDxUJV+nQGbGC82EiGrLC9XYlbTbnh9VzGEjcyKIIU3Dw5tGXzEVMn5w Tc6H6jWg8+fXxIdmhdEkjpH+rS9H160Lt1bGeGadI3LMdmMj89x5u+i6gheT83WL FBBwgNVITWPTwfQFyK4wuRcKzi/UIrRdQIU+2xqJKs6NeWhwqhFDfW8FP5brBI2o f7fFQA+CbP/oRCqXCaZKmB3i/xGeJUsJ/IJ992jq61TCLNoc3LDxovYdaHfUJgph W/8KHUc8oZMQU4CjGJkj30jnpBLPwZeZuJvuTfQZl5QuBeUivIMg89cXLpE9Vnny yof3pm++Fru8wJ0rooq3ef2A5vblpoBQcnYelQV2EJisMCOkd+P5PNVA2viTrG8F QGfmDzm1 =EwPt -----END PGP SIGNATURE----- Merge tag 'rcu.release.v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Uladzislau Rezki: "Misc fixes: - check if IRQs are disabled in rcu_exp_need_qs() - instrument KCSAN exclusive-writer assertions - add extra WARN_ON_ONCE() check - set the cpu_no_qs.b.exp under lock - warn if callback enqueued on offline CPU Torture-test updates: - add rcutorture.preempt_duration kernel module parameter - make the TREE03 scenario do preemption - improve pooling timeouts for rcu_torture_writer() - improve output of "Failure/close-call rcutorture reader segments" - add some reader-state debugging checks - update doc of polled APIs - add extra diagnostics for per-reader-segment preemption - add an extra test for sched_clock() - improve testing on unresponsive systems SRCU updates: - improve doc for srcu_read_lock() in terms of return value - fix typo in comments - remove redundant GP sequence checks in the srcu_funnel_gp_start" * tag 'rcu.release.v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (31 commits) srcu: Remove redundant GP sequence checks in srcu_funnel_gp_start srcu: Fix typo s/srcu_check_read_flavor()/__srcu_check_read_flavor()/ srcu: Guarantee non-negative return value from srcu_read_lock() MAINTAINERS: Update RCU git tree rcu: Add lockdep_assert_irqs_disabled() to rcu_exp_need_qs() rcu: Add KCSAN exclusive-writer assertions for rdp->cpu_no_qs.b.exp rcu: Make preemptible rcu_exp_handler() check idempotency rcu: Replace open-coded rcu_exp_need_qs() from rcu_exp_handler() with call rcu: Move rcu_report_exp_rdp() setting of ->cpu_no_qs.b.exp under lock rcu: Make rcu_report_exp_cpu_mult() caller acquire lock rcu: Report callbacks enqueued on offline CPU blind spot rcutorture: Use symbols for SRCU reader flavors rcutorture: Add per-reader-segment preemption diagnostics rcutorture: Read CPU ID for decoration protected by both reader types rcutorture: Add preempt_count() to rcutorture_one_extend_check() diagnostics rcutorture: Add parameters to control polled/conditional wait interval rcutorture: Add documentation for recent conditional and polled APIs rcutorture: Ignore attempts to test preemption and forward progress rcutorture: Make rcutorture_one_extend() check reader state rcutorture: Pretty-print rcutorture reader segments ...	2025-01-21 14:39:21 -08:00
Linus Torvalds	336088234e	Livepatching changes for 6.14 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmeOTvAACgkQUqAMR0iA lPKpahAAm4GqvxwQWowQmFAfdFW/1H++tADl2xCsPbmCPeKs1PBXCLTPfMDHNjWr zgHihsnJKIUQ0nUfthYrlEdYx15Ku86ucpl6p2gKBcOgpv31SG5iRbL99RnhEJ1C oeIx0XR83Imy9TnRzo0/X0MPYQAUSAEiY0oHENGBFjJpepopG61op/snacbG26AS yibEvJKOYQ3r3xPkbwp8zZ+vyblYJ7X9Tdq1/DX1Iuksz2f7sRS72XJxdjJC7QFQ 8Gyh88kbtasPmiPGOO0zRc0IMzGk0VVFa1b1zwReab7/aQKzPqAX7KQhwb4Q9JPV RuzEb3HE9v+usY1JEiW2JZijM2QXt+SYOgx0/ki7/tDKGb3c5HbVoOyhVwK2bfw7 z86/Vze3w9iLz9i2dVCmwobbZGicrBGHhejahYA8NhpGH49HRR7p5O9Nw22QgCpk ADBD2nfajDBzDTOu+s8OkQk4jPQk69LtXM9BO/nq88f5BlKOIMAY+AofPwCZj+ab KHQEDC6E+Xg03xYUGVZpek4TnpF7T9tWSc7eWGg53YQPMcgj54rR7LXzNK2dO4mP ugRC1qNUCKvjzQ5bMsCEhLhJqrszP975HSuSXIFBSzw1fNS5QNmxKepxfuCxjl08 9ZARNW3Q0mzqge7R5NeIQTcKYa/60d7cxJlWjdYHTiW5HE/xNx0= =UeIx -----END PGP SIGNATURE----- Merge tag 'livepatching-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - Add a sysfs attribute showing the livepatch ordering - Some code clean up * tag 'livepatching-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: add test cases of stack_order sysfs interface livepatch: Add stack_order sysfs attribute selftests/livepatch: Replace hardcoded module name with variable in test-callbacks.sh	2025-01-21 13:11:26 -08:00
Manivannan Sadhasivam	392188bb0f	selftests: pci_endpoint: Migrate to Kselftest framework Migrate the PCI endpoint test to Kselftest framework. All the tests that were part of the previous pcitest.sh file were migrated. Below is the list of tests converted: 1. BAR0 Test 2. BAR1 Test 3. BAR2 Test 4. BAR3 Test 5. BAR4 Test 6. BAR5 Test 7. Consecutive BAR Tests 8. Legacy IRQ Tests 9. MSI Interrupt Tests (MSI1 to MSI32) 10. MSI-X Interrupt Tests (MSI-X1 to MSI-X2048) 11. Read Tests - MEMCPY (For 1, 1024, 1025, 1024000, 1024001 Bytes) 12. Write Tests - MEMCPY (For 1, 1024, 1025, 1024000, 1024001 Bytes) 13. Copy Tests - MEMCPY (For 1, 1024, 1025, 1024000, 1024001 Bytes) 14. Read Tests - DMA (For 1, 1024, 1025, 1024000, 1024001 Bytes) 15. Write Tests - DMA (For 1, 1024, 1025, 1024000, 1024001 Bytes) 16. Copy Tests - DMA (For 1, 1024, 1025, 1024000, 1024001 Bytes) BAR, DMA and MEMCPY tests are added as fixture variants and can be executed separately as below: $ pci_endpoint_test -v BAR0 $ pci_endpoint_test -v dma $ pci_endpoint_test -v memcpy Link: https://lore.kernel.org/r/20250116171650.33585-5-manivannan.sadhasivam@linaro.org Co-developed-by: Aman Gupta <aman1.gupta@samsung.com> Co-developed-by: Padmanabhan Rajanbabu <p.rajanbabu@samsung.com> [mani: reworked based on the IOCTL fix, cleanups, documentation, commit message] Signed-off-by: Aman Gupta <aman1.gupta@samsung.com> Signed-off-by: Padmanabhan Rajanbabu <p.rajanbabu@samsung.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org>	2025-01-21 14:17:55 -06:00
Manivannan Sadhasivam	e19bde2269	selftests: Move PCI Endpoint tests from tools/pci to Kselftests This just moves the existing tests under tools/pci to tools/testing/selftests/pci_endpoint and adjusts the paths in Makefile accordingly. Migration to Kselftest framework will be done in subsequent commits. Link: https://lore.kernel.org/r/20250116171650.33585-4-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org>	2025-01-21 14:17:55 -06:00
Linus Torvalds	6c4aa896eb	Performance events changes for v6.14: - Seqlock optimizations that arose in a perf context and were merged into the perf tree: - seqlock: Add raw_seqcount_try_begin (Suren Baghdasaryan) - mm: Convert mm_lock_seq to a proper seqcount ((Suren Baghdasaryan) - mm: Introduce mmap_lock_speculate_{try_begin\|retry} (Suren Baghdasaryan) - mm/gup: Use raw_seqcount_try_begin() (Peter Zijlstra) - Core perf enhancements: - Reduce 'struct page' footprint of perf by mapping pages in advance (Lorenzo Stoakes) - Save raw sample data conditionally based on sample type (Yabin Cui) - Reduce sampling overhead by checking sample_type in perf_sample_save_callchain() and perf_sample_save_brstack() (Yabin Cui) - Export perf_exclude_event() (Namhyung Kim) - Uprobes scalability enhancements: (Andrii Nakryiko) - Simplify find_active_uprobe_rcu() VMA checks - Add speculative lockless VMA-to-inode-to-uprobe resolution - Simplify session consumer tracking - Decouple return_instance list traversal and freeing - Ensure return_instance is detached from the list before freeing - Reuse return_instances between multiple uretprobes within task - Guard against kmemdup() failing in dup_return_instance() - AMD core PMU driver enhancements: - Relax privilege filter restriction on AMD IBS (Namhyung Kim) - AMD RAPL energy counters support: (Dhananjay Ugwekar) - Introduce topology_logical_core_id() (K Prateek Nayak) - Remove the unused get_rapl_pmu_cpumask() function - Remove the cpu_to_rapl_pmu() function - Rename rapl_pmu variables - Make rapl_model struct global - Add arguments to the init and cleanup functions - Modify the generic variable names to _pkg - Remove the global variable rapl_msrs - Move the cntr_mask to rapl_pmus struct - Add core energy counter support for AMD CPUs - Intel core PMU driver enhancements: - Support RDPMC 'metrics clear mode' feature (Kan Liang) - Clarify adaptive PEBS processing (Kan Liang) - Factor out functions for PEBS records processing (Kan Liang) - Simplify the PEBS records processing for adaptive PEBS (Kan Liang) - Intel uncore driver enhancements: (Kan Liang) - Convert buggy pmu->func_id use to pmu->registered - Support more units on Granite Rapids Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeOJdQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1i2yQ/+MXl7yfJOgdbwjBpgGGzH4burEO7ppak+ ktzz+YjpNgjODe/xMAJGjjblouuYArCnRolc1UPvPm6M7jSY76wi42Y6c4dRtFoB 2ReSrRqnreLOcrRS9nsTjvWRHfJHqJDVSd9TfHX6ILfzbaizCZOGYk558ZxAKRqu Lw7FOvLEe/Y3tg4z8dDg083jsasalKySP9wIPc0BkSqQTOfusd3KXju/Fux/9wkn hZcUgF4ds+0bH7xtO1/G9ILqGyeq97X1McIR9bAjln5Mxykclen4hSjRaWWHHo9O mzBKmd/blIATisfuuW+QLDQow3M1k3688cz7e9QOeWHHd/dJiMb9RLV90jdND/T/ uLINC5vNemzyWEfnNiYQ31LjhG3SeuDiKWzRp36MbQcCh6EBdRXWLBgtmxq1L/3o ZCaCdtFu5+6epycdyOVZEpWDnjdx4GmLXMZi5WJfZ7fZ/IFjNkjk4OdzI1iRQ+i3 Sbi75ep59ayTUhm5AB7gCJsP3R7EsZsiPHUenQdA2n9Sj6xE+IuhlS/QDQ9g5mdY Ijs0jHeVCGmhYoOD1xWnCZSzlnkEVU3zwfypAK+MC7pgtFMwDy5/Bu1USGxXXDy+ aKsrJRSgHbtZ1gwoHstqkV+DeCTfElCLYkvigzI5Nmyib5Zp4vkwy2ZLWQjaNjm7 mqRI7PugUkU= =c8XB -----END PGP SIGNATURE----- Merge tag 'perf-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Seqlock optimizations that arose in a perf context and were merged into the perf tree: - seqlock: Add raw_seqcount_try_begin (Suren Baghdasaryan) - mm: Convert mm_lock_seq to a proper seqcount (Suren Baghdasaryan) - mm: Introduce mmap_lock_speculate_{try_begin\|retry} (Suren Baghdasaryan) - mm/gup: Use raw_seqcount_try_begin() (Peter Zijlstra) Core perf enhancements: - Reduce 'struct page' footprint of perf by mapping pages in advance (Lorenzo Stoakes) - Save raw sample data conditionally based on sample type (Yabin Cui) - Reduce sampling overhead by checking sample_type in perf_sample_save_callchain() and perf_sample_save_brstack() (Yabin Cui) - Export perf_exclude_event() (Namhyung Kim) Uprobes scalability enhancements: (Andrii Nakryiko) - Simplify find_active_uprobe_rcu() VMA checks - Add speculative lockless VMA-to-inode-to-uprobe resolution - Simplify session consumer tracking - Decouple return_instance list traversal and freeing - Ensure return_instance is detached from the list before freeing - Reuse return_instances between multiple uretprobes within task - Guard against kmemdup() failing in dup_return_instance() AMD core PMU driver enhancements: - Relax privilege filter restriction on AMD IBS (Namhyung Kim) AMD RAPL energy counters support: (Dhananjay Ugwekar) - Introduce topology_logical_core_id() (K Prateek Nayak) - Remove the unused get_rapl_pmu_cpumask() function - Remove the cpu_to_rapl_pmu() function - Rename rapl_pmu variables - Make rapl_model struct global - Add arguments to the init and cleanup functions - Modify the generic variable names to _pkg - Remove the global variable rapl_msrs - Move the cntr_mask to rapl_pmus struct - Add core energy counter support for AMD CPUs Intel core PMU driver enhancements: - Support RDPMC 'metrics clear mode' feature (Kan Liang) - Clarify adaptive PEBS processing (Kan Liang) - Factor out functions for PEBS records processing (Kan Liang) - Simplify the PEBS records processing for adaptive PEBS (Kan Liang) Intel uncore driver enhancements: (Kan Liang) - Convert buggy pmu->func_id use to pmu->registered - Support more units on Granite Rapids" * tag 'perf-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) perf: map pages in advance perf/x86/intel/uncore: Support more units on Granite Rapids perf/x86/intel/uncore: Clean up func_id perf/x86/intel: Support RDPMC metrics clear mode uprobes: Guard against kmemdup() failing in dup_return_instance() perf/x86: Relax privilege filter restriction on AMD IBS perf/core: Export perf_exclude_event() uprobes: Reuse return_instances between multiple uretprobes within task uprobes: Ensure return_instance is detached from the list before freeing uprobes: Decouple return_instance list traversal and freeing uprobes: Simplify session consumer tracking uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution uprobes: simplify find_active_uprobe_rcu() VMA checks mm: introduce mmap_lock_speculate_{try_begin\|retry} mm: convert mm_lock_seq to a proper seqcount mm/gup: Use raw_seqcount_try_begin() seqlock: add raw_seqcount_try_begin perf/x86/rapl: Add core energy counter support for AMD CPUs perf/x86/rapl: Move the cntr_mask to rapl_pmus struct perf/x86/rapl: Remove the global variable rapl_msrs ...	2025-01-21 10:52:03 -08:00
James Bottomley	87e6cd7cdb	selftests/efivarfs: add concurrent update tests The delete on last close functionality can now only be tested properly by using multiple threads to hold open the variable files and testing what happens as they complete. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2025-01-21 16:34:41 +01:00
Linus Torvalds	95ec54a420	powerpc updates for 6.14 - Add preempt lazy support - Deprecate cxl and cxl flash driver - Fix a possible IOMMU related OOPS at boot on pSeries - Optimize sched_clock() in ppc32 by replacing mulhdu() by mul_u64_u64_shr() Thanks to: Andrew Donnellan, Andy Shevchenko, Ankur Arora, Christophe Leroy, Frederic Barrat, Gaurav Batra, Luis Felipe Hernandez, Michael Ellerman, Nilay Shroff, Ricardo B. Marliere, Ritesh Harjani (IBM), Sebastian Andrzej Siewior, Shrikanth Hegde, Sourabh Jain, Thorsten Blum, Zhu Jun. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmePIeIACgkQpnEsdPSH ZJTLRxAAmtarhPItiCQxwi0uyQpuzBoypcVuX8M9qpAUr1cQJv1swPlJI0tFW2xV QDK37FlCytYib1oMJpwyhg5DA8kdg08OuWtvGRVxGu4O+vh2v0aehewAfPsBKBwq JTOhjSlAeDPgsYQQlK6baSlfjb4kYlAFr2mh/oJIfXi2BFV1MB7rQmCXq2sPnfKS 9cFFgsZ74fFhbYOn9qFsldnzb9TPxR0/UcTOETqRcGOjiExv4aYlmWtKGMY/nLkN k5go3xoB5WP7z11clmg0pp+RIoYKR41kR58CtGdcCEEXJJ6WBGPhPLQzT5cLBkMi ppZieQNKrZK7J/udrdKP0+2cTmBTbCpjxHicLf7BhzsWwVxHCnyjrJIzUPuLcDUi Ym9AXsmzBsqMudqnR0lslsY2mUvZOJPYh4ZCKTA5S0TDYWGy/HlAlL7sMs2uCzaM 4g8MVpEJLVo4GAoZM96x4RMcPi4RlHYXbYqNpENRkxiZu2fDoRz9WStPCdda59/D 3rQNaSDT1vBpue9ac6EIMeGgNh+f6q6WKh/PA48QBYDTp/IVbfShD+xiXtaa72cZ W+JmWUwBRyM4HOP0C5yhXBXwL6a5sHj+d6R4gng4UUww7VppJmkZpBhXZsN4VS55 Xos+2Q75FBSQkAZa84yK6dXvFW3v/upIdSXuWTkSgoKs+4Z7dG8= =ctY4 -----END PGP SIGNATURE----- Merge tag 'powerpc-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Add preempt lazy support - Deprecate cxl and cxl flash driver - Fix a possible IOMMU related OOPS at boot on pSeries - Optimize sched_clock() in ppc32 by replacing mulhdu() by mul_u64_u64_shr() Thanks to Andrew Donnellan, Andy Shevchenko, Ankur Arora, Christophe Leroy, Frederic Barrat, Gaurav Batra, Luis Felipe Hernandez, Michael Ellerman, Nilay Shroff, Ricardo B. Marliere, Ritesh Harjani (IBM), Sebastian Andrzej Siewior, Shrikanth Hegde, Sourabh Jain, Thorsten Blum, and Zhu Jun. * tag 'powerpc-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix argument order to timer_sub() powerpc/prom_init: Use IS_ENABLED() powerpc/pseries/iommu: IOMMU incorrectly marks MMIO range in DDW powerpc: Use str_on_off() helper in check_cache_coherency() powerpc: Large user copy aware of full:rt:lazy preemption powerpc: Add preempt lazy support powerpc/book3s64/hugetlb: Fix disabling hugetlb when fadump is active powerpc/vdso: Mark the vDSO code read-only after init powerpc/64: Use get_user() in start_thread() macintosh: declare ctl_table as const selftest/powerpc/ptrace: Cleanup duplicate macro definitions selftest/powerpc/ptrace/ptrace-pkey: Remove duplicate macros selftest/powerpc/ptrace/core-pkey: Remove duplicate macros powerpc/8xx: Drop legacy-of-mm-gpiochip.h header scsi/cxlflash: Deprecate driver cxl: Deprecate driver selftests/powerpc: Fix typo in test-vphn.c powerpc/xmon: Use str_yes_no() helper in dump_one_paca() powerpc/32: Replace mulhdu() by mul_u64_u64_shr()	2025-01-20 21:40:19 -08:00
Linus Torvalds	9ad09c4f28	arm64 updates for 6.14 Confidential Computing: * Register a platform device when running in CCA realm mode to enable automatic loading of dependent modules. CPU Features: * Update a bunch of system register definitions to pick up new field encodings from the architectural documentation. * Add hwcaps and selftests for the new (2024) dpISA extensions. Documentation: * Update EL3 (firmware) requirements for booting Linux on modern arm64 designs. * Remove stale information about the kernel virtual memory map. Miscellaneous: * Minor cleanups and typo fixes. Memory management: * Fix vmemmap_check_pmd() to look at the PMD type bits * LPA2 (52-bit physical addressing) cleanups and minor fixes. * Adjust physical address space depending upon whether or not LPA2 is enabled. Perf and PMUs: * Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU * Extend AXI filtering support for the DDR PMU on NXP IMX SoCs * Fix Designware PCIe PMU event numbering. * Add generic branch events for the Apple M1 CPU PMU. * Add support for Marvell Odyssey DDR and LLC-TAD PMUs. * Cleanups to the Hisilicon DDRC and Uncore PMU code. * Advertise discard mode for the SPE PMU. * Add the perf users mailing list to our MAINTAINERS entry. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmeKZLcQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNEQzB/0X2U89ZiqxIkTPQvfFrjN/uUGybkq59rEL DfeoGukTgJIwc3GHWXXtQ//wuuYKdTeCXaIz5NFK3+7/wmKSLvjkexmue8pta6EY 5rx9bAPr/D8lAUvhKIN2l3pF/ygoRwDz+nT2yVQ1xlZxYJWX7ZIsMj7W7ceb5kdx HRrTSQuhEEPREAWWO4oCMWl5SQZSrIflSE3Be/PsP0OhW6k//ZmWbcJTgUcHbKam o2WtNjITyGzxMpRCcrGEZKoe9YcwSxiut/PoD7JuoB4C/rbsf1cdJ6uLmtvGJcZj qsdRHhVfBzP1+ahONrDbiT3C2+s1UZySKdCDIxiYy6lB39wpP0dd =E7Mf -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "We've got a little less than normal thanks to the holidays in December, but there's the usual summary below. The highlight is probably the 52-bit physical addressing (LPA2) clean-up from Ard. Confidential Computing: - Register a platform device when running in CCA realm mode to enable automatic loading of dependent modules CPU Features: - Update a bunch of system register definitions to pick up new field encodings from the architectural documentation - Add hwcaps and selftests for the new (2024) dpISA extensions Documentation: - Update EL3 (firmware) requirements for booting Linux on modern arm64 designs - Remove stale information about the kernel virtual memory map Miscellaneous: - Minor cleanups and typo fixes Memory management: - Fix vmemmap_check_pmd() to look at the PMD type bits - LPA2 (52-bit physical addressing) cleanups and minor fixes - Adjust physical address space depending upon whether or not LPA2 is enabled Perf and PMUs: - Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU - Extend AXI filtering support for the DDR PMU on NXP IMX SoCs - Fix Designware PCIe PMU event numbering - Add generic branch events for the Apple M1 CPU PMU - Add support for Marvell Odyssey DDR and LLC-TAD PMUs - Cleanups to the Hisilicon DDRC and Uncore PMU code - Advertise discard mode for the SPE PMU - Add the perf users mailing list to our MAINTAINERS entry" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits) Documentation: arm64: Remove stale and redundant virtual memory diagrams perf docs: arm_spe: Document new discard mode perf: arm_spe: Add format option for discard mode MAINTAINERS: Add perf list for drivers/perf/ arm64: Remove duplicate included header drivers/perf: apple_m1: Map generic branch events arm64: rsi: Add automatic arm-cca-guest module loading kselftest/arm64: Add 2024 dpISA extensions to hwcap test KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1 arm64/hwcap: Describe 2024 dpISA extensions to userspace arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-12 arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented drivers/perf: hisi: Set correct IRQ affinity for PMUs with no association arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu() arm64: mm: Test for pmd_sect() in vmemmap_check_pmd() arm64/mm: Replace open encodings with PXD_TABLE_BIT arm64/mm: Rename pte_mkpresent() as pte_mkvalid() arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09 ...	2025-01-20 21:21:49 -08:00
Linus Torvalds	fadc3ed9ce	execve updates for v6.14-rc1 - exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case (Tycho Andersen, Kees Cook) - binfmt_misc: Fix comment typos (Christophe JAILLET) - exec: move empty argv[0] warning closer to actual logic (Nir Lichtman) - exec: remove legacy custom binfmt modules autoloading (Nir Lichtman) - binfmt_flat: Fix integer overflow bug on 32 bit systems (Dan Carpenter) - exec: Make sure set_task_comm() always NUL-terminates - coredump: Do not lock when copying "comm" - MAINTAINERS: add auxvec.h and set myself as maintainer -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hNmQAKCRA2KwveOeQk u0/nAQCTGU0zqhdO6t7ABsL3p9kJ2jVRA5njAoX7A/9jGPSWEQD/boRMqZuUpthV nMevcQ2F4u0A7kJJBMK05YdXWHkYqgk= =49Di -----END PGP SIGNATURE----- Merge tag 'execve-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case (Tycho Andersen, Kees Cook) - binfmt_misc: Fix comment typos (Christophe JAILLET) - move empty argv[0] warning closer to actual logic (Nir Lichtman) - remove legacy custom binfmt modules autoloading (Nir Lichtman) - Make sure set_task_comm() always NUL-terminates - binfmt_flat: Fix integer overflow bug on 32 bit systems (Dan Carpenter) - coredump: Do not lock when copying "comm" - MAINTAINERS: add auxvec.h and set myself as maintainer * tag 'execve-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: binfmt_flat: Fix integer overflow bug on 32 bit systems selftests/exec: add a test for execveat()'s comm exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case exec: Make sure task->comm is always NUL-terminated exec: remove legacy custom binfmt modules autoloading exec: move warning of null argv to be next to the relevant code fs: binfmt: Fix a typo MAINTAINERS: exec: Mark Kees as maintainer MAINTAINERS: exec: Add auxvec.h UAPI coredump: Do not lock during 'comm' reporting	2025-01-20 13:27:58 -08:00
Rafael J. Wysocki	1c91c99075	Merge branch 'pm-tools' Merge cpupower utility updates for 6.14: - Fix TSC MHz calculation in cpupower (He Rongguang). - Add install and uninstall options to bindings Makefile and add header changes for cpufreq.h to SWIG bindings in cpupower (John B. Wyatt IV). - Add missing residency header changes in cpuidle.h to SWIG bindings in cpupower (John B. Wyatt IV). - Add output files to .gitignore and clean them up in "make clean" in selftests/cpufreq (Li Zhijian). - Fix cross-compilation in cpupower Makefile (Peng Fan). - Revise the is_valid flag handling for idle_monitor in the cpupower utility (wangfushuai). - Extend and clean up AMD processors support in cpupower (Mario Limonciello). * pm-tools: pm: cpupower: Add missing residency header changes in cpuidle.h to SWIG pm: cpupower: Add header changes for cpufreq.h to SWIG bindings pm: cpupower: Add install and uninstall options to bindings makefile cpupower: Adjust whitespace for amd-pstate specific prints cpupower: Don't fetch maximum latency when EPP is enabled cpupower: Add support for showing energy performance preference cpupower: Don't try to read frequency from hardware when kernel uses aperfmperf cpupower: Add support for amd-pstate preferred core rankings cpupower: Add support for parsing 'enabled' or 'disabled' strings from table cpupower: Remove spurious return statement cpupower: fix TSC MHz calculation cpupower: revise is_valid flag handling for idle_monitor pm: cpupower: Makefile: Fix cross compilation selftests/cpufreq: gitignore output files and clean them in make clean	2025-01-20 21:01:50 +01:00
Liu Ye	3a0b7fa095	selftests/net/ipsec: Fix Null pointer dereference in rtattr_pack() Address Null pointer dereference / undefined behavior in rtattr_pack (note that size is 0 in the bad case). Flagged by cppcheck as: tools/testing/selftests/net/ipsec.c:230:25: warning: Possible null pointer dereference: payload [nullPointer] memcpy(RTA_DATA(attr), payload, size); ^ tools/testing/selftests/net/ipsec.c:1618:54: note: Calling function 'rtattr_pack', 4th argument 'NULL' value is 0 if (rtattr_pack(&req.nh, sizeof(req), XFRMA_IF_ID, NULL, 0)) { ^ tools/testing/selftests/net/ipsec.c:230:25: note: Null pointer dereference memcpy(RTA_DATA(attr), payload, size); ^ Signed-off-by: Liu Ye <liuye@kylinos.cn> Link: https://patch.msgid.link/20250116013037.29470-1-liuye@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-20 11:25:25 -08:00
Linus Torvalds	100ceb4817	vfs-6.14-rc1.mount.v2 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ44+LwAKCRCRxhvAZXjc orNaAQCGDqtxgqgGLsdx9dw7yTxOm9opYBaG5qN7KiThLAz2PwD+MsHNNlLVEOKU IQo9pa23UFUhTipFSeszOWza5SGlxg4= =hdst -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - Add a mountinfo program to demonstrate statmount()/listmount() Add a new "mountinfo" sample userland program that demonstrates how to use statmount() and listmount() to get at the same info that /proc/pid/mountinfo provides - Remove pointless nospec.h include - Prepend statmount.mnt_opts string with security_sb_mnt_opts() Currently these mount options aren't accessible via statmount() - Add new mount namespaces to mount namespace rbtree outside of the namespace semaphore - Lockless mount namespace lookup Currently we take the read lock when looking for a mount namespace to list mounts in. We can make this lockless. The simple search case can just use a sequence counter to detect concurrent changes to the rbtree For walking the list of mount namespaces sequentially via nsfs we keep a separate rcu list as rb_prev() and rb_next() aren't usable safely with rcu. Currently there is no primitive for retrieving the previous list member. To do this we need a new deletion primitive that doesn't poison the prev pointer and a corresponding retrieval helper Since creating mount namespaces is a relatively rare event compared with querying mounts in a foreign mount namespace this is worth it. Once libmount and systemd pick up this mechanism to list mounts in foreign mount namespaces this will be used very frequently - Add extended selftests for lockless mount namespace iteration - Add a sample program to list all mounts on the system, i.e., in all mount namespaces - Improve mount namespace iteration performance Make finding the last or first mount to start iterating the mount namespace from an O(1) operation and add selftests for iterating the mount table starting from the first and last mount - Use an xarray for the old mount id While the ida does use the xarray internally we can use it explicitly which allows us to increment the unique mount id under the xa lock. This allows us to remove the atomic as we're now allocating both ids in one go - Use a shared header for vfs sample programs - Fix build warnings for new sample program to list all mounts * tag 'vfs-6.14-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: samples/vfs: fix build warnings samples/vfs: use shared header samples/vfs/mountinfo: Use __u64 instead of uint64_t fs: remove useless lockdep assertion fs: use xarray for old mount id selftests: add listmount() iteration tests fs: cache first and last mount samples: add test-list-all-mounts selftests: remove unneeded include selftests: add tests for mntns iteration seltests: move nsfs into filesystems subfolder fs: simplify rwlock to spinlock fs: lockless mntns lookup for nsfs rculist: add list_bidir_{del,prev}_rcu() fs: lockless mntns rbtree lookup fs: add mount namespace to rbtree late fs: prepend statmount.mnt_opts string with security_sb_mnt_opts() mount: remove inlude/nospec.h include samples: add a mountinfo program to demonstrate statmount()/listmount()	2025-01-20 10:44:51 -08:00
Linus Torvalds	1a89a6924b	kernel-6.14-rc1.pid -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pR0wAKCRCRxhvAZXjc ojb2AQD5QfpTEX/ju1TkenTvoNl+JfnIjaVSY40Lm9DWYzmCMAEAuRvf5WRIV713 00/RVOrUvsLobzhmnk0yw53EQ5A+pA0= =2NDA -----END PGP SIGNATURE----- Merge tag 'kernel-6.14-rc1.pid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pid_max namespacing update from Christian Brauner: "The pid_max sysctl is a global value. For a long time the default value has been 65535 and during the pidfd dicussions Linus proposed to bump pid_max by default. Based on this discussion systemd started bumping pid_max to 2^22. So all new systems now run with a very high pid_max limit with some distros having also backported that change. The decision to bump pid_max is obviously correct. It just doesn't make a lot of sense nowadays to enforce such a low pid number. There's sufficient tooling to make selecting specific processes without typing really large pid numbers available. In any case, there are workloads that have expections about how large pid numbers they accept. Either for historical reasons or architectural reasons. One concreate example is the 32-bit version of Android's bionic libc which requires pid numbers less than 65536. There are workloads where it is run in a 32-bit container on a 64-bit kernel. If the host has a pid_max value greater than 65535 the libc will abort thread creation because of size assumptions of pthread_mutex_t. That's a fairly specific use-case however, in general specific workloads that are moved into containers running on a host with a new kernel and a new systemd can run into issues with large pid_max values. Obviously making assumptions about the size of the allocated pid is suboptimal but we have userspace that does it. Of course, giving containers the ability to restrict the number of processes in their respective pid namespace indepent of the global limit through pid_max is something desirable in itself and comes in handy in general. Independent of motivating use-cases the existence of pid namespaces makes this also a good semantical extension and there have been prior proposals pushing in a similar direction. The trick here is to minimize the risk of regressions which I think is doable. The fact that pid namespaces are hierarchical will help us here. What we mostly care about is that when the host sets a low pid_max limit, say (crazy number) 100 that no descendant pid namespace can allocate a higher pid number in its namespace. Since pid allocation is hierarchial this can be ensured by checking each pid allocation against the pid namespace's pid_max limit. This means if the allocation in the descendant pid namespace succeeds, the ancestor pid namespace can reject it. If the ancestor pid namespace has a higher limit than the descendant pid namespace the descendant pid namespace will reject the pid allocation. The ancestor pid namespace will obviously not care about this. All in all this means pid_max continues to enforce a system wide limit on the number of processes but allows pid namespaces sufficient leeway in handling workloads with assumptions about pid values and allows containers to restrict the number of processes in a pid namespace through the pid_max interface" * tag 'kernel-6.14-rc1.pid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: tests/pid_namespace: add pid_max tests pid: allow pid_max to be set per pid namespace	2025-01-20 10:29:11 -08:00
Linus Torvalds	5f85bd6aec	vfs-6.14-rc1.pidfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRdwAKCRCRxhvAZXjc otQjAP9ooUH2d/jHZ49Rw4q/3BkhX8R2fFEZgj2PMvtYlr0jQwD/d8Ji0k4jINTL AIFRfPdRwrD+X35IUK3WPO42YFZ4rAg= =5wgo -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfs updates from Christian Brauner: - Rework inode number allocation Recently we received a patchset that aims to enable file handle encoding and decoding via name_to_handle_at(2) and open_by_handle_at(2). A crucical step in the patch series is how to go from inode number to struct pid without leaking information into unprivileged contexts. The issue is that in order to find a struct pid the pid number in the initial pid namespace must be encoded into the file handle via name_to_handle_at(2). This can be used by containers using a separate pid namespace to learn what the pid number of a given process in the initial pid namespace is. While this is a weak information leak it could be used in various exploits and in general is an ugly wart in the design. To solve this problem a new way is needed to lookup a struct pid based on the inode number allocated for that struct pid. The other part is to remove the custom inode number allocation on 32bit systems that is also an ugly wart that should go away. Allocate unique identifiers for struct pid by simply incrementing a 64 bit counter and insert each struct pid into the rbtree so it can be looked up to decode file handles avoiding to leak actual pids across pid namespaces in file handles. On both 64 bit and 32 bit the same 64 bit identifier is used to lookup struct pid in the rbtree. On 64 bit the unique identifier for struct pid simply becomes the inode number. Comparing two pidfds continues to be as simple as comparing inode numbers. On 32 bit the 64 bit number assigned to struct pid is split into two 32 bit numbers. The lower 32 bits are used as the inode number and the upper 32 bits are used as the inode generation number. Whenever a wraparound happens on 32 bit the 64 bit number will be incremented by 2 so inode numbering starts at 2 again. When a wraparound happens on 32 bit multiple pidfds with the same inode number are likely to exist. This isn't a problem since before pidfs pidfds used the anonymous inode meaning all pidfds had the same inode number. On 32 bit sserspace can thus reconstruct the 64 bit identifier by retrieving both the inode number and the inode generation number to compare, or use file handles. This gives the same guarantees on both 32 bit and 64 bit. - Implement file handle support This is based on custom export operation methods which allows pidfs to implement permission checking and opening of pidfs file handles cleanly without hacking around in the core file handle code too much. - Support bind-mounts Allow bind-mounting pidfds. Similar to nsfs let's allow bind-mounts for pidfds. This allows pidfds to be safely recovered and checked for process recycling. Instead of checking d_ops for both nsfs and pidfs we could in a follow-up patch add a flag argument to struct dentry_operations that functions similar to file_operations->fop_flags. * tag 'vfs-6.14-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests: add pidfd bind-mount tests pidfs: allow bind-mounts pidfs: lookup pid through rbtree selftests/pidfd: add pidfs file handle selftests pidfs: check for valid ioctl commands pidfs: implement file handle support exportfs: add permission method fhandle: pull CAP_DAC_READ_SEARCH check into may_decode_fh() exportfs: add open method fhandle: simplify error handling pseudofs: add support for export_ops pidfs: support FS_IOC_GETVERSION pidfs: remove 32bit inode number handling pidfs: rework inode number allocation	2025-01-20 09:59:00 -08:00
Yonghong Song	14a627fe79	selftests/bpf: Add some tests related to 'may_goto 0' insns Add both asm-based and C-based tests which have 'may_goto 0' insns. For the following code in C-based test, int i, tmp[3]; for (i = 0; i < 3 && can_loop; i++) tmp[i] = 0; The clang compiler (clang 19 and 20) generates may_goto 2 may_goto 1 may_goto 0 r1 = 0 r2 = 0 r3 = 0 The above asm codes are due to llvm pass SROAPass. This ensures the successful verification since tmp[0-2] are initialized. Otherwise, the code without SROAPass like may_goto 5 r1 = 0 may_goto 3 r2 = 0 may_goto 1 r3 = 0 will have verification failure. Although from the source code C-based test should have verification failure, clang compiler optimization generates code with successful verification. If gcc generates different asm codes than clang, the following code can be used for gcc: int i, tmp[3]; for (i = 0; i < 3; i++) tmp[i] = 0; Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250118192034.2124952-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-20 09:47:14 -08:00
Linus Torvalds	4b84a4c8d4	vfs-6.14-rc1.misc -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRjQAKCRCRxhvAZXjc omUyAP9k31Qr7RY1zNtmpPfejqc+3Xx+xXD7NwHr+tONWtUQiQEA/F94qU2U3ivS AzyDABWrEQ5ZNsm+Rq2Y3zyoH7of3ww= =s3Bu -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Support caching symlink lengths in inodes The size is stored in a new union utilizing the same space as i_devices, thus avoiding growing the struct or taking up any more space When utilized it dodges strlen() in vfs_readlink(), giving about 1.5% speed up when issuing readlink on /initrd.img on ext4 - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag If a file system supports uncached buffered IO, it may set FOP_DONTCACHE and enable support for RWF_DONTCACHE. If RWF_DONTCACHE is attempted without the file system supporting it, it'll get errored with -EOPNOTSUPP - Enable VBOXGUEST and VBOXSF_FS on ARM64 Now that VirtualBox is able to run as a host on arm64 (e.g. the Apple M3 processors) we can enable VBOXSF_FS (and in turn VBOXGUEST) for this architecture. Tested with various runs of bonnie++ and dbench on an Apple MacBook Pro with the latest Virtualbox 7.1.4 r165100 installed Cleanups: - Delay sysctl_nr_open check in expand_files() - Use kernel-doc includes in fiemap docbook - Use page->private instead of page->index in watch_queue - Use a consume fence in mnt_idmap() as it's heavily used in link_path_walk() - Replace magic number 7 with ARRAY_SIZE() in fc_log - Sort out a stale comment about races between fd alloc and dup2() - Fix return type of do_mount() from long to int - Various cosmetic cleanups for the lockref code Fixes: - Annotate spinning as unlikely() in __read_seqcount_begin The annotation already used to be there, but got lost in commit `52ac39e5db` ("seqlock: seqcount_t: Implement all read APIs as statement expressions") - Fix proc_handler for sysctl_nr_open - Flush delayed work in delayed fput() - Fix grammar and spelling in propagate_umount() - Fix ESP not readable during coredump In /proc/PID/stat, there is the kstkesp field which is the stack pointer of a thread. While the thread is active, this field reads zero. But during a coredump, it should have a valid value However, at the moment, kstkesp is zero even during coredump - Don't wake up the writer if the pipe is still full - Fix unbalanced user_access_end() in select code" * tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) gfs2: use lockref_init for qd_lockref erofs: use lockref_init for pcl->lockref dcache: use lockref_init for d_lockref lockref: add a lockref_init helper lockref: drop superfluous externs lockref: use bool for false/true returns lockref: improve the lockref_get_not_zero description lockref: remove lockref_put_not_zero fs: Fix return type of do_mount() from long to int select: Fix unbalanced user_access_end() vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64 pipe_read: don't wake up the writer if the pipe is still full selftests: coredump: Add stackdump test fs/proc: do_task_stat: Fix ESP not readable during coredump fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag fs: sort out a stale comment about races between fd alloc and dup2 fs: Fix grammar and spelling in propagate_umount() fs: fc_log replace magic number 7 with ARRAY_SIZE() fs: use a consume fence in mnt_idmap() file: flush delayed work in delayed fput() ...	2025-01-20 09:40:49 -08:00
Hou Tao	0a5d2efa38	selftests/bpf: Add test case for the freeing of bpf_timer The main purpose of the test is to demonstrate the lock problem for the free of bpf_timer under PREEMPT_RT. When freeing a bpf_timer which is running on other CPU in bpf_timer_cancel_and_free(), hrtimer_cancel() will try to acquire a spin-lock (namely softirq_expiry_lock), however the freeing procedure has already held a raw-spin-lock. The test first creates two threads: one to start timers and the other to free timers. The start-timers thread will start the timer and then wake up the free-timers thread to free these timers when the starts complete. After freeing, the free-timer thread will wake up the start-timer thread to complete the current iteration. A loop of 10 iterations is used. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250117101816.2101857-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-20 09:09:02 -08:00
Petr Mladek	49dcb50d6c	Merge branch 'for-6.14/selftests-trivial' into for-linus	2025-01-20 14:25:01 +01:00
Paolo Bonzini	43f640f4b9	KVM/riscv changes for 6.14 - Svvptc, Zabha, and Ziccrse extension support for Guest/VM - Virtualize SBI system suspend extension for Guest/VM - Trap related exit statstics as SBI PMU firmware counters for Guest/VM -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmeJ3zYACgkQrUjsVaLH LAdd4hAAj7MDKBXFfkgrg6M5b3CgjFp/1FbiQQe+gJNEOrDdWJy115imk2IKbEZq MAHLF1nlavmi3ZxTCGRPamyyVej5A0+oNujl+ktltHXIKPqXVcJvz5Iu15uDJNZf Ow6Vz74HONl3UKH1H029hFA8oXBTzlxyLv8EXfUJ/HZ/bUeQAfgX52pdvTZ54o4Z gJy+dODD7LOIh9WSlqsrqRcaZkQ4uWZNzQVuXQyubon/AhyqLZyoX/Kx+eLg0QSG LwwXQA5ew4fG3heOWGSSVhiBeqKj7Kmk7l3tyWe6f/YUw7BP1EEwY6ufdH2jy6Z3 mEE/3+mYhkBvr6sWCMPgGt8IMGqe6vRPE1SjPRk3xjzQN4a5aSGXA3J2bBHZhdgt ksGKD0CNhFv/E9LyeQnIylqQqNL9IIb35hrRmVCGzeOJB5PhqDRZiuyyz4LMOgY9 1uY6c5fuSxV0IIZV2Y4oTxZ26dCBkGzR5rrVBSakSF1xWfU+0rzX91FslEUzPyDO W+RcG9ziFTdCbYkTGlt6sSiXRwkYe/TD6VpDgsxbQECgW/9Itg2BSCZojuy7Lfo9 idhjHLIouruGyQKrJmadUdOuHzLOCX8XMo1oTjlrPudNoILC6GmZs+X7xUUJ6Fzi mOgxcUBsByzZLhyhPnbwS0o0D7La7HJbuNne8VSHfEjPhr8/yl0= =EQ0o -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-6.14-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.14 - Svvptc, Zabha, and Ziccrse extension support for Guest/VM - Virtualize SBI system suspend extension for Guest/VM - Trap related exit statstics as SBI PMU firmware counters for Guest/VM	2025-01-20 07:01:17 -05:00
Paolo Bonzini	4f7ff70c05	KVM x86 misc changes for 6.14: - Overhaul KVM's CPUID feature infrastructure to replace "governed" features with per-vCPU tracking of the vCPU's capabailities for all features. Along the way, refactor the code to make it easier to add/modify features, and add a variety of self-documenting macro types to again simplify adding new features and to help readers understand KVM's handling of existing features. - Rework KVM's handling of VM-Exits during event vectoring to plug holes where KVM unintentionally puts the vCPU into infinite loops in some scenarios, e.g. if emulation is triggered by the exit, and to bring parity between VMX and SVM. - Add pending request and interrupt injection information to the kvm_exit and kvm_entry tracepoints respectively. - Fix a relatively benign flaw where KVM would end up redoing RDPKRU when loading guest/host PKRU due to a refactoring of the kernel helpers that didn't account for KVM's pre-checking of the need to do WRPKRU. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmeJngsACgkQOlYIJqCj N/1dfA/+NIZmnd8OV9Zvc6HGxrzgt4QsM9pmsUmrfkDWefxMYIAMeaW8Vn4CJfRf zY/UcqyNI7JYxSiuVTckz+Tf54HhqYaLrUwILGCQ49koirZx+aQT1OUfjLroVMlh ffX1i6GOoLNtxjb9MXM/heLVdUbvmzQMSFkd/AkOH+nrOtDNOiPlZfjHsewj9zrf BNJGhzvT4M6vc/AsScC7tc0yFD5KKFRv8tVwJ6Zf1nWKyUDOSpMTWkVnq6geKJPZ iGBZPPNg55Oy1g6uj6VYWmqYTD8Qioz5jtEJ/8pPHdAyIFo21s81bfJc548d+QLh KfrL1K7TrCOhSAGC3Cb3lTLeq2immmGHaiTBLwGABG4MhpiX4NVpMMdOyFbVLMOS HIYuwXwDckm1pfU7/w+PgPaakCyPrXQntm+3Y2pvDOoY6e2JbwodK4j8BvvQda35 8TrYKEGFvq5aij7Iw1O9TUoLAocDM/sHIHE6BCazHyzKBIv9xLRFeabiCQ+A1pwv gZk5u0+j+DPpLdeLhbMYhIXUtr3bvyMYvc+tRkG716f8ubAE3+Kn5BEDo4Ot2DcT vc+NTRYYWN6zavHiJH3Ddt153yj256JCZhLwCdfbryCQdz3Mpy16m36tgkDRd3lR QT4IkPQo1Vl/aU0yiE/dhnJgh1rTO26YQjZoHs5Oj16d0HRrKyc= =32mM -----END PGP SIGNATURE----- Merge tag 'kvm-x86-misc-6.14' of https://github.com/kvm-x86/linux into HEAD KVM x86 misc changes for 6.14: - Overhaul KVM's CPUID feature infrastructure to track all vCPU capabilities instead of just those where KVM needs to manage state and/or explicitly enable the feature in hardware. Along the way, refactor the code to make it easier to add features, and to make it more self-documenting how KVM is handling each feature. - Rework KVM's handling of VM-Exits during event vectoring; this plugs holes where KVM unintentionally puts the vCPU into infinite loops in some scenarios (e.g. if emulation is triggered by the exit), and brings parity between VMX and SVM. - Add pending request and interrupt injection information to the kvm_exit and kvm_entry tracepoints respectively. - Fix a relatively benign flaw where KVM would end up redoing RDPKRU when loading guest/host PKRU, due to a refactoring of the kernel helpers that didn't account for KVM's pre-checking of the need to do WRPKRU.	2025-01-20 06:49:39 -05:00
Jiri Kosina	670af65d2a	Merge branch 'for-6.14/constify-bin-attribute' into for-linus - constification of 'struct bin_attribute' in various HID driver (Thomas Weißschuh)	2025-01-20 09:58:12 +01:00
James Bottomley	fd3aa3d5e5	selftests/efivarfs: fix tests for failed write removal The current self tests expect the zero size remnants that failed variable creation leaves. Update the tests to verify these are now absent. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2025-01-19 17:51:16 +01:00
James Bottomley	8a32d46b20	selftests/efivarfs: add check for disallowing file truncation Now that the ability of arbitrary writes to set the inode size is fixed, verify that a variable file accepts a truncation operation but does not change the stat size because of it. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2025-01-19 17:49:10 +01:00
Palmer Dabbelt	2613c15b0c	Merge patch series "riscv: Add support for xtheadvector" Charlie Jenkins <charlie@rivosinc.com> says: xtheadvector is a custom extension that is based upon riscv vector version 0.7.1 [1]. All of the vector routines have been modified to support this alternative vector version based upon whether xtheadvector was determined to be supported at boot. vlenb is not supported on the existing xtheadvector hardware, so a devicetree property thead,vlenb is added to provide the vlenb to Linux. There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is used to request which thead vendor extensions are supported on the current platform. This allows future vendors to allocate hwprobe keys for their vendor. Support for xtheadvector is also added to the vector kselftests. [1] `95358cb2cc/xtheadvector.adoc` * b4-shazam-merge: riscv: Add ghostwrite vulnerability selftests: riscv: Support xtheadvector in vector tests selftests: riscv: Fix vector tests riscv: hwprobe: Document thead vendor extensions and xtheadvector extension riscv: hwprobe: Add thead vendor extension probing riscv: vector: Support xtheadvector save/restore riscv: Add xtheadvector instruction definitions riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT RISC-V: define the elements of the VCSR vector CSR riscv: vector: Use vlenb from DT for thead riscv: Add thead and xtheadvector as a vendor extension riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree dt-bindings: cpus: add a thead vlen register length property dt-bindings: riscv: Add xtheadvector ISA extension description Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-0-236c22791ef9@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-18 12:33:43 -08:00
Charlie Jenkins	c384c5d4a2	selftests: riscv: Support xtheadvector in vector tests Extend existing vector tests to be compatible with the xtheadvector instructions. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Yangyu Chen <cyy@cyyself.name> Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-13-236c22791ef9@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-18 12:33:38 -08:00
Charlie Jenkins	57d7713af9	selftests: riscv: Fix vector tests Overhaul the riscv vector tests to use kselftest_harness to help the test cases correctly report the results and decouple the individual test cases from each other. With this refactoring, only run the test cases if vector is reported and properly report the test case as skipped otherwise. The v_initval_nolibc test was previously not checking if vector was supported and used a function (malloc) which invalidates the state of the vector registers. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Yangyu Chen <cyy@cyyself.name> Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-12-236c22791ef9@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-18 12:33:37 -08:00
Jakub Kicinski	54ea680b75	selftests: net: give up on the cmsg_time accuracy on slow machines Commit `b9d5f5711d` ("selftests: net: increase the delay for relative cmsg_time.sh test") widened the accepted value range 8x but we still see flakes (at a rate of around 7%). Return XFAIL for the most timing sensitive test on slow machines. Before: # ./cmsg_time.sh Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK' FAIL - 1/36 cases failed After: # ./cmsg_time.sh Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL) Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL) OK Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250116020105.931338-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-17 18:36:14 -08:00
Al Viro	58cf9c383c	dcache: back inline names with a struct-wrapped array of unsigned long ... so that they can be copied with struct assignment (which generates better code) and accessed word-by-word. The type is union shortname_storage; it's a union of arrays of unsigned char and unsigned long. struct name_snapshot.inline_name turned into union shortname_storage; users (all in fs/dcache.c) adjusted. struct dentry.d_iname has some users outside of fs/dcache.c; to reduce the amount of noise in commit, it is replaced with union shortname_storage d_shortname and d_iname is turned into a macro that expands to d_shortname.string (similar to d_lock handling). That compat macro is temporary - most of the remaining instances will be taken out by debugfs series, and once that is merged and few others are taken care of this will go away. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-01-17 17:46:05 -05:00
Mickaël Salaün	2a794ee613	selftests/landlock: Add layout1.umount_sandboxer tests Check that a domain is not tied to the executable file that created it. For instance, that could happen if a Landlock domain took a reference to a struct path. Move global path names to common.h and replace copy_binary() with a more generic copy_file() helper. Test coverage for security/landlock is 92.7% of 1133 lines according to gcc/gcov-14. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-23-mic@digikod.net [mic: Update date and add test coverage] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-01-17 19:05:38 +01:00
Mickaël Salaün	5147779d5e	selftests/landlock: Add wrappers.h Extract syscall wrappers to make them usable by standalone binaries (see next commit). Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-22-mic@digikod.net [mic: Fix comments] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-01-17 19:05:38 +01:00
Mickaël Salaün	2107c35128	selftests/landlock: Fix error message The global variable errno may not be set in test_execute(). Do not use it in related error message. Cc: Günther Noack <gnoack@google.com> Fixes: `e1199815b4` ("selftests/landlock: Add user space tests") Link: https://lore.kernel.org/r/20250108154338.1129069-21-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-01-17 19:05:37 +01:00
Mickaël Salaün	12264f721f	selftests/landlock: Add test to check partial access in a mount tree Add layout1.refer_part_mount_tree_is_allowed to test the masked logical issue regarding collect_domain_accesses() calls followed by the is_access_to_paths_allowed() check in current_check_refer_path(). See previous commit. This test should work without the previous fix as well, but it enables us to make sure future changes will not have impact regarding this behavior. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-13-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-01-17 19:05:36 +01:00
Mickaël Salaün	0e4db4f843	selftests/landlock: Fix build with non-default pthread linking Old toolchains require explicit -lpthread (e.g. on Debian 11). Cc: Nathan Chancellor <nathan@kernel.org> Cc: Tahera Fahimi <fahimitahera@gmail.com> Fixes: `c899496501` ("selftests/landlock: Test signal scoping for threads") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250115145409.312226-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-01-17 19:05:31 +01:00
Marc Zyngier	946904e728	Merge branch kvm-arm64/coresight-6.14 into kvmarm-master/next * kvm-arm64/coresight-6.14: : . : Trace filtering update from James Clark. From the cover letter: : : "The guest filtering rules from the Perf session are now honored for both : nVHE and VHE modes. This is done by either writing to TRFCR_EL12 at the : start of the Perf session and doing nothing else further, or caching the : guest value and writing it at guest switch for nVHE. In pKVM, trace is : now be disabled for both protected and unprotected guests." : . KVM: arm64: Fix selftests after sysreg field name update coresight: Pass guest TRFCR value to KVM KVM: arm64: Support trace filtering for guests KVM: arm64: coresight: Give TRBE enabled state to KVM coresight: trbe: Remove redundant disable call arm64/sysreg/tools: Move TRFCR definitions to sysreg tools: arm64: Update sysreg.h header files Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-01-17 11:05:44 +00:00
Daniel Xu	f932a8e482	bpf: selftests: verifier: Add nullness elision tests Test that nullness elision works for common use cases. For example, we want to check that both constant scalar spills and STACK_ZERO functions. As well as when there's both const and non-const values of R2 leading up to a lookup. And obviously some bound checks. Particularly tricky are spills both smaller or larger than key size. For smaller, we need to ensure verifier doesn't let through a potential read into unchecked bytes. For larger, endianness comes into play, as the native endian value tracked in the verifier may not be the bytes the kernel would have read out of the key pointer. So check that we disallow both. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/f1dacaa777d4516a5476162e0ea549f7c3354d73.1736886479.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-16 17:51:10 -08:00
Daniel Xu	d2102f2f5d	bpf: verifier: Support eliding map lookup nullness This commit allows progs to elide a null check on statically known map lookup keys. In other words, if the verifier can statically prove that the lookup will be in-bounds, allow the prog to drop the null check. This is useful for two reasons: 1. Large numbers of nullness checks (especially when they cannot fail) unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ. 2. It forms a tighter contract between programmer and verifier. For (1), bpftrace is starting to make heavier use of percpu scratch maps. As a result, for user scripts with large number of unrolled loops, we are starting to hit jump complexity verification errors. These percpu lookups cannot fail anyways, as we only use static key values. Eliding nullness probably results in less work for verifier as well. For (2), percpu scratch maps are often used as a larger stack, as the currrent stack is limited to 512 bytes. In these situations, it is desirable for the programmer to express: "this lookup should never fail, and if it does, it means I messed up the code". By omitting the null check, the programmer can "ask" the verifier to double check the logic. Tests also have to be updated in sync with these changes, as the verifier is more efficient with this change. Notable, iters.c tests had to be changed to use a map type that still requires null checks, as it's exercising verifier tracking logic w.r.t iterators. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/68f3ea96ff3809a87e502a11a4bd30177fc5823e.1736886479.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-16 17:51:10 -08:00
Daniel Xu	37cce22dbd	bpf: verifier: Refactor helper access type tracking Previously, the verifier was treating all PTR_TO_STACK registers passed to a helper call as potentially written to by the helper. However, all calls to check_stack_range_initialized() already have precise access type information available. Rather than treat ACCESS_HELPER as a proxy for BPF_WRITE, pass enum bpf_access_type to check_stack_range_initialized() to more precisely track helper arguments. One benefit from this precision is that registers tracked as valid spills and passed as a read-only helper argument remain tracked after the call. Rather than being marked STACK_MISC afterwards. An additional benefit is the verifier logs are also more precise. For this particular error, users will enjoy a slightly clearer message. See included selftest updates for examples. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/ff885c0e5859e0cd12077c3148ff0754cad4f7ed.1736886479.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-16 17:51:10 -08:00
Jakub Kicinski	3030e3d57b	selftests/net: packetdrill: make tcp buf limited timing tests benign The following tests are failing on debug kernels: tcp_tcp_info_tcp-info-rwnd-limited.pkt tcp_tcp_info_tcp-info-sndbuf-limited.pkt with reports like: assert 19000 <= tcpi_sndbuf_limited <= 21000, tcpi_sndbuf_limited; \ AssertionError: 18000 and: assert 348000 <= tcpi_busy_time <= 360000, tcpi_busy_time AssertionError: 362000 Extend commit `912d6f6697` ("selftests/net: packetdrill: report benign debug flakes as xfail") to cover them. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250115232129.845884-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-16 17:23:22 -08:00
John Daley	8d20dcda40	selftests: drv-net-hw: inject pp_alloc_fail errors in the right place The tool pp_alloc_fail.py tested error recovery by injecting errors into the function page_pool_alloc_pages(). The page pool allocation function page_pool_dev_alloc() does not end up calling page_pool_alloc_pages(). page_pool_alloc_netmems() seems to be the function that is called by all of the page pool alloc functions in the API, so move error injection to that function instead. Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250115181312.3544-2-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-16 17:18:53 -08:00
Pu Lehui	556a399406	selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED When redirecting the split BTF to the vmlinux base BTF, we need to mark the distilled base struct/union members of split BTF structs/unions in id_map with BTF_IS_EMBEDDED. This indicates that these types must match both name and size later. So if a needed composite type, which is the member of composite type in the split BTF, has a different size in the base BTF we wish to relocate with, btf__relocate() should error out. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250115100241.4171581-4-pulehui@huaweicloud.com	2025-01-16 15:34:18 -08:00
Pu Lehui	4a04cb326a	selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test Fix btf leak on new btf alloc failure in btf_distill test. Fixes: `affdeb5061` ("selftests/bpf: Extend distilled BTF tests to cover BTF relocation") Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250115100241.4171581-1-pulehui@huaweicloud.com	2025-01-16 15:34:18 -08:00
Eduard Zingerman	7c311b7cb3	veristat: Load struct_ops programs only once libbpf automatically adjusts autoload for struct_ops programs, see libbpf.c:bpf_object_adjust_struct_ops_autoload. For example, if there is a map: SEC(".struct_ops.link") struct sched_ext_ops ops = { .enqueue = foo, .tick = bar, }; Both 'foo' and 'bar' would be loaded if 'ops' autocreate is true, both 'foo' and 'bar' would be skipped if 'ops' autocreate is false. This means that when veristat processes object file with 'ops', it would load 4 programs in total: two programs per each 'process_prog' call. The adjustment occurs at object load time, and libbpf remembers association between 'ops' and 'foo'/'bar' at object open time. The only way to persuade libbpf to load one of two is to adjust map initial value, such that only one program is referenced. This patch does exactly that, significantly reducing time to process object files with big number of struct_ops programs. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250115223835.919989-1-eddyz87@gmail.com	2025-01-16 15:33:58 -08:00
Tony Ambardar	a8d1c48d07	selftests/bpf: Fix undefined UINT_MAX in veristat.c Include <limits.h> in 'veristat.c' to provide a UINT_MAX definition and avoid multiple compile errors against mips64el/musl-libc: veristat.c: In function 'max_verifier_log_size': veristat.c:1135:36: error: 'UINT_MAX' undeclared (first use in this function) 1135 \| const int SMALL_LOG_SIZE = UINT_MAX >> 8; \| ^~~~~~~~ veristat.c:24:1: note: 'UINT_MAX' is defined in header '<limits.h>'; did you forget to '#include <limits.h>'? 23 \| #include <math.h> +++ \|+#include <limits.h> 24 \| Fixes: `1f7c336307` ("selftests/bpf: Increase verifier log limit in veristat") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250116075036.3459898-1-tony.ambardar@gmail.com	2025-01-16 15:20:21 -08:00
Jakub Kicinski	2ee738e90e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.13-rc8). Conflicts: drivers/net/ethernet/realtek/r8169_main.c `1f691a1fc4` ("r8169: remove redundant hwmon support") `152d00a913` ("r8169: simplify setting hwmon attribute visibility") https://lore.kernel.org/20250115122152.760b4e8d@canb.auug.org.au Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.c `152f4da05a` ("bnxt_en: add support for rx-copybreak ethtool command") `f0aa6a37a3` ("eth: bnxt: always recalculate features after XDP clearing, fix null-deref") drivers/net/ethernet/intel/ice/ice_type.h `50327223a8` ("ice: add lock to protect low latency interface") `dc26548d72` ("ice: Fix quad registers read on E825") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-16 10:34:59 -08:00
Linus Torvalds	ce69b40190	Current release - regressions: - core: avoid CFI problems with sock priv helpers - xsk: bring back busy polling support - netpoll: ensure skb_pool list is always initialized Current release - new code bugs: - core: make page_pool_ref_netmem work with net iovs - ipv4: route: fix drop reason being overridden in ip_route_input_slow - udp: make rehash4 independent in udp_lib_rehash() Previous releases - regressions: - bpf: fix bpf_sk_select_reuseport() memory leak - openvswitch: fix lockup on tx to unregistering netdev with carrier - mptcp: be sure to send ack when mptcp-level window re-opens - eth: bnxt: always recalculate features after XDP clearing, fix null-deref - eth: mlx5: fix sub-function add port error handling - eth: fec: handle page_pool_dev_alloc_pages error Previous releases - always broken: - vsock: some fixes due to transport de-assignment - eth: ice: fix E825 initialization - eth: mlx5e: fix inversion dependency warning while enabling IPsec tunnel - eth: gtp: destroy device along with udp socket's netns dismantle. - eth: xilinx: axienet: Fix IRQ coalescing packet count overflow Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmeJGOISHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkfLwP/1XaeyEtSifDiF+bj7f3M6gd8RC2wkNq 8DvHadl+uPx1RWv0F2UH9fsVz17A3Gg3oF2Agl4tMP5p9F0e489pNjm2QXOl1zac hpJdV0VdNJHKEfWhKODRfLap6fNtPoEQP5r3scbFYuzkdMw6sYZujdQUmFNghPYe Y6GKZIrQ96vYLpSTrLCAQt/2EEt608b3ESFFhqTkvB8voB2cODNxxBoTJ5K+jMa0 +fVW46siGKc8HSaUJCWS5YkAW/Tu3AXJmYgKGQg9PaErVclwImsQFXggIki80P7W 747Gkuc3kZm3Mt91d6kK1s5Sxr/FAaaJlOOE2iHpZld6cN+Y6niJ+knFdkaX5rCE T/aLq8cdegwSdct6CIJ7YZp3v1AVv21erWf7OpbY9KGTWPV9d2yzh3fYin87tAzs YYo0H1OqqbxpnKThgGREpu+LqEkCbMzsKmwn/5wTAZZl28ySZWZin2ukzTMRqla0 Y8JJvBYvcHn/ekb4gJNaDhJF7ZBuLjrXlG1SXAyO+GS4TwToqrK/luPRf0tkbI/Z QVNBNCukdRTy/IeZQJsc1gtE1tlQmRXNlTbAILPIkWWdxgjdpd/wBbP8/qG9184l Ut4gu7AVF+LLH5nhgRVHAcfrO3i/kbRFC3ErQw06YLyqLInyss8GPULP7tWeFfE3 iM/DsTHjjr04 =EtnO -----END PGP SIGNATURE----- Merge tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Notably this includes fixes for a few regressions spotted very recently. No known outstanding ones. Current release - regressions: - core: avoid CFI problems with sock priv helpers - xsk: bring back busy polling support - netpoll: ensure skb_pool list is always initialized Current release - new code bugs: - core: make page_pool_ref_netmem work with net iovs - ipv4: route: fix drop reason being overridden in ip_route_input_slow - udp: make rehash4 independent in udp_lib_rehash() Previous releases - regressions: - bpf: fix bpf_sk_select_reuseport() memory leak - openvswitch: fix lockup on tx to unregistering netdev with carrier - mptcp: be sure to send ack when mptcp-level window re-opens - eth: - bnxt: always recalculate features after XDP clearing, fix null-deref - mlx5: fix sub-function add port error handling - fec: handle page_pool_dev_alloc_pages error Previous releases - always broken: - vsock: some fixes due to transport de-assignment - eth: - ice: fix E825 initialization - mlx5e: fix inversion dependency warning while enabling IPsec tunnel - gtp: destroy device along with udp socket's netns dismantle. - xilinx: axienet: Fix IRQ coalescing packet count overflow" * tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) netdev: avoid CFI problems with sock priv helpers net/mlx5e: Always start IPsec sequence number from 1 net/mlx5e: Rely on reqid in IPsec tunnel mode net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel net/mlx5: Clear port select structure when fail to create net/mlx5: SF, Fix add port error handling net/mlx5: Fix a lockdep warning as part of the write combining test net/mlx5: Fix RDMA TX steering prio net: make page_pool_ref_netmem work with net iovs net: ethernet: xgbe: re-add aneg to supported features in PHY quirks net: pcs: xpcs: actively unset DW_VR_MII_DIG_CTRL1_2G5_EN for 1G SGMII net: pcs: xpcs: fix DW_VR_MII_DIG_CTRL1_2G5_EN bit being set for 1G SGMII w/o inband selftests: net: Adapt ethtool mq tests to fix in qdisc graft net: fec: handle page_pool_dev_alloc_pages error net: netpoll: ensure skb_pool list is always initialized net: xilinx: axienet: Fix IRQ coalescing packet count overflow nfp: bpf: prevent integer overflow in nfp_bpf_event_output() selftests: mptcp: avoid spurious errors on disconnect mptcp: fix spurious wake-up on under memory pressure mptcp: be sure to send ack when mptcp-level window re-opens ...	2025-01-16 09:09:44 -08:00
Steven Rostedt	542079b4b1	selftests/ftrace: Add test that tests event :mod: commands Now that here's a :mod: command that can be sent into set_event, add a test that tests its use. Both setting events for a loaded module, as well as caching what events to set for a module that is not loaded yet. Cc: Shuah Khan <shuah@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/20250116143533.819228058@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-16 09:41:50 -05:00
Taehee Yoo	cfd70e3eba	selftest: net-drv: hds: add test for HDS feature HDS/HDS-thresh features were updated/implemented. so add some tests for these features. HDS tests are the same with `ethtool -G eth0 tcp-data-split <on \| off \| auto >` but `auto` depends on driver specification. So, it doesn't include `auto` case. HDS-thresh tests are same with `ethtool -G eth0 hds-thresh <0 - MAX>` It includes both 0 and MAX cases. It also includes exceed case, MAX + 1. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250114142852.3364986-11-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 14:42:12 -08:00
Alessandro Zanni	7a649f39da	selftests/net/forwarding: teamd command not found Running "make kselftest TARGETS=net/forwarding" results in multiple ccurrences of the same error: - ./lib.sh: line 787: teamd: command not found This patch adds the variable $REQUIRE_TEAMD in every test that uses the command teamd and checks the $REQUIRE_TEAMD variable in the file "lib.sh" to skip the test if the command is not installed. Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Link: https://patch.msgid.link/20250114003323.97207-1-alessandro.zanni87@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 14:14:40 -08:00
Matthieu Baerts (NGI0)	540d3f8f1d	selftests: mptcp: connect: better display the files size 'du' will print the name of the file, which was already displayed before, e.g. Created /tmp/tmp.UOyy0ghfmQ (size 4703740/tmp/tmp.UOyy0ghfmQ) containing data sent by client Created /tmp/tmp.xq3zvFinGo (size 1391724/tmp/tmp.xq3zvFinGo) containing data sent by server 'stat' can be used instead, to display this instead: Created /tmp/tmp.UOyy0ghfmQ (size 4703740 B) containing data sent by client Created /tmp/tmp.xq3zvFinGo (size 1391724 B) containing data sent by server So easier to spot the file sizes. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-6-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:15 -08:00
Matthieu Baerts (NGI0)	b265c5a174	selftests: mptcp: connect: remove unused variable 'cin_disconnect' is used in run_tests_disconnect(), but not 'cout_disconnect', so it is safe to drop it. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-5-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:15 -08:00
Matthieu Baerts (NGI0)	5fbea888f8	selftests: mptcp: add -m with ss in case of errors Recently, we had an issue where getting info about the memory would have helped better understanding what went wrong. Let add it just in case for later. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-4-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:15 -08:00
Matthieu Baerts (NGI0)	8c6bb011e1	selftests: mptcp: move stats info in case of errors to lib.sh A few MPTCP selftests are using the same code to print stats in case of error. This code can then be moved to mptcp_lib.sh. No behaviour changes intended, except to print the error in red and to stderr, like most error messages. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-3-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:15 -08:00
Geliang Tang	3257d4cb8d	selftests: mptcp: sockopt: save nstat infos Similar to the way nstat information is stored in mptcp_connect.sh and mptcp_join.sh scripts, this patch adds a similar way for mptcp_sockopt.sh and displays the nstat information when errors occur. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-2-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:14 -08:00
Matthieu Baerts (NGI0)	894dae026b	selftests: mptcp: simult_flows: unify errors msgs In order to unify what is printed in case of error, similar to what is done in mptcp_connect.sh and mptcp_join.sh, it is interesting to do the following modifications in simult_flows.sh: - Print the rc errors at the end of the line. - Print the MIB counters. - Use the same ss options: add -M (MPTCP sockets) and -e (detailed socket information). While at it, also print of the 'max' time only in case of success, because 'mptcp_connect.c' will already print this info in case of error, e.g.: transfer slower than expected! runtime 11948 ms, expected 11921 ms Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-1-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-15 13:21:14 -08:00
Mathieu Desnoyers	336d02bc4c	selftests/rseq: Fix handling of glibc without rseq support When porting librseq commit: commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection") from librseq to the kernel selftests, the following line was missed at the end of rseq_init(): rseq_size = get_rseq_kernel_feature_size(); which effectively leaves rseq_size initialized to -1U when glibc does not have rseq support. glibc supports rseq from version 2.35 onwards. In a following librseq commit commit c67d198627c2 ("Only set 'rseq_size' on first thread registration") to mimic the libc behavior, a new approach is taken: don't set the feature size in 'rseq_size' until at least one thread has successfully registered. This allows using 'rseq_size' in fast-paths to test for both registration status and available features. The caveat is that on libc either all threads are registered or none are, while with bare librseq it is the responsability of the user to register all threads using rseq. This combines the changes from the following librseq git commits: commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection") commit c67d198627c2 ("Only set 'rseq_size' on first thread registration") Fixes: `a0cc649353` ("selftests/rseq: Fix mm_cid test failure") Reported-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Raghavendra Rao Ananta <rananta@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Carlos O'Donell <carlos@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Michael Jeanson <mjeanson@efficios.com> Cc: linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-15 10:54:48 -07:00
Paolo Bonzini	5cf32aff20	LoongArch KVM changes for v6.14 1. Clear LLBCTL if secondary mmu mapping changed. 2. Add hypercall service support for usermode VMM. This is a really small changeset, because the Chinese New Year (Spring Festival) is coming. Happy New Year! -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmeFGcIWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImern1D/9AZ8M+0nBAaONZaq2qKLC+RaW6 KqvFsR1PUUFzVcQZaHh9OZcx5s4EAH12EaxBH68W0o0ejbTUJp8QXT6cmO9bNFj1 tVaczGACss34kDerrddHisOpimFdaP+ECX4Q43oTc5N7vG6zUu3ijOISnIIxkhHP RlX/+5Djw0NoaVAkhEj4v+LkY33z5QnDFNI0OjJiHpDepP2vLQ1FD573pLMeqcGs BVYwZv7DP3SnVajSjRhT/r5qjy9EMjrXRLkIIwyjOUArRaPq/Lfg4CTK85e5MZsR 2GTkdjvh/YArpluRki4FX1cVOwpBbEtC+24/NWB+MPijtnYqMyAoIraZGqJMAzhw P6W70A15GvBhlhQmvKNai1oXkdZaaT7XDcbFT706Cwhu7LvcNM8kK7VrPc59WLTR uHO+ehJh0DCpBMC2BKH/8sztGx80u7SB4Ph0ytZCK+uYznTMEiBqRup7E/QLLG+1 EotXv8U4+Bwx/inzMxwi6vR1ZXo0dIDsnvFdSZeA6PC/cSoPzdqCdrXjQT/7HUIu DNgcsRVL3LFE+A/sDVGb5/w9UPdQfCdO10bu97FkY37ftqp7LvPTlWJvDZJx+Wle KfErCOM1/ZRQ2knzE7fst58auA3ZFNn3jWRkD/0gJ4X1Fgu63VrYeuc4FL7r8ken HxKLYOLtD6dOzR5DeA== =z2VU -----END PGP SIGNATURE----- Merge tag 'loongarch-kvm-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.14 1. Clear LLBCTL if secondary mmu mapping changed. 2. Add hypercall service support for usermode VMM. This is a really small changeset, because the Chinese New Year (Spring Festival) is coming. Happy New Year!	2025-01-15 11:51:56 -05:00
Saket Kumar Bhaskar	9fe17b7466	selftests/bpf: Fix test_xdp_adjust_tail_grow2 selftest on powerpc On powerpc cache line size is 128 bytes, so skb_shared_info must be aligned accordingly. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250110103109.3670793-1-skb99@linux.ibm.com	2025-01-15 15:45:29 +01:00
Victor Nogueira	0a5b8fff01	selftests: net: Adapt ethtool mq tests to fix in qdisc graft Because of patch[1] the graft behaviour changed So the command: tcq replace parent 100:1 handle 204: Is no longer valid and will not delete 100:4 added by command: tcq replace parent 100:4 handle 204: pfifo_fast So to maintain the original behaviour, this patch manually deletes 100:4 and grafts 100:1 Note: This change will also work fine without [1] [1] https://lore.kernel.org/netdev/20250111151455.75480-1-jhs@mojatatu.com/T/#u Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-01-15 09:28:51 +00:00
Maciej Wieczor-Retman	d6d35d0b0f	selftests/resctrl: Discover SNC kernel support and adjust messages Resctrl selftest prints a message on test failure that Sub-Numa Clustering (SNC) could be enabled and points the user to check their BIOS settings. No actual check is performed before printing that message so it is not very accurate in pinpointing a problem. When there is SNC support for kernel's resctrl subsystem and SNC is enabled then sub node files are created for each node in the resctrlfs. The sub node files exist in each regular node's L3 monitoring directory. The reliable path to check for existence of sub node files is /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00. Add helper that checks for mon_sub_L3_00 existence. Correct old messages to account for kernel support of SNC in resctrl. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Maciej Wieczor-Retman	a1cd99e700	selftests/resctrl: Adjust effective L3 cache size with SNC enabled Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two, three, four or six nodes. When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3. It's possible to detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0. Detect SNC mode once and let other tests inherit that information. Update CFLAGS after including lib.mk in the Makefile so that fallthrough macro can be used. To check if SNC detection is reliable one can check the /sys/devices/system/cpu/offline file. If it's empty, it means all cores are operational and the ratio should be calculated correctly. If it has any contents, it means the detected SNC mode can't be trusted and should be disabled. Check if detection was not reliable due to offline cpus. If it was skip running tests since the results couldn't be trusted. Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Masami Hiramatsu (Google)	89ae64384e	selftests/ftrace: Make uprobe test more robust against binary name Make add_remove_uprobe test case more robust against various real binary name. Current add_remove_uprobe.tc test expects the real binary of /bin/sh is '/bin/sh', but it does not work on busybox environment. Instead of using fixed pattern, use readlink to identify real binary name. Link: https://lore.kernel.org/r/173625187633.1383744.2840679071525852811.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Masami Hiramatsu (Google)	159ca65c42	selftests/ftrace: Fix to use remount when testing mount GID option Fix mount_options.tc to use remount option to mount the tracefs. Since the current implementation does not umount the tracefs, this test always fails because of -EBUSY error. Using remount option will allow us to change the mount option. Link: https://lore.kernel.org/r/173625186741.1383744.16707876180798573039.stgit@devnote2 Fixes: `8b55572e51` ("tracing/selftests: Add tracefs mount options test") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Shivam Chaudhary	58beae2585	selftests: tmpfs: Add kselftest support to tmpfs Replace direct error handling with 'ksft_test_result_*' macros for better reporting. Test logs: Before change: - Without root error: unshare, errno 1 - With root No, output After change: - Without root TAP version 13 1..1 ok 2 # SKIP This test needs root to run! Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0 - With root TAP version 13 1..1 ok 1 Test : Success Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://lore.kernel.org/r/20250105085255.124929-3-cvam0000@gmail.com Signed-off-by: Shivam Chaudhary <cvam0000@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Shivam Chaudhary	41ca14efaf	selftests: tmpfs: Add Test-skip if not run as root Add 'ksft_exit_skip()', if not run as root, with an appropriate Warning. Add 'ksft_print_header()' and 'ksft_set_plan()' to structure test outputs more effectively. Test logs: Before Change: - Without root error: unshare, errno 1 - With root No, output After change: - Without root TAP version 13 1..1 ok 2 # SKIP This test needs root to run! Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0 - With root TAP version 13 1..1 Link: https://lore.kernel.org/r/20250105085255.124929-2-cvam0000@gmail.com Signed-off-by: Shivam Chaudhary <cvam0000@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Dmitry V. Levin	02bc220dc6	selftests: harness: fix printing of mismatch values in __EXPECT() intptr_t and uintptr_t are not big enough types on 32-bit architectures when printing 64-bit values, resulting to the following incorrect diagnostic output: # get_syscall_info.c:209:get_syscall_info:Expected exp_args[2] (3134324433) == info.entry.args[1] (3134324433) Replace intptr_t and uintptr_t with intmax_t and uintmax_t, respectively. With this fix, the same test produces more usable diagnostic output: # get_syscall_info.c:209:get_syscall_info:Expected exp_args[2] (3134324433) == info.entry.args[1] (18446744072548908753) Link: https://lore.kernel.org/r/20250108170757.GA6723@strace.io Fixes: `b5bb6d3068` ("selftests/seccomp: fix 32-bit build warnings") Signed-off-by: Dmitry V. Levin <ldv@strace.io> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Vincent Donnefort	b6f9cd83c6	selftests/ring-buffer: Add test for out-of-bound pgoff mapping Extend the ring-buffer mapping test coverage by checking an out-of-bound pgoff which has proven to be problematic in the past. Link: https://lore.kernel.org/r/20241218170318.2814991-1-vdonnefort@google.com Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Brendan Jackman	103c0b5e82	selftests/run_kselftest.sh: Fix help string for --per-test-log This is documented as --per_test_log but the argument is actually --per-test-log. Link: https://lore.kernel.org/r/20241220-per-test-log-v1-1-de5afe69fdf4@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:32 -07:00
Shivam Chaudhary	9301be2ce1	selftests: acct: Add ksft_exit_skip if not running as root If the selftest is not running as root, it should skip not fail and give an appropriate warning to the user. This patch adds ksft_exit_skip() if the test is not running as root. Logs: Before change: TAP version 13 1..1 ok 1 # SKIP This test needs root to run! After change: TAP version 13 1..1 ok 2 # SKIP This test needs root to run! Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0 Link: https://lore.kernel.org/r/20241210123212.332050-1-cvam0000@gmail.com Signed-off-by: Shivam Chaudhary <cvam0000@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
zhang jiao	5a7a4e46f8	selftests: kselftest: Fix the wrong format specifier The format specifier of "unsigned int" in printf() should be "%u", not "%d". Link: https://lore.kernel.org/r/20241202043111.3888-1-zhangjiao2@cmss.chinamobile.com Signed-off-by: zhang jiao <zhangjiao2@cmss.chinamobile.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
Geert Uytterhoeven	8694e6a7f7	selftests: timers: clocksource-switch: Adapt progress to kselftest framework When adapting the test to the kselftest framework, a few printf() calls indicating test progress were not updated. Fix this by replacing these printf() calls by ksft_print_msg() calls. Link: https://lore.kernel.org/r/7dd4b9ab6e43268846e250878ebf25ae6d3d01ce.1733994134.git.geert+renesas@glider.be Fixes: `ce7d101750` ("selftests: timers: clocksource-switch: adapt to kselftest framework") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
Li Zhijian	d54d3f69b7	selftests/zram: gitignore output file After `make run_tests`, the git status complains: Untracked files: (use "git add <file>..." to include in what will be committed) zram/err.log This file will be cleaned up when execute 'make clean' Link: https://lore.kernel.org/r/20241211004625.5308-1-lizhijian@fujitsu.com Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
Li Zhijian	6d59d557e3	selftests/filesystems: Add missing gitignore file Compiled binary files should be added to .gitignore 'git status' complains: Untracked files: (use "git add <file>..." to include in what will be committed) filesystems/statmount/statmount_test_ns Link: https://lore.kernel.org/r/20241211004947.5806-1-lizhijian@fujitsu.com Cc: Shuah Khan <shuah@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
Laura Nao	74864403c5	selftests: Warn about skipped tests in result summary Update the functions that print the test totals at the end of a selftest to include a warning message when skipped tests are detected. The message advises users that skipped tests may indicate missing configuration options and suggests enabling them to improve coverage. Link: https://lore.kernel.org/r/20241126093710.13314-1-laura.nao@collabora.com Signed-off-by: Laura Nao <laura.nao@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00
Stefano Pigozzi	e8731ecdd6	selftests: kselftest: Add ksft_test_result_xpass The functions ksft_test_result_pass, ksft_test_result_fail, ksft_test_result_xfail, and ksft_test_result_skip already exist and are available for use in selftests, but no XPASS equivalent is available. This adds a new function to that family that outputs XPASS, so that it's available for future test writers. Link: https://lore.kernel.org/r/20241207012325.56611-1-me@steffo.eu Signed-off-by: Stefano Pigozzi <me@steffo.eu> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-01-14 17:06:31 -07:00

... 13 14 15 16 17 ...

19279 Commits (f9db1fc56281b96fe8748632b3894de970a8a850)