Commit Graph

1445868 Commits (18014147d3ee7831dce53fe65d7fc8d428b02552)

Author SHA1 Message Date
Fernando Fernandez Mancera 18014147d3 netfilter: nf_tables: fix dst corruption in same register operation
For lshift and rshift, the shift operations are performed in a loop over
32-bit words. The loop calculates the shifted value and write it to dst,
and then immediately reads from src to calculate the carry for the next
iteration. Because src and dst could point to the same memory location,
the carry is incorrectly calculated using the newly modified dst value
instead of the original src value.

Adding a temporary local variable to cache the original value before
writing to dst and using it for the carry calculation solves the
problem. In addition, partial overlap is rejected from control plane for
all kind of operations including byteorder. This was tested with the
following bytecode:

table test_table ip flags 0 use 1 handle 1
ip test_table test_chain use 3 type filter hook input prio 0 policy accept packets 0 bytes 0 flags 1
ip test_table test_chain 2
  [ immediate reg 1 0x44332211 0x88776655 ]
  [ bitwise reg 1 = ( reg 1 << 0x08000000 ) ]
  [ cmp eq reg 1 0x66443322 0x00887766 ]
  [ counter pkts 0 bytes 0 ]
ip test_table test_chain 4 3
  [ immediate reg 1 0x44332211 0x88776655 ]
  [ bitwise reg 1 = ( reg 1 << 0x08000000 ) ]
  [ cmp eq reg 1 0x55443322 0x00887766 ]
  [ counter pkts 21794 bytes 1917798 ]

Fixes: 567d746b55 ("netfilter: bitwise: add support for shifts.")
Acked-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Jiayuan Chen a40aaaef2f selftests: netfilter: add nft_fib_nexthop test
Functional coverage of nft_fib6_eval()'s nexthop enumeration over
three route shapes:

  1) single external nexthop (nhid)
  2) external nexthop group (nhid -> group)
  3) old-style multipath (nexthop ... nexthop ...)

Each scenario places one nexthop on the input device (veth0). For
(2) and (3) the matching nexthop is the second member, so the walk
has to traverse beyond the primary nh. Two nft counters on prerouting
verify the data path: one increments only when fib reports veth0 as
the oif, the other counts "missing" results and must stay at zero.

  ./nft_fib_nexthop.sh
  PASS: single external nexthop (nhid -> veth0)
  PASS: nexthop group (dummy0 + veth0)
  PASS: old-style multipath (sibling on veth0)

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Jiayuan Chen f81b0c2d28 netfilter: nft_fib_ipv6: handle routes via external nexthop
fib6_info has a union:

    union {
        struct list_head fib6_siblings;
        struct list_head nh_list;
    };

Old-style multipath (ip -6 route add ... nexthop ... nexthop ...) uses
fib6_siblings.  External nexthop (ip -6 route add ... nhid N) uses
nh_list, linked into &nh->f6i_list.

nft_fib6_info_nh_uses_dev() blindly walks &rt->fib6_siblings, causing
an OOB read past the struct nexthop slab when rt->nh is set:

  ==================================================================
  BUG: KASAN: slab-out-of-bounds in nft_fib6_eval+0x1362/0x16c0
  Read of size 8 at addr ffff888103a099d0 by task ping/386

  CPU: 2 UID: 0 PID: 386 Comm: ping Not tainted 7.1.0-rc3+ #251 PREEMPT
  Call Trace:
   <IRQ>
   dump_stack_lvl+0x76/0xa0
   print_report+0xd1/0x5f0
   kasan_report+0xe7/0x130
   __asan_report_load8_noabort+0x14/0x30
   nft_fib6_eval+0x1362/0x16c0
   nft_do_chain+0x279/0x18c0
   nft_do_chain_ipv6+0x1a8/0x230
   nf_hook_slow+0xad/0x200
   ipv6_rcv+0x152/0x380
   __netif_receive_skb_one_core+0x118/0x1c0
  ==================================================================

Branch by route shape: when rt->nh is set, walk via
nexthop_for_each_fib6_nh() (also covers nh groups, which the original
code missed); otherwise walk fib6_siblings, guarded by READ_ONCE() of
rt->fib6_nsiblings as required by commit 31d7d67ba1 ("ipv6: annotate
data-races around rt->fib6_nsiblings").

Fixes: 1c32b24c23 ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Jiayuan Chen 1d001b0a61 netfilter: nft_fib_ipv6: walk fib6_siblings under RCU
nft_fib6_info_nh_uses_dev() runs from nft_fib6_eval() in softirq under
rcu_read_lock().  fib6_siblings is modified by writers that hold
tb6_lock but do not wait for RCU readers, so the sibling walk should
use list_for_each_entry_rcu(): it adds READ_ONCE() on the ->next
pointer and lets CONFIG_PROVE_RCU_LIST validate the locking.

No functional change for non-debug builds.

Fixes: 1c32b24c23 ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Florian Westphal f438d1786d netfilter: ebtables: fix OOB read in compat_mtw_from_user
Luxiao Xu says:

 The function compat_mtw_from_user() converts ebtables extensions from
 32-bit user structures to kernel native structures. However, it lacks
 proper validation of the user-supplied match_size/target_size.

 When certain extensions are processed, the kernel-side translation
 logic may perform memory accesses based on the extension's expected
 size. If the user provides a size smaller than what the extension
 requires, it results in an out-of-bounds read as reported by KASAN.

 This fix introduces a check to ensure match_size is at least as large
 as the extension's required compatsize. This covers matches, watchers,
 and targets, while maintaining compatibility with standard targets.

AFAIU this is relevant for matches that need to go though
match->compat_from_user() call.  Those that use plain memcpy with the
user-provided size are ok because the caller checks that size vs the
start of the next rule entry offset (which itself is checked vs. total
size copied from userspace).

The ->compat_from_user() callbacks assume they can read compatsize bytes,
so they need this extra check.

Based on an earlier patch from Luxiao Xu.

Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Luxiao Xu <rakukuip@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Florian Westphal 968cc2c963 netfilter: disable payload mangling in userns
Several parts of network stack rely on iph->ihl validation
done by network stack before PRE_ROUTING.

Disable this feature for user namespaces for now.

tcp option handling is likely safe even for LOCAL_IN, so this
this leaves tcp option mangling via nft_exthdr.c as-is.

I don't think these are the only means to alter packets, but these
appear to be relatively prominent.

This could be relaxed later.  Example:
 - allow userns for ingress hook.
 - allow userns if base is transport header.

 Also, we should revalidate or restrict generally:
 - Don't allow linklayer writes to spill into network header
 - restrict ipv4 and ipv6 to 'known safe' writes, e.g.
   saddr/daddr/check/tos

Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Tested-by: Qi Tang <tpluszz77@gmail.com>
Link: https://lore.kernel.org/netfilter-devel/20260515100411.3141-1-fw@strlen.de/
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Florian Westphal c376f07e16 netfilter: xt_cpu: prefer raw_smp_processor_id
With PREEMPT_RCU we get splat:

BUG: using smp_processor_id() in preemptible [..]
caller is cpu_mt+0x53/0xd0 net/netfilter/xt_cpu.c:37
CPU: 1 .. Comm: syz.3.1377 #0 PREEMPT(full)
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 check_preemption_disabled+0xd3/0xe0 lib/smp_processor_id.c:47
 cpu_mt+0x53/0xd0 net/netfilter/xt_cpu.c:37
 [..]

Just use raw version instead.
This is similar to 14d14a5d29 ("netfilter: nft_meta: use raw_smp_processor_id()").

Fixes: 0ca743a559 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: syzbot+690d3e3ffa7335ac10eb@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Florian Westphal 47980b6dbf netfilter: nf_conntrack_gre: fix gre keymap list corruption
Quoting reporter:
  A race between GRE keymap insertion and destruction can corrupt the
  kernel list or use a freed object. `nf_ct_gre_keymap_add()` publishes a
  new keymap pointer before the embedded `list_head` is linked, while
  `nf_ct_gre_keymap_destroy()` can concurrently delete and free that
  same object. An unprivileged user can reach this through the PPTP
  conntrack helper by racing PPTP control messages or helper teardown,
  leading to KASAN-detectable list corruption/UAF in kernel context.

 ## Root Cause Analysis
 `exp_gre()` installs GRE expectations for a PPTP control flow and then
  adds two GRE keymap entries [..]

 The add path publishes `ct_pptp_info->keymap[dir]` before linking the
 embedded list node [..]
 Concurrent teardown deletes that partially initialized object.

Make add/destroy symmetric: install both, destroy both while under lock.

Furthermore, we should refuse to publish a new mapping in case ct is going
away, else we may leak the allocation.

The "retrans" detection is strange:  existing mapping is checked for key
equality with the new mapping, then for "is on the list" via list walk.

But I can't see how an existing keymap entry can be NOT on list.

Change this to only check if we're asked to map same tuple again -- if so,
   skip re-install, else signal failure.

Last, add a bug trap for the keymap list; it has to be empty when namespace
is going away.

Reported-by: Leo Lin <leo@depthfirst.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:46 +02:00
Chris Mason 92170e6afe netfilter: synproxy: refresh tcphdr after skb_ensure_writable
synproxy_tstamp_adjust() rewrites the TCP timestamp option in place
and then patches the TCP checksum via inet_proto_csum_replace4() on
the caller-supplied tcphdr pointer.  Both ipv4_synproxy_hook() and
ipv6_synproxy_hook() obtain that pointer with skb_header_pointer()
before calling in, so it may either alias skb->head directly or
point at the caller's on-stack _tcph buffer.

Between obtaining the pointer and using it, the function calls
skb_ensure_writable(skb, optend), which on a cloned or non-linear
skb invokes pskb_expand_head() and frees the old skb->head.  After
that point the cached th is stale:

    caller (ipv[46]_synproxy_hook)
      th = skb_header_pointer(skb, ..., &_tcph)
      synproxy_tstamp_adjust(skb, protoff, th, ...)
        skb_ensure_writable(skb, optend)
          pskb_expand_head()        /* kfree(old skb->head) */
        ...
        inet_proto_csum_replace4(&th->check, ...)
                                    /* writes into freed head, or
                                       into the caller's stack copy
                                       leaving the on-wire checksum
                                       stale */

The option bytes are written through skb->data and are fine; only
the checksum update goes through th and so lands in the wrong
place.  The result is either a write into freed slab memory or a
packet leaving with a checksum that does not match its payload.

Fix by re-deriving th from skb->data + protoff immediately after
skb_ensure_writable() succeeds, so the subsequent checksum update
targets the linear, writable header.

Fixes: 48b1de4c11 ("netfilter: add SYNPROXY core/target")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:28:40 +02:00
Hamza Mahfooz bed6e04be8 netfilter: conntrack: tcp: do not force CLOSE on invalid-seq RST without direction check
An unintended behavior in the TCP conntrack state machine allows a
connection to be forced into the CLOSE state using an RST packet with an
invalid sequence number.

Specifically, after a SYN packet is observed, an RST with an invalid SEQ
can transition the conntrack entry to TCP_CONNTRACK_CLOSE, regardless of
whether the RST corresponds to the expected reply direction. The relevant
code path assumes the RST is a response to an outgoing SYN, but does not
validate packet direction or ensure that a matching SYN was actually sent
in the opposite direction.

As a result, a crafted packet sequence consisting of a SYN followed by an
invalid-sequence RST can prematurely terminate an active NAT entry. This
makes connection teardown easier than intended.

So, tighten the state transition logic to ensure that RST-triggered
CLOSE transitions only occur when the RST is a valid response to a
previously observed SYN in the correct direction.

Cc: stable@vger.kernel.org
Fixes: 9fb9cbb108 ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-05-22 12:27:55 +02:00
Linus Torvalds 68993ced0f Including fixes from Bluetooth, wireless and netfilter.
Current release - fix to a fix:
 
  - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events
 
  - vsock/virtio: relax the recently added memory limit a little
 
 Current release - regressions:
 
  - IB/IPoIB: make sure IB drivers always use async set_rx_mode since
    some (mlx5) are now required to use it due to locking changes
 
 Previous releases - regressions:
 
  - udp: fix UDP length on last GSO_PARTIAL segment
 
  - af_unix: fix UAF read of tail->len in unix_stream_data_wait()
 
  - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction
 
  - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP
 
 Previous releases - always broken:
 
  - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR
 
  - ipv4: raw: reject IP_HDRINCL packets with ihl < 5
 
  - Bluetooth: a lot of locking and concurrency fixes (as always)
 
  - batman-adv (mesh wireless networking): a lot of random fixes
    for issues reported by security researchers and Sashiko
 
  - netfilter: same thing, a lot of small security-ish fixes
    all over the place, nothing really stands out
 
 Misc:
 
  - bring back the old 3c509 driver, Maciej wants to maintain it
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmoPULUACgkQMUZtbf5S
 IrtAvA//bfjxxazZKkGqL8mp6uMYS5Su81Oh/pBcyEWC7q2xv3ftNp5pt8oCTWYP
 eryKi7XrxfNHrkFcmnH+aWQ431UekZLfAjrSd+5V0YvE1nQDnKrgbat5qx2SYSsr
 ZA7EYnJjvAtPMb0KqUJlYPMSfVdFA0H3gEOdnawkGRnizkKNO5NsNRkC4rHzpCil
 hzW5SCTZWQ0r1Cm3IxcTnSCJEOYRqH0BUBbiSRFCWNMZZpq0xKi3UiJFOdgRvqgc
 VoPz6sMRPxZyL8gW8i2jJVz6vj2yuWifJwbl8y3ZkqJJy4HvNXfcPIBH5+vBIWlB
 hWMuYlUv5F0w+h4+UKeDr789Tdpv12edUIDX+prbsJ8c4bXmBflt069HlFjG9Pto
 /k2e5owR0NYSaLt4WvAM6Tr5j1ralzQjHKVDg8JbPaAD+0dtb+e3dXE8J3MBPrw6
 EWtdg9jX+vqsbVoHwMQO9Xp2waNY9+97L07w+I0nVf7NLJvrvz0lkSjMKfNPNyV1
 C5W7McAbSOx3nJ+XzYwMoVK0wP9OunKA73EhAoEdvQSyOGLqQT+iZzDoTMnwKJFs
 2L3fbc8LQ10WBG2B2rCPB/gaGQ1ZZD8uSlZoS9N31dvUPFDaCnCYgKIze/pdcE/R
 KOQskME2xd61KzpYlJszkrjJIbnppkNt/mBvvfNUP+zJZPFRyuA=
 =ei7U
 -----END PGP SIGNATURE-----

Merge tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth, wireless and netfilter.

  Craziness continues with no end in sight. Even discounting the driver
  revert this is a pretty huge PR for standards of the previous era. I'd
  speculate - we haven't seen the worst of it, yet. Good news, I guess,
  is that so far we haven't seen many (any?) cases of "AI reported a
  bug, we fixed it and a real user regressed".

  Current release - fix to a fix:

   - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events

   - vsock/virtio: relax the recently added memory limit a little

  Current release - regressions:

   - IB/IPoIB: make sure IB drivers always use async set_rx_mode since
     some (mlx5) are now required to use it due to locking changes

  Previous releases - regressions:

   - udp: fix UDP length on last GSO_PARTIAL segment

   - af_unix: fix UAF read of tail->len in unix_stream_data_wait()

   - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction

   - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP

  Previous releases - always broken:

   - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR

   - ipv4: raw: reject IP_HDRINCL packets with ihl < 5

   - Bluetooth: a lot of locking and concurrency fixes (as always)

   - batman-adv (mesh wireless networking): a lot of random fixes for
     issues reported by security researchers and Sashiko

   - netfilter: same thing, a lot of small security-ish fixes all over
     the place, nothing really stands out

  Misc:

   - bring back the old 3c509 driver, Maciej wants to maintain it"

* tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (187 commits)
  net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown
  net: enetc: fix init and teardown order to prevent use of unsafe resources
  net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging
  net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()
  net: enetc: fix race condition in VF MAC address configuration
  net: enetc: fix TOCTOU race and validate VF MAC address
  net: enetc: add ratelimiting to VF mailbox error messages
  net: enetc: fix missing error code when pf->vf_state allocation fails
  net: enetc: fix incorrect mailbox message status returned to VFs
  net: bridge: prevent too big nested attributes in br_fill_linkxstats()
  l2tp: use list_del_rcu in l2tp_session_unhash
  net: bcmgenet: keep RBUF EEE/PM disabled
  ethernet: 3c509: Fix most coding style issues
  ethernet: 3c509: Update documentation to match MAINTAINERS
  ethernet: 3c509: Add GPL 2.0 SPDX license identifier
  ethernet: 3c509: Fix AUI transceiver type selection
  Revert "drivers: net: 3com: 3c509: Remove this driver"
  tools: ynl: support listening on all nsids
  net: gro: don't merge zcopy skbs
  pds_core: ensure null-termination for firmware version strings
  ...
2026-05-21 14:39:12 -07:00
Linus Torvalds 6d3b2673e1 A fix for an "rbd unmap" race condition which popped up on a production
setup where many RBD devices are frequently mapped and unmapped, marked
 for stable.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmoPOcUTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi6yvB/9+hm+IHQK1gib4gCHOPgq9cD1gj7j8
 oegjSmO0tIn3nP1K6wqQfTFnYkE6F37+dRqZTXOUjzi3NlefNJDnxwt9fymp2Y6S
 DEobTnRgXSKMxJQ4+cu7jphgbJC/OKYZ+fxJRrKI4hlgZOfQwpvUfQczZZgjWJRz
 WqJnSKLoF2k6JDgvokcmt1nyyQ7TrpPRB+6Jz2ATRtG0oxEzp1EN++FtiaRzn1mx
 e0WgCBZV/AEFey2YG+Zyhrrg+1nUDWrZI3NgIZz5Q5WU4q/8zF7S9qbNeBQVdzKJ
 aZFLxOdQvuRGbLs+g8f6dOMMlDSLnKl2rN2AAzcfwXcmmSQILqLH+ycp
 =+cb7
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A fix for an 'rbd unmap' race condition which popped up on a
  production setup where many RBD devices are frequently mapped and
  unmapped, marked for stable"

* tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client:
  rbd: eliminate a race in lock_dwork draining on unmap
2026-05-21 14:17:28 -07:00
Linus Torvalds 7acfa2c5f4 ring-buffer fixes for 7.1:
- Fix reporting MISSED EVENTS in trace iterator
 
   When the "trace" file is read with tracing enabled, if the writer
   were to pass the iterator reader, it resets, sets a "missed_events"
   flag and continues. The tracing output checks for missed events and
   if there are some, it prints out "[LOST EVENTS]" to let the user
   know events were dropped.
 
   But the clearing of the missed_events happened when the tracing system
   queried the ring buffer iterator about missed events. This was premature
   as the ring buffer is per CPU, and the tracing code reads all the
   CPU buffers and checks for missed events when it is read. If the
   CPU iterator that had missed events isn't printed next, the output
   for the LOST EVENTS is lost.
 
   Clear the missed_events flag when the iterator moves to the next event
   and not when the missed_events flag is queried. Also clear it on reset.
 
 - Flush and stop the persistent ring buffer on panic
 
   On panic the persistent ring buffer is used to debug what caused the
   panic. But on some architectures, it requires flushing the memory
   from cache, otherwise, the ring buffer persistent memory may not have
   the last events and this could also cause the ring buffer to be
   corrupted on the next boot.
 
 - Fix nr_subbufs initialization in simple_ring_buffer_init_mm
 
   The remote simple ring buffer  meta data nr_subbufs is initialized
   too early and gets cleared later on, making it zero and not reflect
   the actual number of sub-buffers.
 
 - Fix unload_page for simple_ring_buffer init rollback
 
   On error, the pages loaded need to be unloaded. To unload a page
   it is expected that: page = load_page(va); -> unload_page(page).
   But the code was doing: unload_page(va) and not unload_page(page).
 
 - Create output file from cmd_check_undefined
 
   The check for undefined symbols checks if the file *.o.checked exists
   and if so it skips doing the work. But the *.o.checked file never
   was created making every build do the work even when it was already
   done previously.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCag8l7BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qga3AQDkyh7V4T+fxY5gc5jSKVx5U9bRAMpJ
 3GWGNCY9TGUyewEApUNO5MVGvXttyc1ONPHuBcShynj3resJk90sk491kw0=
 =aY8d
 -----END PGP SIGNATURE-----

Merge tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer fixes from Steven Rostedt:

 - Fix reporting MISSED EVENTS in trace iterator

   When the "trace" file is read with tracing enabled, if the writer
   were to pass the iterator reader, it resets, sets a "missed_events"
   flag and continues. The tracing output checks for missed events and
   if there are some, it prints out "[LOST EVENTS]" to let the user know
   events were dropped.

   But the clearing of the missed_events happened when the tracing
   system queried the ring buffer iterator about missed events. This was
   premature as the ring buffer is per CPU, and the tracing code reads
   all the CPU buffers and checks for missed events when it is read. If
   the CPU iterator that had missed events isn't printed next, the
   output for the LOST EVENTS is lost.

   Clear the missed_events flag when the iterator moves to the next
   event and not when the missed_events flag is queried. Also clear it
   on reset.

 - Flush and stop the persistent ring buffer on panic

   On panic the persistent ring buffer is used to debug what caused the
   panic. But on some architectures, it requires flushing the memory
   from cache, otherwise, the ring buffer persistent memory may not have
   the last events and this could also cause the ring buffer to be
   corrupted on the next boot.

 - Fix nr_subbufs initialization in simple_ring_buffer_init_mm

   The remote simple ring buffer meta data nr_subbufs is initialized too
   early and gets cleared later on, making it zero and not reflect the
   actual number of sub-buffers.

 - Fix unload_page for simple_ring_buffer init rollback

   On error, the pages loaded need to be unloaded. To unload a page it
   is expected that: page = load_page(va); -> unload_page(page). But the
   code was doing: unload_page(va) and not unload_page(page).

 - Create output file from cmd_check_undefined

   The check for undefined symbols checks if the file *.o.checked exists
   and if so it skips doing the work. But the *.o.checked file never was
   created making every build do the work even when it was already done
   previously.

* tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Create output file from cmd_check_undefined
  tracing: Fix unload_page for simple_ring_buffer init rollback
  tracing: Fix nr_subbufs initialization in simple_ring_buffer_init_mm()
  ring-buffer: Flush and stop persistent ring buffer on panic
  ring-buffer: Fix reporting of missed events in iterator
2026-05-21 14:05:09 -07:00
Jakub Kicinski 0e3c08f1b7 Quite a few more updates:
- cfg80211/mac80211:
    - various security(-ish) fixes
    - fix A-MSDU subframe handling
    - fix multi-link element parsing
  - ath10: avoid sending commands to dead device
  - ath11k:
    - fix WMI buffer leaks on error conditions
    - fix UAF in RX MSDU coalesce path
    - allow peer ID 0 on RX path (legal for mobile devices)
    - reinitialize shared SRNG pointers on restart
  - ath12k:
    - fix 20 MHz-only parsing of EHT-MCS map
  - iwlwifi:
    - fix TSO segmentation explosion
    - don't TX to dead device
    - fix warning in WoWLAN
    - fix TX rates on old devices
    - disconnect on beacon loss only if also no other traffic
    - fill NULL-ptr deref
    - fix STEP_URM hardware access
 -----BEGIN PGP SIGNATURE-----
 
 iQIyBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmoPJEkACgkQ10qiO8sP
 aACQrw/4vU+lbZNW19OyaJMd4h+44gUW+UGJixOzputQCBc6JGUlRsxgceWq5Ws5
 5x2LTOX7S1wcvKm0VuSvkIRP3e9YHcgB60iBtsJ3ozz4RCoCFiSu8Bb2RdkGtRTp
 7CKMK9NNuovOJncBzfyANq4ujsGs/58BmGbhXbaZ0ACfLUauesCCUtM7iQZE1k7t
 lBqtk8ezkz1L8006w5vR7VR8g4KCCofQTEAOASmx450ZeGAiHMlWVKdMFFHV3zWj
 ZDXopvLaMtduLNq9xqGYCRhAIZqOv1axgL7w9RRxsi2gWHv71kLqyz0IzgbFmh1m
 ZxUSQ45+MHVYCHxs7HHCcTR5gqQlx47j5Wi3tuLUH8yoSZ8dPeWjmQMvIEswfZql
 WNq18o6mcK+L3Yg87+oxiiJ7V/euaM//0+ZGtqhbiB+2FyHZhO42BqALTy7e4swS
 kmEl8gCj2lgCbD2AHJQ9VpOJwoNdNuLYoJqg9IiIu/CYqQF80FGO8e6HZXBXsJkL
 3KAPQIXkMMMkSjtpTg/GdHDiFqv/7lF8u3FgED7w7M1ZVQYNUc13KiDwPALFV0pu
 bBbRktB6lvF6ShW9XrTrmn9lT0iiWHxr5YctWoys4+Ofr5V7PUzxNVDxDuzSVCZ0
 apLYZ7uwSXO37q99p/azs47dzYp7tpnwKd4rpQuUTl/9bYErUA==
 =vzwG
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Quite a few more updates:
 - cfg80211/mac80211:
   - various security(-ish) fixes
   - fix A-MSDU subframe handling
   - fix multi-link element parsing
 - ath10: avoid sending commands to dead device
 - ath11k:
   - fix WMI buffer leaks on error conditions
   - fix UAF in RX MSDU coalesce path
   - allow peer ID 0 on RX path (legal for mobile devices)
   - reinitialize shared SRNG pointers on restart
 - ath12k:
   - fix 20 MHz-only parsing of EHT-MCS map
 - iwlwifi:
   - fix TSO segmentation explosion
   - don't TX to dead device
   - fix warning in WoWLAN
   - fix TX rates on old devices
   - disconnect on beacon loss only if also no other traffic
   - fill NULL-ptr deref
   - fix STEP_URM hardware access

* tag 'wireless-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (24 commits)
  wifi: cfg80211: wext: validate chandef in monitor mode
  wifi: mac80211: consume only present negotiated TTLM maps
  wifi: wilc1000: fix dma_buffer leak on bus acquire failure
  wifi: mac80211: capture fast-RX rate before mesh reuses skb->cb
  wifi: mac80211: fix multi-link element inheritance
  wifi: mac80211: fix MLE defragmentation
  wifi: mac80211: don't override max_amsdu_subframes
  wifi: mac80211: bounds-check link_id in ieee80211_ml_epcs
  wifi: ath12k: fix EHT TX MCS limitation due to wrong 20 MHz-only parsing
  wifi: ath11k: clear shared SRNG pointer state on restart
  wifi: ath11k: fix use after free in ath11k_dp_rx_msdu_coalesce()
  wifi: ath11k: fix peer resolution on rx path when peer_id=0
  wifi: iwlwifi: mld: disconnect only after 6 beacons without Rx
  wifi: iwlwifi: mld: don't WARN on WoWLAN suspend w/o BSS vif
  wifi: iwlwifi: use correct function to read STEP_URM register
  wifi: iwlwifi: mvm: fix driver-set TX rates on old devices
  wifi: iwlwifi: mld: don't dereference a pointer before NULL checking it
  wifi: iwlwifi: mld: stop TX during firmware restart
  wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled
  wifi: ath10k: skip WMI and beacon transmission when device is wedged
  ...
====================

Link: https://patch.msgid.link/20260521152903.374070-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 11:03:58 -07:00
Linus Torvalds 758c807bb9 EFI fixes for v7.1 #2
- Permit ACPI PRM runtime firmware calls when acpi_init() runs
 
 - Add another Lenovo Ideapad framebuffer quirk
 
 - Cosmetic tweak
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCag8IUwAKCRAwbglWLn0t
 XPsPAQDBYVNBZTQ+6X5m/G6VbMqZgm2p9TXcqN05UbkCIu6SFQD/cedFAI+MNSbM
 7fog1OIWWr2VLiv425+79QRMcqNQNAg=
 =sZPl
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - Permit ACPI PRM runtime firmware calls when acpi_init() runs

 - Add another Lenovo Ideapad framebuffer quirk

 - Cosmetic tweak

* tag 'efi-fixes-for-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: sysfb_efi: Extend quirk to cover IdeaPad Duet 3 10IGL5-LTE
  efi: efi.h: Remove extra semicolon
  efi: Allocate runtime workqueue before ACPI init
2026-05-21 08:59:52 -07:00
Jakub Kicinski c33f944a33 Merge branch 'net-enetc-sr-iov-robustness-and-security-fixes'
Wei Fang says:

====================
net: enetc: SR-IOV robustness and security fixes

This patch series addresses a number of robustness, security, and
correctness issues in the ENETC driver's SR-IOV subsystem, focusing
primarily on the VF-to-PF mailbox communication path.

The series can be grouped into the following categories:

1. DoS and security fixes:
   - Prevent an unbounded loop DoS in the VF-to-PF message handler,
     which could be triggered by a malicious or misbehaving VF.
   - Fix a TOCTOU (Time-of-Check-Time-of-Use) race and add proper
     validation of VF MAC addresses to prevent spoofing or invalid
     configuration from being applied.

2. Race condition fixes:
   - Fix a race condition in VF MAC address configuration that could
     lead to inconsistent state between the VF request and PF
     application.
   - Fix a race condition during SR-IOV teardown that could cause
     VF->PF mailbox operations to time out, resulting in unnecessary
     errors during shutdown.

3. Memory safety fixes:
   - Fix a DMA write to freed memory in enetc_msg_free_mbx(), which
     could cause silent memory corruption or system instability.

4. Error handling and initialization fixes:
   - Fix missing error code propagation when pf->vf_state allocation
     fails, ensuring callers receive a proper errno instead of
     succeeding silently.
   - Fix incorrect mailbox message status values returned to VFs,
     which could cause VFs to misinterpret PF responses.
   - Fix initialization order to prevent the use of uninitialized
     resources during driver probe, which could cause undefined
     behavior on certain configurations.

5. Diagnostics improvement:
   - Add rate limiting to VF mailbox error messages to prevent log
     flooding in the presence of a misbehaving VF.

These fixes improve the overall stability and security of the ENETC
SR-IOV implementation, particularly in multi-tenant environments where
VFs may be assigned to untrusted guests.
====================

Link: https://patch.msgid.link/20260520064421.91569-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:49:02 -07:00
Wei Fang 9e68817f12 net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown
During SR-IOV teardown, enetc_msg_psi_free() disables the MR interrupt
before pci_disable_sriov() removes the VFs. If a VF sends a mailbox
message during this window, the PF cannot receive it, causing the VF to
timeout waiting for a reply.

Since the timeout occurs during SR-IOV teardown when the VF is about to
be removed anyway, it has no functional impact on operation. However,
more messages will be added in the future, some visible error logs may
confuse users. So fix it by calling pci_disable_sriov() first to remove
all VFs, then safely clean up the mailbox resources. This eliminates the
race window where VFs could send messages to an unresponsive PF.

Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-10-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:49:00 -07:00
Wei Fang 54362b0176 net: enetc: fix init and teardown order to prevent use of unsafe resources
Sashiko reported a potential issue in enetc_msg_psi_init() where the IRQ
handler is registered before DMA resources are fully initialized [1].

The current initialization sequence is:

  1. request_irq(enetc_msg_psi_msix)    <- IRQ handler registered
  2. INIT_WORK(&pf->msg_task, ...)      <- work_struct initialized
  3. enetc_msg_alloc_mbx()              <- mailbox DMA allocated

This ordering is unsafe because if a spurious interrupt or pending
interrupt from a previous device state fires immediately after
request_irq() returns, the registered ISR enetc_msg_psi_msix() will
execute and unconditionally call:

  schedule_work(&pf->msg_task)

At this point, pf->msg_task has not been initialized by INIT_WORK(), so
the work_struct contains garbage values in its internal linked list
pointers (work_struct->entry). Passing an uninitialized work_struct to
schedule_work() could corrupt the kernel's workqueue linked lists,
potentially leading to:

  - Kernel panic in __queue_work()
  - Memory corruption in workqueue data structures
  - System deadlock or undefined behavior

Additionally, even if the work_struct was initialized, the mailbox DMA
buffers (pf->rxmsg[]) may not yet be allocated when the work handler
enetc_msg_task() runs, resulting in NULL pointer dereference.

Fix by reordering the initialization sequence to ensure all resources are
properly initialized before the interrupt handler can execute:

  1. enetc_msg_alloc_mbx()              <- Allocate all mailboxes
  2. INIT_WORK(&pf->msg_task, ...)      <- Initialize work first
  3. request_irq(enetc_msg_psi_msix)    <- Register IRQ last
  4. Configure hardware & enable MR interrupts

This guarantees that when enetc_msg_psi_msix() runs:
  - pf->msg_task is properly initialized (safe for schedule_work)
  - pf->rxmsg[] buffers are allocated (safe for work handler access)
  - Hardware is configured appropriately

As the inverse of enetc_msg_psi_init(), enetc_msg_psi_free() also has
similar problems. For example, if a pending interrupt fires between
enetc_msg_free_mbx() and free_irq(), the ISR enetc_msg_psi_msix() may
schedule the work handler again via schedule_work(), which could then
access already-freed DMA buffers (pf->rxmsg[]), leading to use-after-free
and potential memory corruption.

Therefore, the order of enetc_msg_psi_free() is adjusted:
  1. enetc_msg_disable_mr_int()       <- Stop new interrupts first
  2. free_irq()                       <- Ensure no IRQ handler can run
  3. cancel_work_sync()               <- Wait for any pending work
  4. enetc_msg_disable_mr_int()       <- Re-disable in case work
					 re-enabled it
  5. enetc_msg_free_mbx()             <- Safe to free DMA buffers now

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1
Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-9-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang f8ae63de2a net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging
The enetc_msg_task() function has several issues that need to be addressed:

1. Unbounded loop causing potential DoS:

enetc_msg_task() processes VF-to-PF mailbox messages in an unbounded
for(;;) loop that keeps polling ENETC_PSIMSGRR until no MR bits are set.
A malicious guest VM can exploit this by continuously sending messages at
a high rate - immediately sending a new message as soon as the PF
acknowledges the previous one. Since the worker thread never yields or
enforces a processing budget, the mr_mask check frequently evaluates to
non-zero, causing the PF to spin indefinitely and starving other tasks.

Fix this by replacing the unbounded loop with a single snapshot read at
task entry. The task processes only the VFs whose MR bits were set at
that point, then re-enables message interrupts before returning. This
bounds work per invocation to at most num_vfs iterations. No messages are
lost because the message interrupt is disabled in enetc_msg_psi_msix()
before scheduling enetc_msg_task(), so any new messages arriving during
processing will trigger a fresh interrupt once re-enabled, scheduling
another task invocation.

2. Write order of ENETC_PSIIDR and ENETC_PSIMSGRR:

Both ENETC_PSIIDR and ENETC_PSIMSGRR contain MR bits indicating messages
have been received from VSIs, but only ENETC_PSIIDR trigger the CPU
interrupt. Previously, ENETC_PSIMSGRR was written before ENETC_PSIIDR.
Writing ENETC_PSIMSGRR returns the message code to the VSI in its upper
16 bits, signaling to the VF that message processing is complete and it
may send the next message. If the VF sends a new message before
ENETC_PSIIDR is written, the subsequent w1c write to ENETC_PSIIDR would
inadvertently clear the MR bit set by the new message, causing the
interrupt to be lost and the new message to go unprocessed.

Therefore, write ENETC_PSIIDR first to clear the interrupt source, then
write ENETC_PSIMSGRR to acknowledge the message to the VSI.

3. Check both ENETC_PSIMSGRR and ENETC_PSIIDR for mr_status:

The write order change above introduces a potential race: if a VF sends
a new message in the window between the ENETC_PSIIDR w1c and the
ENETC_PSIMSGRR w1c, the ENETC_PSIMSGRR MR bit for the new message may
not be set. If mr_status was derived solely from ENETC_PSIMSGRR, this
message would never be detected despite ENETC_PSIIDR retaining its MR
bit, leading to an unacknowledged interrupt storm.

Fix this by computing mr_status as the union of both ENETC_PSIMSGRR and
ENETC_PSIIDR MR bits, ensuring all pending messages are detected
regardless of which register reflects the new message state.

Additionally, rename the per-register MR macros (ENETC_PSI*_MR_MASK,
ENETC_PSI*_MR) to register-agnostic names (ENETC_PSIMR_MASK,
ENETC_PSIMR_BIT) since the MR bit layout is shared across ENETC_PSIMSGRR,
ENETC_PSIIER, and ENETC_PSIIDR. Make the mask macro dynamic based on
the actual number of active VFs rather than hardcoded.

Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260520064421.91569-8-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang adb4599979 net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()
The teardown sequence in enetc_msg_psi_free() frees the DMA buffer before
clearing the device's DMA address registers. If a VF sends a message or a
pending DMA transfer completes within this window, the hardware will
perform a DMA write into the kernel memory that has already been returned
to the allocator.

The result is silent memory corruption that can affect arbitrary kernel
data structures. Therefore, clear the DMA address registers before the
DMA buffer is freed.

Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-7-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang f262f5d893 net: enetc: fix race condition in VF MAC address configuration
Sashiko reported a potential race condition between the VF message
handler and administrative VF MAC configuration from the host [1].

The VF message handler (enetc_msg_pf_set_vf_primary_mac_addr) runs
asynchronously in a workqueue context and accesses vf_state->flags
without any locking. Concurrently, the host can administratively
change the VF MAC address via enetc_pf_set_vf_mac(), which executes
under RTNL lock and modifies both vf_state->flags and hardware
registers.

This creates two race windows:

1) TOCTOU race on vf_state->flags: The check of ENETC_VF_FLAG_PF_SET_MAC
   and subsequent MAC programming are not atomic, allowing the flag state
   to change between check and use.

2) Torn MAC address writes: Hardware MAC programming requires multiple
   non-atomic register writes (__raw_writel for lower 32 bits and
   __raw_writew for upper 16 bits). Concurrent updates from VF mailbox
   and PF admin paths can interleave these operations, resulting in a
   corrupted MAC address being programmed into the hardware.

Fix by introducing a per-VF mutex to serialize access to vf_state and
hardware MAC register updates. Both enetc_pf_set_vf_mac() and
enetc_msg_pf_set_vf_primary_mac_addr() now acquire this lock before
accessing vf_state->flags or programming the MAC address, ensuring
atomic read-modify-write sequences and preventing register write
interleaving.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1
Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-6-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang c666fa632f net: enetc: fix TOCTOU race and validate VF MAC address
Sashiko reported that the PF driver accepts arbitrary MAC address from
from VF mailbox messages without proper validation, creating a security
vulnerability [1].

In enetc_msg_pf_set_vf_primary_mac_addr(), the MAC address is extracted
directly from the message buffer (cmd->mac.sa_data) and programmed into
hardware via pf->ops->set_si_primary_mac() without any validity checks.
A malicious VF can configure a multicast, broadcast, or all-zero MAC
address. Therefore, a validation to check the MAC address provided by VF
is required.

However, simply checking the MAC address is not enough, because it also
has the potential TOCTOU race [2]: The code reads the MAC address from
the DMA buffer to validate it via is_valid_ether_addr(), if validation
passes, reads the same DMA buffer a second time when calling
enetc_pf_set_primary_mac_addr() to program the hardware. A malicious VF
can exploit this window by overwriting the MAC address in the DMA buffer
between the validation check and the hardware programming, bypassing the
validation entirely.

Therefore, allocate a local buffer in enetc_msg_handle_rxmsg() and copy
the message content from the DMA buffer via memcpy() before processing.
This ensures the PF operates on a stable snapshot that the VF cannot
modify.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1
Link: https://sashiko.dev/#/patchset/20260513103021.2190593-1-wei.fang%40nxp.com #2
Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-5-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang 4a995d37b5 net: enetc: add ratelimiting to VF mailbox error messages
Sashiko reported that a buggy or malicious guest VM can flood the host
kernel log by repeatedly sending VF-to-PF messages at a high rate,
degrading host performance and hiding important system logs [1].

Fix by replacing dev_err()/dev_warn() with dev_err_ratelimited(),
limiting output to the default kernel ratelimit. This ensures errors are
still logged for debugging while preventing log flooding attacks.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1
Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-4-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:59 -07:00
Wei Fang 5027266dea net: enetc: fix missing error code when pf->vf_state allocation fails
In enetc_pf_probe(), when the memory allocation for pf->vf_state fails,
the code jumps to the error handling label but the variable 'err' is not
assigned an appropriate error code beforehand. This causes the function
to return 0 (success) on an allocation failure path, misleading the
caller into thinking the probe succeeded. So set err to -ENOMEM before
jumping to the error handling label when the allocation for pf->vf_state
returns NULL.

Fixes: e15c5506dd ("net: enetc: allocate vf_state during PF probes")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-3-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:58 -07:00
Wei Fang 8c84c5ec4a net: enetc: fix incorrect mailbox message status returned to VFs
There are two cases where VFs receive an incorrect success status from
the PF mailbox message handler, misleading them into believing their
requests have been fulfilled:

In enetc_msg_handle_rxmsg(), *status is pre-initialized to
ENETC_MSG_CMD_STATUS_OK. When an unsupported command type is received,
the default case only logs an error without updating *status, so it
remains as ENETC_MSG_CMD_STATUS_OK.

In enetc_msg_pf_set_vf_primary_mac_addr(), when the PF has already
assigned a MAC address for the VF (ENETC_VF_FLAG_PF_SET_MAC is set),
the function rejects the request but returns ENETC_MSG_CMD_STATUS_OK
instead of ENETC_MSG_CMD_STATUS_FAIL.

Therefore, correct the status value for the two cases mentioned above.

Fixes: beb74ac878 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-2-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:48:58 -07:00
Eric Dumazet bdd39576bf net: bridge: prevent too big nested attributes in br_fill_linkxstats()
After commit ff205bf8c554 ("netlink: add one debug check in nla_nest_end()")
syzbot found that br_fill_linkxstats() can send corrupted netlink packets.

Make sure the nested attribute size is bounded.

Fixes: a60c090361 ("bridge: netlink: export per-vlan stats")
Reported-by: syzbot+a35f9259d08f907c06e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a0b0da3.050a0220.175f0c.0000.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260520114207.1394241-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:47:36 -07:00
Michael Bommarito 979c017803 l2tp: use list_del_rcu in l2tp_session_unhash
An unprivileged local user can pin a host CPU indefinitely in
l2tp_session_get_by_ifname() by issuing L2TP_CMD_SESSION_GET on
L2TP_ATTR_IFNAME concurrently with L2TP_CMD_SESSION_CREATE and
L2TP_CMD_SESSION_DELETE on the same tunnel. All three commands take
GENL_UNS_ADMIN_PERM, so CAP_NET_ADMIN in the netns user namespace
suffices; on any host that has l2tp_core loaded the trigger is
reachable from a standard `unshare -Urn` sandbox.

l2tp_session_unhash() removes a session from tunnel->session_list
with list_del_init(), but that list is walked by
l2tp_session_get_by_ifname() with list_for_each_entry_rcu() under
rcu_read_lock_bh(). list_del_init() leaves the deleted entry's
next/prev self-pointing; a reader that has loaded the entry and
then advances pos->list.next reads &session->list, container_of()s
back to the same session, and list_for_each_entry_rcu() never
reaches the list head. The CPU stays in strcmp() inside the
walker, with BH and preemption disabled, so RCU grace periods on
the host stall behind it and the wedged thread cannot be killed
(SIGKILL is delivered on syscall return).

Use list_del_rcu() to match the existing list_add_rcu() in
l2tp_session_register(); the deleted session remains visible to
in-flight walkers with consistent next/prev pointers until
kfree_rcu() in l2tp_session_free() releases it. tunnel->session_list
has exactly one list_del_init() call site; the list_del_init
(&session->clist) at l2tp_core.c:533 operates on the per-collision
list, which is not walked under RCU. list_empty(&session->list) is
not used anywhere in net/l2tp/ after the unhash point, so dropping
the post-delete self-init is safe; the fix has no userspace-visible
behavior change.

Fixes: 89b768ec2d ("l2tp: use rcu list add/del when updating lists")
Cc: stable@vger.kernel.org # 6.11+
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260518183447.64078-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:47:20 -07:00
Linus Torvalds dd3802fc4f soc: fixes for 7.1
The ff-a firmware driver gets 11 individual bugfixes for a number
 of issues with robustness to buggy firmware or client implementations.
 Another firmware fix address suspend to RAM via PSCI firmware.
 
 The final code change is for the old Arm Integrator reference
 platform that recently started exposing an old NULL pointer
 dereference bug.
 
 The MAINTAINERS file gets two updates, notably James Tai and Yu-Chun
 Lin are stepping up as co-maintainers for the Realtek platform.
 
 The remaining patches are all for devicetree files. Two of these
 are for riscv  boards, the rest are all for enesas Arm platforms,
 addressing build time checking issues as well as minor configuration
 problems.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmoOI3UACgkQmmx57+YA
 GNmfZRAAgTy6C1902MT6aysKlN1rZLAPN2izKjAUXTaH0tXzTs4N1Q9UqUSQrOtM
 INjWwfHuDw0oJOJozkN8jdmTW9s+4ebo1KM6vUoW21yLGprxiVmm/uG62Ycyt2cb
 tKA/SzoK1sg7ZnV2agmBNyXvEtQyx4mbfNCfv1l1DH+f4q26PZMR8s6kzfK8Kb4O
 e8oIRF/q773Ht7dRqT+NKb+qPMfG0IeLrcnRWY5qoHF/RIUDShO6W8uZEYisiXnY
 Z9e/eNbTyG/zIsqhr8qgwbdW9GXHkj0ztvwbwzC8PeXqsETl4LhXZkUaO/jvA2MH
 VihikFOFYhYZcna/6OQqPEyTKrCxGfuK4be9bYPrD3weEou3YR6+aHV8rpUPEsNq
 3C8iSYDflWwhsj571qy8sMwkYkvIcrIIdDZltKU20Q6p5pv6EeyUwN3RlUJDsXr/
 j6wnHm6DFF8WL0V8/Vv1lB/PjySkzOIFfjihq6VVPeo2EYhJjLmdxg/Z9MNbCY59
 E8Fl9xBqg6YJyZ+Why6v4vkFvNKJX1T35AhgHR58X5DsrGz3v9fvC//m9EENrKCz
 GbNFXC7i93+6/MnQh7SYFUEYaVcAM+3Z8N91qYVJxvgJ4BVURnelJKe5WVwZjjY4
 Ws981yptKHgHWaRHZbCCd9VXQKtnb+jU1Zjjx1uTVEOZtecM3Dw=
 =MpIJ
 -----END PGP SIGNATURE-----

Merge tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:

 - The ff-a firmware driver gets 11 individual bugfixes for a number of
   issues with robustness to buggy firmware or client implementations.
   Another firmware fix address suspend to RAM via PSCI firmware.

 - The final code change is for the old Arm Integrator reference
   platform that recently started exposing an old NULL pointer
   dereference bug.

 - The MAINTAINERS file gets two updates, notably James Tai and Yu-Chun
   Lin are stepping up as co-maintainers for the Realtek platform.

 - The remaining patches are all for devicetree files. Two of these are
   for riscv boards, the rest are all for enesas Arm platforms,
   addressing build time checking issues as well as minor configuration
   problems.

* tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
  firmware: psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
  ARM: realtek: MAINTAINERS: Include pin controller drivers
  MAINTAINERS: Add maintainers for ARM/REALTEK ARCHITECTURE
  ARM: integrator: Fix early initialization
  firmware: arm_ffa: Fix sched-recv callback partition lookup
  firmware: arm_ffa: Snapshot notifier callbacks under lock
  firmware: arm_ffa: Align RxTx buffer size before mapping
  firmware: arm_ffa: Validate framework notification message layout
  firmware: arm_ffa: Keep framework RX release under lock
  firmware: arm_ffa: Bound PARTITION_INFO_GET_REGS copies
  firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0
  firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue
  firmware: arm_ffa: Avoid collapsing NPI work from different CPUs
  firmware: arm_ffa: Skip free_pages on RX buffer alloc failure
  firmware: arm_ffa: Check for NULL FF-A ID table while driver registration
  riscv: dts: microchip: fix icicle i2c pinctrl configuration
  riscv: dts: starfive: jh7110: Drop CAMSS node
  arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst
  arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst
  ARM: dts: renesas: rskrza1: Drop superfluous cells
  ...
2026-05-21 08:43:26 -07:00
Nicolai Buchwitz 9a1730245e net: bcmgenet: keep RBUF EEE/PM disabled
Setting RBUF_EEE_EN | RBUF_PM_EN in RBUF_ENERGY_CTRL breaks the RX
path on GENET hardware once MAC EEE becomes active. RX traffic stops
flowing while the link stays up and the usual descriptor/RX error
counters remain quiet. In that state the MAC still accepts frames
(rbuf_ovflow_cnt keeps climbing) but RBUF no longer forwards them to
DMA, so rx_packets is no longer incremented at the netdev level. On
some boards the corruption ends up as a paging fault in
skb_release_data via bcmgenet_rx_poll on an LPI exit.

Reproduced on Pi 4B (BCM2711 + BCM54213PE) and confirmed by Florian
Fainelli on an internal Broadcom 4908-family board with the same crash
signature. RBUF_PM_EN is not publicly documented.

This shows up more often now that phy_support_eee() enables EEE by
default, but it also affects older kernels as soon as TX LPI is
turned on via ethtool, so it is not specific to recent changes.

Always clear RBUF_EEE_EN | RBUF_PM_EN in bcmgenet_eee_enable_set so
the bits stay off across resets. UMAC and TBUF setup is left alone so
TX-side EEE keeps working.

Link: https://github.com/raspberrypi/linux/issues/7304
Fixes: 6ef398ea60 ("net: bcmgenet: add EEE support")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260520184320.652053-1-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:37:56 -07:00
Jakub Kicinski c5fcca7f66 Merge branch 'ethernet-3c509-bring-driver-back-and-make-some-fixes'
Maciej W. Rozycki says:

====================
ethernet: 3c509: Bring driver back and make some fixes

 As per the previous discussions[1][2] this patch series brings the 3c509
driver back.  Picking up net rather than net-next as I consider it a fix
to accidental removal and so that any downstream users do not suffer from
disruption when using released kernels.

 In the course of making the coding style changes requested I have come
across an actual bug in transceiver type selection code, where the old
setting is not masked out before ORing in the new one, causing no change
to be actually made in a requested transition from BNC to AUI.  I guess
this code must have been executed exceedingly rarely, as it's always been
wrong ever since it was added in 2.5.42 back in 2002.

 Therefore I find it not worth backporting to stable branches, however for
the sake of appropriateness, in case someone downstream does want to have
the fix, I chose to apply it second in the series, right after the actual
revert and before code clean-ups.

 The remaining patches of the series should be obvious; see the respective
commit descriptions for details.

[1] "drivers: net: 3com: 3c509: Remove this driver",
    <https://lore.kernel.org/r/alpine.DEB.2.21.2604240004280.28583@angie.orcam.me.uk/>.

[2] "MAINTAINERS: Add self for the 3c509 network driver",
    <https://lore.kernel.org/r/alpine.DEB.2.21.2604271056460.28583@angie.orcam.me.uk/>.
====================

Link: https://patch.msgid.link/alpine.DEB.2.21.2605201115010.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:59 -07:00
Maciej W. Rozycki 014767c709 ethernet: 3c509: Fix most coding style issues
Update the driver for our current coding style according to output from
`checkpatch.pl' and manual code review, where no change to binary code
results, as indicated by `objdump -dr'.  Exceptions are as follows:

- incomplete reverse xmas tree in set_multicast_list(), as that would
  change binary output,

- referring el3_start_xmit() verbatim rather than via `__func__' with
  pr_debug(), likewise,

- a bunch of pr_cont() calls, likewise,

- a long udelay() call in el3_netdev_set_ecmd() made under a spinlock,
  likewise plus it's not eligible for conversion to a sleep in the first
  place,

- a blank line at the start of a block in el3_interrupt(), to improve
  readability where the first statement would otherwise visually merge
  with the controlling expression of the enclosing `while' statement.

These issues are benign and depending on circumstances may be adressed
with suitable code refactoring later on.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201208280.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:56 -07:00
Maciej W. Rozycki 75756cb4b2 ethernet: 3c509: Update documentation to match MAINTAINERS
There has been apparently a single message only ever publicly posted by
David Ruggiero, back in 2002, which added this documentation piece among
others, and MAINTAINERS was never updated accordingly.  It is therefore
doubtful that his maintainer status has actually come into effect.  Just
replace the reference then so as not to confuse people.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201207380.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:56 -07:00
Maciej W. Rozycki 240117bb51 ethernet: 3c509: Add GPL 2.0 SPDX license identifier
This driver has landed with Linux 0.99.13k, which was covered by the GNU
General Public License version 2, and no further conditions as to
licensing terms have been specified within the copyright notice included
with the driver itself.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201206370.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:56 -07:00
Maciej W. Rozycki 029a6b3a14 ethernet: 3c509: Fix AUI transceiver type selection
The transceiver type is held in bits 15:14 of the Address Configuration
Register, with the values of 0b00, 0b01, and 0b11 denoting TP, AUI, and
BNC types respectively.  Therefore switching from BNC to AUI requires
bits to be cleared before setting bit 14 or the setting won't change.

NB this has always been wrong ever since this code was added in 2.5.42.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201205160.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:56 -07:00
Maciej W. Rozycki 28db0338db Revert "drivers: net: 3com: 3c509: Remove this driver"
This reverts commit 91f3a27ae9.

Contrary to the assumption stated with the original commit description
this driver is in use and I'm going to maintain it for the foreseeable
future.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201204260.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:28:56 -07:00
Ilya Maximets 3287e81292 tools: ynl: support listening on all nsids
A new method ntf_listen_all_nsid() to enable listening on events from
all namespaces.  Useful for testing cross-namespace functionality.

recv() replaced with recvmsg() to be able to receive NSID through the
ancillary data.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20260520172317.175168-4-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:23:50 -07:00
Sabrina Dubroca 4db79a322d net: gro: don't merge zcopy skbs
skb_gro_receive() can currently copy frags between the source and GRO
skb, without checking the zerocopy status, and in particular the
SKBFL_MANAGED_FRAG_REFS flag.

When SKBFL_MANAGED_FRAG_REFS is set, the skb doesn't hold a reference
on the pages in shinfo->frags. Appending those frags to another skb's
frags without fixing up the page refcount can lead to UAF.

When either the last skb in the GRO chain (the one we would append
frags to) or the source skb is zerocopy, don't merge the skbs.

Fixes: 753f1ca4e1 ("net: introduce managed frags infrastructure")
Reported-by: Huzaifa Sidhpurwala <huzaifas@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/c3b7f906bbfcbdfd7b4fa9d6c18a438870df85be.1779307748.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:21:33 -07:00
Nikhil P. Rao 3d4432d34c pds_core: ensure null-termination for firmware version strings
The driver passes fw_version directly to devlink_info_version_stored_put()
without ensuring null-termination. While current firmware null-terminates
these strings, the driver should not rely on this behavior. Add explicit
null-termination to prevent potential issues if firmware behavior changes.

Fixes: 45d76f4929 ("pds_core: set up device and adminq")
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
Link: https://patch.msgid.link/20260520205842.1486718-1-nikhil.rao@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:21:11 -07:00
Lorenzo Bianconi 985d4a55e6 net: airoha: Disable GDM2 forwarding before configuring GDM2 loopback
Hw design requires to disable GDM2 forwarding before configuring GDM2
loopback in airoha_set_gdm2_loopback routine.

Fixes: 9cd451d414 ("net: airoha: Add loopback support for GDM2")
Tested-by: Madhur Agrawal <madhur.agrawal@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260520-airoha-disable-gdm2-fwd-v1-1-1eeea5dffc2f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:19:54 -07:00
Justin Iurman e46e6bc97f ipv6: ioam: refresh hdr pointer before ioam6_event()
Reported by Sashiko:

In ipv6_hop_ioam(), the hdr pointer is initialized to point into the
skb's linear data buffer. Later, the code calls skb_ensure_writable(),
which might reallocate the buffer:

	if (skb_ensure_writable(skb, optoff + 2 + hdr->opt_len))
		goto drop;

	/* Trace pointer may have changed */
	trace = (struct ioam6_trace_hdr *)(skb_network_header(skb)
					   + optoff + sizeof(*hdr));

	ioam6_fill_trace_data(skb, ns, trace, true);

	ioam6_event(IOAM6_EVENT_TRACE, dev_net(skb->dev),
		    GFP_ATOMIC, (void *)trace, hdr->opt_len - 2);

If the skb is cloned or lacks sufficient linear headroom,
skb_ensure_writable() will invoke pskb_expand_head(), which reallocates
the skb's data buffer and frees the old one, invalidating pointers to
it. While the code recalculates the trace pointer immediately after the
call to skb_ensure_writable(), it fails to recalculate the hdr pointer.

This patch fixes the above by recalculating the hdr pointer before
passing hdr->opt_len to ioam6_event(), so that we avoid any UaF.

Fixes: f655c78d62 ("net: exthdrs: ioam6: send trace event")
Cc: stable@vger.kernel.org
Signed-off-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260520124242.32320-1-justin.iurman@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:19:25 -07:00
Weiming Shi bddc09212c tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR
In the SIOCGIFHWADDR path, tap_ioctl() copies 16 bytes of an
uninitialised on-stack struct sockaddr_storage to userspace via
ifr_hwaddr, but netif_get_mac_address() only writes sa_family and
dev->addr_len (6 for Ethernet) bytes, leaving sa_data[6..13] uninitialised.

Those 8 trailing bytes leak kernel stack contents; SIOCGIFHWADDR on a
macvtap chardev returns kernel .text and direct-map pointers, defeating
KASLR.

Initialise ss at declaration.

Fixes: 3b23a32a63 ("net: fix dev_ifsioc_locked() race condition")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260520075736.3415676-3-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:16:12 -07:00
Dawei Feng 2bccfb8476 qed: fix double free in qed_cxt_tables_alloc()
If one of the later PF or VF CID bitmap allocations fails,
qed_cid_map_alloc() jumps to cid_map_fail and frees the previously
allocated CID bitmaps before returning an error. qed_cxt_tables_alloc()
then calls qed_cxt_mngr_free(), which invokes qed_cid_map_free()
again.

Fix this by setting each CID bitmap pointer to NULL after bitmap_free()
to avoid double free.

The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available. Manual inspection confirms that the bug is still
present in v7.1-rc3.

Runtime reproduction was not attempted because exercising the failing
allocation path requires device-specific setup.

Fixes: fe56b9e6a8 ("qed: Add module with basic common support")
Cc: stable@vger.kernel.org
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Link: https://patch.msgid.link/20260520070323.2762379-1-dawei.feng@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:15:00 -07:00
Aditya Garg b809d04099 net: mana: validate rx_req_idx to prevent out-of-bounds array access
In mana_hwc_rx_event_handler(), rx_req_idx is derived from
sge->address in DMA-coherent memory. In Confidential VMs
(SEV-SNP/TDX), this memory is shared unencrypted and HW can modify
WQE contents at any time. No bounds check exists on rx_req_idx,
which can lead to an out-of-bounds access into reqs[].

Add bounds check on rx_req_idx in mana_hwc_rx_event_handler() before
using it to index the reqs[] array.

Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260520051553.857120-1-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:14:11 -07:00
Ratheesh Kannoth 9eddc819f0 octeontx2-af: npc: Fix allmulticast skip logic for LBK and SDP VFs
When installing the allmulticast NPC rule, rvu_npc_install_allmulti_entry()
should skip LBK and SDP VFs (only CGX PF/VF may add the entry).  The
code combined is_lbk_vf() and is_sdp_vf() with logical AND, which is
never true for a single pcifunc, so the intended early return never ran.

Use logical OR instead.

Cc: Geetha sowjanya <gakula@marvell.com>
Fixes: ae703539f4 ("octeontx2-af: Cleanup loopback device checks")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260520043036.1523798-1-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:12:59 -07:00
Zhang Cen c367b90821 netpoll: normalize skb->dev to the netpoll device
__netpoll_send_skb() always transmits through np->dev and queues busy
packets on np->dev->npinfo->txq, but it leaves skb->dev unchanged.
Stacked callers such as DSA and macvlan can reach netpoll with skb->dev
still naming the upper device while np->dev is the lower device that
owns the netpoll state.

If the skb has to be deferred, queue_process() later dequeues it from
the lower device's txq but retries it through skb->dev. That can
re-enter the upper ndo_start_xmit path on an already transformed skb,
and if the upper device disappears before the lower txq drains the
workqueue can dereference a stale skb->dev pointer.

The buggy scenario involves two paths, with each column showing the
order within that path:

path A label: netpoll enqueue path   path B label: upper-device teardown
1. Stacked xmit calls netpoll        1. Teardown unregisters the upper
   with lower np->dev and upper         net_device while lower npinfo
   skb->dev.                            stays alive.
2. __netpoll_send_skb() uses         2. netdev_release() runs for the
   np->dev->npinfo as the txq           upper net_device.
   owner.
3. Busy transmit queues the skb      3. The lower txq still owns the
   on that lower txq with upper         deferred skb.
   skb->dev.
4. queue_process() drains the        4. queue_process() dereferences
   lower txq and reads skb->dev.        that stale upper skb->dev.

Normalize skb->dev to np->dev after loading np->dev from the netpoll
instance, before either the direct transmit path or the fallback enqueue.
This keeps the queued skb in the same device and txq domain as the
netpoll state that owns it.

KASAN report as below:

KASAN slab-use-after-free in queue_process+0x7c/0x480
Workqueue: events queue_process
The buggy address belongs to the object at ffff88810906c000 which belongs
to the cache kmalloc-4k of size 4096
The buggy address is located 168 bytes inside of freed 4096-byte region
[ffff88810906c000, ffff88810906d000)
Read of size 8
Call trace:
  dump_stack_lvl+0x73/0xb0 (?:?)
  print_report+0xd1/0x620 (?:?)
  srso_alias_return_thunk+0x5/0xfbef5 (?:?)
  __virt_addr_valid+0x215/0x420 (?:?)
  kasan_complete_mode_report_info+0x64/0x200 (?:?)
  kasan_report+0xf7/0x130 (?:?)
  queue_process+0x7c/0x480 (net/core/netpoll.c:88)
  kasan_check_range+0x10c/0x1c0 (?:?)
  __kasan_check_read+0x15/0x20 (?:?)
  process_one_work+0x8b7/0x1af0 (kernel/workqueue.c:3200)
  assign_work+0x170/0x3f0 (?:?)
  worker_thread+0x574/0xf10 (?:?)
  _raw_spin_unlock_irqrestore+0x4b/0x60 (?:?)
  trace_hardirqs_on+0x2a/0x180 (?:?)
  kthread+0x2fc/0x3f0 (?:?)
  ret_from_fork+0x58b/0x830 (?:?)
  __switch_to+0x58e/0xe90 (?:?)
  __switch_to_asm+0x39/0x70 (?:?)
  ret_from_fork_asm+0x1a/0x30 (?:?)
Freed by task stack:
  kasan_save_stack+0x3d/0x60 (?:?)
  kasan_save_track+0x18/0x40 (?:?)
  kasan_save_free_info+0x3f/0x60 (?:?)
  __kasan_slab_free+0x48/0x70 (?:?)
  kfree+0x20e/0x4e0 (?:?)
  kvfree+0x31/0x40 (?:?)
  netdev_release+0x71/0x90 (net/core/net-sysfs.c:2227)
  device_release+0xd2/0x250 (?:?)
  kobject_put+0x181/0x4c0 (lib/kobject.c:730)
  netdev_run_todo+0x700/0x1000 (net/core/dev.c:11666)
  rtnl_dellink+0x396/0xc00 (net/core/rtnetlink.c:3558)
  rtnetlink_rcv_msg+0x740/0xc20 (net/core/rtnetlink.c:6897)
  netlink_rcv_skb+0x147/0x3a0 (?:?)
  rtnetlink_rcv+0x19/0x20 (net/core/rtnetlink.c:7021)
  netlink_unicast+0x4d1/0x830 (net/netlink/af_netlink.c:1327)
  netlink_sendmsg+0x840/0xe10 (net/netlink/af_netlink.c:1812)
  ____sys_sendmsg+0x8a7/0xb50 (?:?)
  ___sys_sendmsg+0x104/0x190 (?:?)
  __sys_sendmsg+0x135/0x1d0 (?:?)
  __x64_sys_sendmsg+0x7b/0xc0 (?:?)
  x64_sys_call+0x205c/0x2130 (?:?)
  do_syscall_64+0x115/0x6a0 (arch/x86/entry/syscall_64.c:87)
  entry_SYSCALL_64_after_hwframe+0x77/0x7f (?:?)

Fixes: 5de4a473bd ("netpoll queue cleanup")
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Link: https://patch.msgid.link/20260519104647.3517990-1-rollkingzzc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:10:18 -07:00
Abdun Nihaal c5d93b2c40 net: wwan: iosm: fix potential memory leaks in ipc_imem_init()
The memory allocated in ipc_protocol_init() is not freed on the error
paths that follow in ipc_imem_init(). Fix that by calling the
corresponding release function ipc_protocol_deinit() in the error path.

Fixes: 3670970dd8 ("net: iosm: shared memory IPC interface")
Cc: stable@vger.kernel.org
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20260519062815.55545-1-nihaal@cse.iitm.ac.in
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 08:06:02 -07:00
Michael Grzeschik 85fac50b58 MAINTAINERS: Update address for Michael Grzeschik
Since I am moving from Pengutronix update my email address for the
ARCNET subsystems to point to my kernel.org address.

Also update .mailmap.

Signed-off-by: Michael Grzeschik <mgr@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Markus Schneider-Pargmann <mail@markussp.com>
Link: https://patch.msgid.link/20260521-maintainer-v1-1-29b5e106682d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 07:47:49 -07:00
Jakub Kicinski 099258bde1 MAINTAINERS: add missing entry for Bluetooth include files
We X-out net/bluetooth/ from "NETWORKING [GENERAL]" so that only
the dedicated list is CCed on patches, and networking gets them
once already processed by Luiz. We missed include/net/bluetooth.

Link: https://patch.msgid.link/20260521004151.625049-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 07:47:40 -07:00
Nimrod Oren dfc0770433 selftests: net: Fix checksums in xdp_native
Data adjustment cases failed with "Data exchange failed" when using IPv4
because the program did not update the IP and UDP checksums in the IPv4
branch. The issue was masked when both IPv4 and IPv6 were configured,
since the test harness prefers IPv6.

While here, generalize csum_fold_helper() to fold twice so it works for
any 32-bit input.

Fixes: 0b65cfcef9 ("selftests: drv-net: Test tail-adjustment support")
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Nimrod Oren <noren@nvidia.com>
Link: https://patch.msgid.link/20260520153928.3371765-1-noren@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 07:47:00 -07:00
Yuho Choi 1341db3224 ipv6: route: Unregister netdevice notifier on BPF init failure
ip6_route_init() registers ip6_route_dev_notifier before registering the
IPv6 route BPF iterator target. If bpf_iter_register() fails after the
notifier has been registered, the error path currently jumps to
out_register_late_subsys and unwinds the RTNL handlers and pernet route
state without removing the notifier from the netdevice notifier chain.

This leaves ip6_route_dev_notify() callable after the IPv6 route state it
uses has been torn down. Add a separate unwind label for the BPF iterator
failure path and unregister the netdevice notifier before continuing with
the existing cleanup.

Fixes: 138d0be35b ("net: bpf: Add netlink and ipv6_route bpf_iter targets")
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260520030329.1061183-1-dbgh9129@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21 07:43:15 -07:00