mirror-linux/net/netfilter
Florian Westphal b2f742c846 netfilter: nf_tables: restart set lookup on base_seq change
The hash, hash_fast, rhash and bitwise sets may indicate no result even
though a matching element exists during a short time window while other
cpu is finalizing the transaction.

This happens when the hash lookup/bitwise lookup function has picked up
the old genbit, right before it was toggled by nf_tables_commit(), but
then the same cpu managed to unlink the matching old element from the
hash table:

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:
					A) observes old genbit
   increments base_seq
I) increments the genbit
II) removes old element from the set
					B) finds matching element
					C) returns no match: found
					element is not valid in old
					generation

					Next lookup observes new genbit and
					finds matching e2.

Consider a packet matching element e1, e2.

cpu0 processes following transaction:
1. remove e1
2. adds e2, which has same key as e1.

P matches both e1 and e2.  Therefore, cpu1 should always find a match
for P. Due to above race, this is not the case:

cpu1 observed the old genbit.  e2 will not be considered once it is found.
The element e1 is not found anymore if cpu0 managed to unlink it from the
hlist before cpu1 found it during list traversal.

The situation only occurs for a brief time period, lookups happening
after I) observe new genbit and return e2.

This problem exists in all set types except nft_set_pipapo, so fix it once
in nft_lookup rather than each set ops individually.

Sample the base sequence counter, which gets incremented right before the
genbit is changed.

Then, if no match is found, retry the lookup if the base sequence was
altered in between.

If the base sequence hasn't changed:
 - No update took place: no-match result is expected.
   This is the common case.  or:
 - nf_tables_commit() hasn't progressed to genbit update yet.
   Old elements were still visible and nomatch result is expected, or:
 - nf_tables_commit updated the genbit:
   We picked up the new base_seq, so the lookup function also picked
   up the new genbit, no-match result is expected.

If the old genbit was observed, then nft_lookup also picked up the old
base_seq: nft_lookup_should_retry() returns true and relookup is performed
in the new generation.

This problem was added when the unconditional synchronize_rcu() call
that followed the current/next generation bit toggle was removed.

Thanks to Pablo Neira Ayuso for reviewing an earlier version of this
patchset, for suggesting re-use of existing base_seq and placement of
the restart loop in nft_set_do_lookup().

Fixes: 0cbc06b3fa ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-10 20:30:37 +02:00
..
ipset treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
ipvs ipvs: Fix estimator kthreads preferred affinity 2025-08-13 08:34:33 +02:00
Kconfig netfilter: Exclude LEGACY TABLES on PREEMPT_RT. 2025-07-25 18:38:50 +02:00
Makefile netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
core.c netfilter: nf_dup{4, 6}: Move duplication check to task_struct 2025-05-23 13:57:12 +02:00
nf_bpf_link.c bpf: Check netfilter ctx accesses are aligned 2025-08-01 09:22:44 -07:00
nf_conncount.c netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree() 2025-03-12 15:28:33 +01:00
nf_conntrack_acct.c
nf_conntrack_amanda.c
nf_conntrack_bpf.c
nf_conntrack_broadcast.c
nf_conntrack_core.c netfilter: conntrack: Remove unused net in nf_conntrack_double_lock() 2025-07-25 18:38:41 +02:00
nf_conntrack_ecache.c
nf_conntrack_expect.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
nf_conntrack_extend.c
nf_conntrack_ftp.c
nf_conntrack_h323_asn1.c
nf_conntrack_h323_main.c
nf_conntrack_h323_types.c
nf_conntrack_helper.c netfilter: conntrack: helper: Replace -EEXIST by -EBUSY 2025-08-27 11:53:38 +02:00
nf_conntrack_irc.c
nf_conntrack_labels.c
nf_conntrack_netbios_ns.c
nf_conntrack_netlink.c netfilter: ctnetlink: remove refcounting in expectation dumpers 2025-08-07 13:19:26 +02:00
nf_conntrack_ovs.c
nf_conntrack_pptp.c
nf_conntrack_proto.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_conntrack_proto_generic.c
nf_conntrack_proto_gre.c
nf_conntrack_proto_icmp.c
nf_conntrack_proto_icmpv6.c
nf_conntrack_proto_sctp.c
nf_conntrack_proto_tcp.c
nf_conntrack_proto_udp.c
nf_conntrack_sane.c
nf_conntrack_seqadj.c
nf_conntrack_sip.c
nf_conntrack_snmp.c
nf_conntrack_standalone.c netfilter: conntrack: clean up returns in nf_conntrack_log_invalid_sysctl() 2025-08-07 13:19:26 +02:00
nf_conntrack_tftp.c
nf_conntrack_timeout.c
nf_conntrack_timestamp.c
nf_dup_netdev.c netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit 2025-05-23 13:57:12 +02:00
nf_flow_table_bpf.c
nf_flow_table_core.c netfilter: conntrack: fix erronous removal of offload bit 2025-04-17 11:14:22 +02:00
nf_flow_table_inet.c
nf_flow_table_ip.c
nf_flow_table_offload.c
nf_flow_table_procfs.c
nf_flow_table_xdp.c
nf_hooks_lwtunnel.c
nf_internals.h
nf_log.c netfilter: load nf_log_syslog on enabling nf_conntrack_log_invalid 2025-07-25 18:35:41 +02:00
nf_log_syslog.c tcp: extend TCP flags to allow AE bit/ACE field 2025-03-17 13:49:46 +00:00
nf_nat_amanda.c
nf_nat_bpf.c
nf_nat_core.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_nat_ftp.c
nf_nat_helper.c
nf_nat_irc.c
nf_nat_masquerade.c
nf_nat_ovs.c
nf_nat_proto.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_nat_redirect.c
nf_nat_sip.c
nf_nat_tftp.c
nf_queue.c
nf_sockopt.c
nf_synproxy_core.c
nf_tables_api.c netfilter: nf_tables: restart set lookup on base_seq change 2025-09-10 20:30:37 +02:00
nf_tables_core.c netfilter: nf_tables: Only use nf_skip_indirect_calls() when MITIGATION_RETPOLINE 2025-03-23 10:53:47 +01:00
nf_tables_offload.c netfilter: nf_tables: Have a list of nf_hook_ops in nft_hook 2025-05-23 13:57:13 +02:00
nf_tables_trace.c netfilter: nf_tables: hide clash bit from userspace 2025-07-14 15:22:35 +02:00
nfnetlink.c Revert "netfilter: nf_tables: Add notifications for hook changes" 2025-07-14 15:22:47 +02:00
nfnetlink_acct.c
nfnetlink_cthelper.c
nfnetlink_cttimeout.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nfnetlink_hook.c netfilter: nfnetlink_hook: Dump flowtable info 2025-07-25 18:40:01 +02:00
nfnetlink_log.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
nfnetlink_osf.c
nfnetlink_queue.c netfilter: nfnetlink_queue: Initialize ctx to avoid memory allocation error 2025-03-23 10:20:33 +01:00
nft_bitwise.c
nft_byteorder.c
nft_chain_filter.c Revert "netfilter: nf_tables: Add notifications for hook changes" 2025-07-14 15:22:47 +02:00
nft_chain_nat.c
nft_chain_route.c
nft_cmp.c
nft_compat.c
nft_connlimit.c
nft_counter.c
nft_ct.c
nft_ct_fast.c
nft_dup_netdev.c
nft_dynset.c netfilter: nft_set: remove indirection from update API call 2025-07-25 18:40:23 +02:00
nft_exthdr.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nft_fib.c
nft_fib_inet.c
nft_fib_netdev.c
nft_flow_offload.c netfilter: nf_tables: Introduce nft_hook_find_ops{,_rcu}() 2025-05-23 13:57:12 +02:00
nft_fwd_netdev.c
nft_hash.c
nft_immediate.c
nft_inner.c netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx 2025-05-23 13:57:12 +02:00
nft_last.c
nft_limit.c
nft_log.c
nft_lookup.c netfilter: nf_tables: restart set lookup on base_seq change 2025-09-10 20:30:37 +02:00
nft_masq.c
nft_meta.c
nft_nat.c
nft_numgen.c
nft_objref.c netfilter: nft_set: remove one argument from lookup and update functions 2025-07-25 18:40:16 +02:00
nft_osf.c
nft_payload.c
nft_queue.c
nft_quota.c netfilter: nft_quota: match correctly when the quota just depleted 2025-05-05 13:15:09 +02:00
nft_range.c
nft_redir.c
nft_reject.c
nft_reject_inet.c
nft_reject_netdev.c
nft_rt.c
nft_set_bitmap.c netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation 2025-09-10 20:28:24 +02:00
nft_set_hash.c netfilter: nft_set: remove indirection from update API call 2025-07-25 18:40:23 +02:00
nft_set_pipapo.c netfilter: nft_set_pipapo: don't check genbit from packetpath lookups 2025-09-10 20:30:37 +02:00
nft_set_pipapo.h
nft_set_pipapo_avx2.c netfilter: nft_set_pipapo: don't check genbit from packetpath lookups 2025-09-10 20:30:37 +02:00
nft_set_pipapo_avx2.h
nft_set_rbtree.c netfilter: nft_set_rbtree: continue traversal if element is inactive 2025-09-10 20:30:37 +02:00
nft_socket.c netfilter: nft_socket: remove WARN_ON_ONCE with huge level value 2025-08-07 13:19:26 +02:00
nft_synproxy.c
nft_tproxy.c
nft_tunnel.c netfilter: nft_tunnel: fix geneve_opt dump 2025-05-23 13:57:12 +02:00
nft_xfrm.c
utils.c
x_tables.c netfilter: Exclude LEGACY TABLES on PREEMPT_RT. 2025-07-25 18:38:50 +02:00
xt_AUDIT.c
xt_CHECKSUM.c
xt_CLASSIFY.c
xt_CONNSECMARK.c
xt_CT.c
xt_DSCP.c
xt_HL.c
xt_HMARK.c
xt_IDLETIMER.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
xt_LED.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
xt_LOG.c
xt_MASQUERADE.c
xt_NETMAP.c
xt_NFLOG.c
xt_NFQUEUE.c
xt_RATEEST.c
xt_REDIRECT.c
xt_SECMARK.c
xt_TCPMSS.c
xt_TCPOPTSTRIP.c netfilter: xtables: support arpt_mark and ipv6 optstrip for iptables-nft only builds 2025-05-22 17:16:02 +02:00
xt_TEE.c
xt_TPROXY.c
xt_TRACE.c
xt_addrtype.c
xt_bpf.c
xt_cgroup.c net: cgroup: Guard users of sock_cgroup_classid() 2025-04-24 16:04:02 +02:00
xt_cluster.c
xt_comment.c
xt_connbytes.c
xt_connlabel.c
xt_connlimit.c
xt_connmark.c
xt_conntrack.c
xt_cpu.c
xt_dccp.c
xt_devgroup.c
xt_dscp.c
xt_ecn.c
xt_esp.c
xt_hashlimit.c netfilter: xt_hashlimit: replace vmalloc calls with kvmalloc 2025-03-12 16:37:48 +01:00
xt_helper.c
xt_hl.c
xt_ipcomp.c
xt_iprange.c
xt_ipvs.c
xt_l2tp.c
xt_length.c
xt_limit.c
xt_mac.c
xt_mark.c netfilter: xtables: support arpt_mark and ipv6 optstrip for iptables-nft only builds 2025-05-22 17:16:02 +02:00
xt_multiport.c
xt_nat.c
xt_nfacct.c netfilter: xt_nfacct: don't assume acct name is null-terminated 2025-07-25 18:40:43 +02:00
xt_osf.c
xt_owner.c
xt_physdev.c
xt_pkttype.c
xt_policy.c
xt_quota.c
xt_rateest.c
xt_realm.c
xt_recent.c
xt_repldata.h netfilter: xtables: Use strscpy() instead of strscpy_pad() 2025-03-23 10:53:47 +01:00
xt_sctp.c
xt_set.c
xt_socket.c
xt_state.c
xt_statistic.c
xt_string.c
xt_tcpmss.c
xt_tcpudp.c
xt_time.c
xt_u32.c