mirror-linux/net/core
John Fastabend 90d1f74c3c bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
[ Upstream commit 8866730aed ]

AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
will lookup the paired socket as part of the send operation. It is possible
however to put just one of the pairs in a BPF map. This currently increments
the refcnt on the sock in the sockmap to ensure it is not free'd by the
stack before sockmap cleans up its state and stops any skbs being sent/recv'd
to that socket.

But we missed a case. If the peer socket is closed it will be free'd by the
stack. However, the paired socket can still be referenced from BPF sockmap
side because we hold a reference there. Then if we are sending traffic through
BPF sockmap to that socket it will try to dereference the free'd pair in its
send logic creating a use after free. And following splat:

   [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
   [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
   [...]
   [59.905468] Call Trace:
   [59.905787]  <TASK>
   [59.906066]  dump_stack_lvl+0x130/0x1d0
   [59.908877]  print_report+0x16f/0x740
   [59.910629]  kasan_report+0x118/0x160
   [59.912576]  sk_wake_async+0x31/0x1b0
   [59.913554]  sock_def_readable+0x156/0x2a0
   [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
   [59.916398]  sock_sendmsg+0x20e/0x250
   [59.916854]  skb_send_sock+0x236/0xac0
   [59.920527]  sk_psock_backlog+0x287/0xaa0

To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
primarily problem with keeping this logic in bpf_unix was: In the sock close()
we could handle the deref by having a close handler. But, when we are destroying
the psock through a map delete operation we wouldn't have gotten any signal
thorugh the proto struct other than it being replaced. If we do the deref from
the proto replace its too early because we need to deref the sk_pair after the
backlog worker has been stopped.

Given all this it seems best to just cache it at the end of the psock and eat 8B
for the af_unix and vsock users. Notice dgram sockets are OK because they handle
locking already.

Fixes: 94531cfcbe ("af_unix: Add unix_stream_proto for sockmap")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20231129012557.95371-2-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
..
Makefile devlink: move code to a dedicated directory 2023-08-30 16:11:00 +02:00
bpf_sk_storage.c bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing 2023-08-11 12:08:12 +02:00
datagram.c net: datagram: fix data-races in datagram_poll() 2023-05-24 17:32:32 +01:00
dev.c net: check dev->gso_max_size in gso_features_check() 2024-01-01 12:38:58 +00:00
dev.h net: check for altname conflicts when changing netdev's netns 2023-10-25 12:03:08 +02:00
dev_addr_lists.c
dev_addr_lists_test.c
dev_ioctl.c
drop_monitor.c drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group 2023-12-13 18:39:12 +01:00
dst.c
dst_cache.c
failover.c
fib_notifier.c
fib_rules.c
filter.c bpf: sockmap, updating the sg structure should also update curr 2023-12-13 18:39:11 +01:00
flow_dissector.c net/core: Fix ETH_P_1588 flow dissector 2023-10-06 14:56:36 +02:00
flow_offload.c
gen_estimator.c
gen_stats.c
gro.c skb: Do mix page pool and page referenced frags in GRO 2023-02-09 11:28:05 +01:00
gro_cells.c net: drop the weight argument from netif_napi_add 2022-09-28 18:57:14 -07:00
hwbm.c
link_watch.c
lwt_bpf.c lwt: Fix return values of BPF xmit ops 2023-09-13 09:42:33 +02:00
lwtunnel.c xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available 2022-10-12 10:45:51 +02:00
neighbour.c neighbour: fix various data-races 2023-11-02 09:35:27 +01:00
net-procfs.c
net-sysfs.c net-sysfs: Convert to use sysfs_emit() APIs 2022-09-30 12:27:44 +01:00
net-sysfs.h
net-traces.c
net_namespace.c net: fix UaF in netns ops registration error path 2023-02-01 08:34:43 +01:00
netclassid_cgroup.c
netevent.c
netpoll.c net: don't let netpoll invoke NAPI if in xmit context 2023-04-13 16:55:21 +02:00
netprio_cgroup.c
of_net.c
page_pool.c net: page_pool: add missing free_percpu when page_pool_init fail 2023-11-20 11:52:16 +01:00
pktgen.c net: pktgen: Fix interface flags printing 2023-10-25 12:03:08 +02:00
ptp_classifier.c
request_sock.c
rtnetlink.c netlink: Correct offload_xstats size 2023-10-25 12:03:07 +02:00
scm.c io_uring/af_unix: disable sending io_uring over sockets 2023-12-13 18:39:17 +01:00
secure_seq.c
selftests.c
skbuff.c net: annotate data-races around sk->sk_tsflags 2024-01-10 17:10:23 +01:00
skmsg.c bpf, sockmap: af_unix stream sockets need to hold ref for pair sock 2024-01-10 17:10:32 +01:00
sock.c net: Implement missing SO_TIMESTAMPING_NEW cmsg support 2024-01-10 17:10:26 +01:00
sock_destructor.h
sock_diag.c
sock_map.c bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets 2023-10-10 22:00:41 +02:00
sock_reuseport.c soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-12-31 13:32:04 +01:00
stream.c net: Return error from sk_stream_wait_connect() if sk_wait_event() fails 2024-01-01 12:38:56 +00:00
sysctl_net_core.c
timestamping.c
tso.c
utils.c
xdp.c xdp: improve page_pool xdp_return performance 2022-09-26 11:28:19 -07:00