Patch series "mm: In-kernel support for memory-deny-write-execute (MDWE)",
v2.
The background to this is that systemd has a configuration option called
MemoryDenyWriteExecute [2], implemented as a SECCOMP BPF filter. Its aim
is to prevent a user task from inadvertently creating an executable
mapping that is (or was) writeable. Since such BPF filter is stateless,
it cannot detect mappings that were previously writeable but subsequently
changed to read-only. Therefore the filter simply rejects any
mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI support
(Branch Target Identification), the dynamic loader cannot change an ELF
section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect(). For
libraries, it can resort to unmapping and re-mapping but for the main
executable it does not have a file descriptor. The original bug report in
the Red Hat bugzilla - [3] - and subsequent glibc workaround for libraries
- [4].
This series adds in-kernel support for this feature as a prctl
PR_SET_MDWE, that is inherited on fork(). The prctl denies PROT_WRITE |
PROT_EXEC mappings. Like the systemd BPF filter it also denies adding
PROT_EXEC to mappings. However unlike the BPF filter it only denies it if
the mapping didn't previous have PROT_EXEC. This allows to PROT_EXEC ->
PROT_EXEC | PROT_BTI with mprotect(), which is a problem with the BPF
filter.
This patch (of 2):
The aim of such policy is to prevent a user task from creating an
executable mapping that is also writeable.
An example of mmap() returning -EACCESS if the policy is enabled:
mmap(0, size, PROT_READ | PROT_WRITE | PROT_EXEC, flags, 0, 0);
Similarly, mprotect() would return -EACCESS below:
addr = mmap(0, size, PROT_READ | PROT_EXEC, flags, 0, 0);
mprotect(addr, size, PROT_READ | PROT_WRITE | PROT_EXEC);
The BPF filter that systemd MDWE uses is stateless, and disallows
mprotect() with PROT_EXEC completely. This new prctl allows PROT_EXEC to
be enabled if it was already PROT_EXEC, which allows the following case:
addr = mmap(0, size, PROT_READ | PROT_EXEC, flags, 0, 0);
mprotect(addr, size, PROT_READ | PROT_EXEC | PROT_BTI);
where PROT_BTI enables branch tracking identification on arm64.
Link: https://lkml.kernel.org/r/20230119160344.54358-1-joey.gouly@arm.com
Link: https://lkml.kernel.org/r/20230119160344.54358-2-joey.gouly@arm.com
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: nd <nd@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Topi Miettinen <toiwoton@gmail.com>
Cc: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a Netlink spec-compatible family for netdevs.
This is a very simple implementation without much
thought going into it.
It allows us to reap all the benefits of Netlink specs,
one can use the generic client to issue the commands:
$ ./cli.py --spec netdev.yaml --dump dev_get
[{'ifindex': 1, 'xdp-features': set()},
{'ifindex': 2, 'xdp-features': {'basic', 'ndo-xmit', 'redirect'}},
{'ifindex': 3, 'xdp-features': {'rx-sg'}}]
the generic python library does not have flags-by-name
support, yet, but we also don't have to carry strings
in the messages, as user space can get the names from
the spec.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/327ad9c9868becbe1e601b580c962549c8cd81f2.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch introduces gso_ipv4_max_size and gro_ipv4_max_size
per device and adds netlink attributes for them, so that IPV4
BIG TCP can be guarded by a separate tunable in the next patch.
To not break the old application using "gso/gro_max_size" for
IPv4 GSO packets, this patch updates "gso/gro_ipv4_max_size"
in netif_set_gso/gro_max_size() if the new size isn't greater
than GSO_LEGACY_MAX_SIZE, so that nothing will change even if
userspace doesn't realize the new netlink attributes.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce TP_STATUS_GSO_TCP tp_status flag to tell the af_packet user
that this is a TCP GSO packet. When parsing IPv4 BIG TCP packets in
tcpdump/libpcap, it can use tp_len as the IPv4 packet len when this
flag is set, as iph tot_len is set to 0 for IPv4 BIG TCP packets.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The guid_t type and respective macros are being used internally only.
The uuid_le has its user outside the kernel. Decouple these types and
macros, and make guid_t completely internal type to the kernel.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230124133838.22645-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge tag 'v6.2-rc6' into sched/core, to pick up fixes
Pick up fixes before merging another batch of cpuidle updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge v6.2-rc6 into drm-next
Due to holidays we started -next with more -fixes in-flight than
usual, and people have been asking where they are. Backmerge to get
things better in sync.
Conflicts:
- Tiny conflict in drm_fbdev_generic.c between variable rename and
missing error handling that got added.
- Conflict in drm_fb_helper.c between the added call to vgaswitcheroo
in drm_fb_helper_single_fb_probe and a refactor patch that extracted
lots of helpers and incidentally removed the dev local variable.
Readd it to make things compile.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
iJgEABYKAEAWIQTAnvhxs4J7QT+XHKnMPy2AAyfeZAUCY8R0CCIcbGF1cmVudC5w
aW5jaGFydEBpZGVhc29uYm9hcmQuY29tAAoJEMw/LYADJ95k3XwBAMn3MAcX185i
DNPc1tWKCA0+j7eLqgCrNHOQgqYSV7ZOAQDaoDt/drJffp8S9TdCpP1DhT9wcSIB
i6i6j7mlzGsqBw==
=yAlT
-----END PGP SIGNATURE-----
Merge tag 'media-uvc-next-20230115' of git://git.kernel.org/pub/scm/linux/kernel/git/pinchartl/linux into usb-next
Merge in this tag from the media tree so that future USB uvc patches
will apply properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'media-uvc-next-20230115' of git://git.kernel.org/pub/scm/linux/kernel/git/pinchartl/linux: (27 commits)
media: uvcvideo: Silence memcpy() run-time false positive warnings
media: uvcvideo: Quirk for autosuspend in Logitech B910 and C910
media: uvcvideo: Fix race condition with usb_kill_urb
media: uvcvideo: Use standard names for menus
media: uvcvideo: Fix power line control for Lenovo Integrated Camera
media: uvcvideo: Refactor power_line_frequency_controls_limited
media: uvcvideo: Refactor uvc_ctrl_mappings_uvcXX
media: uvcvideo: Implement mask for V4L2_CTRL_TYPE_MENU
media: uvcvideo: Extend documentation of uvc_video_clock_decode()
media: uvcvideo: Limit power line control for Acer EasyCamera
media: uvcvideo: Refactor __uvc_ctrl_add_mapping
media: uvcvideo: Fix handling on Bitmask controls
media: uvcvideo: Do not return positive errors in uvc_query_ctrl()
media: uvcvideo: Return -EACCES for Wrong state error
media: uvcvideo: Improve error logging in uvc_query_ctrl()
media: uvcvideo: Check for INACTIVE in uvc_ctrl_is_accessible()
media: uvcvideo: Factor out usb_string() calls
media: uvcvideo: Limit power line control for Acer EasyCamera
media: uvcvideo: Recover stalled ElGato devices
media: uvcvideo: Remove void casting for the status endpoint
...
- bump version strings, by Simon Wunderlich
- drop prandom.h includes, by Sven Eckelmann
- fix mailing list address, by Sven Eckelmann
- multicast feature preparation, by Linus Lüssing (2 patches)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmPTpRsWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZWoEAC7F92qmyNAye5ylC4O7XeG5KXL
tPHzOSp5n5Q98HOmLGrqgOTEgfoX8q/GkAQh4lauiG5LPhs8iOYfHhiz5u0rXXnv
Dm26PCLTKxOd+nif2H28rh4ahzgZUSizQgtg3S6tiYAvvTc8TM67UbJpPkU3sXf6
Xh9gT7W5YvZ8lvEQF/l/otdhuE3jpiBgjzyfDKTyLlVweWgli4XZSZ9BqIhv8c/O
gSa6vp72FGyJEZJCHyXh4jD1yNU/S+CRFpj6Gy1OsqLVMnNVlJviwEyVJ+ZuzSw2
KSOeeUXxN4XjeDulIOPgvvxFRsAPCeNoGqOzHtlLgEugyZGowKY9q5Bh33K63A7t
YzayKnfq2CKAVg9VujePurGkiMTY5gjrtuS7KlcWxXn30V8rry3bN6Y8XLphr80A
p1fVYjPQtFhbjjrrmsKbGSCmJiNo+jx2Rs6y1aQ86ykNVK3Et6rSUrISaMRjZEEG
OIMGOkzE0ESH2yK99sdaHOKiauZIMPMHTK7rB53cSMCL2eyP+6wP1jGQFwiytaA6
9GzMD8w8LVrJQc7PjkYP6f75iFKxtWzpf8RzwePhZFiRGVBROW0+6ECv6NAx2ImG
PoMYcqy0EV3ffwUhNnWtML+f4keqBaOKXa63GyWGBKymHLr5yLBzyvjgsr9jMJNb
3Sj/qf7JQy908jSxQQ==
=yGYm
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-pullrequest-20230127' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- drop prandom.h includes, by Sven Eckelmann
- fix mailing list address, by Sven Eckelmann
- multicast feature preparation, by Linus Lüssing (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
unprivileged ublk device is helpful for container use case, such
as: ublk device created in one unprivileged container can be controlled
and accessed by this container only.
Implement this feature by adding flag of UBLK_F_UNPRIVILEGED_DEV, and if
this flag isn't set, any control command has been run from privileged
user. Otherwise, any control command can be sent from any unprivileged
user, but the user has to be permitted to access the ublk char device
to be controlled.
In case of UBLK_F_UNPRIVILEGED_DEV:
1) for command UBLK_CMD_ADD_DEV, it is always allowed, and user needs
to provide owner's uid/gid in this command, so that udev can set correct
ownership for the created ublk device, since the device owner uid/gid
can be queried via command of UBLK_CMD_GET_DEV_INFO.
2) for other control commands, they can only be run successfully if the
current user is allowed to access the specified ublk char device, for
running the permission check, path of the ublk char device has to be
provided by these commands.
Also add one control of command UBLK_CMD_GET_DEV_INFO2 which always
include the char dev path in payload since userspace may not have
knowledge if this device is created in unprivileged mode.
For applying this mechanism, system administrator needs to take
the following policies:
1) chmod 0666 /dev/ublk-control
2) change ownership of ublkcN & ublkbN
- chown owner_uid:owner_gid /dev/ublkcN
- chown owner_uid:owner_gid /dev/ublkbN
Both can be done via one simple udev rule.
Userspace:
https://github.com/ming1/ubdsrv/tree/unprivileged-ublk
'ublk add -t $TYPE --un_privileged=1' is for creating one un-privileged
ublk device if the user is un-privileged.
Link: https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Userspace side only knows device ID, but the associated path of ublkc* and
ublkb* could be changed by udev, and that depends on userspace's policy, so
add parameter of UBLK_PARAM_TYPE_DEVT for retrieving major/minor of the
ublkc* and ublkb*, then user may figure out major/minor of the ublk disks
he/she owns. With major/minor, it is easy to find the device node path.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch adds a new flag (IORING_MSG_RING_FLAGS_PASS) in the message
ring operations (IORING_OP_MSG_RING). This new flag enables the sender
to specify custom flags, which will be copied over to cqe->flags in the
receiving ring. These custom flags should be specified using the
sqe->file_index field.
This mechanism provides additional flexibility when sending messages
between rings.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20230103160507.617416-1-leitao@debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RqJgAKCRDbK58LschI
gw2IAP9G5uhFO5abBzYLupp6SY3T5j97MUvPwLfFqUEt7EXmuwEA2lCUEWeW0KtR
QX+QmzCa6iHxrW7WzP4DUYLue//FJQY=
=yYqA
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
bpf-next 2023-01-28
We've added 124 non-merge commits during the last 22 day(s) which contain
a total of 124 files changed, 6386 insertions(+), 1827 deletions(-).
The main changes are:
1) Implement XDP hints via kfuncs with initial support for RX hash and
timestamp metadata kfuncs, from Stanislav Fomichev and
Toke Høiland-Jørgensen.
Measurements on overhead: https://lore.kernel.org/bpf/875yellcx6.fsf@toke.dk
2) Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case, from Andrii Nakryiko.
3) Significantly reduce the search time for module symbols by livepatch
and BPF, from Jiri Olsa and Zhen Lei.
4) Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs
in different time intervals, from David Vernet.
5) Fix several issues in the dynptr processing such as stack slot liveness
propagation, missing checks for PTR_TO_STACK variable offset, etc,
from Kumar Kartikeya Dwivedi.
6) Various performance improvements, fixes, and introduction of more
than just one XDP program to XSK selftests, from Magnus Karlsson.
7) Big batch to BPF samples to reduce deprecated functionality,
from Daniel T. Lee.
8) Enable struct_ops programs to be sleepable in verifier,
from David Vernet.
9) Reduce pr_warn() noise on BTF mismatches when they are expected under
the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien.
10) Describe modulo and division by zero behavior of the BPF runtime
in BPF's instruction specification document, from Dave Thaler.
11) Several improvements to libbpf API documentation in libbpf.h,
from Grant Seltzer.
12) Improve resolve_btfids header dependencies related to subcmd and add
proper support for HOSTCC, from Ian Rogers.
13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room()
helper along with BPF selftests, from Ziyang Xuan.
14) Simplify the parsing logic of structure parameters for BPF trampoline
in the x86-64 JIT compiler, from Pu Lehui.
15) Get BTF working for kernels with CONFIG_RUST enabled by excluding
Rust compilation units with pahole, from Martin Rodriguez Reboredo.
16) Get bpf_setsockopt() working for kTLS on top of TCP sockets,
from Kui-Feng Lee.
17) Disable stack protection for BPF objects in bpftool given BPF backends
don't support it, from Holger Hoffstätte.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits)
selftest/bpf: Make crashes more debuggable in test_progs
libbpf: Add documentation to map pinning API functions
libbpf: Fix malformed documentation formatting
selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata
selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket.
bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().
bpf/selftests: Verify struct_ops prog sleepable behavior
bpf: Pass const struct bpf_prog * to .check_member
libbpf: Support sleepable struct_ops.s section
bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable
selftests/bpf: Fix vmtest static compilation error
tools/resolve_btfids: Alter how HOSTCC is forced
tools/resolve_btfids: Install subcmd headers
bpf/docs: Document the nocast aliasing behavior of ___init
bpf/docs: Document how nested trusted fields may be defined
bpf/docs: Document cpumask kfuncs in a new file
selftests/bpf: Add selftest suite for cpumask kfuncs
selftests/bpf: Add nested trust selftests suite
bpf: Enable cpumasks to be queried and used as kptrs
bpf: Disallow NULLable pointers for trusted kfuncs
...
====================
Link: https://lore.kernel.org/r/20230128004827.21371-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zero-length arrays are deprecated[1]. Replace struct io_uring_buf_ring's
"bufs" with a flexible array member. (How is the size of this array
verified?) Detected with GCC 13, using -fstrict-flex-arrays=3:
In function 'io_ring_buffer_select',
inlined from 'io_buffer_select' at io_uring/kbuf.c:183:10:
io_uring/kbuf.c:141:23: warning: array subscript 255 is outside the bounds of an interior zero-length array 'struct io_uring_buf[0]' [-Wzero-length-bounds]
141 | buf = &br->bufs[head];
| ^~~~~~~~~~~~~~~
In file included from include/linux/io_uring.h:7,
from io_uring/kbuf.c:10:
include/uapi/linux/io_uring.h: In function 'io_buffer_select':
include/uapi/linux/io_uring.h:628:41: note: while referencing 'bufs'
628 | struct io_uring_buf bufs[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: stable@vger.kernel.org
Cc: io-uring@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230105190507.gonna.131-kees@kernel.org
Permission to create an object (create, mkdir, symlink, mknod) needs to
take supplementary groups into account.
Add a supplementary group request extension. This can contain an arbitrary
number of group IDs and can be added to any request. This extension is not
added to any request by default.
Add FUSE_CREATE_SUPP_GROUP init flag to enable supplementary group info in
creation requests. This adds just a single supplementary group that
matches the parent group in the case described above. In other cases the
extension is not added.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Will need to add supplementary groups to create messages, so add the
general concept of a request extension. A request extension is appended to
the end of the main request. It has a header indicating the size and type
of the extension.
The create security context (fuse_secctx_*) is similar to the generic
request extension, so include that as well in a backward compatible manner.
Add the total extension length to the request header. The offset of the
extension block within the request can be calculated by:
inh->len - inh->total_extlen * 8
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
There are multiple ICMP rate limiting mechanisms:
* Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec
* v4 per-host limits: net.ipv4.icmp_ratelimit/ratemask
* v6 per-host limits: net.ipv6.icmp_ratelimit/ratemask
However, when ICMP output is limited, there is no way to tell
which limit has been hit or even if the limits are responsible
for the lack of ICMP output.
Add counters for each of the cases above. As we are within
local_bh_disable(), use the __INC stats variant.
Example output:
# nstat -sz "*RateLimit*"
IcmpOutRateLimitGlobal 134 0.0
IcmpOutRateLimitHost 770 0.0
Icmp6OutRateLimitHost 84 0.0
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Suggested-by: Abhishek Rawal <rawal.abhishek92@gmail.com>
Link: https://lore.kernel.org/r/273b32241e6b7fdc5c609e6f5ebc68caf3994342.1674605770.git.jamie.bainbridge@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Users who want to share a single public IP address for outgoing connections
between several hosts traditionally reach for SNAT. However, SNAT requires
state keeping on the node(s) performing the NAT.
A stateless alternative exists, where a single IP address used for egress
can be shared between several hosts by partitioning the available ephemeral
port range. In such a setup:
1. Each host gets assigned a disjoint range of ephemeral ports.
2. Applications open connections from the host-assigned port range.
3. Return traffic gets routed to the host based on both, the destination IP
and the destination port.
An application which wants to open an outgoing connection (connect) from a
given port range today can choose between two solutions:
1. Manually pick the source port by bind()'ing to it before connect()'ing
the socket.
This approach has a couple of downsides:
a) Search for a free port has to be implemented in the user-space. If
the chosen 4-tuple happens to be busy, the application needs to retry
from a different local port number.
Detecting if 4-tuple is busy can be either easy (TCP) or hard
(UDP). In TCP case, the application simply has to check if connect()
returned an error (EADDRNOTAVAIL). That is assuming that the local
port sharing was enabled (REUSEADDR) by all the sockets.
# Assume desired local port range is 60_000-60_511
s = socket(AF_INET, SOCK_STREAM)
s.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
s.bind(("192.0.2.1", 60_000))
s.connect(("1.1.1.1", 53))
# Fails only if 192.0.2.1:60000 -> 1.1.1.1:53 is busy
# Application must retry with another local port
In case of UDP, the network stack allows binding more than one socket
to the same 4-tuple, when local port sharing is enabled
(REUSEADDR). Hence detecting the conflict is much harder and involves
querying sock_diag and toggling the REUSEADDR flag [1].
b) For TCP, bind()-ing to a port within the ephemeral port range means
that no connecting sockets, that is those which leave it to the
network stack to find a free local port at connect() time, can use
the this port.
IOW, the bind hash bucket tb->fastreuse will be 0 or 1, and the port
will be skipped during the free port search at connect() time.
2. Isolate the app in a dedicated netns and use the use the per-netns
ip_local_port_range sysctl to adjust the ephemeral port range bounds.
The per-netns setting affects all sockets, so this approach can be used
only if:
- there is just one egress IP address, or
- the desired egress port range is the same for all egress IP addresses
used by the application.
For TCP, this approach avoids the downsides of (1). Free port search and
4-tuple conflict detection is done by the network stack:
system("sysctl -w net.ipv4.ip_local_port_range='60000 60511'")
s = socket(AF_INET, SOCK_STREAM)
s.setsockopt(SOL_IP, IP_BIND_ADDRESS_NO_PORT, 1)
s.bind(("192.0.2.1", 0))
s.connect(("1.1.1.1", 53))
# Fails if all 4-tuples 192.0.2.1:60000-60511 -> 1.1.1.1:53 are busy
For UDP this approach has limited applicability. Setting the
IP_BIND_ADDRESS_NO_PORT socket option does not result in local source
port being shared with other connected UDP sockets.
Hence relying on the network stack to find a free source port, limits the
number of outgoing UDP flows from a single IP address down to the number
of available ephemeral ports.
To put it another way, partitioning the ephemeral port range between hosts
using the existing Linux networking API is cumbersome.
To address this use case, add a new socket option at the SOL_IP level,
named IP_LOCAL_PORT_RANGE. The new option can be used to clamp down the
ephemeral port range for each socket individually.
The option can be used only to narrow down the per-netns local port
range. If the per-socket range lies outside of the per-netns range, the
latter takes precedence.
UAPI-wise, the low and high range bounds are passed to the kernel as a pair
of u16 values in host byte order packed into a u32. This avoids pointer
passing.
PORT_LO = 40_000
PORT_HI = 40_511
s = socket(AF_INET, SOCK_STREAM)
v = struct.pack("I", PORT_HI << 16 | PORT_LO)
s.setsockopt(SOL_IP, IP_LOCAL_PORT_RANGE, v)
s.bind(("127.0.0.1", 0))
s.getsockname()
# Local address between ("127.0.0.1", 40_000) and ("127.0.0.1", 40_511),
# if there is a free port. EADDRINUSE otherwise.
[1] https://github.com/cloudflare/cloudflare-blog/blob/232b432c1d57/2022-02-connectx/connectx.py#L116
Reviewed-by: Marek Majkowski <marek@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building a list of filter events, it can sometimes be a challenge
to fit all the events needed to adequately restrict the guest into the
limited space available in the pmu event filter. This stems from the
fact that the pmu event filter requires each event (i.e. event select +
unit mask) be listed, when the intention might be to restrict the
event select all together, regardless of it's unit mask. Instead of
increasing the number of filter events in the pmu event filter, add a
new encoding that is able to do a more generalized match on the unit mask.
Introduce masked events as another encoding the pmu event filter
understands. Masked events has the fields: mask, match, and exclude.
When filtering based on these events, the mask is applied to the guest's
unit mask to see if it matches the match value (i.e. umask & mask ==
match). The exclude bit can then be used to exclude events from that
match. E.g. for a given event select, if it's easier to say which unit
mask values shouldn't be filtered, a masked event can be set up to match
all possible unit mask values, then another masked event can be set up to
match the unit mask values that shouldn't be filtered.
Userspace can query to see if this feature exists by looking for the
capability, KVM_CAP_PMU_EVENT_MASKED_EVENTS.
This feature is enabled by setting the flags field in the pmu event
filter to KVM_PMU_EVENT_FLAG_MASKED_EVENTS.
Events can be encoded by using KVM_PMU_ENCODE_MASKED_ENTRY().
It is an error to have a bit set outside the valid bits for a masked
event, and calls to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in
such cases, including the high bits of the event select (35:32) if
called on Intel.
With these updates the filter matching code has been updated to match on
a common event. Masked events were flexible enough to handle both event
types, so they were used as the common event. This changes how guest
events get filtered because regardless of the type of event used in the
uAPI, they will be converted to masked events. Because of this there
could be a slight performance hit because instead of matching the filter
event with a lookup on event select + unit mask, it does a lookup on event
select then walks the unit masks to find the match. This shouldn't be a
big problem because I would expect the set of common event selects to be
small, and if they aren't the set can likely be reduced by using masked
events to generalize the unit mask. Using one type of event when
filtering guest events allows for a common code path to be used.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20221220161236.555143-5-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Regenerate the FOU uAPI header from the YAML spec.
The flags now come before attributes which use them,
and the comments for type disappear (coders should look
at the spec instead).
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
An SCTP endpoint can start an association through a path and tear it
down over another one. That means the initial path will not see the
shutdown sequence, and the conntrack entry will remain in ESTABLISHED
state for 5 days.
By merging the HEARTBEAT_ACKED and ESTABLISHED states into one
ESTABLISHED state, there remains no difference between a primary or
secondary path. The timeout for the merged ESTABLISHED state is set to
210 seconds (hb_interval * max_path_retrans + rto_max). So, even if a
path doesn't see the shutdown sequence, it will expire in a reasonable
amount of time.
With this change in place, there is now more than one state from which
we can transition to ESTABLISHED, COOKIE_ECHOED and HEARTBEAT_SENT, so
handle the setting of ASSURED bit whenever a state change has happened
and the new state is ESTABLISHED. Removed the check for dir==REPLY since
the transition to ESTABLISHED can happen only in the reply direction.
Fixes: 9fb9cbb108 ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This reverts commit (bff3d0534804: "netfilter: conntrack: add sctp
DATA_SENT state")
Using DATA/SACK to detect a new connection on secondary/alternate paths
works only on new connections, while a HEARTBEAT is required on
connection re-use. It is probably consistent to wait for HEARTBEAT to
create a secondary connection in conntrack.
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
First set of patches for v6.3. The most important change here is that
the old Wireless Extension user space interface is not supported on
Wi-Fi 7 devices at all. We also added a warning if anyone with modern
drivers (ie. cfg80211 and mac80211 drivers) tries to use Wireless
Extensions, everyone should switch to using nl80211 interface instead.
Static WEP support is removed, there wasn't any driver using that
anyway so there's no user impact. Otherwise it's smaller features and
fixes as usual.
Note: As mt76 had tricky conflicts due to the fixes in wireless tree,
we decided to merge wireless into wireless-next to solve them easily.
There should not be any merge problems anymore.
Major changes:
cfg80211
* remove never used static WEP support
* warn if Wireless Extention interface is used with cfg80211/mac80211 drivers
* stop supporting Wireless Extensions with Wi-Fi 7 devices
* support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting
rfkill
* add GPIO DT support
bitfield
* add FIELD_PREP_CONST()
mt76
* per-PHY LED support
rtw89
* support new Bluetooth co-existance version
rtl8xxxu
* support RTL8188EU
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmPOYeQRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZvSlAf/Y5ZY5xLEytUma7fBkBObXEfP/7tlBBsu
RoRKVx77D1LGfGu0WXG9PCdvyY70e2QtrkdeLHF3gfzLYpNZIyB/eOFhwzCtbJrD
ls2yXhdTm9OwDOHAdvXLXx3fmF4bXni7dYdi78VrGCFOnU6XE6X5JpnZYU1SmQ1U
8Ro7H6D9yp8MKfh5Ct19PYSTS5hmHB09vfJ4rbkjHp7kEGvJjYNbvAqGsxatPnh9
Zw35TEIwmhZO4GsXxsG12g6LZa8W8RO8uCwepHxtFM8oGsF68Yb/lkLcdtMiuN6V
WdB6qn24faEWjdmt5BzJGueA3Td8KI6t5cHhGbQVKjyFD8lAC+IJQA==
=Nq9U
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2023-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.3
First set of patches for v6.3. The most important change here is that
the old Wireless Extension user space interface is not supported on
Wi-Fi 7 devices at all. We also added a warning if anyone with modern
drivers (ie. cfg80211 and mac80211 drivers) tries to use Wireless
Extensions, everyone should switch to using nl80211 interface instead.
Static WEP support is removed, there wasn't any driver using that
anyway so there's no user impact. Otherwise it's smaller features and
fixes as usual.
Note: As mt76 had tricky conflicts due to the fixes in wireless tree,
we decided to merge wireless into wireless-next to solve them easily.
There should not be any merge problems anymore.
Major changes:
cfg80211
- remove never used static WEP support
- warn if Wireless Extention interface is used with cfg80211/mac80211 drivers
- stop supporting Wireless Extensions with Wi-Fi 7 devices
- support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting
rfkill
- add GPIO DT support
bitfield
- add FIELD_PREP_CONST()
mt76
- per-PHY LED support
rtw89
- support new Bluetooth co-existance version
rtl8xxxu
- support RTL8188EU
* tag 'wireless-next-2023-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (123 commits)
wifi: wireless: deny wireless extensions on MLO-capable devices
wifi: wireless: warn on most wireless extension usage
wifi: mac80211: drop extra 'e' from ieeee80211... name
wifi: cfg80211: Deduplicate certificate loading
bitfield: add FIELD_PREP_CONST()
wifi: mac80211: add kernel-doc for EHT structure
mac80211: support minimal EHT rate reporting on RX
wifi: mac80211: Add HE MU-MIMO related flags in ieee80211_bss_conf
wifi: mac80211: Add VHT MU-MIMO related flags in ieee80211_bss_conf
wifi: cfg80211: Use MLD address to indicate MLD STA disconnection
wifi: cfg80211: Support 32 bytes KCK key in GTK rekey offload
wifi: cfg80211: Fix extended KCK key length check in nl80211_set_rekey_data()
wifi: cfg80211: remove support for static WEP
wifi: rtl8xxxu: Dump the efuse only for untested devices
wifi: rtl8xxxu: Print the ROM version too
wifi: rtw88: Use non-atomic sta iterator in rtw_ra_mask_info_update()
wifi: rtw88: Use rtw_iterate_vifs() for rtw_vif_watch_dog_iter()
wifi: rtw88: Move register access from rtw_bf_assoc() outside the RCU
wifi: rtl8xxxu: Use a longer retry limit of 48
wifi: rtl8xxxu: Report the RSSI to the firmware
...
====================
Link: https://lore.kernel.org/r/20230123103338.330CBC433EF@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
New flag BPF_F_XDP_DEV_BOUND_ONLY plus all the infra to have a way
to associate a netdev with a BPF program at load time.
netdevsim checks are dropped in favor of generic check in dev_xdp_attach.
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-6-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
We need the USB fixes in here and this resolves merge conflicts as
reported in linux-next in the following files:
drivers/usb/host/xhci.c
drivers/usb/host/xhci.h
drivers/usb/typec/ucsi/ucsi.c
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
IEEE 802.3-2018 clause 99 defines a MAC Merge sublayer which contains an
Express MAC and a Preemptible MAC. Both MACs are hidden to higher and
lower layers and visible as a single MAC (packet classification to eMAC
or pMAC on TX is done based on priority; classification on RX is done
based on SFD).
For devices which support a MAC Merge sublayer, it is desirable to
retrieve individual packet counters from the eMAC and the pMAC, as well
as aggregate statistics (their sum).
Introduce a new ETHTOOL_A_STATS_SRC attribute which is part of the
policy of ETHTOOL_MSG_STATS_GET and, and an ETHTOOL_A_PAUSE_STATS_SRC
which is part of the policy of ETHTOOL_MSG_PAUSE_GET (accepted when
ETHTOOL_FLAG_STATS is set in the common ethtool header). Both of these
take values from enum ethtool_mac_stats_src, defaulting to "aggregate"
in the absence of the attribute.
Existing drivers do not need to pay attention to this enum which was
added to all driver-facing structures, just the ones which report the
MAC merge layer as supported.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2
specifications (the other being Frame Preemption; IEEE 802.1Q-2018
clause 6.7.2), which work together to minimize latency caused by frame
interference at TX. The overall goal of TSN is for normal traffic and
traffic with a bounded deadline to be able to cohabitate on the same L2
network and not bother each other too much.
The standards achieve this (partly) by introducing the concept of
preemptible traffic, i.e. Ethernet frames that have a custom value for
the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented
and reassembled at L2 on a link-local basis. The non-preemptible frames
are called express traffic, they are transmitted using a normal SFD, and
they can preempt preemptible frames, therefore having lower latency,
which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo
frames around 9K). Preemption is not recursive, i.e. a P frame cannot
preempt another P frame. Preemption also does not depend upon priority,
or otherwise said, an E frame with prio 0 will still preempt a P frame
with prio 7.
In terms of implementation, the standards talk about the presence of an
express MAC (eMAC) which handles express traffic, and a preemptible MAC
(pMAC) which handles preemptible traffic, and these MACs are multiplexed
on the same MII by a MAC merge layer.
To support frame preemption, the definition of the SFD was generalized
to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an
Ethernet frame fragment, or a complete frame. Stations unaware of an SMD
value different from the standard SFD will treat P frames as error
frames. To prevent that from happening, a negotiation process is
defined.
On RX, packets are dispatched to the eMAC or pMAC after being filtered
by their SMD. On TX, the eMAC/pMAC classification decision is taken by
the 802.1Q spec, based on packet priority (each of the 8 user priority
values may have an admin-status of preemptible or express).
The MAC Merge layer and the Frame Preemption parameters have some degree
of independence in terms of how software stacks are supposed to deal
with them. The activation of the MM layer is supposed to be controlled
by an LLDP daemon (after it has been communicated that the link partner
also supports it), after which a (hardware-based or not) verification
handshake takes place, before actually enabling the feature. So the
process is intended to be relatively plug-and-play. Whereas FP settings
are supposed to be coordinated across a network using something
approximating NETCONF.
The support contained here is exclusively for the 802.3 (MAC Merge)
portions and not for the 802.1Q (Frame Preemption) parts. This API is
sufficient for an LLDP daemon to do its job. The FP adminStatus variable
from 802.1Q is outside the scope of an LLDP daemon.
I have taken a few creative licenses and augmented the Linux kernel UAPI
compared to the standard managed objects recommended by IEEE 802.3.
These are:
- ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive
Processing state diagram, a MAC Merge layer is always supposed to be
able to receive P frames. However, this implies keeping the pMAC
powered on, which will consume needless power in applications where FP
will never be used. If LLDP is used, the reception of an Additional
Ethernet Capabilities TLV from the link partner is sufficient
indication that the pMAC should be enabled. So my proposal is that in
Linux, we keep the pMAC turned off by default and that user space
turns it on when needed.
- ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called
aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in
the boolean netlink attributes offered, so this is also positive here.
Other than the meaning being reversed, they correspond to the same
thing.
- ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP
daemon to maximize the verifyTime variable (delay between SMD-V
transmissions), to maximize its chances that the LP replies. IEEE says
that the verifyTime can range between 1 and 128 ms, but the NXP ENETC
stupidly keeps this variable in a 7 bit register, so the maximum
supported value is 127 ms. I could have chosen to hardcode this in the
LLDP daemon to a lower value, but why not let the kernel expose its
supported range directly.
- ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called
aMACMergeAddFragSize, and expresses the "additional" fragment size
(on top of ETH_ZLEN), whereas this expresses the absolute value of the
fragment size.
- ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed
object mandated by the standard, but user space clearly needs to know
what is the minimum supported fragment size of our local receiver,
since LLDP must advertise a value no lower than that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The meye driver does not use the vb2 framework for streaming
video, instead it implements this in the driver. This is error prone,
and nobody stepped in to convert this driver to that framework.
The hardware is very old, so the decision was made to remove it
altogether.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Add support to manage configurations (format, crop, compose) per stream,
instead of per pad. This is accomplished with data structures that hold
an array of all subdev's stream configurations.
The number of streams can vary at runtime based on routing. Every time
the routing is changed, the stream configurations need to be
re-initialized.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Add support for subdev internal routing. A route is defined as a single
stream from a sink pad to a source pad.
The userspace can configure the routing via two new ioctls,
VIDIOC_SUBDEV_G_ROUTING and VIDIOC_SUBDEV_S_ROUTING, and subdevs can
implement the functionality with v4l2_subdev_pad_ops.set_routing().
- Add sink and source streams for multiplexed links
- Copy the argument back in case of an error. This is needed to let the
caller know the number of routes.
- Expand and refine documentation.
- Make the 'routes' pointer a __u64 __user pointer so that a compat32
version of the ioctl is not required.
- Add struct v4l2_subdev_krouting to be used for subdevice operations.
- Fix typecasing warnings
- Check sink & source pad types
- Add 'which' field
- Routing to subdev state
- Dropped get_routing subdev op
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Add a subdev capability flag to expose to userspace if a subdev supports
multiplexed streams.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Prepare TVLV infrastructure for more packet types, in particular the
upcoming batman-adv multicast packet type.
For that swap the OGM vs. unicast-tvlv packet boolean indicator to an
explicit unsigned integer packet type variable. And provide the skb
to a call to batadv_tvlv_containers_process(), as later the multicast
packet's TVLV handler will need to have access not only to the TVLV but
the full skb for forwarding. Forwarding will be invoked from the
multicast packet's TVLVs' contents later.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Implement support for a new note type NT_ARM64_ZT providing access to
ZT0 when implemented. Since ZT0 is a register with constant size this is
much simpler than for other SME state.
As ZT0 is only accessible when PSTATE.ZA is set writes to ZT0 cause
PSTATE.ZA to be set, the main alternative would be to return -EBUSY in
this case but this seemed more constructive. Practical users are also
going to be working with ZA anyway and have some understanding of the
state.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-12-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add new rewrite table and all the required functions, offload hooks and
bookkeeping for maintaining it. The rewrite table reuses the app struct,
and the entire set of app selectors. As such, some bookeeping code can
be shared between the rewrite- and the APP table.
New functions for getting, setting and deleting entries has been added.
Apart from operating on the rewrite list, these functions do not emit a
DCB_APP_EVENT when the list os modified. The new dcb_getrewr does a
lookup based on selector and priority and returns the protocol, so that
mappings from priority to protocol, for a given selector and ifindex is
obtained.
Also, a new nested attribute has been added, that encapsulates one or
more app structs. This attribute is used to distinguish the two tables.
The dcb_lock used for the APP table is reused for the rewrite table.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An async transaction to a frozen process will still be successfully
put in the queue. But this pending async transaction won't be processed
until the target process is unfrozen at an unspecified time in the
future. Pass this important information back to the user space caller
by returning BR_TRANSACTION_PENDING_FROZEN.
Signed-off-by: Li Li <dualli@google.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20221123201654.589322-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The KD_FONT_OP_SET/GET operations hardcode vpitch to be 32 pixels,
which only dates from the old VGA hardware which as asserting this.
Drivers such as fbcon however do not have such limitation, so this
introduces KD_FONT_OP_SET/GET_TALL operations, which userland can try
to use to avoid this limitation, thus opening the patch to >32 pixels
font height.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Link: https://lore.kernel.org/r/20230119151935.013597162@ens-lyon.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't use magic literals & comments but define a real field instead
for UART_IIR_FIFO_ENABLED and name also the values.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20221125130509.8482-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a custom (non-USB IF) extension to the USB standard:
https://wicg.github.io/webusb/
This specification is published under the W3C Community Contributor
Agreement, which in particular allows to implement the specification
without any royalties.
The specification allows USB gadgets to announce an URL to landing
page and describes a Javascript interface for websites to interact
with the USB gadget, if the user allows it. It is currently
supported by Chromium-based browsers, such as Chrome, Edge and
Opera on all major operating systems including Linux.
This patch adds optional support for Linux-based USB gadgets
wishing to expose such a landing page.
During device enumeration, a host recognizes that the announced
USB version is at least 2.01, which means, that there are BOS
descriptors available. The device than announces WebUSB support
using a platform device capability. This includes a vendor code
under which the landing page URL can be retrieved using a
vendor-specific request.
Previously, the BOS descriptors would unconditionally include an
LPM related descriptor, as BOS descriptors were only ever sent
when the device was LPM capable. As this is no longer the case,
this patch puts this descriptor behind a lpm_capable condition.
Usage is modeled after os_desc descriptors:
echo 1 > webusb/use
echo "https://www.kernel.org" > webusb/landingPage
lsusb will report the device with the following lines:
Platform Device Capability:
bLength 24
bDescriptorType 16
bDevCapabilityType 5
bReserved 0
PlatformCapabilityUUID {3408b638-09a9-47a0-8bfd-a0768815b665}
WebUSB:
bcdVersion 1.00
bVendorCode 0
iLandingPage 1 https://www.kernel.org
Signed-off-by: Jó Ágila Bitsch <jgilab@gmail.com>
Link: https://lore.kernel.org/r/Y8Crf8P2qAWuuk/F@jo-einhundert
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The new MFD_NOEXEC_SEAL and MFD_EXEC flags allows application to set
executable bit at creation time (memfd_create).
When MFD_NOEXEC_SEAL is set, memfd is created without executable bit
(mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to be
executable (mode: 0777) after creation.
when MFD_EXEC flag is set, memfd is created with executable bit
(mode:0777), this is the same as the old behavior of memfd_create.
The new pid namespaced sysctl vm.memfd_noexec has 3 values:
0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
MFD_EXEC was set.
1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
MFD_NOEXEC_SEAL was set.
2: memfd_create() without MFD_NOEXEC_SEAL will be rejected.
The sysctl allows finer control of memfd_create for old-software that
doesn't set the executable bit, for example, a container with
vm.memfd_noexec=1 means the old-software will create non-executable memfd
by default. Also, the value of memfd_noexec is passed to child namespace
at creation time. For example, if the init namespace has
vm.memfd_noexec=2, all its children namespaces will be created with 2.
[akpm@linux-foundation.org: add stub functions to fix build]
[akpm@linux-foundation.org: remove unneeded register_pid_ns_ctl_table_vm() stub, per Jeff]
[akpm@linux-foundation.org: s/pr_warn_ratelimited/pr_warn_once/, per review]
[akpm@linux-foundation.org: fix CONFIG_SYSCTL=n warning]
Link: https://lkml.kernel.org/r/20221215001205.51969-4-jeffxu@google.com
Signed-off-by: Jeff Xu <jeffxu@google.com>
Co-developed-by: Daniel Verkamp <dverkamp@chromium.org>
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm/memfd: introduce MFD_NOEXEC_SEAL and MFD_EXEC", v8.
Since Linux introduced the memfd feature, memfd have always had their
execute bit set, and the memfd_create() syscall doesn't allow setting it
differently.
However, in a secure by default system, such as ChromeOS, (where all
executables should come from the rootfs, which is protected by Verified
boot), this executable nature of memfd opens a door for NoExec bypass and
enables “confused deputy attack”. E.g, in VRP bug [1]: cros_vm
process created a memfd to share the content with an external process,
however the memfd is overwritten and used for executing arbitrary code and
root escalation. [2] lists more VRP in this kind.
On the other hand, executable memfd has its legit use, runc uses memfd’s
seal and executable feature to copy the contents of the binary then
execute them, for such system, we need a solution to differentiate runc's
use of executable memfds and an attacker's [3].
To address those above, this set of patches add following:
1> Let memfd_create() set X bit at creation time.
2> Let memfd to be sealed for modifying X bit.
3> A new pid namespace sysctl: vm.memfd_noexec to control the behavior of
X bit.For example, if a container has vm.memfd_noexec=2, then
memfd_create() without MFD_NOEXEC_SEAL will be rejected.
4> A new security hook in memfd_create(). This make it possible to a new
LSM, which rejects or allows executable memfd based on its security policy.
This patch (of 5):
The new F_SEAL_EXEC flag will prevent modification of the exec bits:
written as traditional octal mask, 0111, or as named flags, S_IXUSR |
S_IXGRP | S_IXOTH. Any chmod(2) or similar call that attempts to modify
any of these bits after the seal is applied will fail with errno EPERM.
This will preserve the execute bits as they are at the time of sealing, so
the memfd will become either permanently executable or permanently
un-executable.
Link: https://lkml.kernel.org/r/20221215001205.51969-1-jeffxu@google.com
Link: https://lkml.kernel.org/r/20221215001205.51969-2-jeffxu@google.com
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Co-developed-by: Jeff Xu <jeffxu@google.com>
Signed-off-by: Jeff Xu <jeffxu@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We use station's MLD address to report disconnection of MLD station.
Update the documentation in multiple places to indicate this.
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/20221206080226.1702646-4-quic_vjakkam@quicinc.com
[update commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, maximum KCK key length supported for GTK rekey offload is 24
bytes but with some newer AKMs the KCK key length can be 32 bytes. e.g.,
00-0F-AC:24 AKM suite with SAE finite cyclic group 21. Add support to
allow 32 bytes KCK keys in GTK rekey offload.
Signed-off-by: Shivani Baranwal <quic_shivbara@quicinc.com>
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/20221206143715.1802987-3-quic_vjakkam@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Introduce NFT_MSG_DESTROY* message type. The destroy operation performs a
delete operation but ignoring the ENOENT errors.
This is useful for the transaction semantics, where failing to delete an
object which does not exist results in aborting the transaction.
This new command allows the transaction to proceed in case the object
does not exist.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
We will report extack message if there is an error via netlink_ack(). But
if the rule is not to be exclusively executed by the hardware, extack is not
passed along and offloading failures don't get logged.
In commit 81c7288b17 ("sched: cls: enable verbose logging") Marcelo
made cls could log verbose info for offloading failures, which helps
improving Open vSwitch debuggability when using flower offloading.
It would also be helpful if userspace monitor tools, like "tc monitor",
could log this kind of message, as it doesn't require vswitchd log level
adjusment. Let's add a new tc attributes to report the extack message so
the monitor program could receive the failures. e.g.
# tc monitor
added chain dev enp3s0f1np1 parent ffff: chain 0
added filter dev enp3s0f1np1 ingress protocol all pref 49152 flower chain 0 handle 0x1
ct_state +trk+new
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1
Warning: mlx5_core: matching on ct_state +new isn't supported.
In this patch I only report the extack message on add/del operations.
It doesn't look like we need to report the extack message on get/dump
operations.
Note this message not only reporte to multicast groups, it could also
be reported unicast, which may affect the current usersapce tool's behaivor.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230113034353.2766735-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The memcpy() in uvc_video_decode_meta() intentionally copies across the
length and flags members and into the trailing buf flexible array.
Split the copy so that the compiler can better reason about (the lack
of) buffer overflows here. Avoid the run-time false positive warning:
memcpy: detected field-spanning write (size 12) of single field "&meta->length" at drivers/media/usb/uvc/uvc_video.c:1355 (size 1)
Additionally fix a typo in the documentation for struct uvc_meta_buf.
Reported-by: ionut_n2001@yahoo.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216810
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Instead of duplicating the menu info, use the one from the core.
Also, do not use extra memory for 1:1 mappings.
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.
Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the
new IP header version after decapsulating the outer IP header.
Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/b268ec7f0ff9431f4f43b1b40ab856ebb28cb4e1.1673574419.git.william.xuanziyang@huawei.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
- Fix PAGE_TABLE_CHECK failures on hugepage splitting path
- Fix PSCI encoding of MEM_PROTECT_RANGE function in UAPI header
- Fix NULL deref when accessing debugfs node if PSCI is not present
- Fix MTE core dumping when VMA list is being updated concurrently
- Fix SME signal frame handling when SVE is not implemented by the CPU
- Fix asm constraints for cmpxchg_double() to hazard both words
- Fix build failure with stack tracer and older versions of Clang
- Bring back workaround for Cortex-A715 erratum 2645198
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmO9SzwQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLdYB/9pX4El38TX4Y4M6sR2yl+m1rkGRiU4nV3N
MKJ3ZVjrx87QZ8CKVYmJbnHzolN0Art9WvqFnyxtPMBlZyWzHjtsrQnad3VwLDOu
4qmqjDCXvPod1EncCxBiGu28FZ88HoLqhnwWB6O2Su6TlczD0kJTfzincdyzqvi2
r0uUlBd9gtFt3sjV+sLPjE6NqMf9MfhoOLLafijz7ZMElQL+2/BjZxhpHLaWhUz1
aHIp4w841TJOuSlCwstX20Nc6Q9+6ta07bw+TD/flyQ+IGUptgDEoIrpjdSO5b2t
zFFHHN5IXovAJPDfhAdXGAbC2SDFyYJtURCpv6hVt/SSsilGEbYg
=241k
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Here's a sizeable batch of Friday the 13th arm64 fixes for -rc4. What
could possibly go wrong?
The obvious reason we have so much here is because of the holiday
season right after the merge window, but we've also brought back an
erratum workaround that was previously dropped at the last minute and
there's an MTE coredumping fix that strays outside of the arch/arm64
directory.
Summary:
- Fix PAGE_TABLE_CHECK failures on hugepage splitting path
- Fix PSCI encoding of MEM_PROTECT_RANGE function in UAPI header
- Fix NULL deref when accessing debugfs node if PSCI is not present
- Fix MTE core dumping when VMA list is being updated concurrently
- Fix SME signal frame handling when SVE is not implemented by the
CPU
- Fix asm constraints for cmpxchg_double() to hazard both words
- Fix build failure with stack tracer and older versions of Clang
- Bring back workaround for Cortex-A715 erratum 2645198"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y
arm64/mm: Define dummy pud_user_exec() when using 2-level page-table
arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption
firmware/psci: Don't register with debugfs if PSCI isn't available
firmware/psci: Fix MEM_PROTECT_RANGE function numbers
arm64/signal: Always allocate SVE signal frames on SME only systems
arm64/signal: Always accept SVE signal frames on SME only systems
arm64/sme: Fix context switch for SME only systems
arm64: cmpxchg_double*: hazard against entire exchange variable
arm64/uprobes: change the uprobe_opcode_t typedef to fix the sparse warning
arm64: mte: Avoid the racy walk of the vma list during core dump
elfcore: Add a cprm parameter to elf_core_extra_{phdrs,data_size}
arm64: mte: Fix double-freeing of the temporary tag storage during coredump
arm64: ptrace: Use ARM64_SME to guard the SME register enumerations
arm64/mm: add pud_user_exec() check in pud_user_accessible_page()
arm64/mm: fix incorrect file_map_count for invalid pmd
Add the following ethtool tx aggregation parameters:
ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES
Maximum size in bytes of a tx aggregated block of frames.
ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES
Maximum number of frames that can be aggregated into a block.
ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS
Time in usecs after the first packet arrival in an aggregated
block for the block to be sent.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the link modes for the IEEE 802.3cg Clause 147 10BASE-T1S
Ethernet PHY. According to the specifications, the 10BASE-T1S supports
Point-To-Point Full-Duplex, Point-To-Point Half-Duplex and/or
Point-To-Multipoint (AKA Multi-Drop) Half-Duplex operations.
Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for configuring the PLCA Reconciliation Sublayer on
multi-drop PHYs that support IEEE802.3cg-2019 Clause 148 (e.g.,
10BASE-T1S). This patch adds the appropriate netlink interface
to ethtool.
Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide a method to query previously issued registrations.
Signed-off-by: Michal Clapinski <mclapinski@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20221207164338.1535591-2-mclapinski@google.com
Zero-length arrays are deprecated[1]. Replace struct ethtool_rxnfc's
"rule_locs" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:
net/ethtool/common.c: In function 'ethtool_get_max_rxnfc_channel':
net/ethtool/common.c:558:55: warning: array subscript i is outside array bounds of '__u32[0]' {aka 'unsigned int[]'} [-Warray-bounds=]
558 | .fs.location = info->rule_locs[i],
| ~~~~~~~~~~~~~~~^~~
In file included from include/linux/ethtool.h:19,
from include/uapi/linux/ethtool_netlink.h:12,
from include/linux/ethtool_netlink.h:6,
from net/ethtool/common.c:3:
include/uapi/linux/ethtool.h:1186:41: note: while referencing
'rule_locs'
1186 | __u32 rule_locs[0];
| ^~~~~~~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: kernel test robot <lkp@intel.com>
Cc: Oleksij Rempel <linux@rempel-privat.de>
Cc: Sean Anderson <sean.anderson@seco.com>
Cc: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230106042844.give.885-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zero-length arrays are deprecated[1]. Replace struct ipv6_rpl_sr_hdr's
"segments" union of 0-length arrays with flexible arrays. Detected with
GCC 13, using -fstrict-flex-arrays=3:
In function 'rpl_validate_srh',
inlined from 'rpl_build_state' at ../net/ipv6/rpl_iptunnel.c:96:7:
../net/ipv6/rpl_iptunnel.c:60:28: warning: array subscript <unknown> is outside array bounds of 'struct in6_addr[0]' [-Warray-bounds=]
60 | if (ipv6_addr_type(&srh->rpl_segaddr[srh->segments_left - 1]) &
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../include/net/rpl.h:12,
from ../net/ipv6/rpl_iptunnel.c:13:
../include/uapi/linux/rpl.h: In function 'rpl_build_state':
../include/uapi/linux/rpl.h:40:33: note: while referencing 'addr'
40 | struct in6_addr addr[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230105221533.never.711-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmO4SiAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgc9D/0XJufUgHsLeFCF5G+q6iL5Bz+d7ymw+VFv
xrNjOz8wUKYKXcJqxLrPdkmL1tcd1+fESNGgyBidn4P53BWoHB9dtbs8+Lova08t
I4lQmZHgxgbAMhSOwGvHlOTkdlBIw/fBgQ6XdI+1qmpxzma5+gjImjyp7oH+pODP
zqsg3DKRQmDApKWtvB6D5iItsWc1Jx5TEuOfU5/JjLuVZWl6O2qynNVUccF5T89O
jkt624yO+r70CVfX3NAdFTm/mOEUiGH97l4l/8OkekJ40pf73xzvNRF/S8z8nHb/
QUGY1tKvr08xfPusl3epmQ5aO938F0aFpKi2x6P+z3G6Uq+dqMMrjJl8XMDG+J+d
+yBow5yRH7o6oBb0YPPz/6S5zBjslsHtuKFd/rs4mCDfjp9GHiIIiIpdLxZEWawJ
WaYlc5WlzSdopT/IxfaRZ9HMHzscdKadjiFngSKdpEdCUw7wxdIey+/9xbKR+xh0
Es13MzyCCurj4OnyDl5cnetGJUNNiL1JvQmIaFVndyxnMfvOaZBBmKW7h9RYBIU/
nqi4vZwYoafnGUIfLFL6uq9F627lF/EhodDuLheqz0G2pWhmFJITOJUAakGNFf83
22CiKY2GyTrOy5tKqkNzv7BG/KyJZGP+CxyyQ/7xm0k2C9wEjYSpZHKcjaNZygU5
eswPKbZMkw==
=LJ5Q
-----END PGP SIGNATURE-----
Merge tag 'block-2023-01-06' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"The big change here is obviously the revert of the pktcdvd driver
removal. Outside of that, just minor tweaks. In detail:
- Re-instate the pktcdvd driver, which necessitates adding back
bio_copy_data_iter() and the fops->devnode() hook for now (me)
- Fix for splitting of a bio marked as NOWAIT, causing either nowait
reads or writes to error with EAGAIN even if parts of the IO
completed (me)
- Fix for ublk, punting management commands to io-wq as they can all
easily block for extended periods of time (Ming)
- Removal of SRCU dependency for the block layer (Paul)"
* tag 'block-2023-01-06' of git://git.kernel.dk/linux:
block: Remove "select SRCU"
Revert "pktcdvd: remove driver."
Revert "block: remove devnode callback from struct block_device_operations"
Revert "block: bio_copy_data_iter"
ublk: honor IO_URING_F_NONBLOCK for handling control command
block: don't allow splitting of a REQ_NOWAIT bio
block: handle bio_split_to_limits() NULL return
PSCI v1.1 offers 32-bit and 64-bit variants of the MEM_PROTECT_RANGE
call using function identifier 20.
Fix the incorrect definitions of the MEM_PROTECT_CHECK_RANGE calls in
the PSCI UAPI header.
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 3137f2e600 ("firmware/psci: Add debugfs support to ease debugging")
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20221125101826.22404-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Current release - regressions:
- bpf: fix nullness propagation for reg to reg comparisons,
avoid null-deref
- inet: control sockets should not use current thread task_frag
- bpf: always use maximal size for copy_array()
- eth: bnxt_en: don't link netdev to a devlink port for VFs
Current release - new code bugs:
- rxrpc: fix a couple of potential use-after-frees
- netfilter: conntrack: fix IPv6 exthdr error check
- wifi: iwlwifi: fw: skip PPAG for JF, avoid FW crashes
- eth: dsa: qca8k: various fixes for the in-band register access
- eth: nfp: fix schedule in atomic context when sync mc address
- eth: renesas: rswitch: fix getting mac address from device tree
- mobile: ipa: use proper endpoint mask for suspend
Previous releases - regressions:
- tcp: add TIME_WAIT sockets in bhash2, fix regression caught
by Jiri / python tests
- net: tc: don't intepret cls results when asked to drop, fix
oob-access
- vrf: determine the dst using the original ifindex for multicast
- eth: bnxt_en:
- fix XDP RX path if BPF adjusted packet length
- fix HDS (header placement) and jumbo thresholds for RX packets
- eth: ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf,
avoid memory corruptions
Previous releases - always broken:
- ulp: prevent ULP without clone op from entering the LISTEN status
- veth: fix race with AF_XDP exposing old or uninitialized descriptors
- bpf:
- pull before calling skb_postpull_rcsum() (fix checksum support
and avoid a WARN())
- fix panic due to wrong pageattr of im->image (when livepatch
and kretfunc coexist)
- keep a reference to the mm, in case the task is dead
- mptcp: fix deadlock in fastopen error path
- netfilter:
- nf_tables: perform type checking for existing sets
- nf_tables: honor set timeout and garbage collection updates
- ipset: fix hash:net,port,net hang with /0 subnet
- ipset: avoid hung task warning when adding/deleting entries
- selftests: net:
- fix cmsg_so_mark.sh test hang on non-x86 systems
- fix the arp_ndisc_evict_nocarrier test for IPv6
- usb: rndis_host: secure rndis_query check against int overflow
- eth: r8169: fix dmar pte write access during suspend/resume with WOL
- eth: lan966x: fix configuration of the PCS
- eth: sparx5: fix reading of the MAC address
- eth: qed: allow sleep in qed_mcp_trace_dump()
- eth: hns3:
- fix interrupts re-initialization after VF FLR
- fix handling of promisc when MAC addr table gets full
- refine the handling for VF heartbeat
- eth: mlx5:
- properly handle ingress QinQ-tagged packets on VST
- fix io_eq_size and event_eq_size params validation on big endian
- fix RoCE setting at HCA level if not supported at all
- don't turn CQE compression on by default for IPoIB
- eth: ena:
- fix toeplitz initial hash key value
- account for the number of XDP-processed bytes in interface stats
- fix rx_copybreak value update
Misc:
- ethtool: harden phy stat handling against buggy drivers
- docs: netdev: convert maintainer's doc from FAQ to a normal document
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmO3MLcACgkQMUZtbf5S
IrsEQBAAijPrpxsGMfX+VMqZ8RPKA3Qg8XF3ji2fSp4c0kiKv6lYI7PzPTR3u/fj
CAlhQMHv7z53uM6Zd7FdUVl23paaEycu8YnlwSubg9z+wSeh/RQ6iq94mSk1PV+K
LLVR/yop2N35Yp/oc5KZMb9fMLkxRG9Ci73QUVVYgvIrSd4Zdm13FjfVjL2C1MZH
Yp003wigMs9IkIHOpHjNqwn/5s//0yXsb1PgKxCsaMdMQsG0yC+7eyDmxshCqsji
xQm15mkGMjvWEYJaa4Tj4L3JW6lWbQzCu9nqPUX16KpmrnScr8S8Is+aifFZIBeW
GZeDYgvjSxNWodeOrJnD3X+fnbrR9+qfx7T9y7XighfytAz5DNm1LwVOvZKDgPFA
s+LlxOhzkDNEqbIsusK/LW+04EFc5gJyTI2iR6s4SSqmH3c3coJZQJeyRFWDZy/x
1oqzcCcq8SwGUTJ9g6HAmDQoVkhDWDT/ZcRKhpWG0nJub972lB2iwM7LrAu+HoHI
r8hyCkHpOi5S3WZKI9gPiGD+yOlpVAuG2wHg2IpjhKQvtd9DFUChGDhFeoB2rqJf
9uI3RJBBYTDkeNu3kpfy5uMh2XhvbIZntK5kwpJ4VettZWFMaOAzn7KNqk8iT4gJ
ASMrUrX59X0TAN0MgpJJm7uGtKbKZOu4lHNm74TUxH7V7bYn7dk=
=TlcN
-----END PGP SIGNATURE-----
Merge tag 'net-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, wifi, and netfilter.
Current release - regressions:
- bpf: fix nullness propagation for reg to reg comparisons, avoid
null-deref
- inet: control sockets should not use current thread task_frag
- bpf: always use maximal size for copy_array()
- eth: bnxt_en: don't link netdev to a devlink port for VFs
Current release - new code bugs:
- rxrpc: fix a couple of potential use-after-frees
- netfilter: conntrack: fix IPv6 exthdr error check
- wifi: iwlwifi: fw: skip PPAG for JF, avoid FW crashes
- eth: dsa: qca8k: various fixes for the in-band register access
- eth: nfp: fix schedule in atomic context when sync mc address
- eth: renesas: rswitch: fix getting mac address from device tree
- mobile: ipa: use proper endpoint mask for suspend
Previous releases - regressions:
- tcp: add TIME_WAIT sockets in bhash2, fix regression caught by
Jiri / python tests
- net: tc: don't intepret cls results when asked to drop, fix
oob-access
- vrf: determine the dst using the original ifindex for multicast
- eth: bnxt_en:
- fix XDP RX path if BPF adjusted packet length
- fix HDS (header placement) and jumbo thresholds for RX packets
- eth: ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf,
avoid memory corruptions
Previous releases - always broken:
- ulp: prevent ULP without clone op from entering the LISTEN status
- veth: fix race with AF_XDP exposing old or uninitialized
descriptors
- bpf:
- pull before calling skb_postpull_rcsum() (fix checksum support
and avoid a WARN())
- fix panic due to wrong pageattr of im->image (when livepatch and
kretfunc coexist)
- keep a reference to the mm, in case the task is dead
- mptcp: fix deadlock in fastopen error path
- netfilter:
- nf_tables: perform type checking for existing sets
- nf_tables: honor set timeout and garbage collection updates
- ipset: fix hash:net,port,net hang with /0 subnet
- ipset: avoid hung task warning when adding/deleting entries
- selftests: net:
- fix cmsg_so_mark.sh test hang on non-x86 systems
- fix the arp_ndisc_evict_nocarrier test for IPv6
- usb: rndis_host: secure rndis_query check against int overflow
- eth: r8169: fix dmar pte write access during suspend/resume with
WOL
- eth: lan966x: fix configuration of the PCS
- eth: sparx5: fix reading of the MAC address
- eth: qed: allow sleep in qed_mcp_trace_dump()
- eth: hns3:
- fix interrupts re-initialization after VF FLR
- fix handling of promisc when MAC addr table gets full
- refine the handling for VF heartbeat
- eth: mlx5:
- properly handle ingress QinQ-tagged packets on VST
- fix io_eq_size and event_eq_size params validation on big endian
- fix RoCE setting at HCA level if not supported at all
- don't turn CQE compression on by default for IPoIB
- eth: ena:
- fix toeplitz initial hash key value
- account for the number of XDP-processed bytes in interface stats
- fix rx_copybreak value update
Misc:
- ethtool: harden phy stat handling against buggy drivers
- docs: netdev: convert maintainer's doc from FAQ to a normal
document"
* tag 'net-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
caif: fix memory leak in cfctrl_linkup_request()
inet: control sockets should not use current thread task_frag
net/ulp: prevent ULP without clone op from entering the LISTEN status
qed: allow sleep in qed_mcp_trace_dump()
MAINTAINERS: Update maintainers for ptp_vmw driver
usb: rndis_host: Secure rndis_query check against int overflow
net: dpaa: Fix dtsec check for PCS availability
octeontx2-pf: Fix lmtst ID used in aura free
drivers/net/bonding/bond_3ad: return when there's no aggregator
netfilter: ipset: Rework long task execution when adding/deleting entries
netfilter: ipset: fix hash:net,port,net hang with /0 subnet
net: sparx5: Fix reading of the MAC address
vxlan: Fix memory leaks in error path
net: sched: htb: fix htb_classify() kernel-doc
net: sched: cbq: dont intepret cls results when asked to drop
net: sched: atm: dont intepret cls results when asked to drop
dt-bindings: net: marvell,orion-mdio: Fix examples
dt-bindings: net: sun8i-emac: Add phy-supply property
net: ipa: use proper endpoint mask for suspend
selftests: net: return non-zero for failures reported in arp_ndisc_evict_nocarrier
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY7X/4wAKCRDbK58LschI
g7gzAQCjKsLtAWg1OplW+B7pvEPwkQ8g3O1+PYWlToCUACTlzQD+PEMrqGnxB573
oQAk6I2yOTwLgvlHkrm+TIdKSouI4gs=
=2hUY
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
bpf-next 2023-01-04
We've added 45 non-merge commits during the last 21 day(s) which contain
a total of 50 files changed, 1454 insertions(+), 375 deletions(-).
The main changes are:
1) Fixes, improvements and refactoring of parts of BPF verifier's
state equivalence checks, from Andrii Nakryiko.
2) Fix a few corner cases in libbpf's BTF-to-C converter in particular
around padding handling and enums, also from Andrii Nakryiko.
3) Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better
support decap on GRE tunnel devices not operating in collect metadata,
from Christian Ehrig.
4) Improve x86 JIT's codegen for PROBE_MEM runtime error checks,
from Dave Marchevsky.
5) Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers, from Jiri Olsa.
6) Add proper documentation for BPF_MAP_TYPE_SOCK{MAP,HASH} maps,
from Maryam Tahhan.
7) Improvements in libbpf's btf_parse_elf error handling, from Changbin Du.
8) Bigger batch of improvements to BPF tracing code samples,
from Daniel T. Lee.
9) Add LoongArch support to libbpf's bpf_tracing helper header,
from Hengqi Chen.
10) Fix a libbpf compiler warning in perf_event_open_probe on arm32,
from Khem Raj.
11) Optimize bpf_local_storage_elem by removing 56 bytes of padding,
from Martin KaFai Lau.
12) Use pkg-config to locate libelf for resolve_btfids build,
from Shen Jiamin.
13) Various libbpf improvements around API documentation and errno
handling, from Xin Liu.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits)
libbpf: Return -ENODATA for missing btf section
libbpf: Add LoongArch support to bpf_tracing.h
libbpf: Restore errno after pr_warn.
libbpf: Added the description of some API functions
libbpf: Fix invalid return address register in s390
samples/bpf: Use BPF_KSYSCALL macro in syscall tracing programs
samples/bpf: Fix tracex2 by using BPF_KSYSCALL macro
samples/bpf: Change _kern suffix to .bpf with syscall tracing program
samples/bpf: Use vmlinux.h instead of implicit headers in syscall tracing program
samples/bpf: Use kyscall instead of kprobe in syscall tracing program
bpf: rename list_head -> graph_root in field info types
libbpf: fix errno is overwritten after being closed.
bpf: fix regs_exact() logic in regsafe() to remap IDs correctly
bpf: perform byte-by-byte comparison only when necessary in regsafe()
bpf: reject non-exact register type matches in regsafe()
bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule
bpf: reorganize struct bpf_reg_state fields
bpf: teach refsafe() to take into account ID remapping
bpf: Remove unused field initialization in bpf's ctl_table
selftests/bpf: Add jit probe_mem corner case tests to s390x denylist
...
====================
Link: https://lore.kernel.org/r/20230105000926.31350-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mostly fixes all over the place, a couple of cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmOsSY4PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpnEoH/i7iGllhlBKKula2o97CixSWqMYY7ncryfTN
c3F3Wjk4imlhR8Je7n+xmDPLoE9vFkKdZL6o7kIESCJfNjCQ6r6Rnl8LoHkOZhvj
oLJ8zUfbOa5+B5kzOdnh2WV+3uNkssjayTrozvkDQ4r9MeSVRaC/xZ+lz85PQ+c7
FaNNyT2G+fZ6HoQg5U+Q+bj2lqxP9FYKMqURWGxUqa/gIMmky6VJeUQvmVdQuzCq
cHR48USpn8WE7j8uTf6yCQQuCElOQo1j7Op3FPyaNR3bPWbgzjvOnPci+giQ6hJ5
OAKxJ6HX0ApqcpPoPaA+Qg43EAha+Cu7kw4I58J0U09nQIf/N9w=
=4CJb
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"Mostly fixes all over the place, a couple of cleanups"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (32 commits)
virtio_blk: Fix signedness bug in virtblk_prep_rq()
vdpa_sim_net: should not drop the multicast/broadcast packet
vdpasim: fix memory leak when freeing IOTLBs
vdpa: conditionally fill max max queue pair for stats
vdpa/vp_vdpa: fix kfree a wrong pointer in vp_vdpa_remove
vduse: Validate vq_num in vduse_validate_config()
tools/virtio: remove smp_read_barrier_depends()
tools/virtio: remove stray characters
vhost_vdpa: fix the crash in unmap a large memory
virtio: Implementing attribute show with sysfs_emit
virtio-crypto: fix memory leak in virtio_crypto_alg_skcipher_close_session()
tools/virtio: Variable type completion
vdpa_sim: fix vringh initialization in vdpasim_queue_ready()
virtio_blk: use UINT_MAX instead of -1U
vhost-vdpa: fix an iotlb memory leak
vhost: fix range used in translate_desc()
vringh: fix range used in iotlb_translate()
vhost/vsock: Fix error handling in vhost_vsock_init()
vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init()
tools: Delete the unneeded semicolon after curly braces
...
This reverts commit f40eb99897.
There are apparently still users out there of this driver. While we'd
love to remove it to ease the maintenance burden, let's reinstate it
for now until better (userspace) solutions can be developed.
Link: https://lore.kernel.org/lkml/20230104190115.ceglfefco475ev6c@pali/
Reported-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
UAPI Changes:
* connector: Support analog-TV mode property
* media: Add MEDIA_BUS_FMT_RGB565_1X24_CPADHI,
MEDIA_BUS_FMT_RGB666_1X18 and MEDIA_BUS_FMT_RGB666_1X24_CPADHI
Cross-subsystem Changes:
* dma-buf: Documentation fixes
* i2c: Introduce i2c_client_get_device_id() helper
Core Changes:
* Improve support for analog TV output
* bridge: Remove unused drm_bridge_chain functions
* debugfs: Add per-device helpers and convert various DRM drivers
* dp-mst: Various fixes
* fbdev emulation: Always pick 32 bpp as default
* KUnit: Add tests for managed helpers; Various cleanups
* panel-orientation: Add quirks for Lenovo Yoga Tab 3 X90F and DynaBook K50
* TTM: Open-code ttm_bo_wait() and remove the helper
Driver Changes:
* Fix preferred depth and bpp values throughout DRM drivers
* Remove #CONFIG_PM guards throughout DRM drivers
* ast: Various fixes
* bridge: Implement i2c's probe_new in various drivers; Fixes; ite-it6505:
Locking fixes, Cache EDID data; ite-it66121: Support IT6610 chip,
Cleanups; lontium-tl9611: Fix HDMI on DragonBoard 845c; parade-ps8640:
Use atomic bridge functions
* gud: Convert to DRM shadow-plane helpers; Perform flushing synchronously
during atomic update
* ili9486: Support 16-bit pixel data
* imx: Split off IPUv3 driver; Various fixes
* mipi-dbi: Convert to DRM shadow-plane helpers plus rsp driver changes;i
Support separate I/O-voltage supply
* mxsfb: Depend on ARCH_MXS or ARCH_MXC
* omapdrm: Various fixes
* panel: Use ktime_get_boottime() to measure power-down delay in various
drivers; Fix auto-suspend delay in various drivers; orisetech-ota5601a:
Add support
* sprd: Cleanups
* sun4i: Convert to new TV-mode property
* tidss: Various fixes
* v3d: Various fixes
* vc4: Convert to new TV-mode property; Support Kunit tests; Cleanups;
dpi: Support RGB565 and RGB666 formats; dsi: Convert DSI driver to
bridge
* virtio: Improve tracing
* vkms: Support small cursors in IGT tests; Various fixes
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmO0B0QACgkQaA3BHVML
eiMKtAf9EXt5yaEonR4gsGNVD70VkebW+jPEWbEMg5hMFDHE9sjBja8T7bOxeeL1
BVofJiAuEZW9176eqeWvShuwOuiE7lf4WQLMvXfFmtHF/Nac9HUtEvOcvc1vEDUB
y6VFlVHe8mSp+Iy0WPLyZCtT4d7v2eM+VYm0HgFa74dSTzQTLGPiUI/XVb/YDaA7
FHKB5NvsMX9S1XvjxCsq0jA5bo8SSzh5CVKerdAZkBJMCSmKiY09o2p5C7vw3EAz
WUXL5CXqd9bwgvZa/9ZClQtpJaikNGoOaMSEl67rbWdVTk0LIpqx3nTY15py5LXd
SG4Wtfor3Vf1REa9TrR2uCIOmh+1gQ==
=0/PC
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2023-01-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.3:
UAPI Changes:
* connector: Support analog-TV mode property
* media: Add MEDIA_BUS_FMT_RGB565_1X24_CPADHI,
MEDIA_BUS_FMT_RGB666_1X18 and MEDIA_BUS_FMT_RGB666_1X24_CPADHI
Cross-subsystem Changes:
* dma-buf: Documentation fixes
* i2c: Introduce i2c_client_get_device_id() helper
Core Changes:
* Improve support for analog TV output
* bridge: Remove unused drm_bridge_chain functions
* debugfs: Add per-device helpers and convert various DRM drivers
* dp-mst: Various fixes
* fbdev emulation: Always pick 32 bpp as default
* KUnit: Add tests for managed helpers; Various cleanups
* panel-orientation: Add quirks for Lenovo Yoga Tab 3 X90F and DynaBook K50
* TTM: Open-code ttm_bo_wait() and remove the helper
Driver Changes:
* Fix preferred depth and bpp values throughout DRM drivers
* Remove #CONFIG_PM guards throughout DRM drivers
* ast: Various fixes
* bridge: Implement i2c's probe_new in various drivers; Fixes; ite-it6505:
Locking fixes, Cache EDID data; ite-it66121: Support IT6610 chip,
Cleanups; lontium-tl9611: Fix HDMI on DragonBoard 845c; parade-ps8640:
Use atomic bridge functions
* gud: Convert to DRM shadow-plane helpers; Perform flushing synchronously
during atomic update
* ili9486: Support 16-bit pixel data
* imx: Split off IPUv3 driver; Various fixes
* mipi-dbi: Convert to DRM shadow-plane helpers plus rsp driver changes;i
Support separate I/O-voltage supply
* mxsfb: Depend on ARCH_MXS or ARCH_MXC
* omapdrm: Various fixes
* panel: Use ktime_get_boottime() to measure power-down delay in various
drivers; Fix auto-suspend delay in various drivers; orisetech-ota5601a:
Add support
* sprd: Cleanups
* sun4i: Convert to new TV-mode property
* tidss: Various fixes
* v3d: Various fixes
* vc4: Convert to new TV-mode property; Support Kunit tests; Cleanups;
dpi: Support RGB565 and RGB666 formats; dsi: Convert DSI driver to
bridge
* virtio: Improve tracing
* vkms: Support small cursors in IGT tests; Various fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y7QIwlfElAYWxRcR@linux-uq9g
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOt37UQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpouKEACDwViycFQjYhITw6ENTvJMfAAY7TU1p8Di
widhpRYUYQxuVJHdfyvOs/NTWMzj7okfsjilVLapq+v8jyNc+9TAMmbOJhYTydP0
7MXCt8hRP304zKil0EsPR+U6WT1n8H8Jf5fxqduyYG3tfrB1ao+EMEyEZVKiufGj
x78BOYfDQqtnbMNsxnUDfjl0zJP0tAjHYOncW09u/43Y/3XeOqdT1/fCGqO0N0Fn
RnnnKtCRiZVSnnImTGqVHPkUKSL4FRCmChI8k42cYirP6rtjaWP/zmHyTAYs1ww0
DHeq2n8MnRC156WpIAMGWoSv9G4YMpsnIfVH1JeAgp3FOErngLTX6+eMgcYITfJG
BfcqxmqDdLSuJJp6uSZo14j5eIRDtxFryuPLLhuWfegc1FY7Ls1d+uaibOg9NMh3
aYH/ij/VKzNhtdCh59mUeV/nrs6tXnsYpw1GMzNpriH3P9YeuYR2h3LdAiAAUHcC
Z1/OtwThxWyKC4qEJPvQCqrJViDtLd5wEZDMicPu7HMKpYiypFiGlzA6sraHRKHp
Bz/VnBFTWN+YqlJORKJPoDYbjMHd8gHzkbLV0dSjVOqHAw5S385YL7u8x8pPWD0f
GSz96Nxi6UpRrW9zIcgFOrBleyYtr9Ip2r/6F6cXdDxONFOwtObB3Wrxhc7+Vwgh
uNfRVUL6WQ==
=SQf6
-----END PGP SIGNATURE-----
Merge tag 'io_uring-6.2-2022-12-29' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Two fixes for mutex grabbing when the task state is != TASK_RUNNING
(me)
- Check for invalid opcode in io_uring_register() a bit earlier, to
avoid going through the quiesce machinery just to return -EINVAL
later in the process (me)
- Fix for the uapi io_uring header, skipping including time_types.h
when necessary (Stefan)
* tag 'io_uring-6.2-2022-12-29' of git://git.kernel.dk/linux:
uapi:io_uring.h: allow linux/time_types.h to be skipped
io_uring: check for valid register opcode earlier
io_uring/cancel: re-grab ctx mutex after finishing wait
io_uring: finish waiting before flushing overflow entries
x86:
* several fixes to nested VMX execution controls
* fixes and clarification to the documentation for Xen emulation
* do not unnecessarily release a pmu event with zero period
* MMU fixes
* fix Coverity warning in kvm_hv_flush_tlb()
selftests:
* fixes for the ucall mechanism in selftests
* other fixes mostly related to compilation with clang
We can merge VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES with
VDPA_ATTR_DEV_FEATURES which is functionally equivalent.
While at it, tweak the comment in header file to make
user provioned device features distinguished from those
supported by the parent mgmtdev device: the former of
which can be inherited as a whole from the latter, or
can be a subset of the latter if explicitly specified.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1665422823-18364-1-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
If a memory map has fewer threads than there are cores on the system, or
is limited to run on few cores concurrently through sched affinity or
cgroup cpusets, the concurrency IDs will be values close to 0, thus
allowing efficient use of user-space memory for per-cpu data structures.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-9-mathieu.desnoyers@efficios.com
Adding the NUMA node id to struct rseq is a straightforward thing to do,
and a good way to figure out if anything in the user-space ecosystem
prevents extending struct rseq.
This NUMA node id field allows memory allocators such as tcmalloc to
take advantage of fast access to the current NUMA node id to perform
NUMA-aware memory allocation.
It can also be useful for implementing fast-paths for NUMA-aware
user-space mutexes.
It also allows implementing getcpu(2) purely in user-space.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-5-mathieu.desnoyers@efficios.com
Export the rseq feature size supported by the kernel as well as the
required allocation alignment for the rseq per-thread area to user-space
through ELF auxiliary vector entries.
This is part of the extensible rseq ABI.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-3-mathieu.desnoyers@efficios.com
These are (uint64_t)-1 magic values are a userspace ABI, allowing the
shared info pages and other enlightenments to be disabled. This isn't
a Xen ABI because Xen doesn't let the guest turn these off except with
the full SHUTDOWN_soft_reset mechanism. Under KVM, the userspace VMM is
expected to handle soft reset, and tear down the kernel parts of the
enlightenments accordingly.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mostly small bug fixes and small updates. The only things of note is
a qla2xxx fix for crash on hotplug and timeout and the addition of a
user exposed abstraction layer for persistent reservation error return
handling (which necessitates the conversion of nvme.c as well as
SCSI).
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY6SZISYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQkdAP9Juri0
ihkyA9tVx1ZslVOp8V8mWK3P2VROA4ArvcMRVwD/Qxf2REP8Fx2GIgC0sNaRedg3
+ncveg3EpZ1n/NXXeDw=
=q+XO
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes and small updates.
The only things of note is a qla2xxx fix for crash on hotplug and
timeout and the addition of a user exposed abstraction layer for
persistent reservation error return handling (which necessitates the
conversion of nvme.c as well as SCSI)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix crash when I/O abort times out
nvme: Convert NVMe errors to PR errors
scsi: sd: Convert SCSI errors to PR errors
scsi: core: Rename status_byte to sg_status_byte
block: Add error codes for common PR failures
scsi: sd: sd_zbc: Trace zone append emulation
scsi: libfc: Include the correct header
There are only three fairly simple patches. The #include
change to linux/swab.h addresses a userspace build issue,
and the change to the mmio tracing logic helps provide
more useful traces.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOgtJUACgkQmmx57+YA
GNln8Q//dvQ2FRIWBXKh4r6CxtiCx2aktGmnP1YAuaIVuzjGSn/8EQZAoTYN5jKY
Io8rFt1/FfOMtu3E32JtGpgfDAP/8sfz3Lao9bzJR/Fjv059qL5QCoI3qbEFTNz9
vzUqiddFZGppn76qsXSA6aItHVDS4Y97XiYRSwSMlpIz+9a84rYxCo04bNR4ut4t
PR5+lvlTDfGfmj+SebrCt/IEi/FF9ckEYCLJHfaSPcQcujLDZDKPcT2RbubgwHgB
OfE5Rx25xJxR4BU5MFe74sKn5Qi5HOfr1GrsjL3RbMNiYuHgbwLcZkMXvbZukdHz
50Gt8UXMAxvZYKz92kyQLYuiKEtFSrQ8JccgqVUWL/lRLDoUkTg4hz4tmGUZE6KP
ElxdgIBem9yrFX0oCaPNkY5d3MRU2i19FvBfKWKC54NbcmBjpHxxSg+WW/P7Jw+N
uegj7qcEh7RcQU4w97OW4nS+eZmnXb4O4qXZeFwhXHS/snH7p3iBApyoPlyb+KOs
np5MWRNaGFfi8BWWeVTX78U2VW8Ql8nnlRIlk/Wwm8AkVaNFQDnffKPi87paZd9o
Kl+a9broMf4v0Oq5JTxqPMzmn9zUV0rHa1VanRBnNKqTOWalmNcsfsg1Ih9PhAAT
p3u2CN0cBI7QmrcymJHrCuv0eNJRjsYa5FB4xmhJcJkD2qjsqXI=
=05F5
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"There are only three fairly simple patches.
The #include change to linux/swab.h addresses a userspace build issue,
and the change to the mmio tracing logic helps provide more useful
traces"
* tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard
asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info
include/uapi/linux/swab: Fix potentially missing __always_inline
This patch allows to remove TUNNEL_KEY from the tunnel flags bitmap
when using bpf_skb_set_tunnel_key by providing a BPF_F_NO_TUNNEL_KEY
flag. On egress, the resulting tunnel header will not contain a tunnel
key if the protocol and implementation supports it.
At the moment bpf_tunnel_key wants a user to specify a numeric tunnel
key. This will wrap the inner packet into a tunnel header with the key
bit and value set accordingly. This is problematic when using a tunnel
protocol that supports optional tunnel keys and a receiving tunnel
device that is not expecting packets with the key bit set. The receiver
won't decapsulate and drop the packet.
RFC 2890 and RFC 2784 GRE tunnels are examples where this flag is
useful. It allows for generating packets, that can be decapsulated by
a GRE tunnel device not operating in collect metadata mode or not
expecting the key bit set.
Signed-off-by: Christian Ehrig <cehrig@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20221218051734.31411-1-cehrig@cloudflare.com
New support:
- Qualcomm SDM670, SM6115 and SM6375 GPI controller support
- Ingenic JZ4755 dmaengine support
- Removal of iop-adma driver
Updates:
- Tegra support for dma-channel-mask
- at_hdmac cleanup and virt-chan support for this driver
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmOfJXIACgkQfBQHDyUj
g0eC1RAAr8WdSJo1Vv9EmzgmxIwx5l8nuGWJ+mRBVRvsiK0bVk5a2pDN8XGnyLim
sQVd/RpH42MSYBk35GHsFaLPNvsn9h7cX3oDPwNzbQmijZsnX7ht7yHiYwamxNhC
p/hzmRtBZvGwdUrInHwKG/tiLleCBDDALyDMnRkLsJ5qPxZpeVEe5WrfTQUachLZ
yIHoOg3l+WrXkfqQwPZ5kO3pNTA0igdh3LZwZqKYXveuXVXYYG4zOLtgcJWnArG1
oLpUaJp75QPgHcknAZLDf47pwdDevV/mpv+2deoz3EbnhQc1b0bMZHRHRVc8B06w
Xgx8U51L5iZsfiaueNDQTq+amCu3OehMiG0eouiXd2CmWC/jAB7NYehyJj7003t7
2BFIWSOEIKaLaoGVb7NJssc7yrkWH3QBvdAwQorFT1FnclAQnnaC5zVRng10BvA0
Ul/W9pmHXHrk5CvGGCUzlVRHF04KoRCEifQjMzAM9rgJJSJPM1ZybAtQzDhb1HGM
5lqHmi+BpKududS9MEHiMtfpch0hYdLkBfYemd0v8qFDKNsmNwtpbbg+aS1vwsZP
lnRqCo50YVMsvCsEsjtfR0LyF0vx0ppuO8t7MKu+SIaa4YeJ9KK5JKMpnD33dJWe
jSo+OPgsE2W+14ysdqVXnZpr6Aik34PZlSd13UcrRm9S9u2yyY8=
=pURR
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New support:
- Qualcomm SDM670, SM6115 and SM6375 GPI controller support
- Ingenic JZ4755 dmaengine support
- Removal of iop-adma driver
Updates:
- Tegra support for dma-channel-mask
- at_hdmac cleanup and virt-chan support for this driver"
* tag 'dmaengine-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (46 commits)
dmaengine: Revert "dmaengine: remove s3c24xx driver"
dmaengine: tegra: Add support for dma-channel-mask
dt-bindings: dmaengine: Add dma-channel-mask to Tegra GPCDMA
dmaengine: idxd: Remove linux/msi.h include
dt-bindings: dmaengine: qcom: gpi: add compatible for SM6375
dmaengine: idxd: Fix crc_val field for completion record
dmaengine: at_hdmac: Convert driver to use virt-dma
dmaengine: at_hdmac: Remove unused member of at_dma_chan
dmaengine: at_hdmac: Rename "chan_common" to "dma_chan"
dmaengine: at_hdmac: Rename "dma_common" to "dma_device"
dmaengine: at_hdmac: Use bitfield access macros
dmaengine: at_hdmac: Keep register definitions and structures private to at_hdmac.c
dmaengine: at_hdmac: Set include entries in alphabetic order
dmaengine: at_hdmac: Use pm_ptr()
dmaengine: at_hdmac: Use devm_clk_get()
dmaengine: at_hdmac: Use devm_platform_ioremap_resource
dmaengine: at_hdmac: Use devm_kzalloc() and struct_size()
dmaengine: at_hdmac: Introduce atc_get_llis_residue()
dmaengine: at_hdmac: s/atc_get_bytes_left/atc_get_residue
dmaengine: at_hdmac: Pass residue by address to avoid unnecessary implicit casts
...
* misc. small fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE72YNB0Y/i3JqeVQT2O7X88g7+poFAmOc3c8ACgkQ2O7X88g7
+pomVg//SE7I1Mwz+mVSSqt5aKtNE46TvcwDqC4qGa4mOK58mrZ/WJ5kK5MPL3gj
EmEdJ5z0tzsOv675OLlPY28DhM6CLAHGmKkD7+Gu5cF5MxaIQNm9E24A0QPispjG
s2ntEOIc0sJBQkZxb7oMfv21OF6ENyt3m/uhUpWEo8KBMcLRXfHdnluoqr+XZIc6
K/P5qsWKwPxcFoFaW0lMlIMi8fD7mPrpl1IBHfFiwsbs+RraAQMCK9aFJHPJGRKh
/PAWf/rxO9q71sHp6WBvR0QuHE8t7D2usfbDHYVN84/fw9Gh3zWTHQYxQy/ZdRvF
OO4zFRHaOA6j3wlJ3WRJo5Bt2TWJOysjkNHbCe0X4oSILN38tmx/uyFIJB0MdBmw
7/QSV3JRlToX1Q3p7+o6Gq0bpVJkjUkVJcLeqF3yu5wQFiTBrBS/CcP8G8K14FAX
7PmzAN9mRAwNRUpAm+NCu1RP/RdwF6ugkRmk8iKpd5Jqc6wVYLPLQHFK/qzRMn5g
xSIaacD67rr7TP+AFwkWJzVnIZJKT78gB0dL53lbmfBEeZA+V2CzvJdjFCaT1s5E
qBpa15WQuM3q0CR0a5Mix6Vrz/KB+PEOKTVjY5ckFt9EHMDvhp+BcerMTnf/5Cks
HY/uhFhim+9iQt96wC+MWV55VXLxpdLU2p8gNBDbDkJ+PerXAIA=
=H70N
-----END PGP SIGNATURE-----
Merge tag 'hsi-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi
Pull HSI updates from Sebastian Reichel:
- misc small fixes
* tag 'hsi-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: omap_ssi_core: Fix error handling in ssi_init()
headers: Remove some left-over license text in include/uapi/linux/hsi/
HSI: omap_ssi_core: fix possible memory leak in ssi_probe()
HSI: omap_ssi_core: fix unbalanced pm_runtime_disable()
HSI: ssi_protocol: Fix return type of ssip_pn_xmit()
Here is the large set of char/misc and other driver subsystem changes
for 6.2-rc1. Nothing earth-shattering in here at all, just a lot of new
driver development and minor fixes. Highlights include:
- fastrpc driver updates
- iio new drivers and updates
- habanalabs driver updates for new hardware and features
- slimbus driver updates
- speakup module parameters added to aid in boot time configuration
- i2c probe_new conversions for lots of different drivers
- other small driver fixes and additions
One semi-interesting change in here is the increase of the number of
misc dynamic minors available to 1048448 to handle new huge-cpu systems.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wrdw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykSDgCdHjUHS62/UnKdB9rLtyAOFxS/6DgAn2X4Unf8
RN8Mn2mUIiBzyu5p+Zc7
=tK3S
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the large set of char/misc and other driver subsystem changes
for 6.2-rc1. Nothing earth-shattering in here at all, just a lot of
new driver development and minor fixes.
Highlights include:
- fastrpc driver updates
- iio new drivers and updates
- habanalabs driver updates for new hardware and features
- slimbus driver updates
- speakup module parameters added to aid in boot time configuration
- i2c probe_new conversions for lots of different drivers
- other small driver fixes and additions
One semi-interesting change in here is the increase of the number of
misc dynamic minors available to 1048448 to handle new huge-cpu
systems.
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (521 commits)
extcon: usbc-tusb320: Convert to i2c's .probe_new()
extcon: rt8973: Convert to i2c's .probe_new()
extcon: fsa9480: Convert to i2c's .probe_new()
extcon: max77843: Replace irqchip mask_invert with unmask_base
chardev: fix error handling in cdev_device_add()
mcb: mcb-parse: fix error handing in chameleon_parse_gdd()
drivers: mcb: fix resource leak in mcb_probe()
coresight: etm4x: fix repeated words in comments
coresight: cti: Fix null pointer error on CTI init before ETM
coresight: trbe: remove cpuhp instance node before remove cpuhp state
counter: stm32-lptimer-cnt: fix the check on arr and cmp registers update
misc: fastrpc: Add dma_mask to fastrpc_channel_ctx
misc: fastrpc: Add mmap request assigning for static PD pool
misc: fastrpc: Safekeep mmaps on interrupted invoke
misc: fastrpc: Add support for audiopd
misc: fastrpc: Rework fastrpc_req_munmap
misc: fastrpc: Use fastrpc_map_put in fastrpc_map_create on fail
misc: fastrpc: Add fastrpc_remote_heap_alloc
misc: fastrpc: Add reserved mem support
misc: fastrpc: Rename audio protection domain to root
...
Here is the "big" set of tty/serial driver changes for 6.2-rc1.
As in previous kernel releases, nothing big here at all, just some small
incremental serial/tty layer cleanups and some individual driver
additions and fixes. Highlights are:
- serial helper macros from Jiri Slaby to reduce the amount of
duplicated code in serial drivers
- api cleanups and consolidations from Ilpo Järvinen in lots of serial
drivers
- the usual set of n_gsm fixes from Daniel Starke as that code gets
exercised more
- TIOCSTI is finally able to be disabled if requested (security
hardening feature from Kees Cook)
- fsl_lpuart driver fixes and features added
- other small serial driver additions and fixes
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wt5A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynSiACffLCC9SFnP1ClJJh73OpCN11alDAAoIfMct2A
/zrJ4on7o0YQU1Ilg2+k
=eD/I
-----END PGP SIGNATURE-----
Merge tag 'tty-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the "big" set of tty/serial driver changes for 6.2-rc1.
As in previous kernel releases, nothing big here at all, just some
small incremental serial/tty layer cleanups and some individual driver
additions and fixes. Highlights are:
- serial helper macros from Jiri Slaby to reduce the amount of
duplicated code in serial drivers
- api cleanups and consolidations from Ilpo Järvinen in lots of
serial drivers
- the usual set of n_gsm fixes from Daniel Starke as that code gets
exercised more
- TIOCSTI is finally able to be disabled if requested (security
hardening feature from Kees Cook)
- fsl_lpuart driver fixes and features added
- other small serial driver additions and fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (97 commits)
serial: atmel: don't stop the transmitter when doing PIO
serial: atmel: cleanup atmel_start+stop_tx()
tty: serial: fsl_lpuart: switch to new dmaengine_terminate_* API
serial: sunsab: Fix error handling in sunsab_init()
serial: altera_uart: fix locking in polling mode
serial: pch: Fix PCI device refcount leak in pch_request_dma()
tty: serial: fsl_lpuart: Use pm_ptr() to avoid need to make pm __maybe_unused
tty: serial: fsl_lpuart: Add runtime pm support
tty: serial: fsl_lpuart: enable wakeup source for lpuart
serdev: Replace poll loop by readx_poll_timeout() macro
tty: synclink_gt: unwind actions in error path of net device open
serial: stm32: move dma_request_chan() before clk_prepare_enable()
dt-bindings: serial: xlnx,opb-uartlite: Drop 'contains' from 'xlnx,use-parity'
serial: pl011: Do not clear RX FIFO & RX interrupt in unthrottle.
serial: amba-pl011: avoid SBSA UART accessing DMACR register
tty: serial: altera_jtaguart: remove struct altera_jtaguart
tty: serial: altera_jtaguart: use uart_port::read_status_mask
tty: serial: altera_jtaguart: remove unused altera_jtaguart::sigs
tty: serial: altera_jtaguart: remove flag from altera_jtaguart_rx_chars()
n_tty: Rename tail to old_tail in n_tty_read()
...
Here is the large set of USB and Thunderbolt driver changes for 6.2-rc1.
Overall, thanks to the removal of a driver, more lines were removed than
added, a nice change. Highlights include:
- removal of the sisusbvga driver that was not used by anyone anymore
- minor thunderbolt driver changes and tweaks
- chipidea driver updates
- usual set of typec driver features and hardware support added
- musb minor driver fixes
- fotg210 driver fixes, bringing that hardware back from the "dead"
- minor dwc3 driver updates
- addition, and then removal, of a list.h helper function for many USB
and other subsystem drivers, that ended up breaking the build. That
will come back for 6.3-rc1, it missed this merge window.
- usual xhci updates and enhancements
- usb-serial driver updates and support for new devices
- other minor USB driver updates
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wvYg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl5DACgssl/ag4zDePHpfoiG5zEGEzH8XsAoMFrzvzu
d43hsH3qsfDGSZRkJJMu
=ORDd
-----END PGP SIGNATURE-----
Merge tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and Thunderbolt driver updates from Greg KH:
"Here is the large set of USB and Thunderbolt driver changes for
6.2-rc1. Overall, thanks to the removal of a driver, more lines were
removed than added, a nice change. Highlights include:
- removal of the sisusbvga driver that was not used by anyone anymore
- minor thunderbolt driver changes and tweaks
- chipidea driver updates
- usual set of typec driver features and hardware support added
- musb minor driver fixes
- fotg210 driver fixes, bringing that hardware back from the "dead"
- minor dwc3 driver updates
- addition, and then removal, of a list.h helper function for many
USB and other subsystem drivers, that ended up breaking the build.
That will come back for 6.3-rc1, it missed this merge window.
- usual xhci updates and enhancements
- usb-serial driver updates and support for new devices
- other minor USB driver updates
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (153 commits)
usb: gadget: uvc: Rename bmInterfaceFlags -> bmInterlaceFlags
usb: dwc2: power on/off phy for peripheral mode in dual-role mode
usb: dwc2: disable lpm feature on Rockchip SoCs
dt-bindings: usb: mtk-xhci: add support for mt7986
usb: dwc3: core: defer probe on ulpi_read_id timeout
usb: ulpi: defer ulpi_register on ulpi_read_id timeout
usb: misc: onboard_usb_hub: add Genesys Logic GL850G hub support
dt-bindings: usb: Add binding for Genesys Logic GL850G hub controller
dt-bindings: vendor-prefixes: add Genesys Logic
usb: fotg210-udc: fix potential memory leak in fotg210_udc_probe()
usb: typec: tipd: Set mode of operation for USB Type-C connector
usb: gadget: udc: drop obsolete dependencies on COMPILE_TEST
usb: musb: remove extra check in musb_gadget_vbus_draw
usb: gadget: uvc: Prevent buffer overflow in setup handler
usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init
usb: typec: wusb3801: fix fwnode refcount leak in wusb3801_probe()
usb: storage: Add check for kcalloc
USB: sisusbvga: use module_usb_driver()
USB: sisusbvga: rename sisusb.c to sisusbvga.c
USB: sisusbvga: remove console support
...
- Replace deprecated git://github.com link in MAINTAINERS. (Palmer Dabbelt)
- Simplify vfio/mlx5 with module_pci_driver() helper. (Shang XiaoJing)
- Drop unnecessary buffer from ACPI call. (Rafael Mendonca)
- Correct latent missing include issue in iova-bitmap and fix support
for unaligned bitmaps. Follow-up with better fix through refactor.
(Joao Martins)
- Rework ccw mdev driver to split private data from parent structure,
better aligning with the mdev lifecycle and allowing us to remove
a temporary workaround. (Eric Farman)
- Add an interface to get an estimated migration data size for a device,
allowing userspace to make informed decisions, ex. more accurately
predicting VM downtime. (Yishai Hadas)
- Fix minor typo in vfio/mlx5 array declaration. (Yishai Hadas)
- Simplify module and Kconfig through consolidating SPAPR/EEH code and
config options and folding virqfd module into main vfio module.
(Jason Gunthorpe)
- Fix error path from device_register() across all vfio mdev and sample
drivers. (Alex Williamson)
- Define migration pre-copy interface and implement for vfio/mlx5
devices, allowing portions of the device state to be saved while the
device continues operation, towards reducing the stop-copy state
size. (Jason Gunthorpe, Yishai Hadas, Shay Drory)
- Implement pre-copy for hisi_acc devices. (Shameer Kolothum)
- Fixes to mdpy mdev driver remove path and error path on probe.
(Shang XiaoJing)
- vfio/mlx5 fixes for incorrect return after copy_to_user() fault and
incorrect buffer freeing. (Dan Carpenter)
-----BEGIN PGP SIGNATURE-----
iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmObfPgbHGFsZXgud2ls
bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiDogP/i9GuBKposvZpnfxXWwo
oNpKBZSOVMW8wgavNEuryMb+9WoouIghce8XU49MmONoP26kIh5TA14Zpi3XWkLK
K+NlpwicESvLeZVHU7f3R8meVqmPtlxIi59jE+CfEHB8BW2HIAsEdwdhkxMwus9C
nuiiK/2YYyQWOXYc4LAIkspMzjtGPy6Im5P6AED+dI+TFCEqJAM5qgOLJZFlk4a/
WwZY2xjVKOl6xf5VZXGw+v7fDgz2Ju+j4Bm3X5lx1HgiDrEH83MjXY5h67neAIVb
bXrfNLN++MiuO5niGTFMbUjGVUIFxsfmJzBnL9QrLsuj0JrGEKsu/1JEO78g0Km0
ZCChoJ6UyUOgxt6evEymUAZAAkbcKaaht2gdbAXW71tv9p1TripAbBKwVeah1bQp
SiHPqy9InKJlhaf+GbXL9eux1WVMfQ6FZccU16bNt7VaV2I8js85z/2gqVD0a5Mw
+gnwp5XMUFWNKlJrnc7uVCD0bDExwQhr75OP4rWjMNvvLi9hPXJ2cI2Sg+9OLzQw
vm/I+Df+FfXCuGAgX4Lxq76pqWlYGJH0Qxc14Ds6YoXqygBPz9yvTtuBv8mTHJzE
KdAl/6DmZZxZ/JFD9lPF80KRiAsJ6iNf6tPTWES7hfDBfIdgQ/DZbXridLWJPNoi
xLfaW19yrLTXWKSmR7G2Lsz4
=q9xs
-----END PGP SIGNATURE-----
Merge tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Replace deprecated git://github.com link in MAINTAINERS (Palmer
Dabbelt)
- Simplify vfio/mlx5 with module_pci_driver() helper (Shang XiaoJing)
- Drop unnecessary buffer from ACPI call (Rafael Mendonca)
- Correct latent missing include issue in iova-bitmap and fix support
for unaligned bitmaps. Follow-up with better fix through refactor
(Joao Martins)
- Rework ccw mdev driver to split private data from parent structure,
better aligning with the mdev lifecycle and allowing us to remove a
temporary workaround (Eric Farman)
- Add an interface to get an estimated migration data size for a
device, allowing userspace to make informed decisions, ex. more
accurately predicting VM downtime (Yishai Hadas)
- Fix minor typo in vfio/mlx5 array declaration (Yishai Hadas)
- Simplify module and Kconfig through consolidating SPAPR/EEH code and
config options and folding virqfd module into main vfio module (Jason
Gunthorpe)
- Fix error path from device_register() across all vfio mdev and sample
drivers (Alex Williamson)
- Define migration pre-copy interface and implement for vfio/mlx5
devices, allowing portions of the device state to be saved while the
device continues operation, towards reducing the stop-copy state size
(Jason Gunthorpe, Yishai Hadas, Shay Drory)
- Implement pre-copy for hisi_acc devices (Shameer Kolothum)
- Fixes to mdpy mdev driver remove path and error path on probe (Shang
XiaoJing)
- vfio/mlx5 fixes for incorrect return after copy_to_user() fault and
incorrect buffer freeing (Dan Carpenter)
* tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio: (42 commits)
vfio/mlx5: error pointer dereference in error handling
vfio/mlx5: fix error code in mlx5vf_precopy_ioctl()
samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe()
hisi_acc_vfio_pci: Enable PRE_COPY flag
hisi_acc_vfio_pci: Move the dev compatibility tests for early check
hisi_acc_vfio_pci: Introduce support for PRE_COPY state transitions
hisi_acc_vfio_pci: Add support for precopy IOCTL
vfio/mlx5: Enable MIGRATION_PRE_COPY flag
vfio/mlx5: Fallback to STOP_COPY upon specific PRE_COPY error
vfio/mlx5: Introduce multiple loads
vfio/mlx5: Consider temporary end of stream as part of PRE_COPY
vfio/mlx5: Introduce vfio precopy ioctl implementation
vfio/mlx5: Introduce SW headers for migration states
vfio/mlx5: Introduce device transitions of PRE_COPY
vfio/mlx5: Refactor to use queue based data chunks
vfio/mlx5: Refactor migration file state
vfio/mlx5: Refactor MKEY usage
vfio/mlx5: Refactor PD usage
vfio/mlx5: Enforce a single SAVE command at a time
vfio: Extend the device migration protocol with PRE_COPY
...
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
* Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
* Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d:
"Fix a number of issues with MTE, such as races on the tags being
initialised vs the PG_mte_tagged flag as well as the lack of support
for VM_SHARED when KVM is involved. Patches from Catalin Marinas and
Peter Collingbourne").
* Merge the pKVM shadow vcpu state tracking that allows the hypervisor
to have its own view of a vcpu, keeping that state private.
* Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
* Fix a handful of minor issues around 52bit VA/PA support (64kB pages
only) as a prefix of the oncoming support for 4kB and 16kB pages.
* Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
* Second batch of the lazy destroy patches
* First batch of KVM changes for kernel virtual != physical address support
* Removal of a unused function
x86:
* Allow compiling out SMM support
* Cleanup and documentation of SMM state save area format
* Preserve interrupt shadow in SMM state save area
* Respond to generic signals during slow page faults
* Fixes and optimizations for the non-executable huge page errata fix.
* Reprogram all performance counters on PMU filter change
* Cleanups to Hyper-V emulation and tests
* Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
running on top of a L1 Hyper-V hypervisor)
* Advertise several new Intel features
* x86 Xen-for-KVM:
** Allow the Xen runstate information to cross a page boundary
** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
** Add support for 32-bit guests in SCHEDOP_poll
* Notable x86 fixes and cleanups:
** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
years back when eliminating unnecessary barriers when switching between
vmcs01 and vmcs02.
** Clean up vmread_error_trampoline() to make it more obvious that params
must be passed on the stack, even for x86-64.
** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
of the current guest CPUID.
** Fudge around a race with TSC refinement that results in KVM incorrectly
thinking a guest needs TSC scaling when running on a CPU with a
constant TSC, but no hardware-enumerated TSC frequency.
** Advertise (on AMD) that the SMM_CTL MSR is not supported
** Remove unnecessary exports
Generic:
* Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
* Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
* Fix build errors that occur in certain setups (unsure exactly what is
unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
* Introduce actual atomics for clear/set_bit() in selftests
* Add support for pinning vCPUs in dirty_log_perf_test.
* Rename the so called "perf_util" framework to "memstress".
* Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress tests.
* Add a common ucall implementation; code dedup and pre-work for running
SEV (and beyond) guests in selftests.
* Provide a common constructor and arch hook, which will eventually be
used by x86 to automatically select the right hypercall (AMD vs. Intel).
* A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
breakpoints, stage-2 faults and access tracking.
* x86-specific selftest changes:
** Clean up x86's page table management.
** Clean up and enhance the "smaller maxphyaddr" test, and add a related
test to cover generic emulation failure.
** Clean up the nEPT support checks.
** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
** Fix an ordering issue in the AMX test introduced by recent conversions
to use kvm_cpu_has(), and harden the code to guard against similar bugs
in the future. Anything that tiggers caching of KVM's supported CPUID,
kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
the caching occurs before the test opts in via prctl().
Documentation:
* Remove deleted ioctls from documentation
* Clean up the docs for the x86 MSR filter.
* Various fixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
=QAYX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on (see merge
commit 382b5b87a97d: "Fix a number of issues with MTE, such as
races on the tags being initialised vs the PG_mte_tagged flag as
well as the lack of support for VM_SHARED when KVM is involved.
Patches from Catalin Marinas and Peter Collingbourne").
- Merge the pKVM shadow vcpu state tracking that allows the
hypervisor to have its own view of a vcpu, keeping that state
private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB
pages only) as a prefix of the oncoming support for 4kB and 16kB
pages.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address
support
- Removal of a unused function
x86:
- Allow compiling out SMM support
- Cleanup and documentation of SMM state save area format
- Preserve interrupt shadow in SMM state save area
- Respond to generic signals during slow page faults
- Fixes and optimizations for the non-executable huge page errata
fix.
- Reprogram all performance counters on PMU filter change
- Cleanups to Hyper-V emulation and tests
- Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
guest running on top of a L1 Hyper-V hypervisor)
- Advertise several new Intel features
- x86 Xen-for-KVM:
- Allow the Xen runstate information to cross a page boundary
- Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
- Add support for 32-bit guests in SCHEDOP_poll
- Notable x86 fixes and cleanups:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
- Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
a few years back when eliminating unnecessary barriers when
switching between vmcs01 and vmcs02.
- Clean up vmread_error_trampoline() to make it more obvious that
params must be passed on the stack, even for x86-64.
- Let userspace set all supported bits in MSR_IA32_FEAT_CTL
irrespective of the current guest CPUID.
- Fudge around a race with TSC refinement that results in KVM
incorrectly thinking a guest needs TSC scaling when running on a
CPU with a constant TSC, but no hardware-enumerated TSC
frequency.
- Advertise (on AMD) that the SMM_CTL MSR is not supported
- Remove unnecessary exports
Generic:
- Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
- Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
- Fix build errors that occur in certain setups (unsure exactly what
is unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
- Introduce actual atomics for clear/set_bit() in selftests
- Add support for pinning vCPUs in dirty_log_perf_test.
- Rename the so called "perf_util" framework to "memstress".
- Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress
tests.
- Add a common ucall implementation; code dedup and pre-work for
running SEV (and beyond) guests in selftests.
- Provide a common constructor and arch hook, which will eventually
be used by x86 to automatically select the right hypercall (AMD vs.
Intel).
- A bunch of added/enabled/fixed selftests for ARM64, covering
memslots, breakpoints, stage-2 faults and access tracking.
- x86-specific selftest changes:
- Clean up x86's page table management.
- Clean up and enhance the "smaller maxphyaddr" test, and add a
related test to cover generic emulation failure.
- Clean up the nEPT support checks.
- Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
- Fix an ordering issue in the AMX test introduced by recent
conversions to use kvm_cpu_has(), and harden the code to guard
against similar bugs in the future. Anything that tiggers
caching of KVM's supported CPUID, kvm_cpu_has() in this case,
effectively hides opt-in XSAVE features if the caching occurs
before the test opts in via prctl().
Documentation:
- Remove deleted ioctls from documentation
- Clean up the docs for the x86 MSR filter.
- Various fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
KVM: x86: Add proper ReST tables for userspace MSR exits/flags
KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
KVM: arm64: selftests: Align VA space allocator with TTBR0
KVM: arm64: Fix benign bug with incorrect use of VA_BITS
KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
KVM: x86: Advertise that the SMM_CTL MSR is not supported
KVM: x86: remove unnecessary exports
KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
tools: KVM: selftests: Convert clear/set_bit() to actual atomics
tools: Drop "atomic_" prefix from atomic test_and_set_bit()
tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
perf tools: Use dedicated non-atomic clear/set bit helpers
tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
KVM: arm64: selftests: Enable single-step without a "full" ucall()
KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
KVM: Remove stale comment about KVM_REQ_UNHALT
KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
KVM: Reference to kvm_userspace_memory_region in doc and comments
KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
...
In this round, we've added two features: 1) F2FS_IOC_START_ATOMIC_REPLACE and
2) per-block age-based extent cache. 1) is a variant of the previous atomic
write feature which guarantees a per-file atomicity. It would be more efficient
than AtomicFile implementation in Android framework. 2) implements another type
of extent cache in memory which keeps the per-block age in a file, so that block
allocator could split the hot and cold data blocks more accurately.
Enhancement:
- introduce F2FS_IOC_START_ATOMIC_REPLACE
- refactor extent_cache to add a new per-block-age-based extent cache support
- introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
- add proc entry to show discard_plist info
- optimize iteration over sparse directories
- add barrier mount option
Bug fix
- avoid victim selection from previous victim section
- fix to enable compress for newly created file if extension matches
- set zstd compress level correctly
- initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
- correct i_size change for atomic writes
- allow to read node block after shutdown
- allow to set compression for inlined file
- fix gc mode when gc_urgent_high_remaining is 1
- should put a page when checking the summary info
Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and doc.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmOaTNUACgkQQBSofoJI
UNIQnw//V7Q8DUHw5YNj04jutwXH2DNMLAmn/NJh5S6dIzy/LiywlSzVg53/0/FP
4K577urUkIhgilRO+yncUMSnSQk7BluQvGSx4ja2AV+dpDomjxM3GwIacGzSvr7D
VfVf8Vig10UEFrrtEEKtv1VFlYHAmo8lLpubzrZHV8aZFLHHYO2fakQhPu8BYsaz
eGCJwxjvTZcQUPkaeG9tWto3ChI3F6PzreiQ5TztHhLWSEgw/o0qijpsc+2SthaV
my7uGjeBY8EGPeSYbeCxRtdx8g8Qu11K3ISuDj8zBybmjG3IWOGt1CVcrY6tZbal
aL70CMtHkMqMn03VqbpCTqBtdWNMrrw5sYSL3qXIUdXlX/2yJBh9fLAeNxKNs5Nu
6veSb2WgYMHqIsClkAAcP0xJ8g6kodGoG60wVr4ek0Vdt4osaQqwq+bnffpwwxtQ
F+7aRuinv+rdrHJ4CuFXAmHPKh2lBe2lTTWZEKg2RptTxZ5DhD2Qn6x1khPD2GFA
mG2Aeiq6PVxxEeIO+w/VBCuAgpGTFV2N/ZIF8VfjFNdWiN5OGLWQNHC2KGj2G2uV
+fA+B91txQWtjY9h72YJb2+aGIixcnLY24ni4mDgDItqtpCB4PW56W8cbnbv9Pl+
aXAWdADqJdDyllHoVB/JQ24gr2fATJGRIDeYDnw+vPP4f5ZT5vg=
=f00t
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added two features: F2FS_IOC_START_ATOMIC_REPLACE
and a per-block age-based extent cache.
F2FS_IOC_START_ATOMIC_REPLACE is a variant of the previous atomic
write feature which guarantees a per-file atomicity. It would be more
efficient than AtomicFile implementation in Android framework.
The per-block age-based extent cache implements another type of extent
cache in memory which keeps the per-block age in a file, so that block
allocator could split the hot and cold data blocks more accurately.
Enhancements:
- introduce F2FS_IOC_START_ATOMIC_REPLACE
- refactor extent_cache to add a new per-block-age-based extent cache support
- introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
- add proc entry to show discard_plist info
- optimize iteration over sparse directories
- add barrier mount option
Bug fixes:
- avoid victim selection from previous victim section
- fix to enable compress for newly created file if extension matches
- set zstd compress level correctly
- initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
- correct i_size change for atomic writes
- allow to read node block after shutdown
- allow to set compression for inlined file
- fix gc mode when gc_urgent_high_remaining is 1
- should put a page when checking the summary info
Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and
doc"
* tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
f2fs: reset wait_ms to default if any of the victims have been selected
f2fs: fix some format WARNING in debug.c and sysfs.c
f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
f2fs: fix iostat parameter for discard
f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
f2fs: add block_age-based extent cache
f2fs: allocate the extent_cache by default
f2fs: refactor extent_cache to support for read and more
f2fs: remove unnecessary __init_extent_tree
f2fs: move internal functions into extent_cache.c
f2fs: specify extent cache for read explicitly
f2fs: introduce f2fs_is_readonly() for readability
f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
f2fs: do some cleanup for f2fs module init
MAINTAINERS: Add f2fs bug tracker link
f2fs: remove the unused flush argument to change_curseg
f2fs: open code allocate_segment_by_default
f2fs: remove struct segment_allocation default_salloc_ops
f2fs: introduce discard_urgent_util sysfs node
f2fs: define MIN_DISCARD_GRANULARITY macro
...
API:
- Optimise away self-test overhead when they are disabled.
- Support symmetric encryption via keyring keys in af_alg.
- Flip hwrng default_quality, the default is now maximum entropy.
Algorithms:
- Add library version of aesgcm.
- CFI fixes for assembly code.
- Add arm/arm64 accelerated versions of sm3/sm4.
Drivers:
- Remove assumption on arm64 that kmalloc is DMA-aligned.
- Fix selftest failures in rockchip.
- Add support for RK3328/RK3399 in rockchip.
- Add deflate support in qat.
- Merge ux500 into stm32.
- Add support for TEE for PCI ID 0x14CA in ccp.
- Add mt7986 support in mtk.
- Add MaxLinear platform support in inside-secure.
- Add NPCM8XX support in npcm.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmOZhNQACgkQxycdCkmx
i6edOQ/+IHYe2Z+fLsMGs0qgTVaEV33O0crTRl/PMkfBJai57grz6x/G9QrkwGHS
084u4RmwhVrE7Z/pxvey48m0lHMw3H/ElLTRl5LV1zE2OtGgr4VV63wtqthu1QS1
KblVnjb52DhFhvF1O1IrK9lxyX0lByOiARFVdyZR6+Rb66Xfq8rqk5t8U8mmTUFz
ds9S2Un4HajgtjNEyI78DOX8o4wVST8tltQs0eVii6T9AeXgSgX37ytD7Xtg/zrz
/p61KFgKBQkRT7EEGD6xgNrND0vNAp2w98ZTTRXTZI8+Y0aTUcTYya7cXOLBt9bQ
rA7z9sNKvmwJijTMV6O9eqRGcYfzc2G4qfMhlQqj/P2pjLnEZXdvFNHTTbclR76h
2UFlZXPDQVQukvnNNnB6bmIvv6DsM+jmGH0pK5BnBJXnD5SOZh1RqjJxw0Kj6QCM
VxpKDvfStux2Guh6mz1lJna/S44qKy/sVYkWUawcmE4RF2+GfNayM1GUpEUofndE
vz1yZdgLPETSh5QzKrjFkUAnqo/AsAdc5Qxroz9DRz1BCC0GCuIxjUG8ScTWgcth
R/reQDczBckCNpPxrWPHHYoVXnAMwEFySfcjZyuCoMO6t6qVUvcjRShCyKwO/JPl
9YREdRmq0swwIB9cFIrEoWrzc3wjjBtsltDFlkKsa9c92LXoW+g=
=OpWt
-----END PGP SIGNATURE-----
Merge tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Optimise away self-test overhead when they are disabled
- Support symmetric encryption via keyring keys in af_alg
- Flip hwrng default_quality, the default is now maximum entropy
Algorithms:
- Add library version of aesgcm
- CFI fixes for assembly code
- Add arm/arm64 accelerated versions of sm3/sm4
Drivers:
- Remove assumption on arm64 that kmalloc is DMA-aligned
- Fix selftest failures in rockchip
- Add support for RK3328/RK3399 in rockchip
- Add deflate support in qat
- Merge ux500 into stm32
- Add support for TEE for PCI ID 0x14CA in ccp
- Add mt7986 support in mtk
- Add MaxLinear platform support in inside-secure
- Add NPCM8XX support in npcm"
* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
crypto: ux500/cryp - delete driver
crypto: stm32/cryp - enable for use with Ux500
crypto: stm32 - enable drivers to be used on Ux500
dt-bindings: crypto: Let STM32 define Ux500 CRYP
hwrng: geode - Fix PCI device refcount leak
hwrng: amd - Fix PCI device refcount leak
crypto: qce - Set DMA alignment explicitly
crypto: octeontx2 - Set DMA alignment explicitly
crypto: octeontx - Set DMA alignment explicitly
crypto: keembay - Set DMA alignment explicitly
crypto: safexcel - Set DMA alignment explicitly
crypto: hisilicon/hpre - Set DMA alignment explicitly
crypto: chelsio - Set DMA alignment explicitly
crypto: ccree - Set DMA alignment explicitly
crypto: ccp - Set DMA alignment explicitly
crypto: cavium - Set DMA alignment explicitly
crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
crypto: arm64/ghash-ce - use frame_push/pop macros consistently
crypto: arm64/crct10dif - use frame_push/pop macros consistently
crypto: arm64/aes-modes - use frame_push/pop macros consistently
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmOYpTIUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxuZhAAhGjE8voLZeOYwxbvfL69hGTAZ+Me
x2hqRWVhh/IGWXTTaoSLwSjMMokcmAKN5S/wv8qdCG5sB8EN8FyTBIZDy8PuRRdl
8UlqlBMSL+d4oSRDCnYLxFNcynLRNnmx2dfcdw9tJ4zjTLN8Y4o8PHFogR6pJ3MT
sDC8S0myTQKXr4wAGzTZycKsiGManviYtByp6dCcKD3Oy5Q2uZ9OKO2DP2yQpn+F
c3IJSV9oDz3KR8JVJ5Q1iz9cdMXbGwjkM3JLlHpxhedwjN4ErLumPutKcebtzO5C
aTqabN7Nnzc4yJusAIfojFCWH7fgaYUyJ3pxcFyJ4tu4m9Last+2I5UB/kV2sYAD
jWiCYx3sA/mRopNXOnrBGae+Lgy+sQnt8or0grySr0bK+b+ArAGis4uT4A0uASGO
RUQdIQwz7zhHeQrwAladHWxnx4BEDNCatgfn38p4fklIYKydCY5nfZURMDvHezSR
G6Nu08hoE9ZXlmkWTFw+5F23wPWKcCpzZj0hf7OroIouXUp8vqSFSqatH5vGkbCl
bDswck9GdRJ2hl5SvFOeelaXkM42du45TMLU2JmIn6dYYFNrO93JgdvKSU7E2CpG
AmDIpg1Idxo8fEPPGH1I7RVU5+ilzmmPQQY7poQW+va4/dEd/QVp1+ZZTDnMC1qk
qi3ck22VdvPU2VU=
=KULr
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and
make more things static.
- Make portdrv bind to Switch Ports that have AER. Previously, if
these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant
the Ports couldn't be suspended to low-power states. AER on these
Ports doesn't use interrupts, and the AER driver doesn't need to
claim them.
- Assign PCI domain IDs using ida_alloc(), which makes host bridge
add/remove work better.
Resource management:
- To work better with recent BIOSes that use EfiMemoryMappedIO for
PCI host bridge apertures, remove those regions from the E820 map
(E820 entries normally prevent us from allocating BARs). In v5.19,
we added some quirks to disable E820 checking, but that's not very
maintainable. EfiMemoryMappedIO means the OS needs to map the
region for use by EFI runtime services; it shouldn't prevent OS
from using it.
PCIe native device hotplug:
- Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4
PCIe tunneling depends on native PCIe hotplug.
- Enable Command Completed Interrupt only if supported to avoid user
confusion from lspci output that says this is enabled but not
supported.
- Prevent pciehp from binding to Switch Upstream Ports; this happened
because of interaction with acpiphp and caused devices below the
Upstream Port to disappear.
Power management:
- Convert AGP drivers to generic power management. We hope to remove
legacy power management from the PCI core eventually.
Virtualization:
- Fix pci_device_is_present(), which previously always returned
"false" for VFs, causing virtio hangs when unbinding the driver.
Miscellaneous:
- Convert drivers to gpiod API to prepare for dropping some legacy
code.
- Fix DOE fencepost error for the maximum data object length.
Baikal-T1 PCIe controller driver:
- Add driver and DT bindings.
Broadcom STB PCIe controller driver:
- Enable Multi-MSI.
- Delay 100ms after PERST# deassert to allow power and clocks to
stabilize.
- Configure Read Completion Boundary to 64 bytes.
Freescale i.MX6 PCIe controller driver:
- Initialize PHY before deasserting core reset to fix a regression in
v6.0 on boards where the PHY provides the reference.
- Fix imx6sx and imx8mq clock names in DT schema.
Intel VMD host bridge driver:
- Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe
SSDs in VT-d pass-through scenarios.
- Disable MSI remapping, which gets re-enabled by firmware during
suspend/resume.
MediaTek PCIe Gen3 controller driver:
- Add MT7986 and MT8195 support.
Qualcomm PCIe controller driver:
- Add SC8280XP/SA8540P basic interconnect support.
Rockchip DesignWare PCIe controller driver:
- Base DT schema on common Synopsys schema.
Synopsys DesignWare PCIe core:
- Collect DT items shared between Root Port and Endpoint (PERST GPIO,
PHY info, clocks, resets, link speed, number of lanes, number of
iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml.
- Add dma-ranges support for Root Ports and Endpoints.
- Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to
reduce code duplication.
- Add generic names for clocks and resets to encourage more
consistent naming across drivers using DesignWare IP.
- Stop advertising PTM Responder role for Endpoints, which aren't
allowed to be responders.
TI J721E PCIe driver:
- Add j721s2 host mode ID to DT schema.
- Add interrupt properties to DT schema.
Toshiba Visconti PCIe controller driver:
- Fix interrupts array max constraints in DT schema"
* tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits)
x86/PCI: Use pr_info() when possible
x86/PCI: Fix log message typo
x86/PCI: Tidy E820 removal messages
PCI: Skip allocate_resource() if too little space available
efi/x86: Remove EfiMemoryMappedIO from E820 map
PCI/portdrv: Allow AER service only for Root Ports & RCECs
PCI: xilinx-nwl: Fix coding style violations
PCI: mvebu: Switch to using gpiod API
PCI: pciehp: Enable Command Completed Interrupt only if supported
PCI: aardvark: Switch to using devm_gpiod_get_optional()
dt-bindings: PCI: mediatek-gen3: add support for mt7986
dt-bindings: PCI: mediatek-gen3: add SoC based clock config
dt-bindings: PCI: qcom: Allow 'dma-coherent' property
PCI: mt7621: Add sentinel to quirks table
PCI: vmd: Fix secondary bus reset for Intel bridges
PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning
PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db
PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)
PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member
PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path
...
iommufd is the user API to control the IOMMU subsystem as it relates to
managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU device
specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance the
combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a
guest. Dirty tracking enables VM live migration with SRIOV devices and
PASID support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs, which
is currently VFIO and VDPA.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF
YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl
s7fAu+CQ1pr9+9NKGevD+frw8Solsw4=
=jJkd
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd implementation from Jason Gunthorpe:
"iommufd is the user API to control the IOMMU subsystem as it relates
to managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU
device specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance
the combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a guest.
Dirty tracking enables VM live migration with SRIOV devices and PASID
support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs,
which is currently VFIO and VDPA"
For more background, see the extended explanations in Jason's pull request:
https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits)
iommufd: Change the order of MSI setup
iommufd: Improve a few unclear bits of code
iommufd: Fix comment typos
vfio: Move vfio group specific code into group.c
vfio: Refactor dma APIs for emulated devices
vfio: Wrap vfio group module init/clean code into helpers
vfio: Refactor vfio_device open and close
vfio: Make vfio_device_open() truly device specific
vfio: Swap order of vfio_device_container_register() and open_device()
vfio: Set device->group in helper function
vfio: Create wrappers for group register/unregister
vfio: Move the sanity check of the group to vfio_create_group()
vfio: Simplify vfio_create_group()
iommufd: Allow iommufd to supply /dev/vfio/vfio
vfio: Make vfio_container optionally compiled
vfio: Move container related MODULE_ALIAS statements into container.c
vfio-iommufd: Support iommufd for emulated VFIO devices
vfio-iommufd: Support iommufd for physical VFIO devices
vfio-iommufd: Allow iommufd to be used in place of a container fd
vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
...
Core
----
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations.
- Add inet drop monitor support.
- A few GRO performance improvements.
- Add infrastructure for atomic dev stats, addressing long standing
data races.
- De-duplicate common code between OVS and conntrack offloading
infrastructure.
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements.
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up
the workload with the number of available CPUs.
- Add the helper support for connection-tracking OVS offload.
BPF
---
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF.
- Make cgroup local storage available to non-cgroup attached BPF
programs.
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers.
- A relevant bunch of BPF verifier fixes and improvements.
- Veristat tool improvements to support custom filtering, sorting,
and replay of results.
- Add LLVM disassembler as default library for dumping JITed code.
- Lots of new BPF documentation for various BPF maps.
- Add bpf_rcu_read_{,un}lock() support for sleepable programs.
- Add RCU grace period chaining to BPF to wait for the completion
of access from both sleepable and non-sleepable BPF programs.
- Add support storing struct task_struct objects as kptrs in maps.
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values.
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions.
Protocols
---------
- TCP: implement Protective Load Balancing across switch links.
- TCP: allow dynamically disabling TCP-MD5 static key, reverting
back to fast[er]-path.
- UDP: Introduce optional per-netns hash lookup table.
- IPv6: simplify and cleanup sockets disposal.
- Netlink: support different type policies for each generic
netlink operation.
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support.
- MPTCP: add netlink notification support for listener sockets
events.
- SCTP: add VRF support, allowing sctp sockets binding to VRF
devices.
- Add bridging MAC Authentication Bypass (MAB) support.
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios.
- More work for Wi-Fi 7 support, comprising conversion of all
the existing drivers to internal TX queue usage.
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading.
- IPSec: extended ack support for more descriptive XFRM error
reporting.
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking.
- IEEE 802154: synchronous send frame and extended filtering
support, initial support for scanning available 15.4 networks.
- Tun: bump the link speed from 10Mbps to 10Gbps.
- Tun/VirtioNet: implement UDP segmentation offload support.
Driver API
----------
- PHY/SFP: improve power level switching between standard
level 1 and the higher power levels.
- New API for netdev <-> devlink_port linkage.
- PTP: convert existing drivers to new frequency adjustment
implementation.
- DSA: add support for rx offloading.
- Autoload DSA tagging driver when dynamically changing protocol.
- Add new PCP and APPTRUST attributes to Data Center Bridging.
- Add configuration support for 800Gbps link speed.
- Add devlink port function attribute to enable/disable RoCE and
migratable.
- Extend devlink-rate to support strict prioriry and weighted fair
queuing.
- Add devlink support to directly reading from region memory.
- New device tree helper to fetch MAC address from nvmem.
- New big TCP helper to simplify temporary header stripping.
New hardware / drivers
----------------------
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches.
- Marvel Prestera AC5X Ethernet Switch.
- WangXun 10 Gigabit NIC.
- Motorcomm yt8521 Gigabit Ethernet.
- Microchip ksz9563 Gigabit Ethernet Switch.
- Microsoft Azure Network Adapter.
- Linux Automation 10Base-T1L adapter.
- PHY:
- Aquantia AQR112 and AQR412.
- Motorcomm YT8531S.
- PTP:
- Orolia ART-CARD.
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices.
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices.
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets.
- Realtek RTL8852BE and RTL8723DS.
- Cypress.CYW4373A0 WiFi + Bluetooth combo device.
Drivers
-------
- CAN:
- gs_usb: bus error reporting support.
- kvaser_usb: listen only and bus error reporting support.
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping.
- implement devlink-rate support.
- support direct read from memory.
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate.
- Support for enhanced events compression.
- extend H/W offload packet manipulation capabilities.
- implement IPSec packet offload mode.
- nVidia/Mellanox (mlx4):
- better big TCP support.
- Netronome Ethernet NICs (nfp):
- IPsec offload support.
- add support for multicast filter.
- Broadcom:
- RSS and PTP support improvements.
- AMD/SolarFlare:
- netlink extened ack improvements.
- add basic flower matches to offload, and related stats.
- Virtual NICs:
- ibmvnic: introduce affinity hint support.
- small / embedded:
- FreeScale fec: add initial XDP support.
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood.
- TI am65-cpsw: add suspend/resume support.
- Mediatek MT7986: add RX wireless wthernet dispatch support.
- Realtek 8169: enable GRO software interrupt coalescing per
default.
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP.
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support.
- add ip6gre support.
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support.
- enable flow offload support.
- Renesas:
- add rswitch R-Car Gen4 gPTP support.
- Microchip (lan966x):
- add full XDP support.
- add TC H/W offload via VCAP.
- enable PTP on bridge interfaces.
- Microchip (ksz8):
- add MTU support for KSZ8 series.
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan.
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support.
- add ack signal support.
- enable coredump support.
- remain_on_channel support.
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities.
- 320 MHz channels support.
- RealTek WiFi (rtw89):
- new dynamic header firmware format support.
- wake-over-WLAN support.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmOYXUcSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOk8zQP/R7BZtbJMTPiWkRnSoKHnAyupDVwrz5U
ktukLkwPsCyJuEbAjgxrxf4EEEQ9uq2FFlxNSYuKiiQMqIpFxV6KED7LCUygn4Tc
kxtkp0Q+5XiqisWlQmtfExf2OjuuPqcjV9tWCDBI6GebKUbfNwY/eI44RcMu4BSv
DzIlW5GkX/kZAPqnnuqaLsN3FudDTJHGEAD7NbA++7wJ076RWYSLXlFv0Z+SCSPS
H8/PEG0/ZK/65rIWMAFRClJ9BNIDwGVgp0GrsIvs1gqbRUOlA1hl1rDM21TqtNFf
5QPQT7sIfTcCE/nerxKJD5JE3JyP+XRlRn96PaRw3rt4MgI6I/EOj/HOKQ5tMCNc
oPiqb7N70+hkLZyr42qX+vN9eDPjp2koEQm7EO2Zs+/534/zWDs24Zfk/Aa1ps0I
Fa82oGjAgkBhGe/FZ6i5cYoLcyxqRqZV1Ws9XQMl72qRC7/BwvNbIW6beLpCRyeM
yYIU+0e9dEm+wHQEdh2niJuVtR63hy8tvmPx56lyh+6u0+pondkwbfSiC5aD3kAC
ikKsN5DyEsdXyiBAlytCEBxnaOjQy4RAz+3YXSiS0eBNacXp03UUrNGx4Pzpu/D0
QLFJhBnMFFCgy5to8/DvKnrTPgZdSURwqbIUcZdvU21f1HLR8tUTpaQnYffc/Whm
V8gnt1EL+0cc
=CbJC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations
- Add inet drop monitor support
- A few GRO performance improvements
- Add infrastructure for atomic dev stats, addressing long standing
data races
- De-duplicate common code between OVS and conntrack offloading
infrastructure
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs
- Add the helper support for connection-tracking OVS offload
BPF:
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF
- Make cgroup local storage available to non-cgroup attached BPF
programs
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers
- A relevant bunch of BPF verifier fixes and improvements
- Veristat tool improvements to support custom filtering, sorting,
and replay of results
- Add LLVM disassembler as default library for dumping JITed code
- Lots of new BPF documentation for various BPF maps
- Add bpf_rcu_read_{,un}lock() support for sleepable programs
- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs
- Add support storing struct task_struct objects as kptrs in maps
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
Protocols:
- TCP: implement Protective Load Balancing across switch links
- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path
- UDP: Introduce optional per-netns hash lookup table
- IPv6: simplify and cleanup sockets disposal
- Netlink: support different type policies for each generic netlink
operation
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support
- MPTCP: add netlink notification support for listener sockets events
- SCTP: add VRF support, allowing sctp sockets binding to VRF devices
- Add bridging MAC Authentication Bypass (MAB) support
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios
- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading
- IPSec: extended ack support for more descriptive XFRM error
reporting
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking
- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks
- Tun: bump the link speed from 10Mbps to 10Gbps
- Tun/VirtioNet: implement UDP segmentation offload support
Driver API:
- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels
- New API for netdev <-> devlink_port linkage
- PTP: convert existing drivers to new frequency adjustment
implementation
- DSA: add support for rx offloading
- Autoload DSA tagging driver when dynamically changing protocol
- Add new PCP and APPTRUST attributes to Data Center Bridging
- Add configuration support for 800Gbps link speed
- Add devlink port function attribute to enable/disable RoCE and
migratable
- Extend devlink-rate to support strict prioriry and weighted fair
queuing
- Add devlink support to directly reading from region memory
- New device tree helper to fetch MAC address from nvmem
- New big TCP helper to simplify temporary header stripping
New hardware / drivers:
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter
- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S
- PTP:
- Orolia ART-CARD
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device
Drivers:
- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support
- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"
* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...
This includes a number of small fixes, as usual.
It also includes a new driver for doing the i2c (SSIF) interface
BMC-side, pretty much completing the BMC side interfaces.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAmOYieoACgkQYfOMkJGb
/4EF6A//UX1SL+OT+NOvFYxr6etcKoY6VGZDlEUbUbUAYFE/vv/aFP2aXr+eHf/o
Y+hx3WLE5kveoxVcGfkJMRVYiHbC/4d+Z4CLd1PzXx6Tjp20Abpr8uGCZIyy0263
/gYl+JPmSRXslA46mPImMlTzL/vzsHjoOHVpLCVcv8iDbXKhvqlNEJ2Y8BbNy299
vNkix6MVDmTCD+PR5LE10myL0X53suHJoAN6CbjmRgIgxFb/tfddtFRIjqb/W8Dk
JssgYGd+xFHWV/65xnOGbGDWuciBIQSkFL7fWyZpp0OooPJJRL/Hi/l3VVMC1GTA
y38E+0as+OzQcaHKcG9hnzFWhVtewJYFHL2TZrAv3IbFWjiJwDq7jguraOl5uTAB
vzIt0ML34oWF4PR/ZYva3aZID5xmNKYMRVHp9cxzNmUKd4XcHfwzKD2C9eVVG5W9
qQ+7a+L7mC6PwOCE5t+P/Plh0lC9V8eDAVTf4pfXH/vp0o5oreMaJtDKGnCqvyZR
raU/rmyS8TNYv8iZMItGL8U1h8trYIMVA1SuEzQICpPKd7yITcKuSPIYXLThGUx8
4XWu8iMfaPcHzeCNtOwrRxXjQ8AeRTGPPkKf25pvvM9ETODyUIUsuMYSE9kNlQjP
zFCyDbcMPxVx+WD6YptukbY8pNftCiT8wrFbriV8BMDDGMDq5HM=
=HTb4
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.2-1' of https://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"This includes a number of small fixes, as usual.
It also includes a new driver for doing the i2c (SSIF) interface
BMC-side, pretty much completing the BMC side interfaces"
* tag 'for-linus-6.2-1' of https://github.com/cminyard/linux-ipmi:
ipmi/watchdog: use strscpy() to instead of strncpy()
ipmi: ssif_bmc: Convert to i2c's .probe_new()
ipmi: fix use after free in _ipmi_destroy_user()
ipmi/watchdog: Include <linux/kstrtox.h> when appropriate
ipmi:ssif: Increase the message retry time
ipmi: Fix some kernel-doc warnings
ipmi: ssif_bmc: Use EPOLLIN instead of POLLIN
ipmi: fix msg stack when IPMI is disconnected
ipmi: fix memleak when unload ipmi driver
ipmi: fix long wait in unload when IPMI disconnect
ipmi: kcs: Poll OBF briefly to reduce OBE latency
bindings: ipmi: Add binding for SSIF BMC driver
ipmi: ssif_bmc: Add SSIF BMC driver
- a new driver for Cypress Generation 5 touchscreens
- a new driver for Hynitron cstxxx touchscreens
- a new driver for Himax hx83112b touchscreen
- I2C input devices have been converted to use i2c's probe_new()
- a large number of input devices are now using DEFINE_SIMPLE_DEV_PM_OPS
and pm_sleep_ptr() and no longer use __maybe_unused annotations
- improvements to msg2638 touchscreen driver to also support msg2138
- conversion of several input deevine bindings to yaml/DT schema
- changes to select touch drivers to move handling of wake irqs to the
PM core
- other assorted fixes and improvements.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCY5d+9wAKCRBAj56VGEWX
nN9CAP9R1zCdPc5Y2PmLnE6JHc9XynPhUnVbnx4zHieMxw0nHQD/ZnHot+Cdq/+L
433dkdX50pwK3XxRQgjRaym+efgwfQ0=
=9m4z
-----END PGP SIGNATURE-----
Merge tag 'input-for-v6.2-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a new driver for Cypress Generation 5 touchscreens
- a new driver for Hynitron cstxxx touchscreens
- a new driver for Himax hx83112b touchscreen
- I2C input devices have been converted to use i2c's probe_new()
- a large number of input devices are now using
DEFINE_SIMPLE_DEV_PM_OPS and pm_sleep_ptr() and no longer use
__maybe_unused annotations
- improvements to msg2638 touchscreen driver to also support msg2138
- conversion of several input deevine bindings to yaml/DT schema
- changes to select touch drivers to move handling of wake irqs to the
PM core
- other assorted fixes and improvements.
* tag 'input-for-v6.2-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (165 commits)
Input: elants_i2c - delay longer with reset asserted
dt-bindings: input: Convert ti,drv260x to DT schema
dt-bindings: input: gpio-beeper: Convert to yaml schema
Input: pxspad - fix unused data warning when force feedback not enabled
Input: lpc32xx - allow building with COMPILE_TEST
Input: nomadik-ske-keypad - allow building with COMPILE_TEST
Input: pxa27xx-keypad - allow build with COMPILE_TEST
Input: spear-keyboard - improve build coverage using COMPILE_TEST
Input: tegra-kbc - allow build with COMPILE_TEST
Input: tegra-kbc - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: tca6416-keypad - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: tc3589x - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: st-keyscan - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: sh-keysc - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: qt1070 - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: pxa27x_keypad - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: pmic8xxx-keypad - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: nomadik-ske-keypad - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: mcs-touchkey - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
Input: max7359-keypad - switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmOW44IACgkQCF8+vY7k
4RWt2RAAnUPY7bj2DDGo5rJ54KjMXhz6usdOnh9Hzg5eegGzK2xXAOyKVg4AFsNk
rXWkbEc5Rg2LJnMZg8dojsG/utOV+xtCidQCYdhUKLPDREDMjSuUy/vs3utllwkg
MhO8JDY+OQHhqXaMFRz0suGvr1W4kDmRR7+4VciEEPX9k9CX+FMYnuVlNyxLZG03
Hu/PSDC4ltU+P0xnLap3U681PWfUDAoSvhyQmvde39EspSBxzFTVy7Cw1VL7DvwQ
Idrcxo37buGf8eF9Em02PBgzC00TV6yCy5wOPOemcozBgtDSeLSQjlUUaOqHZgKI
uY4k8LI0efnJPWIqt/rGZ4OREK+m7RbyAKvQ/9ckblm3bjsJV/T8WGtnNHxDRBVD
ypoSvFyJ+RU6eFUw2jG61Fx0vPocK8AGnQLK860ns52h5DxyxpPxWtvPyNZLNs59
bjZPetbU7bgvGZ8aBJno84Q+4Bliel8zXWnQKrAV28gjwCt/q/Lbd9G7sUYCZwIE
EMxcOP9r2J1Q8zQK6s9xdZx2lRINWD+9Hgh1toS2KGhkAtT5BWyBmD2MXqt88v04
8MeyneYt6uiv5Lst41BhxT/hvIyFb9g3pW28TAUCPV9r5pjyJVRNvPjJEv6dnR2e
eRmBHcyLG6/Q1Do+HY2DjjgOsAL7yDxQJNahqFM/cFGYMVmYNFU=
=i0X1
-----END PGP SIGNATURE-----
Merge tag 'media/v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- DVB core changes to avoid refcount troubles and UAF
- DVB API/core has gained support for DVB-C2 and DVB-S2X
- New sensor drivers: ov08x40, ov4689.c, st-vgxy61 and tc358746.c
- Removal of an unused sensor driver: s5k4ecgx
- Move microchip_csi2dc to a new directory, named after the
manufacturer
- Add media controller support to Microship drivers
- Old Atmel/Microship drivers that don't use media controler got moved
to staging
- New drivers added for Renesas RZ/G2L CRU and MIPI CSI-2 support
- Allwinner A31 camera sensor driver code was now split into a bridge
and a separate processor driver
- Added a virtual stateless decoder driver in order to test core
support for stateless drivers and test userspace apps using it
- removed platform-based support for ov9650, as this is not used
anymore
- atomisp now uses videobuf2 and supports normal mmap mode
- the imx7-media-csi driver got promoted from staging
- rcar-vin driver has gained support for gen3 UDS (Up Down Scaler)
- most i2c drivers now use I2C .probe_new() kAPI
- lots of drivers fixes, cleanups and improvements
* tag 'media/v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (544 commits)
media: s5c73m3: Switch to GPIO descriptors
media: i2c: s5k5baf: switch to using gpiod API
media: i2c: s5k6a3: switch to using gpiod API
media: imx: remove code for non-existing config IMX_GPT_ICAP
media: si470x: Fix use-after-free in si470x_int_in_callback()
media: staging: stkwebcam: Restore MEDIA_{USB,CAMERA}_SUPPORT dependencies
media: coda: Add check for kmalloc
media: coda: Add check for dcoda_iram_alloc
dt-bindings: media: s5c73m3: Fix reset-gpio descriptor
media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property
media: s5k4ecgx: Delete driver
media: s5k4ecgx: Switch to GPIO descriptors
media: Switch to use dev_err_probe() helper
headers: Remove some left-over license text in include/uapi/linux/v4l2-*
headers: Remove some left-over license text in include/uapi/linux/dvb/
media: usb: pwc-uncompress: Use flex array destination for memcpy()
media: s5p-mfc: Fix to handle reference queue during finishing
media: s5p-mfc: Clear workbit to handle error condition
media: s5p-mfc: Fix in register read and write for H264
media: imx: Use get_mbus_config instead of parsing upstream DT endpoints
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk
WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9
2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH
YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK
aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII
LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI
0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd
Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7
Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI
XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n
D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr
wxzFGfvHfw==
=k/vv
-----END PGP SIGNATURE-----
Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull requests via Christoph:
- Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
Joshi)
- Refactor PCIe probing and reset (Christoph Hellwig)
- Various fabrics authentication fixes and improvements (Sagi
Grimberg)
- Avoid fallback to sequential scan due to transient issues (Uday
Shankar)
- Implement support for the DEAC bit in Write Zeroes (Christoph
Hellwig)
- Allow overriding the IEEE OUI and firmware revision in configfs
for nvmet (Aleksandr Miloserdov)
- Force reconnect when number of queue changes in nvmet (Daniel
Wagner)
- Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
Grimberg, Christoph Hellwig, Christophe JAILLET)
- Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
- Use the common tagset helpers in nvme-pci driver (Christoph
Hellwig)
- Cleanup the nvme-pci removal path (Christoph Hellwig)
- Use kstrtobool() instead of strtobool (Christophe JAILLET)
- Allow unprivileged passthrough of Identify Controller (Joel
Granados)
- Support io stats on the mpath device (Sagi Grimberg)
- Minor nvmet cleanup (Sagi Grimberg)
- MD pull requests via Song:
- Code cleanups (Christoph)
- Various fixes
- Floppy pull request from Denis:
- Fix a memory leak in the init error path (Yuan)
- Series fixing some batch wakeup issues with sbitmap (Gabriel)
- Removal of the pktcdvd driver that was deprecated more than 5 years
ago, and subsequent removal of the devnode callback in struct
block_device_operations as no users are now left (Greg)
- Fix for partition read on an exclusively opened bdev (Jan)
- Series of elevator API cleanups (Jinlong, Christoph)
- Series of fixes and cleanups for blk-iocost (Kemeng)
- Series of fixes and cleanups for blk-throttle (Kemeng)
- Series adding concurrent support for sync queues in BFQ (Yu)
- Series bringing drbd a bit closer to the out-of-tree maintained
version (Christian, Joel, Lars, Philipp)
- Misc drbd fixes (Wang)
- blk-wbt fixes and tweaks for enable/disable (Yu)
- Fixes for mq-deadline for zoned devices (Damien)
- Add support for read-only and offline zones for null_blk
(Shin'ichiro)
- Series fixing the delayed holder tracking, as used by DM (Yu,
Christoph)
- Series enabling bio alloc caching for IRQ based IO (Pavel)
- Series enabling userspace peer-to-peer DMA (Logan)
- BFQ waker fixes (Khazhismel)
- Series fixing elevator refcount issues (Christoph, Jinlong)
- Series cleaning up references around queue destruction (Christoph)
- Series doing quiesce by tagset, enabling cleanups in drivers
(Christoph, Chao)
- Series untangling the queue kobject and queue references (Christoph)
- Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)
* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
blktrace: Fix output non-blktrace event when blk_classic option enabled
block: sed-opal: Don't include <linux/kernel.h>
sed-opal: allow using IOC_OPAL_SAVE for locking too
blk-cgroup: Fix typo in comment
block: remove bio_set_op_attrs
nvmet: don't open-code NVME_NS_ATTR_RO enumeration
nvme-pci: use the tagset alloc/free helpers
nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
nvme: consolidate setting the tagset flags
nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
block: bio_copy_data_iter
nvme-pci: split out a nvme_pci_ctrl_is_dead helper
nvme-pci: return early on ctrl state mismatch in nvme_reset_work
nvme-pci: rename nvme_disable_io_queues
nvme-pci: cleanup nvme_suspend_queue
nvme-pci: remove nvme_pci_disable
nvme-pci: remove nvme_disable_admin_queue
nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
nvme: use nvme_wait_ready in nvme_shutdown_ctrl
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOSYp4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgWNEADTcHejQ5y8Ctc1BEsCiEnCbR5DeIwESBKN
SfK7uK5N5GjKZgKeMtkdxNeckH2wSZ79VFpz2/b4fszM11H+P81r/PP+OwQojcra
1YyPbp2jXlQZGEYijiXYDohnKr8NnJYj/nuwm4T73OPOD48ekbrY36/t3Hd75jKk
M5L/HOpbKhA+fypPHYlm3XlwEM9/4wupsDeiabTuAeFpLh66V/h85ZLY91c3Bf7j
yllzT2CCGQsh+fnqdovpsqUxM4Sh6sHfa2QhwkfJFfU9OLwDz0TLlcvSVwWWICD1
DGeDE/MPBKZ4Z5zp8t94vTnZDAiExwkZbmCOnOkdYoPdMDqCDhp8a9WywV3IiJJi
+hRN6Au6BWw8Rj3b3pbobfdXFJAvoHqXHM1SDAsWmEIMpuMhNalIpYqtna9JwRDe
KV2ERMpGOUpmvQiRaJa5Wtz6hlFetmuav6Ij9DWb5f8+NXikiFAXk4g6TtD1Z7Cb
15klwM8qW50zX7YUlzBLcjNdj7+HuIswLx0obZ3Uv1ogDwMQKXEvPaAHUwVN02q6
cWP0P2Bc0EKvdwTSNpgNflhNsiV2DFJZpZfxkdPauN4t2EkJ38iEpHewffy0ecM7
uyaXGQVQW7WFDuGno4cGbThQIG95MqkRPBhEyB4cOmcyS/aZ92ZFtV1iI8dVDA+v
uuEIMc3OCA==
=EgDc
-----END PGP SIGNATURE-----
Merge tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
- Always ensure proper ordering in case of CQ ring overflow, which then
means we can remove some work-arounds for that (Dylan)
- Support completion batching for multishot, greatly increasing the
efficiency for those (Dylan)
- Flag epoll/eventfd wakeups done from io_uring, so that we can easily
tell if we're recursing into io_uring again.
Previously, this would have resulted in repeated multishot
notifications if we had a dependency there. That could happen if an
eventfd was registered as the ring eventfd, and we multishot polled
for events on it. Or if an io_uring fd was added to epoll, and
io_uring had a multishot request for the epoll fd.
Test cases here:
https://git.kernel.dk/cgit/liburing/commit/?id=919755a7d0096fda08fb6d65ac54ad8d0fe027cd
Previously these got terminated when the CQ ring eventually
overflowed, now it's handled gracefully (me).
- Tightening of the IOPOLL based completions (Pavel)
- Optimizations of the networking zero-copy paths (Pavel)
- Various tweaks and fixes (Dylan, Pavel)
* tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux: (41 commits)
io_uring: keep unlock_post inlined in hot path
io_uring: don't use complete_post in kbuf
io_uring: spelling fix
io_uring: remove io_req_complete_post_tw
io_uring: allow multishot polled reqs to defer completion
io_uring: remove overflow param from io_post_aux_cqe
io_uring: add lockdep assertion in io_fill_cqe_aux
io_uring: make io_fill_cqe_aux static
io_uring: add io_aux_cqe which allows deferred completion
io_uring: allow defer completion for aux posted cqes
io_uring: defer all io_req_complete_failed
io_uring: always lock in io_apoll_task_func
io_uring: remove iopoll spinlock
io_uring: iopoll protect complete_post
io_uring: inline __io_req_complete_put()
io_uring: remove io_req_tw_post_queue
io_uring: use io_req_task_complete() in timeout
io_uring: hold locks for io_req_complete_failed
io_uring: add completion locking for iopoll
io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post()
...
-----BEGIN PGP SIGNATURE-----
iIYEABYIAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCY5b27RAcbWljQGRpZ2lr
b2QubmV0AAoJEOXj0OiMgvbSg9YA/0K10H+VsGt1+qqR4+w9SM7SFzbgszrV3Yw9
rwiPgaPVAP9rxXPr2bD2hAk7/Lv9LeJ2kfM9RzMErP1A6UsC5YVbDA==
=mAG7
-----END PGP SIGNATURE-----
Merge tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This adds file truncation support to Landlock, contributed by Günther
Noack. As described by Günther [1], the goal of these patches is to
work towards a more complete coverage of file system operations that
are restrictable with Landlock.
The known set of currently unsupported file system operations in
Landlock is described at [2]. Out of the operations listed there,
truncate is the only one that modifies file contents, so these patches
should make it possible to prevent the direct modification of file
contents with Landlock.
The new LANDLOCK_ACCESS_FS_TRUNCATE access right covers both the
truncate(2) and ftruncate(2) families of syscalls, as well as open(2)
with the O_TRUNC flag. This includes usages of creat() in the case
where existing regular files are overwritten.
Additionally, this introduces a new Landlock security blob associated
with opened files, to track the available Landlock access rights at
the time of opening the file. This is in line with Unix's general
approach of checking the read and write permissions during open(), and
associating this previously checked authorization with the opened
file. An ongoing patch documents this use case [3].
In order to treat truncate(2) and ftruncate(2) calls differently in an
LSM hook, we split apart the existing security_path_truncate hook into
security_path_truncate (for truncation by path) and
security_file_truncate (for truncation of previously opened files)"
Link: https://lore.kernel.org/r/20221018182216.301684-1-gnoack3000@gmail.com [1]
Link: https://www.kernel.org/doc/html/v6.1/userspace-api/landlock.html#filesystem-flags [2]
Link: https://lore.kernel.org/r/20221209193813.972012-1-mic@digikod.net [3]
* tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
samples/landlock: Document best-effort approach for LANDLOCK_ACCESS_FS_REFER
landlock: Document Landlock's file truncation support
samples/landlock: Extend sample tool to support LANDLOCK_ACCESS_FS_TRUNCATE
selftests/landlock: Test ftruncate on FDs created by memfd_create(2)
selftests/landlock: Test FD passing from restricted to unrestricted processes
selftests/landlock: Locally define __maybe_unused
selftests/landlock: Test open() and ftruncate() in multiple scenarios
selftests/landlock: Test file truncation support
landlock: Support file truncation
landlock: Document init_layer_masks() helper
landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed()
security: Create file_truncate hook from path_truncate hook
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmOSLtIACgkQxWXV+ddt
WDvpQA//dQ3Wosz5puFNiZvoSUn/BnYJueZHjwF0bWY8OYINkF1PvDenu/WotyFz
Ozf4Yl4Afxncz+FjDnOtlpr6KsSU5NqdGM3NrY0eNsxd2t1KrTsN0LgkA4m24p8b
YsYp7pygbMm7c+h0X4uFpebY4lABkEPCBXnI//ktsls0xG5sOvGfZA3rdUP0bou2
JTn6hk+s0cLTNoTiOCGNHRJbeTzHLR0viZj/E4LCJfCeJvAmOLZamUjqe9sBNYAg
YtsrZTpUIL3JgmRi5B6jG4fHSXOnE14mKmRIR3xPME6J6eoYyNOeuSh1oNmJEuoE
B7nD5We+x5+isjXNw/V5CQrs7FF09UbdpbNb9NF5CYQWv40OCeefuai1opGtBUxX
dvbfmf1blYpWW/wfFOKQwMOsl8kZIZYx68FW2OBUNglB6yRpX/3QgFSGb8kPCr83
DW2ttqwkpSNPMKk92I/owIc4BRvZ+LMR/PimEHB/Sa2apZA2/L+7RGwoaaei1QNX
1tJxHWeJFLDZ+YRxjO1eKqhWdGQPn1kkq8LoXLi3tGaNF4kYQfhWOSM3WRowvx1q
f99XRgA8JQnqZS83zqRIspWlpFK0CFdvzG1Zlqx+eoxERfeaMNA2fHxv1YCyFV4+
TiXgsnCo+PIBwlvL/HjUWZgYE9+AD+NN5vyoE2UDYff4AgBFTE8=
=Nqg9
-----END PGP SIGNATURE-----
Merge tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This round there are a lot of cleanups and moved code so the diffstat
looks huge, otherwise there are some nice performance improvements and
an update to raid56 reliability.
User visible features:
- raid56 reliability vs performance trade off:
- fix destructive RMW for raid5 data (raid6 still needs work): do
full checksum verification for all data during RMW cycle, this
should prevent rewriting potentially corrupted data without
notice
- stripes are cached in memory which should reduce the performance
impact but still can hurt some workloads
- checksums are verified after repair again
- this is the last option without introducing additional features
(write intent bitmap, journal, another tree), the extra checksum
read/verification was supposed to be avoided by the original
implementation exactly for performance reasons but that caused
all the reliability problems
- discard=async by default for devices that support it
- implement emergency flush reserve to avoid almost all unnecessary
transaction aborts due to ENOSPC in cases where there are too many
delayed refs or delayed allocation
- skip block group synchronization if there's no change in used
bytes, can reduce transaction commit count for some workloads
Performance improvements:
- fiemap and lseek:
- overall speedup due to skipping unnecessary or duplicate
searches (-40% run time)
- cache some data structures and sharedness of extents (-30% run
time)
- send:
- faster backref resolution when finding clones
- cached leaf to root mapping for faster backref walking
- improved clone/sharing detection
- overall run time improvements (-70%)
Core:
- module initialization converted to a table of function pointers run
in a sequence
- preparation for fscrypt, extend passing file names across calls,
dir item can store encryption status
- raid56 updates:
- more accurate error tracking of sectors within stripe
- simplify recovery path and remove dedicated endio worker kthread
- simplify scrub call paths
- refactoring to support the extra data checksum verification
during RMW cycle
- tree block parentness checks consolidated and done at metadata read
time
- improved error handling
- cleanups:
- move a lot of code for better synchronization between kernel and
user space sources, split big files
- enum cleanups
- GFP flag cleanups
- header file cleanups, prototypes, dependencies
- redundant parameter cleanups
- inline extent handling simplifications
- inode parameter conversion
- data structure cleanups, reductions, renames, merges"
* tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
btrfs: print transaction aborted messages with an error level
btrfs: sync some cleanups from progs into uapi/btrfs.h
btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
btrfs: remove outdated logic from overwrite_item() and add assertion
btrfs: unify overwrite_item() and do_overwrite_item()
btrfs: replace strncpy() with strscpy()
btrfs: fix uninitialized variable in find_first_clear_extent_bit
btrfs: fix uninitialized parent in insert_state
btrfs: add might_sleep() annotations
btrfs: add stack helpers for a few btrfs items
btrfs: add nr_global_roots to the super block definition
btrfs: remove BTRFS_LEAF_DATA_OFFSET
btrfs: add helpers for manipulating leaf items and data
btrfs: add eb to btrfs_node_key_ptr_offset
btrfs: pass the extent buffer for the btrfs_item_nr helpers
btrfs: move the csum helpers into ctree.h
btrfs: move eb offset helpers into extent_io.h
btrfs: move file_extent_item helpers into file-item.h
btrfs: move leaf_data_end into ctree.c
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCY5cH0wAKCRDh3BK/laaZ
PAIJAQC658ZaXRgWBZ/XmCqbb+c8g/InrccE+PXhtVGYTiWTiwD/ZEy7r/X/uJaO
gb5anxJT5jVJ9Qfk1VbyZqdwqUqydAo=
=If6t
-----END PGP SIGNATURE-----
Merge tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse update from Miklos Szeredi:
- Allow some write requests to proceed in parallel
- Fix a performance problem with allow_sys_admin_access
- Add a special kind of invalidation that doesn't immediately purge
submounts
- On revalidation treat the target of rename(RENAME_NOREPLACE) the same
as open(O_EXCL)
- Use type safe helpers for some mnt_userns transformations
* tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Rearrange fuse_allow_current_process checks
fuse: allow non-extending parallel direct writes on the same file
fuse: remove the unneeded result variable
fuse: port to vfs{g,u}id_t and associated helpers
fuse: Remove user_ns check for FUSE_DEV_IOC_CLONE
fuse: always revalidate rename target dentry
fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
fs/fuse: Replace kmap() with kmap_local_page()
This release adds SM4 encryption support, contributed by Tianjia Zhang.
SM4 is a Chinese block cipher that is an alternative to AES.
I recommend against using SM4, but (according to Tianjia) some people
are being required to use it. Since SM4 has been turning up in many
other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.), it
hasn't been very controversial, and some people have to use it, I don't
think it would be fair for me to reject this optional feature.
Besides the above, there are a couple cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY5auyBQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1u4AP4lhLxaEJ9upkHZrPAvEdF7QjLhO/ju
h1LrvWHcEbvr6AEA/8ptc5RA1BAoSTDcqIWxIAWRztvptP4gUETb1b9C/ws=
=An5w
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"This release adds SM4 encryption support, contributed by Tianjia
Zhang. SM4 is a Chinese block cipher that is an alternative to AES.
I recommend against using SM4, but (according to Tianjia) some people
are being required to use it. Since SM4 has been turning up in many
other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.),
it hasn't been very controversial, and some people have to use it, I
don't think it would be fair for me to reject this optional feature.
Besides the above, there are a couple cleanups"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: add additional documentation for SM4 support
fscrypt: remove unused Speck definitions
fscrypt: Add SM4 XTS/CTS symmetric algorithm support
blk-crypto: Add support for SM4-XTS blk crypto mode
fscrypt: add comment for fscrypt_valid_enc_modes_v1()
fscrypt: pass super_block to fscrypt_put_master_key_activeref()
Add the 'MDBE_ATTR_RTPORT' attribute to allow user space to specify the
routing protocol of the MDB port group entry. Enforce a minimum value of
'RTPROT_STATIC' to prevent user space from using protocol values that
should only be set by the kernel (e.g., 'RTPROT_KERNEL'). Maintain
backward compatibility by defaulting to 'RTPROT_STATIC'.
The protocol is already visible to user space in RTM_NEWMDB responses
and notifications via the 'MDBA_MDB_EATTR_RTPROT' attribute.
The routing protocol allows a routing daemon to distinguish between
entries configured by it and those configured by the administrator. Once
MDB flush is supported, the protocol can be used as a criterion
according to which the flush is performed.
Examples:
# bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto kernel
Error: integer out of range.
# bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto static
# bridge mdb add dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent proto zebra
# bridge mdb add dev br0 port dummy10 grp 239.1.1.2 permanent source_list 198.51.100.1,198.51.100.2 filter_mode include proto 250
# bridge -d mdb show
dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.2 permanent filter_mode include proto 250
dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.1 permanent filter_mode include proto 250
dev br0 port dummy10 grp 239.1.1.2 permanent filter_mode include source_list 198.51.100.2/0.00,198.51.100.1/0.00 proto 250
dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra
dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude proto static
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add new netlink attributes to the RTM_NEWMDB request that allow user
space to add (*, G) with a source list and filter mode.
The RTM_NEWMDB message can already dump such entries (created by the
kernel) so there is no need to add dump support. However, the message
contains a different set of attributes depending if it is a request or a
response. The naming and structure of the new attributes try to follow
the existing ones used in the response.
Request:
[ struct nlmsghdr ]
[ struct br_port_msg ]
[ MDBA_SET_ENTRY ]
struct br_mdb_entry
[ MDBA_SET_ENTRY_ATTRS ]
[ MDBE_ATTR_SOURCE ]
struct in_addr / struct in6_addr
[ MDBE_ATTR_SRC_LIST ] // new
[ MDBE_SRC_LIST_ENTRY ]
[ MDBE_SRCATTR_ADDRESS ]
struct in_addr / struct in6_addr
[ ...]
[ MDBE_ATTR_GROUP_MODE ] // new
u8
Response:
[ struct nlmsghdr ]
[ struct br_port_msg ]
[ MDBA_MDB ]
[ MDBA_MDB_ENTRY ]
[ MDBA_MDB_ENTRY_INFO ]
struct br_mdb_entry
[ MDBA_MDB_EATTR_TIMER ]
u32
[ MDBA_MDB_EATTR_SOURCE ]
struct in_addr / struct in6_addr
[ MDBA_MDB_EATTR_RTPROT ]
u8
[ MDBA_MDB_EATTR_SRC_LIST ]
[ MDBA_MDB_SRCLIST_ENTRY ]
[ MDBA_MDB_SRCATTR_ADDRESS ]
struct in_addr / struct in6_addr
[ MDBA_MDB_SRCATTR_TIMER ]
u8
[...]
[ MDBA_MDB_EATTR_GROUP_MODE ]
u8
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Add a new VID/PID 0489/e0f2 for MT7922
- Add Realtek RTL8852BE support ID 0x0cb8:0xc559
- Add a new PID/VID 13d3/3549 for RTL8822CU
- Add support for broadcom BCM43430A0 & BCM43430A1
- Add CONFIG_BT_HCIBTUSB_POLL_SYNC
- Add CONFIG_BT_LE_L2CAP_ECRED
- Add support for CYW4373A0
- Add support for RTL8723DS
- Add more device IDs for WCN6855
- Add Broadcom BCM4377 family PCIe Bluetooth
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmOXqQYZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKa4LEACMpjtWb7IhfSt42FC+p7sf
gXLZ1GOEZnP3x+rJUXfHRdVfRvkpABixnEFuqbzkCZLQMy6W01abtBJ0xC7mADIu
CKvef5IjUdEQJWbbAE5ZwFPxdWbWjoBnO39crl9W1hlpxatLy0QEFfC0SqZRpiiZ
Jo2BZ8g6eFP5l1GARhW+B1idjxaXI7JIg41s43PFFHOK/TkfpQxCVCeg4bv36wTD
Pp14oixAmJKXsxQoozWJcY9zLy6/6KEedfBzj0mKrOG5/v0O8YyXKoe2T0yjx1d8
t/VSV+XquYiyJ6FA9FsW8spS5GFxGjuEvKagGn11t+in9GIb8t1ve4oAL+y43nC8
br/Sx8FVbFRWu5UA0Xv4issEMp/s/1Zty7U15CHiaFwv6VNZuyXDDLZVzvgUULLl
3ikBrM1JG/AKdoHiIjwmv4qLpteZ7WWOqb3qT3cPMdnUaGBd9fMJ17qCRYs8roW7
9QoGcHUfsPUUAiM6OyBz5L2HbfZuZTdVWvPVqoeaQ7LEmGeqKe4KnGeep0fVpVe8
viKGtLbpi4+S/2IKQvEVvulmsXOved2uJef/x+leH8HShPDceQU9c+TG8RgxOygy
f8dhAgn5wV+AgaKzLDJj2xT3xb+0j9jctTWX5SDjvHUPmu23bcNcsA0jv+dknMFh
VhOZTE+Qze08ZVGfnRT4uw==
=Oj2B
-----END PGP SIGNATURE-----
Merge tag 'for-net-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Add a new VID/PID 0489/e0f2 for MT7922
- Add Realtek RTL8852BE support ID 0x0cb8:0xc559
- Add a new PID/VID 13d3/3549 for RTL8822CU
- Add support for broadcom BCM43430A0 & BCM43430A1
- Add CONFIG_BT_HCIBTUSB_POLL_SYNC
- Add CONFIG_BT_LE_L2CAP_ECRED
- Add support for CYW4373A0
- Add support for RTL8723DS
- Add more device IDs for WCN6855
- Add Broadcom BCM4377 family PCIe Bluetooth
* tag 'for-net-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits)
Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to complete
Bluetooth: ISO: Avoid circular locking dependency
Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: hci_bcsp: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: hci_h5: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: hci_ll: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: hci_qca: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: btusb: don't call kfree_skb() under spin_lock_irqsave()
Bluetooth: btintel: Fix missing free skb in btintel_setup_combined()
Bluetooth: hci_conn: Fix crash on hci_create_cis_sync
Bluetooth: btintel: Fix existing sparce warnings
Bluetooth: btusb: Fix existing sparce warning
Bluetooth: btusb: Fix new sparce warnings
Bluetooth: btusb: Add a new PID/VID 13d3/3549 for RTL8822CU
Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0cb8:0xc559
dt-bindings: net: realtek-bluetooth: Add RTL8723DS
Bluetooth: btusb: Add a new VID/PID 0489/e0f2 for MT7922
dt-bindings: bluetooth: broadcom: add BCM43430A0 & BCM43430A1
Bluetooth: hci_bcm4377: Fix missing pci_disable_device() on error in bcm4377_probe()
...
====================
Link: https://lore.kernel.org/r/20221212222322.1690780-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
1) Incorrect error check in nft_expr_inner_parse(), from Dan Carpenter.
2) Add DATA_SENT state to SCTP connection tracking helper, from
Sriram Yagnaraman.
3) Consolidate nf_confirm for ipv4 and ipv6, from Florian Westphal.
4) Add bitmask support for ipset, from Vishwanath Pai.
5) Handle icmpv6 redirects as RELATED, from Florian Westphal.
6) Add WARN_ON_ONCE() to impossible case in flowtable datapath,
from Li Qiong.
7) A large batch of IPVS updates to replace timer-based estimators by
kthreads to scale up wrt. CPUs and workload (millions of estimators).
Julian Anastasov says:
This patchset implements stats estimation in kthread context.
It replaces the code that runs on single CPU in timer context every 2
seconds and causing latency splats as shown in reports [1], [2], [3].
The solution targets setups with thousands of IPVS services,
destinations and multi-CPU boxes.
Spread the estimation on multiple (configured) CPUs and multiple
time slots (timer ticks) by using multiple chains organized under RCU
rules. When stats are not needed, it is recommended to use
run_estimation=0 as already implemented before this change.
RCU Locking:
- As stats are now RCU-locked, tot_stats, svc and dest which
hold estimator structures are now always freed from RCU
callback. This ensures RCU grace period after the
ip_vs_stop_estimator() call.
Kthread data:
- every kthread works over its own data structure and all
such structures are attached to array. For now we limit
kthreads depending on the number of CPUs.
- even while there can be a kthread structure, its task
may not be running, eg. before first service is added or
while the sysctl var is set to an empty cpulist or
when run_estimation is set to 0 to disable the estimation.
- the allocated kthread context may grow from 1 to 50
allocated structures for timer ticks which saves memory for
setups with small number of estimators
- a task and its structure may be released if all
estimators are unlinked from its chains, leaving the
slot in the array empty
- every kthread data structure allows limited number
of estimators. Kthread 0 is also used to initially
calculate the max number of estimators to allow in every
chain considering a sub-100 microsecond cond_resched
rate. This number can be from 1 to hundreds.
- kthread 0 has an additional job of optimizing the
adding of estimators: they are first added in
temp list (est_temp_list) and later kthread 0
distributes them to other kthreads. The optimization
is based on the fact that newly added estimator
should be estimated after 2 seconds, so we have the
time to offload the adding to chain from controlling
process to kthread 0.
- to add new estimators we use the last added kthread
context (est_add_ktid). The new estimators are linked to
the chains just before the estimated one, based on add_row.
This ensures their estimation will start after 2 seconds.
If estimators are added in bursts, common case if all
services and dests are initially configured, we may
spread the estimators to more chains and as result,
reducing the initial delay below 2 seconds.
Many thanks to Jiri Wiesner for his valuable comments
and for spending a lot of time reviewing and testing
the changes on different platforms with 48-256 CPUs and
1-8 NUMA nodes under different cpufreq governors.
The new IPVS estimators do not use workqueue infrastructure
because:
- The estimation can take long time when using multiple IPVS rules (eg.
millions estimator structures) and especially when box has multiple
CPUs due to the for_each_possible_cpu usage that expects packets from
any CPU. With est_nice sysctl we have more control how to prioritize the
estimation kthreads compared to other processes/kthreads that have
latency requirements (such as servers). As a benefit, we can see these
kthreads in top and decide if we will need some further control to limit
their CPU usage (max number of structure to estimate per kthread).
- with kthreads we run code that is read-mostly, no write/lock
operations to process the estimators in 2-second intervals.
- work items are one-shot: as estimators are processed every
2 seconds, they need to be re-added every time. This again
loads the timers (add_timer) if we use delayed works, as there are
no kthreads to do the timings.
[1] Report from Yunhong Jiang:
https://lore.kernel.org/netdev/D25792C1-1B89-45DE-9F10-EC350DC04ADC@gmail.com/
[2] https://marc.info/?l=linux-virtual-server&m=159679809118027&w=2
[3] Report from Dust:
https://archive.linuxvirtualserver.org/html/lvs-devel/2020-12/msg00000.html
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
ipvs: run_estimation should control the kthread tasks
ipvs: add est_cpulist and est_nice sysctl vars
ipvs: use kthreads for stats estimation
ipvs: use u64_stats_t for the per-cpu counters
ipvs: use common functions for stats allocation
ipvs: add rcu protection to stats
netfilter: flowtable: add a 'default' case to flowtable datapath
netfilter: conntrack: set icmpv6 redirects as RELATED
netfilter: ipset: Add support for new bitmask parameter
netfilter: conntrack: merge ipv4+ipv6 confirm functions
netfilter: conntrack: add sctp DATA_SENT state
netfilter: nft_inner: fix IS_ERR() vs NULL check
====================
Link: https://lore.kernel.org/r/20221211101204.1751-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmOXYmwACgkQaDWVMHDJ
krD8hg/+J0hUTfljmlCctwGZyqVR3Y2E722wL9oTvbgYiUAtFrARzfPF0WNwvHi5
Ywvod5hQ4unPoluthdVAD/uJqcPVhjIZ7CvNTGrS8J7ED5x5ydGLNWAL3Rn+9s6O
xkz/DsV4zl+cPQ60XLsO+3Mc6RhwVs9DUthpUovl22epmgmRPCovkHWkvQsZajJq
ceF/78ThfrkG4dDouaIXi1gsmKLLzU4KdHeBATMg0bgPQXFJZSGBCLaeJXWmLapq
7N3SznUqDMn4Plr/IuP4XuMA6VTVojrakCcBmw5SGVqhkVWGM1/FMg7jHSQS7Z5V
5uG7CkhTBqh17v9xKwDMPh34D51TLtNifA7jbecyL5155czFkj7BoSwEFINU/wCz
agUO9NvK9j1chUnA2UGqGQigM3nWGZHMwaQjfgBWyq5gqF8HURUUrjx6XuunOfmB
1byyrDu0g48u/zaQ/RpNfewz1ZY+WylDPcqOhYaVWF1PYThStML/VMBKpdsl1Ovw
nytUdQsaBIjFHQdB+snizaF93+/0FG+FTGAlDnHYmey/8plL2LYuzrcDnDYnGEXa
tN3HFd2lAi4JBLmvmgF39gH+BLXuKTLweIhwTXZTn91cfire3yxiXAnLd0tuptMP
aXFddxKMdMpxTqzy2X+8gJjqCr2lZ9gZkxaPsWwrBM+xrJf0p2w=
=JGnq
-----END PGP SIGNATURE-----
Merge tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 tdx updates from Dave Hansen:
"This includes a single chunk of new functionality for TDX guests which
allows them to talk to the trusted TDX module software and obtain an
attestation report.
This report can then be used to prove the trustworthiness of the guest
to a third party and get access to things like storage encryption
keys"
* tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/tdx: Test TDX attestation GetReport support
virt: Add TDX guest driver
x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module
The current version of the configuration structure has unaligned
16-bit fields, but according to the specification [1], access to
the configuration space must be aligned.
Add a second, aligned version of the configuration structure
and a new feature bit indicating that this version is being used.
[1] https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.pdf
Signed-off-by: Igor Skalkin <Igor.Skalkin@opensynergy.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
response to physical memory reconfiguration, or memory-side data
invalidation from operations like secure erase or memory-device unlock.
- Add a facility for the kernel to warn about collisions between kernel
and userspace access to PCI configuration registers
- Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1)
- Add handling and reporting of CXL errors reported via the PCIe AER
mechanism
- Add support for CXL Persistent Memory Security commands
- Add support for the "XOR" algorithm for CXL host bridge interleave
- Rework / simplify CXL to NVDIMM interactions
- Miscellaneous cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY5UpyAAKCRDfioYZHlFs
Z0ttAP4uxCjIibKsFVyexpSgI4vaZqQ9yt9NesmPwonc0XookwD+PlwP6Xc0d0Ox
t0gJ6+pwdh11NRzhcNE1pAaPcJZU4gs=
=HAQk
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for 6.2.
While it may seem backwards, the CXL update this time around includes
some focus on CXL 1.x enabling where the work to date had been with
CXL 2.0 (VH topologies) in mind.
First generation CXL can mostly be supported via BIOS, similar to DDR,
however it became clear there are use cases for OS native CXL error
handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x
hosts (Restricted CXL Host (RCH) topologies). So, this update brings
RCH topologies into the Linux CXL device model.
In support of the ongoing CXL 2.0+ enabling two new core kernel
facilities are added.
One is the ability for the kernel to flag collisions between userspace
access to PCI configuration registers and kernel accesses. This is
brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware
mailbox over config-cycles.
The other is a cpu_cache_invalidate_memregion() API that maps to
wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest
VMs and architectures that do not support it yet. The CXL paths that
need it, dynamic memory region creation and security commands (erase /
unlock), are disabled when it is not present.
As for the CXL 2.0+ this cycle the subsystem gains support Persistent
Memory Security commands, error handling in response to PCIe AER
notifications, and support for the "XOR" host bridge interleave
algorithm.
Summary:
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
response to physical memory reconfiguration, or memory-side data
invalidation from operations like secure erase or memory-device
unlock.
- Add a facility for the kernel to warn about collisions between
kernel and userspace access to PCI configuration registers
- Add support for Restricted CXL Host (RCH) topologies (formerly CXL
1.1)
- Add handling and reporting of CXL errors reported via the PCIe AER
mechanism
- Add support for CXL Persistent Memory Security commands
- Add support for the "XOR" algorithm for CXL host bridge interleave
- Rework / simplify CXL to NVDIMM interactions
- Miscellaneous cleanups and fixes"
* tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits)
cxl/region: Fix memdev reuse check
cxl/pci: Remove endian confusion
cxl/pci: Add some type-safety to the AER trace points
cxl/security: Drop security command ioctl uapi
cxl/mbox: Add variable output size validation for internal commands
cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
cxl/security: Fix Get Security State output payload endian handling
cxl: update names for interleave ways conversion macros
cxl: update names for interleave granularity conversion macros
cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry
tools/testing/cxl: Require cache invalidation bypass
cxl/acpi: Fail decoder add if CXIMS for HBIG is missing
cxl/region: Fix spelling mistake "memergion" -> "memregion"
cxl/regs: Fix sparse warning
cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support
tools/testing/cxl: Add an RCH topology
cxl/port: Add RCD endpoint port enumeration
cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
tools/testing/cxl: Add XOR Math support to cxl_test
cxl/acpi: Support CXL XOR Interleave Math (CXIMS)
...
x86 Xen-for-KVM:
* Allow the Xen runstate information to cross a page boundary
* Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
* add support for 32-bit guests in SCHEDOP_poll
x86 fixes:
* One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
* Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
years back when eliminating unnecessary barriers when switching between
vmcs01 and vmcs02.
* Clean up the MSR filter docs.
* Clean up vmread_error_trampoline() to make it more obvious that params
must be passed on the stack, even for x86-64.
* Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
of the current guest CPUID.
* Fudge around a race with TSC refinement that results in KVM incorrectly
thinking a guest needs TSC scaling when running on a CPU with a
constant TSC, but no hardware-enumerated TSC frequency.
* Advertise (on AMD) that the SMM_CTL MSR is not supported
* Remove unnecessary exports
Selftests:
* Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
* Fix an ordering issue in the AMX test introduced by recent conversions
to use kvm_cpu_has(), and harden the code to guard against similar bugs
in the future. Anything that tiggers caching of KVM's supported CPUID,
kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
the caching occurs before the test opts in via prctl().
* Fix build errors that occur in certain setups (unsure exactly what is
unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
* Introduce actual atomics for clear/set_bit() in selftests
Documentation:
* Remove deleted ioctls from documentation
* Various fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmOWgtsACgkQ6rmadz2v
bTpT2g//WzQRsODtPVVmg87fEo1GSTXvoXq/fhg95OKNZrVKgx1N6EVlFSLSqEjL
TAmOuv5cZT28ZpMPMNjnU/c/lFf/6/UWbbTusA+F3MtSCBSbP5DPsWDD0yvNT9DL
EZbGoQDSyt1M+BakZLzwOV6HPn9oDhj5p/4lMw+gptTY+3IeYUbS50DinM8eLz+Q
067aF01p3ROF6LNUx9Az0cLPdU05oHzL2MvRsj/F7h/sWoSW5B/1Kx/m1vsT9lwn
T2vbm6r4Jo0m0ZvpEMeRyKNZgVKIc64C7NH9CV7V66giJaONmxvLwkc0zWFwbXJ2
V9aPQbbBUx/CZXoC72LEsvVcoAFl7LAL1IALm2HVt1iQjpj1yDlWw3WV0PMQ9Rn7
xRVDOfQNGZ6jnkv6LB2j7V1z7hVENWQQwM48dgO2pAnJwYmUW9wZaAGE5kadUrZf
eCD4c1U+qcZkSk4vwvpr8ubJ0PWPMUZqI0FrHUxfPxqkdy78c1h3qNQufZvAHWff
Ca9NZqraFACTx58ZBsN1V5Xzv7azoK8Zgr9+JwVNahpFxclrbL8xuceThkC4smBl
fiZJC9fClD9ATquIdj177jNMVC8F4B5yrKF/ehJDcNQhcqUdWx9Sbj461enf+3HI
nfTP+77ZzyIJ76iRXJBV/jr9wkaPWhAZVeBGxmw5clTvB9/RBbU=
=fzwv
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:
====================
pull-request: bpf-next 2022-12-11
We've added 74 non-merge commits during the last 11 day(s) which contain
a total of 88 files changed, 3362 insertions(+), 789 deletions(-).
The main changes are:
1) Decouple prune and jump points handling in the verifier, from Andrii.
2) Do not rely on ALLOW_ERROR_INJECTION for fmod_ret, from Benjamin.
Merged from hid tree.
3) Do not zero-extend kfunc return values. Necessary fix for 32-bit archs,
from Björn.
4) Don't use rcu_users to refcount in task kfuncs, from David.
5) Three reg_state->id fixes in the verifier, from Eduard.
6) Optimize bpf_mem_alloc by reusing elements from free_by_rcu, from Hou.
7) Refactor dynptr handling in the verifier, from Kumar.
8) Remove the "/sys" mount and umount dance in {open,close}_netns
in bpf selftests, from Martin.
9) Enable sleepable support for cgrp local storage, from Yonghong.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (74 commits)
selftests/bpf: test case for relaxed prunning of active_lock.id
selftests/bpf: Add pruning test case for bpf_spin_lock
bpf: use check_ids() for active_lock comparison
selftests/bpf: verify states_equal() maintains idmap across all frames
bpf: states_equal() must build idmap for all function frames
selftests/bpf: test cases for regsafe() bug skipping check_id()
bpf: regsafe() must not skip check_ids()
docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE
selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
bpf: Use memmove for bpf_dynptr_{read,write}
bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
bpf: Rework check_func_arg_reg_off
bpf: Rework process_dynptr_func
bpf: Propagate errors from process_* checks in check_func_arg
bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
bpf: Skip rcu_barrier() if rcu_trace_implies_rcu_gp() is true
bpf: Reuse freed element in free_by_rcu during allocation
selftests/bpf: Bring test_offload.py back to life
bpf: Fix comment error in fixup_kfunc_call function
bpf: Do not zero-extend kfunc return values
...
====================
Link: https://lore.kernel.org/r/20221212024701.73809-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Add timens support (when switching mm). This version has survived
in -next for the entire cycle (Andrei Vagin).
- Various small bug fixes, refactoring, and readability improvements
(Bernd Edlinger, Rolf Eike Beer, Bo Liu, Li Zetao Liu Shixin).
- Remove FOLL_FORCE for stack setup (Kees Cook).
- Whilespace cleanups (Rolf Eike Beer, Kees Cook).
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOOjsgWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJqwED/9mjtKL2GwHOYKsfhtc0m4HVGBw
gxTEKuyo5mRwaRLg2bfuWe1OQfeGWQd9+IZ83Kr2ijzm4R16Gslv9i69Iwdf2tce
iFf2R+iR7On+zNokHxaNflRH9fMsZLobVFqzLvB73BUF82ybJlTR3WMnQhS6HZQB
Gse8jRfueOnVgKldRLlgdxIucPVsXYSoBS4B0nvIUuQn3aNzDNuuctMe/5NFK0ud
+TWMXtKzS3B9pcLTXy3e0bPk/Ptio18CBUEI+iLMAHswtNCoxx1ZCcuvnEcrd5Qr
h2WGaRvYJ7oSUXeEsqPKuDdhqEJQH2AQoX8FzvD+hyIutQJCJzVYlHvwGCqn/Km6
0Dalng9Pjb6z2LEie/N42LDXEQmLZO2WtJ4otpORJlsJ7ZkrLjB4u+hDU1JA/Q14
YPWvth3fMA5vAFKvGCtpEc7YdHmghmXCW+YGXOBm625fPYnwFSXOarHfow1RKNE5
MOM4l60WwzLIHgmr8AFUaLf8TbutXN+BKvbMRh2ToWzDYXEoywxAedHDyo4LVwEy
mZEca/3izT1ynBcyZg1t8shf4htgLjcPHqM0B+Hq0iNMIrwtecqAcYL/Oj6XssPx
OuQYv341KF9fV/hMy84GM2HMr0ygUmrP7b9x+PEvCwzWf/2Glaw6Z4rtCdYC+TjW
8ZWqPqEY+LRsZsL18Q==
=ZDYk
-----END PGP SIGNATURE-----
Merge tag 'execve-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
"Most are small refactorings and bug fixes, but three things stand out:
switching timens (which got reverted before) looks solid now,
FOLL_FORCE has been removed (no failures seen yet across several weeks
in -next), and some whitespace cleanups (which are long overdue).
- Add timens support (when switching mm). This version has survived
in -next for the entire cycle (Andrei Vagin)
- Various small bug fixes, refactoring, and readability improvements
(Bernd Edlinger, Rolf Eike Beer, Bo Liu, Li Zetao Liu Shixin)
- Remove FOLL_FORCE for stack setup (Kees Cook)
- Whitespace cleanups (Rolf Eike Beer, Kees Cook)"
* tag 'execve-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
binfmt_misc: fix shift-out-of-bounds in check_special_flags
binfmt: Fix error return code in load_elf_fdpic_binary()
exec: Remove FOLL_FORCE for stack setup
binfmt_elf: replace IS_ERR() with IS_ERR_VALUE()
binfmt_elf: simplify error handling in load_elf_phdrs()
binfmt_elf: fix documented return value for load_elf_phdrs()
exec: simplify initial stack size expansion
binfmt: Fix whitespace issues
exec: Add comments on check_unsafe_exec() fs counting
ELF uapi: add spaces before '{'
selftests/timens: add a test for vfork+exit
fs/exec: switch timens when a task gets a new mm
Added new GSO type for USO: VIRTIO_NET_HDR_GSO_UDP_L4.
Feature VIRTIO_NET_F_HOST_USO allows to enable NETIF_F_GSO_UDP_L4.
Separated VIRTIO_NET_F_GUEST_USO4 & VIRTIO_NET_F_GUEST_USO6 features
required for Windows guests.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added 2 additional offlloads for USO(IPv4 & IPv6).
Separate offloads are required for Windows VM guests,
g.e. Windows may set USO rx only for IPv4.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmOS/ooACgkQrB3Eaf9P
W7cOVA/+L8rHwLe78DDz/PNESyShtTVCYBDF/ngYMV8AIvjSfPresMbFV3NKqO5E
3qbMl199QH2eWI7dhQaQ+edynSG0QCx5FmPai0UuHPLxATct1pNPJPpvBryO/4jC
ZouYBIVjdMbq6Y8vD2gJ8UtA7TZpncP0HYOKTvYyDL9kQ+nUmu9KUYxcEcNHL5w+
TjL9jJafR+GqczCRiwAoMKIFV7lUrTFzh7slfINNN5DVTuzN33H7Tp70z6IKOfVL
1LATlZv7mqpLVF6dQuMXOt6kd/BEBl1y4ZHTHow5nstJvwu99P96iKwEfIXuOvWK
fulhDU61eIik8D9QJWeM7TuZDbYewWI77plwVY/R/zRt0At4VLpq7I1m33CmLLMY
Fb5fMxJPkM8YAtDID+BknYPrSAcxo8ji04BWFrVqQ6InPmtGfnP83XSSkYfxY7FB
3hUfz4igsJpV5vrS1EFRhjklNwI+jY2yAvIggQtdkJ97ubSUY3E4ACfNqlJ5lJbv
2KqWnSKlG21F9ZTR68VzcQVhFIQF6j/EuQqro+4TQUIdZswcml2iK32zrel0rs9C
iAsgQQaMV9a2vEaScRZqdOJ4HENTbm9wD7Mso/i5vr+lnpr1ThKjQo8osU8YUlbC
SDTMeWRRos+esFML6SP+YZ7SM/qXMluou204x/llJ/VDMXQ5e8k=
=enQp
-----END PGP SIGNATURE-----
Merge tag 'ipsec-next-2022-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
ipsec-next 2022-12-09
1) Add xfrm packet offload core API.
From Leon Romanovsky.
2) Add xfrm packet offload support for mlx5.
From Leon Romanovsky and Raed Salem.
3) Fix a typto in a error message.
From Colin Ian King.
* tag 'ipsec-next-2022-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: (38 commits)
xfrm: Fix spelling mistake "oflload" -> "offload"
net/mlx5e: Open mlx5 driver to accept IPsec packet offload
net/mlx5e: Handle ESN update events
net/mlx5e: Handle hardware IPsec limits events
net/mlx5e: Update IPsec soft and hard limits
net/mlx5e: Store all XFRM SAs in Xarray
net/mlx5e: Provide intermediate pointer to access IPsec struct
net/mlx5e: Skip IPsec encryption for TX path without matching policy
net/mlx5e: Add statistics for Rx/Tx IPsec offloaded flows
net/mlx5e: Improve IPsec flow steering autogroup
net/mlx5e: Configure IPsec packet offload flow steering
net/mlx5e: Use same coding pattern for Rx and Tx flows
net/mlx5e: Add XFRM policy offload logic
net/mlx5e: Create IPsec policy offload tables
net/mlx5e: Generalize creation of default IPsec miss group and rule
net/mlx5e: Group IPsec miss handles into separate struct
net/mlx5e: Make clear what IPsec rx_err does
net/mlx5e: Flatten the IPsec RX add rule path
net/mlx5e: Refactor FTE setup code to be more clear
net/mlx5e: Move IPsec flow table creation to separate function
...
====================
Link: https://lore.kernel.org/r/20221209093310.4018731-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support to count upall packets, when kmod of openvswitch
upcall to count the number of packets for upcall succeed and
failed, which is a better way to see how many packets upcalled
on every interfaces.
Signed-off-by: wangchuanlei <wangchuanlei@inspur.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on.
- Merge the pKVM shadow vcpu state tracking that allows the hypervisor
to have its own view of a vcpu, keeping that state private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB pages
only) as a prefix of the oncoming support for 4kB and 16kB pages.
- Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
stage-2 faults and access tracking. You name it, we got it, we
probably broke it.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
As a side effect, this tag also drags:
- The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
series
- A shared branch with the arm64 tree that repaints all the system
registers to match the ARM ARM's naming, and resulting in
interesting conflicts
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmOODb0PHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDztsQAInRnsgLl57/SpqhZzExNCllN6AT/bdeB3uz
rnw3ScJOV174uNKp8lnPWoTvu2YUGiVtBp6tFHhDI8le7zHX438ZT8KE5mcs8p5i
KfFKnb8SHV2DDpqkcy24c0Xl/6vsg1qkKrdfJb49yl5ZakRITDpynW/7tn6dXsxX
wASeGFdCYeW4g2xMQzsCbtx6LgeQ8uomBmzRfPrOtZHYYxAn6+4Mj4595EC1sWxM
AQnbp8tW3Vw46saEZAQvUEOGOW9q0Nls7G21YqQ52IA+ZVDK1LmAF2b1XY3edjkk
pX8EsXOURfqdasBxfSfF3SgnUazoz9GHpSzp1cTVTktrPp40rrT7Ldtml0ktq69d
1malPj47KVMDsIq0kNJGnMxciXFgAHw+VaCQX+k4zhIatNwviMbSop2fEoxj22jc
4YGgGOxaGrnvmAJhreCIbr4CkZk5CJ8Zvmtfg+QM6npIp8BY8896nvORx/d4i6tT
H4caadd8AAR56ANUyd3+KqF3x0WrkaU0PLHJLy1tKwOXJUUTjcpvIfahBAAeUlSR
qEFrtb+EEMPgAwLfNOICcNkPZR/yyuYvM+FiUQNVy5cNiwFkpztpIctfOFaHySGF
K07O2/a1F6xKL0OKRUg7hGKknF9ecmux4vHhiUMuIk9VOgNTWobHozBDorLKXMzC
aWa6oGVC
=iIPT
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.2
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on.
- Merge the pKVM shadow vcpu state tracking that allows the hypervisor
to have its own view of a vcpu, keeping that state private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB pages
only) as a prefix of the oncoming support for 4kB and 16kB pages.
- Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
stage-2 faults and access tracking. You name it, we got it, we
probably broke it.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
As a side effect, this tag also drags:
- The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
series
- A shared branch with the arm64 tree that repaints all the system
registers to match the ARM ARM's naming, and resulting in
interesting conflicts
Add an option to initialize SOF_TIMESTAMPING_OPT_ID for TCP from
write_seq sockets instead of snd_una.
This should have been the behavior from the start. Because processes
may now exist that rely on the established behavior, do not change
behavior of the existing option, but add the right behavior with a new
flag. It is encouraged to always set SOF_TIMESTAMPING_OPT_ID_TCP on
stream sockets along with the existing SOF_TIMESTAMPING_OPT_ID.
Intuitively the contract is that the counter is zero after the
setsockopt, so that the next write N results in a notification for
the last byte N - 1.
On idle sockets snd_una == write_seq and this holds for both. But on
sockets with data in transmission, snd_una records the unacked offset
in the stream. This depends on the ACK response from the peer. A
process cannot learn this in a race free manner (ioctl SIOCOUTQ is one
racy approach).
write_seq records the offset at the last byte written by the process.
This is a better starting point. It matches the intuitive contract in
all circumstances, unaffected by external behavior.
The new timestamp flag necessitates increasing sk_tsflags to 32 bits.
Move the field in struct sock to avoid growing the socket (for some
common CONFIG variants). The UAPI interface so_timestamping.flags is
already int, so 32 bits wide.
Reported-by: Sotirios Delimanolis <sotodel@meta.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20221207143701.29861-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the stack that is passed into the callback. To
reflect such a state, a special register type was created.
However, some checks have been bypassed incorrectly during the addition
of this feature. First, for arg_type with MEM_UNINIT flag which
initialize a dynptr, they must be rejected for such register type.
Secondly, in the future, there are plans to add dynptr helpers that
operate on the dynptr itself and may change its offset and other
properties.
In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
to such helpers, however the current code simply returns 0.
The rejection for helpers that release the dynptr is already handled.
For fixing this, we take a step back and rework existing code in a way
that will allow fitting in all classes of helpers and have a coherent
model for dealing with the variety of use cases in which dynptr is used.
First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
fact. To make the distinction clear, use MEM_RDONLY flag to indicate
that the helper only operates on the memory pointed to by the dynptr,
not the dynptr itself. In C parlance, it would be equivalent to taking
the dynptr as a point to const argument.
When either of these flags are not present, the helper is allowed to
mutate both the dynptr itself and also the memory it points to.
Currently, the read only status of the memory is not tracked in the
dynptr, but it would be trivial to add this support inside dynptr state
of the register.
With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
better reflect its usage, it can no longer be passed to helpers that
initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
A note to reviewers is that in code that does mark_stack_slots_dynptr,
and unmark_stack_slots_dynptr, we implicitly rely on the fact that
PTR_TO_STACK reg is the only case that can reach that code path, as one
cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
both cases such helpers won't be setting that flag.
The next patch will add a couple of selftest cases to make sure this
doesn't break.
Fixes: 2057156738 ("bpf: Add bpf_user_ringbuf_drain() helper")
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221207204141.308952-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Usually when closing a crypto device (eg: dm-crypt with LUKS) the
volume key is not required, as it requires root privileges anyway, and
root can deny access to a disk in many ways regardless. Requiring the
volume key to lock the device is a peculiarity of the OPAL
specification.
Given we might already have saved the key if the user requested it via
the 'IOC_OPAL_SAVE' ioctl, we can use that key to lock the device if no
key was provided here and the locking range matches, and the user sets
the appropriate flag with 'IOC_OPAL_SAVE'. This allows integrating OPAL
with tools and libraries that are used to the common behaviour and do
not ask for the volume key when closing a device.
Callers can always pass a non-zero key and it will be used regardless,
as before.
Suggested-by: Štěpán Horáček <stepan.horacek@gmail.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20221206092913.4625-1-luca.boccassi@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In the specification documents for the Uncompressed and MJPEG USB
Video Payloads, the field name is bmInterlaceFlags - it has been
misnamed within the kernel.
Although renaming the field does break the kernel's interface to
userspace it should be low-risk in this instance. The field is read
only and hardcoded to 0, so there was never any value in anyone
reading it. A search of the uvc-gadget application and all the
forks that I could find for it did not reveal any users either.
Fixes: cdda479f15 ("USB gadget: video class function driver")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
Link: https://lore.kernel.org/r/20221206161203.1562827-1-dan.scally@ideasonboard.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Expose port function commands to enable / disable migratable
capability, this is used to set the port function as migratable.
Live migration is the process of transferring a live virtual machine
from one physical host to another without disrupting its normal
operation.
In order for a VM to be able to perform LM, all the VM components must
be able to perform migration. e.g.: to be migratable.
In order for VF to be migratable, VF must be bound to VFIO driver with
migration support.
When migratable capability is enabled for a function of the port, the
device is making the necessary preparations for the function to be
migratable, which might include disabling features which cannot be
migrated.
Example of LM with migratable function configuration:
Set migratable of the VF's port function.
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
function:
hw_addr 00:00:00:00:00:00 migratable disable
$ devlink port function set pci/0000:06:00.0/2 migratable enable
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
function:
hw_addr 00:00:00:00:00:00 migratable enable
Bind VF to VFIO driver with migration support:
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/unbind
$ echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:08:00.0/driver_override
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/bind
Attach VF to the VM.
Start the VM.
Perform LM.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose port function commands to enable / disable RoCE, this is used to
control the port RoCE device capabilities.
When RoCE is disabled for a function of the port, function cannot create
any RoCE specific resources (e.g GID table).
It also saves system memory utilization. For example disabling RoCE enable a
VF/SF saves 1 Mbytes of system memory per function.
Example of a PCI VF port which supports function configuration:
Set RoCE of the VF's port function.
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
function:
hw_addr 00:00:00:00:00:00 roce enable
$ devlink port function set pci/0000:06:00.0/2 roce disable
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
function:
hw_addr 00:00:00:00:00:00 roce disable
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tag branch
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
* tag 'br-v6.2i' of git://linuxtv.org/hverkuil/media_tree: (31 commits)
media: s5c73m3: Switch to GPIO descriptors
media: i2c: s5k5baf: switch to using gpiod API
media: i2c: s5k6a3: switch to using gpiod API
media: imx: remove code for non-existing config IMX_GPT_ICAP
media: si470x: Fix use-after-free in si470x_int_in_callback()
media: staging: stkwebcam: Restore MEDIA_{USB,CAMERA}_SUPPORT dependencies
media: coda: Add check for kmalloc
media: coda: Add check for dcoda_iram_alloc
dt-bindings: media: s5c73m3: Fix reset-gpio descriptor
media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property
media: s5k4ecgx: Delete driver
media: s5k4ecgx: Switch to GPIO descriptors
media: Switch to use dev_err_probe() helper
headers: Remove some left-over license text in include/uapi/linux/v4l2-*
headers: Remove some left-over license text in include/uapi/linux/dvb/
media: usb: pwc-uncompress: Use flex array destination for memcpy()
media: s5p-mfc: Fix to handle reference queue during finishing
media: s5p-mfc: Clear workbit to handle error condition
media: s5p-mfc: Fix in register read and write for H264
media: imx: Use get_mbus_config instead of parsing upstream DT endpoints
...
Remove some left-over from commit e2be04c7f9 ("License cleanup: add SPDX
license identifier to uapi header files with a license")
When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.
Remove it now.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Remove some left-over from commit e2be04c7f9 ("License cleanup: add
SPDX license identifier to uapi header files with a license")
When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.
Remove it now.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
CXL PMEM security operations are routed through the NVDIMM sysfs
interface. For this reason the corresponding commands are marked
"exclusive" to preclude collisions between the ioctl ABI and the sysfs
ABI. However, a better way to preclude that collision is to simply
remove the ioctl ABI (command-id definitions) for those operations.
Now that cxl_internal_send_cmd() (formerly cxl_mbox_send_cmd()) no
longer needs to talk the cxl_mem_commands array, all of the uapi
definitions for the security commands can be dropped.
These never appeared in a released kernel, so no regression risk.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030056464.4044561.11486507095384253833.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The optional PRE_COPY states open the saving data transfer FD before
reaching STOP_COPY and allows the device to dirty track internal state
changes with the general idea to reduce the volume of data transferred
in the STOP_COPY stage.
While in PRE_COPY the device remains RUNNING, but the saving FD is open.
Only if the device also supports RUNNING_P2P can it support PRE_COPY_P2P,
which halts P2P transfers while continuing the saving FD.
PRE_COPY, with P2P support, requires the driver to implement 7 new arcs
and exists as an optional FSM branch between RUNNING and STOP_COPY:
RUNNING -> PRE_COPY -> PRE_COPY_P2P -> STOP_COPY
A new ioctl VFIO_MIG_GET_PRECOPY_INFO is provided to allow userspace to
query the progress of the precopy operation in the driver with the idea it
will judge to move to STOP_COPY at least once the initial data set is
transferred, and possibly after the dirty size has shrunk appropriately.
This ioctl is valid only in PRE_COPY states and kernel driver should
return -EINVAL from any other migration state.
Compared to the v1 clarification, STOP_COPY -> PRE_COPY is blocked
and to be defined in future.
We also split the pending_bytes report into the initial and sustaining
values, e.g.: initial_bytes and dirty_bytes.
initial_bytes: Amount of initial precopy data.
dirty_bytes: Device state changes relative to data previously retrieved.
These fields are not required to have any bearing to STOP_COPY phase.
It is recommended to leave PRE_COPY for STOP_COPY only after the
initial_bytes field reaches zero. Leaving PRE_COPY earlier might make
things slower.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-3-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
scripts/kernel-doc spouts multiple warnings, so fix them:
include/uapi/linux/dvb/frontend.h:399: warning: Enum value 'QAM_1024' not described in enum 'fe_modulation'
include/uapi/linux/dvb/frontend.h:399: warning: Enum value 'QAM_4096' not described in enum 'fe_modulation'
frontend.h:286: warning: contents before sections
frontend.h:780: warning: missing initial short description on line:
* enum atscmh_rs_code_mode
Fixes: 8220ead805 ("media: dvb/frontend.h: document the uAPI file")
Fixes: 6508a50fe8 ("media: dvb: add DVB-C2 and DVB-S2X parameter values")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Add netlink based support for "ethtool -x <dev> [context x]"
command by implementing ETHTOOL_MSG_RSS_GET netlink message.
This is equivalent to functionality provided via ETHTOOL_GRSSH
in ioctl path. It sends RSS table, hash key and hash function
of an interface to user space.
This patch implements existing functionality available
in ioctl path and enables addition of new RSS context
based parameters in future.
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Link: https://lore.kernel.org/r/20221202002555.241580-1-sudheer.mogilappagari@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When syncing this code into btrfs-progs Dave noticed there's some things
we were losing in the sync that are needed. This syncs those changes
into the kernel, which include a few comments that weren't in the
kernel, some whitespace changes, an attribute, and the cplusplus bit.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We already have this defined in btrfs-progs, add it to the kernel to
make it easier to sync these files into btrfs-progs.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For directories with encrypted files/filenames, we need to store a flag
indicating this fact. There's no room in other fields, so we'll need to
borrow a bit from dir_type. Since it's now a combination of type and
flags, we rename it to dir_flags to reflect its new usage.
The new flag, FT_ENCRYPTED, indicates a directory containing encrypted
data, which is orthogonal to file type; therefore, add the new
flag, and make conversion from directory type to file type strip the
flag.
As the file types almost never change we can afford to use the bits.
Actual usage will be guarded behind an incompat bit, this patch only
adds the support for later use by fscrypt.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have maximum link and name length limits, move these to btrfs_tree.h
as they're on disk limitations.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ reformat comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
The bulk of our on-disk definitions exist in btrfs_tree.h, which user
space can use. Keep things consistent and move the rest of the on disk
definitions out of ctree.h into btrfs_tree.h. Note I did have to update
all u8's to __u8, but otherwise this is a strict copy and paste.
Most of the definitions are mainly for internal use and are not
guaranteed stable public API and may change as we need. Compilation
failures by user applications can happen.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ reformat comments, style fixups ]
Signed-off-by: David Sterba <dsterba@suse.com>
In the next patches, the xfrm core code will be extended to support
new type of offload - packet offload. In that mode, both policy and state
should be specially configured in order to perform whole offloaded data
path.
Full offload takes care of encryption, decryption, encapsulation and
other operations with headers.
As this mode is new for XFRM policy flow, we can "start fresh" with flag
bits and release first and second bit for future use.
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Way back in 2016 in commit 5a8b187c61 ("pktcdvd: mark as unmaintained
and deprecated") this driver was marked as "will be removed soon". 5
years seems long enough to have it stick around after that, so finally
remove the thing now.
Reported-by: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Maier <balagi@justmail.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: linux-block@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20221202182758.1339039-1-gregkh@linuxfoundation.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are still references to the removed kvm_memory_region data structure
but the doc and comments should mention struct kvm_userspace_memory_region
instead, since that is what's used by the ioctl that replaced the old one
and this data structure support the same set of flags.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-4-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The documentation says that the ioctl has been deprecated, but it has been
actually removed and the remaining references are just left overs.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-3-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The documentation says that the ioctl has been deprecated, but it has been
actually removed and the remaining references are just left overs.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-2-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version.
The rc fix was done a different way when iommufd patches reworked this
code.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This patch adds two new MPTCP netlink event types for PM listening
socket create and close, named MPTCP_EVENT_LISTENER_CREATED and
MPTCP_EVENT_LISTENER_CLOSED.
Add a new function mptcp_event_pm_listener() to push the new events
with family, port and addr to userspace.
Invoke mptcp_event_pm_listener() with MPTCP_EVENT_LISTENER_CREATED in
mptcp_listen() and mptcp_pm_nl_create_listen_socket(), invoke it with
MPTCP_EVENT_LISTENER_CLOSED in __mptcp_close_ssk().
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These old unused definitions were originally left around to prevent the
same mode numbers from being reused. However, we've now decided to
reuse the mode numbers anyway. So let's completely remove these old
unused definitions to avoid confusion. There is no reason for any code
to be using these constants in any way; and indeed, Debian Code Search
shows no uses of them (other than in copies or translations of the
header). So this should be perfectly safe.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221202035529.55992-1-ebiggers@kernel.org
Create callback function to support the nvdimm_security_ops() ->erase()
callback. Translate the operation to send "Passphrase Secure Erase"
security command for CXL memory device.
When the mem device is secure erased, cpu_cache_invalidate_memregion() is
called in order to invalidate all CPU caches before attempting to access
the mem device again.
See CXL 3.0 spec section 8.2.9.8.6.6 for reference.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983615293.2734609.10358657600295932156.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Create callback function to support the nvdimm_security_ops() ->unlock()
callback. Translate the operation to send "Unlock" security command for CXL
mem device.
When the mem device is unlocked, cpu_cache_invalidate_memregion() is called
in order to invalidate all CPU caches before attempting to access the mem
device.
See CXL rev3.0 spec section 8.2.9.8.6.4 for reference.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983614167.2734609.15124543712487741176.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Create callback function to support the nvdimm_security_ops() ->freeze()
callback. Translate the operation to send "Freeze Security State" security
command for CXL memory device.
See CXL rev3.0 spec section 8.2.9.8.6.5 for reference.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983613019.2734609.10645754779802492122.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Create callback function to support the nvdimm_security_ops ->disable()
callback. Translate the operation to send "Disable Passphrase" security
command for CXL memory device. The operation supports disabling a
passphrase for the CXL persistent memory device. In the original
implementation of nvdimm_security_ops, this operation only supports
disabling of the user passphrase. This is due to the NFIT version of
disable passphrase only supported disabling of user passphrase. The CXL
spec allows disabling of the master passphrase as well which
nvidmm_security_ops does not support yet. In this commit, the callback
function will only support user passphrase.
See CXL rev3.0 spec section 8.2.9.8.6.3 for reference.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983611878.2734609.10602135274526390127.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Create callback function to support the nvdimm_security_ops ->change_key()
callback. Translate the operation to send "Set Passphrase" security command
for CXL memory device. The operation supports setting a passphrase for the
CXL persistent memory device. It also supports the changing of the
currently set passphrase. The operation allows manipulation of a user
passphrase or a master passphrase.
See CXL rev3.0 spec section 8.2.9.8.6.2 for reference.
However, the spec leaves a gap WRT master passphrase usages. The spec does
not define any ways to retrieve the status of if the support of master
passphrase is available for the device, nor does the commands that utilize
master passphrase will return a specific error that indicates master
passphrase is not supported. If using a device does not support master
passphrase and a command is issued with a master passphrase, the error
message returned by the device will be ambiguous.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983610751.2734609.4445075071552032091.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add support for XTS and CTS mode variant of SM4 algorithm. The former is
used to encrypt file contents, while the latter (SM4-CTS-CBC) is used to
encrypt filenames.
SM4 is a symmetric algorithm widely used in China, and is even mandatory
algorithm in some special scenarios. We need to provide these users with
the ability to encrypt files or disks using SM4-XTS.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221201125819.36932-3-tianjia.zhang@linux.alibaba.com
Add the BGR666 format MEDIA_BUS_FMT_BGR666_1X24_CPADHI supported by the
RaspberryPi.
Signed-off-by: Joerg Quinten <aBUGSworstnightmare@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20221013-rpi-dpi-improvements-v3-3-eb76e26a772d@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Add the BGR666 format MEDIA_BUS_FMT_BGR666_1X18 supported by the
RaspberryPi.
Signed-off-by: Joerg Quinten <aBUGSworstnightmare@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20221013-rpi-dpi-improvements-v3-2-eb76e26a772d@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Add the MEDIA_BUS_FMT_RGB565_1X24_CPADHI format used by the Geekworm
MZP280 panel for the Raspberry Pi.
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://lore.kernel.org/r/20221013-rpi-dpi-improvements-v3-1-eb76e26a772d@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
To read from a region, user space must currently request a new snapshot of
the region and then read from that snapshot. This can sometimes be overkill
if user space only reads a tiny portion. They first create the snapshot,
then request a read, then destroy the snapshot.
For regions which have a single underlying "contents", it makes sense to
allow supporting direct reading of the region data.
Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if
requested via the new DEVLINK_ATTR_REGION_DIRECT. If this attribute is set,
then perform a direct read instead of using a snapshot. Direct read is
mutually exclusive with DEVLINK_ATTR_REGION_SNAPSHOT_ID, and care is taken
to ensure that we reject commands which provide incorrect attributes.
Regions must enable support for direct read by implementing the .read()
callback function. If a region does not support such direct reads, a
suitable extended error message is reported.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If a PR operation fails we can return a device-specific error which is
impossible to handle in some cases because we could have a mix of devices
when DM is used, or future users like LIO only knows it's interacting with
a block device so it doesn't know the type.
This patch adds a new pr_status enum so drivers can convert errors to a
common type which can be handled by the caller.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20221122032603.32766-2-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add nvdimm_security_ops support for CXL memory device with the introduction
of the ->get_flags() callback function. This is part of the "Persistent
Memory Data-at-rest Security" command set for CXL memory device support.
The ->get_flags() function provides the security state of the persistent
memory device defined by the CXL 3.0 spec section 8.2.9.8.6.1.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983609611.2734609.13231854299523325319.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
iommufd can directly implement the /dev/vfio/vfio container IOCTLs by
mapping them into io_pagetable operations.
A userspace application can test against iommufd and confirm compatibility
then simply make a small change to open /dev/iommu instead of
/dev/vfio/vfio.
For testing purposes /dev/vfio/vfio can be symlinked to /dev/iommu and
then all applications will use the compatibility path with no code
changes. A later series allows /dev/vfio/vfio to be directly provided by
iommufd, which allows the rlimit mode to work the same as well.
This series just provides the iommufd side of compatibility. Actually
linking this to VFIO_SET_CONTAINER is a followup series, with a link in
the cover letter.
Internally the compatibility API uses a normal IOAS object that, like
vfio, is automatically allocated when the first device is
attached.
Userspace can also query or set this IOAS object directly using the
IOMMU_VFIO_IOAS ioctl. This allows mixing and matching new iommufd only
features while still using the VFIO style map/unmap ioctls.
While this is enough to operate qemu, it has a few differences:
- Resource limits rely on memory cgroups to bound what userspace can do
instead of the module parameter dma_entry_limit.
- VFIO P2P is not implemented. The DMABUF patches for vfio are a start at
a solution where iommufd would import a special DMABUF. This is to avoid
further propogating the follow_pfn() security problem.
- A full audit for pedantic compatibility details (eg errnos, etc) has
not yet been done
- powerpc SPAPR is left out, as it is not connected to the iommu_domain
framework. It seems interest in SPAPR is minimal as it is currently
non-working in v6.1-rc1. They will have to convert to the iommu
subsystem framework to enjoy iommfd.
The following are not going to be implemented and we expect to remove them
from VFIO type1:
- SW access 'dirty tracking'. As discussed in the cover letter this will
be done in VFIO.
- VFIO_TYPE1_NESTING_IOMMU
https://lore.kernel.org/all/0-v1-0093c9b0e345+19-vfio_no_nesting_jgg@nvidia.com/
- VFIO_DMA_MAP_FLAG_VADDR
https://lore.kernel.org/all/Yz777bJZjTyLrHEQ@nvidia.com/
Link: https://lore.kernel.org/r/15-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Connect the IOAS to its IOCTL interface. This exposes most of the
functionality in the io_pagetable to userspace.
This is intended to be the core of the generic interface that IOMMUFD will
provide. Every IOMMU driver should be able to implement an iommu_domain
that is compatible with this generic mechanism.
It is also designed to be easy to use for simple non virtual machine
monitor users, like DPDK:
- Universal simple support for all IOMMUs (no PPC special path)
- An IOVA allocator that considers the aperture and the allowed/reserved
ranges
- io_pagetable allows any number of iommu_domains to be connected to the
IOAS
- Automatic allocation and re-use of iommu_domains
Along with room in the design to add non-generic features to cater to
specific HW functionality.
Link: https://lore.kernel.org/r/11-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This is the basic infrastructure of a new miscdevice to hold the iommufd
IOCTL API.
It provides:
- A miscdevice to create file descriptors to run the IOCTL interface over
- A table based ioctl dispatch and centralized extendable pre-validation
step
- An xarray mapping userspace ID's to kernel objects. The design has
multiple inter-related objects held within in a single IOMMUFD fd
- A simple usage count to build a graph of object relations and protect
against hostile userspace racing ioctls
The only IOCTL provided in this patch is the generic 'destroy any object
by handle' operation.
Link: https://lore.kernel.org/r/6-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add a new parameter to complement the existing 'netmask' option. The
main difference between netmask and bitmask is that bitmask takes any
arbitrary ip address as input, it does not have to be a valid netmask.
The name of the new parameter is 'bitmask'. This lets us mask out
arbitrary bits in the ip address, for example:
ipset create set1 hash:ip bitmask 255.128.255.0
ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
SCTP conntrack currently assumes that the SCTP endpoints will
probe secondary paths using HEARTBEAT before sending traffic.
But, according to RFC 9260, SCTP endpoints can send any traffic
on any of the confirmed paths after SCTP association is up.
SCTP endpoints that sends INIT will confirm all peer addresses
that upper layer configures, and the SCTP endpoint that receives
COOKIE_ECHO will only confirm the address it sent the INIT_ACK to.
So, we can have a situation where the INIT sender can start to
use secondary paths without the need to send HEARTBEAT. This patch
allows DATA/SACK packets to create new connection tracking entry.
A new state has been added to indicate that a DATA/SACK chunk has
been seen in the original direction - SCTP_CONNTRACK_DATA_SENT.
State transitions mostly follows the HEARTBEAT_SENT, except on
receiving HEARTBEAT/HEARTBEAT_ACK/DATA/SACK in the reply direction.
State transitions in original direction:
- DATA_SENT behaves similar to HEARTBEAT_SENT for all chunks,
except that it remains in DATA_SENT on receving HEARTBEAT,
HEARTBEAT_ACK/DATA/SACK chunks
State transitions in reply direction:
- DATA_SENT behaves similar to HEARTBEAT_SENT for all chunks,
except that it moves to HEARTBEAT_ACKED on receiving
HEARTBEAT/HEARTBEAT_ACK/DATA/SACK chunks
Note: This patch still doesn't solve the problem when the SCTP
endpoint decides to use primary paths for association establishment
but uses a secondary path for association shutdown. We still have
to depend on timeout for connections to expire in such a case.
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Closer inspection of the Xen code shows that we aren't supposed to be
using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be
explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall.
If we randomly set the top bit of ->state_entry_time for a guest that
hasn't asked for it and doesn't expect it, that could make the runtimes
fail to add up and confuse the guest. Without the flag it's perfectly
safe for a vCPU to read its own vcpu_runstate_info; just not for one
vCPU to read *another's*.
I briefly pondered adding a word for the whole set of VMASST_TYPE_*
flags but the only one we care about for HVM guests is this, so it
seemed a bit pointless.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221127122210.248427-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY4AC5QAKCRDbK58LschI
g1e0AQCfAqduTy7mYd02jDNCV0wLphNp9FbPiP9OrQT37ABpKAEA1ulj1X59bX3d
HnZdDKuatcPZT9MV5hDLM7MFJ9GjOA4=
=fNmM
-----END PGP SIGNATURE-----
Daniel Borkmann says:
====================
bpf-next 2022-11-25
We've added 101 non-merge commits during the last 11 day(s) which contain
a total of 109 files changed, 8827 insertions(+), 1129 deletions(-).
The main changes are:
1) Support for user defined BPF objects: the use case is to allocate own
objects, build own object hierarchies and use the building blocks to
build own data structures flexibly, for example, linked lists in BPF,
from Kumar Kartikeya Dwivedi.
2) Add bpf_rcu_read_{,un}lock() support for sleepable programs,
from Yonghong Song.
3) Add support storing struct task_struct objects as kptrs in maps,
from David Vernet.
4) Batch of BPF map documentation improvements, from Maryam Tahhan
and Donald Hunter.
5) Improve BPF verifier to propagate nullness information for branches
of register to register comparisons, from Eduard Zingerman.
6) Fix cgroup BPF iter infra to hold reference on the start cgroup,
from Hou Tao.
7) Fix BPF verifier to not mark fentry/fexit program arguments as trusted
given it is not the case for them, from Alexei Starovoitov.
8) Improve BPF verifier's realloc handling to better play along with dynamic
runtime analysis tools like KASAN and friends, from Kees Cook.
9) Remove legacy libbpf mode support from bpftool,
from Sahid Orentino Ferdjaoui.
10) Rework zero-len skb redirection checks to avoid potentially breaking
existing BPF test infra users, from Stanislav Fomichev.
11) Two small refactorings which are independent and have been split out
of the XDP queueing RFC series, from Toke Høiland-Jørgensen.
12) Fix a memory leak in LSM cgroup BPF selftest, from Wang Yufen.
13) Documentation on how to run BPF CI without patch submission,
from Daniel Müller.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Link: https://lore.kernel.org/r/20221125012450.441-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
introduce a new ioctl to replace the whole content of a file atomically,
which means it induces truncate and content update at the same time.
We can start it with F2FS_IOC_START_ATOMIC_REPLACE and complete it with
F2FS_IOC_COMMIT_ATOMIC_WRITE. Or abort it with
F2FS_IOC_ABORT_ATOMIC_WRITE.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
- First batch of KVM changes for kernel virtual != physical address support
- Removal of a unused function
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmN/eYwACgkQ41TmuOI4
ufjoxA/9Et38aXO/IhmUt8v0QhA4yec+sc5GSFfQSYehej/1Vqhw0DXx+ORUiRgg
+rtiXJSSqkuD2dL+BDffY2xoul6nzNdVf4AbkcnrWscfWr6xwVYlPvuL0ymGI6J2
U/IPedRoKw0bHw/wHs05yV5PubrRwDFERKhtyXWYGbPJhX0w2n3IFOoKH1oWBhLW
Dc8jEs6t3gDbJ71Er0xoeBUoiuu+PgZG06cpOvzBZ0KclRgjADXyISqqk8/4mu8w
R+/Wf8NcrbQYV1jfCeq5zIsKC8uvnFj25UuyTLumn5vh+dNNsvE72Khe4tz7LI0I
ZPZ+GZuemu7Yi12dKjw4Sw3ui0ejWH/5XL1SVB0X/xYIWrBqOot+Lq6538GCng+c
tJt+zsu64VFgXCCZ8O9qO4uE4DBL70H3ThT7VZxIghSTZtY0xh3uFc64f3/3d9dy
K4WTJHrmMxhXaA/rqtIa8I53JvFl8CztofZATiQQesyPuc7lZ01w1Co5el4xYaxe
YknyMTq11qf/iYqVOW7sjoWW/YRuuMZ4+FhpI3o/SllVdN98iTwkk1kP3wcoBO5P
bvzpm+WXHbv9OxifPrqkqv34+upbjfEmSogHudQzagBX4vl3rZRfBCdQGCAha0Uc
ZYyg68kiil5sWmHI/Ln/ZjANYfbS5sF0CreuWxnmqcwKl2NSN/E=
=/1yt
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-6.2-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support
- Removal of a unused function
add contiguous nv12 tiled format nv12_8l128 and nv12_10be_8l128
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Extend the DVB frontend parameter enums with additional values specified
by the DVB-C2 (ETSI EN 302 769) and DVB-S2X (ETSI EN 302 307-2)
standards to be ready for frontend drivers for such receivers.
While most parameters will be "read-only" due to being autodetected by
the receiver and only being reported back for informational purposes,
the addition of SYS_DVBC2 to the delivery systems enum is required,
because there are DVB-C2 capable receivers which are not capable of
DVB-C/C2 autodetection and thus need this enum value to be explicitly
instructed to search for a DVB-C2 signal.
As for DVB-S2X, as that is an extension to DVB-S2, the same delivery
system enum as for DVB-S2 can be used.
Add the additional enum values and comments to the documentation.
Link: https://lore.kernel.org/linux-media/trinity-1b7c5a66-85d4-4595-a690-0fde965d49b3-1642146228587@3c-app-gmx-bap69
Signed-off-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
After the commits for userspace (see Link tags below) the uuid field
is not being used in the ACRN code. Update kernel to reflect these
changes, i.e. do the following:
- adding a comment explaining that it's not used anymore
- replacing the specific type by a raw buffer
- updating the example code accordingly
The advertised field confused users and actually never been used.
So the wrong part here is that kernel puts something which userspace
never used and hence this may confuse a reader of this code.
Note, that there is only a single tool that had been prepared a year
ago for these forthcoming changes in the kernel.
Link: https://github.com/projectacrn/acrn-hypervisor/commit/da0d24326ed6
Link: https://github.com/projectacrn/acrn-hypervisor/commit/bb0327e70097
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221116162956.72658-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE to signal that the
KVM_PV_ASYNC_DISABLE and KVM_PV_ASYNC_DISABLE_PREPARE commands for the
KVM_S390_PV_COMMAND ioctl are available.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-4-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-4-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Until now, destroying a protected guest was an entirely synchronous
operation that could potentially take a very long time, depending on
the size of the guest, due to the time needed to clean up the address
space from protected pages.
This patch implements an asynchronous destroy mechanism, that allows a
protected guest to reboot significantly faster than previously.
This is achieved by clearing the pages of the old guest in background.
In case of reboot, the new guest will be able to run in the same
address space almost immediately.
The old protected guest is then only destroyed when all of its memory
has been destroyed or otherwise made non protected.
Two new PV commands are added for the KVM_S390_PV_COMMAND ioctl:
KVM_PV_ASYNC_CLEANUP_PREPARE: set aside the current protected VM for
later asynchronous teardown. The current KVM VM will then continue
immediately as non-protected. If a protected VM had already been
set aside for asynchronous teardown, but without starting the teardown
process, this call will fail. There can be at most one VM set aside at
any time. Once it is set aside, the protected VM only exists in the
context of the Ultravisor, it is not associated with the KVM VM
anymore. Its protected CPUs have already been destroyed, but not its
memory. This command can be issued again immediately after starting
KVM_PV_ASYNC_CLEANUP_PERFORM, without having to wait for completion.
KVM_PV_ASYNC_CLEANUP_PERFORM: tears down the protected VM previously
set aside using KVM_PV_ASYNC_CLEANUP_PREPARE. Ideally the
KVM_PV_ASYNC_CLEANUP_PERFORM PV command should be issued by userspace
from a separate thread. If a fatal signal is received (or if the
process terminates naturally), the command will terminate immediately
without completing. All protected VMs whose teardown was interrupted
will be put in the need_cleanup list. The rest of the normal KVM
teardown process will take care of properly cleaning up all remaining
protected VMs, including the ones on the need_cleanup list.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-2-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-2-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
In general, as of now, in FUSE, direct writes on the same file are
serialized over inode lock i.e we hold inode lock for the full duration of
the write request. I could not find in fuse code and git history a comment
which clearly explains why this exclusive lock is taken for direct writes.
Following might be the reasons for acquiring an exclusive lock but not be
limited to
1) Our guess is some USER space fuse implementations might be relying on
this lock for serialization.
2) The lock protects against file read/write size races.
3) Ruling out any issues arising from partial write failures.
This patch relaxes the exclusive lock for direct non-extending writes only.
File size extending writes might not need the lock either, but we are not
entirely sure if there is a risk to introduce any kind of regression.
Furthermore, benchmarking with fio does not show a difference between patch
versions that take on file size extension a) an exclusive lock and b) a
shared lock.
A possible example of an issue with i_size extending writes are write error
cases. Some writes might succeed and others might fail for file system
internal reasons - for example ENOSPACE. With parallel file size extending
writes it _might_ be difficult to revert the action of the failing write,
especially to restore the right i_size.
With these changes, we allow non-extending parallel direct writes on the
same file with the help of a flag called FOPEN_PARALLEL_DIRECT_WRITES. If
this flag is set on the file (flag is passed from libfuse to fuse kernel as
part of file open/create), we do not take exclusive lock anymore, but
instead use a shared lock that allows non-extending writes to run in
parallel. FUSE implementations which rely on this inode lock for
serialization can continue to do so and serialized direct writes are still
the default. Implementations that do not do write serialization need to be
updated and need to set the FOPEN_PARALLEL_DIRECT_WRITES flag in their file
open/create reply.
On patch review there were concerns that network file systems (or vfs
multiple mounts of the same file system) might have issues with parallel
writes. We believe this is not the case, as this is just a local lock,
which network file systems could not rely on anyway. I.e. this lock is
just for local consistency.
Signed-off-by: Dharmendra Singh <dsingh@ddn.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Add a flag to entry expiration that lets the filesystem expire a dentry
without kicking it out from the cache immediately.
This makes a difference for overmounted dentries, where plain invalidation
would detach all submounts before dropping the dentry from the cache. If
only expiry is set on the dentry, then any overmounts are left alone and
until ->d_revalidate() is called.
Note: ->d_revalidate() is not called for the case of following a submount,
so invalidation will only be triggered for the non-overmounted case. The
dentry could also be mounted in a different mount instance, in which case
any submounts will still be detached.
Suggested-by: Jakob Blomer <jblomer@cern.ch>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
HUTRR72 added 3 new usage codes for keys that are supposed to enable,
disable and toggle camera access. These are useful, considering many
laptops today have key(s) for toggling access to camera.
This patch adds new key definitions for KEY_CAMERA_ACCESS_ENABLE,
KEY_CAMERA_ACCESS_DISABLE and KEY_CAMERA_ACCESS_TOGGLE. Additionally
hid-debug is adjusted to recognize this new usage codes as well.
Signed-off-by: Eray Orçunus <erayorcunus@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/20221029120311.11152-3-erayorcunus@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
We can have dependencies between epoll and io_uring. Consider an epoll
context, identified by the epfd file descriptor, and an io_uring file
descriptor identified by iofd. If we add iofd to the epfd context, and
arm a multishot poll request for epfd with iofd, then the multishot
poll request will repeatedly trigger and generate events until terminated
by CQ ring overflow. This isn't a desired behavior.
Add EPOLL_URING so that io_uring can pass it in as part of the poll wakeup
key, and io_uring can check for that to detect a potential recursive
invocation.
Cc: stable@vger.kernel.org # 6.0
Signed-off-by: Jens Axboe <axboe@kernel.dk>
I've added a flex array to struct nlmsghdr in
commit 738136a0e3 ("netlink: split up copies in the ack construction")
to allow accessing the data easily. It leads to warnings with clang,
if user space wraps this structure into another struct and the flex
array is not at the end of the container.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/all/20221114023927.GA685@u2004-local/
Link: https://lore.kernel.org/r/20221118033903.1651026-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To fully utilize offload capabilities of Intel 100G card QoS capabilities
new attribute 'tx_weight' needs to be introduced. This attribute allows
for usage of Weighted Fair Queuing arbitration scheme among siblings.
This arbitration scheme can be used simultaneously with the strict
priority.
Introduce new attribute in devlink-rate that will allow for configuration
of Weighted Fair Queueing. New attribute is optional.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To fully utilize offload capabilities of Intel 100G card QoS capabilities
new attribute 'tx_priority' needs to be introduced. This attribute allows
for usage of strict priority arbiter among siblings. This arbitration
scheme attempts to schedule nodes based on their priority as long as the
nodes remain within their bandwidth limit.
Introduce new attribute in devlink-rate that will allow for configuration
of strict priority. New attribute is optional.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Many of the drivers which implement ethtool_ops::get_drvinfo() will
prints the .driver, .version or .bus_info of struct ethtool_drvinfo.
To have a glance of current state, do:
$ git grep -W "get_drvinfo(struct"
Printing in those three fields is useless because:
- since [1], the driver version should be the kernel version (at
least for upstream drivers). Arguably, out of tree drivers might
still want to set a custom version, but out of tree is not our
focus.
- since [2], the core is able to provide default values for .driver
and .bus_info.
In summary, drivers may provide .fw_version and .erom_version, the
rest is expected to be done by the core.
In struct ethtool_ops doc from linux/ethtool: rephrase field
get_drvinfo() doc to discourage developers from implementing this
callback.
In struct ethtool_drvinfo doc from uapi/linux/ethtool.h: remove the
paragraph mentioning what drivers should do. Rationale: no need to
repeat what is already written in struct ethtool_ops doc. But add a
note that .fw_version and .erom_version are driver defined.
Also update the dummy driver and simply remove the callback in order
not to confuse the newcomers: most of the drivers will not need this
callback function any more.
[1] commit 6a7e25c7fb ("net/core: Replace driver version to be
kernel version")
Link: https://git.kernel.org/torvalds/linux/c/6a7e25c7fb48
[2] commit edaf5df22c ("ethtool: ethtool_get_drvinfo: populate
drvinfo fields even if callback exits")
Link: https://git.kernel.org/netdev/net-next/c/edaf5df22cb8
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/20221116171828.4093-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove some left-over from commit e2be04c7f9 ("License cleanup: add SPDX
license identifier to uapi header files with a license")
When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Kai Vehmanen <kvcontact@nosignal.fi>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
TDX guest driver exposes IOCTL interfaces to service TDX guest
user-specific requests. Currently, it is only used to allow the user to
get the TDREPORT to support TDX attestation.
Details about the TDX attestation process are documented in
Documentation/x86/tdx.rst, and the IOCTL details are documented in
Documentation/virt/coco/tdx-guest.rst.
Operations like getting TDREPORT involves sending a blob of data as
input and getting another blob of data as output. It was considered
to use a sysfs interface for this, but it doesn't fit well into the
standard sysfs model for configuring values. It would be possible to
do read/write on files, but it would need multiple file descriptors,
which would be somewhat messy. IOCTLs seem to be the best fitting
and simplest model for this use case. The AMD sev-guest driver also
uses the IOCTL interface to support attestation.
[Bagas Sanjaya: Ack is for documentation portion]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/all/20221116223820.819090-3-sathyanarayanan.kuppuswamy%40linux.intel.com
Current release - regressions:
- tls: fix memory leak in tls_enc_skb() and tls_sw_fallback_init()
Previous releases - regressions:
- bridge: fix memory leaks when changing VLAN protocol
- dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims
- dsa: don't leak tagger-owned storage on switch driver unbind
- eth: mlxsw: avoid warnings when not offloaded FDB entry with IPv6 is removed
- eth: stmmac: ensure tx function is not running in stmmac_xdp_release()
- eth: hns3: fix return value check bug of rx copybreak
Previous releases - always broken:
- kcm: close race conditions on sk_receive_queue
- bpf: fix alignment problem in bpf_prog_test_run_skb()
- bpf: fix writing offset in case of fault in strncpy_from_kernel_nofault
- eth: macvlan: use built-in RCU list checking
- eth: marvell: add sleep time after enabling the loopback bit
- eth: octeon_ep: fix potential memory leak in octep_device_setup()
Misc:
- tcp: configurable source port perturb table size
- bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmN2FlMSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkWAwQAJcV7XEB7bEssgabFkEmC4uvS/sFlyHC
uSwFRn5ojaB2c56T1CnNYmitg9Wr4arC6Vca28iai6BgqB6t4qLRI/WWTsZiEPhi
mt/pjNN2u9JMyaafHFHYfXnbSDWRF7kPMpNw4l3uL0vkGyjSI7LGAOP4Qh8C1h/d
tNVSDZnj4Laj/3JtDf7Rk6ydCqPYnNdWxFfoZ/SQkjYZKD3Ze9tml7WJykAzCTLp
yUiPC6TvHOnWIZYbB04sVVOQD4V+95TSOgEhB6wzs/CXB7iBEY+N+oCedjP9Xrfw
n3ea4anBoTleDnJXJI57LhdJBkyoXncfbpbYLwXljyIgosr7XVTALvOG8XUhg/DW
FzN5DWQ54jzTsx2eXFJzjQQcDIgyxazk9EdoHdqF8byCasP+fofq1JvzyqtvNSyh
h8Ps6jdMZrWpXuFDVApXUhP32A/+9q+dFSYHJO681m6mf4CIaUXdm4aB1dkxDAvg
PSlk797U94RQCzJgqxhrgsq1PGQPBb+qadZrAiD3aQi26g0NWCTg7uFpCeCEK2ZF
fLwc2XxrwLQm1q7xQVoEg4UxPIIf0mUesvOD9sLDYop0rFIw8x0v7jdYM4kyhN3o
6FWAXKxBe3LJ9jTTsVTbZbfHYpTnS8Q2KSclBN+/dZNHwwsUPHjy17Z2Ct3o3Jlm
lNbiiD30BgsD
=vVJk
-----END PGP SIGNATURE-----
Merge tag 'net-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf.
Current release - regressions:
- tls: fix memory leak in tls_enc_skb() and tls_sw_fallback_init()
Previous releases - regressions:
- bridge: fix memory leaks when changing VLAN protocol
- dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims
- dsa: don't leak tagger-owned storage on switch driver unbind
- eth: mlxsw: avoid warnings when not offloaded FDB entry with IPv6
is removed
- eth: stmmac: ensure tx function is not running in
stmmac_xdp_release()
- eth: hns3: fix return value check bug of rx copybreak
Previous releases - always broken:
- kcm: close race conditions on sk_receive_queue
- bpf: fix alignment problem in bpf_prog_test_run_skb()
- bpf: fix writing offset in case of fault in
strncpy_from_kernel_nofault
- eth: macvlan: use built-in RCU list checking
- eth: marvell: add sleep time after enabling the loopback bit
- eth: octeon_ep: fix potential memory leak in octep_device_setup()
Misc:
- tcp: configurable source port perturb table size
- bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)"
* tag 'net-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
net: use struct_group to copy ip/ipv6 header addresses
net: usb: smsc95xx: fix external PHY reset
net: usb: qmi_wwan: add Telit 0x103a composition
netdevsim: Fix memory leak of nsim_dev->fa_cookie
tcp: configurable source port perturb table size
l2tp: Serialize access to sk_user_data with sk_callback_lock
net: thunderbolt: Fix error handling in tbnet_init()
net: microchip: sparx5: Fix potential null-ptr-deref in sparx_stats_init() and sparx5_start()
net: lan966x: Fix potential null-ptr-deref in lan966x_stats_init()
net: dsa: don't leak tagger-owned storage on switch driver unbind
net/x25: Fix skb leak in x25_lapb_receive_frame()
net: ag71xx: call phylink_disconnect_phy if ag71xx_hw_enable() fail in ag71xx_open()
bridge: switchdev: Fix memory leaks when changing VLAN protocol
net: hns3: fix setting incorrect phy link ksettings for firmware in resetting process
net: hns3: fix return value check bug of rx copybreak
net: hns3: fix incorrect hw rss hash type of rx packet
net: phy: marvell: add sleep time after enabling the loopback bit
net: ena: Fix error handling in ena_init()
kcm: close race conditions on sk_receive_queue
net: ionic: Fix error handling in ionic_init_module()
...
kernel test robot reported warnings when build bonding module with
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/net/bonding/:
from ../drivers/net/bonding/bond_main.c:35:
In function ‘fortify_memcpy_chk’,
inlined from ‘iph_to_flow_copy_v4addrs’ at ../include/net/ip.h:566:2,
inlined from ‘bond_flow_ip’ at ../drivers/net/bonding/bond_main.c:3984:3:
../include/linux/fortify-string.h:413:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of f
ield (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
413 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘fortify_memcpy_chk’,
inlined from ‘iph_to_flow_copy_v6addrs’ at ../include/net/ipv6.h:900:2,
inlined from ‘bond_flow_ip’ at ../drivers/net/bonding/bond_main.c:3994:3:
../include/linux/fortify-string.h:413:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of f
ield (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
413 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is because we try to copy the whole ip/ip6 address to the flow_key,
while we only point the to ip/ip6 saddr. Note that since these are UAPI
headers, __struct_group() is used to avoid the compiler warnings.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: c3f8324188 ("net: Add full IPv6 addresses to flow_keys")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20221115142400.1204786-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
1) Fix sparse warning in the new nft_inner expression, reported
by Jakub Kicinski.
2) Incorrect vlan header check in nft_inner, from Peng Wu.
3) Two patches to pass reset boolean to expression dump operation,
in preparation for allowing to reset stateful expressions in rules.
This adds a new NFT_MSG_GETRULE_RESET command. From Phil Sutter.
4) Inconsistent indentation in nft_fib, from Jiapeng Chong.
5) Speed up siphash calculation in conntrack, from Florian Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: conntrack: use siphash_4u64
netfilter: rpfilter/fib: clean up some inconsistent indenting
netfilter: nf_tables: Introduce NFT_MSG_GETRULE_RESET
netfilter: nf_tables: Extend nft_expr_ops::dump callback parameters
netfilter: nft_inner: fix return value check in nft_inner_parse_l2l3()
netfilter: nft_payload: use __be16 to store gre version
====================
Link: https://lore.kernel.org/r/20221115095922.139954-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For queueing packets in XDP we want to add a new redirect map type with
support for 64-bit indexes. To prepare fore this, expand the width of the
'key' argument to the bpf_redirect_map() helper. Since BPF registers are
always 64-bit, this should be safe to do after the fact.
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221108140601.149971-3-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRSrcquik9wuZrNjXJrQl33mxwedgUCY2U2ZQAKCRBrQl33mxwe
dhaLAP4t7vpFHF6UwPKhUsnBjykDE9cqX5tFY5hV/P4B3dhm3wEAmJFEi38QFQlb
gkISSnYIx5DZGP2iBXT96ftGlPI7qgg=
=wwvs
-----END PGP SIGNATURE-----
Merge tag 'br-v6.2e' of git://linuxtv.org/hverkuil/media_tree into media_stage
Tag branch
* tag 'br-v6.2e' of git://linuxtv.org/hverkuil/media_tree: (29 commits)
media: davinci/vpbe: Fix a typo ("defualt_mode")
media: sun6i-csi: Remove unnecessary print function dev_err()
media: Documentation: Drop deprecated bytesused == 0
media: platform: exynos4-is: fix return value check in fimc_md_probe()
media: dvb-core: remove variable n, turn for-loop to while-loop
media: vivid: fix compose size exceed boundary
media: rkisp1: make const arrays ae_wnd_num and hist_wnd_num static
media: dvb-core: Fix UAF due to refcount races at releasing
media: rkvdec: Add required padding
media: aspeed: Extend debug message
media: aspeed: Support aspeed mode to reduce compressed data
media: Documentation: aspeed-video: Add user documentation for the aspeed-video driver
media: v4l2-ctrls: Reserve controls for ASPEED
media: v4l: Add definition for the Aspeed JPEG format
staging: media: tegra-video: fix device_node use after free
staging: media: tegra-video: fix chan->mipi value on error
media: cedrus: initialize controls a bit later
media: cedrus: prefer untiled capture format
media: cedrus: Remove cedrus_codec enum
media: cedrus: set codec ops immediately
...
Analogous to NFT_MSG_GETOBJ_RESET, but for rules: Reset stateful
expressions like counters or quotas. The latter two are the only
consumers, adjust their 'dump' callbacks to respect the parameter
introduced earlier.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add the support on the map side to parse, recognize, verify, and build
metadata table for a new special field of the type struct bpf_list_head.
To parameterize the bpf_list_head for a certain value type and the
list_node member it will accept in that value type, we use BTF
declaration tags.
The definition of bpf_list_head in a map value will be done as follows:
struct foo {
struct bpf_list_node node;
int data;
};
struct map_value {
struct bpf_list_head head __contains(foo, node);
};
Then, the bpf_list_head only allows adding to the list 'head' using the
bpf_list_node 'node' for the type struct foo.
The 'contains' annotation is a BTF declaration tag composed of four
parts, "contains:name:node" where the name is then used to look up the
type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
the lookup. The node defines name of the member in this type that has
the type struct bpf_list_node, which is actually used for linking into
the linked list. For now, 'kind' part is hardcoded as struct.
This allows building intrusive linked lists in BPF, using container_of
to obtain pointer to entry, while being completely type safe from the
perspective of the verifier. The verifier knows exactly the type of the
nodes, and knows that list helpers return that type at some fixed offset
where the bpf_list_node member used for this list exists. The verifier
also uses this information to disallow adding types that are not
accepted by a certain list.
For now, no elements can be added to such lists. Support for that is
coming in future patches, hence draining and freeing items is done with
a TODO that will be resolved in a future patch.
Note that the bpf_list_head_free function moves the list out to a local
variable under the lock and releases it, doing the actual draining of
the list items outside the lock. While this helps with not holding the
lock for too long pessimizing other concurrent list operations, it is
also necessary for deadlock prevention: unless every function called in
the critical section would be notrace, a fentry/fexit program could
attach and call bpf_map_update_elem again on the map, leading to the
same lock being acquired if the key matches and lead to a deadlock.
While this requires some special effort on part of the BPF programmer to
trigger and is highly unlikely to occur in practice, it is always better
if we can avoid such a condition.
While notrace would prevent this, doing the draining outside the lock
has advantages of its own, hence it is used to also fix the deadlock
related problem.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add an option to get migration data size by introducing a new migration
feature named VFIO_DEVICE_FEATURE_MIG_DATA_SIZE.
Upon VFIO_DEVICE_FEATURE_GET the estimated data length that will be
required to complete STOP_COPY is returned.
This option may better enable user space to consider before moving to
STOP_COPY whether it can meet the downtime SLA based on the returned
data.
The patch also includes the implementation for mlx5 and hisi for this
new option to make it feature complete for the existing drivers in this
area.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20221106174630.25909-2-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The PCIE Data Object Exchange (DOE) mailbox is a protocol run over
configuration cycles. It assumes one initiator at a time. While the
kernel has control of the mailbox user space writes could interfere with
the kernel access.
Mark DOE mailbox config space exclusive when iterated by the CXL driver.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The crc_val in the completion record should be 64 bits and not 32 bits.
Fixes: 4ac823e9cd ("dmaengine: idxd: fix delta_rec and crc size field for completion record")
Reported-by: Nirav N Shah <nirav.n.shah@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20221111012715.2031481-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEET63h6RnJhTJHuKTjXOwUVIRcSScFAmNu2EkACgkQXOwUVIRc
SSebKhAA0ffmp5jJgEJpQYNABGLYIJcwKkBrGClDbMJLtwCjevGZJajT9fpbCLb1
eK6EIhdfR0NTO+0KtUVkZ8WMa81OmLEJYdTNtJfNE23ENMpssiAWhlhDF8AoXeKv
Bo3j719gn3Cw9PWXQoircH3wpj+5RMDnjxy4iYlA5yNrvzC7XVmssMF+WALvQnuK
CGrfR57hxdgmphmasRqeCzEoriwihwPsG3k6eQN8rf7ZytLhs90tMVgT9L3Cd2u9
DafA0Xl8mZdz2mHhThcJhQVq4MUymZj44ufuHDiOs1j6nhUlWToyQuvegPOqxKti
uLGtZul0ls+3UP0Lbrv1oEGU/MWMxyDz4IBc0EVs0k3ItQbmSKs6r9WuPFGd96Sb
GHk68qFVySeLGN0LfKe3rCHJ9ZoIOPYJg9qT8Rd5bOhetgGwSsxZTxUI39BxkFup
CEqwIDnts1TMU37GDjj+vssKW91k4jEzMZVtRfsL3J36aJs28k/Ez4AqLXg6WU6u
ADqFaejVPcXbN9rX90onIYxxiL28gZSeT+i8qOPELZtqTQmNWz+tC/ySVuWnD8Mn
Nbs7PZ1IWiNZpsKS8pZnpd6j4mlBeJnwXkPKiFy+xHGuwRSRdYl6G9e5CtlRely/
rwQ8DtaOpRYMrGhnmBEdAOCa9t/iqzrzHzjoigjJ7iAST4ToJ5s=
=Y+/e
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:
====================
bpf-next 2022-11-11
We've added 49 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 3592 insertions(+), 1371 deletions(-).
The main changes are:
1) Veristat tool improvements to support custom filtering, sorting, and replay
of results, from Andrii Nakryiko.
2) BPF verifier precision tracking fixes and improvements,
from Andrii Nakryiko.
3) Lots of new BPF documentation for various BPF maps, from Dave Tucker,
Donald Hunter, Maryam Tahhan, Bagas Sanjaya.
4) BTF dedup improvements and libbpf's hashmap interface clean ups, from
Eduard Zingerman.
5) Fix veth driver panic if XDP program is attached before veth_open, from
John Fastabend.
6) BPF verifier clean ups and fixes in preparation for follow up features,
from Kumar Kartikeya Dwivedi.
7) Add access to hwtstamp field from BPF sockops programs,
from Martin KaFai Lau.
8) Various fixes for BPF selftests and samples, from Artem Savkov,
Domenico Cerasuolo, Kang Minchul, Rong Tao, Yang Jihong.
9) Fix redirection to tunneling device logic, preventing skb->len == 0, from
Stanislav Fomichev.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
selftests/bpf: fix veristat's singular file-or-prog filter
selftests/bpf: Test skops->skb_hwtstamp
selftests/bpf: Fix incorrect ASSERT in the tcp_hdr_options test
bpf: Add hwtstamp field for the sockops prog
selftests/bpf: Fix xdp_synproxy compilation failure in 32-bit arch
bpf, docs: Document BPF_MAP_TYPE_ARRAY
docs/bpf: Document BPF map types QUEUE and STACK
docs/bpf: Document BPF ARRAY_OF_MAPS and HASH_OF_MAPS
docs/bpf: Document BPF_MAP_TYPE_CPUMAP map
docs/bpf: Document BPF_MAP_TYPE_LPM_TRIE map
libbpf: Hashmap.h update to fix build issues using LLVM14
bpf: veth driver panics when xdp prog attached before veth_open
selftests: Fix test group SKIPPED result
selftests/bpf: Tests for btf_dedup_resolve_fwds
libbpf: Resolve unambigous forward declarations
libbpf: Hashmap interface update to allow both long and void* keys/values
samples/bpf: Fix sockex3 error: Missing BPF prog type
selftests/bpf: Fix u32 variable compared with less than zero
Documentation: bpf: Escape underscore in BPF type name prefix
selftests/bpf: Use consistent build-id type for liburandom_read.so
...
====================
Link: https://lore.kernel.org/r/20221111233733.1088228-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmNuf4IQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgppvMD/9K2kFcAiD85QmRoIgwlIRM604KZ6aGXqk3
BjTavxfB+3DJcb82FHywBF5DC0sUtrBOTn7+DJpf13lb4L2DZY1lfLkRL7SKHSs5
o1z+1uLcBtZtGCq5M+yhpxbAzJ2kNWdRe+FutSA6wiz03ATXTwo2qE1MLaw1jxap
DowK08DUtLNaFNoEGdpW8iub9ql1OVWWZdOaxZmVJkdPWeWMD6Zaqwi/MeyNv0aY
KbVpYHa2AGxGY6+2krLpL09kqYlW++UvFsofM6RJrHTlLyBdYKvM2Z+Tv9I6w81s
ZerVl5srC2pVj1K0isO7A25GTVIVzI9im/GCzStNTasFtlzW85CwLEcDS8T679bY
I0P+Wl3ZoLJztChrcSufiAaOfJIichML7H3h/iEkSE51+9cBr42fqJO64dc+s/Bi
OGmaFowYgJgOClzpAJ2upd2aNu4sLiR2DUb3qdHDpcio9bfpIme1Do1yB94kRR//
yIFrs47PW+JumE90iKJPnDRHWrl3dVUK27MqkAWSBuvOkBjKxLBSVHIARs1lGWy1
25y4atEMaEYnvjC3ATwM0WX0LY+5jCVqOXyfMPAMmEZ7WDbER7FfGxnnmw/pwka7
D4CiSWn5H2Jp9Lq7HiblgYucXXNCPYgSx9JiXnY/KBpARaKUIXuTOq2PuJ/FW4UG
dsJap0W2rw==
=s8Z1
-----END PGP SIGNATURE-----
Merge tag 'io_uring-6.1-2022-11-11' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
"Nothing major, just a few minor tweaks:
- Tweak for the TCP zero-copy io_uring self test (Pavel)
- Rather than use our internal cached value of number of CQ events
available, use what the user can see (Dylan)
- Fix a typo in a comment, added in this release (me)
- Don't allow wrapping while adding provided buffers (me)
- Fix a double poll race, and add a lockdep assertion for it too
(Pavel)"
* tag 'io_uring-6.1-2022-11-11' of git://git.kernel.dk/linux:
io_uring/poll: lockdep annote io_poll_req_insert_locked
io_uring/poll: fix double poll req->flags races
io_uring: check for rollover of buffer ID when providing buffers
io_uring: calculate CQEs from the user visible value
io_uring: fix typo in io_uring.h comment
selftests/net: don't tests batched TCP io_uring zc
The bpf-tc prog has already been able to access the
skb_hwtstamps(skb)->hwtstamp. This patch extends the same hwtstamp
access to the sockops prog.
In sockops, the skb is also available to the bpf prog during
the BPF_SOCK_OPS_PARSE_HDR_OPT_CB event. There is a use case
that the hwtstamp will be useful to the sockops prog to better
measure the one-way-delay when the sender has put the tx
timestamp in the tcp header option.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221107230420.4192307-2-martin.lau@linux.dev
Driver fixes for:
- Pile of at_hdmac driver rework which fixes many long standing issues
for this driver.
- couple of stm32 driver fixes for clearing structure and race fix
- idxd fixes for RO device state and batch size
- ti driver mem leak fix
- apple fix for grabbing channels in xlate
- resource leak fix in mv xor
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmNt7eYACgkQfBQHDyUj
g0dx3A/+NiVgApGZO1nvRCOFeBXSggZIzPYSDqS/ZqaYai63MoIB/em6CR36cYNm
xztCv/5MfpH8xcztoDBA4TwxcKzHo6/QxvHli3T+gMa0Imz/JSr8wxz2wfMnJStL
dcdCTQTRtUhdh8rlxAcGD4gWdhoZTrVoNSGQK3MgQC7GrcCET5K6hZB8JWOg9nd+
gxpZu5a4DirWAWW/MFdtfhfHjff4SDhLpRi+kWknA8UHQb9jlsidqvVjRQNWAy4u
QGhM4dxXsO+vSFxHcRZoxQjMQHoJ9v65Jh6cvADDpwa3/BbKFnljpCDA775hK2aL
SLqdUXtV9KeGpMVs6D8tradc2MqFgs+UfbNApZM+NY7UPpV0fQZh4wF8SQcS3lAq
1SIz5myxp1XLuac+TFR/nSlisdtEB/eVFcJKibJpc+RrEBZKIG0XzcBjTI1TrFz5
7p//84SE2EKd7MGKVUGZ1gWRG46VYGTsvQQpZ1YDaXdd9JKBl6jzgAPEoLIPEUC/
Pg60TIDXzM0pNtQUDIp6g4LPf6Q88poL8WxejsN2OS0Kxy+isYsY6S8GZIQB39Tw
9Q0JuN0AkQv+wSx5GGbVEaosZhoR/Vk8DJiq22H9joTUSVjrMGFlT+/Kw3X+H6gF
Pi3GcqIJeUbN+UPOMekR+UXS1U36/cB1Dh/ksLyEt3vAbm3U2NQ=
=UR0f
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-fix-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"Misc minor driver fixes and a big pile of at_hdmac driver fixes. More
work on this driver is done and sitting in next:
- Pile of at_hdmac driver rework which fixes many long standing
issues for this driver.
- couple of stm32 driver fixes for clearing structure and race fix
- idxd fixes for RO device state and batch size
- ti driver mem leak fix
- apple fix for grabbing channels in xlate
- resource leak fix in mv xor"
* tag 'dmaengine-fix-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (24 commits)
dmaengine: at_hdmac: Check return code of dma_async_device_register
dmaengine: at_hdmac: Fix impossible condition
dmaengine: at_hdmac: Don't allow CPU to reorder channel enable
dmaengine: at_hdmac: Fix completion of unissued descriptor in case of errors
dmaengine: at_hdmac: Fix descriptor handling when issuing it to hardware
dmaengine: at_hdmac: Fix concurrency over the active list
dmaengine: at_hdmac: Free the memset buf without holding the chan lock
dmaengine: at_hdmac: Fix concurrency over descriptor
dmaengine: at_hdmac: Fix concurrency problems by removing atc_complete_all()
dmaengine: at_hdmac: Protect atchan->status with the channel lock
dmaengine: at_hdmac: Do not call the complete callback on device_terminate_all
dmaengine: at_hdmac: Fix premature completion of desc in issue_pending
dmaengine: at_hdmac: Start transfer for cyclic channels in issue_pending
dmaengine: at_hdmac: Don't start transactions at tx_submit level
dmaengine: at_hdmac: Fix at_lli struct definition
dmaengine: stm32-dma: fix potential race between pause and resume
dmaengine: ti: k3-udma-glue: fix memory leak when register device fail
dmaengine: mv_xor_v2: Fix a resource leak in mv_xor_v2_remove()
dmaengine: apple-admac: Fix grabbing of channels in of_xlate
dmaengine: idxd: fix RO device state error after been disabled/reset
...
Current release - new code bugs:
- can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet
Previous releases - regressions:
- bpf, sockmap: fix the sk->sk_forward_alloc warning
- wifi: mac80211: fix general-protection-fault in
ieee80211_subif_start_xmit()
- can: af_can: fix NULL pointer dereference in can_rx_register()
- can: dev: fix skb drop check, avoid o-o-b access
- nfnetlink: fix potential dead lock in nfnetlink_rcv_msg()
Previous releases - always broken:
- bpf: fix wrong reg type conversion in release_reference()
- gso: fix panic on frag_list with mixed head alloc types
- wifi: brcmfmac: fix buffer overflow in brcmf_fweh_event_worker()
- wifi: mac80211: set TWT Information Frame Disabled bit as 1
- eth: macsec offload related fixes, make sure to clear the keys
from memory
- tun: fix memory leaks in the use of napi_get_frags
- tun: call napi_schedule_prep() to ensure we own a napi
- tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent
- ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg
to network
- tipc: fix a msg->req tlv length check
- sctp: clear out_curr if all frag chunks of current msg are pruned,
avoid list corruption
- mctp: fix an error handling path in mctp_init(), avoid leaks
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNtnlEACgkQMUZtbf5S
IrvSfg//axNePPwFiAdbYUmSNmnnv2Zpyz1l9a2/WvKKMeyAH3d4zuQGyTz7VgoJ
at4k1fr14vm+3qBhlL0UFdd+h/wBewwuuWLiogIfhgqDO7KavZsbTJWQ59DSHH08
ujihvt7dF9ByVd3hOpUDjrYGd2rPghqXk8l/2gpPp/KIrbj1jSW0DdF7Y48/0RRw
PYzNYZ9tqICw1crBT52ZilNEebGaUuWpPLzV2owlhJpzqyRLcgd9GWN9DkKieiiw
wF0Wi7A8b/+cR/Wo93RAXtvEayN9vp/t6iyiI1opv3Yg6bhAMlzDUX/v79ccnAM6
wJ3b8bKyLgph5ZTNmbL8GwC2pwl/20hOgCVLb/Haykqrk4oO2+xD39fjKniFP/71
IBYuLCethi0zmiSyR8yO4iyrfJCnkJffoxtcG8O5x+FuCfMI1xQWx44bSc34KlqT
vDw/VmnIfXH9K3F+QdWtlZfLiM0F6vd7RNGIxX0cC2wQCwaubCo0LOs5vl2+jpR8
Xclo+OquQtX5XRqGGQDtA7kCM9jfuc/DWla1v10wy7ZagiKkdfrV7Zu7r431Dtwn
BWeKZAA38o9WNRb4FD5GGUN0dK5R5V25LmbpvYuerq5Ub3pGJgHMsdA15LqsqTnW
MGIokGFhu7ToAQEnaRkF96jh3c3yoMU/sWXsqh7x/G6Tir7JGUw=
=WPta
-----END PGP SIGNATURE-----
Merge tag 'net-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wifi, can and bpf.
Current release - new code bugs:
- can: af_can: can_exit(): add missing dev_remove_pack() of
canxl_packet
Previous releases - regressions:
- bpf, sockmap: fix the sk->sk_forward_alloc warning
- wifi: mac80211: fix general-protection-fault in
ieee80211_subif_start_xmit()
- can: af_can: fix NULL pointer dereference in can_rx_register()
- can: dev: fix skb drop check, avoid o-o-b access
- nfnetlink: fix potential dead lock in nfnetlink_rcv_msg()
Previous releases - always broken:
- bpf: fix wrong reg type conversion in release_reference()
- gso: fix panic on frag_list with mixed head alloc types
- wifi: brcmfmac: fix buffer overflow in brcmf_fweh_event_worker()
- wifi: mac80211: set TWT Information Frame Disabled bit as 1
- eth: macsec offload related fixes, make sure to clear the keys from
memory
- tun: fix memory leaks in the use of napi_get_frags
- tun: call napi_schedule_prep() to ensure we own a napi
- tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent
- ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to
network
- tipc: fix a msg->req tlv length check
- sctp: clear out_curr if all frag chunks of current msg are pruned,
avoid list corruption
- mctp: fix an error handling path in mctp_init(), avoid leaks"
* tag 'net-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
eth: sp7021: drop free_netdev() from spl2sw_init_netdev()
MAINTAINERS: Move Vivien to CREDITS
net: macvlan: fix memory leaks of macvlan_common_newlink
ethernet: tundra: free irq when alloc ring failed in tsi108_open()
net: mv643xx_eth: disable napi when init rxq or txq failed in mv643xx_eth_open()
ethernet: s2io: disable napi when start nic failed in s2io_card_up()
net: atlantic: macsec: clear encryption keys from the stack
net: phy: mscc: macsec: clear encryption keys when freeing a flow
stmmac: dwmac-loongson: fix missing of_node_put() while module exiting
stmmac: dwmac-loongson: fix missing pci_disable_device() in loongson_dwmac_probe()
stmmac: dwmac-loongson: fix missing pci_disable_msi() while module exiting
cxgb4vf: shut down the adapter when t4vf_update_port_info() failed in cxgb4vf_open()
mctp: Fix an error handling path in mctp_init()
stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz
net: cxgb3_main: disable napi when bind qsets failed in cxgb_up()
net: cpsw: disable napi in cpsw_ndo_open()
iavf: Fix VF driver counting VLAN 0 filters
ice: Fix spurious interrupt during removal of trusted VF
net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions
net/mlx5e: E-Switch, Fix comparing termination table instance
...
ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is
enabled. It's conflicting with that ring-based dirty page tracking always
requires a running VCPU context.
Introduce a new flavor of dirty ring that requires the use of both VCPU
dirty rings and a dirty bitmap. The expectation is that for non-VCPU
sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to
the dirty bitmap. Userspace should scan the dirty bitmap before migrating
the VM to the target.
Use an additional capability to advertise this behavior. The newly added
capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before
KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added
capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.
Suggested-by: Marc Zyngier <maz@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Co-developed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110104914.31280-4-gshan@redhat.com
Add the mask KVM_MSR_EXIT_REASON_VALID_MASK for the MSR exit reason
flags. This simplifies checks that validate these flags, and makes it
easier to introduce new flags in the future.
No functional change intended.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220921151525.904162-3-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit bc27fb68aa ("include/uapi/linux/byteorder, swab: force inlining
of some byteswap operations") added __always_inline to swab functions
and commit 283d757378 ("uapi/linux/stddef.h: Provide __always_inline to
userspace headers") added a definition of __always_inline for use in
exported headers when the kernel's compiler.h is not available.
However, since swab.h does not include stddef.h, if the header soup does
not indirectly include it, the definition of __always_inline is missing,
resulting in a compilation failure, which was observed compiling the
perf tool using exported headers containing this commit:
In file included from /usr/include/linux/byteorder/little_endian.h:12:0,
from /usr/include/asm/byteorder.h:14,
from tools/include/uapi/linux/perf_event.h:20,
from perf.h:8,
from builtin-bench.c:18:
/usr/include/linux/swab.h:160:8: error: unknown type name `__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
Fix this by replacing the inclusion of linux/compiler.h with
linux/stddef.h to ensure that we pick up that definition if required,
without relying on it's indirect inclusion. compiler.h is then included
indirectly, via stddef.h.
Fixes: 283d757378 ("uapi/linux/stddef.h: Provide __always_inline to userspace headers")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Petr Vaněk <arkamar@atlas.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmNpimwUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMptRAAmDe4f1MokIzJUBQOrFOR8Zw2zWDc
snW1dSRTudP4fjV2GX4XVb4X0YFtmnsBXk1GZJpWZWTMshdSxgvS/rnzOT6svqrk
mDeAWVhtiPdyB5Xj3bFNnKz7vzvTmgCcsHJ0NqTwEk0nh2vS+NoIwsJRvNgEVmJb
8HN1uIFYzHF83Ij5+ejBaF/8Xvkc5kKjrhvs68R5YmOeH+9EuVi88S/FckF1HkWP
7WBmsD1bgDoU4UFIiri3w5FPrWQNqLcR7ZQISizCU3C8B9U84tCe5ifxLmNr3RmX
9UrT0THiZHd1iV+uDaaIfiHS+fpvpZn5CbSvPiPXYkybpoeUMNgz+pxTq12nR0eX
xB6CGPUgT51R6qI0gQCCSazhXz4wUy0Jhkyo4hwruW/7bo11chft6Oktzt0Ij/Pd
zyYT5ad+J3Ufub/QSIyo8yvq/oawlmxibMTuDo1mwCkMDDNEsqLJbYJESOBw3/0P
XtbvZC5oChVur3ozBepeKV/B1trvPtmih/RaX3ARVDm6LTobCykFuUwu5bwPhyDz
/iTpZbbez5jgVI1kJd3TCyCVYXwtalNbAH70XPvlygXKwQZpOM9LUrkggX8BSooT
Rq+c2bUT0HZ9Yade4Aw55HlKOTbrzRKlaMWFtlBEz32EVq8tknDxUHnzVPv3zMA/
g8pjmS+k+xD/Ywk=
=Ycwp
-----END PGP SIGNATURE-----
Merge tag 'audit-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
"A small audit patch to fix an instance of undefined behavior in a
shift operator caused when shifting a signed value too far, the same
case as the lsm patch merged previously.
While the fix is trivial and I can't imagine it causing a problem in a
backport, I'm not explicitly marking it for stable on the off chance
that there is some system out there which is relying on some wonky
unexpected behavior which this patch could break; *if* it does break,
IMO it's better that to happen in a minor or -rcX release and not in a
stable backport"
* tag 'audit-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: fix undefined behavior in bit shift for AUDIT_BIT
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmNpiisUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMFzg//WEQgUIwRkmOj2vQdXdvhSBj9T6Ec
sEkh3nBhT0D++21mZeDQiDAUQC3jE3xr/vFwrkECR9xd9ap2Y/RxT5tCucBGOua+
trUHyqMbvH5Ec0lomUfVsDoez4GcrCtZ++p7TP8YXxgvjAbDSFtbDqVyfbWpV+E/
UV0nByDvhX+HQqGVJRDbK4d8JDQFccki/+SlaGnWtaKYA8CjHKkpTrhhkCVm3Fow
iA/qg/sPLX1/5g7yHrhWaFy//MkFM1C7cmLq88nlR46OVVjHGlFqIGXbFclNZNnG
cfLzvcPGuDZ9Ih7Pun3wESEDWxMlSXArNZzC12xIw3STTHiHP8fUEnw9bfpKzWUs
K+3nu6gN+Mh7xRL7dw0ISqx7tQM/SJ7lF91zD7pIEvuXLMKXIfM3D7KyTkwmBVuz
A6nZphAEmmY5R+ez88ry7c0FtNEEc1dST8rVjD8XStvFXxRNqIWOZ3Z2QjhA9SI9
Y/v8H2/VW7hsgGnyozVqmFJmY+x5ij2lge5TEnhfRvCi1xf25Rdii50+lYCdvnM1
v/IY2Xxeq+gvyew8XB6B13Gv6TUKKIgL7sOwMdwEB8Q1Lk2xegzZkl7thsGoVLZn
zZny4+8WxFoGhu51V3EtuTCVWHJ884fPSM1PEntWo5+oe/6cJZ7/rcYRP6PtYfha
4usMsFgUJnzntmY=
=1ZF/
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm fix from Paul Moore:
"A small capability patch to fix an instance of undefined behavior in a
shift operator caused when shifting a signed value too far.
While the fix is trivial and I can't imagine it causing a problem in a
backport, I'm not explicitly marking it for stable on the off chance
that there is some system out there which is relying on some wonky
unexpected behavior which this patch could break; *if* it does break,
IMO it's better that to happen in a minor or -rcX release and not in a
stable backport"
* tag 'lsm-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
capabilities: fix undefined behavior in bit shift for CAP_TO_MASK
This patch is to add helper support in act_ct for OVS actions=ct(alg=xxx)
offloading, which is corresponding to Commit cae3a26275 ("openvswitch:
Allow attaching helpers to ct action") in OVS kernel part.
The difference is when adding TC actions family and proto cannot be got
from the filter/match, other than helper name in tb[TCA_CT_HELPER_NAME],
we also need to send the family in tb[TCA_CT_HELPER_FAMILY] and the
proto in tb[TCA_CT_HELPER_PROTO] to kernel.
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The previous attempt to augment carrier_down (see Link)
was not met with much enthusiasm so let's do the simple
thing of exposing what some devices already maintain.
Add a common ethtool statistic for link going down.
Currently users have to maintain per-driver mapping
to extract the right stat from the vendor-specific ethtool -S
stats. carrier_down does not fit the bill because it counts
a lot of software related false positives.
Add the statistic to the extended link state API to steer
vendors towards implementing all of it.
Implement for bnxt and all Linux-controlled PHYs. mlx5 and (possibly)
enic also have a counter for this but I leave the implementation
to their maintainers.
Link: https://lore.kernel.org/r/20220520004500.2250674-1-kuba@kernel.org
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221104190125.684910-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* git://linuxtv.org/sailus/media_tree: (47 commits)
media: i2c: ov4689: code cleanup
media: ov9650: Drop platform data code path
media: ov7670: Drop unused include
media: ov2640: Drop legacy includes
media: tc358746: add Toshiba TC358746 Parallel to CSI-2 bridge driver
media: dt-bindings: add bindings for Toshiba TC358746
phy: dphy: add support to calculate the timing based on hs_clk_rate
phy: dphy: refactor get_default_config
v4l: subdev: Warn if disabling streaming failed, return success
dw9768: Enable low-power probe on ACPI
media: i2c: imx290: Replace GAIN control with ANALOGUE_GAIN
media: i2c: imx290: Add crop selection targets support
media: i2c: imx290: Factor out format retrieval to separate function
media: i2c: imx290: Move registers with fixed value to init array
media: i2c: imx290: Create controls for fwnode properties
media: i2c: imx290: Implement HBLANK and VBLANK controls
media: i2c: imx290: Split control initialization to separate function
media: i2c: imx290: Fix max gain value
media: i2c: imx290: Add exposure time control
media: i2c: imx290: Define more register macros
...
Just a basic s/thig/this swap, fixing up a typo introduced by a commit
added in the 6.1 release.
Fixes: 9cda70f622 ("io_uring: introduce fixed buffer support for io_uring_cmd")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:
UBSAN: shift-out-of-bounds in security/commoncap.c:1252:2
left shift of 1 by 31 places cannot be represented in type 'int'
Call Trace:
<TASK>
dump_stack_lvl+0x7d/0xa5
dump_stack+0x15/0x1b
ubsan_epilogue+0xe/0x4e
__ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
cap_task_prctl+0x561/0x6f0
security_task_prctl+0x5a/0xb0
__x64_sys_prctl+0x61/0x8f0
do_syscall_64+0x58/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>
Fixes: e338d263a7 ("Add 64-bit capability support to the kernel")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
aspeed supports differential jpeg format which only compress the parts
which are changed. In this way, it reduces both the amount of data to be
transferred by network and those to be decoded on the client side.
2 new ctrls are added:
* Aspeed HQ Mode: to control aspeed's high quality(2-pass) compression mode
This only works with yuv444 subsampling.
* Aspeed HQ Quality: to control the quality of aspeed's HQ mode
only useful if Aspeed HQ mode is enabled
Aspeed JPEG Format requires an additional buffer, called bcd, to store
the information about which macro block in the new frame is different
from the previous one.
To have bcd correctly working, we need to swap the buffers for src0/1 to
make src1 refer to previous frame and src0 to the coming new frame.
Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil: fix logging dma_addr_t, use %pad for that]
Reserve controls for ASPEED video family. Aspeed video engine contains a
few features which improve video quality, reduce amount of compressed
data, and etc. Hence, 16 controls are reserved for these aspeed
proprietary features.
Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
This introduces support for the Aspeed JPEG format, where the new frame
can refer to previous frame to reduce the amount of compressed data.
The concept is similar to I/P frame of video compression. It will
compare the new frame with previous one to decide which macroblock's
data is changed, and only the changed macroblocks will be compressed.
This Aspeed JPEG format is used by the video engine on Aspeed platforms,
which is generally adapted for remote KVM.
Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Expose devlink port handle related to netdev over rtnetlink. Introduce a
new nested IFLA attribute to carry the info. Call into devlink code to
fill-up the nest with existing devlink attributes that are used over
devlink netlink.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hosts that support 802.1X authentication are able to authenticate
themselves by exchanging EAPOL frames with an authenticator (Ethernet
bridge, in this case) and an authentication server. Access to the
network is only granted by the authenticator to successfully
authenticated hosts.
The above is implemented in the bridge using the "locked" bridge port
option. When enabled, link-local frames (e.g., EAPOL) can be locally
received by the bridge, but all other frames are dropped unless the host
is authenticated. That is, unless the user space control plane installed
an FDB entry according to which the source address of the frame is
located behind the locked ingress port. The entry can be dynamic, in
which case learning needs to be enabled so that the entry will be
refreshed by incoming traffic.
There are deployments in which not all the devices connected to the
authenticator (the bridge) support 802.1X. Such devices can include
printers and cameras. One option to support such deployments is to
unlock the bridge ports connecting these devices, but a slightly more
secure option is to use MAB. When MAB is enabled, the MAC address of the
connected device is used as the user name and password for the
authentication.
For MAB to work, the user space control plane needs to be notified about
MAC addresses that are trying to gain access so that they will be
compared against an allow list. This can be implemented via the regular
learning process with the sole difference that learned FDB entries are
installed with a new "locked" flag indicating that the entry cannot be
used to authenticate the device. The flag cannot be set by user space,
but user space can clear the flag by replacing the entry, thereby
authenticating the device.
Locked FDB entries implement the following semantics with regards to
roaming, aging and forwarding:
1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports,
in which case the "locked" flag is cleared. FDB entries cannot roam
to locked ports regardless of MAB being enabled or not. Therefore,
locked FDB entries are only created if an FDB entry with the given {MAC,
VID} does not already exist. This behavior prevents unauthenticated
devices from disrupting traffic destined to already authenticated
devices.
2. Aging: Locked FDB entries age and refresh by incoming traffic like
regular entries.
3. Forwarding: Locked FDB entries forward traffic like regular entries.
If user space detects an unauthorized MAC behind a locked port and
wishes to prevent traffic with this MAC DA from reaching the host, it
can do so using tc or a different mechanism.
Enable the above behavior using a new bridge port option called "mab".
It can only be enabled on a bridge port that is both locked and has
learning enabled. Locked FDB entries are flushed from the port once MAB
is disabled. A new option is added because there are pure 802.1X
deployments that are not interested in notifications about locked FDB
entries.
Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY2RS7QAKCRDbK58LschI
g6RVAQC1FdSXMrhn369NGCG1Vox1QYn2/5P32LSIV1BKqiQsywEAsxgYNrdCPTua
ie91Q5IJGT9pFl1UR50UrgL11DI5BgI=
=sdhO
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
bpf 2022-11-04
We've added 8 non-merge commits during the last 3 day(s) which contain
a total of 10 files changed, 113 insertions(+), 16 deletions(-).
The main changes are:
1) Fix memory leak upon allocation failure in BPF verifier's stack state
tracking, from Kees Cook.
2) Fix address leakage when BPF progs release reference to an object,
from Youlin Li.
3) Fix BPF CI breakage from buggy in.h uapi header dependency,
from Andrii Nakryiko.
4) Fix bpftool pin sub-command's argument parsing, from Pu Lehui.
5) Fix BPF sockmap lockdep warning by cancelling psock work outside
of socket lock, from Cong Wang.
6) Follow-up for BPF sockmap to fix sk_forward_alloc accounting,
from Wang Yufen.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add verifier test for release_reference()
bpf: Fix wrong reg type conversion in release_reference()
bpf, sock_map: Move cancel_work_sync() out of sock lock
tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CI
net/ipv4: Fix linux/in.h header dependencies
bpftool: Fix NULL pointer dereference when pin {PROG, MAP, LINK} without FILE
bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues
bpf, verifier: Fix memory leak in array reallocation for stack state
====================
Link: https://lore.kernel.org/r/20221104000445.30761-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add new apptrust extension attributes to the 8021Qaz APP managed object.
Two new attributes, DCB_ATTR_DCB_APP_TRUST_TABLE and
DCB_ATTR_DCB_APP_TRUST, has been added. Trusted selectors are passed in
the nested attribute DCB_ATTR_DCB_APP_TRUST, in order of precedence.
The new attributes are meant to allow drivers, whose hw supports the
notion of trust, to be able to set whether a particular app selector is
trusted - and in which order.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add new PCP selector for the 8021Qaz APP managed object.
As the PCP selector is not part of the 8021Qaz standard, a new non-std
extension attribute DCB_ATTR_DCB_APP has been introduced. Also two
helper functions to translate between selector and app attribute type
has been added. The new selector has been given a value of 255, to
minimize the risk of future overlap of std- and non-std attributes.
The new DCB_ATTR_DCB_APP is sent alongside the ieee std attribute in the
app table. This means that the dcb_app struct can now both contain std-
and non-std app attributes. Currently there is no overlap between the
selector values of the two attributes.
The purpose of adding the PCP selector, is to be able to offload
PCP-based queue classification to the 8021Q Priority Code Point table,
see 6.9.3 of IEEE Std 802.1Q-2018.
PCP and DEI is encoded in the protocol field as 8*dei+pcp, so that a
mapping of PCP 2 and DEI 1 to priority 3 is encoded as {255, 10, 3}.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
__DECLARE_FLEX_ARRAY is defined in include/uapi/linux/stddef.h but
doesn't seem to be explicitly included from include/uapi/linux/in.h,
which breaks BPF selftests builds (once we sync linux/stddef.h into
tools/include directory in the next patch). Fix this by explicitly
including linux/stddef.h.
Given this affects BPF CI and bpf tree, targeting this for bpf tree.
Fixes: 5854a09b49 ("net/ipv4: Use __DECLARE_FLEX_ARRAY() helper")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20221102182517.2675301-1-andrii@kernel.org
Convert struct serial_rs485 comments to kernel doc format and include
it into documentation.
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20221019093343.9546-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY2GuKgAKCRDbK58LschI
gy32AP9PI0e/bUGDExKJ8g97PeeEtnpj4TTI6g+XKILtYnyXlgD/Rk4j2D/f3IBF
Ha9TmqYvAUim+U/g50vUrNuoNLNJ5w8=
=OKC1
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
bpf-next 2022-11-02
We've added 70 non-merge commits during the last 14 day(s) which contain
a total of 96 files changed, 3203 insertions(+), 640 deletions(-).
The main changes are:
1) Make cgroup local storage available to non-cgroup attached BPF programs
such as tc BPF ones, from Yonghong Song.
2) Avoid unnecessary deadlock detection and failures wrt BPF task storage
helpers, from Martin KaFai Lau.
3) Add LLVM disassembler as default library for dumping JITed code
in bpftool, from Quentin Monnet.
4) Various kprobe_multi_link fixes related to kernel modules,
from Jiri Olsa.
5) Optimize x86-64 JIT with emitting BMI2-based shift instructions,
from Jie Meng.
6) Improve BPF verifier's memory type compatibility for map key/value
arguments, from Dave Marchevsky.
7) Only create mmap-able data section maps in libbpf when data is exposed
via skeletons, from Andrii Nakryiko.
8) Add an autoattach option for bpftool to load all object assets,
from Wang Yufen.
9) Various memory handling fixes for libbpf and BPF selftests,
from Xu Kuohai.
10) Initial support for BPF selftest's vmtest.sh on arm64,
from Manu Bretelle.
11) Improve libbpf's BTF handling to dedup identical structs,
from Alan Maguire.
12) Add BPF CI and denylist documentation for BPF selftests,
from Daniel Müller.
13) Check BPF cpumap max_entries before doing allocation work,
from Florian Lehner.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (70 commits)
samples/bpf: Fix typo in README
bpf: Remove the obsolte u64_stats_fetch_*_irq() users.
bpf: check max_entries before allocating memory
bpf: Fix a typo in comment for DFS algorithm
bpftool: Fix spelling mistake "disasembler" -> "disassembler"
selftests/bpf: Fix bpftool synctypes checking failure
selftests/bpf: Panic on hard/soft lockup
docs/bpf: Add documentation for new cgroup local storage
selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x
selftests/bpf: Add selftests for new cgroup local storage
selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str
bpftool: Support new cgroup local storage
libbpf: Support new cgroup local storage
bpf: Implement cgroup storage available to non-cgroup-attached bpf progs
bpf: Refactor some inode/task/sk storage functions for reuse
bpf: Make struct cgroup btf id global
selftests/bpf: Tracing prog can still do lookup under busy lock
selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection
bpf: Add new bpf_task_storage_delete proto with no deadlock detection
bpf: bpf_task_storage_delete_recur does lookup first before the deadlock check
...
====================
Link: https://lore.kernel.org/r/20221102062120.5724-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:
UBSAN: shift-out-of-bounds in kernel/auditfilter.c:179:23
left shift of 1 by 31 places cannot be represented in type 'int'
Call Trace:
<TASK>
dump_stack_lvl+0x7d/0xa5
dump_stack+0x15/0x1b
ubsan_epilogue+0xe/0x4e
__ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
audit_register_class+0x9d/0x137
audit_classes_init+0x4d/0xb8
do_one_initcall+0x76/0x430
kernel_init_freeable+0x3b3/0x422
kernel_init+0x24/0x1e0
ret_from_fork+0x1f/0x30
</TASK>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
[PM: remove bad 'Fixes' tag as issue predates git, added in v2.6.6-rc1]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Clean up the use of unsafe_memcpy() by adding a flexible array
at the end of netlink message header and splitting up the header
and data copies.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmNe+HAeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGrFwH/jYeSZl8ACW2GR7r
8zsyHK/qy7tOmwV99QZjS94b1KK1hfi2L93Ro8LXohqI3HDvk6eBXWApz3jSicFW
OeTCq1vu2Bw1iimhUThVqVCWF8EaVwdYYobvNqIznA2QCkOO+oTG+3oIiWJaEOVE
xKJypq95b16D0KAGXrgZznjdBi2DlOUzdsjNT3E8xSXeEjvX7y7optHhuMlCgir4
19kEgelGAOp28rd5l0qZ2pt0ltMN+COx4EF8LCNvsD2kvXPbyDN/V2Ql04mYBMde
hriWyP5W0WDOCvjrcxV4CgyosZ2UgkfAdZ1JTJQi3RkfJ1B52BpyfRiBkfQeqcC+
RoHbW7w=
=RWs8
-----END PGP SIGNATURE-----
Merge 6.1-rc3 into usb-next
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Add Alder and Raptor Lakes support to RAPL
- Make sure raw sample data is output with tracepoints
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmNeRnEACgkQEsHwGGHe
VUpmTRAAmLvQhTN15L4qr6BSIUhlOk1xmM4pKtUXfpzX9Nki+bhPvH8sczaUXg1N
90u6pD8+uOIFsd2s+bUVyR/h3cWnjpy9Or1oSYlNTTPxwlqC1XsLqsWjy7/AA91d
YAUZNfmIsBNTUDtjygslnZ2yZIIPWXGI5utvrkS3W2cbfZtQhuDVTo5KAnx3+0fC
inKfiO+lAEouNu9l/+GdqPhgiDVB+oK12ROMosAr9++Ewuf61Jnk0nVEynNVoGT0
OLxbNT6xU3TlOm/n2zwmWnM95ZJ9sM5SEJg+c55VZ9biTAgayd+7Hw8H3CAqIhdD
utFoxkQpblp7Lq6IporcfjpGISA4WdbaiJaMN56azucGcZsk6VXUzNk6AimXvqjP
d8z7nVYDGDxYoIWyoSfO7XuIhqek38KRTEbl3qvyRZoF/FRjaWCvZeir9W32mRbx
bVKPTQ8FgSUtkBLhGZrldHP8PRsw1nf60wJb19p8s5aWNMzimgUN0As0kf78k6l+
fapTvhuU84EDVjiUS7BTrMq1r3ieaZiN2Ofi4EAG8c4R3S4C3hHKFQH3suIOp3vf
UpCyYi+29LfdTgiuNX+efklUSu5T2EccJXke07CJQM5BBppvuPqeG7YJET5/YDuz
tSPIbTZ9lxomeNWJSu9cyuynqPIS0f/j6FtpodA7MY1f/AX7OHM=
=zS3P
-----END PGP SIGNATURE-----
Merge tag 'perf_urgent_for_v6.1_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Rename a perf memory level event define to denote it is of CXL type
- Add Alder and Raptor Lakes support to RAPL
- Make sure raw sample data is output with tracepoints
* tag 'perf_urgent_for_v6.1_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/mem: Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL
perf/x86/rapl: Add support for Intel Raptor Lake
perf/x86/rapl: Add support for Intel AlderLake-N
perf: Fix missing raw data on tracepoint events
Extend packet socket option PACKET_IGNORE_OUTGOING to fanout groups.
The socket option sets ptype.ignore_outgoing, which makes
dev_queue_xmit_nit skip the socket.
When the socket joins a fanout group, the option is not reflected in
the struct ptype of the group. dev_queue_xmit_nit only tests the
fanout ptype, so the flag is ignored once a socket joins a
fanout group.
Inheriting the option from a socket would change established behavior.
Different sockets in the group can set different flags, and can also
change them at runtime.
Testing in packet_rcv_fanout defeats the purpose of the original
patch, which is to avoid skb_clone in dev_queue_xmit_nit (esp. for
MSG_ZEROCOPY packets).
Instead, introduce a new fanout group flag with the same behavior.
Tested with https://github.com/wdebruij/kerneltools/blob/master/tests/test_psock_fanout_ignore_outgoing.c
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221027211014.3581513-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7.
All mac80211 driver are now converted to use internal TX queues, this
might cause some regressions so we wanted to do this early in the
cycle.
Note: wireless tree was merged[1] to wireless-next to avoid some
conflicts with mac80211 patches between the trees. Unfortunately there
are still two smaller conflicts in net/mac80211/util.c which Stephen
also reported[2]. In the first conflict initialise scratch_len to
"params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and
in the second conflict take the version which uses elems->scratch_pos.
Git diff output should like this:
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@@ -1506,7 -1648,7 +1650,7 @@@ ieee802_11_parse_elems_full(struct ieee
const struct element *non_inherit = NULL;
u8 *nontransmitted_profile;
int nontransmitted_profile_len = 0;
- size_t scratch_len = params->len;
- size_t scratch_len = params->scratch_len ?: 2 * params->len;
++ size_t scratch_len = params->scratch_len ?: 3 * params->len;
elems = kzalloc(sizeof(*elems) + scratch_len, GFP_ATOMIC);
if (!elems)
[1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91
[2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/
Major changes:
mac80211
* preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues
* add API to show the link STAs in debugfs
* all mac80211 drivers are now using mac80211 internal TX queues (iTXQs)
rtw89
* support 8852BE
rtl8xxxu
* support RTL8188FU
brmfmac
* support two station interfaces concurrently
bcma
* support SPROM rev 11
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmNb2KwRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZs6hggAqrmgHaiRbPYlLVE0hFVaGeVXslfpK9nj
ZGr3bwG5FpioxjAsI7NfwEwpq+FoxTUigG/SShw1Rr8lw43Nt7E57+5McK3qWdKC
5/WndfChPppjtJUpc6PN+UcmCuhcm4TAyMAojbTe5lJ9cJ4yyOWZChDwah3BVZQk
/tHK4qKq8gpppwgkiJ88VdTVoVv5anIvXHH3lAIteljxk0zkaEM7lIQhIFst/zpX
cu/cWvKz2Rh6vnYjkzl8BcrbW8e0/VwnzWggtWs4/kIuRVnHRrrGwlnsULShOscM
fmK+RI2CbJIMucttUOHOXjCg4wwppSit1rH4xOkcwlvrFa7AvDbdHg==
=Zils
-----END PGP SIGNATURE-----
Kalle Valo says:
====================
pull-request: wireless-next-2022-10-28
First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7.
All mac80211 driver are now converted to use internal TX queues, this
might cause some regressions so we wanted to do this early in the
cycle.
Note: wireless tree was merged[1] to wireless-next to avoid some
conflicts with mac80211 patches between the trees. Unfortunately there
are still two smaller conflicts in net/mac80211/util.c which Stephen
also reported[2]. In the first conflict initialise scratch_len to
"params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and
in the second conflict take the version which uses elems->scratch_pos.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91
[2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/
mac80211
- preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues
- add API to show the link STAs in debugfs
- all mac80211 drivers are now using mac80211 internal TX queues (iTXQs)
rtw89
- support 8852BE
rtl8xxxu
- support RTL8188FU
brmfmac
- support two station interfaces concurrently
bcma
- support SPROM rev 11
====================
Link: https://lore.kernel.org/r/20221028132943.304ECC433B5@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rcv_wnd can be useful to diagnose TCP performance where receiver window
becomes the bottleneck. rehash reports the PLB and timeout triggered
rehash attempts by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A u32 counter is added to tcp_sock for counting the number of PLB
triggered rehashes for a TCP connection. An SNMP counter is also
added to count overall PLB triggered rehash events for a host. These
counters are hooked up to PLB implementation for DCTCP.
TCP_NLA_REHASH is added to SCM_TIMESTAMPING_OPT_STATS that reports
the rehash attempts triggered due to PLB or timeouts. This gives
a historical view of sustained congestion or timeouts experienced
by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to leverage keyring to store sensitive keys, and then use those
keys for symmetric encryption via the crypto API. Among the key types we
wish to support are: user, logon, encrypted, and trusted.
User key types are already able to have their data copied to user space,
but logon does not support this. Further, trusted and encrypted keys will
return their encrypted data back to user space on read, which does not
make them ideal for symmetric encryption.
To support symmetric encryption for these key types, add a new
ALG_SET_KEY_BY_KEY_SERIAL setsockopt() option to the crypto API. This
allows users to pass a key_serial_t to the crypto API to perform
symmetric encryption. The behavior is the same as ALG_SET_KEY, but
the crypto key data is copied in kernel space from a keyring key,
which allows for the support of logon, encrypted, and trusted key types.
Keyring keys must have the KEY_(POS|USR|GRP|OTH)_SEARCH permission set
to leverage this feature. This follows the asymmetric_key type where key
lookup calls eventually lead to keyring_search_rcu() without the
KEYRING_SEARCH_NO_CHECK_PERM flag set.
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
1) Move struct nft_payload_set definition to .c file where it is
only used.
2) Shrink transport and inner header offset fields in the nft_pktinfo
structure to 16-bits, from Florian Westphal.
3) Get rid of nft_objref Kbuild toggle, make it built-in into
nf_tables. This expression is used to instantiate conntrack helpers
in nftables. After removing the conntrack helper auto-assignment
toggle it this feature became more important so move it to the nf_tables
core module. Also from Florian.
4) Extend the existing function to calculate payload inner header offset
to deal with the GRE and IPIP transport protocols.
6) Add inner expression support for nf_tables. This new expression
provides a packet parser for tunneled packets which uses a userspace
description of the expected inner headers. The inner expression
invokes the payload expression (via direct call) to match on the
inner header protocol fields using the inner link, network and
transport header offsets.
An example of the bytecode generated from userspace to match on
IP source encapsulated in a VxLAN packet:
# nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4
netdev x y
[ meta load l4proto => reg 1 ]
[ cmp eq reg 1 0x00000011 ]
[ payload load 2b @ transport header + 2 => reg 1 ]
[ cmp eq reg 1 0x0000b512 ]
[ inner type vxlan hdrsize 8 flags f [ meta load protocol => reg 1 ] ]
[ cmp eq reg 1 0x00000008 ]
[ inner type vxlan hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ]
[ cmp eq reg 1 0x04030201 ]
7) Store inner link, network and transport header offsets in percpu
area to parse inner packet header once only. Matching on a different
tunnel type invalidates existing offsets in the percpu area and it
invokes the inner tunnel parser again.
8) Add support for inner meta matching. This support for
NFTA_META_PROTOCOL, which specifies the inner ethertype, and
NFT_META_L4PROTO, which specifies the inner transport protocol.
9) Extend nft_inner to parse GENEVE optional fields to calculate the
link layer offset.
10) Update inner expression so tunnel offset points to GRE header
to normalize tunnel header handling. This also allows to perform
different interpretations of the GRE header from userspace.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nft_inner: set tunnel offset to GRE header offset
netfilter: nft_inner: add geneve support
netfilter: nft_meta: add inner match support
netfilter: nft_inner: add percpu inner context
netfilter: nft_inner: support for inner tunnel header matching
netfilter: nft_payload: access ipip payload for inner offset
netfilter: nft_payload: access GRE payload via inner offset
netfilter: nft_objref: make it builtin
netfilter: nf_tables: reduce nft_pktinfo by 8 bytes
netfilter: nft_payload: move struct nft_payload_set definition where it belongs
====================
Link: https://lore.kernel.org/r/20221026132227.3287-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
2871edb32f ("can: kvaser_usb: Fix possible completions during init_completion")
abb8670938 ("can: kvaser_usb_leaf: Ignore stale bus-off after start")
8d21f5927a ("can: kvaser_usb_leaf: Fix improved state not being reported")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
PERF_MEM_LVLNUM_EXTN_MEM was introduced to cover CXL devices but it's
bit ambiguous name and also not generic enough to cover cxl.cache and
cxl.io devices. Rename it to PERF_MEM_LVLNUM_CXL to be more specific.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/f6268268-b4e9-9ed6-0453-65792644d953@amd.com
Similar to sk/inode/task storage, implement similar cgroup local storage.
There already exists a local storage implementation for cgroup-attached
bpf programs. See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper
bpf_get_local_storage(). But there are use cases such that non-cgroup
attached bpf progs wants to access cgroup local storage data. For example,
tc egress prog has access to sk and cgroup. It is possible to use
sk local storage to emulate cgroup local storage by storing data in socket.
But this is a waste as it could be lots of sockets belonging to a particular
cgroup. Alternatively, a separate map can be created with cgroup id as the key.
But this will introduce additional overhead to manipulate the new map.
A cgroup local storage, similar to existing sk/inode/task storage,
should help for this use case.
The life-cycle of storage is managed with the life-cycle of the
cgroup struct. i.e. the storage is destroyed along with the owning cgroup
with a call to bpf_cgrp_storage_free() when cgroup itself
is deleted.
The userspace map operations can be done by using a cgroup fd as a key
passed to the lookup, update and delete operations.
Typically, the following code is used to get the current cgroup:
struct task_struct *task = bpf_get_current_task_btf();
... task->cgroups->dfl_cgrp ...
and in structure task_struct definition:
struct task_struct {
....
struct css_set __rcu *cgroups;
....
}
With sleepable program, accessing task->cgroups is not protected by rcu_read_lock.
So the current implementation only supports non-sleepable program and supporting
sleepable program will be the next step together with adding rcu_read_lock
protection for rcu tagged structures.
Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local
storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used
for cgroup storage available to non-cgroup-attached bpf programs. The old
cgroup storage supports bpf_get_local_storage() helper to get the cgroup data.
The new cgroup storage helper bpf_cgrp_storage_get() can provide similar
functionality. While old cgroup storage pre-allocates storage memory, the new
mechanism can also pre-allocate with a user space bpf_map_update_elem() call
to avoid potential run-time memory allocation failure.
Therefore, the new cgroup storage can provide all functionality w.r.t.
the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can
be deprecated since the new one can provide the same functionality.
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When searching for a struct definition I often enough end up simply doing
git grep 'struct foobar {'
Sadly some of the ELF structs did not follow the usual coding style so they
were invisible.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/11563980.Ss37MnutNL@mobilepool36.emlix.com
If it is a progressive (non-interlaced) format, then ignore the
interlaced timing values.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 7f68127fa1 ([media] videodev2.h: defines to calculate blanking and frame sizes)
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Geneve tunnel header may contain options, parse geneve header and update
offset to point to the link layer header according to the opt_len field.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This new expression allows you to match on the inner headers that are
encapsulated by any of the existing tunneling protocols.
This expression parses the inner packet to set the link, network and
transport offsets, so the existing expressions (with a few updates) can
be reused to match on the inner headers.
The inner expression supports for different tunnel combinations such as:
- ethernet frame over IPv4/IPv6 packet, eg. VxLAN.
- IPv4/IPv6 packet over IPv4/IPv6 packet, eg. IPIP.
- IPv4/IPv6 packet over IPv4/IPv6 + transport header, eg. GRE.
- transport header (ESP or SCTP) over transport header (usually UDP)
The following fields are used to describe the tunnel protocol:
- flags, which describe how to parse the inner headers:
NFT_PAYLOAD_CTX_INNER_TUN, the tunnel provides its own header.
NFT_PAYLOAD_CTX_INNER_ETHER, the ethernet frame is available as inner header.
NFT_PAYLOAD_CTX_INNER_NH, the network header is available as inner header.
NFT_PAYLOAD_CTX_INNER_TH, the transport header is available as inner header.
For example, VxLAN sets on all of these flags. While GRE only sets on
NFT_PAYLOAD_CTX_INNER_NH and NFT_PAYLOAD_CTX_INNER_TH. Then, ESP over
UDP only sets on NFT_PAYLOAD_CTX_INNER_TH.
The tunnel description is composed of the following attributes:
- header size: in case the tunnel comes with its own header, eg. VxLAN.
- type: this provides a hint to userspace on how to delinearize the rule.
This is useful for VxLAN and Geneve since they run over UDP, since
transport does not provide a hint. This is also useful in case hardware
offload is ever supported. The type is not currently interpreted by the
kernel.
- expression: currently only payload supported. Follow up patch adds
also inner meta support which is required by autogenerated
dependencies. The exthdr expression should be supported too
at some point. There is a new inner_ops operation that needs to be
set on to allow to use an existing expression from the inner expression.
This patch adds a new NFT_PAYLOAD_TUN_HEADER base which allows to match
on the tunnel header fields, eg. vxlan vni.
The payload expression is embedded into nft_inner private area and this
private data area is passed to the payload inner eval function via
direct call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add V4L2_CID_HDR_MODE as a menu item control to set the HDR mode of the
sensor, and its documentation.
Menu items are not standardized as they differ for each sensors.
Signed-off-by: Benjamin Mugnier <benjamin.mugnier@foss.st.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
This extends the greyscale media bus family originally from
MEDIA_BUS_FMT_Y8_1X8 up to MEDIA_BUS_FMT_Y14_1X14 by adding
MEDIA_BUS_FMT_Y16_1X16, and behaves the same way with 16 bits.
Add its documentation in subdev-formats.rst
Signed-off-by: Benjamin Mugnier <benjamin.mugnier@foss.st.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Added p_s32 and p_s64 pointers to the union in struct v4l2_ext_control to
simplify INTEGER and INTEGER64 control array support.
Internally the control framework handles such arrays just fine, but
userspace is missing corresponding pointers to access array elements of
these types.
The internal union v4l2_ctrl_ptr which the control framework uses already
has these types, they just were never added to the public API.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Add support for 800Gbps speed, link modes of 100Gbps per lane.
As mentioned in slide 21 in IEEE documentation [1], all adopted 802.3df
copper and optical PMDs baselines using 100G/lane will be supported.
Add the relevant PMDs which are mentioned in slide 5 in IEEE
documentation [1] and were approved on 10-2022 [2]:
BP - KR8
Cu Cable - CR8
MMF 50m - VR8
MMF 100m - SR8
SMF 500m - DR8
SMF 2km - DR8-2
[1]: https://www.ieee802.org/3/df/public/22_10/22_1004/shrikhande_3df_01a_221004.pdf
[2]: https://ieee802.org/3/df/KeyMotions_3df_221005.pdf
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmNTCJ8ACgkQCF8+vY7k
4RV/Ag//Ws8bIdedAQsbakBOq9JPOMiqHZnBde5DWn3XqU8aAY9bby70Rf2OTbl7
8mpqzeJY6wFfvesGCJ8L2hprpVqNW1KCrNAxSyaq/8Wau1D77wyEofqPNarNJzqn
oKbH0JWd8hfStJpgmwyxUXjLanDXNx2s4lRm6R1WMWPH6dLeHydx4CtFMbmOn1L8
+jTtLK6631plWw/Kkp1A9z8N1D/9b4iMOgpoQZZLuzL1DouoYWlltz+Kw9HU7rsQ
1/wGmMwTwiV6Zt2UPwB4qudq3UpUMB3tm0KWprkmSx3Xv14Rr1o3zdwALTXib0Ez
wZuzWzWaf9Fjp7CHOfEpm4x3+kU9181iw4ACk34cq7SglMYCdQ2hiwW5b9hhTN2m
tYxv78fXJD2lHyxZQAHNN7XRmiWfMWMA0Z7GwCLVFXJ24Vjzv5AfuD3rJEE6Fv3X
UOjPTNdNt4tpxX8A2Yd7WlfIBBGm2h63MVIYh50R54JCdLLLB8vhtob7pP2Y94pg
FqXxfwc216cArKVsIjmUUkJs153IlQPYzBv9xXBBbD2DXhguWhLQnf9L/KdCnFkF
6NTULAHNezkss6dbLPIL08lCEIvTqeQabPBlCEtXNqqxBWfJwdwLbeS8mg2dTxao
wwR5D37JbNuDSj0/4N/DlvVJozcCLJ2ZZ9R3c2j8/4Z0HERIhqA=
=gJf4
-----END PGP SIGNATURE-----
Merge tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull missed media updates from Mauro Carvalho Chehab:
"It seems I screwed-up my previous pull request: it ends up that only
half of the media patches that were in linux-next got merged in -rc1.
The script which creates the signed tags silently failed due to
5.19->6.0 so it ended generating a tag with incomplete stuff.
So here are the missing parts:
- a DVB core security fix
- lots of fixes and cleanups for atomisp staging driver
- old drivers that are VB1 are being moved to staging to be
deprecated
- several driver updates - mostly for embedded systems, but there are
also some things addressing issues with some PC webcams, in the UVC
video driver"
* tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (163 commits)
media: sun6i-csi: Move csi buffer definition to main header file
media: sun6i-csi: Introduce and use video helper functions
media: sun6i-csi: Add media ops with link notify callback
media: sun6i-csi: Remove controls handler from the driver
media: sun6i-csi: Register the media device after creation
media: sun6i-csi: Pass and store csi device directly in video code
media: sun6i-csi: Tidy up video code
media: sun6i-csi: Tidy up v4l2 code
media: sun6i-csi: Tidy up Kconfig
media: sun6i-csi: Use runtime pm for clocks and reset
media: sun6i-csi: Define and use variant to get module clock rate
media: sun6i-csi: Always set exclusive module clock rate
media: sun6i-csi: Tidy up platform code
media: sun6i-csi: Refactor main driver data structures
media: sun6i-csi: Define and use driver name and (reworked) description
media: cedrus: Add a Kconfig dependency on RESET_CONTROLLER
media: sun8i-rotate: Add a Kconfig dependency on RESET_CONTROLLER
media: sun8i-di: Add a Kconfig dependency on RESET_CONTROLLER
media: sun4i-csi: Add a Kconfig dependency on RESET_CONTROLLER
media: sun6i-csi: Add a Kconfig dependency on RESET_CONTROLLER
...
For the userspace it is needed to distinguish between requests for the
control or streaming interface. The userspace would have to parse the
configfs to know which interface index it has to compare the ctrl
requests against. Since the interface numbers are not fixed, e.g. for
composite gadgets, the interface offset depends on the setup.
The kernel has this information when handing over the ctrl request to
the userspace. This patch removes the offset from the interface numbers
and expose the default interface defines in the uapi g_uvc.h.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Link: https://lore.kernel.org/r/20221011075348.1786897-1-m.grzeschik@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>