Unlike e.g. netdev features, the ethtool ioctl interface requires link mode
table to be in sync between kernel and userspace for userspace to be able
to display and set all link modes supported by kernel. The way arbitrary
length bitsets are implemented in netlink interface, this will be no longer
needed.
To allow userspace to access all link modes running kernel supports, add
table of ethernet link mode names and make it available as a string set to
userspace GET_STRSET requests. Add build time check to make sure names
are defined for all modes declared in enum ethtool_link_mode_bit_indices.
Once the string set is available, make it also accessible via ioctl.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Permanent hardware address of a network device was traditionally provided
via ethtool ioctl interface but as Jiri Pirko pointed out in a review of
ethtool netlink interface, rtnetlink is much more suitable for it so let's
add it to the RTM_NEWLINK message.
Add IFLA_PERM_ADDRESS attribute to RTM_NEWLINK messages unless the
permanent address is all zeros (i.e. device driver did not fill it). As
permanent address is not modifiable, reject userspace requests containing
IFLA_PERM_ADDRESS attribute.
Note: we already provide permanent hardware address for bond slaves;
unfortunately we cannot drop that attribute for backward compatibility
reasons.
v5 -> v6: only add the attribute if permanent address is not zero
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we submit an unknown opcode and have fd == -1, io_op_needs_file()
will return true as we default to needing a file. Then when we go and
assign the file, we find the 'fd' invalid and return -EBADF. We really
should be returning -EINVAL for that case, as we normally do for
unsupported opcodes.
Change io_op_needs_file() to have the following return values:
0 - does not need a file
1 - does need a file
< 0 - error value
and use this to pass back the right value for this invalid case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The VMADDR_CID_RESERVED (1) was used by VMCI, but now it is not
used anymore, so we can reuse it for local communication
(loopback) adding the new well-know CID: VMADDR_CID_LOCAL.
Cc: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow for audit messages to be emitted upon BPF program load and
unload for having a timeline of events. The load itself is in
syscall context, so additional info about the process initiating
the BPF prog creation can be logged and later directly correlated
to the unload event.
The only info really needed from BPF side is the globally unique
prog ID where then audit user space tooling can query / dump all
info needed about the specific BPF program right upon load event
and enrich the record, thus these changes needed here can be kept
small and non-intrusive to the core.
Raw example output:
# auditctl -D
# auditctl -a always,exit -F arch=x86_64 -S bpf
# ausearch --start recent -m 1334
...
----
time->Wed Nov 27 16:04:13 2019
type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf"
type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321 \
success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477 \
pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001 \
egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf" \
exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf" \
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD
----
time->Wed Nov 27 16:04:13 2019
type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD
...
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org
Add support for reading out the uniq information from the underlying HID
device. This might be the iSerialNumber in case of USB or the BD_ADDR in
case of Bluetooth.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The staging isdn drivers are gone, and CONFIG_BT_CMTP is now
the only user. This means a lot of the code in the subsystem
has no remaining callers and can be removed.
Change the capi user space front-end to be part of kernelcapi,
and the combined module to only be compiled if BT_CMTP is
also enabled, then remove the interfaces that have no remaining
callers.
As the notifier list and the capi_drivers list have no callers
outside of kcapi.c, the implementation gets much simpler.
Some definitions from the include/linux/*.h headers are only
needed internally and are moved to kcapi.h.
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20191210210455.3475361-2-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As described in drivers/staging/isdn/TODO, the drivers are all
assumed to be unmaintained and unused now, with gigaset being the
last one to stop being maintained after Paul Bolle lost access
to an ISDN network.
The CAPI subsystem remains for now, as it is still required by
bluetooth/cmtp.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20191210210455.3475361-1-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This framework allows a unified userspace interface for dma-buf
exporters, allowing userland to allocate specific types of memory
for use in dma-buf sharing.
Each heap is given its own device node, which a user can allocate
a dma-buf fd from using the DMA_HEAP_IOC_ALLOC.
This code is an evoluiton of the Android ION implementation,
and a big thanks is due to its authors/maintainers over time
for their effort:
Rebecca Schultz Zavin, Colin Cross, Benjamin Gaignard,
Laura Abbott, and many other contributors!
Cc: Laura Abbott <labbott@redhat.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Vincent Donnefort <Vincent.Donnefort@arm.com>
Cc: Sudipto Paul <Sudipto.Paul@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Acked-by: Sandeep Patil <sspatil@android.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191203172641.66642-2-john.stultz@linaro.org
Some commands will invariably end in a failure in the sense that the
completion result will be less than zero. One such example is timeouts
that don't have a completion count set, they will always complete with
-ETIME unless cancelled.
For linked commands, we sever links and fail the rest of the chain if
the result is less than zero. Since we have commands where we know that
will happen, add IOSQE_IO_HARDLINK as a stronger link that doesn't sever
regardless of the completion result. Note that the link will still sever
if we fail submitting the parent request, hard links are only resilient
in the presence of completion results for requests that did submit
correctly.
Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Wait for rcu grace period after releasing netns in ctnetlink,
from Florian Westphal.
2) Incorrect command type in flowtable offload ndo invocation,
from wenxu.
3) Incorrect callback type in flowtable offload flow tuple
updates, also from wenxu.
4) Fix compile warning on flowtable offload infrastructure due to
possible reference to uninitialized variable, from Nathan Chancellor.
5) Do not inline nf_ct_resolve_clash(), this is called from slow
path / stress situations. From Florian Westphal.
6) Missing IPv6 flow selector description in flowtable offload.
7) Missing check for NETDEV_UNREGISTER in nf_tables offload
infrastructure, from wenxu.
8) Update NAT selftest to use randomized netns names, from
Florian Westphal.
9) Restore nfqueue bridge support, from Marco Oliverio.
10) Compilation warning in SCTP_CHUNKMAP_*() on xt_sctp header.
From Phil Sutter.
11) Fix bogus lookup/get match for non-anonymous rbtree sets.
12) Missing netlink validation for NFT_SET_ELEM_INTERVAL_END
elements.
13) Missing netlink validation for NFT_DATA_VALUE after
nft_data_init().
14) If rule specifies no actions, offload infrastructure returns
EOPNOTSUPP.
15) Module refcount leak in object updates.
16) Missing sanitization for ARP traffic from br_netfilter, from
Eric Dumazet.
17) Compilation breakage on big-endian due to incorrect memcpy()
size in the flowtable offload infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With 'bytes(__u32)' being 32, a left-shift of 31 may happen which is
undefined for the signed 32-bit value 1. Avoid this by declaring 1 as
unsigned.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented
as a TCP ULP, overriding in particular the sendmsg and recvmsg
operations. A Stream Parser is used to extract messages out of the TCP
stream using the first 2 bytes as length marker. Received IKE messages
are put on "ike_queue", waiting to be dequeued by the custom recvmsg
implementation. Received ESP messages are sent to XFRM, like with UDP
encapsulation.
Some of this code is taken from the original submission by Herbert
Xu. Currently, only IPv4 is supported, like for UDP encapsulation.
Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
WireGuard is a layer 3 secure networking tunnel made specifically for
the kernel, that aims to be much simpler and easier to audit than IPsec.
Extensive documentation and description of the protocol and
considerations, along with formal proofs of the cryptography, are
available at:
* https://www.wireguard.com/
* https://www.wireguard.com/papers/wireguard.pdf
This commit implements WireGuard as a simple network device driver,
accessible in the usual RTNL way used by virtual network drivers. It
makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
networking subsystem APIs. It has a somewhat novel multicore queueing
system designed for maximum throughput and minimal latency of encryption
operations, but it is implemented modestly using workqueues and NAPI.
Configuration is done via generic Netlink, and following a review from
the Netlink maintainer a year ago, several high profile userspace tools
have already implemented the API.
This commit also comes with several different tests, both in-kernel
tests and out-of-kernel tests based on network namespaces, taking profit
of the fact that sockets used by WireGuard intentionally stay in the
namespace the WireGuard interface was originally created, exactly like
the semantics of userspace tun devices. See wireguard.com/netns/ for
pictures and examples.
The source code is fairly short, but rather than combining everything
into a single file, WireGuard is developed as cleanly separable files,
making auditing and comprehension easier. Things are laid out as
follows:
* noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
cryptographic aspects of the protocol, and are mostly data-only in
nature, taking in buffers of bytes and spitting out buffers of
bytes. They also handle reference counting for their various shared
pieces of data, like keys and key lists.
* ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
ratelimiting certain types of cryptographic operations in accordance
with particular WireGuard semantics.
* allowedips.[ch], peerlookup.[ch]: The main lookup structures of
WireGuard, the former being trie-like with particular semantics, an
integral part of the design of the protocol, and the latter just
being nice helper functions around the various hashtables we use.
* device.[ch]: Implementation of functions for the netdevice and for
rtnl, responsible for maintaining the life of a given interface and
wiring it up to the rest of WireGuard.
* peer.[ch]: Each interface has a list of peers, with helper functions
available here for creation, destruction, and reference counting.
* socket.[ch]: Implementation of functions related to udp_socket and
the general set of kernel socket APIs, for sending and receiving
ciphertext UDP packets, and taking care of WireGuard-specific sticky
socket routing semantics for the automatic roaming.
* netlink.[ch]: Userspace API entry point for configuring WireGuard
peers and devices. The API has been implemented by several userspace
tools and network management utility, and the WireGuard project
distributes the basic wg(8) tool.
* queueing.[ch]: Shared function on the rx and tx path for handling
the various queues used in the multicore algorithms.
* send.c: Handles encrypting outgoing packets in parallel on
multiple cores, before sending them in order on a single core, via
workqueues and ring buffers. Also handles sending handshake and cookie
messages as part of the protocol, in parallel.
* receive.c: Handles decrypting incoming packets in parallel on
multiple cores, before passing them off in order to be ingested via
the rest of the networking subsystem with GRO via the typical NAPI
poll function. Also handles receiving handshake and cookie messages
as part of the protocol, in parallel.
* timers.[ch]: Uses the timer wheel to implement protocol particular
event timeouts, and gives a set of very simple event-driven entry
point functions for callers.
* main.c, version.h: Initialization and deinitialization of the module.
* selftest/*.h: Runtime unit tests for some of the most security
sensitive functions.
* tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
script using network namespaces.
This commit aims to be as self-contained as possible, implementing
WireGuard as a standalone module not needing much special handling or
coordination from the network subsystem. I expect for future
optimizations to the network stack to positively improve WireGuard, and
vice-versa, but for the time being, this exists as intentionally
standalone.
We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: David Miller <davem@davemloft.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull more input updates from Dmitry Torokhov:
- fixups for Synaptics RMI4 driver
- a quirk for Goodinx touchscreen on Teclast tablet
- a new keycode definition for activating privacy screen feature found
on a few "enterprise" laptops
- updates to snvs_pwrkey driver
- polling uinput device for writing (which is always allowed) now works
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers
Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash
Input: goodix - add upside-down quirk for Teclast X89 tablet
Input: add privacy screen toggle keycode
Input: uinput - fix returning EPOLLOUT from uinput_poll
Input: snvs_pwrkey - remove gratuitous NULL initializers
Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3puL4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjOJD/4vV2ENSYcwZ7Qezgn9tx5NEdADN6jC0DXe
tJDnA0sDLfLTlVM9I1U2/wD6Wu31cvV2dHKmxO+WWWRP2M0orI00UA3I6BssVptY
42e/YJd13IM+iVrhfhm+hmcAHvQUmWHTPZg/7F7OZPkEtNcbCjn6s+HTyzu9gqtt
/kbJVDJ75TR9dEWCPY/A4jUcJYLw2leoHI4u2eqYRsTFNJHUQoaFUHDyNHyj7UNI
kIUi2UixJv+a4pkFvyLPKhJcLr6DBC2TeUnfP2Re5oX1Z/XkDR7iX+L5dUwRHHbT
4M7aJJ+mHDm8Z4IwwBqsSDRU5wUpzuplwBQBY/EQilUZAyOALVzUQABlRa0zxH8t
0D4U6LDDOopYeK0by/FGdD88S6Z0LKm0HJETbdPaGd9fqSlhBn19iGeBKYk0cCIu
Uecm9DKF+QLfKhTbN6h8nPyR3bfkqyUptQJ1UnheWo9f6L32bfp19sdI9LdtRUnS
k661QApajaYQu/641u9CDZiI+DvI+57Wm9I4eCF3GXqOYWPAxSgWP7rJVgS+mfA3
wo99/r11SruaA1Wqf+bOoN/wjsQCiUPFa8po8sh05ER4MH5Brb5EFlg04ZlQZkR9
jOdiL8ISM9t86eyKwtR4PanE7/pPEYUjEWm4ntCC6ViTwCvgV3d6s2We6QPw1j2l
nQ9p/c2bhg==
=7ZFw
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20191205' of git://git.kernel.dk/linux-block
Pull more block and io_uring updates from Jens Axboe:
"I wasn't expecting this to be so big, and if I was, I would have used
separate branches for this. Going forward I'll be doing separate
branches for the current tree, just like for the next kernel version
tree. In any case, this contains:
- Series from Christoph that fixes an inherent race condition with
zoned devices and revalidation.
- null_blk zone size fix (Damien)
- Fix for a regression in this merge window that caused busy spins by
sending empty disk uevents (Eric)
- Fix for a regression in this merge window for bfq stats (Hou)
- Fix for io_uring creds allocation failure handling (me)
- io_uring -ERESTARTSYS send/recvmsg fix (me)
- Series that fixes the need for applications to retain state across
async request punts for io_uring. This one is a bit larger than I
would have hoped, but I think it's important we get this fixed for
5.5.
- connect(2) improvement for io_uring, handling EINPROGRESS instead
of having applications needing to poll for it (me)
- Have io_uring use a hash for poll requests instead of an rbtree.
This turned out to work much better in practice, so I think we
should make the switch now. For some workloads, even with a fair
amount of cancellations, the insertion sort is just too expensive.
(me)
- Various little io_uring fixes (me, Jackie, Pavel, LimingWu)
- Fix for brd unaligned IO, and a warning for the future (Ming)
- Fix for a bio integrity data leak (Justin)
- bvec_iter_advance() improvement (Pavel)
- Xen blkback page unmap fix (SeongJae)
The major items in here are all well tested, and on the liburing side
we continue to add regression and feature test cases. We're up to 50
topic cases now, each with anywhere from 1 to more than 10 cases in
each"
* tag 'for-linus-20191205' of git://git.kernel.dk/linux-block: (33 commits)
block: fix memleak of bio integrity data
io_uring: fix a typo in a comment
bfq-iosched: Ensure bio->bi_blkg is valid before using it
io_uring: hook all linked requests via link_list
io_uring: fix error handling in io_queue_link_head
io_uring: use hash table for poll command lookups
io-wq: clear node->next on list deletion
io_uring: ensure deferred timeouts copy necessary data
io_uring: allow IO_SQE_* flags on IORING_OP_TIMEOUT
null_blk: remove unused variable warning on !CONFIG_BLK_DEV_ZONED
brd: warn on un-aligned buffer
brd: remove max_hw_sectors queue limit
xen/blkback: Avoid unmapping unmapped grant pages
io_uring: handle connect -EINPROGRESS like -EAGAIN
block: set the zone size in blk_revalidate_disk_zones atomically
block: don't handle bio based drivers in blk_revalidate_disk_zones
block: allocate the zone bitmaps lazily
block: replace seq_zones_bitmap with conv_zones_bitmap
block: simplify blkdev_nr_zones
block: remove the empty line at the end of blk-zoned.c
...
Merge more updates from Andrew Morton:
"Most of the rest of MM and various other things. Some Kconfig rework
still awaits merges of dependent trees from linux-next.
Subsystems affected by this patch series: mm/hotfixes, mm/memcg,
mm/vmstat, mm/thp, procfs, sysctl, misc, notifiers, core-kernel,
bitops, lib, checkpatch, epoll, binfmt, init, rapidio, uaccess, kcov,
ubsan, ipc, bitmap, mm/pagemap"
* akpm: (86 commits)
mm: remove __ARCH_HAS_4LEVEL_HACK and include/asm-generic/4level-fixup.h
um: add support for folded p4d page tables
um: remove unused pxx_offset_proc() and addr_pte() functions
sparc32: use pgtable-nopud instead of 4level-fixup
parisc/hugetlb: use pgtable-nopXd instead of 4level-fixup
parisc: use pgtable-nopXd instead of 4level-fixup
nds32: use pgtable-nopmd instead of 4level-fixup
microblaze: use pgtable-nopmd instead of 4level-fixup
m68k: mm: use pgtable-nopXd instead of 4level-fixup
m68k: nommu: use pgtable-nopud instead of 4level-fixup
c6x: use pgtable-nopud instead of 4level-fixup
arm: nommu: use pgtable-nopud instead of 4level-fixup
alpha: use pgtable-nopud instead of 4level-fixup
gpio: pca953x: tighten up indentation
gpio: pca953x: convert to use bitmap API
gpio: pca953x: use input from regs structure in pca953x_irq_pending()
gpio: pca953x: remove redundant variable and check in IRQ handler
lib/bitmap: introduce bitmap_replace() helper
lib/test_bitmap: fix comment about this file
lib/test_bitmap: move exp1 and exp2 upper for others to use
...
Patch series " kcov: collect coverage from usb and vhost", v3.
This patchset extends kcov to allow collecting coverage from backgound
kernel threads. This extension requires custom annotations for each of
the places where coverage collection is desired. This patchset
implements this for hub events in the USB subsystem and for vhost
workers. See the first patch description for details about the kcov
extension. The other two patches apply this kcov extension to USB and
vhost.
Examples of other subsystems that might potentially benefit from this
when custom annotations are added (the list is based on
process_one_work() callers for bugs recently reported by syzbot):
1. fs: writeback wb_workfn() worker,
2. net: addrconf_dad_work()/addrconf_verify_work() workers,
3. net: neigh_periodic_work() worker,
4. net/p9: p9_write_work()/p9_read_work() workers,
5. block: blk_mq_run_work_fn() worker.
These patches have been used to enable coverage-guided USB fuzzing with
syzkaller for the last few years, see the details here:
https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md
This patchset has been pushed to the public Linux kernel Gerrit
instance:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524
This patch (of 3):
Add background thread coverage collection ability to kcov.
With KCOV_ENABLE coverage is collected only for syscalls that are issued
from the current process. With KCOV_REMOTE_ENABLE it's possible to
collect coverage for arbitrary parts of the kernel code, provided that
those parts are annotated with kcov_remote_start()/kcov_remote_stop().
This allows to collect coverage from two types of kernel background
threads: the global ones, that are spawned during kernel boot in a
limited number of instances (e.g. one USB hub_event() worker thread is
spawned per USB HCD); and the local ones, that are spawned when a user
interacts with some kernel interface (e.g. vhost workers).
To enable collecting coverage from a global background thread, a unique
global handle must be assigned and passed to the corresponding
kcov_remote_start() call. Then a userspace process can pass a list of
such handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field
of the kcov_remote_arg struct. This will attach the used kcov device to
the code sections, that are referenced by those handles.
Since there might be many local background threads spawned from
different userspace processes, we can't use a single global handle per
annotation. Instead, the userspace process passes a non-zero handle
through the common_handle field of the kcov_remote_arg struct. This
common handle gets saved to the kcov_handle field in the current
task_struct and needs to be passed to the newly spawned threads via
custom annotations. Those threads should in turn be annotated with
kcov_remote_start()/kcov_remote_stop().
Internally kcov stores handles as u64 integers. The top byte of a
handle is used to denote the id of a subsystem that this handle belongs
to, and the lower 4 bytes are used to denote the id of a thread instance
within that subsystem. A reserved value 0 is used as a subsystem id for
common handles as they don't belong to a particular subsystem. The
bytes 4-7 are currently reserved and must be zero. In the future the
number of bytes used for the subsystem or handle ids might be increased.
When a particular userspace process collects coverage by via a common
handle, kcov will collect coverage for each code section that is
annotated to use the common handle obtained as kcov_handle from the
current task_struct. However non common handles allow to collect
coverage selectively from different subsystems.
Link: http://lkml.kernel.org/r/e90e315426a384207edbec1d6aa89e43008e4caf.1572366574.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Userspace cannot compile <linux/scc.h>
CC usr/include/linux/scc.h.s
In file included from <command-line>:32:0:
usr/include/linux/scc.h:20:20: error: `SIOCDEVPRIVATE' undeclared here (not in a function)
SIOCSCCRESERVED = SIOCDEVPRIVATE,
^~~~~~~~~~~~~~
Include <linux/sockios.h> to make it self-contained, and add it to the
compile-test coverage.
Link: http://lkml.kernel.org/r/20191108055809.26969-1-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add keycode for toggling electronic privacy screen to the keycodes
definition. Some new laptops have a privacy screen which can be toggled
with a key on the keyboard.
Signed-off-by: Mathew King <mathewk@chromium.org>
Link: https://lore.kernel.org/r/20191017163208.235518-1-mathewk@chromium.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
* small x86 cleanup
* fix for an x86-specific out-of-bounds write on a ioctl (not guest triggerable,
data not attacker-controlled)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJd551cAAoJEL/70l94x66D+JkH/R3eEOyvckPmYmzd0lnV8mQ/
7e0n2G/aD+iLZkcCbUnMaImdmSJmoEEJCPjgPk/5nJ3zUi5b/ABWyidEM5uf19Hl
rzKBg0DR7BiQptPnZv2JMwEVKu3JOTchMykqu9xXChQlICocZ0xjdOA6nQ19p0Lv
FulDw5MUaWrXevIzCBskQ38zJejRQA6CpD1lQkHn7LKS9p3p+BsAOd/Ouy87RfWG
b3ktECNbXyO6KStrrhgm+z8pviWY+kqYklyBlDOOwxWif0x8WvNDpQLoVo+ZuhLU
Me8YJ1BN75vFlxzh6ZK5exBUnm9E3fGVKIaaF+dpuds2x+j4HnYl+lZCm89MdqY=
=Q4v7
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
- PPC secure guest support
- small x86 cleanup
- fix for an x86-specific out-of-bounds write on a ioctl (not guest
triggerable, data not attacker-controlled)
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vmx: Stop wasting a page for guest_msrs
KVM: x86: fix out-of-bounds write in KVM_GET_EMULATED_CPUID (CVE-2019-19332)
Documentation: kvm: Fix mention to number of ioctls classes
powerpc: Ultravisor: Add PPC_UV config option
KVM: PPC: Book3S HV: Support reset of secure guest
KVM: PPC: Book3S HV: Handle memory plug/unplug to secure VM
KVM: PPC: Book3S HV: Radix changes for secure guest
KVM: PPC: Book3S HV: Shared pages support for secure guests
KVM: PPC: Book3S HV: Support for running secure guests
mm: ksm: Export ksm_madvise()
KVM x86: Move kvm cpuid support out of svm
Here is the "big" tty and serial driver patches for 5.5-rc1. It's a bit
later in the merge window than normal as I wanted to make sure some
last-minute patches applied to it were all sane. They seem to be :)
There's a lot of little stuff in here, for the tty core, and for lots of
serial drivers:
- reverts of uartlite serial driver patches that were wrong
- msm-serial driver fixes
- serial core updates and fixes
- tty core fixes
- serial driver dma mapping api changes
- lots of other tiny fixes and updates for serial drivers
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXebFIQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylnmACgjfMcfQWa7uC9Q6m2DaQaRMaW6QoAnjg+TgBB
eW9EhvyXL2VbrsuUl+iH
=Am9O
-----END PGP SIGNATURE-----
Merge tag 'tty-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the "big" tty and serial driver patches for 5.5-rc1.
It's a bit later in the merge window than normal as I wanted to make
sure some last-minute patches applied to it were all sane. They seem
to be :)
There's a lot of little stuff in here, for the tty core, and for lots
of serial drivers:
- reverts of uartlite serial driver patches that were wrong
- msm-serial driver fixes
- serial core updates and fixes
- tty core fixes
- serial driver dma mapping api changes
- lots of other tiny fixes and updates for serial drivers
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (58 commits)
Revert "serial/8250: Add support for NI-Serial PXI/PXIe+485 devices"
vcs: prevent write access to vcsu devices
tty: vt: keyboard: reject invalid keycodes
tty: don't crash in tty_init_dev when missing tty_port
serial: stm32: fix clearing interrupt error flags
tty: Fix Kconfig indentation, continued
serial: serial_core: Perform NULL checks for break_ctl ops
tty: remove unused argument from tty_open_by_driver()
tty: Fix Kconfig indentation
{tty: serial, nand: onenand}: samsung: rename to fix build warning
serial: ifx6x60: add missed pm_runtime_disable
serial: pl011: Fix DMA ->flush_buffer()
Revert "serial-uartlite: Move the uart register"
Revert "serial-uartlite: Add get serial id if not provided"
Revert "serial-uartlite: Do not use static struct uart_driver out of probe()"
Revert "serial-uartlite: Add runtime support"
Revert "serial-uartlite: Change logic how console_port is setup"
Revert "serial-uartlite: Use allocated structure instead of static ones"
tty: serial: msm_serial: Use dma_request_chan() directly for channel request
tty: serial: tegra: Use dma_request_chan() directly for channel request
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl3leXUUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vyY3g/9FAVVdPEaadNtAhQ/zIxcjozDovKq
0q7yOA3aTBTUoNEinm88an6p0dcC4gNKtGukXmzVH2Hhxm9kLRdtpZGYY00tpLUB
9rI7XsgwwHa+hLwsHbIs507sKGFGy5FLr0ChTTGLDEMppnEvjA2hZooYmcB/OgrC
LlFcwbNKGOk/Si9u2bF2nLO0JDoVHnwzpF99saew/nqc7Lfj9e9IPZFom+VjPBUh
AOvRp2H7uBN+WQlpLeFeMDDoeXh34lX0kYqIV/cVkXVnknDGYKV2CBTg2aeX7jd0
QiPHZh6zlW8zNQgaCZRiBAbatVEOnRMRJ++yiqB8hBYp1LMXm6kJ01YSQpXkugoY
Vp9dtzzTARWV/XkKwD4brw9ZEmIDnO+Ed2x2VbUkPJVcXAvzSQWAx82IU0Iuqmcb
9qr6U2Zf/Xk5aFlGPYVH8QOG+QqzIbZNRQ7NlhDlITyW4P6QPu0mw374yYP2wDGL
sP5YSS3YGa0sQcEgDtVnd4z+WTZI4AwXLPaeaLkDhdfHp2FsERUY4TrPs33J99xw
og4EyokVFzjYzlnBPU6WWn7LL+jj5ccXkL3MA4DR4FJOnNGHh7NXfQUH56rrgsq7
F9/8shL5DuTbQkde1uSyUG9Iq/RigVLlV5DQavFm3dSXvZi0E16t5alC5URNTzk7
at8Bogn53QhlmYc=
=uUXw
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Warn if a host bridge has no NUMA info (Yunsheng Lin)
- Add PCI_STD_NUM_BARS for the number of standard BARs (Denis
Efremov)
Resource management:
- Fix boot-time Embedded Controller GPE storm caused by incorrect
resource assignment after ACPI Bus Check Notification (Mika
Westerberg)
- Protect pci_reassign_bridge_resources() against concurrent
addition/removal (Benjamin Herrenschmidt)
- Fix bridge dma_ranges resource list cleanup (Rob Herring)
- Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control
the MMIO and prefetchable MMIO window sizes of hotplug bridges
independently (Nicholas Johnson)
- Fix MMIO/MMIO_PREF window assignment that assigned more space than
desired (Nicholas Johnson)
- Only enforce bus numbers from bridge EA if the bridge has EA
devices downstream (Subbaraya Sundeep)
- Consolidate DT "dma-ranges" parsing and convert all host drivers to
use shared parsing (Rob Herring)
Error reporting:
- Restore AER capability after resume (Mayurkumar Patel)
- Add PoisonTLPBlocked AER counter (Rajat Jain)
- Use for_each_set_bit() to simplify AER code (Andy Shevchenko)
- Fix AER kernel-doc (Andy Shevchenko)
- Add "pcie_ports=dpc-native" parameter to allow native use of DPC
even if platform didn't grant control over AER (Olof Johansson)
Hotplug:
- Avoid returning prematurely from sysfs requests to enable or
disable a PCIe hotplug slot (Lukas Wunner)
- Don't disable interrupts twice when suspending hotplug ports (Mika
Westerberg)
- Fix deadlocks when PCIe ports are hot-removed while suspended (Mika
Westerberg)
Power management:
- Remove unnecessary ASPM locking (Bjorn Helgaas)
- Add support for disabling L1 PM Substates (Heiner Kallweit)
- Allow re-enabling Clock PM after it has been disabled (Heiner
Kallweit)
- Add sysfs attributes for controlling ASPM link states (Heiner
Kallweit)
- Remove CONFIG_PCIEASPM_DEBUG, including "link_state" and "clk_ctl"
sysfs files (Heiner Kallweit)
- Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on
USB 2.0 or 1.1 connect events (Kai-Heng Feng)
- Move power state check out of pci_msi_supported() (Bjorn Helgaas)
- Fix incorrect MSI-X masking on resume and revert related nvme quirk
for Kingston NVME SSD running FW E8FK11.T (Jian-Hong Pan)
- Always return devices to D0 when thawing to fix hibernation with
drivers like mlx4 that used legacy power management (previously we
only did it for drivers with new power management ops) (Dexuan Cui)
- Clear PCIe PME Status even for legacy power management (Bjorn
Helgaas)
- Fix PCI PM documentation errors (Bjorn Helgaas)
- Use dev_printk() for more power management messages (Bjorn Helgaas)
- Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas)
- Convert xen-platform from legacy to generic power management (Bjorn
Helgaas)
- Removed unused .resume_early() and .suspend_late() legacy power
management hooks (Bjorn Helgaas)
- Rearrange power management code for clarity (Rafael J. Wysocki)
- Decode power states more clearly ("4" or "D4" really refers to
"D3cold") (Bjorn Helgaas)
- Notice when reading PM Control register returns an error (~0)
instead of interpreting it as being in D3hot (Bjorn Helgaas)
- Add missing link delays required by the PCIe spec (Mika Westerberg)
Virtualization:
- Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI (Bjorn
Helgaas)
- Allow VFs to use PRI (the PF PRI is shared by the VFs, but the code
previously didn't recognize that) (Kuppuswamy Sathyanarayanan)
- Allow VFs to use PASID (the PF PASID capability is shared by the
VFs, but the code previously didn't recognize that) (Kuppuswamy
Sathyanarayanan)
- Disconnect PF and VF ATS enablement, since ATS in PFs and
associated VFs can be enabled independently (Kuppuswamy
Sathyanarayanan)
- Cache PRI and PASID capability offsets (Kuppuswamy Sathyanarayanan)
- Cache the PRI PRG Response PASID Required bit (Bjorn Helgaas)
- Consolidate ATS declarations in linux/pci-ats.h (Krzysztof
Wilczynski)
- Remove unused PRI and PASID stubs (Bjorn Helgaas)
- Removed unnecessary EXPORT_SYMBOL_GPL() from ATS, PRI, and PASID
interfaces that are only used by built-in IOMMU drivers (Bjorn
Helgaas)
- Hide PRI and PASID state restoration functions used only inside the
PCI core (Bjorn Helgaas)
- Add a DMA alias quirk for the Intel VCA NTB (Slawomir Pawlowski)
- Serialize sysfs sriov_numvfs reads vs writes (Pierre Crégut)
- Update Cavium ACS quirk for ThunderX2 and ThunderX3 (George
Cherian)
- Fix the UPDCR register address in the Intel ACS quirk (Steffen
Liebergeld)
- Unify ACS quirk implementations (Bjorn Helgaas)
Amlogic Meson host bridge driver:
- Fix meson PERST# GPIO polarity problem (Remi Pommarel)
- Add DT bindings for Amlogic Meson G12A (Neil Armstrong)
- Fix meson clock names to match DT bindings (Neil Armstrong)
- Add meson support for Amlogic G12A SoC with separate shared PHY
(Neil Armstrong)
- Add meson extended PCIe PHY functions for Amlogic G12A USB3+PCIe
combo PHY (Neil Armstrong)
- Add arm64 DT for Amlogic G12A PCIe controller node (Neil Armstrong)
- Add commented-out description of VIM3 USB3/PCIe mux in arm64 DT
(Neil Armstrong)
Broadcom iProc host bridge driver:
- Invalidate iProc PAXB address mapping before programming it
(Abhishek Shah)
- Fix iproc-msi and mvebu __iomem annotations (Ben Dooks)
Cadence host bridge driver:
- Refactor Cadence PCIe host controller to use as a library for both
host and endpoint (Tom Joseph)
Freescale Layerscape host bridge driver:
- Add layerscape LS1028a support (Xiaowei Bao)
Intel VMD host bridge driver:
- Add VMD bus 224-255 restriction decode (Jon Derrick)
- Add VMD 8086:9A0B device ID (Jon Derrick)
- Remove Keith from VMD maintainer list (Keith Busch)
Marvell ARMADA 3700 / Aardvark host bridge driver:
- Use LTSSM state to build link training flag since Aardvark doesn't
implement the Link Training bit (Remi Pommarel)
- Delay before training Aardvark link in case PERST# was asserted
before the driver probe (Remi Pommarel)
- Fix Aardvark issues with Root Control reads and writes (Remi
Pommarel)
- Don't rely on jiffies in Aardvark config access path since
interrupts may be disabled (Remi Pommarel)
- Fix Aardvark big-endian support (Grzegorz Jaszczyk)
Marvell ARMADA 370 / XP host bridge driver:
- Make mvebu_pci_bridge_emul_ops static (Ben Dooks)
Microsoft Hyper-V host bridge driver:
- Add hibernation support for Hyper-V virtual PCI devices (Dexuan
Cui)
- Track Hyper-V pci_protocol_version per-hbus, not globally (Dexuan
Cui)
- Avoid kmemleak false positive on hv hbus buffer (Dexuan Cui)
Mobiveil host bridge driver:
- Change mobiveil csr_read()/write() function names that conflict
with riscv arch functions (Kefeng Wang)
NVIDIA Tegra host bridge driver:
- Fix Tegra CLKREQ dependency programming (Vidya Sagar)
Renesas R-Car host bridge driver:
- Remove unnecessary header include from rcar (Andrew Murray)
- Tighten register index checking for rcar inbound range programming
(Marek Vasut)
- Fix rcar inbound range alignment calculation to improve packing of
multiple entries (Marek Vasut)
- Update rcar MACCTLR setting to match documentation (Yoshihiro
Shimoda)
- Clear bit 0 of MACCTLR before PCIETCTLR.CFINIT per manual
(Yoshihiro Shimoda)
- Add Marek Vasut and Yoshihiro Shimoda as R-Car maintainers (Simon
Horman)
Rockchip host bridge driver:
- Make rockchip 0V9 and 1V8 power regulators non-optional (Robin
Murphy)
Socionext UniPhier host bridge driver:
- Set uniphier to host (RC) mode always (Kunihiko Hayashi)
Endpoint drivers:
- Fix endpoint driver sign extension problem when shifting page
number to phys_addr_t (Alan Mikhak)
Misc:
- Add NumaChip SPDX header (Krzysztof Wilczynski)
- Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski)
- Remove unused includes (Krzysztof Wilczynski)
- Removed unused sysfs attribute groups (Ben Dooks)
- Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas)
- Add PCIe Link Control 2 register field definitions to replace magic
numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas)
- Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and
Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas)
- Use pcie_capability_read_word() instead of pci_read_config_word()
in AMDGPU and Radeon CIK/SI (Frederick Lawler)
- Remove unused pci_irq_get_node() Greg Kroah-Hartman)
- Make asm/msi.h mandatory and simplify PCI_MSI_IRQ_DOMAIN Kconfig
(Palmer Dabbelt, Michal Simek)
- Read all 64 bits of Switchtec part_event_bitmap (Logan Gunthorpe)
- Fix erroneous intel-iommu dependency on CONFIG_AMD_IOMMU (Bjorn
Helgaas)
- Fix bridge emulation big-endian support (Grzegorz Jaszczyk)
- Fix dwc find_next_bit() usage (Niklas Cassel)
- Fix pcitest.c fd leak (Hewenliang)
- Fix typos and comments (Bjorn Helgaas)
- Fix Kconfig whitespace errors (Krzysztof Kozlowski)"
* tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (160 commits)
PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist
asm-generic: Make msi.h a mandatory include/asm header
Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"
PCI/MSI: Fix incorrect MSI-X masking on resume
PCI/MSI: Move power state check out of pci_msi_supported()
PCI/MSI: Remove unused pci_irq_get_node()
PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer
PCI: hv: Change pci_protocol_version to per-hbus
PCI: hv: Add hibernation support
PCI: hv: Reorganize the code in preparation of hibernation
MAINTAINERS: Remove Keith from VMD maintainer
PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code
PCI/ASPM: Add sysfs attributes for controlling ASPM link states
PCI: Fix indentation
drm/radeon: Prefer pcie_capability_read_word()
drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
drm/radeon: Correct Transmit Margin masks
drm/amdgpu: Prefer pcie_capability_read_word()
PCI: uniphier: Set mode register to host mode
drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
...
If this flag is set, applications can be certain that any data for
async offload has been consumed when the kernel has consumed the
SQE.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
plus a whole load of minor updates and fixes. The two major core
changes are Al Viro's reworking of sg's handling of copy to/from user,
Ming Lei's removal of the host busy counter to avoid contention in the
multiqueue case and Damien Le Moal's fixing of residual tracking
across error handling.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
HkiUww3zbcgl0FWFkUM=
=cdVU
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: aacraid, ufs, zfcp,
NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
plus a whole load of minor updates and fixes.
The major core changes are Al Viro's reworking of sg's handling of
copy to/from user, Ming Lei's removal of the host busy counter to
avoid contention in the multiqueue case and Damien Le Moal's fixing of
residual tracking across error handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
scsi: target: core: Fix a pr_debug() argument
scsi: iscsi: Don't send data to unbound connection
scsi: target: iscsi: Wait for all commands to finish before freeing a session
scsi: target: core: Release SPC-2 reservations when closing a session
scsi: target: core: Document target_cmd_size_check()
scsi: bnx2i: fix potential use after free
Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
scsi: NCR5380: Add disconnect_mask module parameter
scsi: NCR5380: Unconditionally clear ICR after do_abort()
scsi: NCR5380: Call scsi_set_resid() on command completion
scsi: scsi_debug: num_tgts must be >= 0
scsi: lpfc: use hdwq assigned cpu for allocation
scsi: arcmsr: fix indentation issues
scsi: qla4xxx: fix double free bug
scsi: pm80xx: Modified the logic to collect fatal dump
scsi: pm80xx: Tie the interrupt name to the module instance
scsi: pm80xx: Controller fatal error through sysfs
scsi: pm80xx: Do not request 12G sas speeds
scsi: pm80xx: Cleanup command when a reset times out
...
Including:
- Conversion of the AMD IOMMU driver to use the dma-iommu code
for imlementing the DMA-API. This gets rid of quite some code
in the driver itself, but also has some potential for
regressions (non are known at the moment).
- Support for the Qualcomm SMMUv2 implementation in the SDM845
SoC. This also includes some firmware interface changes, but
those are acked by the respective maintainers.
- Preparatory work to support two distinct page-tables per
domain in the ARM-SMMU driver
- Power management improvements for the ARM SMMUv2
- Custom PASID allocator support
- Multiple PCI DMA alias support for the AMD IOMMU driver
- Adaption of the Mediatek driver to the changed IO/TLB flush
interface of the IOMMU core code.
- Preparatory patches for the Renesas IOMMU driver to support
future hardware.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl3hCC4ACgkQK/BELZcB
GuN9QQ/+LFh2TdhtiemIakA/19nv1FTP719uje7vjX4gGBGD++NEzW7mBcAXSEnD
rBta1GsD6N8h0fdT53Nw8cezQ1ldBomKG3j+mzcju7TcuRwebhCEQaxh2iWy+I6g
cp6HxTu3G0E6Zy7wd+MWyJzvXa7MXV2p8iCDs7Dp8yEow+c55b4LAIoeRWx3rjsT
rat29MuJ8TGLP6vOYHcpI+REGfda4rsog75980RIoOEuqRjMG6JPj9clPeakSNtQ
Rl1EtgrDskbRCgDSujbzDMHAYRUKvdCuTuTM1De/GQO+GWYsOtzqBHkct67sGn9I
H518Be9m4xfYyyktVM6K9bSpxzCOtor+u6LFOejufJN/7vL2qtePZX7EHL/ks8zh
Mn80H/1ch1UcFcF9p7V7QCMUSyaiX/VWhgwWIdPf3CGrKVaLnQ8mkB82Zf0VNuQT
OzcfJcVF+skhDkXdFL5xUkQtqqTHhpaK2CzvvTDAsR1KXMCc6mH/MT/9m+mOFQFK
P+klgGdU5rVniru10k4pamT5LlLubRV0NBpaAiGr2R3dfyYyiS/D9FBSLanqO+JM
AgSnmOSbl7y927DxufkVPH8M7TxSdtQVo7VoQFjSWE8B9bh4qU6MVV30enLvY0Fj
g4DP+8srOvY0vNsWNiBe2JpldABGEAbumFt78g1WV2tFi1d/NUI=
=ntaE
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Conversion of the AMD IOMMU driver to use the dma-iommu code for
imlementing the DMA-API. This gets rid of quite some code in the
driver itself, but also has some potential for regressions (non are
known at the moment).
- Support for the Qualcomm SMMUv2 implementation in the SDM845 SoC.
This also includes some firmware interface changes, but those are
acked by the respective maintainers.
- Preparatory work to support two distinct page-tables per domain in
the ARM-SMMU driver
- Power management improvements for the ARM SMMUv2
- Custom PASID allocator support
- Multiple PCI DMA alias support for the AMD IOMMU driver
- Adaption of the Mediatek driver to the changed IO/TLB flush interface
of the IOMMU core code.
- Preparatory patches for the Renesas IOMMU driver to support future
hardware.
* tag 'iommu-updates-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (62 commits)
iommu/rockchip: Don't provoke WARN for harmless IRQs
iommu/vt-d: Turn off translations at shutdown
iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved
iommu/arm-smmu: Remove duplicate error message
iommu/arm-smmu-v3: Don't display an error when IRQ lines are missing
iommu/ipmmu-vmsa: Add utlb_offset_base
iommu/ipmmu-vmsa: Add helper functions for "uTLB" registers
iommu/ipmmu-vmsa: Calculate context registers' offset instead of a macro
iommu/ipmmu-vmsa: Add helper functions for MMU "context" registers
iommu/ipmmu-vmsa: tidyup register definitions
iommu/ipmmu-vmsa: Remove all unused register definitions
iommu/mediatek: Reduce the tlb flush timeout value
iommu/mediatek: Get rid of the pgtlock
iommu/mediatek: Move the tlb_sync into tlb_flush
iommu/mediatek: Delete the leaf in the tlb_flush
iommu/mediatek: Use gather to achieve the tlb range flush
iommu/mediatek: Add a new tlb_lock for tlb_flush
iommu/mediatek: Correct the flush_iotlb_all callback
iommu/io-pgtable-arm: Rename IOMMU_QCOM_SYS_CACHE and improve doc
iommu/io-pgtable-arm: Rationalise MAIR handling
...
Pull HID updates from Jiri Kosina:
- Support for Logitech G15 (Hans de Goede)
- HID parser improvements, improving support for some devices; e.g.
Windows Precision Touchpad, products from Primax, etc. (Blaž
Hrastnik, Candle Sun)
- robustification of tablet mode support in google-whiskers driver
(Dmitry Torokhov)
- assorted small fixes, device-specific quirks and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (23 commits)
HID: rmi: Check that the RMI_STARTED bit is set before unregistering the RMI transport device
HID: quirks: remove hid-led devices from hid_have_special_driver
HID: Improve Windows Precision Touchpad detection.
HID: i2c-hid: Reset ALPS touchpads on resume
HID: i2c-hid: fix no irq after reset on raydium 3118
HID: logitech-hidpp: Silence intermittent get_battery_capacity errors
HID: i2c-hid: remove orphaned member sleep_delay
HID: quirks: Add quirk for HP MSU1465 PIXART OEM mouse
HID: core: check whether Usage Page item is after Usage ID items
HID: intel-ish-hid: Spelling s/diconnect/disconnect/
HID: google: Detect base folded usage instead of hard-coding whiskers
HID: logitech: Add depends on LEDS_CLASS to Logitech Kconfig entry
HID: lg-g15: Add support for the G510's M1-M3 and MR LEDs
HID: lg-g15: Add support for controlling the G510's RGB backlight
HID: lg-g15: Add support for the G510 keyboards' gaming keys
HID: lg-g15: Add support for the M1-M3 and MR LEDs
HID: lg-g15: Add keyboard and LCD backlight control
HID: Add driver for Logitech gaming keyboards (G15, G15 v2)
Input: Add event-codes for macro keys found on various keyboards
HID: hidraw: replace printk() with corresponding pr_xx() variant
...
Core changes:
- Expose pull up/down flags for the GPIO character device to
userspace. After clear input from the RaspberryPi and Beagle
communities, it has been established that prototyping,
industrial automation and make communities strongly need
this feature, and as we want people to use the character
device, we have implemented the simple pull up/down
interface for GPIO lines. This means we can specify that
a (chip-specific) pull up/down resistor can be enabled,
but does not offer fine-grained control such as cases
where the resistance of the same pull resistor can be
controlled (yet).
- Introduce devm_fwnode_gpiod_get_index() and start to phase out
the old symbol devm_fwnode_get_index_gpiod_from_child().
- A bit of documentation clean-up work.
- Introduce a define for GPIO line directions and deploy it
in all GPIO drivers in the drivers/gpio directory.
- Add a special callback to populate pin ranges when
cooperating with the pin control subsystem and registering
ranges as part of adding a gpiolib driver and a
gpio_irq_chip driver at the same time. This is also
deployed in the Intel Merrifield driver.
New drivers:
- RDA Micro GPIO controller.
- XGS-iproc GPIO driver.
Driver improvements:
- Wake event and debounce support on the Tegra 186 driver.
- Finalize the Aspeed SGPIO driver.
- MPC8xxx uses a normal IRQ handler rather than a chained
handler.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl3hIDcACgkQQRCzN7AZ
XXOOyw/8DcaBV6j3EZPDS+b+N74/flNf9JitdJRCtUPn8mjdm+uKNPxbtL/znZc8
zd3rlpiZOqHy8klB3gOPJsJhgXY9QX/b+F5j8fHZvu0DACugRndCqMJ7wrgwUiPn
2ni6KJmz6z5urhcfaAIgyinTWOMegvSVfqjISaUCRCAg3F9dIeQoulRvTPk2ybvv
f31jAGemGbvvQPpd81SYCxuTbMg+jxBIgnOCaX+VmUaBLzh8J2W4It3Myp6M4Sbg
Td6QU4b2J2Q0quk/Dc7c4saT+qRODkg2syPKV2YqmWwdLRDxZyKESMdkCXLWlWJU
fP+KZ4lDxhCaOAYUrY2sEAYMw4E8MzrfeWikdIe0nk0DWqNQhWvDyzsNsB90XGFb
aGgeCPH2W1MdE6usPhLidBaHbLeowzndw5BiEl0UCJUqz7tzTCd5iMIAhoSU/Sr5
ymO8J45G9rdx5pscA3cXhpR/PmqaETYQ/uNrLuxTdI4F4xY12+M0vPrV8z3oDPxB
U/uL0v6HndDcFAavQQiMd9eL6Hocirnn+Z2xFut3nOznHY96ozXSnZb3lzvH/kqI
Du2C8geboVcZsiZJTKVN1zxnfIA8oDauzTOEpGFbIGFhmy0zt4RRRptL4W8NxeFm
KCOk/HqlGeuqJ49epta3mqhUC0MSASA9fdicCdiDqvw+puEznwU=
=o13I
-----END PGP SIGNATURE-----
Merge tag 'gpio-v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v5.5 kernel cycle
Core changes:
- Expose pull up/down flags for the GPIO character device to
userspace.
After clear input from the RaspberryPi and Beagle communities, it
has been established that prototyping, industrial automation and
make communities strongly need this feature, and as we want people
to use the character device, we have implemented the simple pull
up/down interface for GPIO lines.
This means we can specify that a (chip-specific) pull up/down
resistor can be enabled, but does not offer fine-grained control
such as cases where the resistance of the same pull resistor can be
controlled (yet).
- Introduce devm_fwnode_gpiod_get_index() and start to phase out the
old symbol devm_fwnode_get_index_gpiod_from_child().
- A bit of documentation clean-up work.
- Introduce a define for GPIO line directions and deploy it in all
GPIO drivers in the drivers/gpio directory.
- Add a special callback to populate pin ranges when cooperating with
the pin control subsystem and registering ranges as part of adding
a gpiolib driver and a gpio_irq_chip driver at the same time. This
is also deployed in the Intel Merrifield driver.
New drivers:
- RDA Micro GPIO controller.
- XGS-iproc GPIO driver.
Driver improvements:
- Wake event and debounce support on the Tegra 186 driver.
- Finalize the Aspeed SGPIO driver.
- MPC8xxx uses a normal IRQ handler rather than a chained handler"
* tag 'gpio-v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (64 commits)
gpio: Add TODO item for regmap helper
Documentation: gpio: driver.rst: Fix warnings
gpio: of: Fix bogus reference to gpiod_get_count()
gpiolib: Grammar s/manager/managed/
gpio: lynxpoint: Setup correct IRQ handlers
MAINTAINERS: Replace my email by one @kernel.org
gpiolib: acpi: Make acpi_gpiochip_alloc_event always return AE_OK
gpio/mpc8xxx: fix qoriq GPIO reading
gpio: mpc8xxx: Don't overwrite default irq_set_type callback
gpiolib: acpi: Print pin number on acpi_gpiochip_alloc_event errors
gpiolib: fix coding style in gpiod_hog()
drm/bridge: ti-tfp410: switch to using fwnode_gpiod_get_index()
gpio: merrifield: Pass irqchip when adding gpiochip
gpio: merrifield: Add GPIO <-> pin mapping ranges via callback
gpiolib: Introduce ->add_pin_ranges() callback
gpio: mmio: remove untrue leftover comment
gpio: em: Use platform_get_irq() to obtain interrupts
gpio: tegra186: Add debounce support
gpio: tegra186: Program interrupt route mapping
gpio: tegra186: Derive register offsets from bank/port
...
This is a series of cleanups for the y2038 work, mostly intended
for namespace cleaning: the kernel defines the traditional
time_t, timeval and timespec types that often lead to y2038-unsafe
code. Even though the unsafe usage is mostly gone from the kernel,
having the types and associated functions around means that we
can still grow new users, and that we may be missing conversions
to safe types that actually matter.
There are still a number of driver specific patches needed to
get the last users of these types removed, those have been
submitted to the respective maintainers.
Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJd3D+wAAoJEJpsee/mABjZfdcQAJvl6e+4ddKoDMIVJqVCE25N
meFRgA7S8jy6BefEVeUgI8TxK+amGO36szMBUEnZxSSxq9u+gd13m5bEK6Xq/ov7
4KTAiA3Irm/W5FBTktu1zc5ROIra1Xj7jLdubf8wEC3viSXIXB3+68Y28iBN7D2O
k9kSpwINC5lWeC8guZy2I+2yc4ywUEXao9nVh8C/J+FQtU02TcdLtZop9OhpAa8u
U19VVH3WHkQI7ZfLvBTUiYK6tlYTiYCnpr8l6sm850CnVv1fzBW+DzmVhPJ6FdFd
4m5staC0sQ6gVqtjVMBOtT5CdzREse6hpwbKo2GRWFroO5W9tljMOJJXHvv/f6kz
DxrpUmj37JuRbqAbr8KDmQqPo6M2CRkxFxjol1yh5ER63u1xMwLm/PQITZIMDvPO
jrFc2C2SdM2E9bKP/RMCVoKSoRwxCJ5IwJ2AF237rrU0sx/zB2xsrOGssx5CWEgc
3bbk6tDQujJJubnCfgRy1tTxpLZOHEEKw8YhFLLbR2LCtA9pA/0rfLLad16cjA5e
5jIHxfsFc23zgpzrJeB7kAF/9xgu1tlA5BotOs3VBE89LtWOA9nK5dbPXng6qlUe
er3xLCfS38ovhUw6DusQpaYLuaYuLM7DKO4iav9kuTMcY9GkbPk7vDD3KPGh2goy
hY5cSM8+kT1q/THLnUBH
=Bdbv
-----END PGP SIGNATURE-----
Merge tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 cleanups from Arnd Bergmann:
"y2038 syscall implementation cleanups
This is a series of cleanups for the y2038 work, mostly intended for
namespace cleaning: the kernel defines the traditional time_t, timeval
and timespec types that often lead to y2038-unsafe code. Even though
the unsafe usage is mostly gone from the kernel, having the types and
associated functions around means that we can still grow new users,
and that we may be missing conversions to safe types that actually
matter.
There are still a number of driver specific patches needed to get the
last users of these types removed, those have been submitted to the
respective maintainers"
Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
* tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits)
y2038: alarm: fix half-second cut-off
y2038: ipc: fix x32 ABI breakage
y2038: fix typo in powerpc vdso "LOPART"
y2038: allow disabling time32 system calls
y2038: itimer: change implementation to timespec64
y2038: move itimer reset into itimer.c
y2038: use compat_{get,set}_itimer on alpha
y2038: itimer: compat handling to itimer.c
y2038: time: avoid timespec usage in settimeofday()
y2038: timerfd: Use timespec64 internally
y2038: elfcore: Use __kernel_old_timeval for process times
y2038: make ns_to_compat_timeval use __kernel_old_timeval
y2038: socket: use __kernel_old_timespec instead of timespec
y2038: socket: remove timespec reference in timestamping
y2038: syscalls: change remaining timeval to __kernel_old_timeval
y2038: rusage: use __kernel_old_timeval
y2038: uapi: change __kernel_time_t to __kernel_old_time_t
y2038: stat: avoid 'time_t' in 'struct stat'
y2038: ipc: remove __kernel_time_t reference from headers
y2038: vdso: powerpc: avoid timespec references
...
As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need support
for time64_t.
In completely unrelated work, I spent time on cleaning up parts of this
file in the past, moving things out into drivers instead.
After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the rest
of it and move it all into drivers.
This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which is
the scsi ioctl handlers. I have patches for those as well, but they need
more testing or possibly a rewrite.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJdsHCdAAoJEJpsee/mABjZtYkP/1JGl3jFv3Iq/5BCdPkaePP1
RtMJRNfURgK3GeuHUui330PvVjI/pLWXU/VXMK2MPTASpJLzYz3uCaZrpVWEMpDZ
+ImzGmgJkITlW1uWU3zOcQhOxTyb1hCZ0Ci+2xn9QAmyOL7prXoXCXDWv3h6iyiF
lwG+nW+HNtyx41YG+9bRfKNoG0ZJ+nkJ70BV6u0acQHXWn7Xuupa9YUmBL87hxAL
6dlJfLTJg6q8QSv/Q6LxslfWk2Ti8OOJZOwtFM5R8Bgl0iUcvshiRCKfv/3t9jXD
dJNvF1uq8z+gracWK49Qsfq5dnZ2ZxHFUo9u0NjbCrxNvWH/sdvhbaUBuJI75seH
VIznCkdxFhrqitJJ8KmxANxG08u+9zSKjSlxG2SmlA4qFx/AoStoHwQXcogJscNb
YIXYKmWBvwPzYu09QFAXdHFPmZvp/3HhMWU6o92lvDhsDwzkSGt3XKhCJea4DCaT
m+oCcoACqSWhMwdbJOEFofSub4bY43s5iaYuKes+c8O261/Dwg6v/pgIVez9mxXm
TBnvCsotq5m8wbwzv99eFqGeJH8zpDHrXxEtRR5KQqMqjLq/OQVaEzmpHZTEuK7n
e/V/PAKo2/V63g4k6GApQXDxnjwT+m0aWToWoeEzPYXS6KmtWC91r4bWtslu3rdl
bN65armTm7bFFR32Avnu
=lgCl
-----END PGP SIGNATURE-----
Merge tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
"As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need
support for time64_t.
In completely unrelated work, I spent time on cleaning up parts of
this file in the past, moving things out into drivers instead.
After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the
rest of it and move it all into drivers.
This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which
is the scsi ioctl handlers. I have patches for those as well, but they
need more testing or possibly a rewrite"
* tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
scsi: sd: enable compat ioctls for sed-opal
pktcdvd: add compat_ioctl handler
compat_ioctl: move SG_GET_REQUEST_TABLE handling
compat_ioctl: ppp: move simple commands into ppp_generic.c
compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
compat_ioctl: unify copy-in of ppp filters
tty: handle compat PPP ioctls
compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
compat_ioctl: handle SIOCOUTQNSD
af_unix: add compat_ioctl support
compat_ioctl: reimplement SG_IO handling
compat_ioctl: move WDIOC handling into wdt drivers
fs: compat_ioctl: move FITRIM emulation into file systems
gfs2: add compat_ioctl support
compat_ioctl: remove unused convert_in_user macro
compat_ioctl: remove last RAID handling code
compat_ioctl: remove /dev/raw ioctl translation
compat_ioctl: remove PCI ioctl translation
compat_ioctl: remove joystick ioctl translation
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl3dbM4UHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMagw/+MiOlIHFykjK0NGOyUbXR7AwjdrHz
1Fgkh7VZCAHMfCcdlljIIkYe7P+ybfdK51E1QLBiTPGh353JJRAvrjFbDcIyT4kf
u7AVddUeT0QQefFs39ZFWTV0mvJjDBfjFkmiL3cdY9ulx4ZX8V426qjyl8KIwTHe
YkQF3pYMmO28G1SfZu298zTmrFRA10FezxCUbBRZTTE9FcD7mnGQRB7w/wY9t0H1
ebIDgDA0EBh3oznGD3qxD63b5ULdSrImTzvSmEfhzZZsoZB4XGO5MOnLRqgBwFbT
qfRSRfHVl+1JtDHCU43zOUlzScAsff5mvfVNdDMLAFtGGSjDGxgxKNyLwp8+5wmH
GzvB99QQECvCN+gbaedm6adWBGzi7vpoCcgfqY0UPLYvCqsNFbZw4U/iu5ONKWy/
cEGVUGzHf0pWofugDIJfGQuRt6iS2XT9Ode6+QMx8OiC3auZcluhTuJSxMQIhFZ1
5XmoHOQddBwlalmIx8fsIHSo6xsAjNFwTOikEFVzUB+ECR2Urs8eHvQj+d94xz6e
q9LrNkt/eVIDiI+PeP+UBD5IJlfmRSoyUd/mIDFMfmqMBubBSQl70Eyt3dUdt1Bz
0PBs6xjYztpgk3Q7s35TMn8EvDcEksG0WEoFM5fEohYPLWQf8tJ5BbyYJJqc0sod
ExlXu1lPfg9ppKc=
=fc7R
-----END PGP SIGNATURE-----
Merge tag 'audit-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"Audit is back for v5.5, albeit with only two patches:
- Allow for the auditing of suspicious O_CREAT usage via the new
AUDIT_ANOM_CREAT record.
- Remove a redundant if-conditional check found during code analysis.
It's a minor change, but when the pull request is only two patches
long, you need filler in the pull request email"
[ Heh on the pull request filler. I wish more people tried to write
better pull request messages, even if maybe it's not worth it for the
trivial cases ;^) - Linus ]
* tag 'audit-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: remove redundant condition check in kauditd_thread()
audit: Report suspicious O_CREAT usage
Highlights:
- Infrastructure for secure boot on some bare metal Power9 machines. The
firmware support is still in development, so the code here won't actually
activate secure boot on any existing systems.
- A change to xmon (our crash handler / pseudo-debugger) to restrict it to
read-only mode when the kernel is lockdown'ed, otherwise it's trivial to drop
into xmon and modify kernel data, such as the lockdown state.
- Support for KASLR on 32-bit BookE machines (Freescale / NXP).
- Fixes for our flush_icache_range() and __kernel_sync_dicache() (VDSO) to work
with memory ranges >4GB.
- Some reworks of the pseries CMM (Cooperative Memory Management) driver to
make it behave more like other balloon drivers and enable some cleanups of
generic mm code.
- A series of fixes to our hardware breakpoint support to properly handle
unaligned watchpoint addresses.
Plus a bunch of other smaller improvements, fixes and cleanups.
Thanks to:
Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V, Anthony Steinhauser,
Cédric Le Goater, Chris Packham, Chris Smart, Christophe Leroy, Christopher M.
Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Hildenbrand,
Deb McLemore, Diana Craciun, Eric Richter, Geert Uytterhoeven, Greg
Kroah-Hartman, Greg Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason
Yan, Krzysztof Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M.
Rodrigues, Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes, Ravi
Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth, Tyrel
Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl3hBycTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgApBEACk91MEQDYJ9MF9I6uN+85qb5p4pMsp
rGzqnpt+XFidbDAc3eP63pYfIDSo3jtkQ2YL7shAnDOTvkO0md+Vqkl9Aq/G6FIf
lDBlwbgkXMSxS/O2Lpvfn4NZAoK6dKmiV55LSgfliM62X3e2Saeg6TR55wBTgJ6/
SlYPDwZfcVHOAiFS3UmfB+hkiIZk+AI5Zr5VAZvT2ZmeH36yAWkq4JgJI1uAk6m1
/7iCnlfUjx/nl/BhnA3kjjmAgGCJ5s/WuVgwFMz47XpMBWGBhLWpMh/NqDTFb8ca
kpiVQoVPQe2xyO3pL/kOwBy6sii26ftfHDhLKMy1hJdEhVQzS5LerPIMeh1qsU8Q
hV/Cj+jfsrS/vBDOehj3jwx93t+861PmTOqgLnpYQ6Ltrt+2B/74+fufGMHE1kI3
Ffo7xvNw4sw6bSziDxDFqUx2P1dFN5D5EJsJsYM98ekkVAAkzNqCDRvfD2QI8Pif
VXWPYXqtNJTrVPJA0D7Yfo9FDNwhANd0f1zi7r/U5mVXBFUyKOlGqTQSkXgMrVeK
3I7wHPOVGgdA5UUkfcd3pcuqsY081U9E//o5PUfj8ybO5JCwly8NoatbG+xHmKia
a72uJT8MjCo9mGCHKDrwi9l/kqms6ZSv8RP+yMhGuB52YoiGc6PpVyab5jXIUd1N
yTtBlC0YGW1JYw==
=JHzg
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Infrastructure for secure boot on some bare metal Power9 machines.
The firmware support is still in development, so the code here
won't actually activate secure boot on any existing systems.
- A change to xmon (our crash handler / pseudo-debugger) to restrict
it to read-only mode when the kernel is lockdown'ed, otherwise it's
trivial to drop into xmon and modify kernel data, such as the
lockdown state.
- Support for KASLR on 32-bit BookE machines (Freescale / NXP).
- Fixes for our flush_icache_range() and __kernel_sync_dicache()
(VDSO) to work with memory ranges >4GB.
- Some reworks of the pseries CMM (Cooperative Memory Management)
driver to make it behave more like other balloon drivers and enable
some cleanups of generic mm code.
- A series of fixes to our hardware breakpoint support to properly
handle unaligned watchpoint addresses.
Plus a bunch of other smaller improvements, fixes and cleanups.
Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V,
Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart,
Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio
Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana
Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg
Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof
Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues,
Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes,
Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth,
Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing"
* tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits)
powerpc/fixmap: fix crash with HIGHMEM
x86/efi: remove unused variables
powerpc: Define arch_is_kernel_initmem_freed() for lockdep
powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp
powerpc: Avoid clang warnings around setjmp and longjmp
powerpc: Don't add -mabi= flags when building with Clang
powerpc: Fix Kconfig indentation
powerpc/fixmap: don't clear fixmap area in paging_init()
selftests/powerpc: spectre_v2 test must be built 64-bit
powerpc/powernv: Disable native PCIe port management
powerpc/kexec: Move kexec files into a dedicated subdir.
powerpc/32: Split kexec low level code out of misc_32.S
powerpc/sysdev: drop simple gpio
powerpc/83xx: map IMMR with a BAT.
powerpc/32s: automatically allocate BAT in setbat()
powerpc/ioremap: warn on early use of ioremap()
powerpc: Add support for GENERIC_EARLY_IOREMAP
powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt()
powerpc/8xx: use the fixmapped IMMR in cpm_reset()
powerpc/8xx: add __init to cpm1 init functions
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3f/C0QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgprUjD/4t5ituIXmfPDulW4178LchRbAdC2FquGeU
Okft4laeQsJ8ucTBm3RAziVDNnkeJ0mLy5JTMpdPwKW+1F/d8jPuDeTlmmRyKGyW
vt2NlI9fKaOt8cU+LPp3MiGIamleZFOEPsRHU03HYASFcRKQwAswgjS2DZnR7oTK
66XGB9CdwREh1mM63wERYX07+t+ylocTClcLQDCi99H4HI5ScsJXB5aSqpombycy
XujZGlEAUT6WLp0HAwihhAnQIu/fEw+tXMWLvozphIP9xZGpk+a2nbd2WxTCdLZy
MoksoCpl76RmYPuZ+TC1L+tHf2KqGHa/mo64L/6v15QS8hROw3ZTMdJI0mEC/90g
5JwWuBqWOcRiMcV1Uwkz5OYL3KYQOWebS+FBbIc+2jqBlXJeN+1R/Ik5HP8xtGaE
5p317/q09h32piZN0c+BmkXCdnAYZeBez0EjSXf6j3PW3LAqmLOjxc0yA2OmwFWj
iudOfBOg48HtHkMaNrg2f+0THMP8zHqUVoRh96BKCSjnOXmbxGFt3ACHvb+VS0CX
RmlVOcaq+SzL1EsT805Hb6N3x7mK2l9CoanGfLxYXQZlOzD6TcRdlBzs9JCgqiAs
3jk5nAq9wtYvigAkVarMw/LAJNatEO4sy0AHyWJwAcfOsVJquNvhhokT8VxOcVZs
pMxUhUYKZA==
=HqiO
-----END PGP SIGNATURE-----
Merge tag 'for-5.5/io_uring-post-20191128' of git://git.kernel.dk/linux-block
Pull more io_uring updates from Jens Axboe:
"As mentioned in the first pull request, there was a later batch as
well. This contains fixes to the stuff that already went in, cleanups,
and a few later additions. In particular, this contains:
- Cleanups/fixes/unification of the submission and completion path
(Pavel,me)
- Linked timeouts improvements (Pavel,me)
- Error path fixes (me)
- Fix lookup window where cancellations wouldn't work (me)
- Improve DRAIN support (Pavel)
- Fix backlog flushing -EBUSY on submit (me)
- Add support for connect(2) (me)
- Fix for non-iter based fixed IO (Pavel)
- creds inheritance for async workers (me)
- Disable cmsg/ancillary data for sendmsg/recvmsg (me)
- Shrink io_kiocb to 3 cachelines (me)
- NUMA fix for io-wq (Jann)"
* tag 'for-5.5/io_uring-post-20191128' of git://git.kernel.dk/linux-block: (42 commits)
io_uring: make poll->wait dynamically allocated
io-wq: shrink io_wq_work a bit
io-wq: fix handling of NUMA node IDs
io_uring: use kzalloc instead of kcalloc for single-element allocations
io_uring: cleanup io_import_fixed()
io_uring: inline struct sqe_submit
io_uring: store timeout's sqe->off in proper place
net: disallow ancillary data for __sys_{send,recv}msg_file()
net: separate out the msghdr copy from ___sys_{send,recv}msg()
io_uring: remove superfluous check for sqe->off in io_accept()
io_uring: async workers should inherit the user creds
io-wq: have io_wq_create() take a 'data' argument
io_uring: fix dead-hung for non-iter fixed rw
io_uring: add support for IORING_OP_CONNECT
net: add __sys_connect_file() helper
io_uring: only return -EBUSY for submit on non-flushed backlog
io_uring: only !null ptr to io_issue_sqe()
io_uring: simplify io_req_link_next()
io_uring: pass only !null to io_req_find_next()
io_uring: remove io_free_req_find_next()
...
- Protect pci_reassign_bridge_resources() against concurrent
addition/removal (Benjamin Herrenschmidt)
- Fix bridge dma_ranges resource list cleanup (Rob Herring)
- Add PCI_STD_NUM_BARS for the number of standard BARs (Denis Efremov)
- Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control the
MMIO and prefetchable MMIO window sizes of hotplug bridges
independently (Nicholas Johnson)
- Fix MMIO/MMIO_PREF window assignment that assigned more space than
desired (Nicholas Johnson)
- Only enforce bus numbers from bridge EA if the bridge has EA devices
downstream (Subbaraya Sundeep)
* pci/resource:
PCI: Do not use bus number zero from EA capability
PCI: Avoid double hpmemsize MMIO window assignment
PCI: Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters
PCI: Add PCI_STD_NUM_BARS for the number of standard BARs
PCI: Fix missing bridge dma_ranges resource list cleanup
PCI: Protect pci_reassign_bridge_resources() against concurrent addition/removal
Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF.
This ioctl will be issued by QEMU during reset and includes the
the following steps:
- Release all device pages of the secure guest.
- Ask UV to terminate the guest via UV_SVM_TERMINATE ucall
- Unpin the VPA pages so that they can be migrated back to secure
side when guest becomes secure again. This is required because
pinned pages can't be migrated.
- Reinit the partition scoped page tables
After these steps, guest is ready to issue UV_ESM call once again
to switch to secure mode.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[Implementation of uv_svm_terminate() and its call from
guest shutdown path]
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[Unpinning of VPA pages]
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl3dKsYACgkQCF8+vY7k
4RWmzBAAnIL1V0Jb6vJo9AJUmntDiO/3lVGjz+6bzvlhsjbPD0/6r1AY2ZRZxO97
U3TTnB9m1nRjtJaaZ5Lmx9kc9suB8u1s9HziTcKIQ971goSZapmrx3IKPK64jt5D
Uw6nzfinvAgRw0Rt0t6yd+yciw5bsqmGR5iIz/A29BXHnVu5P2TcGE8IM/oloxqV
ZQkff30jieSM3CXq9hQSoekJkipyCXN3xbKncVVHc4TEonvXSCcAQxC6tRulKpi/
Y/MOxB3v7Vh3snVe5lI3fBYPr8Iw8xpBTMXg7s5e4Sb9aCeZioF0cb2nLJFG8tw8
2TviW1Fktt/N1nwZreRgVMkl19fGm3IdVd+1ukWgXS4EvV3FFsx3FZMFgAebhvKZ
1SHxhGyC5yc5yDl8f9/yo/cyKRzeDj7h6SSD2TUKRmnKXWBin9cNr3CPNJDvTk6Q
rgb2B/Sbs00fZFihRdkt8n6SmHpARS4QBZiAKpHECZhmBKNqFLg050v0uzB1Z7D8
r7gK8NIWQORnmLWmHCa/kCAB8syUm8cW6LibWAvCbfN0FEaYdujFtvC+v0BQSTVJ
f9BNFwUs3mCJ6x30F2DN60f2z3G1ehuNPMxFBFJpa9sLgh2wEPVaqEakGWE2iMUP
Rv/cz0CJQweJutdhIAHtk3nVOQZeiAcqZMeHYemX9lL62Mjprmo=
=VN90
-----END PGP SIGNATURE-----
Merge tag 'media/v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- uAPI documentation for stateless decoders
- Added a new CEC ioctl together with its documentation
- Improved IPU3 documentation
- New i2c drivers: hi556 and imx290
- Added support on Vivid driver for meta streams
- Added de-interlace support for sunxi subdriver
- Added a few new remote controler keymaps
- Added H.265 support for Sunxi Cedrus driver
- Another round of random driver cleanups, fixes and improvements
* tag 'media/v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (361 commits)
media: Revert "media: mtk-vcodec: Remove extra area allocation in an input buffer on encoding"
media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly
media: hantro: Remove now unused H264 pic_size
media: hantro: Use output buffer width and height for H264 decoding
media: hantro: Reduce H264 extra space for motion vectors
media: hantro: Fix H264 motion vector buffer offset
media: ti-vpe: vpe: fix compatible to match bindings
media: dt-bindings: media: ti-vpe: Document VPE driver
media: zr364xx: remove redundant assigmnent to idx, clean up code
media: Documentation: media: *_DEFAULT targets for subdevs
media: hantro: Fix s_fmt for dynamic resolution changes
media: i2c: Use the correct style for SPDX License Identifier
media: siano: Use the correct style for SPDX License Identifier
media: vicodec: media_device_cleanup was called too early
media: vim2m: media_device_cleanup was called too early
media: cedrus: Increase maximum supported size
media: cedrus: Fix H264 4k support
media: cedrus: Properly signal size in mode register
media: v4l2-ctrl: Lock main_hdl on operations of requests_queued.
media: si470x-i2c: add missed operations in remove
...
Pull perf updates from Ingo Molnar:
"The main kernel side changes in this cycle were:
- Various Intel-PT updates and optimizations (Alexander Shishkin)
- Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)
- Add support for LSM and SELinux checks to control access to the
perf syscall (Joel Fernandes)
- Misc other changes, optimizations, fixes and cleanups - see the
shortlog for details.
There were numerous tooling changes as well - 254 non-merge commits.
Here are the main changes - too many to list in detail:
- Enhancements to core tooling infrastructure, perf.data, libperf,
libtraceevent, event parsing, vendor events, Intel PT, callchains,
BPF support and instruction decoding.
- There were updates to the following tools:
perf annotate
perf diff
perf inject
perf kvm
perf list
perf maps
perf parse
perf probe
perf record
perf report
perf script
perf stat
perf test
perf trace
- And a lot of other changes: please see the shortlog and Git log for
more details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
perf parse: Fix potential memory leak when handling tracepoint errors
perf probe: Fix spelling mistake "addrees" -> "address"
libtraceevent: Fix memory leakage in copy_filter_type
libtraceevent: Fix header installation
perf intel-bts: Does not support AUX area sampling
perf intel-pt: Add support for decoding AUX area samples
perf intel-pt: Add support for recording AUX area samples
perf pmu: When using default config, record which bits of config were changed by the user
perf auxtrace: Add support for queuing AUX area samples
perf session: Add facility to peek at all events
perf auxtrace: Add support for dumping AUX area samples
perf inject: Cut AUX area samples
perf record: Add aux-sample-size config term
perf record: Add support for AUX area sampling
perf auxtrace: Add support for AUX area sample recording
perf auxtrace: Move perf_evsel__find_pmu()
perf record: Add a function to test for kernel support for AUX area sampling
perf tools: Add kernel AUX area sampling definitions
perf/core: Make the mlock accounting simple again
perf report: Jump to symbol source view from total cycles view
...
Pull networking updates from David Miller:
"Another merge window, another pull full of stuff:
1) Support alternative names for network devices, from Jiri Pirko.
2) Introduce per-netns netdev notifiers, also from Jiri Pirko.
3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara
Larsen.
4) Allow compiling out the TLS TOE code, from Jakub Kicinski.
5) Add several new tracepoints to the kTLS code, also from Jakub.
6) Support set channels ethtool callback in ena driver, from Sameeh
Jubran.
7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED,
SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long.
8) Add XDP support to mvneta driver, from Lorenzo Bianconi.
9) Lots of netfilter hw offload fixes, cleanups and enhancements,
from Pablo Neira Ayuso.
10) PTP support for aquantia chips, from Egor Pomozov.
11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From
Josh Hunt.
12) Add smart nagle to tipc, from Jon Maloy.
13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat
Duvvuru.
14) Add a flow mask cache to OVS, from Tonghao Zhang.
15) Add XDP support to ice driver, from Maciej Fijalkowski.
16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak.
17) Support UDP GSO offload in atlantic driver, from Igor Russkikh.
18) Support it in stmmac driver too, from Jose Abreu.
19) Support TIPC encryption and auth, from Tuong Lien.
20) Introduce BPF trampolines, from Alexei Starovoitov.
21) Make page_pool API more numa friendly, from Saeed Mahameed.
22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni.
23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits)
libbpf: Fix usage of u32 in userspace code
mm: Implement no-MMU variant of vmalloc_user_node_flags
slip: Fix use-after-free Read in slip_open
net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
macvlan: schedule bc_work even if error
enetc: add support Credit Based Shaper(CBS) for hardware offload
net: phy: add helpers phy_(un)lock_mdio_bus
mdio_bus: don't use managed reset-controller
ax88179_178a: add ethtool_op_get_ts_info()
mlxsw: spectrum_router: Fix use of uninitialized adjacency index
mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels
bpf: Simplify __bpf_arch_text_poke poke type handling
bpf: Introduce BPF_TRACE_x helper for the tracing tests
bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT
bpf, testing: Add various tail call test cases
bpf, x86: Emit patchable direct jump as tail call
bpf: Constant map key tracking for prog array pokes
bpf: Add poke dependency tracking for prog array maps
bpf: Add initial poke descriptor table for jit images
bpf: Move owner type, jited info into array auxiliary data
...
This allows an application to call connect() in an async fashion. Like
other opcodes, we first try a non-blocking connect, then punt to async
context if we have to.
Note that we can still return -EINPROGRESS, and in that case the caller
should use IORING_OP_POLL_ADD to do an async wait for completion of the
connect request (just like for regular connect(2), except we can do it
async here too).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXdfjBwAKCRCRxhvAZXjc
onCBAP47WZ/ie7yjoDWhOI1QB7II3NGSzToakxpgJaWoB+NjTwEA7PGrSYVEbPrf
pUhiEaEJ29t+cWUxX3+yDO+k7SA6BAY=
=Ra58
-----END PGP SIGNATURE-----
Merge tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull thread management updates from Christian Brauner:
- A pidfd's fdinfo file currently contains the field "Pid:\t<pid>"
where <pid> is the pid of the process in the pid namespace of the
procfs instance the fdinfo file for the pidfd was opened in.
The fdinfo file has now gained a new "NSpid:\t<ns-pid1>[\t<ns-pid2>[...]]"
field which lists the pids of the process in all child pid namespaces
provided the pid namespace of the procfs instance it is looked up
under has an ancestoral relationship with the pid namespace of the
process. If it does not 0 will be shown and no further pid namespaces
will be listed. Tests included. (Christian Kellner)
- If the process the pidfd references has already exited, print -1 for
the Pid and NSpid fields in the pidfd's fdinfo file. Tests included.
(me)
- Add CLONE_CLEAR_SIGHAND. This lets callers clear all signal handler
that are not SIG_DFL or SIG_IGN at process creation time. This
originated as a feature request from glibc to improve performance and
elimate races in their posix_spawn() implementation. Tests included.
(me)
- Add support for choosing a specific pid for a process with clone3().
This is the feature which was part of the thread update for v5.4 but
after a discussion at LPC in Lisbon we decided to delay it for one
more cycle in order to make the interface more generic. This has now
done. It is now possible to choose a specific pid in a whole pid
namespaces (sub)hierarchy instead of just one pid namespace. In order
to choose a specific pid the caller must have CAP_SYS_ADMIN in all
owning user namespaces of the target pid namespaces. Tests included.
(Adrian Reber)
- Test improvements and extensions. (Andrei Vagin, me)
* tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
selftests/clone3: skip if clone3() is ENOSYS
selftests/clone3: check that all pids are released on error paths
selftests/clone3: report a correct number of fails
selftests/clone3: flush stdout and stderr before clone3() and _exit()
selftests: add tests for clone3() with *set_tid
fork: extend clone3() to support setting a PID
selftests: add tests for clone3()
tests: test CLONE_CLEAR_SIGHAND
clone3: add CLONE_CLEAR_SIGHAND
pid: use pid_has_task() in pidfd_open()
exit: use pid_has_task() in do_wait()
pid: use pid_has_task() in __change_pid()
test: verify fdinfo for pidfd of reaped process
pidfd: check pid has attached task in fdinfo
pidfd: add tests for NSpid info in fdinfo
pidfd: add NSpid entries to fdinfo
- Data abort report and injection
- Steal time support
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt polling counters
- Make the emulated timer PREEMPT_RT compliant
s390:
- Small fixes and cleanups
- selftest improvements
- yield improvements
PPC:
- Add capability to tell userspace whether we can single-step the guest.
- Improve the allocation of XIVE virtual processor IDs
- Rewrite interrupt synthesis code to deliver interrupts in virtual
mode when appropriate.
- Minor cleanups and improvements.
x86:
- XSAVES support for AMD
- more accurate report of nested guest TSC to the nested hypervisor
- retpoline optimizations
- support for nested 5-level page tables
- PMU virtualization optimizations, and improved support for nested
PMU virtualization
- correct latching of INITs for nested virtualization
- IOAPIC optimization
- TSX_CTRL virtualization for more TAA happiness
- improved allocation and flushing of SEV ASIDs
- many bugfixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJd27PMAAoJEL/70l94x66DspsH+gPc6YWtKJFJH58Zj8NrNh6y
t0FwDFcvUa51+m4jaY4L5Y8+zqu1dZFnPPhFGqNWpxrjCEvE/glQJv3BiUX06Seh
aYUHNymGoYCTJOHaaGhV+NlgQaDuZOCOkIsOLAPehyFd1KojwB+FRC0xmO6aROPw
9yQgYrKuK1UUn5HwxBNrMS4+Xv+2iKv/9sTnq1G4W2qX2NZQg84LVPg1zIdkCh3D
3GOvoCBEk3ivQqjmdE7rP/InPr0XvW0b6TFhchIk8J6jEIQFHsmOUefiTvTxsIHV
OKAZwvyeYPrYHA/aDZpaBmY2aR0ydfKDUQcviNIJoF1vOktGs0hvl3VbsmG8QCg=
=OSI1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- data abort report and injection
- steal time support
- GICv4 performance improvements
- vgic ITS emulation fixes
- simplify FWB handling
- enable halt polling counters
- make the emulated timer PREEMPT_RT compliant
s390:
- small fixes and cleanups
- selftest improvements
- yield improvements
PPC:
- add capability to tell userspace whether we can single-step the
guest
- improve the allocation of XIVE virtual processor IDs
- rewrite interrupt synthesis code to deliver interrupts in virtual
mode when appropriate.
- minor cleanups and improvements.
x86:
- XSAVES support for AMD
- more accurate report of nested guest TSC to the nested hypervisor
- retpoline optimizations
- support for nested 5-level page tables
- PMU virtualization optimizations, and improved support for nested
PMU virtualization
- correct latching of INITs for nested virtualization
- IOAPIC optimization
- TSX_CTRL virtualization for more TAA happiness
- improved allocation and flushing of SEV ASIDs
- many bugfixes and cleanups"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints
KVM: x86: Grab KVM's srcu lock when setting nested state
KVM: x86: Open code shared_msr_update() in its only caller
KVM: Fix jump label out_free_* in kvm_init()
KVM: x86: Remove a spurious export of a static function
KVM: x86: create mmu/ subdirectory
KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page
KVM: x86: remove set but not used variable 'called'
KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning
KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it
KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality
KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID
KVM: x86: do not modify masked bits of shared MSRs
KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path
KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT
KVM: x86: Unexport kvm_vcpu_reload_apic_access_page()
KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1
KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization
...
Expose the fs-verity bit through statx().
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXdtWqhQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+C9AQCCf8C2KP6DynoGQb9KRYYreJk8js8G
IgtlhazJ3j1RJAD/VijFbdwbxGCmiR1Y6BhKq5eaCYD1El68wSwkKuNO3ww=
=7WpU
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
"Expose the fs-verity bit through statx()"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
docs: fs-verity: mention statx() support
f2fs: support STATX_ATTR_VERITY
ext4: support STATX_ATTR_VERITY
statx: define STATX_ATTR_VERITY
docs: fs-verity: document first supported kernel version
- Add the IV_INO_LBLK_64 encryption policy flag which modifies the
encryption to be optimized for UFS inline encryption hardware.
- For AES-128-CBC, use the crypto API's implementation of ESSIV (which
was added in 5.4) rather than doing ESSIV manually.
- A few other cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXdtVMxQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK8MVAP44iRzj8ZXu62BhqNOYYcF60s/58QfZ
Jo1VdmvO/8MNrAD+P/jW5sqzcB5BLdNzS7pLKGIzsC55uMyp/79xyKK8wQc=
=XKWV
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
- Add the IV_INO_LBLK_64 encryption policy flag which modifies the
encryption to be optimized for UFS inline encryption hardware.
- For AES-128-CBC, use the crypto API's implementation of ESSIV (which
was added in 5.4) rather than doing ESSIV manually.
- A few other cleanups.
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
f2fs: add support for IV_INO_LBLK_64 encryption policies
ext4: add support for IV_INO_LBLK_64 encryption policies
fscrypt: add support for IV_INO_LBLK_64 policies
fscrypt: avoid data race on fscrypt_mode::logged_impl_name
docs: ioctl-number: document fscrypt ioctl numbers
fscrypt: zeroize fscrypt_info before freeing
fscrypt: remove struct fscrypt_ctx
fscrypt: invoke crypto API for ESSIV handling
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl3YCRAACgkQxWXV+ddt
WDuTuQ/7BOibDKqInm2SsL8xMuZqXjxGXUcHDPio5MbzNJ3wpV0j1KqWWsuK8hi0
HAhSI3fu3NG7RQYh3nuRO0CaZy3ENiqKoffrSpg7k5DJG0B7Lm/G/970fmOYUp6a
j6PMNcrKaw+1J3yuljSd20+n6j/hdmfn847ZsSY+7JmZ4zGMJ5GMv3IRipdLFUzR
tmjWmmCI05sF4/8cI6jzUVq588uSFTO1bGXFugmoO0ztpameudCnYniJI0tDBFSV
pqk6lqoOPNcaC9nATuA5KKOpUJ9nSscP/St3DV4D6LaZKkT/M5zs12lXPMJx/pKn
oCHt/A/wBdbDOoy1uHVMWQ9cz9PyVFtU7eSKizcFjoqnHO6fzlnRr9fsmIZKtTw9
H6nXVmP1S+xJg/zTBxCXHVfZR2dqADUsHWztN1LM8Pen/l9+UMwBeMhq9f9Jz68I
kF7zWlfLEtNh8naEYf34LkGVMtCNY4PHFsSztPg/jbfsH34xMvetKvPR2s8lejhp
42YqPHgEh2+8mmVcq65+jl+bPOp/5bdToRtuPiszWiKZSXt/5xplP+5lkSEet0J6
4aNZ8NRAiZ98br45jdTMUVSo6YtI27SS+GdVOUHPQtPI/kWi9XHx+l3E9MVOUtrd
lQ1Z9tPinEnJH4kntiCz2sKdzNKE01IagV4wFylz1Ct+ZqF9jNs=
=JzIp
-----END PGP SIGNATURE-----
Merge tag 'for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"User visible changes:
- new block group profiles: RAID1 with 3- and 4- copies
- RAID1 in btrfs has always 2 copies, now add support for 3 and 4
- this is an incompat feature (named RAID1C34)
- recommended use of RAID1C3 is replacement of RAID6 profile on
metadata, this brings a more reliable resiliency against 2
device loss/damage
- support for new checksums
- per-filesystem, set at mkfs time
- fast hash (crc32c successor): xxhash, 64bit digest
- strong hashes (both 256bit): sha256 (slower, FIPS), blake2b
(faster)
- the blake2b module goes via the crypto tree, btrfs.ko has a
soft dependency
- speed up lseek, don't take inode locks unnecessarily, this can
speed up parallel SEEK_CUR/SEEK_SET/SEEK_END by 80%
- send:
- allow clone operations within the same file
- limit maximum number of sent clone references to avoid slow
backref walking
- error message improvements: device scan prints process name and PID
Core changes:
- cleanups
- remove unique workqueue helpers, used to provide a way to avoid
deadlocks in the workqueue code, now done in a simpler way
- remove lots of indirect function calls in compression code
- extent IO tree code moved out of extent_io.c
- cleanup backup superblock handling at mount time
- transaction life cycle documentation and cleanups
- locking code cleanups, annotations and documentation
- add more cold, const, pure function attributes
- removal of unused or redundant struct members or variables
- new tree-checker sanity tests
- try to detect missing INODE_ITEM, cross-reference checks of
DIR_ITEM, DIR_INDEX, INODE_REF, and XATTR_* items
- remove own bio scheduling code (used to avoid checksum submissions
being stuck behind other IO), replaced by cgroup controller-based
code to allow better control and avoid priority inversions in cases
where the custom and cgroup scheduling disagreed
Fixes:
- avoid getting stuck during cyclic writebacks
- fix trimming of ranges crossing block group boundaries
- fix rename exchange on subvolumes, all involved subvolumes need to
be recorded in the transaction"
* tag 'for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits)
btrfs: drop bdev argument from submit_extent_page
btrfs: remove extent_map::bdev
btrfs: drop bio_set_dev where not needed
btrfs: get bdev directly from fs_devices in submit_extent_page
btrfs: record all roots for rename exchange on a subvol
Btrfs: fix block group remaining RO forever after error during device replace
btrfs: scrub: Don't check free space before marking a block group RO
btrfs: change btrfs_fs_devices::rotating to bool
btrfs: change btrfs_fs_devices::seeding to bool
btrfs: rename btrfs_block_group_cache
btrfs: block-group: Reuse the item key from caller of read_one_block_group()
btrfs: block-group: Refactor btrfs_read_block_groups()
btrfs: document extent buffer locking
btrfs: access eb::blocking_writers according to ACCESS_ONCE policies
btrfs: set blocking_writers directly, no increment or decrement
btrfs: merge blocking_writers branches in btrfs_tree_read_lock
btrfs: drop incompat bit for raid1c34 after last block group is gone
btrfs: add incompat for raid1 with 3, 4 copies
btrfs: add support for 4-copy replication (raid1c4)
btrfs: add support for 3-copy replication (raid1c3)
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3WxrEQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpuH5D/9qQKfIIuQDUNO4Xx+dIHimTDCrfiEOeO9e
CRaMuSj+yMxLDMwfX8RnDmR17H3ZVoiIY1CT24U9ZkA5iDjeAH4xmzkH30US7LR7
/64YVZTxB0OrWppRK8RiIhaJJZDQ6+HPUQsn6PRaLVuFHi2unMoTQnj/ZQKz03QA
Pl8Xx7qBtH1JwYCzQ21f/uryAcNg9eWabRLN2f1uiOXLmvRxOfh6Z/iaezlaZlmL
qeJdcdLjjvOgOPwEOfNjfS6pd+XBz3gdEhn0l+11nHITxWZmVBwsWTKyUQlCmKnl
yuCWDVyx5d6zCnlrLYG0l2Fn2lr9SwAkdkq3YAKV03hA/6s6P9q9bm31VvOf828x
7gmr4YVz68y7H9bM0QAHCvDpjll0aIEUw6XFzSOCDtZ9B6/pppYQWzMU71J05eyF
8DOKv2M2EVNLUjf6u0RDyolnWGU0kIjt5ryWE3OsGcezAVa2wYstgUJTKbrn1YgT
j+4KTpaI+sg8GKDFauvxcSa6gwoRp6jweFNW+7vC090/shXmrGmVLOnQZKRuHho/
O4W8y/1/deM8CCIAETpiNxA8RV5U/EZygrFGDFc7yzTtVDGHY356M/B4Bmm2qkVu
K3WgeZp8Fc0lH0QF6Pp9ZlBkZEpGNCAPVsPkXIsxQXbctftkn3KY//uIubfpFEB1
PpHSicvkww==
=HYYq
-----END PGP SIGNATURE-----
Merge tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
"Due to more granular branches, this one is small and will be followed
with other core branches that add specific features. I meant to just
have a core and drivers branch, but external dependencies we ended up
adding a few more that are also core.
The changes are:
- Fixes and improvements for the zoned device support (Ajay, Damien)
- sed-opal table writing and datastore UID (Revanth)
- blk-cgroup (and bfq) blk-cgroup stat fixes (Tejun)
- Improvements to the block stats tracking (Pavel)
- Fix for overruning sysfs buffer for large number of CPUs (Ming)
- Optimization for small IO (Ming, Christoph)
- Fix typo in RWH lifetime hint (Eugene)
- Dead code removal and documentation (Bart)
- Reduction in memory usage for queue and tag set (Bart)
- Kerneldoc header documentation (André)
- Device/partition revalidation fixes (Jan)
- Stats tracking for flush requests (Konstantin)
- Various other little fixes here and there (et al)"
* tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block: (48 commits)
Revert "block: split bio if the only bvec's length is > SZ_4K"
block: add iostat counters for flush requests
block,bfq: Skip tracing hooks if possible
block: sed-opal: Introduce SUM_SET_LIST parameter and append it using 'add_token_u64'
blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1
block: Don't disable interrupts in trigger_softirq()
sbitmap: Delete sbitmap_any_bit_clear()
blk-mq: Delete blk_mq_has_free_tags() and blk_mq_can_queue()
block: split bio if the only bvec's length is > SZ_4K
block: still try to split bio if the bvec crosses pages
blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT
blk-cgroup: reimplement basic IO stats using cgroup rstat
blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive()
blk-throtl: stop using blkg->stat_bytes and ->stat_ios
bfq-iosched: stop using blkg->stat_bytes and ->stat_ios
bfq-iosched: relocate bfqg_*rwstat*() helpers
block: add zone open, close and finish ioctl support
block: add zone open, close and finish operations
block: Simplify REQ_OP_ZONE_RESET_ALL handling
block: Remove REQ_OP_ZONE_RESET plugging
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3WxNwQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgps4kD/9SIDXhYhhE8fNqeAF7Uouu8fxgwnkY3hSI
43vJwCziiDxWWJH5mYW7/83VNOMZKHIbiYMnU6iEUsRQ/sG/wI0wEfAQZDHLzCKt
cko2q7zAC1/4rtoslwJ3q04hE2Ap/nb93ELZBVr7fOAuODBNFUp/vifAojvsMPKz
hNMNPq/vYg7c/iYMZKSBdtjE3tqceFNBjAVNMB9dHKQLeexEy4ve7AjBeawWsSi7
GesnQ5w5u5LqkMYwLslpv/oVjHiiFWgGnDAvBNvykQvVy+DfB54KSqMV11W1aqdU
l6L+ENfZasEvlk1yMAth2Foq4vlscm5MKEb6VdJhXWHHXtXkcBmz7RBqPmjSvXCY
wS5GZRw8oYtTcid0aQf+t/wgRNTDJsGsnsT32qto41No3Z7vlIDHUDxHZGTA+gEL
E8j9rDx6EXMTo3EFbC8XZcfsorhPJ1HKAyw1YFczHtYzJEQUR9jJe3f/Q9u6K2Vy
s/EhkVeHa/lEd7kb6mI+6lQjGe1FXl7AHauDuaaEfIOZA/xJB3Bad5Wjq1va1cUO
TX+37zjzFzJghhSIBGYq7G7iT4AMecPQgxHzCdCyYfW5S4Uur9tMmIElwVPI/Pjl
kDZ9gdg9lm6JifZ9Ab8QcGhuQQTF3frwX9VfgrVgcqyvm38AiYzVgL9ZJnxRS/Cy
ZfLNkACXqQ==
=YZ9s
-----END PGP SIGNATURE-----
Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"A lot of stuff has been going on this cycle, with improving the
support for networked IO (and hence unbounded request completion
times) being one of the major themes. There's been a set of fixes done
this week, I'll send those out as well once we're certain we're fully
happy with them.
This contains:
- Unification of the "normal" submit path and the SQPOLL path (Pavel)
- Support for sparse (and bigger) file sets, and updating of those
file sets without needing to unregister/register again.
- Independently sized CQ ring, instead of just making it always 2x
the SQ ring size. This makes it more flexible for networked
applications.
- Support for overflowed CQ ring, never dropping events but providing
backpressure on submits.
- Add support for absolute timeouts, not just relative ones.
- Support for generic cancellations. This divorces io_uring from
workqueues as well, which additionally gets us one step closer to
generic async system call support.
- With cancellations, we can support grabbing the process file table
as well, just like we do mm context. This allows support for system
calls that create file descriptors, like accept4() support that's
built on top of that.
- Support for io_uring tracing (Dmitrii)
- Support for linked timeouts. These abort an operation if it isn't
completed by the time noted in the linke timeout.
- Speedup tracking of poll requests
- Various cleanups making the coder easier to follow (Jackie, Pavel,
Bob, YueHaibing, me)
- Update MAINTAINERS with new io_uring list"
* tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits)
io_uring: make POLL_ADD/POLL_REMOVE scale better
io-wq: remove now redundant struct io_wq_nulls_list
io_uring: Fix getting file for non-fd opcodes
io_uring: introduce req_need_defer()
io_uring: clean up io_uring_cancel_files()
io-wq: ensure free/busy list browsing see all items
io-wq: ensure we have a stable view of ->cur_work for cancellations
io_wq: add get/put_work handlers to io_wq_create()
io_uring: check for validity of ->rings in teardown
io_uring: fix potential deadlock in io_poll_wake()
io_uring: use correct "is IO worker" helper
io_uring: fix -ENOENT issue with linked timer with short timeout
io_uring: don't do flush cancel under inflight_lock
io_uring: flag SQPOLL busy condition to userspace
io_uring: make ASYNC_CANCEL work with poll and timeout
io_uring: provide fallback request for OOM situations
io_uring: convert accept4() -ERESTARTSYS into -EINTR
io_uring: fix error clear of ->file_table in io_sqe_files_register()
io_uring: separate the io_free_req and io_free_req_find_next interface
io_uring: keep io_put_req only responsible for release and put req
...
This commit reverts commit 91e6015b08 ("bpf: Emit audit messages
upon successful prog load and unload") and its follow up commit
7599a896f2 ("audit: Move audit_log_task declaration under
CONFIG_AUDITSYSCALL") as requested by Paul Moore. The change needs
close review on linux-audit, tests etc.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
This patch is to allow matching options in erspan.
The options can be described in the form:
VER:INDEX:DIR:HWID/VER:INDEX_MASK:DIR_MASK:HWID_MASK.
When ver is set to 1, index will be applied while dir
and hwid will be ignored, and when ver is set to 2,
dir and hwid will be used while index will be ignored.
Different from geneve, only one option can be set. And
also, geneve options, vxlan options or erspan options
can't be set at the same time.
# ip link add name erspan1 type erspan external
# tc qdisc add dev erspan1 ingress
# tc filter add dev erspan1 protocol ip parent ffff: \
flower \
enc_src_ip 10.0.99.192 \
enc_dst_ip 10.0.99.193 \
enc_key_id 11 \
erspan_opts 1:12:0:0/1:ffff:0:0 \
ip_proto udp \
action mirred egress redirect dev eth0
v1->v2:
- improve some err msgs of extack.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to allow matching gbp option in vxlan.
The options can be described in the form GBP/GBP_MASK,
where GBP is represented as a 32bit hexadecimal value.
Different from geneve, only one option can be set. And
also, geneve options and vxlan options can't be set at
the same time.
# ip link add name vxlan0 type vxlan dstport 0 external
# tc qdisc add dev vxlan0 ingress
# tc filter add dev vxlan0 protocol ip parent ffff: \
flower \
enc_src_ip 10.0.99.192 \
enc_dst_ip 10.0.99.193 \
enc_key_id 11 \
vxlan_opts 01020304/ffffffff \
ip_proto udp \
action mirred egress redirect dev eth0
v1->v2:
- add .strict_start_type for enc_opts_policy as Jakub noticed.
- use Duplicate instead of Wrong in err msg for extack as Jakub
suggested.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to allow setting erspan options using the
act_tunnel_key action. Different from geneve options,
only one option can be set. And also, geneve options,
vxlan options or erspan options can't be set at the
same time.
Options are expressed as ver:index:dir:hwid, when ver
is set to 1, index will be applied while dir and hwid
will be ignored, and when ver is set to 2, dir and
hwid will be used while index will be ignored.
# ip link add name erspan1 type erspan external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
erspan_opts 1:2:0:0 \
action mirred egress redirect dev erspan1
v1->v2:
- do the validation when dst is not yet allocated as Jakub suggested.
- use Duplicate instead of Wrong in err msg for extack.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to allow setting vxlan options using the
act_tunnel_key action. Different from geneve options,
only one option can be set. And also, geneve options
and vxlan options can't be set at the same time.
gbp is the only param for vxlan options:
# ip link add name vxlan0 type vxlan dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
vxlan_opts 01020304 \
action mirred egress redirect dev vxlan0
v1->v2:
- add .strict_start_type for enc_opts_policy as Jakub noticed.
- use Duplicate instead of Wrong in err msg for extack as Jakub
suggested.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add definitions for the Enter Compliance and Transmit Margin fields of the
PCIe Link Control 2 register.
Link: https://lore.kernel.org/r/20191112173503.176611-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
- Allow non-ISV data aborts to be reported to userspace
- Allow injection of data aborts from userspace
- Expose stolen time to guests
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt pool counters
- Make the emulated timer PREEMPT_RT compliant
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl3VZ0EPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDwikQAJ/lQT97zEKV3dnpD/jjmEic/3QvTGljS4p+
pbwZAzyoSMc09lAK2pkaGRRc7euPnp4uLdRS4SenToVzmUQCzuxpQEEMdV/wjp4V
WLQ1WnTEAhYkm7k5MVo4uy3eD7nVWHWXgfQJvzL4EYZ5R/gd9NzBrnAc6LLV6hp9
0eLXIYrGFa1GESzF6P6sBDJhYpqVUcQlTI8I43kZH3iCC4+OsBxIkhHREZYsELhW
MZJIM9ZCskg2tvPC4UysaFiGjBYUJNJ0V+fFOrhyGzludP8i8rNRwgA60mznwvNw
V4N6/gLlkGK7nLqP+noUcU2wTBnIu389TcWWsX47CuDUzawCd8Fb9kX2zYauQSyS
ujE0uzoo/nhPFysh9OVVeLUZ6o/wotXoMp2t32t1c5h9N1hISEJvAWavMxTY6KzF
NEn9hWFjNcgBoArz9GKn9p2nBQpCDvu+2SlI4nL/qgZ7lPC4O3U1uq9myCOLj/gu
Can/u5EAwgyIBDVcEPHV+vP2GjyeERdXprGiG2VJTYlbHsdjgISTR+5Fy32KdGlP
YygeZxJtzretr3AYsWqD6Mri30FDSoYy9rUOWBpa+ZHbJPac0M+uKOqntV1OxPX9
QUkmNEdJcDr8fkcKxnEZ/MaZxFTGPp4vfhiT4A7dUkWTFq7ajvGo8IwN1d7PvWxS
LFMij1Js
=SxBH
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm updates for Linux 5.5:
- Allow non-ISV data aborts to be reported to userspace
- Allow injection of data aborts from userspace
- Expose stolen time to guests
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt pool counters
- Make the emulated timer PREEMPT_RT compliant
Conflicts:
include/uapi/linux/kvm.h
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-11-20
The following pull-request contains BPF updates for your *net-next* tree.
We've added 81 non-merge commits during the last 17 day(s) which contain
a total of 120 files changed, 4958 insertions(+), 1081 deletions(-).
There are 3 trivial conflicts, resolve it by always taking the chunk from
196e8ca74886c433:
<<<<<<< HEAD
=======
void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
>>>>>>> 196e8ca748
<<<<<<< HEAD
void *bpf_map_area_alloc(u64 size, int numa_node)
=======
static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
>>>>>>> 196e8ca748
<<<<<<< HEAD
if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
=======
/* kmalloc()'ed memory can't be mmap()'ed */
if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
>>>>>>> 196e8ca748
The main changes are:
1) Addition of BPF trampoline which works as a bridge between kernel functions,
BPF programs and other BPF programs along with two new use cases: i) fentry/fexit
BPF programs for tracing with practically zero overhead to call into BPF (as
opposed to k[ret]probes) and ii) attachment of the former to networking related
programs to see input/output of networking programs (covering xdpdump use case),
from Alexei Starovoitov.
2) BPF array map mmap support and use in libbpf for global data maps; also a big
batch of libbpf improvements, among others, support for reading bitfields in a
relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko.
3) Extend s390x JIT with usage of relative long jumps and loads in order to lift
the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich.
4) Add BPF audit support and emit messages upon successful prog load and unload in
order to have a timeline of events, from Daniel Borkmann and Jiri Olsa.
5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode
(XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson.
6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API
call named bpf_get_link_xdp_info() for retrieving the full set of prog
IDs attached to XDP, from Toke Høiland-Jørgensen.
7) Add BTF support for array of int, array of struct and multidimensional arrays
and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau.
8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo.
9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid
xdping to be run as standalone, from Jiri Benc.
10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song.
11) Fix a memory leak in BPF fentry test run data, from Colin Ian King.
12) Various smaller misc cleanups and improvements mostly all over BPF selftests and
samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow for audit messages to be emitted upon BPF program load and
unload for having a timeline of events. The load itself is in
syscall context, so additional info about the process initiating
the BPF prog creation can be logged and later directly correlated
to the unload event.
The only info really needed from BPF side is the globally unique
prog ID where then audit user space tooling can query / dump all
info needed about the specific BPF program right upon load event
and enrich the record, thus these changes needed here can be kept
small and non-intrusive to the core.
Raw example output:
# auditctl -D
# auditctl -a always,exit -F arch=x86_64 -S bpf
# ausearch --start recent -m 1334
[...]
----
time->Wed Nov 20 12:45:51 2019
type=PROCTITLE msg=audit(1574271951.590:8974): proctitle="./test_verifier"
type=SYSCALL msg=audit(1574271951.590:8974): arch=c000003e syscall=321 success=yes exit=14 a0=5 a1=7ffe2d923e80 a2=78 a3=0 items=0 ppid=742 pid=949 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="test_verifier" exe="/root/bpf-next/tools/testing/selftests/bpf/test_verifier" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=UNKNOWN[1334] msg=audit(1574271951.590:8974): auid=0 uid=0 gid=0 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=949 comm="test_verifier" exe="/root/bpf-next/tools/testing/selftests/bpf/test_verifier" prog-id=3260 event=LOAD
----
time->Wed Nov 20 12:45:51 2019
type=UNKNOWN[1334] msg=audit(1574271951.590:8975): prog-id=3260 event=UNLOAD
----
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191120213816.8186-1-jolsa@kernel.org
RFC 8033 suggests an alternative approach to calculate the queue
delay in PIE by using a timestamp on every enqueued packet. This
patch adds an implementation of that approach and sets it as the
default method to calculate queue delay. The previous method (based
on Little's law) to calculate queue delay is set as optional.
Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Acked-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Wildcard support for the net,iface set from Kristian Evensen.
2) Offload support for matching on the input interface.
3) Simplify matching on vlan header fields.
4) Add nft_payload_rebuild_vlan_hdr() function to rebuild the vlan
header from the vlan sk_buff metadata.
5) Pass extack to nft_flow_cls_offload_setup().
6) Add C-VLAN matching support.
7) Use time64_t in xt_time to fix y2038 overflow, from Arnd Bergmann.
8) Use time_t in nft_meta to fix y2038 overflow, also from Arnd.
9) Add flow_action_entry_next() helper function to flowtable offload
infrastructure.
10) Add IPv6 support to the flowtable offload infrastructure.
11) Support for input interface matching from postrouting,
from Phil Sutter.
12) Missing check for ndo callback in flowtable offload, from wenxu.
13) Remove conntrack parameter from flow_offload_fill_dir(), from wenxu.
14) Do not pass flow_rule object for rule removal, cookie is sufficient
to achieve this.
15) Release flow_rule object in case of error from the offload commit
path.
16) Undo offload ruleset updates if transaction fails.
17) Check for error when binding flowtable callbacks, from wenxu.
18) Always unbind flowtable callbacks when unregistering hooks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The new raid1c3 and raid1c4 profiles are backward incompatible and the
name shall be 'raid1c34', the status can be found in the global
supported features in /sys/fs/btrfs/features or in the per-filesystem
directory.
Signed-off-by: David Sterba <dsterba@suse.com>
Add new block group profile to store 4 copies in a simliar way that
current RAID1 does. The profile attributes and constraints are defined
in the raid table and used by the same code that already handles the 2-
and 3-copy RAID1.
The minimum number of devices is 4, the maximum number of devices/chunks
that can be lost/damaged is 3. There is no comparable traditional RAID
level, the profile is added for future needs to accompany triple-parity
and beyond.
Signed-off-by: David Sterba <dsterba@suse.com>
Add new block group profile to store 3 copies in a simliar way that
current RAID1 does. The profile attributes and constraints are defined
in the raid table and used by the same code that already handles the
2-copy RAID1.
The minimum number of devices is 3, the maximum number of devices/chunks
that can be lost/damaged is 2. Like RAID6 but with 33% space
utilization.
Signed-off-by: David Sterba <dsterba@suse.com>
Add sha256 to the list of possible checksumming algorithms used by BTRFS.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add xxhash64 to the list of possible checksumming algorithms used by
BTRFS.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use enum to replace macro definitions of extent types.
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.
There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
- if map is already frozen, no writable mapping is allowed;
- if map has writable memory mappings active (accounted in map->writecnt),
map freezing will keep failing with -EBUSY;
- once number of writable memory mappings drops to zero, map freezing can be
performed again.
Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.
For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.
One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close(). close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
The main motivation to add set_tid to clone3() is CRIU.
To restore a process with the same PID/TID CRIU currently uses
/proc/sys/kernel/ns_last_pid. It writes the desired (PID - 1) to
ns_last_pid and then (quickly) does a clone(). This works most of the
time, but it is racy. It is also slow as it requires multiple syscalls.
Extending clone3() to support *set_tid makes it possible restore a
process using CRIU without accessing /proc/sys/kernel/ns_last_pid and
race free (as long as the desired PID/TID is available).
This clone3() extension places the same restrictions (CAP_SYS_ADMIN)
on clone3() with *set_tid as they are currently in place for ns_last_pid.
The original version of this change was using a single value for
set_tid. At the 2019 LPC, after presenting set_tid, it was, however,
decided to change set_tid to an array to enable setting the PID of a
process in multiple PID namespaces at the same time. If a process is
created in a PID namespace it is possible to influence the PID inside
and outside of the PID namespace. Details also in the corresponding
selftest.
To create a process with the following PIDs:
PID NS level Requested PID
0 (host) 31496
1 42
2 1
For that example the two newly introduced parameters to struct
clone_args (set_tid and set_tid_size) would need to be:
set_tid[0] = 1;
set_tid[1] = 42;
set_tid[2] = 31496;
set_tid_size = 3;
If only the PIDs of the two innermost nested PID namespaces should be
defined it would look like this:
set_tid[0] = 1;
set_tid[1] = 42;
set_tid_size = 2;
The PID of the newly created process would then be the next available
free PID in the PID namespace level 0 (host) and 42 in the PID namespace
at level 1 and the PID of the process in the innermost PID namespace
would be 1.
The set_tid array is used to specify the PID of a process starting
from the innermost nested PID namespaces up to set_tid_size PID namespaces.
set_tid_size cannot be larger then the current PID namespace level.
Signed-off-by: Adrian Reber <areber@redhat.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
Link: https://lore.kernel.org/r/20191115123621.142252-1-areber@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Allow FENTRY/FEXIT BPF programs to attach to other BPF programs of any type
including their subprograms. This feature allows snooping on input and output
packets in XDP, TC programs including their return values. In order to do that
the verifier needs to track types not only of vmlinux, but types of other BPF
programs as well. The verifier also needs to translate uapi/linux/bpf.h types
used by networking programs into kernel internal BTF types used by FENTRY/FEXIT
BPF programs. In some cases LLVM optimizations can remove arguments from BPF
subprograms without adjusting BTF info that LLVM backend knows. When BTF info
disagrees with actual types that the verifiers sees the BPF trampoline has to
fallback to conservative and treat all arguments as u64. The FENTRY/FEXIT
program can still attach to such subprograms, but it won't be able to recognize
pointer types like 'struct sk_buff *' and it won't be able to pass them to
bpf_skb_output() for dumping packets to user space. The FENTRY/FEXIT program
would need to use bpf_probe_read_kernel() instead.
The BPF_PROG_LOAD command is extended with attach_prog_fd field. When it's set
to zero the attach_btf_id is one vmlinux BTF type ids. When attach_prog_fd
points to previously loaded BPF program the attach_btf_id is BTF type id of
main function or one of its subprograms.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-18-ast@kernel.org
Introduce BPF trampoline concept to allow kernel code to call into BPF programs
with practically zero overhead. The trampoline generation logic is
architecture dependent. It's converting native calling convention into BPF
calling convention. BPF ISA is 64-bit (even on 32-bit architectures). The
registers R1 to R5 are used to pass arguments into BPF functions. The main BPF
program accepts only single argument "ctx" in R1. Whereas CPU native calling
convention is different. x86-64 is passing first 6 arguments in registers
and the rest on the stack. x86-32 is passing first 3 arguments in registers.
sparc64 is passing first 6 in registers. And so on.
The trampolines between BPF and kernel already exist. BPF_CALL_x macros in
include/linux/filter.h statically compile trampolines from BPF into kernel
helpers. They convert up to five u64 arguments into kernel C pointers and
integers. On 64-bit architectures this BPF_to_kernel trampolines are nops. On
32-bit architecture they're meaningful.
The opposite job kernel_to_BPF trampolines is done by CAST_TO_U64 macros and
__bpf_trace_##call() shim functions in include/trace/bpf_probe.h. They convert
kernel function arguments into array of u64s that BPF program consumes via
R1=ctx pointer.
This patch set is doing the same job as __bpf_trace_##call() static
trampolines, but dynamically for any kernel function. There are ~22k global
kernel functions that are attachable via nop at function entry. The function
arguments and types are described in BTF. The job of btf_distill_func_proto()
function is to extract useful information from BTF into "function model" that
architecture dependent trampoline generators will use to generate assembly code
to cast kernel function arguments into array of u64s. For example the kernel
function eth_type_trans has two pointers. They will be casted to u64 and stored
into stack of generated trampoline. The pointer to that stack space will be
passed into BPF program in R1. On x86-64 such generated trampoline will consume
16 bytes of stack and two stores of %rdi and %rsi into stack. The verifier will
make sure that only two u64 are accessed read-only by BPF program. The verifier
will also recognize the precise type of the pointers being accessed and will
not allow typecasting of the pointer to a different type within BPF program.
The tracing use case in the datacenter demonstrated that certain key kernel
functions have (like tcp_retransmit_skb) have 2 or more kprobes that are always
active. Other functions have both kprobe and kretprobe. So it is essential to
keep both kernel code and BPF programs executing at maximum speed. Hence
generated BPF trampoline is re-generated every time new program is attached or
detached to maintain maximum performance.
To avoid the high cost of retpoline the attached BPF programs are called
directly. __bpf_prog_enter/exit() are used to support per-program execution
stats. In the future this logic will be optimized further by adding support
for bpf_stats_enabled_key inside generated assembly code. Introduction of
preemptible and sleepable BPF programs will completely remove the need to call
to __bpf_prog_enter/exit().
Detach of a BPF program from the trampoline should not fail. To avoid memory
allocation in detach path the half of the page is used as a reserve and flipped
after each attach/detach. 2k bytes is enough to call 40+ BPF programs directly
which is enough for BPF tracing use cases. This limit can be increased in the
future.
BPF_TRACE_FENTRY programs have access to raw kernel function arguments while
BPF_TRACE_FEXIT programs have access to kernel return value as well. Often
kprobe BPF program remembers function arguments in a map while kretprobe
fetches arguments from a map and analyzes them together with return value.
BPF_TRACE_FEXIT accelerates this typical use case.
Recursion prevention for kprobe BPF programs is done via per-cpu
bpf_prog_active counter. In practice that turned out to be a mistake. It
caused programs to randomly skip execution. The tracing tools missed results
they were looking for. Hence BPF trampoline doesn't provide builtin recursion
prevention. It's a job of BPF program itself and will be addressed in the
follow up patches.
BPF trampoline is intended to be used beyond tracing and fentry/fexit use cases
in the future. For example to remove retpoline cost from XDP programs.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-5-ast@kernel.org
User space may request time stamps on rising edges, falling edges, or
both. However, the particular mode may or may not be supported in the
hardware or in the driver. This patch adds a "strict" flag that tells
drivers to ensure that the requested mode will be honored.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 415606588c ("PTP: introduce new versions of IOCTLs")
introduced a new external time stamp ioctl that validates the flags.
This patch extends the validation to ensure that at least one rising
or falling edge flag is set when enabling external time stamps.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We store elapsed time for a crashed process in struct elf_prstatus using
'timeval' structures. Once glibc starts using 64-bit time_t, this becomes
incompatible with the kernel's idea of timeval since the structure layout
no longer matches on 32-bit architectures.
This changes the definition of the elf_prstatus structure to use
__kernel_old_timeval instead, which is hardcoded to the currently used
binary layout. There is no risk of overflow in y2038 though, because
the time values are all relative times, and can store up to 68 years
of process elapsed time.
There is a risk of applications breaking at build time when they
use the new kernel headers and expect the type to be exactly 'timeval'
rather than a structure that has the same fields as before. Those
applications have to be modified to deal with 64-bit time_t anyway.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In order to remove the 'struct timespec' definition and the
timespec64_to_timespec() helper function, change over the in-kernel
definition of 'struct scm_timestamping' to use the __kernel_old_timespec
replacement and open-code the assignment.
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There are two 'struct timeval' fields in 'struct rusage'.
Unfortunately the definition of timeval is now ambiguous when used in
user space with a libc that has a 64-bit time_t, and this also changes
the 'rusage' definition in user space in a way that is incompatible with
the system call interface.
While there is no good solution to avoid all ambiguity here, change
the definition in the kernel headers to be compatible with the kernel
ABI, using __kernel_old_timeval as an unambiguous base type.
In previous discussions, there was also a plan to add a replacement
for rusage based on 64-bit timestamps and nanosecond resolution,
i.e. 'struct __kernel_timespec'. I have patches for that as well,
if anyone thinks we should do that.
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This is mainly a patch for clarification, and to let us remove
the time_t definition from the kernel to prevent new users from
creeping in that might not be y2038-safe.
All remaining uses of 'time_t' or '__kernel_time_t' are part of
the user API that cannot be changed by that either have a
replacement or that do not suffer from the y2038 overflow.
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The 'struct timespec' definition can no longer be part of the uapi headers
because it conflicts with a a now incompatible libc definition. Also,
we really want to remove it in order to prevent new uses from creeping in.
The same namespace conflict exists with time_t, which should also be
removed. __kernel_time_t could be used safely, but adding 'old' in the
name makes it clearer that this should not be used for new interfaces.
Add a replacement __kernel_old_timespec structure and __kernel_old_time_t
along the lines of __kernel_old_timeval.
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
When using the kernel datapath, the upcall don't
include skb hash info relatived. That will introduce
some problem, because the hash of skb is important
in kernel stack. For example, VXLAN module uses
it to select UDP src port. The tx queue selection
may also use the hash in stack.
Hash is computed in different ways. Hash is random
for a TCP socket, and hash may be computed in hardware,
or software stack. Recalculation hash is not easy.
Hash of TCP socket is computed:
tcp_v4_connect
-> sk_set_txhash (is random)
__tcp_transmit_skb
-> skb_set_hash_from_sk
There will be one upcall, without information of skb
hash, to ovs-vswitchd, for the first packet of a TCP
session. The rest packets will be processed in Open vSwitch
modules, hash kept. If this tcp session is forward to
VXLAN module, then the UDP src port of first tcp packet
is different from rest packets.
TCP packets may come from the host or dockers, to Open vSwitch.
To fix it, we store the hash info to upcall, and restore hash
when packets sent back.
+---------------+ +-------------------------+
| Docker/VMs | | ovs-vswitchd |
+----+----------+ +-+--------------------+--+
| ^ |
| | |
| | upcall v restore packet hash (not recalculate)
| +-+--------------------+--+
| tap netdev | | vxlan module
+---------------> +--> Open vSwitch ko +-->
or internal type | |
+-------------------------+
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a statx attribute bit STATX_ATTR_VERITY which will be set if the
file has fs-verity enabled. This is the statx() equivalent of
FS_VERITY_FL which is returned by FS_IOC_GETFLAGS.
This is useful because it allows applications to check whether a file is
a verity file without opening it. Opening a verity file can be
expensive because the fsverity_info is set up on open, which involves
parsing metadata and optionally verifying a cryptographic signature.
This is analogous to how various other bits are exposed through both
FS_IOC_GETFLAGS and statx(), e.g. the encrypt bit.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
AUX data can be used to annotate perf events such as performance counters
or tracepoints/breakpoints by including it in sample records when
PERF_SAMPLE_AUX flag is set. Such samples would be instrumental in debugging
and profiling by providing, for example, a history of instruction flow
leading up to the event's overflow.
The implementation makes use of grouping an AUX event with all the events
that wish to take samples of the AUX data, such that the former is the
group leader. The samplees should also specify the desired size of the AUX
sample via attr.aux_sample_size.
AUX capable PMUs need to explicitly add support for sampling, because it
relies on a new callback to take a snapshot of the buffer without touching
the event states.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Cc: mathieu.poirier@linaro.org
Link: https://lkml.kernel.org/r/20191025140835.53665-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jozsef Kadlecsik says:
====================
ipset patches for nf-next
- Add wildcard support to hash:net,iface which makes possible to
match interface prefixes besides complete interfaces names, from
Kristian Evensen.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can now get rid of the cmm_lock and completely rely on the balloon
compaction internals, which now also manage the page list and the
lock.
Inflated/"loaned" pages are now movable. Memory blocks that contain
such pages can get offlined. Also, all such pages will be marked
PageOffline() and can therefore be excluded in memory dumps using
recent versions of makedumpfile.
Don't switch to balloon_page_alloc() yet (due to the GFP_NOIO). Will
do that separately to discuss this change in detail.
Signed-off-by: David Hildenbrand <david@redhat.com>
[mpe: Add isolated_pages-- in cmm_migratepage() as suggested by David]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191031142933.10779-9-david@redhat.com
This patch adds the NFTA_FLOWTABLE_FLAGS attribute that allows users to
specify the NF_FLOWTABLE_HW_OFFLOAD flag. This patch also adds a new
setup interface for the flowtable type to perform the flowtable offload
block callback configuration.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the GPIOHANDLE_SET_CONFIG_IOCTL to the gpio chardev.
The ioctl allows some of the configuration of a requested handle to be
changed without having to release the line.
The primary use case is the changing of direction for bi-directional
lines.
Based on initial work by Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Allow pull up/down bias to be disabled, allowing the line to float
or to be biased only by external circuitry.
Use case is for where the bias has been applied previously, either
by default or by the user, but that setting may conflict with the
current use of the line.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Add pull-up/pull-down flags to the gpio line get and
set ioctl() calls. Use cases include a push button
that does not have an external resistor.
Addition use cases described by Limor Fried (ladyada) of
Adafruit in this PR for Adafruit_Blinka Python lib:
https://github.com/adafruit/Adafruit_Blinka/pull/59
Signed-off-by: Drew Fustini <drew@pdp7.com>
[Kent: added BIAS to GPIO flag names and restrict application to input
lines]
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
When setting the dump's time-stamp, use ktime_get_real in addition to
jiffies. This simplifies the user space implementation and bypasses
some inconsistent behavior with translating jiffies to current time.
The time taken is transformed into nsec, to comply with y2038 issue.
Fixes: c8e1da0bf9 ("devlink: Add health report functionality")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) BPF sample build fixes from Björn Töpel
2) Fix powerpc bpf tail call implementation, from Eric Dumazet.
3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet.
4) Fix crash in ebtables when using dnat target, from Florian Westphal.
5) Fix port disable handling whne removing bcm_sf2 driver, from Florian
Fainelli.
6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski.
7) Various KCSAN fixes all over the networking, from Eric Dumazet.
8) Memory leaks in mlx5 driver, from Alex Vesker.
9) SMC interface refcounting fix, from Ursula Braun.
10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu.
11) Add a TX lock to synchonize the kTLS TX path properly with crypto
operations. From Jakub Kicinski.
12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano
Garzarella.
13) Infinite loop in Intel ice driver, from Colin Ian King.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
ixgbe: need_wakeup flag might not be set for Tx
i40e: need_wakeup flag might not be set for Tx
igb/igc: use ktime accessors for skb->tstamp
i40e: Fix for ethtool -m issue on X722 NIC
iavf: initialize ITRN registers with correct values
ice: fix potential infinite loop because loop counter being too small
qede: fix NULL pointer deref in __qede_remove()
net: fix data-race in neigh_event_send()
vsock/virtio: fix sock refcnt holding during the shutdown
net: ethernet: octeon_mgmt: Account for second possible VLAN header
mac80211: fix station inactive_time shortly after boot
net/fq_impl: Switch to kvmalloc() for memory allocation
mac80211: fix ieee80211_txq_setup_flows() failure path
ipv4: Fix table id reference in fib_sync_down_addr
ipv6: fixes rt6_probe() and fib6_nh->last_probe init
net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
net: usb: qmi_wwan: add support for DW5821e with eSIM support
CDC-NCM: handle incomplete transfer of MTU
nfc: netlink: fix double device reference drop
NFC: st21nfca: fix double free
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3F+DIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpkVmD/9FIa092Q6obga0RqA16GlbI85tgtNyRFZU
neIO/g9R3G/uBTGbiUeXHXHDz9CqUXIRYX7pmI2u0b07iGLRz8oUsOsgyVEfCMen
VitqwkJJAZ9j9OifyKpLYCZX9ulVDWX5hEz/vm2cNWDkjCbOpXvRuQmkXEzp7RNM
F7K25PpGLvJHfqC90q9FXXxNDlB2i1M/rh5I7eUqhb6rHmzfJGKCd+H80t+REoB1
iXAygPj86agQKLOUKZtJjXUjq9Ol/0FD+OKY+eP3EfVv/FJvIeWWYe78WplxJpRD
BYb9dhLMCSo619WVVy4hNYCPjSfPKVT2cO5QJmVRpgOI1urFuTNgNoIiw06AgvkZ
09vrlrJZ5A7eFEppuAFQC4WRYKWCQCQfq8wxt2iGUivXgHfskjJ7qJz1Mh5Nlxsr
JGm1hSVw9UCBzjqC75K2CR+vVt4T8ovEaizPFvzVj6lQ0lRTmSchisxbXzTdevOn
kFvBOntBdpeSq/CwZ0x5PDP4AbRsgH3ny47LCJoEpFZlxlOLuEB/dAx564dn6ZgZ
rRz6mKU7rlO7brrkW+DbRZ0XiOn5qzN6FrcSGX1DBzc4+hMus/5PPjv8Gk+Wo1PU
388mu6N6DObSjw1ij1AqkwyzudmbrziKM4isJY4I72I0YGSq9cuG5VXgM2GqbGlU
XXpzsu8pSw==
=8B/z
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Two NVMe device removal crash fixes, and a compat fixup for for an
ioctl that was introduced in this release (Anton, Charles, Max - via
Keith)
- Missing error path mutex unlock for drbd (Dan)
- cgroup writeback fixup on dead memcg (Tejun)
- blkcg online stats print fix (Tejun)
* tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block:
cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
block: drbd: remove a stray unlock in __drbd_send_protocol()
blkcg: make blkcg_print_stat() print stats only for online blkgs
nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
nvme-rdma: fix a segmentation fault during module unload
Section 7.2 of rfc7829: "Peer Address Thresholds (SCTP_PEER_ADDR_THLDS)
Socket Option" extends 'struct sctp_paddrthlds' with 'spt_pathcpthld'
added to allow a user to change ps_retrans per sock/asoc/transport, as
other 2 paddrthlds: pf_retrans, pathmaxrxt.
Note: to not break the user's program, here to support pf_retrans dump
and setting by adding a new sockopt SCTP_PEER_ADDR_THLDS_V2, and a new
structure sctp_paddrthlds_v2 instead of extending sctp_paddrthlds.
Also, when setting ps_retrans, the value is not allowed to be greater
than pf_retrans.
v1->v2:
- use SCTP_PEER_ADDR_THLDS_V2 to set/get pf_retrans instead,
as Marcelo and David Laight suggested.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a sockopt defined in section 7.3 of rfc7829: "Exposing
the Potentially Failed Path State", by which users can change
pf_expose per sock and asoc.
The new sockopt SCTP_EXPOSE_POTENTIALLY_FAILED_STATE is also
known as SCTP_EXPOSE_PF_STATE for short.
v2->v3:
- return -EINVAL if params.assoc_value > SCTP_PF_EXPOSE_MAX.
- define SCTP_EXPOSE_PF_STATE SCTP_EXPOSE_POTENTIALLY_FAILED_STATE.
v3->v4:
- improve changelog.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SCTP Quick failover draft section 5.1, point 5 has been removed
from rfc7829. Instead, "the sender SHOULD (i) notify the Upper
Layer Protocol (ULP) about this state transition", as said in
section 3.2, point 8.
So this patch is to add SCTP_ADDR_POTENTIALLY_FAILED, defined
in section 7.1, "which is reported if the affected address
becomes PF". Also remove transport cwnd's update when moving
from PF back to ACTIVE , which is no longer in rfc7829 either.
Note that ulp_notify will be set to false if asoc->expose is
not 'enabled', according to last patch.
v2->v3:
- define SCTP_ADDR_PF SCTP_ADDR_POTENTIALLY_FAILED.
v3->v4:
- initialize spc_state with SCTP_ADDR_AVAILABLE, as Marcelo suggested.
- check asoc->pf_expose in sctp_assoc_control_transport(), as Marcelo
suggested.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As said in rfc7829, section 3, point 12:
The SCTP stack SHOULD expose the PF state of its destination
addresses to the ULP as well as provide the means to notify the
ULP of state transitions of its destination addresses from
active to PF, and vice versa. However, it is recommended that
an SCTP stack implementing SCTP-PF also allows for the ULP to be
kept ignorant of the PF state of its destinations and the
associated state transitions, thus allowing for retention of the
simpler state transition model of [RFC4960] in the ULP.
Not only does it allow to expose the PF state to ULP, but also
allow to ignore sctp-pf to ULP.
So this patch is to add pf_expose per netns, sock and asoc. And in
sctp_assoc_control_transport(), ulp_notify will be set to false if
asoc->expose is not 'enabled' in next patch.
It also allows a user to change pf_expose per netns by sysctl, and
pf_expose per sock and asoc will be initialized with it.
Note that pf_expose also works for SCTP_GET_PEER_ADDR_INFO sockopt,
to not allow a user to query the state of a sctp-pf peer address
when pf_expose is 'disabled', as said in section 7.3.
v1->v2:
- Fix a build warning noticed by Nathan Chancellor.
v2->v3:
- set pf_expose to UNUSED by default to keep compatible with old
applications.
v3->v4:
- add a new entry for pf_expose on ip-sysctl.txt, as Marcelo suggested.
- change this patch to 1/5, and move sctp_assoc_control_transport
change into 2/5, as Marcelo suggested.
- use SCTP_PF_EXPOSE_UNSET instead of SCTP_PF_EXPOSE_UNUSED, and
set SCTP_PF_EXPOSE_UNSET to 0 in enum, as Marcelo suggested.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds two netlink commands to TIPC in order for user to be
able to set or remove AEAD keys:
- TIPC_NL_KEY_SET
- TIPC_NL_KEY_FLUSH
When the 'KEY_SET' is given along with the key data, the key will be
initiated and attached to TIPC crypto. On the other hand, the
'KEY_FLUSH' command will remove all existing keys if any.
Acked-by: Ying Xue <ying.xue@windreiver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new structure 'tipc_aead_key' is added to the 'tipc.h' for user to
be able to transfer a key to TIPC in kernel. Netlink will be used for
this purpose in the later commits.
Acked-by: Ying Xue <ying.xue@windreiver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* typo fixes in docs
* APIs for station separation using VLAN tags rather
than separate wifi netdevs
* some preparations for upcoming features (802.3 offload
and airtime queue limits (AQL)
* stack reduction in ieee80211_assoc_success()
* use DEFINE_DEBUGFS_ATTRIBUTE in hwsim
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl3FYNEACgkQB8qZga/f
l8TQig//aZIXq2qlrci+Tzydi5clhrog8mzGBOECRUrgAFUZB9Q1qSD0IvqQrU8o
3xY+9lDxYvaOYKAm83oyk2st8FIgGXwPmrfPMEodFlFWla4Dkev/n8+CdbVgJDGI
9j0+9tPsQFwvIIOwD1OUKzJyRUIwmJ7+f8DRhMZ8LlkHZwSx217yD1RGq8zbyXUf
tIndHM2yAY0H9ip1GPaTiGMyWN9wbpcllNq6h126OJZ0kGRLbY9X9ySoMIw8zSPF
ul1qofvhcdN8BEOqCd97JIsDF6d3fG/LZyNm+UM415C/Se9bSG1O9i9AakGkk2SW
zqSjkhqeBK5DUfMqGlh6uxx+1CH46vTZhBKuYIe2uWrdia14d88qkQE88R4cRcir
dTJHUP90XOXXh3DrDSgEsfb341E5/IMVgUIIu6U72efNqZ8eMCbsXnp8Bnpervtw
sBPYC4aTzU5wBWbl8XDgm0KWx+nh06Ah5+PXW7gzkKZQpW8qOJp4ATcDkqHrMPBf
OFd8zhYKULBVRH8dsshzXdeVOurNwazJnEN/h72U1eZTa7ZxFL+CMShpY20d4t9I
ldh6EG//b/D7WjBIpbs3VBBQ6fnblwtq41GUS0TgndoDUD1wmBf3mJL9ZvEsYHwC
OmsTaAcsAkHBPdOZDpSyyGYddxXp7YNCT6ua3lNdRAbVn5Y0A/8=
=gDUI
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-net-next-2019-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Some relatively small changes:
* typo fixes in docs
* APIs for station separation using VLAN tags rather
than separate wifi netdevs
* some preparations for upcoming features (802.3 offload
and airtime queue limits (AQL)
* stack reduction in ieee80211_assoc_success()
* use DEFINE_DEBUGFS_ATTRIBUTE in hwsim
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This provides an alternative mechanism for AP VLAN support where a
single netdev is used with VLAN tagged frames instead of separate
netdevs for each VLAN without tagged frames from the WLAN driver.
By setting NL80211_EXT_FEATURE_VLAN_OFFLOAD flag the driver indicates
support for a single netdev with VLAN tagged frames. Separate
VLAN-specific netdevs can be added using RTM_NEWLINK/IFLA_VLAN_ID
similarly to Ethernet. NL80211_CMD_NEW_KEY (for group keys),
NL80211_CMD_NEW_STATION, and NL80211_CMD_SET_STATION will optionally
specify vlan_id using NL80211_ATTR_VLAN_ID.
Signed-off-by: Gurumoorthi Gnanasambandhan <gguru@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20191031214640.5012-1-jouni@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull on for-linus to resolve what otherwise would have been a conflict
with the cgroups rstat patchset from Tejun.
* for-linus: (942 commits)
blkcg: make blkcg_print_stat() print stats only for online blkgs
nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
nvme-rdma: fix a segmentation fault during module unload
iocost: don't nest spin_lock_irq in ioc_weight_write()
io_uring: ensure we clear io_kiocb->result before each issue
um-ubd: Entrust re-queue to the upper layers
nvme-multipath: remove unused groups_only mode in ana log
nvme-multipath: fix possible io hang after ctrl reconnect
io_uring: don't touch ctx in setup after ring fd install
io_uring: Fix leaked shadow_req
Linux 5.4-rc5
riscv: cleanup do_trap_break
nbd: verify socket is supported during setup
ata: libahci_platform: Fix regulator_get_optional() misuse
nbd: handle racing with error'ed out commands
nbd: protect cmd->status with cmd->lock
io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
io_uring: used cached copies of sq->dropped and cq->overflow
ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
...
Currently we drop completion events, if the CQ ring is full. That's fine
for requests with bounded completion times, but it may make it harder or
impossible to use io_uring with networked IO where request completion
times are generally unbounded. Or with POLL, for example, which is also
unbounded.
After this patch, we never overflow the ring, we simply store requests
in a backlog for later flushing. This flushing is done automatically by
the kernel. To prevent the backlog from growing indefinitely, if the
backlog is non-empty, we apply back pressure on IO submissions. Any
attempt to submit new IO with a non-empty backlog will get an -EBUSY
return from the kernel. This is a signal to the application that it has
backlogged CQ events, and that it must reap those before being allowed
to submit more IO.
Note that if we do return -EBUSY, we will have filled whatever
backlogged events into the CQ ring first, if there's room. This means
the application can safely reap events WITHOUT entering the kernel and
waiting for them, they are already available in the CQ ring.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
While we have support for generic timeouts, we don't have a way to tie
a timeout to a specific SQE. The generic timeouts simply trigger wakeups
on the CQ ring.
This adds support for IORING_OP_LINK_TIMEOUT. This command is only valid
as a link to a previous command. The timeout specific can be either
relative or absolute, following the same rules as IORING_OP_TIMEOUT. If
the timeout triggers before the dependent command completes, it will
attempt to cancel that command. Likewise, if the dependent command
completes before the timeout triggers, it will cancel the timeout.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Introduce three new ioctl commands BLKOPENZONE, BLKCLOSEZONE and
BLKFINISHZONE to allow applications to control the condition of zones
on a zoned block device through the execution of the REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH operations.
Contains contributions from Matias Bjorling, Hans Holmberg,
Dmitry Fomichev, Keith Busch, Damien Le Moal and Christoph Hellwig.
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Based on the code framework built on the last patch, to
support setting and dumping for vxlan, we only need to
add ip_tun_parse_opts_erspan() for .build_state and
ip_tun_fill_encap_opts_erspan() for .fill_encap and
if (tun_flags & TUNNEL_ERSPAN_OPT) for .get_encap_size.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on the code framework built on the last patch, to
support setting and dumping for vxlan, we only need to
add ip_tun_parse_opts_vxlan() for .build_state and
ip_tun_fill_encap_opts_vxlan() for .fill_encap and
if (tun_flags & TUNNEL_VXLAN_OPT) for .get_encap_size.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To add options setting and dumping, .build_state(), .fill_encap() and
.get_encap_size() in ip_tun_lwt_ops needs to be extended:
ip_tun_build_state():
ip_tun_parse_opts():
ip_tun_parse_opts_geneve()
ip_tun_fill_encap_info():
ip_tun_fill_encap_opts():
ip_tun_fill_encap_opts_geneve()
ip_tun_encap_nlsize()
ip_tun_opts_nlsize():
if (tun_flags & TUNNEL_GENEVE_OPT)
ip_tun_parse_opts(), ip_tun_fill_encap_opts() and ip_tun_opts_nlsize()
processes LWTUNNEL_IP_OPTS.
ip_tun_parse_opts_geneve(), ip_tun_fill_encap_opts_geneve() and
if (tun_flags & TUNNEL_GENEVE_OPT) processes LWTUNNEL_IP_OPTS_GENEVE.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inline encryption hardware compliant with the UFS v2.1 standard or with
the upcoming version of the eMMC standard has the following properties:
(1) Per I/O request, the encryption key is specified by a previously
loaded keyslot. There might be only a small number of keyslots.
(2) Per I/O request, the starting IV is specified by a 64-bit "data unit
number" (DUN). IV bits 64-127 are assumed to be 0. The hardware
automatically increments the DUN for each "data unit" of
configurable size in the request, e.g. for each filesystem block.
Property (1) makes it inefficient to use the traditional fscrypt
per-file keys. Property (2) precludes the use of the existing
DIRECT_KEY fscrypt policy flag, which needs at least 192 IV bits.
Therefore, add a new fscrypt policy flag IV_INO_LBLK_64 which causes the
encryption to modified as follows:
- The encryption keys are derived from the master key, encryption mode
number, and filesystem UUID.
- The IVs are chosen as (inode_number << 32) | file_logical_block_num.
For filenames encryption, file_logical_block_num is 0.
Since the file nonces aren't used in the key derivation, many files may
share the same encryption key. This is much more efficient on the
target hardware. Including the inode number in the IVs and mixing the
filesystem UUID into the keys ensures that data in different files is
nevertheless still encrypted differently.
Additionally, limiting the inode and block numbers to 32 bits and
placing the block number in the low bits maintains compatibility with
the 64-bit DUN convention (property (2) above).
Since this scheme assumes that inode numbers are stable (which may
preclude filesystem shrinking) and that inode and file logical block
numbers are at most 32-bit, IV_INO_LBLK_64 will only be allowed on
filesystems that meet these constraints. These are acceptable
limitations for the cases where this format would actually be used.
Note that IV_INO_LBLK_64 is an on-disk format, not an implementation.
This patch just adds support for it using the existing filesystem layer
encryption. A later patch will add support for inline encryption.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Now the kernel uses 64bit packet counters in scheduler layer,
we want to export these counters to user space.
Instead risking breaking user space by adding fields
to struct gnet_stats_basic, add a new TCA_STATS_PKT64.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gnet_stats_basic_packed was really meant to be private kernel structure.
If this proves to be a problem, we will have to rename the in-kernel
version.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing nvme_passthru_cmd64 to add a field: rsvd2. This field is an explicit
marker for the padding space added on certain platforms as a result of the
enlargement of the result field from 32 bit to 64 bits in size, and
fixes differences in struct size when using compat ioctl for 32-bit
binaries on 64-bit architecture.
Fixes: 65e68edce0 ("nvme: allow 64-bit results in passthru commands")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Charles Machalow <csm10495@gmail.com>
[changelog]
Signed-off-by: Keith Busch <kbusch@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXcGMxwAKCRCRxhvAZXjc
otEzAP9lHvP97TtRG9gP6dj6YovZ5Djdo3IscmkTqy5Nt8sVNQD+NWg1LZnSMFdJ
ExETgRVlsjF8q2sblswtn/8Ab53O6AM=
=RVGl
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2019-11-05' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull clone3 stack argument update from Christian Brauner:
"This changes clone3() to do basic stack validation and to set up the
stack depending on whether or not it is growing up or down.
With clone3() the expectation is now very simply that the .stack
argument points to the lowest address of the stack and that
.stack_size specifies the initial stack size. This is diferent from
legacy clone() where the "stack" argument had to point to the lowest
or highest address of the stack depending on the architecture.
clone3() was released with 5.3. Currently, it is not documented and
very unclear to userspace how the stack and stack_size argument have
to be passed. After talking to glibc folks we concluded that changing
clone3() to determine stack direction and doing basic validation is
the right course of action.
Note, this is a potentially user visible change. In the very unlikely
case, that it breaks someone's use-case we will revert. (And then e.g.
place the new behavior under an appropriate flag.)
Note that passing an empty stack will continue working just as before.
Breaking someone's use-case is very unlikely. Neither glibc nor musl
currently expose a wrapper for clone3(). There is currently also no
real motivation for anyone to use clone3() directly. First, because
using clone{3}() with stacks requires some assembly (see glibc and
musl). Second, because it does not provide features that legacy
clone() doesn't. New features for clone3() will first happen in v5.5
which is why v5.4 is still a good time to try and make that change now
and backport it to v5.3.
I did a codesearch on https://codesearch.debian.net, github, and
gitlab and could not find any software currently relying directly on
clone3(). I expect this to change once we land CLONE_CLEAR_SIGHAND
which was a request coming from glibc at which point they'll likely
start using it"
* tag 'for-linus-2019-11-05' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
clone3: validate stack arguments
Validate the stack arguments and setup the stack depening on whether or not
it is growing down or up.
Legacy clone() required userspace to know in which direction the stack is
growing and pass down the stack pointer appropriately. To make things more
confusing microblaze uses a variant of the clone() syscall selected by
CONFIG_CLONE_BACKWARDS3 that takes an additional stack_size argument.
IA64 has a separate clone2() syscall which also takes an additional
stack_size argument. Finally, parisc has a stack that is growing upwards.
Userspace therefore has a lot nasty code like the following:
#define __STACK_SIZE (8 * 1024 * 1024)
pid_t sys_clone(int (*fn)(void *), void *arg, int flags, int *pidfd)
{
pid_t ret;
void *stack;
stack = malloc(__STACK_SIZE);
if (!stack)
return -ENOMEM;
#ifdef __ia64__
ret = __clone2(fn, stack, __STACK_SIZE, flags | SIGCHLD, arg, pidfd);
#elif defined(__parisc__) /* stack grows up */
ret = clone(fn, stack, flags | SIGCHLD, arg, pidfd);
#else
ret = clone(fn, stack + __STACK_SIZE, flags | SIGCHLD, arg, pidfd);
#endif
return ret;
}
or even crazier variants such as [3].
With clone3() we have the ability to validate the stack. We can check that
when stack_size is passed, the stack pointer is valid and the other way
around. We can also check that the memory area userspace gave us is fine to
use via access_ok(). Furthermore, we probably should not require
userspace to know in which direction the stack is growing. It is easy
for us to do this in the kernel and I couldn't find the original
reasoning behind exposing this detail to userspace.
/* Intentional user visible API change */
clone3() was released with 5.3. Currently, it is not documented and very
unclear to userspace how the stack and stack_size argument have to be
passed. After talking to glibc folks we concluded that trying to change
clone3() to setup the stack instead of requiring userspace to do this is
the right course of action.
Note, that this is an explicit change in user visible behavior we introduce
with this patch. If it breaks someone's use-case we will revert! (And then
e.g. place the new behavior under an appropriate flag.)
Breaking someone's use-case is very unlikely though. First, neither glibc
nor musl currently expose a wrapper for clone3(). Second, there is no real
motivation for anyone to use clone3() directly since it does not provide
features that legacy clone doesn't. New features for clone3() will first
happen in v5.5 which is why v5.4 is still a good time to try and make that
change now and backport it to v5.3. Searches on [4] did not reveal any
packages calling clone3().
[1]: https://lore.kernel.org/r/CAG48ez3q=BeNcuVTKBN79kJui4vC6nw0Bfq6xc-i0neheT17TA@mail.gmail.com
[2]: https://lore.kernel.org/r/20191028172143.4vnnjpdljfnexaq5@wittgenstein
[3]: 5238e95759/src/basic/raw-clone.h (L31)
[4]: https://codesearch.debian.net
Fixes: 7f192e3cd3 ("fork: add clone3")
Cc: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 5.3
Cc: GNU C Library <libc-alpha@sourceware.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/r/20191031113608.20713-1-christian.brauner@ubuntu.com
The "GPL-2.0" license identifier changed to "GPL-2.0-only" in SPDX v3.0.
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The net,iface equal functions currently compares the full interface
names. In several cases, wildcard (or prefix) matching is useful. For
example, when converting a large iptables rule-set to make use of ipset,
I was able to significantly reduce the number of set elements by making
use of wildcard matching.
Wildcard matching is enabled by adding "wildcard" when adding an element
to a set. Internally, this causes the IPSET_FLAG_IFACE_WILDCARD-flag to
be set. When this flag is set, only the initial part of the interface
name is used for comparison.
Wildcard matching is done per element and not per set, as there are many
cases where mixing wildcard and non-wildcard elements are useful. This
means that is up to the user to handle (avoid) overlapping interface
names.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
This feature gives the user RW access to any opal table with admin1
authority. The flags described in the new structure determines if the user
wants to read/write the data. Flags are checked for valid values in
order to allow future features to be added to the ioctl.
The user can provide the desired table's UID. Also, the ioctl provides a
size and offset field and internally will loop data accesses to return
the full data block. Read overrun is prevented by the initiator's
sec_send_recv() backend. The ioctl provides a private field with the
intention to accommodate any future expansions to the ioctl.
Reviewed-by: Scott Bauer <sbauer@plzdonthack.me>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Revanth Rajashekar <revanth.rajashekar@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-11-02
The following pull-request contains BPF updates for your *net-next* tree.
We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 41 files changed, 1864 insertions(+), 474 deletions(-).
The main changes are:
1) Fix long standing user vs kernel access issue by introducing
bpf_probe_read_user() and bpf_probe_read_kernel() helpers, from Daniel.
2) Accelerated xskmap lookup, from Björn and Maciej.
3) Support for automatic map pinning in libbpf, from Toke.
4) Cleanup of BTF-enabled raw tracepoints, from Alexei.
5) Various fixes to libbpf and selftests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.
The rest were (relatively) trivial in nature.
Signed-off-by: David S. Miller <davem@davemloft.net>
The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.
However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.
Lets fix BPF side by making use of recently added 3d7081822f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.
The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d3 ("x86-64: add warning for
non-canonical user access address dereferences").
Fixes: a5e8c07059 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32 ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
This adds support for IORING_OP_ASYNC_CANCEL, which will attempt to
cancel requests that have been punted to async context and are now
in-flight. This works for regular read/write requests to files, as
long as they haven't been started yet. For socket based IO (or things
like accept4(2)), we can cancel work that is already running as well.
To cancel a request, the sqe must have ->addr set to the user_data of
the request it wishes to cancel. If the request is cancelled
successfully, the original request is completed with -ECANCELED
and the cancel request is completed with a result of 0. If the
request was already running, the original may or may not complete
in error. The cancel request will complete with -EALREADY for that
case. And finally, if the request to cancel wasn't found, the cancel
request is completed with -ENOENT.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The bpf program type raw_tp together with 'expected_attach_type'
was the most appropriate api to indicate BTF-enabled raw_tp programs.
But during development it became apparent that 'expected_attach_type'
cannot be used and new 'attach_btf_id' field had to be introduced.
Which means that the information is duplicated in two fields where
one of them is ignored.
Clean it up by introducing new program type where both
'expected_attach_type' and 'attach_btf_id' fields have
specific meaning.
In the future 'expected_attach_type' will be extended
with other attach points that have similar semantics to raw_tp.
This patch is replacing BTF-enabled BPF_PROG_TYPE_RAW_TRACEPOINT with
prog_type = BPF_RPOG_TYPE_TRACING
expected_attach_type = BPF_TRACE_RAW_TP
attach_btf_id = btf_id of raw tracepoint inside the kernel
Future patches will add
expected_attach_type = BPF_TRACE_FENTRY or BPF_TRACE_FEXIT
where programs have the same input context and the same helpers,
but different attach points.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-2-ast@kernel.org
Extend TCA_ACT space with nla_bitfield32 flags. Add
TCA_ACT_FLAGS_NO_PERCPU_STATS as the only allowed flag. Parse the flags in
tcf_action_init_1() and pass resulting value as additional argument to
a_o->init().
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit a69b0e855d.
This patchset doesn't meet the UAPI requirements set out in [1] for the DRM
subsystem. Once the userspace component is reviewed and ready for merge
we can try again.
[1]- https://01.org/linuxgraphics/gfx-docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
Fixes: a69b0e855d ("dma-buf: Add dma-buf heaps framework")
Cc: Laura Abbott <labbott@redhat.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Vincent Donnefort <Vincent.Donnefort@arm.com>
Cc: Sudipto Paul <Sudipto.Paul@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Brian Starkey <brian.starkey@arm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: "Andrew F. Davis" <afd@ti.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030203003.101156-6-sean@poorly.run
We introduce a feature that works like a combination of TCP_NAGLE and
TCP_CORK, but without some of the weaknesses of those. In particular,
we will not observe long delivery delays because of delayed acks, since
the algorithm itself decides if and when acks are to be sent from the
receiving peer.
- The nagle property as such is determined by manipulating a new
'maxnagle' field in struct tipc_sock. If certain conditions are met,
'maxnagle' will define max size of the messages which can be bundled.
If it is set to zero no messages are ever bundled, implying that the
nagle property is disabled.
- A socket with the nagle property enabled enters nagle mode when more
than 4 messages have been sent out without receiving any data message
from the peer.
- A socket leaves nagle mode whenever it receives a data message from
the peer.
In nagle mode, messages smaller than 'maxnagle' are accumulated in the
socket write queue. The last buffer in the queue is marked with a new
'ack_required' bit, which forces the receiving peer to send a CONN_ACK
message back to the sender upon reception.
The accumulated contents of the write queue is transmitted when one of
the following events or conditions occur.
- A CONN_ACK message is received from the peer.
- A data message is received from the peer.
- A SOCK_WAKEUP pseudo message is received from the link level.
- The write queue contains more than 64 1k blocks of data.
- The connection is being shut down.
- There is no CONN_ACK message to expect. I.e., there is currently
no outstanding message where the 'ack_required' bit was set. As a
consequence, the first message added after we enter nagle mode
is always sent directly with this bit set.
This new feature gives a 50-100% improvement of throughput for small
(i.e., less than MTU size) messages, while it might add up to one RTT
to latency time when the socket is in nagle mode.
Acked-by: Ying Xue <ying.xue@windreiver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows an application to call accept4() in an async fashion. Like
other opcodes, we first try a non-blocking accept, then punt to async
context if we have to.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXbgx9wAKCRDh3BK/laaZ
PIjMAQCTrWPf6pLHSoq+Pll7b8swu1m2mW97fj5k8coED1/DSQEA4222PkIdMmhu
qyHnoX0fTtYXg6NnHbDVWsNL4uG8YAM=
=RE9K
-----END PGP SIGNATURE-----
Merge tag 'fuse-fixes-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"Mostly virtiofs fixes, but also fixes a regression and couple of
longstanding data/metadata writeback ordering issues"
* tag 'fuse-fixes-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: redundant get_fuse_inode() calls in fuse_writepages_fill()
fuse: Add changelog entries for protocols 7.1 - 7.8
fuse: truncate pending writes on O_TRUNC
fuse: flush dirty data/metadata before non-truncate setattr
virtiofs: Remove set but not used variable 'fc'
virtiofs: Retry request submission from worker context
virtiofs: Count pending forgets as in_flight forgets
virtiofs: Set FR_SENT flag only after request has been sent
virtiofs: No need to check fpq->connected state
virtiofs: Do not end request in submission context
fuse: don't advise readdirplus for negative lookup
fuse: don't dereference req->args on finished request
virtio-fs: don't show mount options
virtio-fs: Change module name to virtiofs.ko
We might have cases where the need for a specific timeout is gone, add
support for canceling an existing timeout operation. This works like the
POLL_REMOVE command, where the application passes in the user_data of
the timeout it wishes to cancel in the sqe->addr field.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is a pretty trivial addition on top of the relative timeouts
we have now, but it's handy for ensuring tighter timing for those
that are building scheduling primitives on top of io_uring.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently size the CQ ring as twice the SQ ring, to allow some
flexibility in not overflowing the CQ ring. This is done because the
SQE life time is different than that of the IO request itself, the SQE
is consumed as soon as the kernel has seen the entry.
Certain application don't need a huge SQ ring size, since they just
submit IO in batches. But they may have a lot of requests pending, and
hence need a big CQ ring to hold them all. By allowing the application
to control the CQ ring size multiplier, we can cater to those
applications more efficiently.
If an application wants to define its own CQ ring size, it must set
IORING_SETUP_CQSIZE in the setup flags, and fill out
io_uring_params->cq_entries. The value must be a power of two.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Allows the application to remove/replace/add files to/from a file set.
Passes in a struct:
struct io_uring_files_update {
__u32 offset;
__s32 *fds;
};
that holds an array of fds, size of array passed in through the usual
nr_args part of the io_uring_register() system call. The logic is as
follows:
1) If ->fds[i] is -1, the existing file at i + ->offset is removed from
the set.
2) If ->fds[i] is a valid fd, the existing file at i + ->offset is
replaced with ->fds[i].
For case #2, is the existing file is currently empty (fd == -1), the
new fd is simply added to the array.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix various misspellings of "configuration" and "configure".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch from BIT(0) to (1UL << 0).
First, there are already two different forms used in the header, so there's
no need to add a third. Second, the BIT() macros is kernel internal and
afaict not actually exposed to userspace. Maybe there's some magic there
I'm missing but it definitely causes issues when compiling a program that
tries to use SECCOMP_USER_NOTIF_FLAG_CONTINUE. It currently fails in the
following way:
# github.com/lxc/lxd/lxd
/usr/bin/ld: $WORK/b001/_x003.o: in function
`__do_user_notification_continue':
lxd/main_checkfeature.go:240: undefined reference to `BIT'
collect2: error: ld returned 1 exit status
Switching to (1UL << 0) should prevent that and is more in line what is
already done in the rest of the header.
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191024212539.4059-1-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-10-27
The following pull-request contains BPF updates for your *net-next* tree.
We've added 52 non-merge commits during the last 11 day(s) which contain
a total of 65 files changed, 2604 insertions(+), 1100 deletions(-).
The main changes are:
1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF
assembly code. The work here teaches BPF verifier to recognize
kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints
such that verifier allows direct use of bpf_skb_event_output() helper
used in tc BPF et al (w/o probing memory access) that dumps skb data
into perf ring buffer. Also add direct loads to probe memory in order
to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov.
2) Big batch of changes to improve libbpf and BPF kselftests. Besides
others: generalization of libbpf's CO-RE relocation support to now
also include field existence relocations, revamp the BPF kselftest
Makefile to add test runner concept allowing to exercise various
ways to build BPF programs, and teach bpf_object__open() and friends
to automatically derive BPF program type/expected attach type from
section names to ease their use, from Andrii Nakryiko.
3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu.
4) Allow to read BTF as raw data from bpftool. Most notable use case
is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa.
5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which
manages to improve "rx_drop" performance by ~4%., from Björn Töpel.
6) Fix to restore the flow dissector after reattach BPF test and also
fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki.
7) Improve verifier's BTF ctx access for use outside of raw_tp, from
Martin KaFai Lau.
8) Improve documentation for AF_XDP with new sections and to reflect
latest features, from Magnus Karlsson.
9) Add back 'version' section parsing to libbpf for old kernels, from
John Fastabend.
10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(),
from KP Singh.
11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order
to improve insn coverage for built BPF progs, from Yonghong Song.
12) Misc minor cleanups and fixes, from various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next,
more specifically:
* Updates for ipset:
1) Coding style fix for ipset comment extension, from Jeremy Sowden.
2) De-inline many functions in ipset, from Jeremy Sowden.
3) Move ipset function definition from header to source file.
4) Move ip_set_put_flags() to source, export it as a symbol, remove
inline.
5) Move range_to_mask() to the source file where this is used.
6) Move ip_set_get_ip_port() to the source file where this is used.
* IPVS selftests and netns improvements:
7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan.
8) Three patches to add selftest script for ipvs, also from
Haishuang Yan.
* Conntrack updates and new nf_hook_slow_list() function:
9) Document ct ecache extension, from Florian Westphal.
10) Skip ct extensions from ctnetlink dump, from Florian.
11) Free ct extension immediately, from Florian.
12) Skip access to ecache extension from nf_ct_deliver_cached_events()
this is not correct as reported by Syzbot.
13) Add and use nf_hook_slow_list(), from Florian.
* Flowtable infrastructure updates:
14) Move priority to nf_flowtable definition.
15) Dynamic allocation of per-device hooks in flowtables.
16) Allow to include netdevice only once in flowtable definitions.
17) Rise maximum number of devices per flowtable.
* Netfilter hardware offload infrastructure updates:
18) Add nft_flow_block_chain() helper function.
19) Pass callback list to nft_setup_cb_call().
20) Add nft_flow_cls_offload_setup() helper function.
21) Remove rules for the unregistered device via netdevice event.
22) Support for multiple devices in a basechain definition at the
ingress hook.
22) Add nft_chain_offload_cmd() helper function.
23) Add nft_flow_block_offload_init() helper function.
24) Rewind in case of failing to bind multiple devices to hook.
25) Typo in IPv6 tproxy module description, from Norman Rasmussen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
or not data-in-SYN was ack'd on both the client and server side. We'd like
to gather more information on the client-side in the failure case in order
to indicate the reason for the failure. This can be useful for not only
debugging TFO, but also for creating TFO socket policies. For example, if
a middle box removes the TFO option or drops a data-in-SYN, we can
can detect this case, and turn off TFO for these connections saving the
extra retransmits.
The newly added tcpi_fastopen_client_fail status is 2 bits and has the
following 4 states:
1) TFO_STATUS_UNSPEC
Catch-all state which includes when TFO is disabled via black hole
detection, which is indicated via LINUX_MIB_TCPFASTOPENBLACKHOLE.
2) TFO_COOKIE_UNAVAILABLE
If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
is available in the cache.
3) TFO_DATA_NOT_ACKED
Data was sent with SYN, we received a SYN/ACK but it did not cover the data
portion. Cookie is not accepted by server because the cookie may be invalid
or the server may be overloaded.
4) TFO_SYN_RETRANSMITTED
Data was sent with SYN, we received a SYN/ACK which did not cover the data
after at least 1 additional SYN was sent (without data). It may be the case
that a middle-box is dropping data-in-SYN packets. Thus, it would be more
efficient to not use TFO on this connection to avoid extra retransmits
during connection establishment.
These new fields do not cover all the cases where TFO may fail, but other
failures, such as SYN/ACK + data being dropped, will result in the
connection not becoming established. And a connection blackhole after
session establishment shows up as a stalled connection.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to commit message in the original commit c75b1d9421 ("fs:
add fcntl() interface for setting/getting write life time hints"),
as well as userspace library[1] and man page update[2], R/W hint constants
are intended to have RWH_* prefix. However, RWF_WRITE_LIFE_NOT_SET retained
"RWF_*" prefix used in the early versions of the proposed patch set[3].
Rename it and provide the old name as a synonym for the new one
for backward compatibility.
[1] https://github.com/axboe/fio/commit/bd553af6c849
[2] https://github.com/mkerrisk/man-pages/commit/580082a186fd
[3] https://www.mail-archive.com/linux-block@vger.kernel.org/msg09638.html
Fixes: c75b1d9421 ("fs: add fcntl() interface for setting/getting write life time hints")
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
SEV INIT command loads the SEV related persistent data from NVS
and initializes the platform context. The firmware validates the
persistent state. If validation fails, the firmware will reset
the persisent state and return an integrity check failure status.
At this point, a subsequent INIT command should succeed, so retry
the command. The INIT command retry is only done during driver
initialization.
Additional enums along with SEV_RET_SECURE_DATA_INVALID are added
to sev_ret_code to maintain continuity and relevance of enum values.
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This framework allows a unified userspace interface for dma-buf
exporters, allowing userland to allocate specific types of memory
for use in dma-buf sharing.
Each heap is given its own device node, which a user can allocate
a dma-buf fd from using the DMA_HEAP_IOC_ALLOC.
This code is an evoluiton of the Android ION implementation,
and a big thanks is due to its authors/maintainers over time
for their effort:
Rebecca Schultz Zavin, Colin Cross, Benjamin Gaignard,
Laura Abbott, and many other contributors!
Cc: Laura Abbott <labbott@redhat.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Vincent Donnefort <Vincent.Donnefort@arm.com>
Cc: Sudipto Paul <Sudipto.Paul@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Tested-by: Ayan Kumar Halder <ayan.halder@arm.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021190310.85221-2-john.stultz@linaro.org
Add new metadata format to support metadata output in vivid.
Signed-off-by: Vandana BN <bnvandana@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
The ppp_idle structure is defined in terms of __kernel_time_t, which is
defined as 'long' on all architectures, and this usage is not affected
by the y2038 problem since it transports a time interval rather than an
absolute time.
However, the ppp user space defines the same structure as time_t, which
may be 64-bit wide on new libc versions even on 32-bit architectures.
It's easy enough to just handle both possible structure layouts on
all architectures, to deal with the possibility that a user space ppp
implementation comes with its own ppp_idle structure definition, as well
as to document the fact that the driver is y2038-safe.
Doing this also avoids the need for a special compat mode translation,
since 32-bit and 64-bit kernels now support the same interfaces. The old
32-bit structure is also available on native 64-bit architectures now,
but this is harmless.
Cc: netdev@vger.kernel.org
Cc: linux-ppp@vger.kernel.org
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch allows you to register one netdev basechain to multiple
devices. This adds a new NFTA_HOOK_DEVS netlink attribute to specify
the list of netdevices. Basechains store a list of hooks.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reset all signal handlers of the child not set to SIG_IGN to SIG_DFL.
Mutually exclusive with CLONE_SIGHAND to not disturb other thread's
signal handler.
In the spirit of closer cooperation between glibc developers and kernel
developers (cf. [2]) this patchset came out of a discussion on the glibc
mailing list for improving posix_spawn() (cf. [1], [3], [4]). Kernel
support for this feature has been explicitly requested by glibc and I
see no reason not to help them with this.
The child helper process on Linux posix_spawn must ensure that no signal
handlers are enabled, so the signal disposition must be either SIG_DFL
or SIG_IGN. However, it requires a sigprocmask to obtain the current
signal mask and at least _NSIG sigaction calls to reset the signal
handlers for each posix_spawn call or complex state tracking that might
lead to data corruption in glibc. Adding this flags lets glibc avoid
these problems.
[1]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00149.html
[3]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00158.html
[4]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00160.html
[2]: https://lwn.net/Articles/799331/
'[...] by asking for better cooperation with the C-library projects
in general. They should be copied on patches containing ABI
changes, for example. I noted that there are often times where
C-library developers wish the kernel community had done things
differently; how could those be avoided in the future? Members of
the audience suggested that more glibc developers should perhaps
join the linux-api list. The other suggestion was to "copy Florian
on everything".'
Cc: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20191014104538.3096-1-christian.brauner@ubuntu.com
Allow user space to inform the KVM host where in the physical memory
map the paravirtualized time structures should be located.
User space can set an attribute on the VCPU providing the IPA base
address of the stolen time structure for that VCPU. This must be
repeated for every VCPU in the VM.
The address is given in terms of the physical address visible to
the guest and must be 64 byte aligned. The guest will discover the
address via a hypercall.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
In some scenarios, such as buggy guest or incorrect configuration of the
VMM and firmware description data, userspace will detect a memory access
to a portion of the IPA, which is not mapped to any MMIO region.
For this purpose, the appropriate action is to inject an external abort
to the guest. The kernel already has functionality to inject an
external abort, but we need to wire up a signal from user space that
lets user space tell the kernel to do this.
It turns out, we already have the set event functionality which we can
perfectly reuse for this.
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
For a long time, if a guest accessed memory outside of a memslot using
any of the load/store instructions in the architecture which doesn't
supply decoding information in the ESR_EL2 (the ISV bit is not set), the
kernel would print the following message and terminate the VM as a
result of returning -ENOSYS to userspace:
load/store instruction decoding not implemented
The reason behind this message is that KVM assumes that all accesses
outside a memslot is an MMIO access which should be handled by
userspace, and we originally expected to eventually implement some sort
of decoding of load/store instructions where the ISV bit was not set.
However, it turns out that many of the instructions which don't provide
decoding information on abort are not safe to use for MMIO accesses, and
the remaining few that would potentially make sense to use on MMIO
accesses, such as those with register writeback, are not used in
practice. It also turns out that fetching an instruction from guest
memory can be a pretty horrible affair, involving stopping all CPUs on
SMP systems, handling multiple corner cases of address translation in
software, and more. It doesn't appear likely that we'll ever implement
this in the kernel.
What is much more common is that a user has misconfigured his/her guest
and is actually not accessing an MMIO region, but just hitting some
random hole in the IPA space. In this scenario, the error message above
is almost misleading and has led to a great deal of confusion over the
years.
It is, nevertheless, ABI to userspace, and we therefore need to
introduce a new capability that userspace explicitly enables to change
behavior.
This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV)
which does exactly that, and introduces a new exit reason to report the
event to userspace. User space can then emulate an exception to the
guest, restart the guest, suspend the guest, or take any other
appropriate action as per the policy of the running system.
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
This patch adds support for the V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
flag.
It also adds a new V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF
capability.
Drivers should set vb2_queue->subsystem_flags to
VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF to indicate support
for this flag.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
When calling the KVM_SET_GUEST_DEBUG ioctl, userspace might request
the next instruction to be single stepped via the
KVM_GUESTDBG_SINGLESTEP control bit of the kvm_guest_debug structure.
This patch adds the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability in order
to inform userspace about the state of single stepping support.
We currently don't have support for guest single stepping implemented
in Book3S HV so the capability is only present for Book3S PR and
BookE.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Introduce new helper that reuses existing skb perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct sk_buff *' as tracepoint argument or
can walk other kernel data structures to skb pointer.
In order to do that teach verifier to resolve true C types
of bpf helpers into in-kernel BTF ids.
The type of kernel pointer passed by raw tracepoint into bpf
program will be tracked by the verifier all the way until
it's passed into helper function.
For example:
kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
bpf programs receives that skb pointer and may eventually
pass it into bpf_skb_output() bpf helper which in-kernel is
implemented via bpf_skb_event_output() kernel function.
Its first argument in the kernel is 'struct sk_buff *'.
The verifier makes sure that types match all the way.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
Add attach_btf_id attribute to prog_load command.
It's similar to existing expected_attach_type attribute which is
used in several cgroup based program types.
Unfortunately expected_attach_type is ignored for
tracing programs and cannot be reused for new purpose.
Hence introduce attach_btf_id to verify bpf programs against
given in-kernel BTF type id at load time.
It is strictly checked to be valid for raw_tp programs only.
In a later patches it will become:
btf_id == 0 semantics of existing raw_tp progs.
btd_id > 0 raw_tp with BTF and additional type safety.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-5-ast@kernel.org
Add support for 400Gbps speed, link modes of 50Gbps per lane
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guest shared virtual address (SVA) may require host to shadow guest
PASID tables. Guest PASID can also be allocated from the host via
enlightened interfaces. In this case, guest needs to bind the guest
mm, i.e. cr3 in guest physical address to the actual PASID table in
the host IOMMU. Nesting will be turned on such that guest virtual
address can go through a two level translation:
- 1st level translates GVA to GPA
- 2nd level translates GPA to HPA
This patch introduces APIs to bind guest PASID data to the assigned
device entry in the physical IOMMU. See the diagram below for usage
explanation.
.-------------. .---------------------------.
| vIOMMU | | Guest process mm, FL only |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush -
'-------------' |
| | V
| | GP
'-------------'
Guest
------| Shadow |----------------------- GP->HP* ---------
v v |
Host v
.-------------. .----------------------.
| pIOMMU | | Bind FL for GVA-GPA |
| | '----------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.---------------------.
| | |Set SL to GPA-HPA |
| | '---------------------'
'-------------'
Where:
- FL = First level/stage one page tables
- SL = Second level/stage two page tables
- GP = Guest PASID
- HP = Host PASID
* Conversion needed if non-identity GP-HP mapping option is chosen.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In any virtualization use case, when the first translation stage
is "owned" by the guest OS, the host IOMMU driver has no knowledge
of caching structure updates unless the guest invalidation activities
are trapped by the virtualizer and passed down to the host.
Since the invalidation data can be obtained from user space and will be
written into physical IOMMU, we must allow security check at various
layers. Therefore, generic invalidation data format are proposed here,
model specific IOMMU drivers need to convert them into their own format.
Signed-off-by: Yi L Liu <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-10-14
The following pull-request contains BPF updates for your *net-next* tree.
12 days of development and
85 files changed, 1889 insertions(+), 1020 deletions(-)
The main changes are:
1) auto-generation of bpf_helper_defs.h, from Andrii.
2) split of bpf_helpers.h into bpf_{helpers, helper_defs, endian, tracing}.h
and move into libbpf, from Andrii.
3) Track contents of read-only maps as scalars in the verifier, from Andrii.
4) small x86 JIT optimization, from Daniel.
5) cross compilation support, from Ivan.
6) bpf flow_dissector enhancements, from Jakub and Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Code that iterates over all standard PCI BARs typically uses
PCI_STD_RESOURCE_END. However, that requires the unusual test
"i <= PCI_STD_RESOURCE_END" rather than something the typical
"i < PCI_STD_NUM_BARS".
Add a definition for PCI_STD_NUM_BARS and change loops to use the more
idiomatic C style to help avoid fencepost errors.
Link: https://lore.kernel.org/r/20190927234026.23342-1-efremov@linux.com
Link: https://lore.kernel.org/r/20190927234308.23935-1-efremov@linux.com
Link: https://lore.kernel.org/r/20190916204158.6889-3-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com> # arch/s390/
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> # video/fbdev/
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> # pci/controller/dwc/
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> # scsi/pm8001/
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi/pm8001/
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # memstick/
* minstrel improvements from Felix
* a TX aggregation simplification
* some additional capabilities for hwsim
* minor cleanups & docs updates
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl2gQp8ACgkQB8qZga/f
l8T5kA//Yo07t93OespsNwJZXWx7l/WWBtIydnTAk9hNXDV4kd6oLgN0oadqpx7g
5bryCRqmS4vx2IjiEQiSK08JqmhpruQSXYe8oixVVCOknw2pfkK6lr+OHCqJO11A
iWu5Nz7bTl4pfSO0XIfWk1STUjNuXWCTUgbLSwU4oaoqk8oib2VeV1QdXX0hvgXF
gSlToWQqliI/c6HS69iUJGRqXZCMO7GPWE9Sqj8cvmeAFXWQz9zcan6Fcd2XJyLq
qJxNbxGD0JQ6vdbg2bFnio8PlwYMJ7ohrRDds8euYzViVtyTVZ6WtD9/gKB6UGVe
RS5NEsmZLISCrQbV8nK/q0G/mBdNNegj4ezUkWxMvuYDEvl83Xniyz5CoAC++9mp
0M0//+NgwoVqDvaoV0s+TZBYv5arJyeUCY9kkmPCFFVV6cvmXfRFpn9yU95he2Eb
duY5P+uKNlFU+sYVh1d6QC26mEAIa0y4qZszp3HurVWXe/aG/fLumW2USAOdqDOw
9HF9vOqGc3FRZTX1l15F+5nPn9gMyMJJGqOeT4oS1mQJT/KdzQCGLmhQ+IR+00Un
zF6QsfCCtbuO5xLErqoARa7qKzddDxgkEBbdmQmjUwdyzAxSxZxGDBLLcpZ0OQwo
Kxx7ELz97f55unLbByDrFMoZvEXaCeGcbZeTJWGvDRElw/BhRJU=
=IJ7D
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-net-next-2019-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
A few more small things, nothing really stands out:
* minstrel improvements from Felix
* a TX aggregation simplification
* some additional capabilities for hwsim
* minor cleanups & docs updates
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are some small tty and serial driver fixes for 5.4-rc3 that resolve
a number of reported issues and regressions.
None of these are huge, full details are in the shortlog. THere's also
a MAINTAINERS update that I think you might have already taken in your
tree already, but git should handle that merge easily.
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXaH51A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykeBQCeMWoqZHpHTXiadziVgtHrjnj5lL8An2Y+3/Wh
ZbNxvHnUK/VUJQJ2Szsm
=e84r
-----END PGP SIGNATURE-----
Merge tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 5.4-rc3 that
resolve a number of reported issues and regressions.
None of these are huge, full details are in the shortlog. There's also
a MAINTAINERS update that I think you might have already taken in your
tree already, but git should handle that merge easily.
All have been in linux-next with no reported issues"
* tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
MAINTAINERS: kgdb: Add myself as a reviewer for kgdb/kdb
tty: serial: imx: Use platform_get_irq_optional() for optional IRQs
serial: fix kernel-doc warning in comments
serial: 8250_omap: Fix gpio check for auto RTS/CTS
serial: mctrl_gpio: Check for NULL pointer
tty: serial: fsl_lpuart: Fix lpuart_flush_buffer()
tty: serial: Fix PORT_LINFLEXUART definition
tty: n_hdlc: fix build on SPARC
serial: uartps: Fix uartps_major handling
serial: uartlite: fix exit path null pointer
tty: serial: linflexuart: Fix magic SysRq handling
serial: sh-sci: Use platform_get_irq_optional() for optional interrupts
dt-bindings: serial: sh-sci: Document r8a774b1 bindings
serial/sifive: select SERIAL_EARLYCON
tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
This allows the seccomp notifier to continue a syscall. A positive
discussion about this feature was triggered by a post to the
ksummit-discuss mailing list (cf. [3]) and took place during KSummit
(cf. [1]) and again at the containers/checkpoint-restore
micro-conference at Linux Plumbers.
Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
which enables a process (watchee) to retrieve an fd for its seccomp
filter. This fd can then be handed to another (usually more privileged)
process (watcher). The watcher will then be able to receive seccomp
messages about the syscalls having been performed by the watchee.
This feature is heavily used in some userspace workloads. For example,
it is currently used to intercept mknod() syscalls in user namespaces
aka in containers.
The mknod() syscall can be easily filtered based on dev_t. This allows
us to only intercept a very specific subset of mknod() syscalls.
Furthermore, mknod() is not possible in user namespaces toto coelo and
so intercepting and denying syscalls that are not in the whitelist on
accident is not a big deal. The watchee won't notice a difference.
In contrast to mknod(), a lot of other syscall we intercept (e.g.
setxattr()) cannot be easily filtered like mknod() because they have
pointer arguments. Additionally, some of them might actually succeed in
user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
currently cannot tell seccomp to continue from a user notifier we are
stuck with performing all of the syscalls in lieu of the container. This
is a huge security liability since it is extremely difficult to
correctly assume all of the necessary privileges of the calling task
such that the syscall can be successfully emulated without escaping
other additional security restrictions (think missing CAP_MKNOD for
mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
telling seccomp to resume the syscall.
One thing that came up in the discussion was the problem that another
thread could change the memory after userspace has decided to let the
syscall continue which is a well known TOCTOU with seccomp which is
present in other ways already.
The discussion showed that this feature is already very useful for any
syscall without pointer arguments. For any accidentally intercepted
non-pointer syscall it is safe to continue.
For syscalls with pointer arguments there is a race but for any cautious
userspace and the main usec cases the race doesn't matter. The notifier
is intended to be used in a scenario where a more privileged watcher
supervises the syscalls of lesser privileged watchee to allow it to get
around kernel-enforced limitations by performing the syscall for it
whenever deemed save by the watcher. Hence, if a user tricks the watcher
into allowing a syscall they will either get a deny based on
kernel-enforced restrictions later or they will have changed the
arguments in such a way that they manage to perform a syscall with
arguments that they would've been allowed to do anyway.
In general, it is good to point out again, that the notifier fd was not
intended to allow userspace to implement a security policy but rather to
work around kernel security mechanisms in cases where the watcher knows
that a given action is safe to perform.
/* References */
[1]: https://linuxplumbersconf.org/event/4/contributions/560
[2]: https://linuxplumbersconf.org/event/4/contributions/477
[3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
[4]: commit 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
CC: Tyler Hicks <tyhicks@canonical.com>
Link: https://lore.kernel.org/r/20190920083007.11475-2-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
This control returns the unit cell size in nanometres. The struct provides
the width and the height in separated fields to take into consideration
asymmetric pixels and/or hardware binning.
This control is required for automatic calibration of sensors/cameras.
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Ricardo Ribalda Delgado <ribalda@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
This type contains the width and the height of a rectangular area.
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Signed-off-by: Ricardo Ribalda Delgado <ribalda@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
This patch is to add a new event SCTP_SEND_FAILED_EVENT described in
rfc6458#section-6.1.11. It's a update of SCTP_SEND_FAILED event:
struct sctp_sndrcvinfo ssf_info is replaced with
struct sctp_sndinfo ssfe_info in struct sctp_send_failed_event.
SCTP_SEND_FAILED is being deprecated, but we don't remove it in this
patch. Both are being processed in sctp_datamsg_destroy() when the
corresp event flag is set.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When the new CEC_OP_UI_CMD defines were added I forgot to update this
header to use these new defines. This is now fixed.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
The CEC_MSG_GIVE_DECK_STATUS and CEC_MSG_GIVE_TUNER_DEVICE_STATUS commands
both have a status_req argument: ON, OFF, ONCE. If ON or ONCE, then the
follower will reply with a STATUS message. Either once or whenever the
status changes (status_req == ON).
If status_req == OFF, then it will stop sending continuous status updates,
but the follower will *not* send a STATUS message in that case.
This means that if status_req == OFF, then msg->reply should be 0 as well
since no reply is expected in that case.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Various small fixes to BPF helper documentation comments, enabling
automatic header generation with a list of BPF helpers.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a statistic for number of RX resyncs sent down to the NIC.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a statistic for TLS record decryption errors.
Since devices are supposed to pass records as-is when they
encounter errors this statistic will count bad records in
both pure software and inline crypto configurations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add SNMP stats for number of sockets with successfully
installed sessions. Break them down to software and
hardware ones. Note that if hardware offload fails
stack uses software implementation, and counts the
session appropriately.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a skeleton structure for adding TLS statistics.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add casts to fix these warnings:
./usr/include/linux/netfilter_arp/arp_tables.h:200:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_bridge/ebtables.h:197:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_ipv4/ip_tables.h:223:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_ipv6/ip6_tables.h:263:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/tipc_config.h:310:28: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/tipc_config.h:410:24: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/virtio_ring.h:170:16: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
Those are theoretical probably but kernel doesn't control compiler flags
in userspace.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All devlink instances are created in init_net and stay there for a
lifetime. Allow user to be able to move devlink instances into
namespaces during devlink reload operation. That ensures proper
re-instantiation of driver objects, including netdevices.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXZZIgQAKCRCRxhvAZXjc
orNOAP98B2nmoxvq8d5Z6PhoyTBC5NIUuJ5h2YMwcX/hAaj5uQEA58NTKtPmOPDR
2ffUFFerGZ2+brlHgACa0ZKdH27TjAA=
=QryD
-----END PGP SIGNATURE-----
Merge tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull copy_struct_from_user() helper from Christian Brauner:
"This contains the copy_struct_from_user() helper which got split out
from the openat2() patchset. It is a generic interface designed to
copy a struct from userspace.
The helper will be especially useful for structs versioned by size of
which we have quite a few. This allows for backwards compatibility,
i.e. an extended struct can be passed to an older kernel, or a legacy
struct can be passed to a newer kernel. For the first case (extended
struct, older kernel) the new fields in an extended struct can be set
to zero and the struct safely passed to an older kernel.
The most obvious benefit is that this helper lets us get rid of
duplicate code present in at least sched_setattr(), perf_event_open(),
and clone3(). More importantly it will also help to ensure that users
implementing versioning-by-size end up with the same core semantics.
This point is especially crucial since we have at least one case where
versioning-by-size is used but with slighly different semantics:
sched_setattr(), perf_event_open(), and clone3() all do do similar
checks to copy_struct_from_user() while rt_sigprocmask(2) always
rejects differently-sized struct arguments.
With this pull request we also switch over sched_setattr(),
perf_event_open(), and clone3() to use the new helper"
* tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
usercopy: Add parentheses around assignment in test_copy_struct_from_user
perf_event_open: switch to copy_struct_from_user()
sched_setattr: switch to copy_struct_from_user()
clone3: switch to copy_struct_from_user()
lib: introduce copy_struct_from_user() helper
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXZZKNgAKCRCRxhvAZXjc
otfIAPsHUZn+Wfa/8uftNDJ6RLDXDsq6l8xiQTkz+k4YdnDj2AD/aIPjrM950jrS
W7+8R7CSSQOLmIif6R+S0A1fyFoVlQA=
=HVz0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull clone3/pidfd fixes from Christian Brauner:
"This contains a couple of fixes:
- Fix pidfd selftest compilation (Shuah Kahn)
Due to a false linking instruction in the Makefile compilation for
the pidfd selftests would fail on some systems.
- Fix compilation for glibc on RISC-V systems (Seth Forshee)
In some scenarios linux/uapi/linux/sched.h is included where
__ASSEMBLY__ is defined causing a build failure because struct
clone_args was not guarded by an #ifndef __ASSEMBLY__.
- Add missing clone3() and struct clone_args kernel-doc (Christian Brauner)
clone3() and struct clone_args were missing kernel-docs. (The goal
is to use kernel-doc for any function or type where it's worth it.)
For struct clone_args this also contains a comment about the fact
that it's versioned by size"
* tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
sched: add kernel-doc for struct clone_args
fork: add kernel-doc for clone3
selftests: pidfd: Fix undefined reference to pthread_create()
sched: Add __ASSEMBLY__ guards around struct clone_args
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2WrkYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjf1D/9wy2L/yAXA/KLQkwDZ2kn3tMtzMrsv6bOJ
4UTdlLWKH2ihGg69iAE5f19iQSeNmqqXxBz0VvIBpFR7TqcRbGe0d9iVtoOsDhpZ
c1OvQ2Ey35Xx1T2w6uMh0llgeWY2J/gJkY64unUxwUBUZwNOPA8ZjxqeXcMQmyAt
sYmXpNLCT/f4YTYOYDgzoh5960TsCB/H+m/bLEVRvr0MaonvlaBUKRcysQikDFCQ
yobzDmlSJqGKqlFJ2fnRVSkJC0BmBE5p8Ric9HHiUOT8BO31079IHUGbkbSh/csH
0yPipNaYNMv+Hr0t9pgfcNbAt2weMK5HFgtpQwv8Frl4xjvBSWDS5fQesCVDjkZt
+ROeOvQtjfeKtLy5PCu6BJwYpu8iYG9eGF8zxBQ4FBHM3tghcVhqssaNbfrVOW+u
YXYbAuLMkLwKlmJ+6WBiVIMefyF59ue3+UJGECiCrj/BrgxUyw8HcGKwpKEAZSok
VFGDukL0Y3flnoO/gyOf0GFaD5Uovr1sx82DCz05B/XEMfkqFMJRGkbyZBarJL69
9QrnyGpF4rwtfg+usR1PmJ+9/oY/ypSk8N9MAIkoK9e1YIexxvBiXAf0k8AxuDyC
uPuOiQgKcqUr3aF+ivao8dQB9NiK1bJGc4pqBPPN4ZYRSjMSfBT/cms4IeUyj0K6
sokcB1p+CQ==
=vKVl
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Mandate timespec64 for the io_uring timeout ABI (Arnd)
- Set of NVMe changes via Sagi:
- controller removal race fix from Balbir
- quirk additions from Gabriel and Jian-Hong
- nvme-pci power state save fix from Mario
- Add 64bit user commands (for 64bit registers) from Marta
- nvme-rdma/nvme-tcp fixes from Max, Mark and Me
- Minor cleanups and nits from James, Dan and John
- Two s390 dasd fixes (Jan, Stefan)
- Have loop change block size in DIO mode (Martijn)
- paride pg header ifdef guard (Masahiro)
- Two blk-mq queue scheduler tweaks, fixing an ordering issue on zoned
devices and suboptimal performance on others (Ming)
* tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block: (22 commits)
block: sed-opal: fix sparse warning: convert __be64 data
block: sed-opal: fix sparse warning: obsolete array init.
block: pg: add header include guard
Revert "s390/dasd: Add discard support for ESE volumes"
s390/dasd: Fix error handling during online processing
io_uring: use __kernel_timespec in timeout ABI
loop: change queue block size to match when using DIO
blk-mq: apply normal plugging for HDD
blk-mq: honor IO scheduler for multiqueue devices
nvme-rdma: fix possible use-after-free in connect timeout
nvme: Move ctrl sqsize to generic space
nvme: Add ctrl attributes for queue_count and sqsize
nvme: allow 64-bit results in passthru commands
nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T
nvmet-tcp: remove superflous check on request sgl
Added QUIRKs for ADATA XPG SX8200 Pro 512GB
nvme-rdma: Fix max_hw_sectors calculation
nvme: fix an error code in nvme_init_subsystem()
nvme-pci: Save PCI state before putting drive into deepest state
nvme-tcp: fix wrong stop condition in io_work
...
The port type macros should have different values for different devices.
Currently, PORT_LINFLEXUART conflicts with PORT_SUNIX.
Fixes: 09864c1cdf ("tty: serial: Add linflexuart driver for S32V234")
Signed-off-by: Stefan-Gabriel Mirea <stefan-gabriel.mirea@nxp.com>
Link: https://lore.kernel.org/r/20191004135058.18007-1-stefan-gabriel.mirea@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit documents the expectation for NL80211_ATTR_IE when included
in NL80211_CMD_CONNECT, as following.
Driver shall not modify the IEs specified through NL80211_ATTR_IE if
NL80211_ATTR_MAC is included. However, if NL80211_ATTR_MAC_HINT is
included, these IEs through NL80211_ATTR_IE are specified by the user
space based on the best possible BSS selected. Thus, if the driver ends
up selecting a different BSS, it can modify these IEs accordingly (e.g.
userspace asks the driver to perform PMKSA caching with BSS1 and the
driver ends up selecting BSS2 with different PMKSA cache entry. RSNIE
has to get updated with the apt PMKID).
Signed-off-by: Sunil Dutt <usdutt@codeaurora.org>
Link: https://lore.kernel.org/r/1568378504-15179-1-git-send-email-usdutt@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Various keyboards have macro keys, which are intended to have user
programmable actions / key-sequences bound to them. In some cases these
macro keys are actually programmable in hardware, but more often they
basically are just extra keys and the playback of the key-sequence is done
by software running on the host.
One example of keyboards with macro-keys are various "internet" / "office"
keyboards have a set of so-called "Smart Keys", typically a set of 4 keys
labeled "[A]" - "[D]".
Another example are gaming keyboards, such as the Logitech G15 Gaming
keyboard, which has 18 "G"aming keys labeled "G1" to G18", 3 keys to select
macro presets labeled "M1" - "M3" and a key to start recording a macro
called "MR" note that even though there us a record key everything is
handled in sw on the host.
Besides macro keys the G15 (and other gaming keyboards) also has a buildin
LCD panel where the contents are controlled by the host. There are 5 keys
directly below the LCD intended for controlling a menu shown on the LCD.
The Microsoft SideWinder X6 keyboard is another gaming keyboard example,
this keyboard has 30 "S"idewinder keys and a key to cycle through
macro-presets.
After discussion between various involved userspace people we've come to
the conclusion that since these are all really just extra keys we should
simply treat them as such and give them their own event-codes, see:
https://github.com/libratbag/libratbag/issues/172
This commit adds the following new KEY_ defines for this:
KEY_MACRO1 - KEY_MACRO30. KEY_MACRO_RECORD_START/-STOP,
KEY_MACRO_PRESET_CYCLE, KEY_MACRO_PRESET1 - KEY_MACRO_PRESET3,
KEY_KBD_LCD_MENU1 - KEY_KBD_LCD_MENU5.
The defines leave room for adding some more LCD-menu, preset or macro keys,
the maximum values above are based on the maximum values to support all
currently known internet, office and gaming keyboards.
BugLink: https://github.com/libratbag/libratbag/issues/172
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This renames the very specific audit_log_link_denied() to
audit_log_path_denied() and adds the AUDIT_* type as an argument. This
allows for the creation of the new AUDIT_ANOM_CREAT that can be used to
report the fifo/regular file creation restrictions that were introduced
in commit 30aba6656f ("namei: allow restricted O_CREAT of FIFOs and
regular files").
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add two commands to add and delete list of link properties. Implement
the first property type along - alternative ifnames.
Each net device can have multiple alternative names.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now the connector info API was a kernel-internal API only.
This moves it to the public API and adds the new ioctl to retrieve
this information.
Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
CEC_OP_REC_FLAG_NOT_USED is 0 and CEC_OP_REC_FLAG_USED is 1, not the
other way around.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: Jiunn Chang <c0d1n61at3@gmail.com>
Cc: <stable@vger.kernel.org> # for v4.10 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
These defines were never added to this CEC header, likely due
to laziness on the part of the original author, i.e. me.
But it is useful to have them, so add them.
Also update the cec.h.rst.exceptions file to avoid errors when
building the documentation.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Switch clone3() syscall from it's own copying struct clone_args from
userspace to the new dedicated copy_struct_from_user() helper.
The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Additionally, explicitly
define CLONE_ARGS_SIZE_VER0 to match the other users of the
struct-extension pattern.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
[christian.brauner@ubuntu.com: improve commit message]
Link: https://lore.kernel.org/r/20191001011055.19283-3-cyphar@cyphar.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
The addition of struct clone_args to uapi/linux/sched.h is not protected
by __ASSEMBLY__ guards, causing a failure to build from source for glibc
on RISC-V. Add the guards to fix this.
Fixes: 7f192e3cd3 ("fork: add clone3")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Cc: <stable@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20190917071853.12385-1-seth.forshee@canonical.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Pull networking fixes from David Miller:
1) Sanity check URB networking device parameters to avoid divide by
zero, from Oliver Neukum.
2) Disable global multicast filter in NCSI, otherwise LLDP and IPV6
don't work properly. Longer term this needs a better fix tho. From
Vijay Khemka.
3) Small fixes to selftests (use ping when ping6 is not present, etc.)
from David Ahern.
4) Bring back rt_uses_gateway member of struct rtable, it's semantics
were not well understood and trying to remove it broke things. From
David Ahern.
5) Move usbnet snaity checking, ignore endpoints with invalid
wMaxPacketSize. From Bjørn Mork.
6) Missing Kconfig deps for sja1105 driver, from Mao Wenan.
7) Various small fixes to the mlx5 DR steering code, from Alaa Hleihel,
Alex Vesker, and Yevgeny Kliteynik
8) Missing CAP_NET_RAW checks in various places, from Ori Nimron.
9) Fix crash when removing sch_cbs entry while offloading is enabled,
from Vinicius Costa Gomes.
10) Signedness bug fixes, generally in looking at the result given by
of_get_phy_mode() and friends. From Dan Crapenter.
11) Disable preemption around BPF_PROG_RUN() calls, from Eric Dumazet.
12) Don't create VRF ipv6 rules if ipv6 is disabled, from David Ahern.
13) Fix quantization code in tcp_bbr, from Kevin Yang.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (127 commits)
net: tap: clean up an indentation issue
nfp: abm: fix memory leak in nfp_abm_u32_knode_replace
tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state
sk_buff: drop all skb extensions on free and skb scrubbing
tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidth
mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions
Documentation: Clarify trap's description
mlxsw: spectrum: Clear VLAN filters during port initialization
net: ena: clean up indentation issue
NFC: st95hf: clean up indentation issue
net: phy: micrel: add Asym Pause workaround for KSZ9021
net: socionext: ave: Avoid using netdev_err() before calling register_netdev()
ptp: correctly disable flags on old ioctls
lib: dimlib: fix help text typos
net: dsa: microchip: Always set regmap stride to 1
nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs
nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs
net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N
vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled
net: sched: sch_sfb: don't call qdisc_put() while holding tree lock
...
- add a new knfsd file cache, so that we don't have to open and
close on each (NFSv2/v3) READ or WRITE. This can speed up
read and write in some cases. It also replaces our readahead
cache.
- Prevent silent data loss on write errors, by treating write
errors like server reboots for the purposes of write caching,
thus forcing clients to resend their writes.
- Tweak the code that allocates sessions to be more forgiving,
so that NFSv4.1 mounts are less likely to hang when a server
already has a lot of clients.
- Eliminate an arbitrary limit on NFSv4 ACL sizes; they should
now be limited only by the backend filesystem and the
maximum RPC size.
- Allow the server to enforce use of the correct kerberos
credentials when a client reclaims state after a reboot.
And some miscellaneous smaller bugfixes and cleanup.
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl2OoFcVHGJmaWVsZHNA
ZmllbGRzZXMub3JnAAoJECebzXlCjuG+dRoP/3OW1NxPjpjbCQWZL0M+O3AYJJla
W8E+uoZKMosFEe/ymokMD0Vn5s47jPaMCifMjHZa2GygW8zHN9X2v0HURx/lob+o
/rJXwMn78N/8kdbfDz2FvaCPeT0IuNzRIFBV8/sSXofqwCBwvPo+cl0QGrd4/xLp
X35qlupx62TRk+kbdRjvv8kpS5SJ7BvR+FSA1WubNYWw2hpdEsr2OCFdGq2Wvthy
DK6AfGBXfJGsOE+HGCSj6ejRV6i0UOJ17P8gRSsx+YT0DOe5E7ROjt+qvvRwk489
wmR8Vjuqr1e40eGAUq3xuLfk5F5NgycY4ekVxk/cTVFNwWcz2DfdjXQUlyPAbrSD
SqIyxN1qdKT24gtr7AHOXUWJzBYPWDgObCVBXUGzyL81RiDdhf38HRNjL2TcSDld
tzCjQ0wbPw+iT74v6qQRY05oS+h3JOtDjU4pxsBnxVtNn4WhGJtaLfWW8o1C1QwU
bc4aX3TlYhDmzU7n7Zjt4rFXGJfyokM+o6tPao1Z60Pmsv1gOk4KQlzLtW/jPHx4
ZwYTwVQUKRDBfC62nmgqDyGI3/Qu11FuIxL2KXUCgkwDxNWN4YkwYjOGw9Lb5qKM
wFpxq6CDNZB/IWLEu8Yg85kDPPUJMoI657lZb7Osr/MfBpU0YljcMOIzLBy8uV1u
9COUbPaQipiWGu/0
=diBo
-----END PGP SIGNATURE-----
Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Highlights:
- Add a new knfsd file cache, so that we don't have to open and close
on each (NFSv2/v3) READ or WRITE. This can speed up read and write
in some cases. It also replaces our readahead cache.
- Prevent silent data loss on write errors, by treating write errors
like server reboots for the purposes of write caching, thus forcing
clients to resend their writes.
- Tweak the code that allocates sessions to be more forgiving, so
that NFSv4.1 mounts are less likely to hang when a server already
has a lot of clients.
- Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be
limited only by the backend filesystem and the maximum RPC size.
- Allow the server to enforce use of the correct kerberos credentials
when a client reclaims state after a reboot.
And some miscellaneous smaller bugfixes and cleanup"
* tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits)
sunrpc: clean up indentation issue
nfsd: fix nfs read eof detection
nfsd: Make nfsd_reset_boot_verifier_locked static
nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
nfsd: handle drc over-allocation gracefully.
nfsd: add support for upcall version 2
nfsd: add a "GetVersion" upcall for nfsdcld
nfsd: Reset the boot verifier on all write I/O errors
nfsd: Don't garbage collect files that might contain write errors
nfsd: Support the server resetting the boot verifier
nfsd: nfsd_file cache entries should be per net namespace
nfsd: eliminate an unnecessary acl size limit
Deprecate nfsd fault injection
nfsd: remove duplicated include from filecache.c
nfsd: Fix the documentation for svcxdr_tmpalloc()
nfsd: Fix up some unused variable warnings
nfsd: close cached files prior to a REMOVE or RENAME that would replace target
nfsd: rip out the raparms cache
nfsd: have nfsd_test_lock use the nfsd_file cache
nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXYx2zAAKCRDh3BK/laaZ
PFpHAQD2G+F8a9e41jFTJg5YpNKMD8/Pl4T6v9chIO9qPXF2IAEAji0P1JterRfv
ixiBhv54hSwYbk527nxNWE9tP5gAHAQ=
=WCHy
-----END PGP SIGNATURE-----
Merge tag 'virtio-fs-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse virtio-fs support from Miklos Szeredi:
"Virtio-fs allows exporting directory trees on the host and mounting
them in guest(s).
This isn't actually a new filesystem, but a glue layer between the
fuse filesystem and a virtio based back-end.
It's similar in functionality to the existing virtio-9p solution, but
significantly faster in benchmarks and has better POSIX compliance.
Further permformance improvements can be achieved by sharing the page
cache between host and guest, allowing for faster I/O and reduced
memory use.
Kata Containers have been including the out-of-tree virtio-fs (with
the shared page cache patches as well) since version 1.7 as an
experimental feature. They have been active in development and plan to
switch from virtio-9p to virtio-fs as their default solution. There
has been interest from other sources as well.
The userspace infrastructure is slated to be merged into qemu once the
kernel part hits mainline.
This was developed by Vivek Goyal, Dave Gilbert and Stefan Hajnoczi"
* tag 'virtio-fs-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
virtio-fs: add virtiofs filesystem
virtio-fs: add Documentation/filesystems/virtiofs.rst
fuse: reserve values for mapping protocol
Some additional RISC-V updates for v5.4-rc1. This includes one
significant fix:
- Prevent interrupts from being unconditionally re-enabled during
exception handling if they were disabled in the context in which the
exception occurred
Also a few other fixes:
- Fix a build error when sparse memory support is manually enabled
- Prevent CPUs beyond CONFIG_NR_CPUS from being enabled in early boot
And a few minor improvements:
- DT improvements: in the FU540 SoC DT files, improve U-Boot
compatibility by adding an "ethernet0" alias, drop an unnecessary
property from the DT files, and add support for the PWM device
- KVM preparation: add a KVM-related macro for future RISC-V KVM
support, and export some symbols required to build KVM support as
modules
- defconfig additions: build more drivers by default for QEMU
configurations
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl2OUysACgkQx4+xDQu9
Kks5Lg//bVTjarTnzgelTMz99UWiVhFc9X2H/r0zPWAms6if1bQVxIPDzffJnO9n
ldnYMLKSo3s7vUNdbs8NqdrgxZ21XSU1x3KllnEPub89e5e12wjMpqFcow0Yx0GF
16QEsHfvUsAfIcXJinf6YnQ1AO3n+DnvhxGE3ey+8AIJk676cnsxU0R5o9defLdk
XCC/o7BAt+qBZytJiSeIKQkozQkPbrzoQs2HZpqm/pDllVMFrpYAKaVLWMiqdvjO
teMqEgeQLjnHxrAHrfeKG8ssPmnf81TWOFM1fmUy2kqxsGiZSPj2LrpaghQpN/EB
1j0irgzesI0vX90rYwV4WtYoeBKICqcXo+c6wVrKaUrnBX0NiQbt55Ryf2k0JFw7
YlEPjEt46qRMLQ5J5LFALkp1bSCwX4XU3BVKCHLuzTKlg3ntHJGYAA9gPbGIdmYK
NkNYqQUowOWI6x9SkPJwStpXyckqCkj9nSp8WV5YqmgrpfZjw0b7yaU1GxfLhVkt
ygGyl0EMlL3iwcGHfn+qXg8m2zB9PjcdJZGwkAks0kWroxxg+xnEfA413zaiVNO2
AKFRLQtoZG6mtujoDBNxyc09ewovte7lXil/+rLQEcQhZJPMpK3oiM53NH736Zi5
fpt7YhanIGRswPaNQgwz981R23aDb6NZXUUaI4XStfDxJN5Px6M=
=wfz5
-----END PGP SIGNATURE-----
Merge tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Paul Walmsley:
"Some additional RISC-V updates.
This includes one significant fix:
- Prevent interrupts from being unconditionally re-enabled during
exception handling if they were disabled in the context in which
the exception occurred
Also a few other fixes:
- Fix a build error when sparse memory support is manually enabled
- Prevent CPUs beyond CONFIG_NR_CPUS from being enabled in early boot
And a few minor improvements:
- DT improvements: in the FU540 SoC DT files, improve U-Boot
compatibility by adding an "ethernet0" alias, drop an unnecessary
property from the DT files, and add support for the PWM device
- KVM preparation: add a KVM-related macro for future RISC-V KVM
support, and export some symbols required to build KVM support as
modules
- defconfig additions: build more drivers by default for QEMU
configurations"
* tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Avoid interrupts being erroneously enabled in handle_exception()
riscv: dts: sifive: Drop "clock-frequency" property of cpu nodes
riscv: dts: sifive: Add ethernet0 to the aliases node
RISC-V: Export kernel symbols for kvm
KVM: RISC-V: Add KVM_REG_RISCV for ONE_REG interface
arch/riscv: disable excess harts before picking main boot hart
RISC-V: Enable VIRTIO drivers in RV64 and RV32 defconfig
RISC-V: Fix building error when CONFIG_SPARSEMEM_MANUAL=y
riscv: dts: Add DT support for SiFive FU540 PWM driver
* The usual accuracy improvements for nested virtualization
* The usual round of code cleanups from Sean
* Added back optimizations that were prematurely removed in 5.2
(the bare minimum needed to fix the regression was in 5.3-rc8,
here comes the rest)
* Support for UMWAIT/UMONITOR/TPAUSE
* Direct L2->L0 TLB flushing when L0 is Hyper-V and L1 is KVM
* Tell Windows guests if SMT is disabled on the host
* More accurate detection of vmexit cost
* Revert a pvqspinlock pessimization
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdjfaKAAoJEL/70l94x66D8MAH/2thJnM47tYtMTFA4GBFugeH
mAx8OApWFBo8apOip+8ElFLPQ8FQdZCzr9ti8H4JkuzKxgsxCs1iqEg5pHEKxSTi
K9kLOZwoFtwgy3XmxC0PIZ9lT2Wx74ruh1HF+QG/YsjKH636UPv2VpmulsTNbm62
2ryzOb3TlGT/cjf+gv9l6IYIxZa2Ff19PF4i//H8u4YRBj358/jr99CK01iE0M9r
4NhEKiQZywzREWtKxymGOM6HEbwbWcIa+loYjj2htq8epep6f9Y1zQ0Jcn5+nPA0
cn1T2gGJAJ0OUahKLwNbz8pzrFDkW+eoQgqCBJZ4RT9Uf8WCESfl14p+/vRkAMg=
=tk5S
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
"x86 KVM changes:
- The usual accuracy improvements for nested virtualization
- The usual round of code cleanups from Sean
- Added back optimizations that were prematurely removed in 5.2 (the
bare minimum needed to fix the regression was in 5.3-rc8, here
comes the rest)
- Support for UMWAIT/UMONITOR/TPAUSE
- Direct L2->L0 TLB flushing when L0 is Hyper-V and L1 is KVM
- Tell Windows guests if SMT is disabled on the host
- More accurate detection of vmexit cost
- Revert a pvqspinlock pessimization"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (56 commits)
KVM: nVMX: cleanup and fix host 64-bit mode checks
KVM: vmx: fix build warnings in hv_enable_direct_tlbflush() on i386
KVM: x86: Don't check kvm_rebooting in __kvm_handle_fault_on_reboot()
KVM: x86: Drop ____kvm_handle_fault_on_reboot()
KVM: VMX: Add error handling to VMREAD helper
KVM: VMX: Optimize VMX instruction error and fault handling
KVM: x86: Check kvm_rebooting in kvm_spurious_fault()
KVM: selftests: fix ucall on x86
Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
kvm: nvmx: limit atomic switch MSRs
kvm: svm: Intercept RDPRU
kvm: x86: Add "significant index" flag to a few CPUID leaves
KVM: x86/mmu: Skip invalid pages during zapping iff root_count is zero
KVM: x86/mmu: Explicitly track only a single invalid mmu generation
KVM: x86/mmu: Revert "KVM: x86/mmu: Remove is_obsolete() call"
KVM: x86/mmu: Revert "Revert "KVM: MMU: reclaim the zapped-obsolete page first""
KVM: x86/mmu: Revert "Revert "KVM: MMU: collapse TLB flushes when zap all pages""
KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""
KVM: x86/mmu: Revert "Revert "KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages""
KVM: x86/mmu: Revert "Revert "KVM: MMU: show mmu_valid_gen in shadow page related tracepoints""
...
Pull NVMe changes from Sagi:
"This set consists of various fixes and cleanups:
- controller removal race fix from Balbir
- quirk additions from Gabriel and Jian-Hong
- nvme-pci power state save fix from Mario
- Add 64bit user commands (for 64bit registers) from Marta
- nvme-rdma/nvme-tcp fixes from Max, Mark and Me
- Minor cleanups and nits from James, Dan and John"
* 'nvme-5.4' of git://git.infradead.org/nvme:
nvme-rdma: fix possible use-after-free in connect timeout
nvme: Move ctrl sqsize to generic space
nvme: Add ctrl attributes for queue_count and sqsize
nvme: allow 64-bit results in passthru commands
nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T
nvmet-tcp: remove superflous check on request sgl
Added QUIRKs for ADATA XPG SX8200 Pro 512GB
nvme-rdma: Fix max_hw_sectors calculation
nvme: fix an error code in nvme_init_subsystem()
nvme-pci: Save PCI state before putting drive into deepest state
nvme-tcp: fix wrong stop condition in io_work
nvme-pci: Fix a race in controller removal
nvmet: change ppl to lpp
Commit 415606588c ("PTP: introduce new versions of IOCTLs",
2019-09-13) introduced new versions of the PTP ioctls which actually
validate that the flags are acceptable values.
As part of this, it cleared the flags value using a bitwise
and+negation, in an attempt to prevent the old ioctl from accidentally
enabling new features.
This is incorrect for a couple of reasons. First, it results in
accidentally preventing previously working flags on the request ioctl.
By clearing the "valid" flags, we now no longer allow setting the
enable, rising edge, or falling edge flags.
Second, if we add new additional flags in the future, they must not be
set by the old ioctl. (Since the flag wasn't checked before, we could
potentially break userspace programs which sent garbage flag data.
The correct way to resolve this is to check for and clear all but the
originally valid flags.
Create defines indicating which flags are correctly checked and
interpreted by the original ioctls. Use these to clear any bits which
will not be correctly interpreted by the original ioctls.
In the future, new flags must be added to the VALID_FLAGS macros, but
*not* to the V1_VALID_FLAGS macros. In this way, new features may be
exposed over the v2 ioctls, but without breaking previous userspace
which happened to not clear the flags value properly. The old ioctl will
continue to behave the same way, while the new ioctl gains the benefit
of using the flags fields.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Christopher Hall <christopher.s.hall@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Add NFT_CHAIN_POLICY_UNSET to replace hardcoded -1 to
specify that the chain policy is unset. The chain policy
field is actually defined as an 8-bit unsigned integer.
2) Remove always true condition reported by smatch in
chain policy check.
3) Fix element lookup on dynamic sets, from Florian Westphal.
4) Use __u8 in ebtables uapi header, from Masahiro Yamada.
5) Bogus EBUSY when removing flowtable after chain flush,
from Laura Garcia Liebana.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge more updates from Andrew Morton:
- almost all of the rest of -mm
- various other subsystems
Subsystems affected by this patch series:
memcg, misc, core-kernel, lib, checkpatch, reiserfs, fat, fork,
cpumask, kexec, uaccess, kconfig, kgdb, bug, ipc, lzo, kasan, madvise,
cleanups, pagemap
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (77 commits)
arch/sparc/include/asm/pgtable_64.h: fix build
mm: treewide: clarify pgtable_page_{ctor,dtor}() naming
ntfs: remove (un)?likely() from IS_ERR() conditions
IB/hfi1: remove unlikely() from IS_ERR*() condition
xfs: remove unlikely() from WARN_ON() condition
wimax/i2400m: remove unlikely() from WARN*() condition
fs: remove unlikely() from WARN_ON() condition
xen/events: remove unlikely() from WARN() condition
checkpatch: check for nested (un)?likely() calls
hexagon: drop empty and unused free_initrd_mem
mm: factor out common parts between MADV_COLD and MADV_PAGEOUT
mm: introduce MADV_PAGEOUT
mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM
mm: introduce MADV_COLD
mm: untag user pointers in mmap/munmap/mremap/brk
vfio/type1: untag user pointers in vaddr_get_pfn
tee/shm: untag user pointers in tee_shm_register
media/v4l2-core: untag user pointers in videobuf_dma_contig_user_get
drm/radeon: untag user pointers in radeon_gem_userptr_ioctl
drm/amdgpu: untag user pointers
...
Add a header include guard just in case.
My motivation is to allow Kbuild to detect missing include guard:
https://patchwork.kernel.org/patch/11063011/
Before I enable this checker I want to fix as many headers as possible.
Link: http://lkml.kernel.org/r/20190728154728.11126-1-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is not possible to get 64-bit results from the passthru commands,
what prevents from getting for the Capabilities (CAP) property value.
As a result, it is not possible to implement IOL's NVMe Conformance
test 4.3 Case 1 for Fabrics targets [1] (page 123).
This issue has been already discussed [2], but without a solution.
This patch solves the problem by adding new ioctls with a new
passthru structure, including 64-bit results. The older ioctls stay
unchanged.
[1] https://www.iol.unh.edu/sites/default/files/testsuites/nvme/UNH-IOL_NVMe_Conformance_Test_Suite_v11.0.pdf
[2] http://lists.infradead.org/pipermail/linux-nvme/2018-June/018791.html
Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXYod6wAKCRDh3BK/laaZ
PG3fAP9WXuvUeYh3X7ThQPa2D33VCIMJRd6t+1TVSSc/H8P3dAD/ehN5HIWjnmzz
iZFc3zDtO9UCJUe23IZomblxOQbu6Qk=
=I0S2
-----END PGP SIGNATURE-----
Merge tag 'fuse-update-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Continue separating the transport (user/kernel communication) and the
filesystem layers of fuse. Getting rid of most layering violations
will allow for easier cleanup and optimization later on.
- Prepare for the addition of the virtio-fs filesystem. The actual
filesystem will be introduced by a separate pull request.
- Convert to new mount API.
- Various fixes, optimizations and cleanups.
* tag 'fuse-update-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (55 commits)
fuse: Make fuse_args_to_req static
fuse: fix memleak in cuse_channel_open
fuse: fix beyond-end-of-page access in fuse_parse_cache()
fuse: unexport fuse_put_request
fuse: kmemcg account fs data
fuse: on 64-bit store time in d_fsdata directly
fuse: fix missing unlock_page in fuse_writepage()
fuse: reserve byteswapped init opcodes
fuse: allow skipping control interface and forced unmount
fuse: dissociate DESTROY from fuseblk
fuse: delete dentry if timeout is zero
fuse: separate fuse device allocation and installation in fuse_conn
fuse: add fuse_iqueue_ops callbacks
fuse: extract fuse_fill_super_common()
fuse: export fuse_dequeue_forget() function
fuse: export fuse_get_unique()
fuse: export fuse_send_init_request()
fuse: export fuse_len_args()
fuse: export fuse_end_request()
fuse: fix request limit
...
When CONFIG_UAPI_HEADER_TEST=y, exported headers are compile-tested to
make sure they can be included from user-space.
Currently, linux/netfilter_bridge/ebtables.h is excluded from the test
coverage. To make it join the compile-test, we need to fix the build
errors attached below.
For a case like this, we decided to use __u{8,16,32,64} variable types
in this discussion:
https://lkml.org/lkml/2019/6/5/18
Build log:
CC usr/include/linux/netfilter_bridge/ebtables.h.s
In file included from <command-line>:32:0:
./usr/include/linux/netfilter_bridge/ebtables.h:126:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:139:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:152:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2J9DEQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpqiqD/9D9etX/wXJ54bNKXpQqba3tUQy8e5SLnd4
q1ZzlJhvnmfNz6GqdH/1c+5Q2WBTLEb8PLJ4Z8d+1I8eWxpaCOzz362a7NA7x4l2
d1a8MV8gQy0moJKX7kTL9Pn44d2HofxyhgmHFKkvm/j7P4HIKbAuckzlcGoOS9TS
gV/bnroZ8eSrAfmswRRrxaWIoCe0iB+4x+nvoQjdVz8WXHcpBD8Y7hi5Ulid3Q/7
vU4WJubwKrP1dczk1L64CnYN4QfP2k4pyyxQwEBujueNuAxPKTRD3GXQc58c63Bp
GElE7FGxMUT4JhaY7U8G2pUSdAb2WiyMc9vd4Au7IkX0TEdQeRSHWkocnt2O9OPp
nk7XqPNK5MSskE4T9fjsJkE4KiOHhslixxc5qY6X9g5GuPw03ITM4VBwvU8aYxbf
p++jcWff10ovllck3uKvMpzU1IS20B4mSYYTyj990diocceEN6nsJsYv9wW4lMpk
lBL8b/Q3TaoMG58C5BiZvlf6JQqdKpN5UJhBmyrZAv5ozvgBAcSeLK41V6oxzGzo
y/3qcJeuIE6Exdm4DqWBHp7L82FcednO/EDqz3WnGx9v67kUz5VFqP7lIGNmXKM3
2PaBMWv7BU/Fx087KOsBOI5XWm3aVw+yTkHT2aitVQW0su+UDvb/BvFL0sqbBe3O
qm60F3TWkQ==
=/w5G
-----END PGP SIGNATURE-----
Merge tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block
Pull more io_uring updates from Jens Axboe:
"A collection of later fixes and additions, that weren't quite ready
for pushing out with the initial pull request.
This contains:
- Fix potential use-after-free of shadow requests (Jackie)
- Fix potential OOM crash in request allocation (Jackie)
- kmalloc+memcpy -> kmemdup cleanup (Jackie)
- Fix poll crash regression (me)
- Fix SQ thread not being nice and giving up CPU for !PREEMPT (me)
- Add support for timeouts, making it easier to do epoll_wait()
conversions, for instance (me)
- Ensure io_uring works without f_ops->read_iter() and
f_ops->write_iter() (me)"
* tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block:
io_uring: correctly handle non ->{read,write}_iter() file_operations
io_uring: IORING_OP_TIMEOUT support
io_uring: use cond_resched() in sqthread
io_uring: fix potential crash issue due to io_get_req failure
io_uring: ensure poll commands clear ->sqe
io_uring: fix use-after-free of shadow_req
io_uring: use kmemdup instead of kmalloc and memcpy
Hyper-V direct tlb flush function should be enabled for
guest that only uses Hyper-V hypercall. User space
hypervisor(e.g, Qemu) can disable KVM identification in
CPUID and just exposes Hyper-V identification to make
sure the precondition. Add new KVM capability KVM_CAP_
HYPERV_DIRECT_TLBFLUSH for user space to enable Hyper-V
direct tlb function and this function is default to be
disabled in KVM.
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Fix Tegra OF node reference leak (Nishka Dasgupta)
- Add #defines for PCIe Data Link Feature and Physical Layer 16.0 GT/s
features (Vidya Sagar)
- Disable MSI for Tegra Root Ports since they don't support using MSI for
all Root Port events (Vidya Sagar)
- Group DesignWare write-protected register writes together (Vidya Sagar)
- Move DesignWare capability search interfaces so they can be used by
both host and endpoint drivers (Vidya Sagar)
- Add DesignWare extended capability search interfaces (Vidya Sagar)
- Export dw_pcie_wait_for_link() so drivers can be modules (Vidya Sagar)
- Add "snps,enable-cdm-check" DT binding for Configuration Dependent
Module (CDM) register checking (Vidya Sagar)
- Add DesignWare support for "snps,enable-cdm-check" CDM checking (Vidya
Sagar)
- Add "supports-clkreq" DT binding for host drivers to decide whether to
advertise low power features (Vidya Sagar)
- Add DT binding for Tegra194 (Vidya Sagar)
- Add DT binding for Tegra194 P2U (PIPE to UPHY) block (Vidya Sagar)
- Add support for Tegra194 P2U (PIPE to UPHY) (Vidya Sagar)
- Add support for Tegra194 host controller (Vidya Sagar)
- Add Tegra support for sideband PERST# and CLKREQ# for C5 (Vidya Sagar)
- Add Tegra support for slot regulators for p2972-0000 platform (Vidya
Sagar)
* lorenzo/pci/tegra:
arm64: tegra: Add PCIe slot supply information in p2972-0000 platform
arm64: tegra: Add configuration for PCIe C5 sideband signals
PCI: tegra: Add support to enable slot regulators
PCI: tegra: Add support to configure sideband pins
dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries
dt-bindings: PCI: tegra: Add sideband pins configuration entries
PCI: tegra: Add Tegra194 PCIe support
phy: tegra: Add PCIe PIPE2UPHY support
dt-bindings: PHY: P2U: Add Tegra194 P2U block
dt-bindings: PCI: tegra: Add device tree support for Tegra194
dt-bindings: Add PCIe supports-clkreq property
PCI: dwc: Add support to enable CDM register check
dt-bindings: PCI: designware: Add binding for CDM register check
PCI: dwc: Export dw_pcie_wait_for_link() API
PCI: dwc: Add extended configuration space capability search API
PCI: dwc: Move config space capability search API
PCI: dwc: Group DBI registers writes requiring unlocking
PCI: Disable MSI for Tegra root ports
PCI: Add #defines for some of PCIe spec r4.0 features
PCI: tegra: Fix OF node reference leak
In this round, we introduced casefolding support in f2fs, and fixed various bugs
in individual features such as IO alignment, checkpoint=disable, quota, and
swapfile.
Enhancement:
- support casefolding w/ enhancement in ext4
- support fiemap for directory
- support FS_IO_GET|SET_FSLABEL
Bug fix:
- fix IO stuck during checkpoint=disable
- avoid infinite GC loop
- fix panic/overflow related to IO alignment feature
- fix livelock in swap file
- fix discard command leak
- disallow dio for atomic_write
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl2FL1wACgkQQBSofoJI
UNIRdg/+M6QNiAazIbqzPXyUUkyZQYR5YKhu2abd49o/g38xjTq1afH0PZpQyDrA
9RncR4xsW8F241vPLVoCSanfaa+MxN6xHi3TrD8zYtZWxOcPF6v1ETHeUXGTHuJ2
gqlk+mm+CnY02M6rxW7XwixuXwttT3bF9+cf1YBWRpNoVrR+SjNqgeJS7FmJwXKd
nGKb+94OxuygL1NUop+LDUo3qRQjc0Sxv/7qj/K4lhqgTjhAxMYT2KvUP/1MZ7U0
Kh9WIayDXnpoioxMPnt4VEb+JgXfLLFELvQzNjwulk15GIweuJzwVYCBXcRoX0cK
eRBRmRy/kRp/e0R1gvl3kYrXQC2AC5QTlBVH/0ESwnaukFiUBKB509vH4aqE/vpB
Krldjfg+uMHkc7XiNBf1boDp713vJ76iRKUDWoVb6H/sPbdJ+jtrnUNeBP8CVpWh
u31SY1MppnmKhhsoCHQRbhbXO/Z29imBQgF9Tm3IFWImyLY3IU40vFj2fR15gJkL
X3x/HWxQynSqyqEOwAZrvhCRTvBAIGIVy5292Di1RkqIoh8saxcqiaywgLz1+eVE
0DCOoh8R6sSbfN/EEh+yZqTxmjo0VGVTw30XVI6QEo4cY5Vfc9u6dN6SRWVRvbjb
kPb3dKcMrttgbn3fcXU8Jbw1AOor9N6afHaqs0swQJyci2RwJyc=
=oonf
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we introduced casefolding support in f2fs, and fixed
various bugs in individual features such as IO alignment,
checkpoint=disable, quota, and swapfile.
Enhancement:
- support casefolding w/ enhancement in ext4
- support fiemap for directory
- support FS_IO_GET|SET_FSLABEL
Bug fix:
- fix IO stuck during checkpoint=disable
- avoid infinite GC loop
- fix panic/overflow related to IO alignment feature
- fix livelock in swap file
- fix discard command leak
- disallow dio for atomic_write"
* tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
f2fs: add a condition to detect overflow in f2fs_ioc_gc_range()
f2fs: fix to add missing F2FS_IO_ALIGNED() condition
f2fs: fix to fallback to buffered IO in IO aligned mode
f2fs: fix to handle error path correctly in f2fs_map_blocks
f2fs: fix extent corrupotion during directIO in LFS mode
f2fs: check all the data segments against all node ones
f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY
f2fs: fix inode rwsem regression
f2fs: fix to avoid accessing uninitialized field of inode page in is_alive()
f2fs: avoid infinite GC loop due to stale atomic files
f2fs: Fix indefinite loop in f2fs_gc()
f2fs: convert inline_data in prior to i_size_write
f2fs: fix error path of f2fs_convert_inline_page()
f2fs: add missing documents of reserve_root/resuid/resgid
f2fs: fix flushing node pages when checkpoint is disabled
f2fs: enhance f2fs_is_checkpoint_ready()'s readability
f2fs: clean up __bio_alloc()'s parameter
f2fs: fix wrong error injection path in inc_valid_block_count()
f2fs: fix to writeout dirty inode during node flush
f2fs: optimize case-insensitive lookups
...
implementation details that do not belong in DM crypt. The wrapper
template for ESSIV generation that was factored out will also be used
by fscrypt in the future.
- Add root hash pkcs#7 signature verification to the DM verity target.
- Add a new "clone" DM target that allows for efficient remote
replication of a device.
- Enhance DM bufio's cache to be tailored to each client based on use.
Clients that make heavy use of the cache get more of it, and those
that use less have reduced cache usage.
- Add a new DM_GET_TARGET_VERSION ioctl to allow userspace to query the
version number of a DM target (even if the associated module isn't yet
loaded).
- Fix invalid memory access in DM zoned target.
- Fix the max_discard_sectors limit advertised by the DM raid target; it
was mistakenly storing the limit in bytes rather than sectors.
- Small optimizations and cleanups in DM writecache target.
- Various fixes and cleanups in DM core, DM raid1 and space map portion
of DM persistent data library.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl2D7ycTHHNuaXR6ZXJA
cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWp9QCACwTkVGzPGMCbAaCVlCACo8B5JyY4OO
FNxucqUlt1MHKuBbzJd4XwNGlLg68xjMUKVPYPlgina7TaDl+wvlTbHchaJS8nak
x1zyhDSywy0F9f6HHiXJi/vshmAfa0xnIM6fQXVPM346S6xf9u7hqOJQMCrdvY92
w4FhuW9nVt5xizo8iC/3LzoWbhrWncT7dyZUZtG3/tmglhkEK7QwctlgQxcD7tXg
H1lhntQzHzpxQAVBefWWdw7ubuDd6XCHuQMaxRhyR++c62P3eKDR8ck9hhd3hZKv
E481gtxcsjKuYLxwULjqFJZaNFitWFNMJ7gppQyKRqCzn2zlGAL6npl8
=m6zD
-----END PGP SIGNATURE-----
Merge tag 'for-5.4/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- crypto and DM crypt advances that allow the crypto API to reclaim
implementation details that do not belong in DM crypt. The wrapper
template for ESSIV generation that was factored out will also be used
by fscrypt in the future.
- Add root hash pkcs#7 signature verification to the DM verity target.
- Add a new "clone" DM target that allows for efficient remote
replication of a device.
- Enhance DM bufio's cache to be tailored to each client based on use.
Clients that make heavy use of the cache get more of it, and those
that use less have reduced cache usage.
- Add a new DM_GET_TARGET_VERSION ioctl to allow userspace to query the
version number of a DM target (even if the associated module isn't
yet loaded).
- Fix invalid memory access in DM zoned target.
- Fix the max_discard_sectors limit advertised by the DM raid target;
it was mistakenly storing the limit in bytes rather than sectors.
- Small optimizations and cleanups in DM writecache target.
- Various fixes and cleanups in DM core, DM raid1 and space map portion
of DM persistent data library.
* tag 'for-5.4/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (22 commits)
dm: introduce DM_GET_TARGET_VERSION
dm bufio: introduce a global cache replacement
dm bufio: remove old-style buffer cleanup
dm bufio: introduce a global queue
dm bufio: refactor adjust_total_allocated
dm bufio: call adjust_total_allocated from __link_buffer and __unlink_buffer
dm: add clone target
dm raid: fix updating of max_discard_sectors limit
dm writecache: skip writecache_wait for pmem mode
dm stats: use struct_size() helper
dm crypt: omit parsing of the encapsulated cipher
dm crypt: switch to ESSIV crypto API template
crypto: essiv - create wrapper template for ESSIV generation
dm space map common: remove check for impossible sm_find_free() return value
dm raid1: use struct_size() with kzalloc()
dm writecache: optimize performance by sorting the blocks for writeback_all
dm writecache: add unlikely for getting two block with same LBA
dm writecache: remove unused member pointer in writeback_struct
dm zoned: fix invalid memory access
dm verity: add root hash pkcs#7 signature verification
...
- Fix spapr iommu error case case (Alexey Kardashevskiy)
- Consolidate region type definitions (Cornelia Huck)
- Restore saved original PCI state on release (hexin)
- Simplify mtty sample driver interrupt path (Parav Pandit)
- Support for reporting valid IOVA regions to user (Shameer Kolothum)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJdhS76AAoJECObm247sIsipU0QAI4eCfwy4R4nV6eeTUymniVO
jxYNwE2ZARQp3YW6DzupuesCIDDuhPL81vTzhFuIP6lHixeWM/D4eS8pKw4CH7UH
SAOhiNJ7j1NryXK6SnZMaE8zZH/Ry4j8QvfmlJpDi7EBSQI/IRykRZOA2qEULFJm
eI1ZwBR03njbqKPwKy9a6RVifjFT+yAnQrX/ADdQY1T1C9qPkqJZnYwTviwOM6G/
pXHkpLgc3eYmz6ZAk7TpE4kxx7Zji3MenKHxSE0ZD9xqixzzntZ3gG9dDXy98IRW
/oWQSM0gusF4louG/iXb22QzlzcRPZXJhQ+02fjWfuR+Ak3W5amI5eIPzPyjvbZw
As54My1HedLMWnKvQKjTS5HYyHqgCACYjdAEocx79LvU40txpcEUOqoZeRhSiEyN
zcc2xaYSs/aw7AZd3MwNKrqeg25LspD9T1FBoqou/C4EzW5Q63gClVJqbfpyrmVX
SPYuGtFzkWqI4eLkgJ8ACAm20CMBtH0nVPhxpE7fOMxwzuUoWLxB+t6x5eEDKUhd
kHhtdUJrfYSsmjgnWc0IoB9OVLaj/EsfQrv08V+tNUjt8JoGb9A6A5jWebf506hI
JhlVS0aNN80GATDgu3nPiBMaS/3ErTDrTNh/zmYQnSDFOKTDw1nj+w3rczkwVY7C
1VFtP4vrLSjwJV49zAIR
=siL2
-----END PGP SIGNATURE-----
Merge tag 'vfio-v5.4-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Fix spapr iommu error case case (Alexey Kardashevskiy)
- Consolidate region type definitions (Cornelia Huck)
- Restore saved original PCI state on release (hexin)
- Simplify mtty sample driver interrupt path (Parav Pandit)
- Support for reporting valid IOVA regions to user (Shameer Kolothum)
* tag 'vfio-v5.4-rc1' of git://github.com/awilliam/linux-vfio:
vfio_pci: Restore original state on release
vfio/type1: remove duplicate retrieval of reserved regions
vfio/type1: Add IOVA range capability support
vfio/type1: check dma map request is within a valid iova range
vfio/spapr_tce: Fix incorrect tce_iommu_group memory free
vfio-mdev/mtty: Simplify interrupt generation
vfio: re-arrange vfio region definitions
vfio/type1: Update iova list on detach
vfio/type1: Check reserved region conflict and update iova list
vfio/type1: Introduce iova list and add iommu aperture validity check
We will be using ONE_REG interface accessing VCPU registers from
user-space hence we add KVM_REG_RISCV for RISC-V VCPU registers.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdgfi4AAoJEAx081l5xIa+uYQP/3lbB75F60oSb0Y17uOtAwrS
/ZMKZ3/EXcCw42JuYTbz17EiQSajkJcOC+tNRo22nlg4d9R0x3/kXwA7O/eu5RWI
8Qi1rfrMZ6LotQXBfc4nVlHvyocsYc/GVNfsCboUCLwU/aNwnrufS9jeEsvWd2Vt
iIn/okeQ7mTyB/3Dm4RFIAexE21+d5is6YTs45xUnDLhWzXYLU7VnHt5S5kXurEI
cmVA7C1EAqV+GAwkeFWFx/jBpBRKqvTPa8EpOu7cQL01x7KwU2cQeNdOyBF6Uf8a
cNKFI7jZZmu/mFp+YqU33ZIZxbLELm5PN1sz4ZvoIT8BJAQf1VmZg+GG87AvQCUz
zbWKrbHGVy/c+sohUmvCOQvmzca/7rZutFyaCOx2mEdrheRZMWQI/w2C03VfkNFS
vPpXrKXaWbVezHwF6x9PemRxvOPvLkeKAgSVuAfK0DhT5kEldqdzFzI7UO9MYfyX
j+HOUIRP/pseshUV6YbnAe9MS3T4zb5P+Qd1zRTGgo8R9/l1AmVHyrkbH1hGNjY0
mECHucCOh/VsyPAdg1XADJHqMg9081prySK8hNV6oazwSHdC38GdajuOmdyO3azQ
OpJZDQd0eP4fHPMU6F5HSzLOO/wYuAie8gWVSZ3ylDxDPIKfqcjVo+4bxJ8sbmpI
akj6BoMX7we0fjhlbdit
=5CRH
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"This is the main pull request for 5.4-rc1 merge window. I don't think
there is anything outstanding so next week should just be fixes, but
we'll see if I missed anything. I landed some fixes earlier in the
week but got delayed writing summary and sending it out, due to a mix
of sick kid and jetlag!
There are some fixes pending, but I'd rather get the main merge out of
the way instead of delaying it longer.
It's also pretty large in commit count and new amd header file size.
The largest thing is four new amdgpu products (navi12/14, arcturus and
renoir APU support).
Otherwise it's pretty much lots of work across the board, i915 has
started landing tigerlake support, lots of icelake fixes and lots of
locking reworking for future gpu support, lots of header file rework
(drmP.h is nearly gone), some old legacy hacks (DRM_WAIT_ON) have been
put into the places they are needed.
uapi:
- content protection type property for HDCP
core:
- rework include dependencies
- lots of drmP.h removals
- link rate calculation robustness fix
- make fb helper map only when required
- add connector->DDC adapter link
- DRM_WAIT_ON removed
- drop DRM_AUTH usage from drivers
dma-buf:
- reservation object fence helper
dma-fence:
- shrink dma_fence struct
- merge signal functions
- store timestamps in dma_fence
- selftests
ttm:
- embed drm_get_object struct into ttm_buffer_object
- release_notify callback
bridges:
- sii902x - audio graph card support
- tc358767 - aux data handling rework
- ti-snd64dsi86 - debugfs support, DSI mode flags support
panels:
- Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191, Boe
Himax8279d, Sharp LD-D5116Z01B
- TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01,
Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1
i915:
- Initial tigerlake platform support
- Locking simplification work, general all over refactoring.
- Selftests
- HDCP debug info improvements
- DSI properties
- Icelake display PLL fixes, colorspace fixes, bandwidth fixes, DSI
suspend/resume
- GuC fixes
- Perf fixes
- ElkhartLake enablement
- DP MST fixes
- GVT - command parser enhancements
amdgpu:
- add wipe memory on release flag for buffer creation
- Navi12/14 support (may be marked experimental)
- Arcturus support
- Renoir APU support
- mclk DPM for Navi
- DC display fixes
- Raven scatter/gather support
- RAS support for GFX
- Navi12 + Arcturus power features
- GPU reset for Picasso
- smu11 i2c controller support
amdkfd:
- navi12/14 support
- Arcturus support
radeon:
- kexec fix
nouveau:
- improved display color management
- detect lack of GPU power cables
vmwgfx:
- evicition priority support
- remove unused security feature
msm:
- msm8998 display support
- better async commit support for cursor updates
etnaviv:
- per-process address space support
- performance counter fixes
- softpin support
mcde:
- DCS transfers fix
exynos:
- drmP.h cleanup
lima:
- reduce logging
kirin:
- misc clenaups
komeda:
- dual-link support
- DT memory regions
hisilicon:
- misc fixes
imx:
- IPUv3 image converter fixes
- 32-bit RGB V4L2 pixel format support
ingenic:
- more support for panel related cases
mgag200:
- cursor support fix
panfrost:
- export GPU features register to userspace
- gpu heap allocations
- per-fd address space support
pl111:
- CLD pads wiring support removed from DT
rockchip:
- rework to use DRM PSR helpers
- fix bug in VOP_WIN_GET macro
- DSI DT binding rework
sun4i:
- improve support for color encoding and range
- DDC enabled GPIO
tinydrm:
- rework SPI support
- improve MIPI-DBI support
- moved to drm/tiny
vkms:
- rework CRC tracking
dw-hdmi:
- get_eld and i2s improvements
gm12u320:
- misc fixes
meson:
- global code cleanup
- vpu feature detect
omap:
- alpha/pixel blend mode properties
rcar-du:
- misc fixes"
* tag 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm: (2112 commits)
drm/nouveau/bar/gm20b: Avoid BAR1 teardown during init
drm/nouveau: Fix ordering between TTM and GEM release
drm/nouveau/prime: Extend DMA reservation object lock
drm/nouveau: Fix fallout from reservation object rework
drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors
drm/i915: Use NOEVICT for first pass on attemping to pin a GGTT mmap
drm/i915: to make vgpu ppgtt notificaiton as atomic operation
drm/i915: Flush the existing fence before GGTT read/write
drm/i915: Hold irq-off for the entire fake lock period
drm/i915/gvt: update RING_START reg of vGPU when the context is submitted to i915
drm/i915/gvt: update vgpu workload head pointer correctly
drm/mcde: Fix DSI transfers
drm/msm: Use the correct dma_sync calls harder
drm/msm: remove unlikely() from WARN_ON() conditions
drm/msm/dsi: Fix return value check for clk_get_parent
drm/msm: add atomic traces
drm/msm/dpu: async commit support
drm/msm: async commit support
drm/msm: split power control from prepare/complete_commit
drm/msm: add kms->flush_commit()
...
Pull input updates from Dmitry Torokhov:
- input core allows hardware drivers to specify a [more precise]
timestamp (normally taken in top half) to better track velocity of
contacts
- input_dev instances now support "polling" mode so that drivers could
use the same object for polled and interrupt-driven operation. The
plan is to convert existing drivers and retire input_polled_dev API
- a new driver for the FlySky FS-iA6B RC receiver
- a refresh of BU21013 touchpad driver
- w90x900 keyboard and touchpad drivers are removed as the platform is
gone
- assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits)
Input: sidewinder - make array seq static const, makes object smaller
Input: reset device timestamp on sync
Input: bu21013_ts - switch to using standard touchscreen properties
Input: bu21013_ts - switch to using MT-B (slotted) protocol
Input: bu21013_ts - fix suspend when wake source
Input: bu21013_ts - use interrupt from I2C client
Input: bu21013_ts - remove support for platform data
Input: bu21013_ts - convert to using managed resources
Input: bu21013_ts - remove useless comments
Input: bu21013_ts - annotate supend/resume methods as __maybe_unused
Input: bu21013_ts - rename some variables
Input: bu21013_ts - convert to use GPIO descriptors
ARM: ux500: improve BU21013 touchpad bindings
Input: i8042 - enable wakeup on a stable struct device
Input: soc_button_array - use platform_device_register_resndata()
Input: psmouse - drop all unneeded functions from mouse headers
Input: add support for polling to input devices
Input: wacom_w8001 - allocate additional space for 'phys'
Input: cros_ec_keyb - add back missing mask for event_type
Input: remove dev_err() usage after platform_get_irq()
...
vmlinux BTF has more than 64k types.
Its string section is also at the offset larger than 64k.
Adjust both limits to make in-kernel BTF verifier successfully parse in-kernel BTF.
Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl1/hCoACgkQxWXV+ddt
WDs0lQ//flGLX4fvaY2vuWA26t1elITnIatyX8S+xP4pUsT1Tyy1egeGpR8Jku/7
sCOgUlEM2MNXqveOdkQqPJuFPp3B6tInz4S/fowtLlz4enp7uTXw2SFuS3bhOJ+b
rpxK9VTc6QV3aipBCG31m8fnDiMaj2Hcspp0oej3V2mBhLUvzn69+P4eo7WN+46w
r2F605+lfURauHE6WjM09HINx3NGSfPqdSA5rJvHSm0jlxhb9l3DJOX8cYkbf8lo
MAbLDZmtiDiQAqRcsQPi6LZ1LKBkOYaeSnVvnXnH23FI04LBra3duk03qpvWCW2R
c1tFnKF5vACCyBQp1z8WYP9GjjoW5WT33R2iXufgwXP6pkLpS/12qLLeXqO2K4p5
zINKrIkF3P+GHxiDsQZE3G9A4UpKWFHCxKdxyWIV8LQDEBrgE2Mo3NThEyRBbP+8
1dia4j+qFHvPTMNBvBCjCZMqDwbCe9H70WOXKGE36JITW2le91mn4qHl4SuWReUP
IoHYDVcC/eBGRegc9X+bLJNjJYqo+XFo6u32/fUC5YVhngycQEi2vg1vv8fWQ7dB
g/Ruo3Inrk8h5kPmrHvbOzGazgANIt5ELHrYMRMA5WSgaq29jtGt9oTnsrd+I88G
aPJtwAZfLwdSjl/pwJw8atEPrf04DA2w+gO7rZ/AmeLshnGfOTc=
=bY+a
-----END PGP SIGNATURE-----
Merge tag 'for-5.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This continues with work on code refactoring, sanity checks and space
handling. There are some less user visible changes, nothing that would
particularly stand out.
User visible changes:
- tree checker, more sanity checks of:
- ROOT_ITEM (key, size, generation, level, alignment, flags)
- EXTENT_ITEM and METADATA_ITEM checks (key, size, offset,
alignment, refs)
- tree block reference items
- EXTENT_DATA_REF (key, hash, offset)
- deprecate flag BTRFS_SUBVOL_CREATE_ASYNC for subvolume creation
ioctl, scheduled removal in 5.7
- delete stale and unused UAPI definitions
BTRFS_DEV_REPLACE_ITEM_STATE_*
- improved export of debugging information available via existing
sysfs directory structure
- try harder to delete relations between qgroups and allow to delete
orphan entries
- remove unreliable space checks before relocation starts
Core:
- space handling:
- improved ticket reservations and other high level logic in
order to remove special cases
- factor flushing infrastructure and use it for different
contexts, allows to remove some special case handling
- reduce metadata reservation when only updating inodes
- reduce global block reserve minimum size (affects small
filesystems)
- improved overcommit logic wrt global block reserve
- tests:
- fix memory leaks in extent IO tree
- catch all TRIM range
Fixes:
- fix ENOSPC errors, leading to transaction aborts, when cloning
extents
- several fixes for inode number cache (mount option inode_cache)
- fix potential soft lockups during send when traversing large trees
- fix unaligned access to space cache pages with SLUB debug on
(PowerPC)
Other:
- refactoring public/private functions, moving to new or more
appropriate files
- defines converted to enums
- error handling improvements
- more assertions and comments
- old code deletion"
* tag 'for-5.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (138 commits)
btrfs: Relinquish CPUs in btrfs_compare_trees
btrfs: Don't assign retval of btrfs_try_tree_write_lock/btrfs_tree_read_lock_atomic
btrfs: create structure to encode checksum type and length
btrfs: turn checksum type define into an enum
btrfs: add enospc debug messages for ticket failure
btrfs: do not account global reserve in can_overcommit
btrfs: use btrfs_try_granting_tickets in update_global_rsv
btrfs: always reserve our entire size for the global reserve
btrfs: change the minimum global reserve size
btrfs: rename btrfs_space_info_add_old_bytes
btrfs: remove orig_bytes from reserve_ticket
btrfs: fix may_commit_transaction to deal with no partial filling
btrfs: rework wake_all_tickets
btrfs: refactor the ticket wakeup code
btrfs: stop partially refilling tickets when releasing space
btrfs: add space reservation tracepoint for reserved bytes
btrfs: roll tracepoint into btrfs_space_info_update helper
btrfs: do not allow reservations if we have pending tickets
btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
...
Please consider pulling fs-verity for 5.4.
fs-verity is a filesystem feature that provides Merkle tree based
hashing (similar to dm-verity) for individual readonly files, mainly for
the purpose of efficient authenticity verification.
This pull request includes:
(a) The fs/verity/ support layer and documentation.
(b) fs-verity support for ext4 and f2fs.
Compared to the original fs-verity patchset from last year, the UAPI to
enable fs-verity on a file has been greatly simplified. Lots of other
things were cleaned up too.
fs-verity is planned to be used by two different projects on Android;
most of the userspace code is in place already. Another userspace tool
("fsverity-utils"), and xfstests, are also available. e2fsprogs and
f2fs-tools already have fs-verity support. Other people have shown
interest in using fs-verity too.
I've tested this on ext4 and f2fs with xfstests, both the existing tests
and the new fs-verity tests. This has also been in linux-next since
July 30 with no reported issues except a couple minor ones I found
myself and folded in fixes for.
Ted and I will be co-maintaining fs-verity.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXX8ZUBQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK2YOAQCbnBAKWDxXS3alLARRwjQLjmEtQIGl
gsek+WurFIg/zAEAlpSzHwu13LvYzTqv3rhO2yhSlvhnDu4GQEJPXPm0wgM=
=ID0n
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fs-verity support from Eric Biggers:
"fs-verity is a filesystem feature that provides Merkle tree based
hashing (similar to dm-verity) for individual readonly files, mainly
for the purpose of efficient authenticity verification.
This pull request includes:
(a) The fs/verity/ support layer and documentation.
(b) fs-verity support for ext4 and f2fs.
Compared to the original fs-verity patchset from last year, the UAPI
to enable fs-verity on a file has been greatly simplified. Lots of
other things were cleaned up too.
fs-verity is planned to be used by two different projects on Android;
most of the userspace code is in place already. Another userspace tool
("fsverity-utils"), and xfstests, are also available. e2fsprogs and
f2fs-tools already have fs-verity support. Other people have shown
interest in using fs-verity too.
I've tested this on ext4 and f2fs with xfstests, both the existing
tests and the new fs-verity tests. This has also been in linux-next
since July 30 with no reported issues except a couple minor ones I
found myself and folded in fixes for.
Ted and I will be co-maintaining fs-verity"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
f2fs: add fs-verity support
ext4: update on-disk format documentation for fs-verity
ext4: add fs-verity read support
ext4: add basic fs-verity support
fs-verity: support builtin file signatures
fs-verity: add SHA-512 support
fs-verity: implement FS_IOC_MEASURE_VERITY ioctl
fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
fs-verity: add data verification hooks for ->readpages()
fs-verity: add the hook for file ->setattr()
fs-verity: add the hook for file ->open()
fs-verity: add inode and superblock fields
fs-verity: add Kconfig and the helper functions for hashing
fs: uapi: define verity bit for FS_IOC_GETFLAGS
fs-verity: add UAPI header
fs-verity: add MAINTAINERS file entry
fs-verity: add a documentation file
This is a large update to fs/crypto/ which includes:
- Add ioctls that add/remove encryption keys to/from a filesystem-level
keyring. These fix user-reported issues where e.g. an encrypted home
directory can break NetworkManager, sshd, Docker, etc. because they
don't get access to the needed keyring. These ioctls also provide a
way to lock encrypted directories that doesn't use the vm.drop_caches
sysctl, so is faster, more reliable, and doesn't always need root.
- Add a new encryption policy version ("v2") which switches to a more
standard, secure, and flexible key derivation function, and starts
verifying that the correct key was supplied before using it. The key
derivation improvement is needed for its own sake as well as for
ongoing feature work for which the current way is too inflexible.
Work is in progress to update both Android and the 'fscrypt' userspace
tool to use both these features. (Working patches are available and
just need to be reviewed+merged.) Chrome OS will likely use them too.
This has also been tested on ext4, f2fs, and ubifs with xfstests -- both
the existing encryption tests, and the new tests for this. This has
also been in linux-next since Aug 16 with no reported issues. I'm also
using an fscrypt v2-encrypted home directory on my personal desktop.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXX8L/BQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3DqAQDER8ji5uMWbh00h4+eywfIQdcrUWI0
t2iEdqfNOoGTWAEAhE2u0SebIVwjluQ3N3HU9b/U6e5R0ZkZU9IQdwkZhQ0=
=J5WG
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"This is a large update to fs/crypto/ which includes:
- Add ioctls that add/remove encryption keys to/from a
filesystem-level keyring.
These fix user-reported issues where e.g. an encrypted home
directory can break NetworkManager, sshd, Docker, etc. because they
don't get access to the needed keyring. These ioctls also provide a
way to lock encrypted directories that doesn't use the
vm.drop_caches sysctl, so is faster, more reliable, and doesn't
always need root.
- Add a new encryption policy version ("v2") which switches to a more
standard, secure, and flexible key derivation function, and starts
verifying that the correct key was supplied before using it.
The key derivation improvement is needed for its own sake as well
as for ongoing feature work for which the current way is too
inflexible.
Work is in progress to update both Android and the 'fscrypt' userspace
tool to use both these features. (Working patches are available and
just need to be reviewed+merged.) Chrome OS will likely use them too.
This has also been tested on ext4, f2fs, and ubifs with xfstests --
both the existing encryption tests, and the new tests for this. This
has also been in linux-next since Aug 16 with no reported issues. I'm
also using an fscrypt v2-encrypted home directory on my personal
desktop"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: (27 commits)
ext4 crypto: fix to check feature status before get policy
fscrypt: document the new ioctls and policy version
ubifs: wire up new fscrypt ioctls
f2fs: wire up new fscrypt ioctls
ext4: wire up new fscrypt ioctls
fscrypt: require that key be added when setting a v2 encryption policy
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl
fscrypt: allow unprivileged users to add/remove keys for v2 policies
fscrypt: v2 encryption policy support
fscrypt: add an HKDF-SHA512 implementation
fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
fscrypt: rename keyinfo.c to keysetup.c
fscrypt: move v1 policy key setup to keysetup_v1.c
fscrypt: refactor key setup code in preparation for v2 policies
fscrypt: rename fscrypt_master_key to fscrypt_direct_key
fscrypt: add ->ci_inode to fscrypt_info
fscrypt: use FSCRYPT_* definitions, not FS_*
fscrypt: use FSCRYPT_ prefix for uapi constants
...
Pull networking updates from David Miller:
1) Support IPV6 RA Captive Portal Identifier, from Maciej Żenczykowski.
2) Use bio_vec in the networking instead of custom skb_frag_t, from
Matthew Wilcox.
3) Make use of xmit_more in r8169 driver, from Heiner Kallweit.
4) Add devmap_hash to xdp, from Toke Høiland-Jørgensen.
5) Support all variants of 5750X bnxt_en chips, from Michael Chan.
6) More RTNL avoidance work in the core and mlx5 driver, from Vlad
Buslov.
7) Add TCP syn cookies bpf helper, from Petar Penkov.
8) Add 'nettest' to selftests and use it, from David Ahern.
9) Add extack support to drop_monitor, add packet alert mode and
support for HW drops, from Ido Schimmel.
10) Add VLAN offload to stmmac, from Jose Abreu.
11) Lots of devm_platform_ioremap_resource() conversions, from
YueHaibing.
12) Add IONIC driver, from Shannon Nelson.
13) Several kTLS cleanups, from Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1930 commits)
mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer
mlxsw: spectrum: Register CPU port with devlink
mlxsw: spectrum_buffers: Prevent changing CPU port's configuration
net: ena: fix incorrect update of intr_delay_resolution
net: ena: fix retrieval of nonadaptive interrupt moderation intervals
net: ena: fix update of interrupt moderation register
net: ena: remove all old adaptive rx interrupt moderation code from ena_com
net: ena: remove ena_restore_ethtool_params() and relevant fields
net: ena: remove old adaptive interrupt moderation code from ena_netdev
net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*()
net: ena: enable the interrupt_moderation in driver_supported_features
net: ena: reimplement set/get_coalesce()
net: ena: switch to dim algorithm for rx adaptive interrupt moderation
net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it
net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable
ethtool: implement Energy Detect Powerdown support via phy-tunable
xen-netfront: do not assume sk_buff_head list is empty in error handling
s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb”
net: ena: don't wake up tx queue when down
drop_monitor: Better sanitize notified packets
...
Pull crypto updates from Herbert Xu:
"API:
- Add the ability to abort a skcipher walk.
Algorithms:
- Fix XTS to actually do the stealing.
- Add library helpers for AES and DES for single-block users.
- Add library helpers for SHA256.
- Add new DES key verification helper.
- Add surrounding bits for ESSIV generator.
- Add accelerations for aegis128.
- Add test vectors for lzo-rle.
Drivers:
- Add i.MX8MQ support to caam.
- Add gcm/ccm/cfb/ofb aes support in inside-secure.
- Add ofb/cfb aes support in media-tek.
- Add HiSilicon ZIP accelerator support.
Others:
- Fix potential race condition in padata.
- Use unbound workqueues in padata"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (311 commits)
crypto: caam - Cast to long first before pointer conversion
crypto: ccree - enable CTS support in AES-XTS
crypto: inside-secure - Probe transform record cache RAM sizes
crypto: inside-secure - Base RD fetchcount on actual RD FIFO size
crypto: inside-secure - Base CD fetchcount on actual CD FIFO size
crypto: inside-secure - Enable extended algorithms on newer HW
crypto: inside-secure: Corrected configuration of EIP96_TOKEN_CTRL
crypto: inside-secure - Add EIP97/EIP197 and endianness detection
padata: remove cpu_index from the parallel_queue
padata: unbind parallel jobs from specific CPUs
padata: use separate workqueues for parallel and serial work
padata, pcrypt: take CPU hotplug lock internally in padata_alloc_possible
crypto: pcrypt - remove padata cpumask notifier
padata: make padata_do_parallel find alternate callback CPU
workqueue: require CPU hotplug read exclusion for apply_workqueue_attrs
workqueue: unconfine alloc/apply/free_workqueue_attrs()
padata: allocate workqueue internally
arm64: dts: imx8mq: Add CAAM node
random: Use wait_event_freezable() in add_hwgenerator_randomness()
crypto: ux500 - Fix COMPILE_TEST warnings
...
Add a basic file system module for virtio-fs. This does not yet contain
shared data support between host and guest or metadata coherency speedups.
However it is already significantly faster than virtio-9p.
Design Overview
===============
With the goal of designing something with better performance and local file
system semantics, a bunch of ideas were proposed.
- Use fuse protocol (instead of 9p) for communication between guest and
host. Guest kernel will be fuse client and a fuse server will run on
host to serve the requests.
- For data access inside guest, mmap portion of file in QEMU address space
and guest accesses this memory using dax. That way guest page cache is
bypassed and there is only one copy of data (on host). This will also
enable mmap(MAP_SHARED) between guests.
- For metadata coherency, there is a shared memory region which contains
version number associated with metadata and any guest changing metadata
updates version number and other guests refresh metadata on next access.
This is yet to be implemented.
How virtio-fs differs from existing approaches
==============================================
The unique idea behind virtio-fs is to take advantage of the co-location of
the virtual machine and hypervisor to avoid communication (vmexits).
DAX allows file contents to be accessed without communication with the
hypervisor. The shared memory region for metadata avoids communication in
the common case where metadata is unchanged.
By replacing expensive communication with cheaper shared memory accesses,
we expect to achieve better performance than approaches based on network
file system protocols. In addition, this also makes it easier to achieve
local file system semantics (coherency).
These techniques are not applicable to network file system protocols since
the communications channel is bypassed by taking advantage of shared memory
on a local machine. This is why we decided to build virtio-fs rather than
focus on 9P or NFS.
Caching Modes
=============
Like virtio-9p, different caching modes are supported which determine the
coherency level as well. The “cache=FOO” and “writeback” options control
the level of coherence between the guest and host filesystems.
- cache=none
metadata, data and pathname lookup are not cached in guest. They are
always fetched from host and any changes are immediately pushed to host.
- cache=always
metadata, data and pathname lookup are cached in guest and never expire.
- cache=auto
metadata and pathname lookup cache expires after a configured amount of
time (default is 1 second). Data is cached while the file is open
(close to open consistency).
- writeback/no_writeback
These options control the writeback strategy. If writeback is disabled,
then normal writes will immediately be synchronized with the host fs.
If writeback is enabled, then writes may be cached in the guest until
the file is closed or an fsync(2) performed. This option has no effect
on mmap-ed writes or writes going through the DAX mechanism.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Here is the big char/misc driver pull request for 5.4-rc1.
As has been happening in previous releases, more and more individual
driver subsystem trees are ending up in here. Now if that is good or
bad I can't tell, but hopefully it makes your life easier as it's more
of an aggregation of trees together to one merge point for you.
Anyway, lots of stuff in here:
- habanalabs driver updates
- thunderbolt driver updates
- misc driver updates
- coresight and intel_th hwtracing driver updates
- fpga driver updates
- extcon driver updates
- some dma driver updates
- char driver updates
- android binder driver updates
- nvmem driver updates
- phy driver updates
- parport driver fixes
- pcmcia driver fix
- uio driver updates
- w1 driver updates
- configfs fixes
- other assorted driver updates
All of these have been in linux-next for a long time with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXYIT1g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym9lwCgrHZlMMvfYNVm6GQ5ge58JJsVTL4AoNatTcL4
hfVMA6pCHWBjV65xVSf6
=Tijw
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver pull request for 5.4-rc1.
As has been happening in previous releases, more and more individual
driver subsystem trees are ending up in here. Now if that is good or
bad I can't tell, but hopefully it makes your life easier as it's more
of an aggregation of trees together to one merge point for you.
Anyway, lots of stuff in here:
- habanalabs driver updates
- thunderbolt driver updates
- misc driver updates
- coresight and intel_th hwtracing driver updates
- fpga driver updates
- extcon driver updates
- some dma driver updates
- char driver updates
- android binder driver updates
- nvmem driver updates
- phy driver updates
- parport driver fixes
- pcmcia driver fix
- uio driver updates
- w1 driver updates
- configfs fixes
- other assorted driver updates
All of these have been in linux-next for a long time with no reported
issues"
* tag 'char-misc-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (200 commits)
misc: mic: Use PTR_ERR_OR_ZERO rather than its implementation
habanalabs: correctly cast variable to __le32
habanalabs: show correct id in error print
habanalabs: stop using the acronym KMD
habanalabs: display card name as sensors header
habanalabs: add uapi to retrieve aggregate H/W events
habanalabs: add uapi to retrieve device utilization
habanalabs: Make the Coresight timestamp perpetual
habanalabs: explicitly set the queue-id enumerated numbers
habanalabs: print to kernel log when reset is finished
habanalabs: replace __le32_to_cpu with le32_to_cpu
habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64
habanalabs: Handle HW_IP_INFO if device disabled or in reset
habanalabs: Expose devices after initialization is done
habanalabs: improve security in Debug IOCTL
habanalabs: use default structure for user input in Debug IOCTL
habanalabs: Add descriptive name to PSOC app status register
habanalabs: Add descriptive names to PSOC scratch-pad registers
habanalabs: create two char devices per ASIC
habanalabs: change device_setup_cdev() to be more generic
...
Here is the big staging/iio driver update for 5.4-rc1.
Lots of churn here, with a few driver/filesystems moving out of staging
finally:
- erofs moved out of staging
- greybus core code moved out of staging
Along with that, a new filesytem has been added:
- extfat
to provide support for those devices requiring that filesystem (i.e.
transfer devices to/from windows systems or printers.)
Other than that, there a number of new IIO drivers, and lots and lots
and lots of staging driver cleanups and minor fixes as people continue
to dig into those for easy changes.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXYIW0g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynJgwCgt22YQdsWRrOuJnZp3xpFq/ZSMWAAn0dAgFf5
SlSTI2nQhbW7jjdShrPR
=Ra09
-----END PGP SIGNATURE-----
Merge tag 'staging-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver updates from Greg KH:
"Here is the big staging/iio driver update for 5.4-rc1.
Lots of churn here, with a few driver/filesystems moving out of
staging finally:
- erofs moved out of staging
- greybus core code moved out of staging
Along with that, a new filesytem has been added:
- extfat
to provide support for those devices requiring that filesystem (i.e.
transfer devices to/from windows systems or printers)
Other than that, there a number of new IIO drivers, and lots and lots
and lots of staging driver cleanups and minor fixes as people continue
to dig into those for easy changes.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (453 commits)
Staging: gasket: Use temporaries to reduce line length.
Staging: octeon: Avoid several usecases of strcpy
staging: vhciq_core: replace snprintf with scnprintf
staging: wilc1000: avoid twice IRQ handler execution for each single interrupt
staging: wilc1000: remove unused interrupt status handling code
staging: fbtft: make several arrays static const, makes object smaller
staging: rtl8188eu: make two arrays static const, makes object smaller
staging: rtl8723bs: core: Remove Macro "IS_MAC_ADDRESS_BROADCAST"
dt-bindings: anybus-controller: move to staging/ tree
staging: emxx_udc: remove local TRUE/FALSE definition
staging: wilc1000: look for rtc_clk clock
staging: dt-bindings: wilc1000: add optional rtc_clk property
staging: nvec: make use of devm_platform_ioremap_resource
staging: exfat: drop unused function parameter
Staging: exfat: Avoid use of strcpy
staging: exfat: use integer constants
staging: exfat: cleanup spacing for casts
staging: exfat: cleanup spacing for operators
staging: rtl8723bs: hal: remove redundant variable n
staging: pi433: Fix typo in documentation
...
Even in this age, people are still making new serial port silicon,
why...
Anyway, here's the TTY and Serial driver update for 5.4-rc1. Lots of
changes in here for a number of embedded serial port devices that are
being worked on because people really like to see those console logs...
Other than that, nothing major here, no core tty changes that anyone
should care about.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXYIXfw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymmNACeIYq7038NXGBXBj1DfN90GeGNenwAnRkjzF+6
YwEhFRxOMglnbvDevgPU
=E6jv
-----END PGP SIGNATURE-----
Merge tag 'tty-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Even in this age, people are still making new serial port silicon,
why...
Anyway, here's the TTY and Serial driver update for 5.4-rc1. Lots of
changes in here for a number of embedded serial port devices that are
being worked on because people really like to see those console
logs...
Other than that, nothing major here, no core tty changes that anyone
should care about.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (125 commits)
serial: tegra: Add PIO mode support
serial: tegra: report clk rate errors
serial: tegra: add support to adjust baud rate
serial: tegra: DT for Adjusted baud rates
serial: tegra: add support to use 8 bytes trigger
serial: tegra: set maximum num of uart ports to 8
serial: tegra: check for FIFO mode enabled status
dt-binding: serial: tegra: add new chips
serial: tegra: report error to upper tty layer
serial: tegra: flush the RX fifo on frame error
serial: tegra: avoid reg access when clk disabled
serial: tegra: add support to ignore read
serial: sprd: correct the wrong sequence of arguments
dt-bindings: serial: Convert riscv,sifive-serial to json-schema
serial: max310x: turn off transmitter before activating AutoCTS or auto transmitter flow control
serial: max310x: Properly set flags in AutoCTS mode
tty: serial: fix platform_no_drv_owner.cocci warnings
dt-bindings: serial: Document Freescale LINFlexD UART
serial: fsl_linflexuart: Update compatible string
tty: n_gsm: avoid recursive locking with async port hangup
...
Here is the big set of USB patches for 5.4-rc1.
Two major chunks of code are moving out of the tree and into the staging
directory, uwb and wusb (wireless USB support), because there are no
devices that actually use this protocol anymore, and what we have today
probably doesn't work at all given that the maintainers left many many
years ago. So move it to staging where it will be removed in a few
releases if no one screams.
Other than that, lots of little things. The usual gadget and xhci and
usb serial driver updates, along with a bunch of sysfs file cleanups due
to the driver core changes to support that. Nothing really major, just
constant forward progress.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXYIYYQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymn8ACfTi8Y2ku/yqw8Nvs4vQjc08MUDhgAoNnbhsI6
H3HUrZTjJJzuxHCM22Lh
=8ZRm
-----END PGP SIGNATURE-----
Merge tag 'usb-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB updates from Greg KH:
"Here is the big set of USB patches for 5.4-rc1.
Two major chunks of code are moving out of the tree and into the
staging directory, uwb and wusb (wireless USB support), because there
are no devices that actually use this protocol anymore, and what we
have today probably doesn't work at all given that the maintainers
left many many years ago. So move it to staging where it will be
removed in a few releases if no one screams.
Other than that, lots of little things. The usual gadget and xhci and
usb serial driver updates, along with a bunch of sysfs file cleanups
due to the driver core changes to support that. Nothing really major,
just constant forward progress.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (159 commits)
USB: usbcore: Fix slab-out-of-bounds bug during device reset
usb: cdns3: Remove redundant dev_err call in cdns3_probe()
USB: rio500: Fix lockdep violation
USB: rio500: simplify locking
usb: mtu3: register a USB Role Switch for dual role mode
usb: common: add USB GPIO based connection detection driver
usb: common: create Kconfig file
usb: roles: get usb-role-switch from parent
usb: roles: Add fwnode_usb_role_switch_get() function
device connection: Add fwnode_connection_find_match()
usb: roles: Introduce stubs for the exiting functions in role.h
dt-bindings: usb: mtu3: add properties about USB Role Switch
dt-bindings: usb: add binding for USB GPIO based connection detection driver
dt-bindings: connector: add optional properties for Type-B
dt-binding: usb: add usb-role-switch property
usbip: Implement SG support to vhci-hcd and stub driver
usb: roles: intel: Enable static DRD mode for role switch
xhci-ext-caps.c: Add property to disable Intel SW switch
usb: dwc3: remove generic PHY calibrate() calls
usb: core: phy: add support for PHY calibration
...
* ARM: ITS translation cache; support for 512 vCPUs, various cleanups
and bugfixes
* PPC: various minor fixes and preparation
* x86: bugfixes all over the place (posted interrupts, SVM, emulation
corner cases, blocked INIT), some IPI optimizations
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdf7fdAAoJEL/70l94x66DJzkIAKDcuWXJB4Qtoto6yUvPiHZm
LYkY/Dn1zulb/DhzrBoXFey/jZXwl9kxMYkVTefnrAl0fRwFGX+G1UYnQrtAL6Gr
ifdTYdy3kZhXCnnp99QAantWDswJHo1THwbmHrlmkxS4MdisEaTHwgjaHrDRZ4/d
FAEwW2isSonP3YJfTtsKFFjL9k2D4iMnwZ/R2B7UOaWvgnerZ1GLmOkilvnzGGEV
IQ89IIkWlkKd4SKgq8RkDKlfW5JrLrSdTK2Uf0DvAxV+J0EFkEaR+WlLsqumra0z
Eg3KwNScfQj0DyT0TzurcOxObcQPoMNSFYXLRbUu1+i0CGgm90XpF1IosiuihgU=
=w6I3
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"s390:
- ioctl hardening
- selftests
ARM:
- ITS translation cache
- support for 512 vCPUs
- various cleanups and bugfixes
PPC:
- various minor fixes and preparation
x86:
- bugfixes all over the place (posted interrupts, SVM, emulation
corner cases, blocked INIT)
- some IPI optimizations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits)
KVM: X86: Use IPI shorthands in kvm guest when support
KVM: x86: Fix INIT signal handling in various CPU states
KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode
KVM: VMX: Stop the preemption timer during vCPU reset
KVM: LAPIC: Micro optimize IPI latency
kvm: Nested KVM MMUs need PAE root too
KVM: x86: set ctxt->have_exception in x86_decode_insn()
KVM: x86: always stop emulation on page fault
KVM: nVMX: trace nested VM-Enter failures detected by H/W
KVM: nVMX: add tracepoint for failed nested VM-Enter
x86: KVM: svm: Fix a check in nested_svm_vmrun()
KVM: x86: Return to userspace with internal error on unexpected exit reason
KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code
KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers
doc: kvm: Fix return description of KVM_SET_MSRS
KVM: X86: Tune PLE Window tracepoint
KVM: VMX: Change ple_window type to unsigned int
KVM: X86: Remove tailing newline for tracepoints
KVM: X86: Trace vcpu_id for vmexit
KVM: x86: Manually calculate reserved bits when loading PDPTRS
...
There's been a few requests for functionality similar to io_getevents()
and epoll_wait(), where the user can specify a timeout for waiting on
events. I deliberately did not add support for this through the system
call initially to avoid overloading the args, but I can see that the use
cases for this are valid.
This adds support for IORING_OP_TIMEOUT. If a user wants to get woken
when waiting for events, simply submit one of these timeout commands
with your wait call (or before). This ensures that the application
sleeping on the CQ ring waiting for events will get woken. The timeout
command is passed in as a pointer to a struct timespec. Timeouts are
relative. The timeout command also includes a way to auto-cancel after
N events has passed.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl1/essACgkQCF8+vY7k
4RV4YQ/+IILxwEuXn6cU2H/ss2+6QmOdOYISGDniWRQGTE9NN4AuvF7jv3qxek5o
FqisJXFJWCzpO7+LpKomAQXweWmVGaSV99QSJhRUyM0lNIrXD+rACO77yYL5ZwSX
r5REvk+ykozgRrQrWpiHCGOyH08Kt/LO/jA29VaJokX6yX4w0xvRc0rUBe2OlL+K
rcaXDnVw8SloVpyDaMY0DDqP8v0iGsUufTxFwAoskL36+qudXgunrg1yoNwmvTL+
spxLFs/I2LBIuBdvSg5W+FmwC3YnMdE/BDiaor9jKvhqUT9hOIozZ+pNmVTEsfvk
1FdTobf2mhvzmJrn4bdQSfFb7BiDNgMuf6/IpeRJqQSpQHzyHQhVGBG8h3axSOhM
5QWHBqrA0MAhcwVsKBUKj+mcfg7uNxtGPqmoeSVlMz+LH1hhokNWTAq5p2FgiDh8
wrSsZyJItHg2hbXPkWP9EgA40Q/xzwsvUCvoz0/+dCd08DwIeSi2HIQbNddxg7y8
NllZY0o4h6ETeDIfQb6k3zrRnAqOwlxxvXIMlaGanYegZBXMltiBIqiWrUG1qQh5
CrBTAE28vBCNNTxQJyEjut6H5/RpWEFwebcUkn1BkeUAPZ1w72+LTJ/obekiC0/6
Dxv7VvoTG+UFO14Fz7L3lYYTK2+OGvAsuXHAhQgrW5bWfmm8Cog=
=QTT+
-----END PGP SIGNATURE-----
Merge tag 'media/v5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- a new sensor driver for ov5675
- a new platform driver for Allwinner A10 sensor interface
- some new remote controller keymaps
- some cosmetic changes at V4L2 core in order to avoid #ifdefs and to
merge two core modules into one
- removal of bcm2048 radio driver from staging
- removal of davinci_vpfe video driver from staging
- regression fix since Kernel 5.1 at the legacy VideoBuffer version 1
core
- added some documentation for remote controller protocols
- pixel format documentation was split on two files
- lots of other driver improvements and cleanups
* tag 'media/v5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (321 commits)
media: videobuf-core.c: poll_wait needs a non-NULL buf pointer
media: sun4i: Make sun4i_csi_formats static
media: imx: remove unused including <linux/version.h>
media: stm32-dcmi: Delete an unnecessary of_node_put() call in dcmi_probe()
media: pvrusb2: qctrl.flag will be uninitlaized if cx2341x_ctrl_query() returns error code
media: em28xx: Fix exception handling in em28xx_alloc_urbs()
media: don't do a 31 bit shift on a signed int
media: use the BIT() macro
media: ov9650: add a sanity check
media: aspeed-video: address a protential usage of an unitialized var
media: vicodec: make life easier for static analyzers
media: remove include stdarg.h from some drivers
v4l2-core: fix coding style for the two new c files
media: v4l2-core: Remove BUG() from i2c and spi helpers
media: v4l2-core: introduce a helper to unregister a i2c subdev
media: v4l2-core: introduce a helper to unregister a spi subdev
media: v4l2-core: move i2c helpers out of v4l2-common.c
media: v4l2-core: move spi helpers out of v4l2-common.c
media: v4l2-core: Module re-organization
media: usbvision: Remove dead code
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl1/no0QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmo9EACFXMbdNmEEUMyRSdOkVLlr7ZlTyQi1tLpB
YESDPxdBfybzpi0qa8JSaysGIfvSkSjmSAqBqrWPmASOSOL6CK4bbA4fTYbgPplk
XeHUdgGiG34oCQUn8Xil5reYaTm7I6LQWnWTpVa5fIhAyUYaGJL+987ykoGmpQmB
Dvf3YSc+8H0RTp9PCMVd6UCGPkZbVlLImGad3PF5ULvTEaE4RCXC2aiAgh0p1l5A
J2CkRZ+/mio3zN2O4YN7VdPGfr1Wo1iZ834xbIGLegv1miHXagFk7jwTcC7zIt5t
oSnJnqIg3iCe7SpWt4Bkzw/zy/2UqaspifbCMgw8vychlViVRUHFO5h85Yboo7kQ
OMLEQPcwjm6dTHv5h1iXF9LW1O7NoiYmmgvApU9uOo1HUrl1X7PZ3JEfUsVHxkOO
T4D5igf0Krsl1eAbiwEUQzy7vFZ8PlRHqrHgK+fkyotzHu1BJR7OQkYygEfGFOB/
EfMxplGDpmibYGuWCwDX2bPAmLV3SPUQENReHrfPJRDt5TD1UkFpVGv/PLLhbr0p
cLYI78DKpDSigBpVMmwq5nTYpnex33eyDTTA8C0sakcsdzdmU5qv30y3wm4nTiep
f6gZo6IMXwRg/rCgVVrd9SKQAr/8wEzVlsDW3qyi2pVT8sHIgm0tFv7paihXGdDV
xsKgmTrQQQ==
=Qt+h
-----END PGP SIGNATURE-----
Merge tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- Two NVMe pull requests:
- ana log parse fix from Anton
- nvme quirks support for Apple devices from Ben
- fix missing bio completion tracing for multipath stack devices
from Hannes and Mikhail
- IP TOS settings for nvme rdma and tcp transports from Israel
- rq_dma_dir cleanups from Israel
- tracing for Get LBA Status command from Minwoo
- Some nvme-tcp cleanups from Minwoo, Potnuri and Myself
- Some consolidation between the fabrics transports for handling
the CAP register
- reset race with ns scanning fix for fabrics (move fabrics
commands to a dedicated request queue with a different lifetime
from the admin request queue)."
- controller reset and namespace scan races fixes
- nvme discovery log change uevent support
- naming improvements from Keith
- multiple discovery controllers reject fix from James
- some regular cleanups from various people
- Series fixing (and re-fixing) null_blk debug printing and nr_devices
checks (André)
- A few pull requests from Song, with fixes from Andy, Guoqing,
Guilherme, Neil, Nigel, and Yufen.
- REQ_OP_ZONE_RESET_ALL support (Chaitanya)
- Bio merge handling unification (Christoph)
- Pick default elevator correctly for devices with special needs
(Damien)
- Block stats fixes (Hou)
- Timeout and support devices nbd fixes (Mike)
- Series fixing races around elevator switching and device add/remove
(Ming)
- sed-opal cleanups (Revanth)
- Per device weight support for BFQ (Fam)
- Support for blk-iocost, a new model that can properly account cost of
IO workloads. (Tejun)
- blk-cgroup writeback fixes (Tejun)
- paride queue init fixes (zhengbin)
- blk_set_runtime_active() cleanup (Stanley)
- Block segment mapping optimizations (Bart)
- lightnvm fixes (Hans/Minwoo/YueHaibing)
- Various little fixes and cleanups
* tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block: (186 commits)
null_blk: format pr_* logs with pr_fmt
null_blk: match the type of parameter nr_devices
null_blk: do not fail the module load with zero devices
block: also check RQF_STATS in blk_mq_need_time_stamp()
block: make rq sector size accessible for block stats
bfq: Fix bfq linkage error
raid5: use bio_end_sector in r5_next_bio
raid5: remove STRIPE_OPS_REQ_PENDING
md: add feature flag MD_FEATURE_RAID0_LAYOUT
md/raid0: avoid RAID0 data corruption due to layout confusion.
raid5: don't set STRIPE_HANDLE to stripe which is in batch list
raid5: don't increment read_errors on EILSEQ return
nvmet: fix a wrong error status returned in error log page
nvme: send discovery log page change events to userspace
nvme: add uevent variables for controller devices
nvme: enable aen regardless of the presence of I/O queues
nvme-fabrics: allow discovery subsystems accept a kato
nvmet: Use PTR_ERR_OR_ZERO() in nvmet_init_discovery()
nvme: Remove redundant assignment of cq vector
nvme: Assign subsys instance from first ctrl
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl1+zs8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvIOD/9P4MhnjTc9sFz978YGw2kP/yUjW0Vs194B
nV5wmM+J3I6VUBgurNQeB7f14b/1CZnnPjisLKGzz1SKrhRQT/HuXzItsYpzRql3
3GvD4aRNJd52pfkX3p7tA8uMr5Hh2gnNLhJtlPTaaAiIc8XCme+hHMnlmhwqMZ0T
FKTc0Pw0yxvYIbxmt+pfWxoKhHrq12dYLG61Iv8XeSCb3k+EzdZV8jQdnN/TVhsF
ngzxRMvbPjJ/TvDcsdV0qNqmLTojgmuiAvDs/3YPCreGVhMBkm4w/tlZGqTJHSdb
Ljxzwp0Ungu94O2Ct3gN+pIxTJ5hgfnArfT40fbWczZrlXzUKw4d3mW/QG3g/Elf
aUfCY+3WWR/Fs+XT5ZDasJDewwO4j5wYgO8VYeyTX08jobcxC/osqoMtGUqsljwJ
CTWMYSD8MTXEnOIZFK9sgjg0AoJYVDE8SXN/zx+7K7iVk9+5xQTkuL/cCd+3/Nih
73QqvwpRm0Tik4yLGcw+EJc2ttvNUsI5EEETloiGNaMeK1oOJvzSu8qhh0G9VdYo
DWka5Yo+kLzwuUMbtzVSq288Un+RjbUIoqrqMFqm5QLqUYYg735IH2McIZvIEHEj
o0vjudw/yxcpURGqV5cokeusIHex/LC8hItCwHY4U9959A2yOtiCs/Bi7LIj3lAu
FO5gTlGDHA==
=KSaR
-----END PGP SIGNATURE-----
Merge tag 'for-5.4/io_uring-2019-09-15' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
- Allocate SQ/CQ ring together, more efficient. Expose this through a
feature flag as well, so we can reduce the number of mmaps by 1
(Hristo and me)
- Fix for sequence logic with SQ thread (Jackie).
- Add support for links with drain commands (Jackie).
- Improved async merging (me)
- Improved buffered async write performance (me)
- Support SQ poll wakeup + event get in single io_uring_enter() (me)
- Support larger SQ ring size. For epoll conversions, the 4k limit was
too small for some prod workloads (Daniel).
- put_user_page() usage (John)
* tag 'for-5.4/io_uring-2019-09-15' of git://git.kernel.dk/linux-block:
io_uring: increase IORING_MAX_ENTRIES to 32K
io_uring: make sqpoll wakeup possible with getevents
io_uring: extend async work merging
io_uring: limit parallelism of buffered writes
io_uring: add io_queue_async_work() helper
io_uring: optimize submit_and_wait API
io_uring: add support for link with drain
io_uring: fix wrong sequence setting logic
io_uring: expose single mmap capability
io_uring: allocate the two rings together
fs/io_uring.c: convert put_page() to put_user_page*()
Pull x86 cpu-feature updates from Ingo Molnar:
- Rework the Intel model names symbols/macros, which were decades of
ad-hoc extensions and added random noise. It's now a coherent, easy
to follow nomenclature.
- Add new Intel CPU model IDs:
- "Tiger Lake" desktop and mobile models
- "Elkhart Lake" model ID
- and the "Lightning Mountain" variant of Airmont, plus support code
- Add the new AVX512_VP2INTERSECT instruction to cpufeatures
- Remove Intel MPX user-visible APIs and the self-tests, because the
toolchain (gcc) is not supporting it going forward. This is the
first, lowest-risk phase of MPX removal.
- Remove X86_FEATURE_MFENCE_RDTSC
- Various smaller cleanups and fixes
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
x86/cpu: Update init data for new Airmont CPU model
x86/cpu: Add new Airmont variant to Intel family
x86/cpu: Add Elkhart Lake to Intel family
x86/cpu: Add Tiger Lake to Intel family
x86: Correct misc typos
x86/intel: Add common OPTDIFFs
x86/intel: Aggregate microserver naming
x86/intel: Aggregate big core graphics naming
x86/intel: Aggregate big core mobile naming
x86/intel: Aggregate big core client naming
x86/cpufeature: Explain the macro duplication
x86/ftrace: Remove mcount() declaration
x86/PCI: Remove superfluous returns from void functions
x86/msr-index: Move AMD MSRs where they belong
x86/cpu: Use constant definitions for CPU models
lib: Remove redundant ftrace flag removal
x86/crash: Remove unnecessary comparison
x86/bitops: Use __builtin_constant_p() directly instead of IS_IMMEDIATE()
x86: Remove X86_FEATURE_MFENCE_RDTSC
x86/mpx: Remove MPX APIs
...
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Improved kbprobes robustness
- Intel PEBS support for PT hardware tracing
- Other Intel PT improvements: high order pages memory footprint
reduction and various related cleanups
- Misc cleanups
The perf tooling side has been very busy in this cycle, with over 300
commits. This is an incomplete high-level summary of the many
improvements done by over 30 developers:
- Lots of updates to the following tools:
'perf c2c'
'perf config'
'perf record'
'perf report'
'perf script'
'perf test'
'perf top'
'perf trace'
- Updates to libperf and libtraceevent, and a consolidation of the
proliferation of x86 instruction decoder libraries.
- Vendor event updates for Intel and PowerPC CPUs,
- Updates to hardware tracing tooling for ARM and Intel CPUs,
- ... and lots of other changes and cleanups - see the shortlog and
Git log for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (322 commits)
kprobes: Prohibit probing on BUG() and WARN() address
perf/x86: Make more stuff static
x86, perf: Fix the dependency of the x86 insn decoder selftest
objtool: Ignore intentional differences for the x86 insn decoder
objtool: Update sync-check.sh from perf's check-headers.sh
perf build: Ignore intentional differences for the x86 insn decoder
perf intel-pt: Use shared x86 insn decoder
perf intel-pt: Remove inat.c from build dependency list
perf: Update .gitignore file
objtool: Move x86 insn decoder to a common location
perf metricgroup: Support multiple events for metricgroup
perf metricgroup: Scale the metric result
perf pmu: Change convert_scale from static to global
perf symbols: Move mem_info and branch_info out of symbol.h
perf auxtrace: Uninline functions that touch perf_session
perf tools: Remove needless evlist.h include directives
perf tools: Remove needless evlist.h include directives
perf tools: Remove needless thread_map.h include directives
perf tools: Remove needless thread.h include directives
perf tools: Remove needless map.h include directives
...
Pull parisc updates from Helge Deller:
- Make the powerpc implementation to read elf files available as a
public kexec interface so it can be re-used on other architectures
(Sven)
- Implement kexec on parisc (Sven)
- Add kprobes on ftrace on parisc (Sven)
- Fix kernel crash with HSC-PCI cards based on card-mode Dino
- Add assembly implementations for memset, strlen, strcpy, strncpy and
strcat
- Some cleanups, documentation updates, warning fixes, ...
* 'parisc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (25 commits)
parisc: Have git ignore generated real2.S and firmware.c
parisc: Disable HP HSC-PCI Cards to prevent kernel crash
parisc: add support for kexec_file_load() syscall
parisc: wire up kexec_file_load syscall
parisc: add kexec syscall support
parisc: add __pdc_cpu_rendezvous()
kprobes/parisc: remove arch_kprobe_on_func_entry()
kexec_elf: support 32 bit ELF files
kexec_elf: remove unused variable in kexec_elf_load()
kexec_elf: remove Elf_Rel macro
kexec_elf: remove PURGATORY_STACK_SIZE
kexec_elf: remove parsing of section headers
kexec_elf: change order of elf_*_to_cpu() functions
kexec: add KEXEC_ELF
parisc: Save some bytes in dino driver
parisc: Drop comments which are already in pci.h
parisc: Convert eisa_enumerator to use pr_cont()
parisc: Avoid warning when loading hppb driver
parisc: speed up flush_tlb_all_local with qemu
parisc: Add ALTERNATIVE_CODE() and ALT_COND_RUN_ON_QEMU
...
- 52-bit virtual addressing in the kernel
- New ABI to allow tagged user pointers to be dereferenced by syscalls
- Early RNG seeding by the bootloader
- Improve robustness of SMP boot
- Fix TLB invalidation in light of recent architectural clarifications
- Support for i.MX8 DDR PMU
- Remove direct LSE instruction patching in favour of static keys
- Function error injection using kprobes
- Support for the PPTT "thread" flag introduced by ACPI 6.3
- Move PSCI idle code into proper cpuidle driver
- Relaxation of implicit I/O memory barriers
- Build with RELR relocations when toolchain supports them
- Numerous cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl1yYREQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNAM3CAChqDFQkryXoHwdeEcaukMRVNxtxOi4pM4g
5xqkb7PoqRJssIblsuhaXjrSD97yWCgaqCmFe6rKoes++lP4bFcTe22KXPPyPBED
A+tK4nTuKKcZfVbEanUjI+ihXaHJmKZ/kwAxWsEBYZ4WCOe3voCiJVNO2fHxqg1M
8TskZ2BoayTbWMXih0eJg2MCy/xApBq4b3nZG4bKI7Z9UpXiKN1NYtDh98ZEBK4V
d/oNoHsJ2ZvIQsztoBJMsvr09DTCazCijWZiECadm6l41WEPFizngrACiSJLLtYo
0qu4qxgg9zgFlvBCRQmIYSggTuv35RgXSfcOwChmW5DUjHG+f9GK
=Ru4B
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Although there isn't tonnes of code in terms of line count, there are
a fair few headline features which I've noted both in the tag and also
in the merge commits when I pulled everything together.
The part I'm most pleased with is that we had 35 contributors this
time around, which feels like a big jump from the usual small group of
core arm64 arch developers. Hopefully they all enjoyed it so much that
they'll continue to contribute, but we'll see.
It's probably worth highlighting that we've pulled in a branch from
the risc-v folks which moves our CPU topology code out to where it can
be shared with others.
Summary:
- 52-bit virtual addressing in the kernel
- New ABI to allow tagged user pointers to be dereferenced by
syscalls
- Early RNG seeding by the bootloader
- Improve robustness of SMP boot
- Fix TLB invalidation in light of recent architectural
clarifications
- Support for i.MX8 DDR PMU
- Remove direct LSE instruction patching in favour of static keys
- Function error injection using kprobes
- Support for the PPTT "thread" flag introduced by ACPI 6.3
- Move PSCI idle code into proper cpuidle driver
- Relaxation of implicit I/O memory barriers
- Build with RELR relocations when toolchain supports them
- Numerous cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (114 commits)
arm64: remove __iounmap
arm64: atomics: Use K constraint when toolchain appears to support it
arm64: atomics: Undefine internal macros after use
arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL
arm64: asm: Kill 'asm/atomic_arch.h'
arm64: lse: Remove unused 'alt_lse' assembly macro
arm64: atomics: Remove atomic_ll_sc compilation unit
arm64: avoid using hard-coded registers for LSE atomics
arm64: atomics: avoid out-of-line ll/sc atomics
arm64: Use correct ll/sc atomic constraints
jump_label: Don't warn on __exit jump entries
docs/perf: Add documentation for the i.MX8 DDR PMU
perf/imx_ddr: Add support for AXI ID filtering
arm64: kpti: ensure patched kernel text is fetched from PoU
arm64: fix fixmap copy for 16K pages and 48-bit VA
perf/smmuv3: Validate groups for global filtering
perf/smmuv3: Validate group size
arm64: Relax Documentation/arm64/tagged-pointers.rst
arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_F
arm64: mm: Ignore spurious translation faults taken from the kernel
...
The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like
this feature is common across other PHYs (like EEE), and defining
`ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long.
The way EDPD works, is that the RX block is put to a lower power mode,
except for link-pulse detection circuits. The TX block is also put to low
power mode, but the PHY wakes-up periodically to send link pulses, to avoid
lock-ups in case the other side is also in EDPD mode.
Currently, there are 2 PHY drivers that look like they could use this new
PHY tunable feature: the `adin` && `micrel` PHYs.
The ADIN's datasheet mentions that TX pulses are at intervals of 1 second
default each, and they can be disabled. For the Micrel KSZ9031 PHY, the
datasheet does not mention whether they can be disabled, but mentions that
they can modified.
The way this change is structured, is similar to the PHY tunable downshift
control:
* a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default
TX interval; some PHYs could specify a certain value that makes sense
* `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled
* `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD
As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1
millisecond, which should cover a reasonable range of intervals:
- from 1 millisecond, which does not sound like much of a power-saver
- to ~65 seconds which is quite a lot to wait for a link to come up when
plugging a cable
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows taprio to offload the schedule enforcement to capable
network cards, resulting in more precise windows and less CPU usage.
The gate mask acts on traffic classes (groups of queues of same
priority), as specified in IEEE 802.1Q-2018, and following the existing
taprio and mqprio semantics.
It is up to the driver to perform conversion between tc and individual
netdev queues if for some reason it needs to make that distinction.
Full offload is requested from the network interface by specifying
"flags 2" in the tc qdisc creation command, which in turn corresponds to
the TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD bit.
The important detail here is the clockid which is implicitly /dev/ptpN
for full offload, and hence not configurable.
A reference counting API is added to support the use case where Ethernet
drivers need to keep the taprio offload structure locally (i.e. they are
a multi-port switch driver, and configuring a port depends on the
settings of other ports as well). The refcount_t variable is kept in a
private structure (__tc_taprio_qopt_offload) and not exposed to drivers.
In the future, the private structure might also be expanded with a
backpointer to taprio_sched *q, to implement the notification system
described in the patch (of when admin became oper, or an error occurred,
etc, so the offload can be monitored with 'tc qdisc show').
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXXe8mQAKCRCRxhvAZXjc
ou7oAQCszihkNfpjORSSSOqenMDrxxDW++A7TIOLuq7UyZQl8QD+LM1wvT/xypfJ
ORD9XX8+Wrv07AQn85fZBEFXGrnengk=
=o+VL
-----END PGP SIGNATURE-----
Merge tag 'core-process-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd/waitid updates from Christian Brauner:
"This contains two features and various tests.
First, it adds support for waiting on process through pidfds by adding
the P_PIDFD type to the waitid() syscall. This completes the basic
functionality of the pidfd api (cf. [1]). In the meantime we also have
a new adition to the userspace projects that make use of the pidfd
api. The qt project was nice enough to send a mail pointing out that
they have a pr up to switch to the pidfd api (cf. [2]).
Second, this tag contains an extension to the waitid() syscall to make
it possible to wait on the current process group in a race free manner
(even though the actual problem is very unlikely) by specifing 0
together with the P_PGID type. This extension traces back to a
discussion on the glibc development mailing list.
There are also a range of tests for the features above. Additionally,
the test-suite which detected the pidfd-polling race we fixed in [3]
is included in this tag"
[1] https://lwn.net/Articles/794707/
[2] https://codereview.qt-project.org/c/qt/qtbase/+/108456
[3] commit b191d6491b ("pidfd: fix a poll race when setting exit_state")
* tag 'core-process-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
waitid: Add support for waiting for the current process group
tests: add pidfd poll tests
tests: move common definitions and functions into pidfd.h
pidfd: add pidfd_wait tests
pidfd: add P_PIDFD to waitid()
Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP
performance problems --
> (1) Usually when we're diagnosing TCP performance problems, we do so
> from the sender, since the sender makes most of the
> performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
> From the sender-side the thing that would be most useful is to see
> tp->snd_wnd, the receive window that the receiver has advertised to
> the sender.
This serves the purpose of adding an additional __u32 to avoid the
would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
Signed-off-by: Thomas Higdon <tph@fb.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For receive-heavy cases on the server-side, we want to track the
connection quality for individual client IPs. This counter, similar to
the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
tracks out-of-order packet reception. By providing this counter in
TCP_INFO, it will allow understanding to what degree receive-heavy
sockets are experiencing out-of-order delivery and packet drops
indicating congestion.
Please note that this is similar to the counter in NetBSD TCP_INFO, and
has the same name.
Also note that we avoid increasing the size of the tcp_sock struct by
taking advantage of a hole.
Signed-off-by: Thomas Higdon <tph@fb.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit introduces a new ioctl DM_GET_TARGET_VERSION. It will load a
target that is specified in the "name" entry in the parameter structure
and return its version.
This functionality is intended to be used by cryptsetup, so that it can
query kernel capabilities before activating the device.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Currently the fact that devlink reload failed is stored in drivers.
Move this flag into devlink core. Also, expose it to the user.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to a bug introduced in Linux 3.14 we cannot determine the
correctly layout for a multi-zone RAID0 array - there are two
possibilities.
It is possible to tell the kernel which to chose using a module
parameter, but this can be clumsy to use. It would be best if
the choice were recorded in the metadata.
So add a feature flag for this purpose.
If it is set, then the 'layout' field of the superblock is used
to determine which layout to use.
If this flag is not set, then mddev->layout gets set to -1,
which causes the module parameter to be required.
Acked-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <songliubraving@fb.com>
Some controllers allow for a one-shot output pulse, in contrast to
periodic output. Now that we have extensible versions of our IOCTLs, we
can finally make use of the 'flags' field to pass a bit telling driver
that if we want one-shot pulse output.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current version of the IOCTL have a small problem which prevents us
from extending the API by making use of reserved fields. In these new
IOCTLs, we are now making sure that flags and rsv fields are zero which
will allow us to extend the API in the future.
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Fix error path of nf_tables_updobj(), from Dan Carpenter.
2) Move large structure away from stack in the nf_tables offload
infrastructure, from Arnd Bergmann.
3) Move indirect flow_block logic to nf_tables_offload.
4) Support for synproxy objects, from Fernando Fernandez Mancera.
5) Support for fwd and dup offload.
6) Add __nft_offload_get_chain() helper, this implicitly fixes missing
mutex and check for offload flags in the indirect block support,
patch from wenxu.
7) Remove rules on device unregistration, from wenxu. This includes
two preparation patches to reuse nft_flow_offload_chain() and
nft_flow_offload_rule().
Large batch from Jeremy Sowden to make a second pass to the
CONFIG_HEADER_TEST support and a bit of housekeeping:
8) Missing include guard in conntrack label header, from Jeremy Sowden.
9) A few coding style errors: trailing whitespace, incorrect indent in
Kconfig, and semicolons at the end of function definitions.
10) Remove unused ipt_init() and ip6t_init() declarations.
11) Inline xt_hashlimit, ebt_802_3 and xt_physdev headers. They are
only used once.
12) Update include directive in several netfilter files.
13) Remove unused include/net/netfilter/ipv6/nf_conntrack_icmpv6.h.
14) Move nf_ip6_ext_hdr() to include/linux/netfilter_ipv6.h
15) Move several synproxy structure definitions to nf_synproxy.h
16) Move nf_bridge_frag_data structure to include/linux/netfilter_bridge.h
17) Clean up static inline definitions in nf_conntrack_ecache.h.
18) Replace defined(CONFIG...) || defined(CONFIG...MODULE) with IS_ENABLED(CONFIG...).
19) Missing inline function conditional definitions based on Kconfig
preferences in synproxy and nf_conntrack_timeout.
20) Update br_nf_pre_routing_ipv6() definition.
21) Move conntrack code in linux/skbuff.h to nf_conntrack headers.
22) Several patches to remove superfluous CONFIG_NETFILTER and
CONFIG_NF_CONNTRACK checks in headers, coming from the initial batch
support for CONFIG_HEADER_TEST for netfilter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
virtio fs tunnels fuse over a virtio channel. One issue is two sides might
be speaking different endian-ness. To detects this, host side looks at the
opcode value in the FUSE_INIT command. Works fine at the moment but might
fail if a future version of fuse will use such an opcode for
initialization. Let's reserve this opcode so we remember and don't do
this.
Same for CUSE_INIT.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
SETUPMAPPING is a command for use with 'virtiofsd', a fuse-over-virtio
implementation; it may find use in other fuse impelementations as well in
which the kernel does not have access to the address space of the daemon
directly.
A SETUPMAPPING operation causes a section of a file to be mapped into a
memory window visible to the kernel. The offsets in the file and the
window are defined by the kernel performing the operation.
The daemon may reject the request, for reasons including permissions and
limited resources.
When a request perfectly overlaps a previous mapping, the previous mapping
is replaced. When a mapping partially overlaps a previous mapping, the
previous mapping is split into one or two smaller mappings.
REMOVEMAPPING is the complement to SETUPMAPPING; it unmaps a range of
mapped files from the window visible to the kernel.
The map_alignment field communicates the alignment constraint for
FUSE_SETUPMAPPING/FUSE_REMOVEMAPPING and allows the daemon to constrain the
addresses and file offsets chosen by the kernel.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Similar to the 'reset_dev_on_drv_probe' devlink parameter, it is useful
to have an unknown value which can be used by drivers to report that the
hardware value isn't recognized or is otherwise invalid instead of
failing the operation.
This is especially useful for u8/enum parameters.
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* a fix in the new 6 GHz channel support
* a fix for recent minstrel (rate control) updates
for an infinite loop
* handle interface type changes better wrt. management frame
registrations (for management frames sent to userspace)
* add in-BSS RX time to survey information
* handle HW rfkill properly if !CONFIG_RFKILL
* send deauth on IBSS station expiry, to avoid state mismatches
* handle deferred crypto tailroom updates in mac80211 better
when device restart happens
* fix a spectre-v1 - really a continuation of a previous patch
* advertise NL80211_CMD_UPDATE_FT_IES as supported if so
* add some missing parsing in VHT extended NSS support
* support HE in mac80211_hwsim
* let mac80211 drivers determine the max MTU themselves
along with the usual cleanups etc.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl148l4ACgkQB8qZga/f
l8TafQ//Yjunnxq49ClKMDxyKwxIBvLRAMr4D3QwaWAcWUpar1/V0Ft/5glnHtbZ
7QptwbxVNUk/N68hYi8wlpMpvfzv/nShZD8QBNS1bk5E3Gng3yow03LhOx5iaYb2
KJXS1GE4jwkMD7Xn65+eMeb8rt1vEj8LleX91cguilq+y5YbcNFsP1nil2RtQyBU
cf5i8CBu4I5rTBoFaRvcz2xn+blqPSm2/piA+yXjzFp9vyVmhD+FjR5T482u48pj
wi/1zersGVUzBNElnZOKg67XPir1fcJqCfLILr7okPItWuYXHdVHGfn+arhimK4W
dyIpv1EfCe0lwKl4VTdhXt1GwhKvWCc2Ja7lz/RnDisGq9CYPJNfW7jFgR3tw4eg
SccnUhnxRIgD1V2KDcvsRadPo+2YsBJBmC61JRX1K6L3zpHLbktDyzxvTwCeVQ9A
TQp5bmQcKqDoq+/60JNHI6IxMbmX/vc2PC7dENGWUkem/JEmSWBB1wcL9gsRkhVi
c4uHyvFeXJaIV7YFA36hWCfa0fr+UUYxfzxudRSvxq/tTpayqBKu6fr3Pvv0SqTj
/BKkezoIdJntClhJv4PcDkZMva4uMCPtHCero9eICX5J+4AanD6caRNefX6L0PfF
DtN1sscVOy9zY2fV4tqvn3IASmOE5yB/dxjtKkMYyNHkZsGiDI0=
=HAuT
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2019-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have a number of changes, but things are settling down:
* a fix in the new 6 GHz channel support
* a fix for recent minstrel (rate control) updates
for an infinite loop
* handle interface type changes better wrt. management frame
registrations (for management frames sent to userspace)
* add in-BSS RX time to survey information
* handle HW rfkill properly if !CONFIG_RFKILL
* send deauth on IBSS station expiry, to avoid state mismatches
* handle deferred crypto tailroom updates in mac80211 better
when device restart happens
* fix a spectre-v1 - really a continuation of a previous patch
* advertise NL80211_CMD_UPDATE_FT_IES as supported if so
* add some missing parsing in VHT extended NSS support
* support HE in mac80211_hwsim
* let mac80211 drivers determine the max MTU themselves
along with the usual cleanups etc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Receiving an unexpected exit reason from hardware should be considered
as a severe bug in KVM. Therefore, instead of just injecting #UD to
guest and ignore it, exit to userspace on internal error so that
it could handle it properly (probably by terminating guest).
In addition, prefer to use vcpu_unimpl() instead of WARN_ONCE()
as handling unexpected exit reason should be a rare unexpected
event (that was expected to never happen) and we prefer to print
a message on it every time it occurs to guest.
Furthermore, dump VMCS/VMCB to dmesg to assist diagnosing such cases.
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Register a new synproxy stateful object type into the stateful object
infrastructure.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add the 'reset_dev_on_drv_probe' devlink parameter, controlling the
device reset policy on driver probe.
This parameter is useful in conjunction with the existing
'fw_load_policy' parameter.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the 'disk' value to the generic 'fw_load_policy' devlink parameter.
This value indicates that firmware should always be loaded from disk
only.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Version 2 upcalls will allow the nfsd to include a hash of the kerberos
principal string in the Cld_Create upcall. If a principal is present in
the svc_cred, then the hash will be included in the Cld_Create upcall.
We attempt to use the svc_cred.cr_raw_principal (which is returned by
gssproxy) first, and then fall back to using the svc_cred.cr_principal
(which is returned by both gssproxy and rpc.svcgssd). Upon a subsequent
restart, the hash will be returned in the Cld_Gracestart downcall and
stored in the reclaim_str_hashtbl so it can be used when handling
reclaim opens.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Add a "GetVersion" upcall to allow nfsd to determine the maximum upcall
version that the nfsdcld userspace daemon supports. If the daemon
responds with -EOPNOTSUPP, then we know it only supports v1.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Turn the checksum type definition into a enum. This eases later addition
of new checksums.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The BTRFS_DEV_REPLACE_ITEM_STATE_x defines, as shown in [1], are
unused in both kernel and btrfs-progs (except for one instance of
BTRFS_DEV_REPLACE_ITEM_STATE_NEVER_STARTED in kernel).
[1]
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED 2
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED 3
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED 4
Further these define-values are different form its counterpart
BTRFS_IOCTL_DEV_REPLACE_STATE_x series as shown in [2].
[2]
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_SUSPENDED 2
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_FINISHED 3
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_CANCELED 4
So this patch deletes the BTRFS_DEV_REPLACE_ITEM_STATE_x altogether, and
one instance of BTRFS_DEV_REPLACE_ITEM_STATE_NEVER_STARTED is replaced
with BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED in the kernel.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In commit c11d2c236c ("Btrfs: add ioctl to get and reset the device
stats") the get_dev_stats ioctl was added.
Shortly thereafter, in commit b27f7c0c15 ("btrfs: join DEV_STATS
ioctls to one") , the flags field was added. However, the calculation
for unused padding space was not updated, which also invalidated the
comment.
Clarify what happened to reduce confusion and wasted time for anyone
implementing this.
Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
I lifted the btrfs label get/set ioctls to the vfs some time ago, but
never followed up to use those common definitions directly in btrfs.
This patch does that.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While parts of the VGIC support a large number of vcpus (we
bravely allow up to 512), other parts are more limited.
One of these limits is visible in the KVM_IRQ_LINE ioctl, which
only allows 256 vcpus to be signalled when using the CPU or PPI
types. Unfortunately, we've cornered ourselves badly by allocating
all the bits in the irq field.
Since the irq_type subfield (8 bit wide) is currently only taking
the values 0, 1 and 2 (and we have been careful not to allow anything
else), let's reduce this field to only 4 bits, and allocate the
remaining 4 bits to a vcpu2_index, which acts as a multiplier:
vcpu_id = 256 * vcpu2_index + vcpu_index
With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2)
allowing this to be discovered, it becomes possible to inject
PPIs to up to 4096 vcpus. But please just don't.
Whilst we're there, add a clarification about the use of KVM_IRQ_LINE
on arm, which is not completely conditionned by KVM_CAP_IRQCHIP.
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
syzbot reported:
BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
__msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
do_loop_readv_writev fs/read_write.c:703 [inline]
do_iter_write+0x83e/0xd80 fs/read_write.c:961
vfs_writev fs/read_write.c:1004 [inline]
do_writev+0x397/0x840 fs/read_write.c:1039
__do_sys_writev fs/read_write.c:1112 [inline]
__se_sys_writev+0x9b/0xb0 fs/read_write.c:1109
__x64_sys_writev+0x4a/0x70 fs/read_write.c:1109
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
[...]
The problem is that capi_write() is reading past the end of the message.
Fix it by checking the message's length in the needed places.
Reported-and-tested-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add nft_reg_store64() and nft_reg_load64() helpers, from Ander Juaristi.
2) Time matching support, also from Ander Juaristi.
3) VLAN support for nfnetlink_log, from Michael Braun.
4) Support for set element deletions from the packet path, also from Ander.
5) Remove __read_mostly from conntrack spinlock, from Li RongQing.
6) Support for updating stateful objects, this also includes the initial
client for this infrastructure: the quota extension. A follow up fix
for the control plane also comes in this batch. Patches from
Fernando Fernandez Mancera.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 75b28affdd we can get by with just a single mmap to
map both the sq and cq ring. However, userspace doesn't know that.
Add a features variable to io_uring_params, and notify userspace
that the kernel has this ability. This can then be used in liburing
(or in applications directly) to avoid the second mmap.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Daniel Borkmann says:
====================
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add the ability to use unaligned chunks in the AF_XDP umem. By
relaxing where the chunks can be placed, it allows to use an
arbitrary buffer size and place whenever there is a free
address in the umem. Helps more seamless DPDK AF_XDP driver
integration. Support for i40e, ixgbe and mlx5e, from Kevin and
Maxim.
2) Addition of a wakeup flag for AF_XDP tx and fill rings so the
application can wake up the kernel for rx/tx processing which
avoids busy-spinning of the latter, useful when app and driver
is located on the same core. Support for i40e, ixgbe and mlx5e,
from Magnus and Maxim.
3) bpftool fixes for printf()-like functions so compiler can actually
enforce checks, bpftool build system improvements for custom output
directories, and addition of 'bpftool map freeze' command, from Quentin.
4) Support attaching/detaching XDP programs from 'bpftool net' command,
from Daniel.
5) Automatic xskmap cleanup when AF_XDP socket is released, and several
barrier/{read,write}_once fixes in AF_XDP code, from Björn.
6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf
inclusion as well as libbpf versioning improvements, from Andrii.
7) Several new BPF kselftests for verifier precision tracking, from Alexei.
8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya.
9) And more BPF kselftest improvements all over the place, from Stanislav.
10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub.
11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan.
12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni.
13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin.
14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar.
15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari,
Peter, Wei, Yue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For high speed adapter like Mellanox CX-5 card, it can reach upto
100 Gbits per second bandwidth. Currently htb already supports 64bit rate
in tc utility. However police action rate and peakrate are still limited
to 32bit value (upto 32 Gbits per second). Add 2 new attributes
TCA_POLICE_RATE64 and TCA_POLICE_RATE64 in kernel for 64bit support
so that tc utility can use them for 64bit rate and peakrate value to
break the 32bit limit, and still keep the backward binary compatibility.
Tested-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offloaded OvS datapath rules are translated one to one to tc rules,
for example the following simplified OvS rule:
recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)
Will be translated to the following tc rule:
$ tc filter add dev dev1 ingress \
prio 1 chain 0 proto ip \
flower tcp ct_state -trk \
action ct pipe \
action goto chain 2
Received packets will first travel though tc, and if they aren't stolen
by it, like in the above rule, they will continue to OvS datapath.
Since we already did some actions (action ct in this case) which might
modify the packets, and updated action stats, we would like to continue
the proccessing with the correct recirc_id in OvS (here recirc_id(2))
where we left off.
To support this, introduce a new skb extension for tc, which
will be used for translating tc chain to ovs recirc_id to
handle these miss cases. Last tc chain index will be set
by tc goto chain action and read by OvS datapath.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove pciehp_set_attention_status() and use pciehp_set_indicators()
instead, since the code is mostly the same.
Link: https://lore.kernel.org/r/20190903111021.1559-4-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
SAE J1939 is the vehicle bus recommended practice used for communication
and diagnostics among vehicle components. Originating in the car and
heavy-duty truck industry in the United States, it is now widely used in
other parts of the world.
J1939, ISO 11783 and NMEA 2000 all share the same high level protocol.
SAE J1939 can be considered the replacement for the older SAE J1708 and
SAE J1587 specifications.
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Bastian Stender <bst@pengutronix.de>
Signed-off-by: Elenita Hinds <ecathinds@gmail.com>
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Signed-off-by: Robin van der Gracht <robin@protonic.nl>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch prepares struct sockaddr_can for SAE J1939.
Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch is a preparation for SAE J1939 and adds CAN_J1939
socket type.
Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Guessing the first tty for a gsm0710 multiplexed serial device is not
currently possible, which makes it racy to use with multiple modems.
Add a way to map the physical serial tty to its related mux devices
using an ioctl.
Signed-off-by: Martin Hundebøll <martin@geanix.com>
Link: https://lore.kernel.org/r/20190812211243.98686-1-martin@geanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce support for LINFlex driver, based on:
- the version of Freescale LPUART driver after commit b3e3bf2ef2 ("Merge
4.0-rc7 into tty-next");
- commit abf1e0a980 ("tty: serial: fsl_lpuart: lock port on console
write").
In this basic version, the driver can be tested using initramfs and relies
on the clocks and pin muxing set up by U-Boot.
Remarks concerning the earlycon support:
- LinFlexD does not allow character transmissions in the INIT mode (see
section 47.4.2.1 in the reference manual[1]). Therefore, a mutual
exclusion between the first linflex_setup_watermark/linflex_set_termios
executions and linflex_earlycon_putchar was employed and the characters
normally sent to earlycon during initialization are kept in a buffer and
sent afterwards.
- Empirically, character transmission is also forbidden within the last 1-2
ms before entering the INIT mode, so we use an explicit timeout
(PREINIT_DELAY) between linflex_earlycon_putchar and the first call to
linflex_setup_watermark.
- U-Boot currently uses the UART FIFO mode, while this driver makes the
transition to the buffer mode. Therefore, the earlycon putchar function
matches the U-Boot behavior before initializations and the Linux behavior
after.
[1] https://www.nxp.com/webapp/Download?colCode=S32V234RM
Signed-off-by: Stoica Cosmin-Stefan <cosmin.stoica@nxp.com>
Signed-off-by: Adrian.Nitu <adrian.nitu@freescale.com>
Signed-off-by: Larisa Grigore <Larisa.Grigore@nxp.com>
Signed-off-by: Ana Nedelcu <B56683@freescale.com>
Signed-off-by: Mihaela Martinas <Mihaela.Martinas@freescale.com>
Signed-off-by: Matthew Nunez <matthew.nunez@nxp.com>
[stefan-gabriel.mirea@nxp.com: Reduced for upstreaming and implemented
earlycon support]
Signed-off-by: Stefan-Gabriel Mirea <stefan-gabriel.mirea@nxp.com>
Link: https://lore.kernel.org/r/20190809112853.15846-6-stefan-gabriel.mirea@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support to Sunix serial boards with up to 16 ports.
Sunix board need its own setup callback instead of using Timedia's, to
properly support more than 4 ports.
Cc: Morris Ku <morris_ku@sunix.com>
Cc: Debbie Liu <debbie_liu@sunix.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Link: https://lore.kernel.org/r/20190809190130.30773-1-kai.heng.feng@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The platform is getting removed, so there are no more users
of this driver.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20190809202749.742267-3-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With only 45 non-merge commits, we have a small merge window from the
Gadget perspective.
The biggest change here is the addition of the Cadence USB3 DRD
Driver. All other changes are small, non-critical fixes or smaller new
features like the improvement to BESL handling in dwc3.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
iQJRBAABCAA7FiEElLzh7wn96CXwjh2IzL64meEamQYFAl1o58UdHGZlbGlwZS5i
YWxiaUBsaW51eC5pbnRlbC5jb20ACgkQzL64meEamQaKtxAAvvrwvleA7lZwInU2
FariXyI7+ihPAUDtlEHA+kHclpUw4hyWPc5rg0Mk7CH5gbtqodj9Uo4vbWtM5EaR
EN6AJTn90KlygBXlw2pJAb6GBiJ2hMK8K5Gl5gdowRg4LExLYdLZCgCDueAv1p2o
2vg7LSJbHaeC0pv+L97W6/YH7q4fdacGjFm/49ROrChCwNNJygoXchO2+eAcXNlo
uMzx2QxF3cHOMeSxSGnXC5VsZYAk6VftXgAiGvVZ5+6EgPIA5b07JovDhn0nAoDR
1aVQTqWDzdcd/tkJKEMBi16mOaeLDzsnGmsr+/2vzPzLJaWKPKwZ/FgGTyBXzDfM
qEVrj2zWioH+6oOE3G5Ar3j1dgBqbgWhTfO47UWjgdQww5HeKZJqewJh19h5t/uS
hxxSFgwNMVIo4B8DJc5J/vIbDsQbk1LsXkT2b3B/UaNYXQuKBdiv4jVpUgLJX+FF
mN46SsfLVBpaYANcBPlgC8zGoOzWJFXgoh0qffxnQ6/Egvh55k0xVL/YPRgHiiw5
FbaGAlnJo9ue9vadzInde0X4SZnpBIQ5WR9fyHYm8fKDT1neB97F3yhZxhJ3apxN
0PM9P7auGtmjYYLAfIIL9687c3BNvDk8YvLehxXI4YqVHUZF3YNAz/DPrnCG2O5D
W33pwIhL1swIZCFRtGWpokGiN+o=
=KUJN
-----END PGP SIGNATURE-----
Merge tag 'usb-for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next
Felipe writes:
USB: Changes for v5.4 merge window
With only 45 non-merge commits, we have a small merge window from the
Gadget perspective.
The biggest change here is the addition of the Cadence USB3 DRD
Driver. All other changes are small, non-critical fixes or smaller new
features like the improvement to BESL handling in dwc3.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
* tag 'usb-for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (45 commits)
usb: gadget: net2280: Add workaround for AB chip Errata 11
usb: gadget: net2280: Move all "ll" registers in one structure
usb: dwc3: gadget: Workaround Mirosoft's BESL check
usb:cdns3 Fix for stuck packets in on-chip OUT buffer.
usb: cdns3: Add Cadence USB3 DRD Driver
usb: common: Simplify usb_decode_get_set_descriptor function.
usb: common: Patch simplify usb_decode_set_clear_feature function.
usb: common: Separated decoding functions from dwc3 driver.
dt-bindings: add binding for USBSS-DRD controller.
usb: gadget: composite: Set recommended BESL values
usb: dwc3: gadget: Set BESL config parameter
usb: dwc3: Separate field holding multiple properties
usb: gadget: Export recommended BESL values
usb: phy: phy-fsl-usb: Make structure fsl_otg_initdata constant
usb: udc: lpc32xx: silence fall-through warning
usb: dwc3: meson-g12a: fix suspend resume regulator unbalanced disables
usb: udc: lpc32xx: remove set but not used 3 variables
usb: gadget: udc: core: Fix segfault if udc_bind_to_driver() for pending driver fails
usb: dwc3: st: Add of_dev_put() in probe function
usb: dwc3: st: Add of_node_put() before return in probe function
...
When an application configures kernel TLS on top of a TCP socket, it's
now possible for inet_diag_handler() to collect information regarding the
protocol version, the cipher type and TX / RX configuration, in case
INET_DIAG_INFO is requested.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
currently, only getsockopt(TCP_ULP) can be invoked to know if a ULP is on
top of a TCP socket. Extend idiag_get_aux() and idiag_get_aux_size(),
introduced by commit b37e88407c ("inet_diag: allow protocols to provide
additional data"), to report the ULP name and other information that can
be made available by the ULP through optional functions.
Users having CAP_NET_ADMIN privileges will then be able to retrieve this
information through inet_diag_handler, if they specify INET_DIAG_INFO in
the request.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, addresses are chunk size aligned. This means, we are very
restricted in terms of where we can place chunk within the umem. For
example, if we have a chunk size of 2k, then our chunks can only be placed
at 0,2k,4k,6k,8k... and so on (ie. every 2k starting from 0).
This patch introduces the ability to use unaligned chunks. With these
changes, we are no longer bound to having to place chunks at a 2k (or
whatever your chunk size is) interval. Since we are no longer dealing with
aligned chunks, they can now cross page boundaries. Checks for page
contiguity have been added in order to keep track of which pages are
followed by a physically contiguous page.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This is useful for checking how much airtime is being used up by other
transmissions on the channel, e.g. by calculating (time_rx - time_bss_rx)
or (time_busy - time_bss_rx - time_tx)
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20190828102042.58016-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Set the recommended BESL deep and baseline values based on the gadget's
configuration parameters to the extended BOS descriptor. This feature
helps to optimize power savings by maximizing the opportunity for longer
L1 residency time.
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In some cases, ordinary (non-AUX) events can generate data for AUX events.
For example, PEBS events can come out as records in the Intel PT stream
instead of their usual DS records, if configured to do so.
One requirement for such events is to consistently schedule together, to
ensure that the data from the "AUX output" events isn't lost while their
corresponding AUX event is not scheduled. We use grouping to provide this
guarantee: an "AUX output" event can be added to a group where an AUX event
is a group leader, and provided that the former supports writing to the
latter.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kan.liang@linux.intel.com
Link: https://lkml.kernel.org/r/20190806084606.4021-2-alexander.shishkin@linux.intel.com
SCTP_ECN_SUPPORTED sockopt will be added to allow users to change
ep ecn flag, and it's similar with other feature flags.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain
and state pruning.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull networking fixes from David Miller:
1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
Leoshkevich.
2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
SMC from Jason Baron.
3) ipv6_mc_may_pull() should return 0 for malformed packets, not
-EINVAL. From Stefano Brivio.
4) Don't forget to unpin umem xdp pages in error path of
xdp_umem_reg(). From Ivan Khoronzhuk.
5) Fix sta object leak in mac80211, from Johannes Berg.
6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
switches. From Florian Fainelli.
7) Revert DMA sync removal from r8169 which was causing regressions on
some MIPS Loongson platforms. From Heiner Kallweit.
8) Use after free in flow dissector, from Jakub Sitnicki.
9) Fix NULL derefs of net devices during ICMP processing across
collect_md tunnels, from Hangbin Liu.
10) proto_register() memory leaks, from Zhang Lin.
11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
from John Fastabend.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
r8152: Set memory to all 0xFFs on failed reg reads
openvswitch: Fix conntrack cache with timeout
ipv4: mpls: fix mpls_xmit for iptunnel
nexthop: Fix nexthop_num_path for blackhole nexthops
net: rds: add service level support in rds-info
net: route dump netlink NLM_F_MULTI flag missing
s390/qeth: reject oversized SNMP requests
sock: fix potential memory leak in proto_register()
MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
ipv4/icmp: fix rt dst dev null pointer dereference
openvswitch: Fix log message in ovs conntrack
bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
bpf: fix use after free in prog symbol exposure
bpf: fix precision tracking in presence of bpf2bpf calls
flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
Revert "r8169: remove not needed call to dma_sync_single_for_device"
ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
net/ncsi: Fix the payload copying for the request coming from Netlink
qed: Add cleanup in qed_slowpath_start()
...
This patch implements the delete operation from the ruleset.
It implements a new delete() function in nft_set_rhash. It is simpler
to use than the already existing remove(), because it only takes the set
and the key as arguments, whereas remove() expects a full
nft_set_elem structure.
Signed-off-by: Ander Juaristi <a@juaristi.eus>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, there is no vlan information (e.g. when used with a vlan aware
bridge) passed to userspache, HWHEADER will contain an 08 00 (ip) suffix
even for tagged ip packets.
Therefore, add an extra netlink attribute that passes the vlan information
to userspace similarly to 15824ab29f for nfqueue.
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch introduces meta matches in the kernel for time (a UNIX timestamp),
day (a day of week, represented as an integer between 0-6), and
hour (an hour in the current day, or: number of seconds since midnight).
All values are taken as unsigned 64-bit integers.
The 'time' keyword is internally converted to nanoseconds by nft in
userspace, and hence the timestamp is taken in nanoseconds as well.
Signed-off-by: Ander Juaristi <a@juaristi.eus>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
UBIFS:
- Don't block too long in writeback_inodes_sb()
- Fix for a possible overrun of the log head
- Fix double unlock in orphan_delete()
JFFS2:
- Remove C++ style from UAPI header and unbreak picky toolchains
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAl1ik14WHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wbP2D/4xVW7YP5Yyt6YrABJuclfoib30
2LI6eOz0+5OojQKUbOzXCN9N7Dv4TLJKrCjRc9qKYTIB1DiQXuBDqtYKg6CTBhHb
MjiftEDiBQ6j3jVmRxkQRXZEB9I3Uu9CkA8s65+UmL8peJfgNElpH34omsU1fzup
y0NhZhj77P5jsAG6r7yXvuaofCOTlZIZVPya9FX17J0Ra+3rMOCtVEqnaHk2E5RB
EQPAEByqXUIx7+9mOi1Krw7B7fesB7oOVbCykE5knX1pZQCTURP64yNr35WxN+7Z
crcpdEQtf54qWMCKf4ClIBHiPmmsDIHYJy3JXjgJKOwIYvrB3dZ5E170qPr3JixY
nS+l8x69IYZhWUzHg8gxDizk92iFYKbO1h5vBwI7NUFHkHLzylsgonBK0KdaUnol
OvI5oCO/rdJEMBPr5LEFpOjZJIEptPtXpDvQCpm5tWd5tuW+8edNpI38lDO9LThC
O0diZZUQfsuzD1XrvKRORPU+4lskzGV5b1UA0DWXdGKALqM5VrQZo1XftvA74Zkv
oZQcHNK5wdecQX81Oadfb/0a5SN7FGGtTUCKTpOyBIu0adarGIasC6TQr2aDiiNh
7jLjBoV2XEGhXZQrK2lm8G+6rJ7Mp11B6aoTFgDELzt+SB7htp6dARR2+4aGWXh9
iXgme0n9HXDDeuosag==
=Bsgx
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBIFS and JFFS2 fixes from Richard Weinberger:
"UBIFS:
- Don't block too long in writeback_inodes_sb()
- Fix for a possible overrun of the log head
- Fix double unlock in orphan_delete()
JFFS2:
- Remove C++ style from UAPI header and unbreak picky toolchains"
* tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubifs: Limit the number of pages in shrink_liability
ubifs: Correctly initialize c->min_log_bytes
ubifs: Fix double unlock around orphan_delete()
jffs2: Remove C++ style comments from uapi header
>From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL)
is used to identify different flows within an IBA subnet.
It is carried in the local route header of the packet.
Before this commit, run "rds-info -I". The outputs are as
below:
"
RDS IB Connections:
LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
192.2.95.3 192.2.95.1 2 0 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 1 0 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
"
After this commit, the output is as below:
"
RDS IB Connections:
LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
192.2.95.3 192.2.95.1 2 2 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 1 1 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
"
The commit fe3475af3b ("net: rds: add per rds connection cache
statistics") adds cache_allocs in struct rds_info_rdma_connection
as below:
struct rds_info_rdma_connection {
...
__u32 rdma_mr_max;
__u32 rdma_mr_size;
__u8 tos;
__u32 cache_allocs;
};
The peer struct in rds-tools of struct rds_info_rdma_connection is as
below:
struct rds_info_rdma_connection {
...
uint32_t rdma_mr_max;
uint32_t rdma_mr_size;
uint8_t tos;
uint8_t sl;
uint32_t cache_allocs;
};
The difference between userspace and kernel is the member variable sl.
In the kernel struct, the member variable sl is missing. This will
introduce risks. So it is necessary to use this commit to avoid this risk.
Fixes: fe3475af3b ("net: rds: add per rds connection cache statistics")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Suggested-by: Gerd Rausch <gerd.rausch@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EROFS filesystem has been merged into linux-staging for a year.
EROFS is designed to be a better solution of saving extra storage
space with guaranteed end-to-end performance for read-only files
with the help of reduced metadata, fixed-sized output compression
and decompression inplace technologies.
In the past year, EROFS was greatly improved by many people as
a staging driver, self-tested, betaed by a large number of our
internal users, successfully applied to almost all in-service
HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable
enough to be moved out of staging.
EROFS is a self-contained filesystem driver. Although there are
still some TODOs to be more generic, we have a dedicated team
actively keeping on working on EROFS in order to make it better
with the evolution of Linux kernel as the other in-kernel filesystems.
As Pavel suggested, it's better to do as one commit since git
can do moves and all histories will be saved in this way.
Let's promote it from staging and enhance it more actively as
a "real" part of kernel for more wider scenarios!
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Pavel Machek <pavel@denx.de>
Cc: David Sterba <dsterba@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J . Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Miao Xie <miaoxie@huawei.com>
Cc: Li Guifu <bluce.liguifu@huawei.com>
Cc: Fang Wei <fangwei1@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Link: https://lore.kernel.org/r/20190822213659.5501-1-hsiangkao@aol.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Timestamps are currently communicated to user space as 'struct
timespec', which is not considered y2038 safe since it uses a 32-bit
signed value for seconds.
Fix this while the API is still not part of any official kernel release
by using 64-bit nanoseconds timestamps instead.
Fixes: ca30707dee ("drop_monitor: Add packet alert mode")
Fixes: 5e58109b1e ("drop_monitor: Add support for packet alert mode for hardware drops")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for including the casefold feature within f2fs, elevate
the EXT4_CASEFOLD_FL flag to FS_CASEFOLD_FL.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Linux kernel tolerates C++ style comments these days. Actually, the
SPDX License tags for .c files start with //.
On the other hand, uapi headers are written in more strict C, where
the C++ comment style is forbidden.
I simply dropped these lines instead of fixing the comment style.
This code has been always commented out since it was added around
Linux 2.4.9 (i.e. commented out for more than 17 years).
'Maybe later...' will never happen.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
I opened /sys/kernel/tracing/trace once and kept reading from it.
bpf_trace_printk somehow did not seem to work, no entries were appended
to that trace file. It turns out that tracing is disabled when that file
is open. Save the next person some time and document this.
The trace file is described in Documentation/trace/ftrace.rst, however
the implication "tracing is disabled" did not immediate translate to
"bpf_trace_printk silently discards entries".
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There is no 'struct pt_reg'.
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
802.11ay specification defines Enhanced Directional Multi-Gigabit
(EDMG) STA and AP which allow channel bonding of 2 channels and more.
Introduce new NL attributes that are needed for enabling and
configuring EDMG support.
Two new attributes are used by kernel to publish driver's EDMG
capabilities to the userspace:
NL80211_BAND_ATTR_EDMG_CHANNELS - bitmap field that indicates the 2.16
GHz channel(s) that are supported by the driver.
When this attribute is not set it means driver does not support EDMG.
NL80211_BAND_ATTR_EDMG_BW_CONFIG - represent the channel bandwidth
configurations supported by the driver.
Additional two new attributes are used by the userspace for connect
command and for AP configuration:
NL80211_ATTR_WIPHY_EDMG_CHANNELS
NL80211_ATTR_WIPHY_EDMG_BW_CONFIG
New rate info flag - RATE_INFO_FLAGS_EDMG, can be reported from driver
and used for bitrate calculation that will take into account EDMG
according to the 802.11ay specification.
Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org>
Link: https://lore.kernel.org/r/1566138918-3823-2-git-send-email-ailizaro@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the 802.11ax specification a new band is introduced, which
is also proposed by FCC for unlicensed use. This band is referred
to as 6GHz spanning frequency range from 5925 to 7125 MHz.
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Leon Zegers <leon.zegers@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://lore.kernel.org/r/1564745465-21234-2-git-send-email-arend.vanspriel@broadcom.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add a new command for the bpf() system call: BPF_BTF_GET_NEXT_ID is used
to cycle through all BTF objects loaded on the system.
The motivation is to be able to inspect (list) all BTF objects presents
on the system.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
SCTP_AUTH_SUPPORTED sockopt is used to set enpoint's auth
flag. With this feature, each endpoint will have its own
flag for its future asoc's auth_capable, instead of netns
auth flag.
Note that when both ep's auth_enable is enabled, endpoint
auth related data should be initialized. If asconf_enable
is also set, SCTP_CID_ASCONF/SCTP_CID_ASCONF_ACK should
be added into auth_chunk_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SCTP_ASCONF_SUPPORTED sockopt is used to set enpoint's asconf
flag. With this feature, each endpoint will have its own flag
for its future asoc's asconf_capable, instead of netns asconf
flag.
Note that when both ep's asconf_enable and auth_enable are
enabled, SCTP_CID_ASCONF and SCTP_CID_ASCONF_ACK should be
added into auth_chunk_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Remove IP MASQUERADING record in MAINTAINERS file,
from Denis Efremov.
2) Counter arguments are swapped in ebtables, from
Todd Seidelmann.
3) Missing netlink attribute validation in flow_offload
extension.
4) Incorrect alignment in xt_nfacct that breaks 32-bits
userspace / 64-bits kernels, from Juliana Rodrigueiro.
5) Missing include guard in nf_conntrack_h323_types.h,
from Masahiro Yamada.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add EEE-related constants. This includes the new MMD EEE registers for
NBase-T / 802.3bz.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows the user-space to retrieve the supported IOVA
range(s), excluding any non-relaxable reserved regions. The
implementation is based on capability chains, added to
VFIO_IOMMU_GET_INFO ioctl.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
It is easy to miss already defined region types. Let's re-arrange
the definitions a bit and add more comments to make it hopefully
a bit clearer.
No functional change.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.
This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.
Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Reviewed-by: Alexandre Courbot <acourbot@chromium.org>
Acked-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Add an enum_fmt format flag to specifically tag coded formats where
full bytestream parsing is supported by the device.
Some stateful decoders are capable of fully parsing a bytestream,
but others require that userspace pre-parses the bytestream into
frames or fields (see the corresponding pixelformat descriptions
for details).
If this flag is set, then this pre-parsing step is not required
(but still possible, of course).
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Reviewed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Pull networking fixes from David Miller:
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
When running a 64-bit kernel with a 32-bit iptables binary, the size of
the xt_nfacct_match_info struct diverges.
kernel: sizeof(struct xt_nfacct_match_info) : 40
iptables: sizeof(struct xt_nfacct_match_info)) : 36
Trying to append nfacct related rules results in an unhelpful message.
Although it is suggested to look for more information in dmesg, nothing
can be found there.
# iptables -A <chain> -m nfacct --nfacct-name <acct-object>
iptables: Invalid argument. Run `dmesg' for more information.
This patch fixes the memory misalignment by enforcing 8-byte alignment
within the struct's first revision. This solution is often used in many
other uapi netfilter headers.
Signed-off-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add new helper bpf_sk_storage_clone which optionally clones sk storage
and call it from sk_clone_lock.
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This commit adds support for a new flag called need_wakeup in the
AF_XDP Tx and fill rings. When this flag is set, it means that the
application has to explicitly wake up the kernel Rx (for the bit in
the fill ring) or kernel Tx (for bit in the Tx ring) processing by
issuing a syscall. Poll() can wake up both depending on the flags
submitted and sendto() will wake up tx processing only.
The main reason for introducing this new flag is to be able to
efficiently support the case when application and driver is executing
on the same core. Previously, the driver was just busy-spinning on the
fill ring if it ran out of buffers in the HW and there were none on
the fill ring. This approach works when the application is running on
another core as it can replenish the fill ring while the driver is
busy-spinning. Though, this is a lousy approach if both of them are
running on the same core as the probability of the fill ring getting
more entries when the driver is busy-spinning is zero. With this new
feature the driver now sets the need_wakeup flag and returns to the
application. The application can then replenish the fill queue and
then explicitly wake up the Rx processing in the kernel using the
syscall poll(). For Tx, the flag is only set to one if the driver has
no outstanding Tx completion interrupts. If it has some, the flag is
zero as it will be woken up by a completion interrupt anyway.
As a nice side effect, this new flag also improves the performance of
the case where application and driver are running on two different
cores as it reduces the number of syscalls to the kernel. The kernel
tells user space if it needs to be woken up by a syscall, and this
eliminates many of the syscalls.
This flag needs some simple driver support. If the driver does not
support this, the Rx flag is always zero and the Tx flag is always
one. This makes any application relying on this feature default to the
old behaviour of not requiring any syscalls in the Rx path and always
having to call sendto() in the Tx path.
For backwards compatibility reasons, this feature has to be explicitly
turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
that you always turn it on as it so far always have had a positive
performance impact.
The name and inspiration of the flag has been taken from io_uring by
Jens Axboe. Details about this feature in io_uring can be found in
http://kernel.dk/io_uring.pdf, section 8.3.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add the basic packet trap infrastructure that allows device drivers to
register their supported packet traps and trap groups with devlink.
Each driver is expected to provide basic information about each
supported trap, such as name and ID, but also the supported metadata
types that will accompany each packet trapped via the trap. The
currently supported metadata type is just the input port, but more will
be added in the future. For example, output port and traffic class.
Trap groups allow users to set the action of all member traps. In
addition, users can retrieve per-group statistics in case per-trap
statistics are too narrow. In the future, the trap group object can be
extended with more attributes, such as policer settings which will limit
the amount of traffic generated by member traps towards the CPU.
Beside registering their packet traps with devlink, drivers are also
expected to report trapped packets to devlink along with relevant
metadata. devlink will maintain packets and bytes statistics for each
packet trap and will potentially report the trapped packet with its
metadata to user space via drop monitor netlink channel.
The interface towards the drivers is simple and allows devlink to set
the action of the trap. Currently, only two actions are supported:
'trap' and 'drop'. When set to 'trap', the device is expected to provide
the sole copy of the packet to the driver which will pass it to devlink.
When set to 'drop', the device is expected to drop the packet and not
send a copy to the driver. In the future, more actions can be added,
such as 'mirror'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop monitor has start and stop commands, but so far these were only
used to start and stop monitoring of software drops.
Now that drop monitor can also monitor hardware drops, we should allow
the user to control these as well.
Do that by adding SW and HW flags to these commands. If no flag is
specified, then only start / stop monitoring software drops. This is
done in order to maintain backward-compatibility with existing user
space applications.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In summary alert mode a notification is sent with a list of recent drop
reasons and a count of how many packets were dropped due to this reason.
To avoid expensive operations in the context in which packets are
dropped, each CPU holds an array whose number of entries is the maximum
number of drop reasons that can be encoded in the netlink notification.
Each entry stores the drop reason and a count. When a packet is dropped
the array is traversed and a new entry is created or the count of an
existing entry is incremented.
Later, in process context, the array is replaced with a newly allocated
copy and the old array is encoded in a netlink notification. To avoid
breaking user space, the notification includes the ancillary header,
which is 'struct net_dm_alert_msg' with number of entries set to '0'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a similar fashion to software drops, extend drop monitor to send
netlink events when packets are dropped by the underlying hardware.
The main difference is that instead of encoding the program counter (PC)
from which kfree_skb() was called in the netlink message, we encode the
hardware trap name. The two are mostly equivalent since they should both
help the user understand why the packet was dropped.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl1TpDwTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRBaxiGjkeSdIMwYB/49sQtE+obVwAOGcXogemkN5+Gj/KTo
KzF0CLYRjVOz2FViGVJL9HQ5QZwy1IxTycIUV2kKnrlI6s0Z17z3G0lV2+J4scuy
O+wVn36QzQHdP3Uj8kGmNQ1v6BliePM8bvm4sHtXzb1diW+qNg/0LmNN4brjPWka
kfvFjS2IdXa7BblT2aiAZj2V+iSQEpy5NieVwcBti0E2AFb+W3qEa8Lg4QORy3Qi
MAHg0QSUHZp0+joXRQiott4WttfuQ17OQFuxzsQ7bQCQhBQWYUmg14Hm/LP4V3g2
9algzS6g3Qapdo7KEJrCSViblzHVD+yL3BemfahNUyK4F8zGkPq48Kp5
=MqG7
-----END PGP SIGNATURE-----
Merge tag 'linux-can-next-for-5.4-20190814' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2019-08-14
this is a pull request for net-next/master consisting of 41 patches.
The first two patches are for the kvaser_pciefd driver: Christer Beskow
removes unnecessary code in the kvaser_pciefd_pwm_stop() function,
YueHaibing removes the unused including of <linux/version.h>.
In the next patch YueHaibing also removes the unused including of
<linux/version.h> in the f81601 driver.
In the ti_hecc driver the next 6 patches are by me and fix checkpatch
warnings. YueHaibing's patch removes an unused variable in the
ti_hecc_mailbox_read() function.
The next 6 patches all target the xilinx_can driver. Anssi Hannula's
patch fixes a chip start failure with an invalid bus. The patch by
Venkatesh Yadav Abbarapu skips an error message in case of a deferred
probe. The 3 patches by Appana Durga Kedareswara rao fix the RX and TX
path for CAN-FD frames. Srinivas Neeli's patch fixes the bit timing
calculations for CAN-FD.
The next 12 patches are by me and several checkpatch warnings in the
af_can, raw and bcm components.
Thomas Gleixner provides a patch for the bcm, which switches the timer
to HRTIMER_MODE_SOFT and removes the hrtimer_tasklet.
Then 6 more patches by me for the gw component, which fix checkpatch
warnings, followed by 2 patches by Oliver Hartkopp to add CAN-FD
support.
The vcan driver gets 3 patches by me, fixing checkpatch warnings.
And finally a patch by Andre Hartmann to fix typos in CAN's netlink
header.
====================
There are two netfilter userspace headers which contain deprecation
warnings. While these headers are not used within the kernel, they are
compiled stand-alone for header-testing.
Pablo informs me that userspace iptables still refer to these headers,
and the intention was to use xt_LOG.h instead and remove these, but
userspace was never updated.
Remove the warnings.
Fixes: 2a475c409f ("kbuild: remove all netfilter headers from header-test blacklist.")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The recent commit 7794f486ed ("usbfs: Add ioctls for runtime power
management") neglected to add a corresponding capability flag. This
patch rectifies the omission.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Mayuresh Kulkarni <mkulkarni@opensource.cirrus.com>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1908131613490.1941-100000@iolanthe.rowland.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next:
1) Rename mss field to mss_option field in synproxy, from Fernando Mancera.
2) Use SYSCTL_{ZERO,ONE} definitions in conntrack, from Matteo Croce.
3) More strict validation of IPVS sysctl values, from Junwei Hu.
4) Remove unnecessary spaces after on the right hand side of assignments,
from yangxingwu.
5) Add offload support for bitwise operation.
6) Extend the nft_offload_reg structure to store immediate date.
7) Collapse several ip_set header files into ip_set.h, from
Jeremy Sowden.
8) Make netfilter headers compile with CONFIG_KERNEL_HEADER_TEST=y,
from Jeremy Sowden.
9) Fix several sparse warnings due to missing prototypes, from
Valdis Kletnieks.
10) Use static lock initialiser to ensure connlabel spinlock is
initialized on boot time to fix sched/act_ct.c, patch
from Florian Westphal.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Daniel Borkmann says:
====================
The following pull-request contains BPF updates for your *net-next* tree.
There is a small merge conflict in libbpf (Cc Andrii so he's in the loop
as well):
for (i = 1; i <= btf__get_nr_types(btf); i++) {
t = (struct btf_type *)btf__type_by_id(btf, i);
if (!has_datasec && btf_is_var(t)) {
/* replace VAR with INT */
t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
<<<<<<< HEAD
/*
* using size = 1 is the safest choice, 4 will be too
* big and cause kernel BTF validation failure if
* original variable took less than 4 bytes
*/
t->size = 1;
*(int *)(t+1) = BTF_INT_ENC(0, 0, 8);
} else if (!has_datasec && kind == BTF_KIND_DATASEC) {
=======
t->size = sizeof(int);
*(int *)(t + 1) = BTF_INT_ENC(0, 0, 32);
} else if (!has_datasec && btf_is_datasec(t)) {
>>>>>>> 72ef80b5ee
/* replace DATASEC with STRUCT */
Conflict is between the two commits 1d4126c4e1 ("libbpf: sanitize VAR to
conservative 1-byte INT") and b03bc6853c ("libbpf: convert libbpf code to
use new btf helpers"), so we need to pick the sanitation fixup as well as
use the new btf_is_datasec() helper and the whitespace cleanup. Looks like
the following:
[...]
if (!has_datasec && btf_is_var(t)) {
/* replace VAR with INT */
t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
/*
* using size = 1 is the safest choice, 4 will be too
* big and cause kernel BTF validation failure if
* original variable took less than 4 bytes
*/
t->size = 1;
*(int *)(t + 1) = BTF_INT_ENC(0, 0, 8);
} else if (!has_datasec && btf_is_datasec(t)) {
/* replace DATASEC with STRUCT */
[...]
The main changes are:
1) Addition of core parts of compile once - run everywhere (co-re) effort,
that is, relocation of fields offsets in libbpf as well as exposure of
kernel's own BTF via sysfs and loading through libbpf, from Andrii.
More info on co-re: http://vger.kernel.org/bpfconf2019.html#session-2
and http://vger.kernel.org/lpc-bpf2018.html#session-2
2) Enable passing input flags to the BPF flow dissector to customize parsing
and allowing it to stop early similar to the C based one, from Stanislav.
3) Add a BPF helper function that allows generating SYN cookies from XDP and
tc BPF, from Petar.
4) Add devmap hash-based map type for more flexibility in device lookup for
redirects, from Toke.
5) Improvements to XDP forwarding sample code now utilizing recently enabled
devmap lookups, from Jesper.
6) Add support for reporting the effective cgroup progs in bpftool, from Jakub
and Takshak.
7) Fix reading kernel config from bpftool via /proc/config.gz, from Peter.
8) Fix AF_XDP umem pages mapping for 32 bit architectures, from Ivan.
9) Follow-up to add two more BPF loop tests for the selftest suite, from Alexei.
10) Add perf event output helper also for other skb-based program types, from Allan.
11) Fix a co-re related compilation error in selftests, from Yonghong.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
This patch fixes some documentation typos in struct can_bittiming_const.
Signed-off-by: Andre Hartmann <aha_1980@gmx.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Introduce CAN FD support which needs an extension of the netlink API to
pass CAN FD type content to the kernel which has a different size to
Classic CAN. Additionally the struct canfd_frame has a new 'flags' element
that can now be modified with can-gw.
The new CGW_FLAGS_CAN_FD option flag defines whether the routing job
handles Classic CAN or CAN FD frames. This setting is very strict at
reception time and enables the new possibilities, e.g. CGW_FDMOD_* and
modifying the flags element of struct canfd_frame, only when
CGW_FLAGS_CAN_FD is set.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
To prepare the CAN FD support this patch implements the first adaptions in
data structures for CAN FD without changing the current functionality.
Additionally some code at the end of this patch is moved or indented to
simplify the review of the next implementation step.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Add #defines only for the Data Link Feature and Physical Layer 16.0 GT/s
features as defined in PCIe spec r4.0, sec 7.7.4 for Data Link Feature and
sec 7.7.5 for Physical Layer 16.0 GT/s.
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
A number of netfilter header-files used declarations and definitions
from other headers without including them. Added include directives to
make those declarations and definitions available.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add SHA-512 support to fs-verity. This is primarily a demonstration of
the trivial changes needed to support a new hash algorithm in fs-verity;
most users will still use SHA-256, due to the smaller space required to
store the hashes. But some users may prefer SHA-512.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a root-only variant of the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl which
removes all users' claims of the key, not just the current user's claim.
I.e., it always removes the key itself, no matter how many users have
added it.
This is useful for forcing a directory to be locked, without having to
figure out which user ID(s) the key was added under. This is planned to
be used by a command like 'sudo fscrypt lock DIR --all-users' in the
fscrypt userspace tool (http://github.com/google/fscrypt).
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Allow the FS_IOC_ADD_ENCRYPTION_KEY and FS_IOC_REMOVE_ENCRYPTION_KEY
ioctls to be used by non-root users to add and remove encryption keys
from the filesystem-level crypto keyrings, subject to limitations.
Motivation: while privileged fscrypt key management is sufficient for
some users (e.g. Android and Chromium OS, where a privileged process
manages all keys), the old API by design also allows non-root users to
set up and use encrypted directories, and we don't want to regress on
that. Especially, we don't want to force users to continue using the
old API, running into the visibility mismatch between files and keyrings
and being unable to "lock" encrypted directories.
Intuitively, the ioctls have to be privileged since they manipulate
filesystem-level state. However, it's actually safe to make them
unprivileged if we very carefully enforce some specific limitations.
First, each key must be identified by a cryptographic hash so that a
user can't add the wrong key for another user's files. For v2
encryption policies, we use the key_identifier for this. v1 policies
don't have this, so managing keys for them remains privileged.
Second, each key a user adds is charged to their quota for the keyrings
service. Thus, a user can't exhaust memory by adding a huge number of
keys. By default each non-root user is allowed up to 200 keys; this can
be changed using the existing sysctl 'kernel.keys.maxkeys'.
Third, if multiple users add the same key, we keep track of those users
of the key (of which there remains a single copy), and won't really
remove the key, i.e. "lock" the encrypted files, until all those users
have removed it. This prevents denial of service attacks that would be
possible under simpler schemes, such allowing the first user who added a
key to remove it -- since that could be a malicious user who has
compromised the key. Of course, encryption keys should be kept secret,
but the idea is that using encryption should never be *less* secure than
not using encryption, even if your key was compromised.
We tolerate that a user will be unable to really remove a key, i.e.
unable to "lock" their encrypted files, if another user has added the
same key. But in a sense, this is actually a good thing because it will
avoid providing a false notion of security where a key appears to have
been removed when actually it's still in memory, available to any
attacker who compromises the operating system kernel.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a new fscrypt policy version, "v2". It has the following changes
from the original policy version, which we call "v1" (*):
- Master keys (the user-provided encryption keys) are only ever used as
input to HKDF-SHA512. This is more flexible and less error-prone, and
it avoids the quirks and limitations of the AES-128-ECB based KDF.
Three classes of cryptographically isolated subkeys are defined:
- Per-file keys, like used in v1 policies except for the new KDF.
- Per-mode keys. These implement the semantics of the DIRECT_KEY
flag, which for v1 policies made the master key be used directly.
These are also planned to be used for inline encryption when
support for it is added.
- Key identifiers (see below).
- Each master key is identified by a 16-byte master_key_identifier,
which is derived from the key itself using HKDF-SHA512. This prevents
users from associating the wrong key with an encrypted file or
directory. This was easily possible with v1 policies, which
identified the key by an arbitrary 8-byte master_key_descriptor.
- The key must be provided in the filesystem-level keyring, not in a
process-subscribed keyring.
The following UAPI additions are made:
- The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a
fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated
from fscrypt_policy/fscrypt_policy_v1 by the version code prefix.
- A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows
getting the v1 or v2 encryption policy of an encrypted file or
directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not
be used because it did not have a way for userspace to indicate which
policy structure is expected. The new ioctl includes a size field, so
it is extensible to future fscrypt policy versions.
- The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY,
and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2
encryption policies. Such keys are kept logically separate from keys
for v1 encryption policies, and are identified by 'identifier' rather
than by 'descriptor'. The 'identifier' need not be provided when
adding a key, since the kernel will calculate it anyway.
This patch temporarily keeps adding/removing v2 policy keys behind the
same permission check done for adding/removing v1 policy keys:
capable(CAP_SYS_ADMIN). However, the next patch will carefully take
advantage of the cryptographically secure master_key_identifier to allow
non-root users to add/remove v2 policy keys, thus providing a full
replacement for v1 policies.
(*) Actually, in the API fscrypt_policy::version is 0 while on-disk
fscrypt_context::format is 1. But I believe it makes the most sense
to advance both to '2' to have them be in sync, and to consider the
numbering to start at 1 except for the API quirk.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key
specified by 'struct fscrypt_key_specifier' (the same way a key is
specified for the other fscrypt key management ioctls), it returns
status information in a 'struct fscrypt_get_key_status_arg'.
The main motivation for this is that applications need to be able to
check whether an encrypted directory is "unlocked" or not, so that they
can add the key if it is not, and avoid adding the key (which may
involve prompting the user for a passphrase) if it already is.
It's possible to use some workarounds such as checking whether opening a
regular file fails with ENOKEY, or checking whether the filenames "look
like gibberish" or not. However, no workaround is usable in all cases.
Like the other key management ioctls, the keyrings syscalls may seem at
first to be a good fit for this. Unfortunately, they are not. Even if
we exposed the keyring ID of the ->s_master_keys keyring and gave
everyone Search permission on it (note: currently the keyrings
permission system would also allow everyone to "invalidate" the keyring
too), the fscrypt keys have an additional state that doesn't map cleanly
to the keyrings API: the secret can be removed, but we can be still
tracking the files that were using the key, and the removal can be
re-attempted or the secret added again.
After later patches, some applications will also need a way to determine
whether a key was added by the current user vs. by some other user.
Reserved fields are included in fscrypt_get_key_status_arg for this and
other future extensions.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an
encryption key to the filesystem's fscrypt keyring ->s_master_keys,
making any files encrypted with that key appear "unlocked".
Why we need this
~~~~~~~~~~~~~~~~
The main problem is that the "locked/unlocked" (ciphertext/plaintext)
status of encrypted files is global, but the fscrypt keys are not.
fscrypt only looks for keys in the keyring(s) the process accessing the
filesystem is subscribed to: the thread keyring, process keyring, and
session keyring, where the session keyring may contain the user keyring.
Therefore, userspace has to put fscrypt keys in the keyrings for
individual users or sessions. But this means that when a process with a
different keyring tries to access encrypted files, whether they appear
"unlocked" or not is nondeterministic. This is because it depends on
whether the files are currently present in the inode cache.
Fixing this by consistently providing each process its own view of the
filesystem depending on whether it has the key or not isn't feasible due
to how the VFS caches work. Furthermore, while sometimes users expect
this behavior, it is misguided for two reasons. First, it would be an
OS-level access control mechanism largely redundant with existing access
control mechanisms such as UNIX file permissions, ACLs, LSMs, etc.
Encryption is actually for protecting the data at rest.
Second, almost all users of fscrypt actually do need the keys to be
global. The largest users of fscrypt, Android and Chromium OS, achieve
this by having PID 1 create a "session keyring" that is inherited by
every process. This works, but it isn't scalable because it prevents
session keyrings from being used for any other purpose.
On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't
similarly abuse the session keyring, so to make 'sudo' work on all
systems it has to link all the user keyrings into root's user keyring
[2]. This is ugly and raises security concerns. Moreover it can't make
the keys available to system services, such as sshd trying to access the
user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to
read certificates from the user's home directory (see [5]); or to Docker
containers (see [6], [7]).
By having an API to add a key to the *filesystem* we'll be able to fix
the above bugs, remove userspace workarounds, and clearly express the
intended semantics: the locked/unlocked status of an encrypted directory
is global, and encryption is orthogonal to OS-level access control.
Why not use the add_key() syscall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use an ioctl for this API rather than the existing add_key() system
call because the ioctl gives us the flexibility needed to implement
fscrypt-specific semantics that will be introduced in later patches:
- Supporting key removal with the semantics such that the secret is
removed immediately and any unused inodes using the key are evicted;
also, the eviction of any in-use inodes can be retried.
- Calculating a key-dependent cryptographic identifier and returning it
to userspace.
- Allowing keys to be added and removed by non-root users, but only keys
for v2 encryption policies; and to prevent denial-of-service attacks,
users can only remove keys they themselves have added, and a key is
only really removed after all users who added it have removed it.
Trying to shoehorn these semantics into the keyrings syscalls would be
very difficult, whereas the ioctls make things much easier.
However, to reuse code the implementation still uses the keyrings
service internally. Thus we get lockless RCU-mode key lookups without
having to re-implement it, and the keys automatically show up in
/proc/keys for debugging purposes.
References:
[1] https://github.com/google/fscrypt
[2] https://goo.gl/55cCrI#heading=h.vf09isp98isb
[3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939
[4] https://github.com/google/fscrypt/issues/116
[5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715
[6] https://github.com/google/fscrypt/issues/128
[7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Update fs/crypto/ to use the new names for the UAPI constants rather
than the old names, then make the old definitions conditional on
!__KERNEL__.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Prefix all filesystem encryption UAPI constants except the ioctl numbers
with "FSCRYPT_" rather than with "FS_". This namespaces the constants
more appropriately and makes it clear that they are related specifically
to the filesystem encryption feature, and to the 'fscrypt_*' structures.
With some of the old names like "FS_POLICY_FLAGS_VALID", it was not
immediately clear that the constant had anything to do with encryption.
This is also useful because we'll be adding more encryption-related
constants, e.g. for the policy version, and we'd otherwise have to
choose whether to use unclear names like FS_POLICY_V1 or inconsistent
names like FS_ENCRYPTION_POLICY_V1.
For source compatibility with existing userspace programs, keep the old
names defined as aliases to the new names.
Finally, as long as new names are being defined anyway, I skipped
defining new names for the fscrypt mode numbers that aren't actually
used: INVALID (0), AES_256_GCM (2), AES_256_CBC (3), SPECK128_256_XTS
(7), and SPECK128_256_CTS (8).
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
More fscrypt definitions are being added, and we shouldn't use a
disproportionate amount of space in <linux/fs.h> for fscrypt stuff.
So move the fscrypt definitions to a new header <linux/fscrypt.h>.
For source compatibility with existing userspace programs, <linux/fs.h>
still includes the new header.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1QegseHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGavMH/RBaR39ct3XXpPzC
yxmKJn/n692NXFKusfsfasitGHMdFom6HaCcKx4PpzX1QHnR34LJtMd1QvwM8cHz
FtbC68HyZBB91JOUzC38GbRufvVUsSeXg8YeBiF6BOoOP06OcOG+DKoPcKBOMXsR
MtJmgxEyedLT7ozEPTpowVBWulELJxkbe1MCc93xDXdOqT+aMuxZBASMJIgrufS7
uJTsJ/afHz6F29Mj6Q9lfIJJSHqSfMK/rPGP54xRdBgMWmAmNjA2aExyCK8PE/Yb
TChsrDjDz38ePuVaWfjtwFNWlWcq0Do8vJdPuAxZDdfaJlQDXQHCWXsJjrWD6oNy
ZhCq0zE=
=HcEO
-----END PGP SIGNATURE-----
Merge tag 'v5.3-rc4' into next
Sync up with mainline to bring in device_property_count_u32 andother
newer APIs.
Previous patch made the length of the per-CPU skb drop list
configurable. Expose a counter that shows how many packets could not be
enqueued to this list.
This allows users determine the desired queue length.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In packet alert mode, each CPU holds a list of dropped skbs that need to
be processed in process context and sent to user space. To avoid
exhausting the system's memory the maximum length of this queue is
currently set to 1000.
Allow users to tune the length of this queue according to their needs.
The configured length is reported to user space when drop monitor
configuration is queried.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Users should be able to query the current configuration of drop monitor
before they start using it. Add a command to query the existing
configuration which currently consists of alert mode and packet
truncation length.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending dropped packets to user space it is not always necessary to
copy the entire packet as usually only the headers are of interest.
Allow user to specify the truncation length and add the original length
of the packet as additional metadata to the netlink message.
By default no truncation is performed.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far drop monitor supported only one alert mode in which a summary of
locations in which packets were recently dropped was sent to user space.
This alert mode is sufficient in order to understand that packets were
dropped, but lacks information to perform a more detailed analysis.
Add a new alert mode in which the dropped packet itself is passed to
user space along with metadata: The drop location (as program counter
and resolved symbol), ingress netdevice and drop timestamp. More
metadata can be added in the future.
To avoid performing expensive operations in the context in which
kfree_skb() is invoked (can be hard IRQ), the dropped skb is cloned and
queued on per-CPU skb drop list. Then, in process context the netlink
message is allocated, prepared and finally sent to user space.
The per-CPU skb drop list is limited to 1000 skbs to prevent exhausting
the system's memory. Subsequent patches will make this limit
configurable and also add a counter that indicates how many skbs were
tail dropped.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The next patch is going to add another alert mode in which the dropped
packet is notified to user space, instead of only a summary of recent
drops.
Abstract the differences between the modes by adding alert mode
operations. The operations are selected based on the currently
configured mode and associated with the probes and the work item just
before tracing starts.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Generating and retrieving socket cookies are a useful feature that is
exposed to BPF for various program types through bpf_get_socket_cookie()
helper.
The fact that the cookie counter is per netns is quite a limitation
for BPF in practice in particular for programs in host namespace that
use socket cookies as part of a map lookup key since they will be
causing socket cookie collisions e.g. when attached to BPF cgroup hooks
or cls_bpf on tc egress in host namespace handling container traffic
from veth or ipvlan devices with peer in different netns. Change the
counter to be global instead.
Socket cookie consumers must assume the value as opqaue in any case.
Not every socket must have a cookie generated and knowledge of the
counter value itself does not provide much value either way hence
conversion to global is fine.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Martynas Pumputis <m@lambda.lt>
Signed-off-by: David S. Miller <davem@davemloft.net>
core:
- mode parser strncpy fix
i915:
- GLK DSI escape clock setting
- HDCP memleak fix
tegra:
- one gpiod/of regression fix
amdgpu:
- Fixes VCN to handle the latest navi10 firmware
- Fixes for fan control on navi10
- Properly handle SMU metrics table on navi10
- Fix a resume regression on Stoney
- kfd revert a GWS ioctl
vmwgfx:
- memory leak fix
rockchip:
- suspend fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdTQ/bAAoJEAx081l5xIa+YAUP/i0sQUbM41u/CijX14TRSEOt
bdv6yIGsCQ4B5qu2LTm/53XzmQzh8ENsN9zDoQYbqerrdRjtcZ1h9Iux/e5a2mzo
+apVXfBo4QFvn7IelTjg/7MAyK7PO9vZmSEC4/xUxetz/LxgTrED3JWUGLUK5SbR
6zBupVPJmsizBWAOhwQFlwoBpDYbVfL5sJMq7NNBCjwrmgJEXsElZC4ZHFaIgXyh
MtWfJtVHJ9211q8gSVoAbjlFyuinn5qejCbx0Bo83CrZN6LNWu6tCaajsWa7QGGo
arxT1xlU2bMI5MpfNfb+jM3rDycR5vB3uO597S3M3gkVrN05ZiG2AhpX4r9ohMYB
qxGR0rLcu7P7CE8V4fDQWjIeaK0fV24waEYNMa0cUA+iilpt43/0DEKBpIR5oL74
nf445bmDSd4JazgxQXExUy+7szoWHJyqVhkpHnLKi1WdgOR2ZEZpQMnZGmg60RfE
QbgpRlszl/i0tBolMvC4Dgha+fyXjdUMEDgXb8lM7BZZFJ+Ew4Oxxq1V7Zy2+sub
XOrWp7oE6tucwd6XabGKDUakhDnxGp2taIXOMu+2K8b/H38sJdlirecrzAQdQfix
YB5FQcu/VqmQ2IdHU2bmXUm0lbd6e4s23JC4uRcnQFelVcrOey9GooauDX7OBgaj
EBMoR1qdJPjFHeQSuYda
=LjgJ
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2019-08-09' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Usual fixes roundup. Nothing too crazy or serious, one non-released
ioctl is removed in the amdkfd driver.
core:
- mode parser strncpy fix
i915:
- GLK DSI escape clock setting
- HDCP memleak fix
tegra:
- one gpiod/of regression fix
amdgpu:
- fix VCN to handle the latest navi10 firmware
- fix for fan control on navi10
- properly handle SMU metrics table on navi10
- fix a resume regression on Stoney
- kfd revert a GWS ioctl
vmwgfx:
- memory leak fix
rockchip:
- suspend fix"
* tag 'drm-fixes-2019-08-09' of git://anongit.freedesktop.org/drm/drm:
drm/vmwgfx: fix memory leak when too many retries have occurred
Revert "drm/amdkfd: New IOCTL to allocate queue GWS"
Revert "drm/amdgpu: fix transform feedback GDS hang on gfx10 (v2)"
drm/amdgpu: pin the csb buffer on hw init for gfx v8
drm/rockchip: Suspend DP late
drm/i915: Fix wrong escape clock divisor init for GLK
drm/i915: fix possible memory leak in intel_hdcp_auth_downstream()
drm/modes: Fix unterminated strncpy
drm/amd/powerplay: correct navi10 vcn powergate
drm/amd/powerplay: honor hw limit on fetching metrics data for navi10
drm/amd/powerplay: Allow changing of fan_control in smu_v11_0
drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0
drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands
drm/tegra: Fix gpiod_get_from_of_node() regression
UAPI Changes:
- HDCP: Add a Content protection type property
Cross-subsystem Changes:
Core Changes:
- Continue to rework the include dependencies
- fb: Remove the unused drm_gem_fbdev_fb_create function
- drm-dp-helper: Make the link rate calculation more tolerant to
non-explicitly defined, yet supported, rates
- fb-helper: Map DRM client buffer only when required, and instanciate a
shadow buffer when the device has a dirty function or says so
- connector: Add a helper to link the DDC adapter used by that connector to
the userspace
- vblank: Switch from DRM_WAIT_ON to wait_event_interruptible_timeout
- dma-buf: Fix a stack corruption
- ttm: Embed a drm_gem_object struct to make ttm_buffer_object a
superclass of GEM, and convert drivers to use it.
- hdcp: Improvements to report the content protection type to the
userspace
Driver Changes:
- Remove drm_gem_prime_import/export from being defined in the drivers
- Drop DRM_AUTH usage from drivers
- Continue to drop drmP.h
- Convert drivers to the connector ddc helper
- ingenic: Add support for more panel-related cases
- komeda: Support for dual-link
- lima: Reduce logging
- mpag200: Fix the cursor support
- panfrost: Export GPU features register to userspace through an ioctl
- pl111: Remove the CLD pads wiring support from the DT
- rockchip: Rework to use DRM PSR helpers, fix a bug in the VOP_WIN_GET
macro
- sun4i: Improve support for color encoding and range
- tinydrm: Rework SPI support, improve MIPI-DBI support, move to drm/tiny
- vkms: Rework of the CRC tracking
- bridges:
- sii902x: Add support for audio graph card
- tc358767: Rework AUX data handling code
- ti-sn65dsi86: Add Debugfs and proper DSI mode flags support
- panels
- Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191,
Boe Himax8279d, Sharp LD-D5116Z01B
- Conversion of the device tree bindings to the YAML description
- jh057n00900: Rework the enable / disable path
- fbdev:
- ssd1307fb: Support more devices based on that controller
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXUwPUAAKCRDj7w1vZxhR
xQ4lAQDK2ijx29YHeZspbOwP4Nwq95DFs1uQcSm5GvbRt1JSowD9EwkLeNfkPkel
Xv1Ts/Frgq7ckH2e2zkLPyCOFCHd0wA=
=rIUl
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2019-08-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.4:
UAPI Changes:
- HDCP: Add a Content protection type property
Cross-subsystem Changes:
Core Changes:
- Continue to rework the include dependencies
- fb: Remove the unused drm_gem_fbdev_fb_create function
- drm-dp-helper: Make the link rate calculation more tolerant to
non-explicitly defined, yet supported, rates
- fb-helper: Map DRM client buffer only when required, and instanciate a
shadow buffer when the device has a dirty function or says so
- connector: Add a helper to link the DDC adapter used by that connector to
the userspace
- vblank: Switch from DRM_WAIT_ON to wait_event_interruptible_timeout
- dma-buf: Fix a stack corruption
- ttm: Embed a drm_gem_object struct to make ttm_buffer_object a
superclass of GEM, and convert drivers to use it.
- hdcp: Improvements to report the content protection type to the
userspace
Driver Changes:
- Remove drm_gem_prime_import/export from being defined in the drivers
- Drop DRM_AUTH usage from drivers
- Continue to drop drmP.h
- Convert drivers to the connector ddc helper
- ingenic: Add support for more panel-related cases
- komeda: Support for dual-link
- lima: Reduce logging
- mpag200: Fix the cursor support
- panfrost: Export GPU features register to userspace through an ioctl
- pl111: Remove the CLD pads wiring support from the DT
- rockchip: Rework to use DRM PSR helpers, fix a bug in the VOP_WIN_GET
macro
- sun4i: Improve support for color encoding and range
- tinydrm: Rework SPI support, improve MIPI-DBI support, move to drm/tiny
- vkms: Rework of the CRC tracking
- bridges:
- sii902x: Add support for audio graph card
- tc358767: Rework AUX data handling code
- ti-sn65dsi86: Add Debugfs and proper DSI mode flags support
- panels
- Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191,
Boe Himax8279d, Sharp LD-D5116Z01B
- Conversion of the device tree bindings to the YAML description
- jh057n00900: Rework the enable / disable path
- fbdev:
- ssd1307fb: Support more devices based on that controller
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808121423.xzpedzkpyecvsiy4@flea
It has been requested that usbfs should implement runtime power
management, instead of forcing the device to remain at full power as
long as the device file is open. This patch introduces that new
feature.
It does so by adding three new usbfs ioctls:
USBDEVFS_FORBID_SUSPEND: Prevents the device from going into
runtime suspend (and causes a resume if the device is already
suspended).
USBDEVFS_ALLOW_SUSPEND: Allows the device to go into runtime
suspend. Some time may elapse before the device actually is
suspended, depending on things like the autosuspend delay.
USBDEVFS_WAIT_FOR_RESUME: Blocks until the call is interrupted
by a signal or at least one runtime resume has occurred since
the most recent ALLOW_SUSPEND ioctl call (which may mean
immediately, even if the device is currently suspended). In
the latter case, the device is prevented from suspending again
just as if FORBID_SUSPEND was called before the ioctl returns.
For backward compatibility, when the device file is first opened
runtime suspends are forbidden. The userspace program can then allow
suspends whenever it wants, and either resume the device directly (by
forbidding suspends again) or wait for a resume from some other source
(such as a remote wakeup). URBs submitted to a suspended device will
fail or will complete with an appropriate error code.
This combination of ioctls is sufficient for user programs to have
nearly the same degree of control over a device's runtime power
behavior as kernel drivers do.
Still lacking is documentation for the new ioctls. I intend to add it
later, after the existing documentation for the usbfs userspace API is
straightened out into a reasonable form.
Suggested-by: Mayuresh Kulkarni <mkulkarni@opensource.cirrus.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1908071013220.1514-100000@iolanthe.rowland.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drm-fixes-5.3-2019-08-07:
amdgpu:
- Fixes VCN to handle the latest navi10 firmware
- Fixes for fan control on navi10
- Properly handle SMU metrics table on navi10
- Fix a resume regression on Stoney
amdkfd:
- Revert new GWS ioctl. It's not ready.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190807184221.3323-1-alexander.deucher@amd.com
This reverts commit 1a058c3376.
This interface is still in too much flux. Revert until
it's sorted out.
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull networking fixes from David Miller:
"Yeah I should have sent a pull request last week, so there is a lot
more here than usual:
1) Fix memory leak in ebtables compat code, from Wenwen Wang.
2) Several kTLS bug fixes from Jakub Kicinski (circular close on
disconnect etc.)
3) Force slave speed check on link state recovery in bonding 802.3ad
mode, from Thomas Falcon.
4) Clear RX descriptor bits before assigning buffers to them in
stmmac, from Jose Abreu.
5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF
loops, from Nishka Dasgupta.
6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean.
7) Need to hold sock across skb->destructor invocation, from Cong
Wang.
8) IP header length needs to be validated in ipip tunnel xmit, from
Haishuang Yan.
9) Use after free in ip6 tunnel driver, also from Haishuang Yan.
10) Do not use MSI interrupts on r8169 chips before RTL8168d, from
Heiner Kallweit.
11) Upon bridge device init failure, we need to delete the local fdb.
From Nikolay Aleksandrov.
12) Handle erros from of_get_mac_address() properly in stmmac, from
Martin Blumenstingl.
13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef
Kadlecsik.
14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with
some devices, so revert. From Johannes Berg.
15) Fix deadlock in rxrpc, from David Howells.
16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue
Haibing.
17) Fix mvpp2 crash on module removal, from Matteo Croce.
18) Fix race in genphy_update_link, from Heiner Kallweit.
19) bpf_xdp_adjust_head() stopped working with generic XDP when we
fixes generic XDP to support stacked devices properly, fix from
Jesper Dangaard Brouer.
20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from
David Ahern.
21) Several memory leaks in new sja1105 driver, from Vladimir Oltean"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits)
net: dsa: sja1105: Fix memory leak on meta state machine error path
net: dsa: sja1105: Fix memory leak on meta state machine normal path
net: dsa: sja1105: Really fix panic on unregistering PTP clock
net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled
net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus()
net: sched: sample: allow accessing psample_group with rtnl
net: sched: police: allow accessing police->params with rtnl
net: hisilicon: Fix dma_map_single failed on arm64
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: make hip04_tx_reclaim non-reentrant
tc-testing: updated vlan action tests with batch create/delete
net sched: update vlan action for batched events operations
net: stmmac: tc: Do not return a fragment entry
net: stmmac: Fix issues when number of Queues >= 4
net: stmmac: xgmac: Fix XGMAC selftests
be2net: disable bh with spin_lock in be_process_mcc
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs
net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
...
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order to support virtualization usage via PCIe SRIOV, this patch
adds two ioctls under FPGA Management Engine (FME) to release and
assign back the port device. In order to safely turn Port from PF
into VF and enable PCIe SRIOV, it requires user to invoke this
PORT_RELEASE ioctl to release port firstly to remove userspace
interfaces, and then configure the PF/VF access register in FME.
After disable SRIOV, it requires user to invoke this PORT_ASSIGN
ioctl to attach the port back to PF.
Ioctl interfaces:
* DFL_FPGA_FME_PORT_RELEASE
Release platform device of given port, it deletes port platform
device to remove related userspace interfaces on PF. After this
function, then it's safe to configure PF/VF access mode to VF,
and enable VFs via SRIOV.
* DFL_FPGA_FME_PORT_ASSIGN
Assign platform device of given port back to PF. After configure
PF/VF access mode to PF, this ioctl adds port platform device
back to re-enable related userspace interfaces on PF.
Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/1564914022-3710-2-git-send-email-hao.wu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a lot of those warnings with GCC8+ 64-bit,
In file included from ./include/linux/sctp.h:42,
from net/core/skbuff.c:47:
./include/uapi/linux/sctp.h:395:1: warning: alignment 4 of 'struct
sctp_paddr_change' is less than 8 [-Wpacked-not-aligned]
} __attribute__((packed, aligned(4)));
^
./include/uapi/linux/sctp.h:728:1: warning: alignment 4 of 'struct
sctp_setpeerprim' is less than 8 [-Wpacked-not-aligned]
} __attribute__((packed, aligned(4)));
^
./include/uapi/linux/sctp.h:727:26: warning: 'sspp_addr' offset 4 in
'struct sctp_setpeerprim' isn't aligned to 8 [-Wpacked-not-aligned]
struct sockaddr_storage sspp_addr;
^~~~~~~~~
./include/uapi/linux/sctp.h:741:1: warning: alignment 4 of 'struct
sctp_prim' is less than 8 [-Wpacked-not-aligned]
} __attribute__((packed, aligned(4)));
^
./include/uapi/linux/sctp.h:740:26: warning: 'ssp_addr' offset 4 in
'struct sctp_prim' isn't aligned to 8 [-Wpacked-not-aligned]
struct sockaddr_storage ssp_addr;
^~~~~~~~
./include/uapi/linux/sctp.h:792:1: warning: alignment 4 of 'struct
sctp_paddrparams' is less than 8 [-Wpacked-not-aligned]
} __attribute__((packed, aligned(4)));
^
./include/uapi/linux/sctp.h:784:26: warning: 'spp_address' offset 4 in
'struct sctp_paddrparams' isn't aligned to 8 [-Wpacked-not-aligned]
struct sockaddr_storage spp_address;
^~~~~~~~~~~
./include/uapi/linux/sctp.h:905:1: warning: alignment 4 of 'struct
sctp_paddrinfo' is less than 8 [-Wpacked-not-aligned]
} __attribute__((packed, aligned(4)));
^
./include/uapi/linux/sctp.h:899:26: warning: 'spinfo_address' offset 4
in 'struct sctp_paddrinfo' isn't aligned to 8 [-Wpacked-not-aligned]
struct sockaddr_storage spinfo_address;
^~~~~~~~~~~~~~
This is because the commit 20c9c825b1 ("[SCTP] Fix SCTP socket options
to work with 32-bit apps on 64-bit kernels.") added "packed, aligned(4)"
GCC attributes to some structures but one of the members, i.e, "struct
sockaddr_storage" in those structures has the attribute,
"aligned(__alignof__ (struct sockaddr *)" which is 8-byte on 64-bit
systems, so the commit overwrites the designed alignments for
"sockaddr_storage".
To fix this, "struct sockaddr_storage" needs to be aligned to 4-byte as
it is only used in those packed sctp structure which is part of UAPI,
and "struct __kernel_sockaddr_storage" is used in some other
places of UAPI that need not to change alignments in order to not
breaking userspace.
Use an implicit alignment for "struct __kernel_sockaddr_storage" so it
can keep the same alignments as a member in both packed and un-packed
structures without breaking UAPI.
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add header include guards in case they are included multiple times.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This adds the P_PIDFD type to waitid().
One of the last remaining bits for the pidfd api is to make it possible
to wait on pidfds. With P_PIDFD added to waitid() the parts of userspace
that want to use the pidfd api to exclusively manage processes can do so
now.
One of the things this will unblock in the future is the ability to make
it possible to retrieve the exit status via waitid(P_PIDFD) for
non-parent processes if handed a _suitable_ pidfd that has this feature
set. This is similar to what you can do on FreeBSD with kqueue(). It
might even end up being possible to wait on a process as a non-parent if
an appropriate property is enabled on the pidfd.
With P_PIDFD no scoping of the process identified by the pidfd is
possible, i.e. it explicitly blocks things such as wait4(-1), wait4(0),
waitid(P_ALL), waitid(P_PGID) etc. It only allows for semantics
equivalent to wait4(pid), waitid(P_PID). Users that need scoping should
rely on pid-based wait*() syscalls for now.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20190727222229.6516-2-christian@brauner.io
In user-space there's no way to distinguish why an mdb entry was deleted
and that is a problem for daemons which would like to keep the mdb in
sync with remote ends (e.g. mlag) but would also like to converge faster.
In almost all cases we'd like to age-out the remote entry for performance
and convergence reasons except when fast-leave is enabled. In that case we
want explicit immediate remote delete, thus add mdb flag which is set only
when the entry is being deleted due to fast-leave.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree:
1) memleak in ebtables from the error path for the 32/64 compat layer,
from Florian Westphal.
2) Fix inverted meta ifname/ifidx matching when no interface is set
on either from the input/output path, from Phil Sutter.
3) Remove goto label in nft_meta_bridge, also from Phil.
4) Missing include guard in xt_connlabel, from Masahiro Yamada.
5) Two patch to fix ipset destination MAC matching coming from
Stephano Brivio, via Jozsef Kadlecsik.
6) Fix set rename and listing concurrency problem, from Shijie Luo.
Patch also coming via Jozsef Kadlecsik.
7) ebtables 32/64 compat missing base chain policy in rule count,
from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the data structure, policy and parsing code allowing userland to send
the OBSS PD information into the kernel.
Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190730163701.18836-2-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This helper function allows BPF programs to try to generate SYN
cookies, given a reference to a listener socket. The function works
from XDP and with an skb context since bpf_skc_lookup_tcp can lookup a
socket in both cases.
Signed-off-by: Petar Penkov <ppenkov@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A common pattern when using xdp_redirect_map() is to create a device map
where the lookup key is simply ifindex. Because device maps are arrays,
this leaves holes in the map, and the map has to be sized to fit the
largest ifindex, regardless of how many devices actually are actually
needed in the map.
This patch adds a second type of device map where the key is looked up
using a hashmap, instead of being used as an array index. This allows maps
to be densely packed, so they can be smaller.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixes in the iommu and balloon devices.
Disable the meta-data optimization for now - I hope we can get it fixed
shortly, but there's no point in making users suffer crashes while we
are working on that.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJdPV3yAAoJECgfDbjSjVRp5qAIAIbzdgGkkuill7++e05fo3zJ
Vus5ApnFb+VopuiKFAxHyrRhvFun2dftcpOEFC6qpZ1xMcErRa1JTDp+Z70gLPcf
ZYrT7WoJv202cTQLjlrKwMA4C+hNTGf86KZWls+uzTXngbsrzib99M89wjOTP6UW
fslOtznbaHw/oPqQSiL40vNUEhU6thnvSxWpaIGJTnU9cx508Q7dE8TpLA5UpuNj
0y0+0HJrwlNdO2CSOay+dLEkZ/3M0vbXxwcmMNwoPIOx3N58ScCTLF3w6/Zuudco
XGhUzY6K5UqonVRVoxXMsQru9ZiAhKGMnf3+ugUojm+riPFOrWBbMNkU7mmNIo0=
=nw3y
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost fixes from Michael Tsirkin:
- Fixes in the iommu and balloon devices.
- Disable the meta-data optimization for now - I hope we can get it
fixed shortly, but there's no point in making users suffer crashes
while we are working on that.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: disable metadata prefetch optimization
iommu/virtio: Update to most recent specification
balloon: fix up comments
mm/balloon_compaction: avoid duplicate page removal
Add a header include guard just in case.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add FS_VERITY_FL to the flags for FS_IOC_GETFLAGS, so that applications
can easily determine whether a file is a verity file at the same time as
they're checking other file flags. This flag will be gettable only;
FS_IOC_SETFLAGS won't allow setting it, since an ioctl must be used
instead to provide more parameters.
This flag matches the on-disk bit that was already allocated for ext4.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Here are 2 tty/vt patches for 5.3-rc2
- delete the netx-serial driver as the arch has been removed, no need
to keep the serial driver for it around either.
- vt console_lock fix to resolve a reported noisy warning at runtime
Both of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2MbA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk0iACeOnBnhE1OKqIVsG11K619UtLb67sAoL9h6Rgk
3PWLkB02N1ApgevMEOs5
=V3Xf
-----END PGP SIGNATURE-----
Merge tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty fixes from Greg KH:
"Here are two tty/vt fixes:
- delete the netx-serial driver as the arch has been removed, no need
to keep the serial driver for it around either.
- vt console_lock fix to resolve a reported noisy warning at runtime
Both of these have been in linux-next with no reported issues"
* tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: Grab console_lock around con_is_bound in show_bind
tty: serial: netx: Delete driver
Here are some small SPDX fixes for 5.3-rc2 for things that came in
during the 5.3-rc1 merge window that we previously missed.
Only 3 small patches here:
- 2 uapi patches to resolve some SPDX tags that were not correct
- fix an invalid SPDX tag in the iomap Makefile file
All have been properly reviewed on the public mailing lists.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2N9w8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylY9wCeJIYfs/eNf3tsjLQXxUBMYAJNqnsAn2IaMiTt
cv2mck7JZm5KyHpP3f5N
=RSZa
-----END PGP SIGNATURE-----
Merge tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX fixes from Greg KH:
"Here are some small SPDX fixes for 5.3-rc2 for things that came in
during the 5.3-rc1 merge window that we previously missed.
Only three small patches here:
- two uapi patches to resolve some SPDX tags that were not correct
- fix an invalid SPDX tag in the iomap Makefile file
All have been properly reviewed on the public mailing lists"
* tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
iomap: fix Invalid License ID
treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again
treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
Commit 1c38c7f220 ("nl80211: send event when CMD_FRAME duration
expires") added the possibility of NL80211_CMD_FRAME_WAIT_CANCEL
being sent whenever the off-channel wait time associated with a
CMD_FRAME completes. Document this in the uapi/linux/nl80211.h file.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190722113312.14031-1-denkenz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add support for exporting ipv6 flow label via bpf_flow_keys.
Export flow label from bpf_flow.c and also return early when
BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed.
Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
C flow dissector supports input flags that tell it to customize parsing
by either stopping early or trying to parse as deep as possible. Pass
those flags to the BPF flow dissector so it can make the same
decisions. In the next commits I'll add support for those flags to
our reference bpf_flow.c
v3:
* Export copy of flow dissector flags instead of moving (Alexei Starovoitov)
Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The Netx ARM machine was deleted from the kernel. This driver
had no users and has to go.
Cc: Robert Schwebel <r.schwebel@pengutronix.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20190722065146.4844-1-linus.walleij@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UAPI headers licensed under GPL are supposed to have exception
"WITH Linux-syscall-note" so that they can be included into non-GPL
user space application code.
The exception note is missing in some UAPI headers.
Some of them slipped in by the treewide conversion commit b24413180f
("License cleanup: add SPDX GPL-2.0 license identifier to files with
no license"). Just run:
$ git show --oneline b24413180f -- arch/x86/include/uapi/asm/
I believe they are not intentional, and should be fixed too.
This patch was generated by the following script:
git grep -l --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
-- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild |
while read file
do
sed -i -e '/[[:space:]]OR[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
-e '/[[:space:]]or[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
-e '/[[:space:]]OR[[:space:]]/!{/[[:space:]]or[[:space:]]/!s/\(GPL-[^[:space:]]*\)/\1 WITH Linux-syscall-note/g}' $file
done
After this patch is applied, there are 5 UAPI headers that do not contain
"WITH Linux-syscall-note". They are kept untouched since this exception
applies only to GPL variants.
$ git grep --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
-- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild
include/uapi/drm/panfrost_drm.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/batman_adv.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/qemu_fw_cfg.h:/* SPDX-License-Identifier: BSD-3-Clause */
include/uapi/linux/vbox_err.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/virtio_iommu.h:/* SPDX-License-Identifier: BSD-3-Clause */
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdOEuhAAoJEL/70l94x66DX/IH/3c6ADaZkuwzUMtJZgib/slX
V7h4ljoW33M85z3nCF5+kY3CNl8c9F2xKGcAIUlJF8MIsZW+zB3HjuU1LC4fCzuk
TqpBf74DpQsKCsv1ngiV02lefPVQ7/VT/QFY7EXNuAqNRfgsBRNoi50244a0ZKpD
KydzKTDKMD5HjE4lHb+bNr+guqkisPx0b0mZtsb4R9uuUSwXEa8DLmWQF2Do7zBj
6G9UD6a1AP5XQBwRRbo5a78b5NZQcF5R9wVEzsmK7OGUw/yC4Em4HVt46z+oT5cm
JK9m59XDqJaL6HMAWC2P/mXUj6o+PP+uBE2uuvkGCNcTLQZwWf+dq9961tWg81E=
=DD/Z
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Bugfixes, a pvspinlock optimization, and documentation moving"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption
Documentation: move Documentation/virtual to Documentation/virt
KVM: nVMX: Set cached_vmcs12 and cached_shadow_vmcs12 NULL after free
KVM: X86: Dynamically allocate user_fpu
KVM: X86: Fix fpu state crash in kvm guest
Revert "kvm: x86: Use task structs fpu field for user"
KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
Renaming docs seems to be en vogue at the moment, so fix on of the
grossly misnamed directories. We usually never use "virtual" as
a shortcut for virtualization in the kernel, but always virt,
as seen in the virt/ top-level directory. Fix up the documentation
to match that.
Fixes: ed16648eb5 ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl01pAkACgkQCF8+vY7k
4RXwTA/7BrsINB+6rRQxwBadSbTJvfhJ0WwNKYgbAJJfE3v91zEYNGkFgbwgYIcw
r6pXQSr2jPxFv6mv1rXRwjU8+eOucrgl2VIjRka0CPVBMrHsF1ENGM0+DbfNqjNU
8ygjyHZlGdThOarECkl76z0GXKprr79MKUMIEfeekDVz18QCQ2ER6gyKhyGz9LfI
Q0uDdWbGnXyu27ubRwLm7Cl6QTZ9Lhmcx3MPbEwD6bgLiy7PAIWXun3RyzGZCj6q
ps5DUBNtPM35k995JWOrGh4raW2VhJyTRWYJKErIfNe24bFg2dX2DMLcIfRaA4oF
joQJ+JvJX+V01QKxz8VksXPhWbsJ+9SQAaVnJ8iTS10HXULgb+Eymst86pmhUMM3
DFzouvcFdLXUwHJk4xXgCMabfMolIZwfhJRB220A+bP8AS+RmbsoP6zIorkTQQH3
3+iGX/s1GM7ARQhSXkwogltmqZuy2GSxq7EZrQr1lmI4ZW/0YxaxMnecgD7+TWEP
tU4e9R1WcMPkHbLYQnM7tzcsGwywkbGjX7vIEK52ERSy4g0TgHmO03tsL+kL6S5j
AQVAAfaDEzEi4Ar4nL7wxm2hoLAfwUUyCv+t84AivTpdHLO7QoznYgvS0GAbRjTB
t7YN5Uk2VVuCtfPsem704i5a9QIubB9pBZsYuvP1/miVxlhpIFw=
=ZxJh
-----END PGP SIGNATURE-----
Merge tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"For two regressions in media core:
- v4l2-subdev: fix regression in check_pad()
- videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already
in use"
* tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use
media: v4l2-subdev: fix regression in check_pad()
Following specification review a few things were changed in v8 of the
virtio-iommu series [1], but have been omitted when merging the base
driver. Add them now:
* Remove the EXEC flag.
* Add feature bit for the MMIO flag.
* Change domain_bits to domain_range.
* Add NOMEM status flag.
[1] https://lore.kernel.org/linux-iommu/20190530170929.19366-1-jean-philippe.brucker@arm.com/
Fixes: edcd69ab9a ("iommu: Add virtio-iommu driver")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
MPX is being removed from the kernel due to a lack of support in the
toolchain going forward (gcc).
The first step is to remove the userspace-visible ABIs so that applications
will stop using it. The most visible one are the enable/disable prctl()s.
Remove them first.
This is the most minimal and least invasive change needed to ensure that
apps stop using MPX with new kernels.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190705175321.DB42F0AD@viggo.jf.intel.com
This patch adds support for the FlySky FS-iA6B RC receiver (serial IBUS).
It allows the usage of the FlySky FS-i6 and other AFHDS compliant remote
controls as a joystick input device.
To use it, a patch to inputattach which adds the FS-iA6B as a 115200 baud
serial device is required. I will upstream it after this patch is merged.
More information about the hardware can be found here:
https://notsyncing.net/?p=blog&b=2018.linux-fsia6b
Signed-off-by: Markus Koch <markus@notsyncing.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
NL80211_HE_MAX_CAPABILITY_LEN has changed between D2.0 and D4.0. It is now
MAC (6) + PHY (11) + MCS (12) + PPE (25) = 54.
Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190627095832.19445-1-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull adfs updates from Al Viro:
"More ADFS patches from Russell King"
* 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/adfs: add time stamp and file type helpers
fs/adfs: super: limit idlen according to directory type
fs/adfs: super: fix use-after-free bug
fs/adfs: super: safely update options on remount
fs/adfs: super: correct superblock flags
fs/adfs: clean up indirect disc addresses and fragment IDs
fs/adfs: clean up error message printing
fs/adfs: use %pV for error messages
fs/adfs: use format_version from disc_record
fs/adfs: add helper to get filesystem size
fs/adfs: add helper to get discrecord from map
fs/adfs: correct disc record structure
Pull vfs mount updates from Al Viro:
"The first part of mount updates.
Convert filesystems to use the new mount API"
* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
mnt_init(): call shmem_init() unconditionally
constify ksys_mount() string arguments
don't bother with registering rootfs
init_rootfs(): don't bother with init_ramfs_fs()
vfs: Convert smackfs to use the new mount API
vfs: Convert selinuxfs to use the new mount API
vfs: Convert securityfs to use the new mount API
vfs: Convert apparmorfs to use the new mount API
vfs: Convert openpromfs to use the new mount API
vfs: Convert xenfs to use the new mount API
vfs: Convert gadgetfs to use the new mount API
vfs: Convert oprofilefs to use the new mount API
vfs: Convert ibmasmfs to use the new mount API
vfs: Convert qib_fs/ipathfs to use the new mount API
vfs: Convert efivarfs to use the new mount API
vfs: Convert configfs to use the new mount API
vfs: Convert binfmt_misc to use the new mount API
convenience helper: get_tree_single()
convenience helper get_tree_nodev()
vfs: Kill sget_userns()
...
Pull networking fixes from David Miller:
1) Fix AF_XDP cq entry leak, from Ilya Maximets.
2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit.
3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika.
4) Fix handling of neigh timers wrt. entries added by userspace, from
Lorenzo Bianconi.
5) Various cases of missing of_node_put(), from Nishka Dasgupta.
6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing.
7) Various RDS layer fixes, from Gerd Rausch.
8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from
Cong Wang.
9) Fix FIB source validation checks over loopback, also from Cong Wang.
10) Use promisc for unsupported number of filters, from Justin Chen.
11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
tcp: fix tcp_set_congestion_control() use from bpf hook
ag71xx: fix return value check in ag71xx_probe()
ag71xx: fix error return code in ag71xx_probe()
usb: qmi_wwan: add D-Link DWM-222 A2 device ID
bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
net: dsa: sja1105: Fix missing unlock on error in sk_buff()
gve: replace kfree with kvfree
selftests/bpf: fix test_xdp_noinline on s390
selftests/bpf: fix "valid read map access into a read-only array 1" on s390
net/mlx5: Replace kfree with kvfree
MAINTAINERS: update netsec driver
ipv6: Unlink sibling route in case of failure
liquidio: Replace vmalloc + memset with vzalloc
udp: Fix typo in net/ipv4/udp.c
net: bcmgenet: use promisc for unsupported filters
ipv6: rt6_check should return NULL if 'from' is NULL
tipc: initialize 'validated' field of received packets
selftests: add a test case for rp_filter
fib: relax source validation check for loopback packets
mlxsw: spectrum: Do not process learned records with a dummy FID
...
Alexei Starovoitov says:
====================
pull-request: bpf 2019-07-18
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) verifier precision propagation fix, from Andrii.
2) BTF size fix for typedefs, from Andrii.
3) a bunch of big endian fixes, from Ilya.
4) wide load from bpf_sock_addr fixes, from Stanislav.
5) a bunch of misc fixes from a number of developers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
persistent memory device that allows a guest VM to use DAX mechanisms to
access a host-file with host-page-cache. It arranges for MAP_SYNC to
be disabled and instead triggers a host fsync() when a 'write-cache
flush' command is sent to the virtual disk device.
- Miscellaneous small fixups.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdMHwpAAoJEB7SkWpmfYgCUYoP/3vcgYBAaXNksyALF0iowPoP
z4J0KoaOA1CzRFEQtCWUQa84CWj+XoSewwSeyrIkqKQvx/gghXblK+GVjVzBn0BD
hmmiKr8af4DdxfzYdEXJp65cCpIiVMaJiGr20Aj9ObwvWJb4QZbz9q7hnPt6KgiI
jVND3BpP3OERb4ZFcibdmJT5foKooMcXVG6+luVe+hc1+ZZQxJBsBaqie4brQIFq
j59NX3HfHH2fr1vVwnVH0CO4tgbgYg9wZ2EivGu6wBWvORjrr7KiSSbOYP68EBtd
lUoNps+vQtGnfXGwNzAjp1wuknrQYYh4/KMKjep7hiZD39rgyvBpbHbyynKzQCWV
REe8cXr/nwphsENvBAUBiqY999EWVIxdT2iaVaSA6K/31JQAC5AFyxVK/P2Ke1SK
rvePZ++iLQ1o4phTxQPNlVUqF9jOrFVVICGwMDqaqSkOsD9YKQdFClfOF/1ntlDz
V0bs+Y0Pe8AJCd9ESep4X+vHAWRRIb4EQIuwLaX8RJoY+r1fGye9RPthpYYzvXKp
DI2iJztFO3anzj2i9htNPUFIaiUmIhzEvG32O2If2yc5FL02hMpHPoFx6vHhe6s3
f8OJ+olsJK+/IIrV8+DHqYvhzylOYIhmRTvIxIxaNDPHkhR1i2RDQ6KKK1YZmsr8
MjAZ+Ym0GadDivs+wcM6
=uAMG
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"Primarily just the virtio_pmem driver:
- virtio_pmem
The new virtio_pmem facility introduces a paravirtualized
persistent memory device that allows a guest VM to use DAX
mechanisms to access a host-file with host-page-cache. It arranges
for MAP_SYNC to be disabled and instead triggers a host fsync()
when a 'write-cache flush' command is sent to the virtual disk
device.
- Miscellaneous small fixups"
* tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
virtio_pmem: fix sparse warning
xfs: disable map_sync for async flush
ext4: disable map_sync for async flush
dax: check synchronous mapping is supported
dm: enable synchronous dax
libnvdimm: add dax_dev sync flag
virtio-pmem: Add virtio pmem driver
libnvdimm: nd_region flush callback support
libnvdimm, namespace: Drop uuid_t implementation detail
new iommu device
vhost guest memory access using vmap (just meta-data for now)
minor fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Note: due to code driver changes the driver-core tree, the following
patch is needed when merging tree with commit 92ce7e83b4
("driver_find_device: Unify the match function with
class_find_device()") in the driver-core tree:
From: Nathan Chancellor <natechancellor@gmail.com>
Subject: [PATCH] iommu/virtio: Constify data parameter in viommu_match_node
After commit 92ce7e83b4 ("driver_find_device: Unify the match
function with class_find_device()") in the driver-core tree.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
drivers/iommu/virtio-iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 4620dd221ffd..433f4d2ee956 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -839,7 +839,7 @@ static void viommu_put_resv_regions(struct device *dev, struct list_head *head)
static struct iommu_ops viommu_ops;
static struct virtio_driver virtio_iommu_drv;
-static int viommu_match_node(struct device *dev, void *data)
+static int viommu_match_node(struct device *dev, const void *data)
{
return dev->parent->fwnode == data;
}
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJdJ5qUAAoJECgfDbjSjVRpQs0H/2qWcIG1zjGKyh9KWrfgOusG
/QIqeP50d7SC6oqdyd00tzmExqO1xdGLPFzYixdOsU817te1gHBP4Rfmzo01jZRd
CUzZNnZQ2JRsDshiA6G2ui+wn1/a/cB3RPN4rT1mquDYS53QmsRGDQDnpp84TXMV
aocB8TS6halbRzKMq3VmaWHIvzNXnt4dwQR542+PyeLLn9bUx2QwWj2ON3QwxixK
dVRZow3GwLGBhKTA/Z1Z/Bta4fEfOKjUGP2XWgvL6zOr+nZR4eQ8w5WXVJYzR+d6
1JCfqTxleweT2k6Tu5VwtTNlQkxn/XvQAeisppOiEE6NnPjubyI9wMQIvL7bkpo=
=uJbC
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio, vhost updates from Michael Tsirkin:
"Fixes, features, performance:
- new iommu device
- vhost guest memory access using vmap (just meta-data for now)
- minor fixes"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-mmio: add error check for platform_get_irq
scsi: virtio_scsi: Use struct_size() helper
iommu/virtio: Add event queue
iommu/virtio: Add probe request
iommu: Add virtio-iommu driver
PCI: OF: Initialize dev->fwnode appropriately
of: Allow the iommu-map property to omit untranslated devices
dt-bindings: virtio: Add virtio-pci-iommu node
dt-bindings: virtio-mmio: Add IOMMU description
vhost: fix clang build warning
vhost: access vq metadata through kernel virtual address
vhost: factor out setting vring addr and num
vhost: introduce helpers to get the size of metadata area
vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch()
vhost: fine grain userspace memory accessors
vhost: generalize adding used elem
Merge more updates from Andrew Morton:
"VM:
- z3fold fixes and enhancements by Henry Burns and Vitaly Wool
- more accurate reclaimed slab caches calculations by Yafang Shao
- fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
Christoph Hellwig
- !CONFIG_MMU fixes by Christoph Hellwig
- new novmcoredd parameter to omit device dumps from vmcore, by
Kairui Song
- new test_meminit module for testing heap and pagealloc
initialization, by Alexander Potapenko
- ioremap improvements for huge mappings, by Anshuman Khandual
- generalize kprobe page fault handling, by Anshuman Khandual
- device-dax hotplug fixes and improvements, by Pavel Tatashin
- enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V
- add pte_devmap() support for arm64, by Robin Murphy
- unify locked_vm accounting with a helper, by Daniel Jordan
- several misc fixes
core/lib:
- new typeof_member() macro including some users, by Alexey Dobriyan
- make BIT() and GENMASK() available in asm, by Masahiro Yamada
- changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better
code generation, by Alexey Dobriyan
- rbtree code size optimizations, by Michel Lespinasse
- convert struct pid count to refcount_t, by Joel Fernandes
get_maintainer.pl:
- add --no-moderated switch to skip moderated ML's, by Joe Perches
misc:
- ptrace PTRACE_GET_SYSCALL_INFO interface
- coda updates
- gdb scripts, various"
[ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ]
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits)
fs/select.c: use struct_size() in kmalloc()
mm: add account_locked_vm utility function
arm64: mm: implement pte_devmap support
mm: introduce ARCH_HAS_PTE_DEVMAP
mm: clean up is_device_*_page() definitions
mm/mmap: move common defines to mman-common.h
mm: move MAP_SYNC to asm-generic/mman-common.h
device-dax: "Hotremove" persistent memory that is used like normal RAM
mm/hotplug: make remove_memory() interface usable
device-dax: fix memory and resource leak if hotplug fails
include/linux/lz4.h: fix spelling and copy-paste errors in documentation
ipc/mqueue.c: only perform resource calculation if user valid
include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
scripts/gdb: add helpers to find and list devices
scripts/gdb: add lx-genpd-summary command
drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
kernel/pid.c: convert struct pid count to refcount_t
drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR
...
This patch fixes below sparse warning related to __virtio
type in virtio pmem driver. This is reported by Intel test
bot on linux-next tree.
nd_virtio.c:56:28: warning: incorrect type in assignment
(different base types)
nd_virtio.c:56:28: expected unsigned int [unsigned] [usertype] type
nd_virtio.c:56:28: got restricted __virtio32
nd_virtio.c:93:59: warning: incorrect type in argument 2
(different base types)
nd_virtio.c:93:59: expected restricted __virtio32 [usertype] val
nd_virtio.c:93:59: got unsigned int [unsigned] [usertype] ret
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.
There are two reasons for a special syscall-related ptrace request.
Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its
tracer has no reliable means to find out that the syscall was, in
fact, a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the
tracer. Common practice is to keep track of the sequence of
ptrace-stops in order not to mix the two syscall-stops up. But it is
not as simple as it looks; for example, strace had a (just recently
fixed) long-standing bug where attaching strace to a tracee that is
performing the execve system call led to the tracer identifying the
following syscall-exit-stop as syscall-enter-stop, which messed up
all the state tracking.
* Since the introduction of commit 84d77d3f06 ("ptrace: Don't allow
accessing an undumpable mm"), both PTRACE_PEEKDATA and
process_vm_readv become unavailable when the process dumpable flag is
cleared. On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.
Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.
ptrace(2) man page:
long ptrace(enum __ptrace_request request, pid_t pid,
void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
Retrieve information about the syscall that caused the stop.
The information is placed into the buffer pointed by "data"
argument, which should be a pointer to a buffer of type
"struct ptrace_syscall_info".
The "addr" argument contains the size of the buffer pointed to
by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
The return value contains the number of bytes available
to be written by the kernel.
If the size of data to be written by the kernel exceeds the size
specified by "addr" argument, the output is truncated.
[ldv@altlinux.org: selftests/seccomp/seccomp_bpf: update for PTRACE_GET_SYSCALL_INFO]
Link: http://lkml.kernel.org/r/20190708182904.GA12332@altlinux.org
Link: http://lkml.kernel.org/r/20190510152842.GF28558@altlinux.org
Signed-off-by: Elvira Khabirova <lineprinter@altlinux.org>
Co-developed-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Eugene Syromyatnikov <esyr@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Helge Deller <deller@gmx.de> [parisc]
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: kbuild test robot <lkp@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds support for partial file caching in Coda. Every read, write
and mmap informs the userspace cache manager about what part of a file
is about to be accessed so that the cache manager can ensure the
relevant parts are available before the operation is allowed to proceed.
When a read or write operation completes, this is also reported to allow
the cache manager to track when partially cached content can be
released.
If the cache manager does not support partial file caching, or when the
entire file has been fetched into the local cache, the cache manager may
return an EOPNOTSUPP error to indicate that intent upcalls are no longer
necessary until the file is closed.
[akpm@linux-foundation.org: little whitespace fixup]
Link: http://lkml.kernel.org/r/20190618181301.6960-1-jaharkes@cs.cmu.edu
Signed-off-by: Pedro Cuadra <pjcuadra@gmail.com>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nothing is left in this header that is used by userspace.
Link: http://lkml.kernel.org/r/bb11378cef94739f2cf89425dd6d302a52c64480.1558117389.git.jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the 32-bit time_t problems to userspace.
Link: http://lkml.kernel.org/r/8d089068823bfb292a4020f773922fbd82ffad39.1558117389.git.jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We exchange file timestamps with user space using psdev device
read/write operations with a fixed but architecture specific binary
layout.
On 32-bit systems, this uses a 'timespec' structure that is defined by
the C library to contain two 32-bit values for seconds and nanoseconds.
As we get ready for the year 2038 overflow of the 32-bit signed seconds,
the kernel now uses 64-bit timestamps internally, and user space will do
the same change by changing the 'timespec' definition in the future.
Unfortunately, this breaks the layout of the coda_vattr structure, so we
need to redefine that in terms of something that does not change. I'm
introducing a new 'struct vtimespec' structure here that keeps the
existing layout, and the same change has to be done in the coda user
space copy of linux/coda.h before anyone can use that on a 32-bit
architecture with 64-bit time_t.
An open question is what should happen to actual times past y2038, as
they are now truncated to the last valid date when sent to user space,
and interpreted as pre-1970 times when a timestamp with the MSB set is
read back into the kernel. Alternatively, we could change the new
timespec64_to_coda()/coda_to_timespec64() functions to use a different
interpretation and extend the available range further to the future by
disallowing past timestamps. This would require more changes in the
user space side though.
Link: http://lkml.kernel.org/r/562b7324149461743e4fbe2fedbf7c242f7e274a.1558117389.git.jaharkes@cs.cmu.edu
Link: https://patchwork.kernel.org/patch/10474735/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These constants only used internally and not exposed to userspace.
Link: http://lkml.kernel.org/r/baeafc30dad70d8b422ee679420099c2d8aa7da0.1558117389.git.jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Only users of upc_req in kernel side fs/coda/psdev.c and
fs/coda/upcall.c already include linux/coda_psdev.h.
Suggested by Jan Harkes <jaharkes@cs.cmu.edu> in
https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/
Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace:
linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type
struct list_head uc_chain;
^
linux/coda_psdev.h:13:2: error: unknown type name `caddr_t'
caddr_t uc_data;
^
linux/coda_psdev.h:14:2: error: unknown type name `u_short'
u_short uc_flags;
^
linux/coda_psdev.h:15:2: error: unknown type name `u_short'
u_short uc_inSize; /* Size is at most 5000 bytes */
^
linux/coda_psdev.h:16:2: error: unknown type name `u_short'
u_short uc_outSize;
^
linux/coda_psdev.h:17:2: error: unknown type name `u_short'
u_short uc_opcode; /* copied from data to save lookup */
^
linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t'
wait_queue_head_t uc_sleep; /* process' wait queue */
^
Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389.git.jaharkes@cs.cmu.edu
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Part of a patch by Mikko Rapeli, as Arnd Bergman commented on the
original patch.
pid_t might differ between libc and the kernel, so the kernel
interface has to use types that the kernel defines.
Link: http://lkml.kernel.org/r/f374a71f4d351bc8c8b3ac18ad7765c88d806d10.1558117389.git.jaharkes@cs.cmu.edu
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl0sNWYACgkQxWXV+ddt
WDsyQA/8CGnF68g6hwVuYz4K7f39gOiFlBnRxeN/3RT6vkNSyLZxvRDaDrSTzVIo
cz2G/9qZLXsIll+3EfZlyzZZiA+4f4hEDAfAd4yVPavRom+uu7dbqzAIpgvFlYdH
vhAYKOeWSqWElWJ06hzWO3FCwjY9GKFMk4PS0XHHp+STCT0hq1MkaHr44kiHsqdh
T5nVGDwXz8nGDZ51RO6+mgiSrd5eHbs6kXCd8rW7hmjTx8ClKHa1tdkxN/us+pJm
hTFT669m5ckHhY2AUKmkREoOwpnt2HcXQJNkz6gO+o03IDvYz73SScbhSYdNTlwi
j74GLf89FA52qVM+JDg9MaWYqgf1pQI8AHK/rXw2FNbuP/eL9kuZ85ZIbO6CiO0c
5jAixReSwzSP/V0+MKW3F7k4KtIqbHAV6mkI8zLwrAee4Xj81BOtgL7gYPFQTwSZ
ma0hEoen7IV5+/z9upUuLA5wr4BT+h1T+EllCWe1+9+9mRYOvowtkRNBL8HZWTDI
b65oTITfot54xX9ecKtiuG2qoqJEjjkR+YKdRM4nph6wflSNZxEoezBp3iRFpYOL
Lx+g97RcJ2EEoBVjVMkTqfj93GeiKRifa8yXdRY+A0I2ZXZEcS8DjSJM6rj3AOPy
4idIl+ABscayZowfqu0FSIULf1La0qiRXmbGNeG4ylhN4L6S/og=
=eshk
-----END PGP SIGNATURE-----
Merge tag 'for-5.3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Highlights:
- chunks that have been trimmed and unchanged since last mount are
tracked and skipped on repeated trims
- use hw assissed crc32c on more arches, speedups if native
instructions or optimized implementation is available
- the RAID56 incompat bit is automatically removed when the last
block group of that type is removed
Fixes:
- fsync fix for reflink on NODATACOW files that could lead to ENOSPC
- fix data loss after inode eviction, renaming it, and fsync it
- fix fsync not persisting dentry deletions due to inode evictions
- update ctime/mtime/iversion after hole punching
- fix compression type validation (reported by KASAN)
- send won't be allowed to start when relocation is in progress, this
can cause spurious errors or produce incorrect send stream
Core:
- new tracepoints for space update
- tree-checker: better check for end of extents for some tree items
- preparatory work for more checksum algorithms
- run delayed iput at unlink time and don't push the work to cleaner
thread where it's not properly throttled
- wrap block mapping to structures and helpers, base for further
refactoring
- split large files, part 1:
- space info handling
- block group reservations
- delayed refs
- delayed allocation
- other cleanups and refactoring"
* tag 'for-5.3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (103 commits)
btrfs: fix memory leak of path on error return path
btrfs: move the subvolume reservation stuff out of extent-tree.c
btrfs: migrate the delalloc space stuff to it's own home
btrfs: migrate btrfs_trans_release_chunk_metadata
btrfs: migrate the delayed refs rsv code
btrfs: Evaluate io_tree in find_lock_delalloc_range()
btrfs: migrate the global_block_rsv helpers to block-rsv.c
btrfs: migrate the block-rsv code to block-rsv.c
btrfs: stop using block_rsv_release_bytes everywhere
btrfs: cleanup the target logic in __btrfs_block_rsv_release
btrfs: export __btrfs_block_rsv_release
btrfs: export btrfs_block_rsv_add_bytes
btrfs: move btrfs_block_rsv definitions into it's own header
btrfs: Simplify update of space_info in __reserve_metadata_bytes()
btrfs: unexport can_overcommit
btrfs: move reserve_metadata_bytes and supporting code to space-info.c
btrfs: move dump_space_info to space-info.c
btrfs: export block_rsv_use_bytes
btrfs: move btrfs_space_info_add_*_bytes to space-info.c
btrfs: move the space info update macro to space-info.h
...
During the review of the iproute2 patches for txtime-assist mode, it was
pointed out that it does not make sense for the txtime-delay parameter to
be negative. So, change the type of the parameter from s32 to u32.
Fixes: 4cfd5779bd ("taprio: Add support for txtime-assist mode")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl0tpocACgkQCF8+vY7k
4RWoxA//b/fmDXP3WPzrjjSmpyB9ml0/epKzPbT5S2j0lftqKBmet29k+PCjVrTx
Nq2QauehY9ug5h8UMVUCmzPr95F0tSIGRoqk1vrn7z0K3q6k1SHrtvqbY1Bgb2Uk
Qvh2YFU4fQLJg8WAbExCjxCdbdmBKQVGKTwCtM+tP5OMxwAFOmQrjGaUaKCKIIA2
7Wzrx8CpSji+bJ3uK/d36c+4M9oDly5eaxBhoboL3BI0y+GqwiSASGwTO7BxrPOg
0wq5IZHnqS8+bprT9xQdDOqf+UOY9U1cxE/+sqsHxblfUEx9gfLy/R+FLmJn+SS9
Z3yLy4SqVHQMpWBjEAGodohikF60PAuTdymSC11jqFaKCUxWrIZg5xO+0blMrxPF
7vYIexutCkaBMHBlNaNsHIqB7B/2FGGKoN7QW64hwvwJCGvF7OmJcV+R4bROGvh4
nFuis9/Nm66Fq7I3aw37ThyZ0aWZdaQ0QJTH9ksxU/ZCz2hhMNYu/rXggrDvkS4U
nr77ZT5Gd7nj4b110zf8+99uiGiinY6hTfzPAuTCLBhaxwrv4/xDHAhpwdEB5T4j
8gOkxV8c0XWtL7sKqhGJvs/RRe2za0Y9XH6fyxsYfWcfuLjEvug8ouXMad9gxFWH
DL3WnKJEMGLScei2wux4kGOwEbkR1bUf2cHJfh3GpCB/y8vgLOc=
=smxY
-----END PGP SIGNATURE-----
Merge tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull rst conversion of docs from Mauro Carvalho Chehab:
"As agreed with Jon, I'm sending this big series directly to you, c/c
him, as this series required a special care, in order to avoid
conflicts with other trees"
* tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (77 commits)
docs: kbuild: fix build with pdf and fix some minor issues
docs: block: fix pdf output
docs: arm: fix a breakage with pdf output
docs: don't use nested tables
docs: gpio: add sysfs interface to the admin-guide
docs: locking: add it to the main index
docs: add some directories to the main documentation index
docs: add SPDX tags to new index files
docs: add a memory-devices subdir to driver-api
docs: phy: place documentation under driver-api
docs: serial: move it to the driver-api
docs: driver-api: add remaining converted dirs to it
docs: driver-api: add xilinx driver API documentation
docs: driver-api: add a series of orphaned documents
docs: admin-guide: add a series of orphaned documents
docs: cgroup-v1: add it to the admin-guide book
docs: aoe: add it to the driver-api book
docs: add some documentation dirs to the driver-api book
docs: driver-model: move it to the driver-api book
docs: lp855x-driver.rst: add it to the driver-api book
...
Add explicit check for u64 loads of user_ip6 and msg_src_ip6 and
update the comment.
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
ASUS WMI driver got a big refactoring in order to support the TUF Gaming
laptops. Besides that, the regression with backlight being permanently off
on various EeePC laptops has been fixed.
Accelerometer on HP ProBook 450 G0 shows wrong measurements due to
X axis being inverted. This has been fixed.
Intel PMC core driver has been extended to be ACPI enumerated
if the DSDT provides device with _HID "INT33A1". This allows
to convert the driver to be pure platform and support new hardware
purely based on ACPI DSDT.
From now on the Intel Speed Select Technology is supported thru
a corresponding driver. This driver provides an access to the features
of the ISST, such as Performance Profile, Core Power, Base frequency and
Turbo Frequency.
Mellanox platform drivers has been refactored and now extended
to support more systems, including new coming ones.
The OLPC XO-1.75 platform is now supported.
CB4063 Beckhoff Automation board is using PMC clocks,
provided via pmc_atom driver, for ethernet controllers in a way
that they can't be managed by the clock driver. The quirk
has been extended to cover this case.
Touchscreen on Chuwi Hi10 Plus tablet has been enabled. Meanwhile
the information of Chuwi Hi10 Air has been fixed to cover more models
based on the same platform.
Xiaomi notebooks have WMI interface enabled. Thus, the driver to support it
has been provided. It required some extension of the generic WMI library,
which allows to propagate opaque context to the ->probe() of the
individual drivers.
This release includes debugfs clean up from Greg KH for several drivers
that drop return code check and make debugfs absence or failure non-fatal.
Miscellaneous fixes here and there, mostly for Acer WMI and
various Intel drivers.
The listed below commits are duplicated due to previously pushed fixes in v5.2 cycle:
- 1dd93f873d platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi
- 89ae3a0736 platform/x86: intel-vbtn: Report switch events when event wakes device
- fa882fc80d platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration
- 0bfcd24b39 platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow
The following is an automated git shortlog grouped by driver:
acer-wmi:
- Mark expected switch fall-throughs
- no need to check return value of debugfs_create functions
asus-nb-wmi:
- Add microphone mute key code
asus-wmi:
- Use dev_get_drvdata()
- Do not disable keyboard backlight on unloading
- Switch fan boost mode
- Enhance detection of thermal data
- Organize code into sections
- Refactor error handling
- Support WMI event queue
- Refactor WMI event handling
- Improve DSTS WMI method ID detection
- Increase input buffer size of WMI methods
- Fix preserving keyboard backlight intensity on load
- Fix hwmon device cleanup
- no need to check return value of debugfs_create functions
- Only Tell EC the OS will handle display hotkeys from asus_nb_wmi
dell-laptop:
- no need to check return value of debugfs_create functions
hp_accel:
- Add support for HP ProBook 450 G0
ideapad-laptop:
- no need to check return value of debugfs_create functions
intel_int0002_vgpio:
- Get rid of custom ICPU() macro
intel_menlow:
- avoid null pointer deference error
intel_pmc:
- no need to check return value of debugfs_create functions
intel_pmc_core:
- Attach using APCI HID "INT33A1"
- transform Pkg C-state residency from TSC ticks into microseconds
intel_telemetry:
- no need to check return value of debugfs_create functions
intel-vbtn:
- Report switch events when event wakes device
ISST:
- Restore state on resume
- Add Intel Speed Select PUNIT MSR interface
- Add Intel Speed Select mailbox interface via MSRs
- Add Intel Speed Select mailbox interface via PCI
- Add Intel Speed Select mmio interface
- Add IOCTL to Translate Linux logical CPU to PUNIT CPU number
- Store per CPU information
- Add common API to register and handle ioctls
- Update ioctl-number.txt for Intel Speed Select interface
- A tool to validate Intel Speed Select commands
- Add .gitignore file
MAINTAINERS:
- Update for Intel Speed Select Technology
mlx-platform:
- Fix error handling in mlxplat_init()
- Add more reset cause attributes
- Modify DMI matching order
- Add regmap structure for the next generation systems
- Change API for i2c-mlxcpld driver activation
- Move regmap initialization before all drivers activation
- Fix parent device in i2c-mux-reg device registration
- Add new attribute for mlxreg-io sysfs interfaces
pcengines-apuv2:
- Make two symbols static
- Fix PCENGINES_APU2 Kconfig warning
OLPC:
- Add a config menu category for XO 1.75
- Require CONFIG_POWER_SUPPLY for XO-1.75 EC
- Fix olpc_xo175_ec_cmd() return value
- Make olpc_dt_compatible_match() static __init
- Add INPUT dependencies
- Fix build error without CONFIG_SPI
- Add a regulator for the DCON
- Add XO-1.75 EC driver
- Use BIT() and GENMASK() for event masks
- Avoid a warning if the EC didn't register yet
- Move EC-specific functionality out from x86
- Remove an unused include
- Add OLPC XO-1.75 EC bindings
platform/mellanox:
- mlxreg-hotplug: Add devm_free_irq call to remove flow
pmc_atom:
- Add CB4063 Beckhoff Automation board to critclk_systems DMI table
- no need to check return value of debugfs_create functions
Kconfig:
- Remove left-over BACKLIGHT_LCD_SUPPORT
samsung-laptop:
- no need to check return value of debugfs_create functions
touchscreen_dmi:
- Update Hi10 Air filter
- Add info for the CHUWI Hi10 Plus tablet.
wmi:
- add Xiaomi WMI key driver
- add context argument to the probe function
- add context pointer field to struct wmi_device_id
- Add function to get _UID of WMI device
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl0rP88ACgkQb7wzTHR8
rCis8BAAnRgRgi8x1C7xn66gAUHsDXpY0tF9cp/Fw3HyTmFCQkRSmnLkMM2DqGi+
dvB9U1zPcGWwdwryKFsJXioEK3erYpiYyT2VwLtW4S7P5jQ+N9biT4TZ8yFp0MEr
MZC50LZDV1JTp1a0GQyrMpfoMBnE7UhR2GL8UbGli/WwXFE5BLkrJ1pdrjhYZOHl
rJcgq3HPAhV5qkUkIU7gTC2GGSPydjBqk0OhVIU4dPsYwXIb2gXc0yR0QVwKm5x3
I/NQwBOBMKmdI6uJ8BJyg/p888Strw65YJaTe5wtvG8ljuIbcN/aQ3ZmClNrUnc0
58byqJCpRhg9HN39VpF9rsApEGxKTlitAUAUKy7lgue7/mycHbA1Syzz29AIM+2v
ey2/zgFeeWtgh1cuh2cUWlCE6woW7ED4VpDxhkXlX4xGUp+CILEiFqcsULlcc4j5
sgojCLRPs78roYj9Y84CwYbsd7J/Ce4r2evBpKYPqYxDbUiuH2aVQtEdPTKv9/xC
yHtBuJJSxY7a+sf4OZONRo13dfvRoZIPjcccR8yTOakS2/1Fqph7MpHyDkwFAfeS
M2f+OcJn9IECol1391PTLj9Dx3jApyVk21HJdiIj7sKZgJOSS54AFm0/Ywk0MFpY
XScXKulV48SdL4ZKup5aIpDzyP5zuvXszKQboRitep1dHiR9bl0=
=DC5j
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:
"Gathered a bunch of x86 platform driver changes. It's rather big,
since includes two big refactors and completely new driver:
- ASUS WMI driver got a big refactoring in order to support the TUF
Gaming laptops. Besides that, the regression with backlight being
permanently off on various EeePC laptops has been fixed.
- Accelerometer on HP ProBook 450 G0 shows wrong measurements due to
X axis being inverted. This has been fixed.
- Intel PMC core driver has been extended to be ACPI enumerated if
the DSDT provides device with _HID "INT33A1". This allows to
convert the driver to be pure platform and support new hardware
purely based on ACPI DSDT.
- From now on the Intel Speed Select Technology is supported thru a
corresponding driver. This driver provides an access to the
features of the ISST, such as Performance Profile, Core Power, Base
frequency and Turbo Frequency.
- Mellanox platform drivers has been refactored and now extended to
support more systems, including new coming ones.
- The OLPC XO-1.75 platform is now supported.
- CB4063 Beckhoff Automation board is using PMC clocks, provided via
pmc_atom driver, for ethernet controllers in a way that they can't
be managed by the clock driver. The quirk has been extended to
cover this case.
- Touchscreen on Chuwi Hi10 Plus tablet has been enabled. Meanwhile
the information of Chuwi Hi10 Air has been fixed to cover more
models based on the same platform.
- Xiaomi notebooks have WMI interface enabled. Thus, the driver to
support it has been provided. It required some extension of the
generic WMI library, which allows to propagate opaque context to
the ->probe() of the individual drivers.
This release includes debugfs clean up from Greg KH for several
drivers that drop return code check and make debugfs absence or
failure non-fatal.
Also miscellaneous fixes here and there, mostly for Acer WMI and
various Intel drivers"
* tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86: (74 commits)
platform/x86: Fix PCENGINES_APU2 Kconfig warning
tools/power/x86/intel-speed-select: Add .gitignore file
platform/x86: mlx-platform: Fix error handling in mlxplat_init()
platform/x86: intel_pmc_core: Attach using APCI HID "INT33A1"
platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds
platform/x86: asus-wmi: Use dev_get_drvdata()
Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
platform/x86: mlx-platform: Add more reset cause attributes
platform/x86: mlx-platform: Modify DMI matching order
platform/x86: mlx-platform: Add regmap structure for the next generation systems
platform/x86: mlx-platform: Change API for i2c-mlxcpld driver activation
platform/x86: mlx-platform: Move regmap initialization before all drivers activation
MAINTAINERS: Update for Intel Speed Select Technology
tools/power/x86: A tool to validate Intel Speed Select commands
platform/x86: ISST: Restore state on resume
platform/x86: ISST: Add Intel Speed Select PUNIT MSR interface
platform/x86: ISST: Add Intel Speed Select mailbox interface via MSRs
platform/x86: ISST: Add Intel Speed Select mailbox interface via PCI
platform/x86: ISST: Add Intel Speed Select mmio interface
platform/x86: ISST: Add IOCTL to Translate Linux logical CPU to PUNIT CPU number
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0nUl4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplCaEACa7c7ybRgV6CZpS9PXXVqBJqYILRLBXsnn
MXomrSdjKMs0Y928RAQHFNWh3HaHHtdvSmFvwOpfF5lyrYXVfc9MQ7brqarDp1t2
f4jvkL63BVG2Zs/VL8QVAz+CwtCF39hduzUR/Y9j/4+rJNhSBMNJLY0nlB5weCTy
MJetnLoQ9ETA2+xu49vAM/PFJgBynNUAyUer918y8QysJRj90/VnhieQmrVb4tpG
Q4yZFKq4YPDs0tLEX4Nj6eJERcyW/4MC2oZ0aPXU4g2Dc3SVWaSNOo5WpkP+crGt
0dbyLmhomteE6+Kaco1hAWIkG/RuvgiMzDizryi0enXP51edV3Vnwyg3MQSUhcnf
Pn7vrDkajKBE9rFGlLy8V4gkKdS8XJQy2xA1MWm3aWgGl4v0j64EXIe0IhIK30vU
25A9jLDcdgr74+Lw+vWLLd+oeGD0iFf6wiEp+3jzEdtfVNE/lD6yilTzbdz2V0UK
8T1sRLMEkaG7CbxOVc1UAfcvObjuqQihEI0fQvl4yxV178h8mtWB87YmV2S2EhzP
v6FSxiC1yZ7J+rwb/Mff7+1GoOgzrpS/zESk2WMTgcwVdiwFfv5eIC26ZNWObJ/x
IY+4xRgTf2dEsjBeumOuBzxTfzrZb+pTO4GCa4O+t0UDQRIwl0y20pTXKtxU3y/U
gKPXEjgXrQ==
=jDiB
-----END PGP SIGNATURE-----
Merge tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"This contains:
- Support for recvmsg/sendmsg as first class opcodes.
I don't envision going much further down this path, as there are
plans in progress to support potentially any system call in an
async fashion through io_uring. But I think it does make sense to
have certain core ops available directly, especially those that can
support a "try this non-blocking" flag/mode. (me)
- Handle generic short reads automatically.
This can happen fairly easily if parts of the buffered read is
cached. Since the application needs to issue another request for
the remainder, just do this internally and save kernel/user
roundtrip while providing a nicer more robust API. (me)
- Support for linked SQEs.
This allows SQEs to depend on each other, enabling an application
to eg queue a read-from-this-file,write-to-that-file pair. (me)
- Fix race in stopping SQ thread (Jackie)"
* tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block:
io_uring: fix io_sq_thread_stop running in front of io_sq_thread
io_uring: add support for recvmsg()
io_uring: add support for sendmsg()
io_uring: add support for sqe links
io_uring: punt short reads to async context
uio: make import_iovec()/compat_import_iovec() return bytes on success
- remove headers_{install,check}_all targets
- remove unreasonable 'depends on !UML' from CONFIG_SAMPLES
- re-implement 'make headers_install' more cleanly
- add new header-test-y syntax to compile-test headers
- compile-test exported headers to ensure they are compilable in
user-space
- compile-test headers under include/ to ensure they are self-contained
- remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value flags
- add -Werror=unknown-warning-option for Clang
- add 128-bit built-in types support to genksyms
- fix missed rebuild of modules.builtin
- propagate 'No space left on device' error in fixdep to Make
- allow Clang to use its integrated assembler
- improve some coccinelle scripts
- add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
path for $(srctree).
- do not ignore errors when compression utility is missing
- misc cleanups
-----BEGIN PGP SIGNATURE-----
iQJSBAABCgA8FiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl0oxNkeHHlhbWFkYS5t
YXNhaGlyb0Bzb2Npb25leHQuY29tAAoJED2LAQed4NsGnhcP/AuM8s+3SYFiLitJ
ISbznLFP2Xatq0SPXp5+moez/AMTK6Mm1biPcdo20d+TjVEh4+9F2nq12Ii9U8/D
tds9A6G8+Bb28r9GMIVQPdFohijW6ijtDziS31iQnIWyPsP/yx6PKfLAD9F4ca1x
7/4btmu+BOMjtN0NrMWSNz5MM47xUzoWIALL40SV4PzGVXLCQZ2PBNPeSRIk22Jt
ynDNPuNsmDWcFfwAE+sLSDrhCHZlwM8rg8rf6jmYdc4LcN4cj0oho5+K1TRyC9mn
fO3PT25juFejthxQulxEfyGggnyLM6BNTgPDGcCHSP4nD7mlXA9GcpZICtJOgGGu
SlDadMZ0GRMK5zcZ0MF0GQboeyViwsbXgrRcYuXt6cUFWX4P/1SeAQ5Mf4u1EKqf
hEbwFXV/g81ht0lFS8gyWkvdpoNPtxGHNPusLjp65C4rc0/48/s+7EE/u8JTPl1g
dQTeIOds6XUOkJgqhEfuq+8gfngbjKc9bYhs+ACbkCzBltQdnb6m5aLgk0ODxe8I
WbGn0+cQcS9VVwre7E5DnFSVWVOHAG5taiUwj0KDcHB0Jxw9Gvorq9WU1ppHHYH2
XQIFBx7XHdn28d+plS8R23vAPgDgrGdvE5RYK5tNQLhTJ6BbjlZ1n/Tmxzu62scK
deG3aCOB13Om7OTzTUh9+C3TC9ZQ
=E2Rz
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- remove headers_{install,check}_all targets
- remove unreasonable 'depends on !UML' from CONFIG_SAMPLES
- re-implement 'make headers_install' more cleanly
- add new header-test-y syntax to compile-test headers
- compile-test exported headers to ensure they are compilable in
user-space
- compile-test headers under include/ to ensure they are self-contained
- remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value
flags
- add -Werror=unknown-warning-option for Clang
- add 128-bit built-in types support to genksyms
- fix missed rebuild of modules.builtin
- propagate 'No space left on device' error in fixdep to Make
- allow Clang to use its integrated assembler
- improve some coccinelle scripts
- add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
path for $(srctree).
- do not ignore errors when compression utility is missing
- misc cleanups
* tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits)
kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix
kbuild: Inform user to pass ARCH= for make mrproper
kbuild: fix compression errors getting ignored
kbuild: add a flag to force absolute path for srctree
kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree
kbuild: remove src and obj from the top Makefile
scripts/tags.sh: remove unused environment variables from comments
scripts/tags.sh: drop SUBARCH support for ARM
kbuild: compile-test kernel headers to ensure they are self-contained
kheaders: include only headers into kheaders_data.tar.xz
kheaders: remove meaningless -R option of 'ls'
kbuild: support header-test-pattern-y
kbuild: do not create wrappers for header-test-y
kbuild: compile-test exported headers to ensure they are self-contained
init/Kconfig: add CONFIG_CC_CAN_LINK
kallsyms: exclude kasan local symbols on s390
kbuild: add more hints about SUBDIRS replacement
coccinelle: api/stream_open: treat all wait_.*() calls as blocking
coccinelle: put_device: Add a cast to an expression for an assignment
coccinelle: put_device: Adjust a message construction
...
* support for chained PMU counters in guests
* improved SError handling
* handle Neoverse N1 erratum #1349291
* allow side-channel mitigation status to be migrated
* standardise most AArch64 system register accesses to msr_s/mrs_s
* fix host MPIDR corruption on 32bit
* selftests ckleanups
x86:
* PMU event {white,black}listing
* ability for the guest to disable host-side interrupt polling
* fixes for enlightened VMCS (Hyper-V pv nested virtualization),
* new hypercall to yield to IPI target
* support for passing cstate MSRs through to the guest
* lots of cleanups and optimizations
Generic:
* Some txt->rST conversions for the documentation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdJzdIAAoJEL/70l94x66DQDoH/i83/8kX4I8AWDlushPru4ts
Q4lCE5VAPha+o4pLb1dtfFL3gTmSbsB1N++JSlqK3JOo6LphIOy6b0wBjQBbAa6U
3CT1dJaHJoScLLj09vyBlvClGUH2ZKEQTWOiquCCf7JfPofxwPUA6vJ7TYsdkckx
zR3ygbADWmnfS7hFfiqN3JzuYh9eoooGNWSU+Giq6VF41SiL3IqhBGZhWS0zE9c2
2c5lpqqdeHmAYNBqsyzNiDRKp7+zLFSmZ7Z5/0L755L8KYwR6F5beTnmBMHvb4lA
PWH/SWOC8EYR+PEowfrH+TxKZwp0gMn1kcAKjilHk0uCRwG1IzuHAr2jlNxICCk=
=t/Oq
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- support for chained PMU counters in guests
- improved SError handling
- handle Neoverse N1 erratum #1349291
- allow side-channel mitigation status to be migrated
- standardise most AArch64 system register accesses to msr_s/mrs_s
- fix host MPIDR corruption on 32bit
- selftests ckleanups
x86:
- PMU event {white,black}listing
- ability for the guest to disable host-side interrupt polling
- fixes for enlightened VMCS (Hyper-V pv nested virtualization),
- new hypercall to yield to IPI target
- support for passing cstate MSRs through to the guest
- lots of cleanups and optimizations
Generic:
- Some txt->rST conversions for the documentation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits)
Documentation: virtual: Add toctree hooks
Documentation: kvm: Convert cpuid.txt to .rst
Documentation: virtual: Convert paravirt_ops.txt to .rst
KVM: x86: Unconditionally enable irqs in guest context
KVM: x86: PMU Event Filter
kvm: x86: Fix -Wmissing-prototypes warnings
KVM: Properly check if "page" is valid in kvm_vcpu_unmap
KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register
KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane
kvm: LAPIC: write down valid APIC registers
KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register
KVM: arm/arm64: Add save/restore support for firmware workaround state
arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests
KVM: arm/arm64: Support chained PMU counters
KVM: arm/arm64: Remove pmc->bitmask
KVM: arm/arm64: Re-create event when setting counter value
KVM: arm/arm64: Extract duplicated code to own function
KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions
KVM: LAPIC: ARBPRI is a reserved register for x2APIC
...
Merge updates from Andrew Morton:
"Am experimenting with splitting MM up into identifiable subsystems
perhaps with a view to gitifying it in complex ways. Also with more
verbose "incoming" emails.
Most of MM is here and a few other trees.
Subsystems affected by this patch series:
- hotfixes
- iommu
- scripts
- arch/sh
- ocfs2
- mm:slab-generic
- mm:slub
- mm:kmemleak
- mm:kasan
- mm:cleanups
- mm:debug
- mm:pagecache
- mm:swap
- mm:memcg
- mm:gup
- mm:pagemap
- mm:infrastructure
- mm:vmalloc
- mm:initialization
- mm:pagealloc
- mm:vmscan
- mm:tools
- mm:proc
- mm:ras
- mm:oom-kill
hotfixes:
mm: vmscan: scan anonymous pages on file refaults
mm/nvdimm: add is_ioremap_addr and use that to check ioremap address
mm/memcontrol: fix wrong statistics in memory.stat
mm/z3fold.c: lock z3fold page before __SetPageMovable()
nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header
MAINTAINERS: nilfs2: update email address
iommu:
include/linux/dmar.h: replace single-char identifiers in macros
scripts:
scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
scripts/decode_stacktrace: look for modules with .ko.debug extension
scripts/spelling.txt: drop "sepc" from the misspelling list
scripts/spelling.txt: add spelling fix for prohibited
scripts/decode_stacktrace: Accept dash/underscore in modules
scripts/spelling.txt: add more spellings to spelling.txt
arch/sh:
arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS
sh: config: remove left-over BACKLIGHT_LCD_SUPPORT
sh: prevent warnings when using iounmap
ocfs2:
fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat"
ocfs2/dlm: use struct_size() helper
ocfs2: add last unlock times in locking_state
ocfs2: add locking filter debugfs file
ocfs2: add first lock wait time in locking_state
ocfs: no need to check return value of debugfs_create functions
fs/ocfs2/dlmglue.c: unneeded variable: "status"
ocfs2: use kmemdup rather than duplicating its implementation
mm:slab-generic:
Patch series "mm/slab: Improved sanity checking":
mm/slab: validate cache membership under freelist hardening
mm/slab: sanity-check page type when looking up cache
lkdtm/heap: add tests for freelist hardening
mm:slub:
mm/slub.c: avoid double string traverse in kmem_cache_flags()
slub: don't panic for memcg kmem cache creation failure
mm:kmemleak:
mm/kmemleak.c: fix check for softirq context
mm/kmemleak.c: change error at _write when kmemleak is disabled
docs: kmemleak: add more documentation details
mm:kasan:
mm/kasan: print frame description for stack bugs
Patch series "Bitops instrumentation for KASAN", v5:
lib/test_kasan: add bitops tests
x86: use static_cpu_has in uaccess region to avoid instrumentation
asm-generic, x86: add bitops instrumentation for KASAN
Patch series "mm/kasan: Add object validation in ksize()", v3:
mm/kasan: introduce __kasan_check_{read,write}
mm/kasan: change kasan_check_{read,write} to return boolean
lib/test_kasan: Add test for double-kzfree detection
mm/slab: refactor common ksize KASAN logic into slab_common.c
mm/kasan: add object validation in ksize()
mm:cleanups:
include/linux/pfn_t.h: remove pfn_t_to_virt()
Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect":
arm: remove ARCH_SELECT_MEMORY_MODEL
s390: remove ARCH_SELECT_MEMORY_MODEL
sparc: remove ARCH_SELECT_MEMORY_MODEL
mm/gup.c: make follow_page_mask() static
mm/memory.c: trivial clean up in insert_page()
mm: make !CONFIG_HUGE_PAGE wrappers into static inlines
include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info
mm: remove the account_page_dirtied export
mm/page_isolation.c: change the prototype of undo_isolate_page_range()
include/linux/vmpressure.h: use spinlock_t instead of struct spinlock
mm: remove the exporting of totalram_pages
include/linux/pagemap.h: document trylock_page() return value
mm:debug:
mm/failslab.c: by default, do not fail allocations with direct reclaim only
Patch series "debug_pagealloc improvements":
mm, debug_pagelloc: use static keys to enable debugging
mm, page_alloc: more extensive free page checking with debug_pagealloc
mm, debug_pagealloc: use a page type instead of page_ext flag
mm:pagecache:
Patch series "fix filler_t callback type mismatches", v2:
mm/filemap.c: fix an overly long line in read_cache_page
mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page
jffs2: pass the correct prototype to read_cache_page
9p: pass the correct prototype to read_cache_page
mm/filemap.c: correct the comment about VM_FAULT_RETRY
mm:swap:
mm, swap: fix race between swapoff and some swap operations
mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()
mm, swap: use rbtree for swap_extent
mm/mincore.c: fix race between swapoff and mincore
mm:memcg:
memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL
memcg, fsnotify: no oom-kill for remote memcg charging
mm, memcg: introduce memory.events.local
mm: memcontrol: dump memory.stat during cgroup OOM
Patch series "mm: reparent slab memory on cgroup removal", v7:
mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache()
mm: memcg/slab: rename slab delayed deactivation functions and fields
mm: memcg/slab: generalize postponed non-root kmem_cache deactivation
mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg()
mm: memcg/slab: unify SLAB and SLUB page accounting
mm: memcg/slab: don't check the dying flag on kmem_cache creation
mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
mm: memcg/slab: rework non-root kmem_cache lifecycle management
mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages
mm: memcg/slab: reparent memcg kmem_caches on cgroup removal
mm, memcg: add a memcg_slabinfo debugfs file
mm:gup:
Patch series "switch the remaining architectures to use generic GUP", v4:
mm: use untagged_addr() for get_user_pages_fast addresses
mm: simplify gup_fast_permitted
mm: lift the x86_32 PAE version of gup_get_pte to common code
MIPS: use the generic get_user_pages_fast code
sh: add the missing pud_page definition
sh: use the generic get_user_pages_fast code
sparc64: add the missing pgd_page definition
sparc64: define untagged_addr()
sparc64: use the generic get_user_pages_fast code
mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP
mm: reorder code blocks in gup.c
mm: consolidate the get_user_pages* implementations
mm: validate get_user_pages_fast flags
mm: move the powerpc hugepd code to mm/gup.c
mm: switch gup_hugepte to use try_get_compound_head
mm: mark the page referenced in gup_hugepte
mm/gup: speed up check_and_migrate_cma_pages() on huge page
mm/gup.c: remove some BUG_ONs from get_gate_page()
mm/gup.c: mark undo_dev_pagemap as __maybe_unused
mm:pagemap:
asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
alpha: switch to generic version of pte allocation
arm: switch to generic version of pte allocation
arm64: switch to generic version of pte allocation
csky: switch to generic version of pte allocation
m68k: sun3: switch to generic version of pte allocation
mips: switch to generic version of pte allocation
nds32: switch to generic version of pte allocation
nios2: switch to generic version of pte allocation
parisc: switch to generic version of pte allocation
riscv: switch to generic version of pte allocation
um: switch to generic version of pte allocation
unicore32: switch to generic version of pte allocation
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
mm/memory.c: fail when offset == num in first check of __vm_map_pages()
mm:infrastructure:
mm/mmu_notifier: use hlist_add_head_rcu()
mm:vmalloc:
Patch series "Some cleanups for the KVA/vmalloc", v5:
mm/vmalloc.c: remove "node" argument
mm/vmalloc.c: preload a CPU with one object for split purpose
mm/vmalloc.c: get rid of one single unlink_va() when merge
mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va()
mm/vmalloc.c: spelling> s/informaion/information/
mm:initialization:
mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist
mm/large system hash: clear hashdist when only one node with memory is booted
mm:pagealloc:
arm64: move jump_label_init() before parse_early_param()
Patch series "add init_on_alloc/init_on_free boot options", v10:
mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
mm: init: report memory auto-initialization features at boot time
mm:vmscan:
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
mm: vmscan: correct some vmscan counters for THP swapout
mm:tools:
tools/vm/slabinfo: order command line options
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add sorting info to help menu
mm:proc:
proc: use down_read_killable mmap_sem for /proc/pid/maps
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/map_files
mm: use down_read_killable for locking mmap_sem in access_remote_vm
mm: smaps: split PSS into components
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm:ras:
mm/memory-failure.c: clarify error message
mm:oom-kill:
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm, oom: refactor dump_tasks for memcg OOMs
mm, oom: remove redundant task_in_mem_cgroup() check
oom: decouple mems_allowed from oom_unkillable_task
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()"
* akpm: (147 commits)
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()
oom: decouple mems_allowed from oom_unkillable_task
mm, oom: remove redundant task_in_mem_cgroup() check
mm, oom: refactor dump_tasks for memcg OOMs
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm/memory-failure.c: clarify error message
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm: smaps: split PSS into components
mm: use down_read_killable for locking mmap_sem in access_remote_vm
proc: use down_read_killable mmap_sem for /proc/pid/map_files
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/maps
tools/vm/slabinfo: add sorting info to help menu
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: order command line options
mm: vmscan: correct some vmscan counters for THP swapout
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
...
cpu_to_le32/le32_to_cpu is defined in include/linux/byteorder/generic.h,
which is not exported to user-space.
UAPI headers must use the ones prefixed with double-underscore.
Detected by compile-testing exported headers:
include/linux/nilfs2_ondisk.h: In function `nilfs_checkpoint_set_snapshot':
include/linux/nilfs2_ondisk.h:536:17: error: implicit declaration of function `cpu_to_le32' [-Werror=implicit-function-declaration]
cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) | \
^
include/linux/nilfs2_ondisk.h:552:1: note: in expansion of macro `NILFS_CHECKPOINT_FNS'
NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
^~~~~~~~~~~~~~~~~~~~
include/linux/nilfs2_ondisk.h:536:29: error: implicit declaration of function `le32_to_cpu' [-Werror=implicit-function-declaration]
cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) | \
^
include/linux/nilfs2_ondisk.h:552:1: note: in expansion of macro `NILFS_CHECKPOINT_FNS'
NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
^~~~~~~~~~~~~~~~~~~~
include/linux/nilfs2_ondisk.h: In function `nilfs_segment_usage_set_clean':
include/linux/nilfs2_ondisk.h:622:19: error: implicit declaration of function `cpu_to_le64' [-Werror=implicit-function-declaration]
su->su_lastmod = cpu_to_le64(0);
^~~~~~~~~~~
Link: http://lkml.kernel.org/r/20190605053006.14332-1-yamada.masahiro@socionext.com
Fixes: e63e88bc53 ("nilfs2: move ioctl interface and disk layout to uapi separately")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: <stable@vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the big USB and PHY driver pull request for 5.3-rc1.
Lots of stuff here, all of which has been in linux-next for a while with
no reported issues. Nothing is earth-shattering, just constant forward
progress for more devices supported and cleanups and small fixes:
- USB gadget driver updates and fixes
- new USB gadget driver for some hardware, followed by a quick revert
of those patches as they were not ready to be merged...
- PHY driver updates
- Lots of new driver additions and cleanups with a few fixes mixed in.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSXjYA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynMYACgnSRP3GylwMywrkc9paVmDeiIgNwAn0N2sika
JEW7C3lkBJZJ7R6V/Ynm
=drla
-----END PGP SIGNATURE-----
Merge tag 'usb-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / PHY updates from Greg KH:
"Here is the big USB and PHY driver pull request for 5.3-rc1.
Lots of stuff here, all of which has been in linux-next for a while
with no reported issues. Nothing is earth-shattering, just constant
forward progress for more devices supported and cleanups and small
fixes:
- USB gadget driver updates and fixes
- new USB gadget driver for some hardware, followed by a quick revert
of those patches as they were not ready to be merged...
- PHY driver updates
- Lots of new driver additions and cleanups with a few fixes mixed
in"
* tag 'usb-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (145 commits)
Revert "usb: gadget: storage: Remove warning message"
Revert "dt-bindings: add binding for USBSS-DRD controller."
Revert "usb:gadget Separated decoding functions from dwc3 driver."
Revert "usb:gadget Patch simplify usb_decode_set_clear_feature function."
Revert "usb:gadget Simplify usb_decode_get_set_descriptor function."
Revert "usb:cdns3 Add Cadence USB3 DRD Driver"
Revert "usb:cdns3 Fix for stuck packets in on-chip OUT buffer."
usb :fsl: Change string format for errata property
usb: host: Stops USB controller init if PLL fails to lock
usb: linux/fsl_device: Add platform member has_fsl_erratum_a006918
usb: phy: Workaround for USB erratum-A005728
usb: fsl: Set USB_EN bit to select ULPI phy
usb: Handle USB3 remote wakeup for LPM enabled devices correctly
drivers/usb/typec/tps6598x.c: fix 4CC cmd write
drivers/usb/typec/tps6598x.c: fix portinfo width
usb: storage: scsiglue: Do not skip VPD if try_vpd_pages is set
usb: renesas_usbhs: add a workaround for a race condition of workqueue
usb: gadget: udc: renesas_usb3: remove redundant assignment to ret
usb: dwc2: use a longer AHB idle timeout in dwc2_core_reset()
USB: gadget: function: fix issue Unneeded variable: "value"
...
Here is the "large" TTY and Serial driver update for 5.3-rc1.
It's in the negative number of lines overall as we removed an obsolete
serial driver that was causing problems for some people who were trying
to clean up some apis (the mpsc.c driver, which only worked for some
pre-production hardware that no one has anymore.)
Other than that, lots of tiny changes, cleaning up small things along
with some platform-specific serial driver updates.
All of these have been in linux-next for a while now with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSXkOA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynidwCgoCX5L3zayZ7pZTWmtLsTOy0O0WYAn1gzleNg
sSRApC0DGfxOIrco7urV
=0KJa
-----END PGP SIGNATURE-----
Merge tag 'tty-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the "large" TTY and Serial driver update for 5.3-rc1.
It's in the negative number of lines overall as we removed an obsolete
serial driver that was causing problems for some people who were
trying to clean up some apis (the mpsc.c driver, which only worked for
some pre-production hardware that no one has anymore.)
Other than that, lots of tiny changes, cleaning up small things along
with some platform-specific serial driver updates.
All of these have been in linux-next for a while now with no reported
issues"
* tag 'tty-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (68 commits)
tty: serial: fsl_lpuart: add imx8qxp support
serial: imx: set_termios(): preserve RTS state
serial: imx: set_termios(): clarify RTS/CTS bits calculation
serial: imx: set_termios(): factor-out 'ucr2' initial value
serial: sh-sci: Terminate TX DMA during buffer flushing
serial: sh-sci: Fix TX DMA buffer flushing and workqueue races
serial: mpsc: Remove obsolete MPSC driver
serial: 8250: 8250_core: Fix missing unlock on error in serial8250_register_8250_port()
serial: stm32: add RX and TX FIFO flush
serial: stm32: add support of RX FIFO threshold
serial: stm32: add support of TX FIFO threshold
serial: stm32: update PIO transmission
serial: stm32: add support of timeout interrupt for RX
Revert "serial: 8250: Don't service RX FIFO if interrupts are disabled"
tty/serial/8250: use mctrl_gpio helpers
serial: mctrl_gpio: Check if GPIO property exisits before requesting it
serial: 8250: pericom_do_set_divisor can be static
tty: serial_core: Set port active bit in uart_port_activate
serial: 8250: Add MSR/MCR TIOCM conversion wrapper functions
serial: 8250: factor out serial8250_{set,clear}_THRI() helpers
...
Pull networking updates from David Miller:
"Some highlights from this development cycle:
1) Big refactoring of ipv6 route and neigh handling to support
nexthop objects configurable as units from userspace. From David
Ahern.
2) Convert explored_states in BPF verifier into a hash table,
significantly decreased state held for programs with bpf2bpf
calls, from Alexei Starovoitov.
3) Implement bpf_send_signal() helper, from Yonghong Song.
4) Various classifier enhancements to mvpp2 driver, from Maxime
Chevallier.
5) Add aRFS support to hns3 driver, from Jian Shen.
6) Fix use after free in inet frags by allocating fqdirs dynamically
and reworking how rhashtable dismantle occurs, from Eric Dumazet.
7) Add act_ctinfo packet classifier action, from Kevin
Darbyshire-Bryant.
8) Add TFO key backup infrastructure, from Jason Baron.
9) Remove several old and unused ISDN drivers, from Arnd Bergmann.
10) Add devlink notifications for flash update status to mlxsw driver,
from Jiri Pirko.
11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.
12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.
13) Various enhancements to ipv6 flow label handling, from Eric
Dumazet and Willem de Bruijn.
14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van
der Merwe, and others.
15) Various improvements to axienet driver including converting it to
phylink, from Robert Hancock.
16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.
17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana
Radulescu.
18) Add devlink health reporting to mlx5, from Moshe Shemesh.
19) Convert stmmac over to phylink, from Jose Abreu.
20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from
Shalom Toledo.
21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.
22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.
23) Track spill/fill of constants in BPF verifier, from Alexei
Starovoitov.
24) Support bounded loops in BPF, from Alexei Starovoitov.
25) Various page_pool API fixes and improvements, from Jesper Dangaard
Brouer.
26) Just like ipv4, support ref-countless ipv6 route handling. From
Wei Wang.
27) Support VLAN offloading in aquantia driver, from Igor Russkikh.
28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.
29) Add flower GRE encap/decap support to nfp driver, from Pieter
Jansen van Vuuren.
30) Protect against stack overflow when using act_mirred, from John
Hurley.
31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.
32) Use page_pool API in netsec driver, Ilias Apalodimas.
33) Add Google gve network driver, from Catherine Sullivan.
34) More indirect call avoidance, from Paolo Abeni.
35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.
36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.
37) Add MPLS manipulation actions to TC, from John Hurley.
38) Add sending a packet to connection tracking from TC actions, and
then allow flower classifier matching on conntrack state. From
Paul Blakey.
39) Netfilter hw offload support, from Pablo Neira Ayuso"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits)
net/mlx5e: Return in default case statement in tx_post_resync_params
mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
net: dsa: add support for BRIDGE_MROUTER attribute
pkt_sched: Include const.h
net: netsec: remove static declaration for netsec_set_tx_de()
net: netsec: remove superfluous if statement
netfilter: nf_tables: add hardware offload support
net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
net: flow_offload: add flow_block_cb_is_busy() and use it
net: sched: remove tcf block API
drivers: net: use flow block API
net: sched: use flow block API
net: flow_offload: add flow_block_cb_{priv, incref, decref}()
net: flow_offload: add list handling functions
net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()
net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
net: flow_offload: add flow_block_cb_setup_simple()
net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC
net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhhgAKCRCRxhvAZXjc
or7kAP9VzDcQaK/WoDd2ezh2C7Wh5hNy9z/qJVCa6Tb+N+g1UgEAxbhFUg55uGOA
JNf7fGar5JF5hBMIXR+NqOi1/sb4swg=
=ELWo
-----END PGP SIGNATURE-----
Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull clone3 system call from Christian Brauner:
"This adds the clone3 syscall which is an extensible successor to clone
after we snagged the last flag with CLONE_PIDFD during the 5.2 merge
window for clone(). It cleanly supports all of the flags from clone()
and thus all legacy workloads.
There are few user visible differences between clone3 and clone.
First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse
this flag. Second, the CSIGNAL flag is deprecated and will cause
EINVAL to be reported. It is superseeded by a dedicated "exit_signal"
argument in struct clone_args thus freeing up even more flags. And
third, clone3 gives CLONE_PIDFD a dedicated return argument in struct
clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr
argument.
The clone3 uapi is designed to be easy to handle on 32- and 64 bit:
/* uapi */
struct clone_args {
__aligned_u64 flags;
__aligned_u64 pidfd;
__aligned_u64 child_tid;
__aligned_u64 parent_tid;
__aligned_u64 exit_signal;
__aligned_u64 stack;
__aligned_u64 stack_size;
__aligned_u64 tls;
};
and a separate kernel struct is used that uses proper kernel typing:
/* kernel internal */
struct kernel_clone_args {
u64 flags;
int __user *pidfd;
int __user *child_tid;
int __user *parent_tid;
int exit_signal;
unsigned long stack;
unsigned long stack_size;
unsigned long tls;
};
The system call comes with a size argument which enables the kernel to
detect what version of clone_args userspace is passing in. clone3
validates that any additional bytes a given kernel does not know about
are set to zero and that the size never exceeds a page.
A nice feature is that this patchset allowed us to cleanup and
simplify various core kernel codepaths in kernel/fork.c by making the
internal _do_fork() function take struct kernel_clone_args even for
legacy clone().
This patch also unblocks the time namespace patchset which wants to
introduce a new CLONE_TIMENS flag.
Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and
xtensa. These were the architectures that did not require special
massaging.
Other architectures treat fork-like system calls individually and
after some back and forth neither Arnd nor I felt confident that we
dared to add clone3 unconditionally to all architectures. We agreed to
leave this up to individual architecture maintainers. This is why
there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3
which any architecture can set once it has implemented support for
clone3. The patch also adds a cond_syscall(clone3) for architectures
such as nios2 or h8300 that generate their syscall table by simply
including asm-generic/unistd.h. The hope is to get rid of
__ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon"
* tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
arch: handle arches who do not yet define clone3
arch: wire-up clone3() syscall
fork: add clone3
- Add support for chained PMU counters in guests
- Improve SError handling
- Handle Neoverse N1 erratum #1349291
- Allow side-channel mitigation status to be migrated
- Standardise most AArch64 system register accesses to msr_s/mrs_s
- Fix host MPIDR corruption on 32bit
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl0kge4VHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDYyQP/3XY5tFcLKkp/h9rnGaCXwAxhNzn
TyF/IZEFBKFTSoDMXKLLc8KllvoPQ7aUl03heYbuayYpyKR1+LCx7lDwu1MYyEf+
aSSuOKlbG//tLUEGp09pTRCgjs2mhhZYqOj5GF2mZ7xpovFVSNOPzTazbXDNQ7tw
zUAs43YNg+bUMwj+SLWpBlizjrLr7T34utIr6daKJE/GSfmIrcYXhGbZqUh0zbO0
z5LNasebws8/pHyeGI7+/yoMIKaQ8foMgywTpsRpBsx6YI+AbOLjEmCk2IBOPcEK
pm9KkSIBZEO2CSxZKl3NQiEow/Qd/lnz2xLMCSfh4XrYoI2Th4gNcsbJpiBDWP5a
0eZ5jSiexxKngIbM+to7jR3m0yc9RgcuzceJg3Uly7Ya0vb5RqKwOX4Ge4XP4VDT
DzIVFdQjxDKdVIf3EvGp1cj4P7dRUU3xbZcbzyuRPEmT3vgjEnbxawmPLs3QMAl1
31Wd2wIsPB86kSxzSMel27Vs5VgMhgyHE26zN91R745CvhDXaDKydIWjGjdVMHsB
GuX/h2kL+ohx+N/OpZPgwsVUAGLSOQFP3pE/EcGtqc2kkfqa+bx12DKcZ3zdmJvy
+cu5ixU8q5thPH/pZob/C3hKUY/eLy02emS34RK0Jh2sZHbQgAOtMsiqUxNHEjUm
6TkpdWa5SRd7CtGV
=yfCs
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm updates for 5.3
- Add support for chained PMU counters in guests
- Improve SError handling
- Handle Neoverse N1 erratum #1349291
- Allow side-channel mitigation status to be migrated
- Standardise most AArch64 system register accesses to msr_s/mrs_s
- Fix host MPIDR corruption on 32bit
Some events can provide a guest with information about other guests or the
host (e.g. L3 cache stats); providing the capability to restrict access
to a "safe" set of events would limit the potential for the PMU to be used
in any side channel attacks. This change introduces a new VM ioctl that
sets an event filter. If the guest attempts to program a counter for
any blacklisted or non-whitelisted event, the kernel counter won't be
created, so any RDPMC/RDMSR will show 0 instances of that event.
Signed-off-by: Eric Hankland <ehankland@google.com>
[Lots of changes. All remaining bugs are probably mine. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The V4L2_PIX_FMT_BGRA444 define clashed with the pre-existing V4L2_PIX_FMT_SGRBG12
which strangely enough used the same fourcc, even though that fourcc made no sense
for a Bayer format. In any case, you can't have duplicates, so change the fourcc of
V4L2_PIX_FMT_BGRA444.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: <stable@vger.kernel.org> # for v5.2 and up
Fixes: 6c84f9b1d2 ("media: v4l: Add definitions for missing 16-bit RGB4444 formats")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Pull m68nommu updates from Greg Ungerer:
"A series of cleanups for the FLAT format binary loader, binfmt_flat,
from Christoph.
The end goal is to support no-MMU on RISC-V, and the last patch
enables that"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
riscv: add binfmt_flat support
binfmt_flat: don't offset the data start
binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c
binfmt_flat: remove the persistent argument from flat_get_addr_from_rp
binfmt_flat: provide an asm-generic/flat.h
binfmt_flat: make support for old format binaries optional
binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option
binfmt_flat: add endianess annotations
binfmt_flat: use fixed size type for the on-disk format
binfmt_flat: consolidate two version of flat_v2_reloc_t
binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition
binfmt_flat: remove the uapi <linux/flat.h> header
binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable
binfmt_flat: remove flat_old_ram_flag
binfmt_flat: provide a default version of flat_get_relocate_addr
binfmt_flat: remove flat_set_persistent
binfmt_flat: remove flat_reloc_valid
Commit 9903c8dc73 changed TC_ETF defines to use _BITUL instead of BIT
but did not add the dependecy on linux/const.h. As a consequence,
importing the uapi headers into iproute2 causes builds to fail. Add
the dependency.
Fixes: 9903c8dc73 ("etf: Don't use BIT() in UAPI headers.")
Cc: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds hardware offload support for nftables through the
existing netdev_ops->ndo_setup_tc() interface, the TC_SETUP_CLSFLOWER
classifier and the flow rule API. This hardware offload support is
available for the NFPROTO_NETDEV family and the ingress hook.
Each nftables expression has a new ->offload interface, that is used to
populate the flow rule object that is attached to the transaction
object.
There is a new per-table NFT_TABLE_F_HW flag, that is set on to offload
an entire table, including all of its chains.
This patch supports for basic metadata (layer 3 and 4 protocol numbers),
5-tuple payload matching and the accept/drop actions; this also includes
basechain hardware offload only.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is done through IORING_OP_RECVMSG. This opcode uses the same
sqe->msg_flags that IORING_OP_SENDMSG added, and we pass in the
msghdr struct in the sqe->addr field as well.
We use MSG_DONTWAIT to force an inline fast path if recvmsg() doesn't
block, and punt to async execution if it would have.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is done through IORING_OP_SENDMSG. There's a new sqe->msg_flags
for the flags argument, and the msghdr struct is passed in the
sqe->addr field.
We use MSG_DONTWAIT to force an inline fast path if sendmsg() doesn't
block, and punt to async execution if it would have.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
New matches for conntrack mark, label, zone, and state.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow sending a packet to conntrack module for connection tracking.
The packet will be marked with conntrack connection's state, and
any metadata such as conntrack mark and label. This state metadata
can later be matched against with tc classifers, for example with the
flower classifier as below.
In addition to committing new connections the user can optionally
specific a zone to track within, set a mark/label and configure nat
with an address range and port range.
Usage is as follows:
$ tc qdisc add dev ens1f0_0 ingress
$ tc qdisc add dev ens1f0_1 ingress
$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 0 proto ip \
flower ip_proto tcp ct_state -trk \
action ct zone 2 pipe \
action goto chain 2
$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower ct_state +trk+new \
action ct zone 2 commit mark 0xbb nat src addr 5.5.5.7 pipe \
action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower ct_zone 2 ct_mark 0xbb ct_state +trk+est \
action ct nat pipe \
action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_1 ingress \
prio 1 chain 0 proto ip \
flower ip_proto tcp ct_state -trk \
action ct zone 2 pipe \
action goto chain 1
$ tc filter add dev ens1f0_1 ingress \
prio 1 chain 1 proto ip \
flower ct_zone 2 ct_mark 0xbb ct_state +trk+est \
action ct nat pipe \
action mirred egress redirect dev ens1f0_0
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Changelog:
V5->V6:
Added CONFIG_NF_DEFRAG_IPV6 in handle fragments ipv6 case
V4->V5:
Reordered nf_conntrack_put() in tcf_ct_skb_nfct_cached()
V3->V4:
Added strict_start_type for act_ct policy
V2->V3:
Fixed david's comments: Removed extra newline after rcu in tcf_ct_params , and indent of break in act_ct.c
V1->V2:
Fixed parsing of ranges TCA_CT_NAT_IPV6_MAX as 'else' case overwritten ipv4 max
Refactored NAT_PORT_MIN_MAX range handling as well
Added ipv4/ipv6 defragmentation
Removed extra skb pull push of nw offset in exectute nat
Refactored tcf_ct_skb_network_trim after pull
Removed TCA_ACT_CT define
Signed-off-by: David S. Miller <davem@davemloft.net>
In an eswitch, PCI VF may have port which is normally represented using
a representor netdevice.
To have better visibility of eswitch port, its association with VF,
and its representor netdevice, introduce a PCI VF port flavour.
When devlink port flavour is PCI VF, fill up PCI VF attributes of
the port.
Extend port name creation using PCI PF and VF number scheme on best
effort basis, so that vendor drivers can skip defining their own scheme.
$ devlink port show
pci/0000:05:00.0/0: type eth netdev eth0 flavour pcipf pfnum 0
pci/0000:05:00.0/1: type eth netdev eth1 flavour pcivf pfnum 0 vfnum 0
pci/0000:05:00.0/2: type eth netdev eth2 flavour pcivf pfnum 0 vfnum 1
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In an eswitch, PCI PF may have port which is normally represented
using a representor netdevice.
To have better visibility of eswitch port, its association with
PF and a representor netdevice, introduce a PCI PF port
flavour and port attriute.
When devlink port flavour is PCI PF, fill up PCI PF attributes of the
port.
Extend port name creation using PCI PF number on best effort basis.
So that vendor drivers can skip defining their own scheme.
$ devlink port show
pci/0000:05:00.0/0: type eth netdev eth0 flavour pcipf pfnum 0
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0jrIMQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptlFD/9CNsBX+Aap2lO6wKNr6QISwNAK76GMzEay
s4LSY2kGkXvzv8i89mCuY+8UVNI8WH2/22WnU+8CBAJOjWyFQMsIwH/mrq0oZWRD
J6STJE8rTr6Fc2MvJUWryp/xdBh3+eDIsAdIZVHVAkIzqYPBnpIAwEIeIw8t0xsm
v9ngpQ3WD6ep8tOj9pnG1DGKFg1CmukZCC/Y4CQV1vZtmm2I935zUwNV/TB+Egfx
G8JSC0cSV02LMK88HCnA6MnC/XSUC0qgfXbnmP+TpKlgjVX+P/fuB3oIYcZEu2Rk
3YBpIkhsQytKYbF42KRLsmBH72u6oB9G+tNZTgB1STUDrZqdtD9xwX1rjDlY0ZzP
EUDnk48jl/cxbs+VZrHoE2TcNonLiymV7Kb92juHXdIYmKFQStprGcQUbMaTkMfB
6BYrYLifWx0leu1JJ1i7qhNmug94BYCSCxcRmH0p6kPazPcY9LXNmDWMfMuBPZT7
z79VLZnHF2wNXJyT1cBluwRYYJRT4osWZ3XUaBWFKDgf1qyvXJfrN/4zmgkEIyW7
ivXC+KLlGkhntDlWo2pLKbbyOIKY1HmU6aROaI11k5Zyh0ixKB7tHKavK39l+NOo
YB41+4l6VEpQEyxyRk8tO0sbHpKaKB+evVIK3tTwbY+Q0qTExErxjfWUtOgRWhjx
iXJssPRo4w==
=VSYT
-----END PGP SIGNATURE-----
Merge tag 'for-5.3/block-20190708' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"This is the main block updates for 5.3. Nothing earth shattering or
major in here, just fixes, additions, and improvements all over the
map. This contains:
- Series of documentation fixes (Bart)
- Optimization of the blk-mq ctx get/put (Bart)
- null_blk removal race condition fix (Bob)
- req/bio_op() cleanups (Chaitanya)
- Series cleaning up the segment accounting, and request/bio mapping
(Christoph)
- Series cleaning up the page getting/putting for bios (Christoph)
- block cgroup cleanups and moving it to where it is used (Christoph)
- block cgroup fixes (Tejun)
- Series of fixes and improvements to bcache, most notably a write
deadlock fix (Coly)
- blk-iolatency STS_AGAIN and accounting fixes (Dennis)
- Series of improvements and fixes to BFQ (Douglas, Paolo)
- debugfs_create() return value check removal for drbd (Greg)
- Use struct_size(), where appropriate (Gustavo)
- Two lighnvm fixes (Heiner, Geert)
- MD fixes, including a read balance and corruption fix (Guoqing,
Marcos, Xiao, Yufen)
- block opal shadow mbr additions (Jonas, Revanth)
- sbitmap compare-and-exhange improvemnts (Pavel)
- Fix for potential bio->bi_size overflow (Ming)
- NVMe pull requests:
- improved PCIe suspent support (Keith Busch)
- error injection support for the admin queue (Akinobu Mita)
- Fibre Channel discovery improvements (James Smart)
- tracing improvements including nvmetc tracing support (Minwoo Im)
- misc fixes and cleanups (Anton Eidelman, Minwoo Im, Chaitanya
Kulkarni)"
- Various little fixes and improvements to drivers and core"
* tag 'for-5.3/block-20190708' of git://git.kernel.dk/linux-block: (153 commits)
blk-iolatency: fix STS_AGAIN handling
block: nr_phys_segments needs to be zero for REQ_OP_WRITE_ZEROES
blk-mq: simplify blk_mq_make_request()
blk-mq: remove blk_mq_put_ctx()
sbitmap: Replace cmpxchg with xchg
block: fix .bi_size overflow
block: sed-opal: check size of shadow mbr
block: sed-opal: ioctl for writing to shadow mbr
block: sed-opal: add ioctl for done-mark of shadow mbr
block: never take page references for ITER_BVEC
direct-io: use bio_release_pages in dio_bio_complete
block_dev: use bio_release_pages in bio_unmap_user
block_dev: use bio_release_pages in blkdev_bio_end_io
iomap: use bio_release_pages in iomap_dio_bio_end_io
block: use bio_release_pages in bio_map_user_iov
block: use bio_release_pages in bio_unmap_user
block: optionally mark pages dirty in bio_release_pages
block: move the BIO_NO_PAGE_REF check into bio_release_pages
block: skd_main.c: Remove call to memset after dma_alloc_coherent
block: mtip32xx: Remove call to memset after dma_alloc_coherent
...
Many updates in this development cycle are found in ASoC where it got
a wide range of changes for the continued refactoring.
Some highlights are below.
ASoC:
* Continued refactoring work by Morimoto-san toward the full
componentization; the changes are seen allover the places
* Support for force disconnecting muxes in DAPM
* Continued development of ASoC Intel SOF stuff
* New drivers for Cirrus Logic CS47L35, CS47L85 and CS47L90,
Conexant CX2072X, Realtek RT1011 and RT1308
HD-audio:
* More fixes and adjustments for ASoC SOF HD-audio
* Fix for resume problem on some Realtek codecs
USB-audio:
* A few fixes for the issues reported by syzbot USB fuzzer
* Fix for UAC2 extension unit parser
* Quirks for Line6 Helix, Emgaic Unitor 8
FireWire:
* Lots of code refactoring and fixes in most of its components
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAl0kp+oOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE902A/+ISwG/QtN41cnJxBaEMMQryJ9jDG2ux89NSST
TFbmXuJj3CRsBoQ4gkqgRGtEvxf8DK5BohEl8AarxYaTrtZALEjLJCm9oQ5YUF1X
Vkwqw9cuj+lptZmB6j/t4f352edsvuf4ZyG6plb331xbzJzzh06s0PpUyIispdMH
hmxBPrKuWl1pPtpRaLqmhUrq3g4aMA3WKbVChwIdlMc3laJkbsIlMBHs74V4yo8s
MauzOdNaFFgr7KKjopWO6b/qLVRgcMsm5Cr/+PaIcylcbsJ9UiMew6s7H7CFjTj+
AzER9ydTBp9hmJHdXyVmEHzTpYJL+5nMUToYbv8on4gzScIVhimpAjJVDjT6ALui
q7+NyicdnGitpFyrbE0mS3Y0yUtbfwmJA1txrmpoyVnX9HhJ5ZNvRMEW0HmylKi5
CLvJcdg2XG42xCogaNtOkI54AAMZIDgulAQwuYemY5C/wAvsOxQjhjvfv8oTFMPv
3Q0BhTjuH7xACcxIBIZSvZmw822PB07/lrR8lMyJz4CTHJwNts1ycnc5Kw1n/8qs
73f6RZWiWI0eqMMm0Pahv6bGD9lLurp66Y0NnwBl7HXSvXAdL9L+xlzecqqTYeCI
r1czm7qli+mKZY9Wml/sfgXgBaLw0/UhhU99IyzrVYwrTrNRRAb6UJqCRmw0mTuj
1f0Q69w=
=H+RV
-----END PGP SIGNATURE-----
Merge tag 'sound-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"Many updates in this development cycle are found in ASoC where it got
a wide range of changes for the continued refactoring.
Some highlights are below.
ASoC:
- Continued refactoring work by Morimoto-san toward the full
componentization; the changes are seen allover the places
- Support for force disconnecting muxes in DAPM
- Continued development of ASoC Intel SOF stuff
- New drivers for Cirrus Logic CS47L35, CS47L85 and CS47L90, Conexant
CX2072X, Realtek RT1011 and RT1308
HD-audio:
- More fixes and adjustments for ASoC SOF HD-audio
- Fix for resume problem on some Realtek codecs
USB-audio:
- A few fixes for the issues reported by syzbot USB fuzzer
- Fix for UAC2 extension unit parser
- Quirks for Line6 Helix, Emgaic Unitor 8
FireWire:
- Lots of code refactoring and fixes in most of its components"
* tag 'sound-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (626 commits)
ALSA: firewire-lib: code refactoring for local variables
ALSA: firewire-lib: code refactoring for post operation to data block counter
ALSA: firewire-lib: code refactoring for error path of parser for CIP header
ALSA: firewire-lib: fix different data block counter between probed event and transferred isochronous packet
ALSA: firewire-lib: fix initial value of data block count for IR context without CIP_DBC_IS_END_EVENT
ALSA: firewire-lib/fireface: fix initial value of data block counter for IR context with CIP_NO_HEADER
ALSA: firewire-lib: fix invalid length of rx packet payload for tracepoint events
ALSA: usb-audio: fix Line6 Helix audio format rates
firewire-motu: fix wrong reference count for stream functionality at error path of rawmidi interface
ALSA: firewire-digi00x: fix wrong reference count for stream functionality at error path of rawmidi interface
ALSA: dice: fix wrong reference count for stream functionality at error path of rawmidi interface
ALSA: oxfw: fix wrong reference count for stream functionality at error path of rawmidi interface
ALSA: fireworks: fix wrong reference count for stream functionality at error path of rawmidi interface
ALSA: bebob: fix wrong reference count for stream functionality at error path of rawmidi interface
ASoC: SOF: Intel: implement runtime idle for CNL/APL
ASoC: SOF: add runtime idle callback
ASoC: hdac_hdmi: report codec link up/down status to bus
ASoC: SOF: debug: fix possible memory leak in sof_dfsentry_write()
ASoC: sunxi: sun50i-codec-analog: Add earpiece
ASoC: rt5665: remove redundant assignment to variable idx
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl0kcloACgkQCF8+vY7k
4RU0Mg//c0BK1VjPfh45k3HxDvvoQnaTlQjo1ApvEBa64TR10/JxXi9U+QuhY1H1
QEOjJJrLe3OdWrcBwFT5s15cBdRKn6jB8s67FgN7CFA8IzG4xBjOovOP4MXYXztz
TjNlkLwWkkwiQ4C99HInSsmI36ZbnEI3PloJXMrEBnsXQIazjRzMeJ3DKZggSSkN
jhedASDgnHgSun0rCFh0mN8k2kiMUZ/XxVIqiCuWT1AzyycoHp+HXm9YilnxD7q0
43X6v4HtxrfJMQWQ8z3Pxb4McHc1j3L8S7Mgu9oSf7oJb12grsDxokytUCnmhFz8
gvNgx3D2OTQk2nrQlxcfgrbqMs2KXLkXIiqg/dZ35hpNfXIOxiOx77zqqIsg2WHr
j4qaDcw+TNQU1eUIm0dIcPFi82EMOjAUqHRrvsg94EbBQ1dUniE4GCkCHFU+2TLz
YIaWXv/WmavrWTydsoIKqbBBOR9OY9+PDjkxgHsSVPSZ9sAskcPQ5MJdkatoZcu/
glZJD8HEe/w1vi1Ob3ZsXU7KzNWAhIkWvtEWsxAuFGzr8uNfwmzM2a5giNLIV94b
W+ZYhQT363uliVSOPFRjqqkVliC/HmV2lcNVzlVymWgeKhRj8fI5JqrDhpLz0XD+
9Rn90mweCzCSw7/fj/keP9mEnMce4XdkkDB/Pyf+dIMiLMOczoU=
=ABgl
-----END PGP SIGNATURE-----
Merge tag 'media/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- new Atmel microship ISC driver
- coda has gained support for mpeg2 and mpeg4
- cxusb gained support for analog TV
- rockchip staging driver was split into two separate staging drivers
- added a new staging driver for Allegro DVT video IP core
- added a new staging driver for Amlogic Meson video decoder
- lots of improvements and cleanups
* tag 'media/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (398 commits)
media: allegro: use new v4l2_m2m_ioctl_try_encoder_cmd funcs
media: doc-rst: Fix typos
media: radio-raremono: change devm_k*alloc to k*alloc
media: stv0297: fix frequency range limit
media: rc: Prefer KEY_NUMERIC_* for number buttons on remotes
media: dvb_frontend: split dvb_frontend_handle_ioctl function
media: mceusb: disable "nonsensical irdata" messages
media: rc: remove redundant dev_err message
media: cec-notifier: add new notifier functions
media: cec: add struct cec_connector_info support
media: cec-notifier: rename variables, check kstrdup and n->conn_name
media: MAINTAINERS: Add maintainers for Media Controller
media: staging: media: tegra-vde: Defer dmabuf's unmapping
media: staging: media: tegra-vde: Add IOMMU support
media: hdpvr: fix locking and a missing msleep
media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
media: atmel: atmel-isc: fix i386 build error
media: v4l2-ctrl: Move compound control initialization
media: hantro: Use vb2_get_buffer
media: pci: cx88: Change the type of 'missed' to u64
...
Including:
- Patches to make the dma-iommu code more generic so that it can
be used outside of the ARM context with other IOMMU drivers.
Goal is to make use of it on x86 too.
- Generic IOMMU domain support for the Intel VT-d driver. This
driver now makes more use of common IOMMU code to allocate
default domains for the devices it handles.
- An IOMMU fault reporting API to userspace. With that the IOMMU
fault handling can be done in user-space, for example to
forward the faults to a VM.
- Better handling for reserved regions requested by the
firmware. These can be 'relaxed' now, meaning that those don't
prevent a device being attached to a VM.
- Suspend/Resume support for the Renesas IOMMU driver.
- Added support for dumping SVA related fields of the DMAR table
in the Intel VT-d driver via debugfs.
- A pile of smaller fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl0jaOAACgkQK/BELZcB
GuNYTRAAjXuNs1OX/ROJ4TByT3YBWj5BLZVMrpGx75MAEvO68a7rTdaCpGuHv09w
5JneXxA3H2O1q6JCGguLm//Dy5QycJQIn0WuaBGi5Vjo+cGe47sf48Hb6AhIoAFf
exjhfmG6kL3ZzEkLU/7iJWqz+iiCBxRU0yMINJvzFFzliJkfyGWsjoruxobmUEhm
XdEsdeAQFXoPMNVCDiorM+B+3VCl7BfwfhdlFrmX3fpNrMZ5ytlnPTgfJdXAGgHn
DAoK8qGoyYtRnZnivn1y56M13qv5y3XvicKqx/GRKK8CyJpz/vYSBLktI5HO4tOF
y/VzNg+ctDSjQgFyKIq+38k6qu4YWctyxZNObmPVpioNf9j9GaZWT97fxGzXMNS4
G7TqOeq4yxjotAOcd+DdH265uVBtGmLqk7XobigxCZ0xzGVVnmPy4oJtkWwjaMgN
HtNSMPRRirjprK4BUoCiXwP8YyWZs0a93oJmh2orD406yZkHNuqLNaicu24WsKKf
Ze5BsFbNUb0qYdyL7ofOugQrOvYYoWYuREiz8ClF3U4mCcOk4FJ9K2Vf1kSfKBX8
1Skz75NeSZdzXg9JcznNMuiqH2mN3lx/88xKJg3LJxEeFxU++VDMX3Z8kEmpj5w+
9m/v+VRclUXL6g8s6QFNIlIenMknAEjtWw6hVWPhWLaULH6fR6Y=
=Kvk5
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Make the dma-iommu code more generic so that it can be used outside
of the ARM context with other IOMMU drivers. Goal is to make use of
it on x86 too.
- Generic IOMMU domain support for the Intel VT-d driver. This driver
now makes more use of common IOMMU code to allocate default domains
for the devices it handles.
- An IOMMU fault reporting API to userspace. With that the IOMMU fault
handling can be done in user-space, for example to forward the faults
to a VM.
- Better handling for reserved regions requested by the firmware. These
can be 'relaxed' now, meaning that those don't prevent a device being
attached to a VM.
- Suspend/Resume support for the Renesas IOMMU driver.
- Added support for dumping SVA related fields of the DMAR table in the
Intel VT-d driver via debugfs.
- A pile of smaller fixes and cleanups.
* tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (90 commits)
iommu/omap: No need to check return value of debugfs_create functions
iommu/arm-smmu-v3: Invalidate ATC when detaching a device
iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=n
iommu/vt-d: Cleanup unused variable
iommu/amd: Flush not present cache in iommu_map_page
iommu/amd: Only free resources once on init error
iommu/amd: Move gart fallback to amd_iommu_init
iommu/amd: Make iommu_disable safer
iommu/io-pgtable: Support non-coherent page tables
iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag
iommu/io-pgtable-arm: Add support to use system cache
iommu/arm-smmu-v3: Increase maximum size of queues
iommu/vt-d: Silence a variable set but not used
iommu/vt-d: Remove an unused variable "length"
iommu: Fix integer truncation
iommu: Add padding to struct iommu_fault
iommu/vt-d: Consolidate domain_init() to avoid duplication
iommu/vt-d: Cleanup after delegating DMA domain to generic iommu
iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
iommu/vt-d: Allow DMA domain attaching to rmrr locked device
...
Pull cgroup updates from Tejun Heo:
"Documentation updates and the addition of cgroup_parse_float() which
will be used by new controllers including blk-iocost"
* 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs: cgroup-v1: convert docs to ReST and rename to *.rst
cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS
cgroup: add cgroup_parse_float()
-----BEGIN PGP SIGNATURE-----
iQIVAwUAXRyyVvu3V2unywtrAQL3xQ//eifjlELkRAPm2EReWwwahdM+9QL/0bAy
e8eAzP9EaphQGUhpIzM9Y7Cx+a8XW2xACljY8hEFGyxXhDMoLa35oSoJOeay6vQt
QcgWnDYsET8Z7HOsFCP3ZQqlbbqfsB6CbIKtZoEkZ8ib7eXpYcy1qTydu7wqrl4A
AaJalAhlUKKUx9hkGGJTh2xvgmxgSJkxx3cNEWJQ2uGgY/ustBpqqT4iwFDsgA/q
fcYTQFfNQBsC8/SmvQgxJSc+reUdQdp0z1vd8qjpSdFFcTq1qOtK0qDdz1Bbyl24
hAxvNM1KKav83C8aF7oHhEwLrkD+XiYKixdEiCJJp+A2i+vy2v8JnfgtFTpTgLNK
5xu2VmaiWmee9SLCiDIBKE4Ghtkr8DQ/5cKFCwthT8GXgQUtdsdwAaT3bWdCNfRm
DqgU/AyyXhoHXrUM25tPeF3hZuDn2yy6b1TbKA9GCpu5TtznZIHju40Px/XMIpQH
8d6s/pg+u/SnkhjYWaTvTcvsQ2FB/vZY/UzAVyosnoMBkVfL4UtAHGbb8FBVj1nf
Dv5VjSjl4vFjgOr3jygEAeD2cJ7L6jyKbtC/jo4dnOmPrSRShIjvfSU04L3z7FZS
XFjMmGb2Jj8a7vAGFmsJdwmIXZ1uoTwX56DbpNL88eCgZWFPGKU7TisdIWAmJj8U
N9wholjHJgw=
=E3bF
-----END PGP SIGNATURE-----
Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull keyring ACL support from David Howells:
"This changes the permissions model used by keys and keyrings to be
based on an internal ACL by the following means:
- Replace the permissions mask internally with an ACL that contains a
list of ACEs, each with a specific subject with a permissions mask.
Potted default ACLs are available for new keys and keyrings.
ACE subjects can be macroised to indicate the UID and GID specified
on the key (which remain). Future commits will be able to add
additional subject types, such as specific UIDs or domain
tags/namespaces.
Also split a number of permissions to give finer control. Examples
include splitting the revocation permit from the change-attributes
permit, thereby allowing someone to be granted permission to revoke
a key without allowing them to change the owner; also the ability
to join a keyring is split from the ability to link to it, thereby
stopping a process accessing a keyring by joining it and thus
acquiring use of possessor permits.
- Provide a keyctl to allow the granting or denial of one or more
permits to a specific subject. Direct access to the ACL is not
granted, and the ACL cannot be viewed"
* tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Provide KEYCTL_GRANT_PERMISSION
keys: Replace uid/gid/perm permissions checking with an ACL
Currently, TC offers the ability to match on the MPLS fields of a packet
through the use of the flow_dissector_key_mpls struct. However, as yet, TC
actions do not allow the modification or manipulation of such fields.
Add a new module that registers TC action ops to allow manipulation of
MPLS. This includes the ability to push and pop headers as well as modify
the contents of new or existing headers. A further action to decrement the
TTL field of an MPLS header is also provided with a new helper added to
support this.
Examples of the usage of the new action with flower rules to push and pop
MPLS labels are:
tc filter add dev eth0 protocol ip parent ffff: flower \
action mpls push protocol mpls_uc label 123 \
action mirred egress redirect dev eth1
tc filter add dev eth0 protocol mpls_uc parent ffff: flower \
action mpls pop protocol ipv4 \
action mirred egress redirect dev eth1
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>