Commit Graph

5506 Commits (640d4ea83c8aa085407ea93aec65704bf51f4bae)

Author SHA1 Message Date
Linus Torvalds b5dd0c658c Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - some of the rest of MM

 - various misc things

 - dynamic-debug updates

 - checkpatch

 - some epoll speedups

 - autofs

 - rapidio

 - lib/, lib/lzo/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  samples/mic/mpssd/mpssd.h: remove duplicate header
  kernel/fork.c: remove duplicated include
  include/linux/relay.h: fix percpu annotation in struct rchan
  arch/nios2/mm/fault.c: remove duplicate include
  unicore32: stop printing the virtual memory layout
  MAINTAINERS: fix GTA02 entry and mark as orphan
  mm: create the new vm_fault_t type
  arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
  arch: simplify several early memory allocations
  openrisc: simplify pte_alloc_one_kernel()
  sh: prefer memblock APIs returning virtual address
  microblaze: prefer memblock API returning virtual address
  powerpc: prefer memblock APIs returning virtual address
  lib/lzo: separate lzo-rle from lzo
  lib/lzo: implement run-length encoding
  lib/lzo: fast 8-byte copy on arm64
  lib/lzo: 64-bit CTZ on arm64
  lib/lzo: tidy-up ifdefs
  ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
  ipc: annotate implicit fall through
  ...
2019-03-07 19:25:37 -08:00
Oleg Nesterov 6eb3c3d0a5 exec: increase BINPRM_BUF_SIZE to 256
Large enterprise clients often run applications out of networked file
systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters.  Bumping this up
to the next order of two solves this problem in all but the most
egregious case while still fitting into a 512b slab.

[oleg@redhat.com: update comment, per Kees]
Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ben Woodard <woodard@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent 60d6d04ca3 autofs: add ignore mount option
Add an autofs file system mount option that can be used to provide a
generic indicator to applications that the mount entry should be ignored
when displaying mount information.

In other OSes that provide autofs and that provide a mount list to user
space based on the kernel mount list a no-op mount option ("ignore" is
the one use on the most common OS) is allowed so that autofs file system
users can optionally use it.

The idea is that it be used by user space programs to exclude autofs
mounts from consideration when reading the mounts list.

Prior to the change to link /etc/mtab to /proc/self/mounts all I needed
to do to achieve this was to use mount(2) and not update the mtab but
now that no longer works.

I know the symlinking happened a long time ago and I considered doing
this then but, at the time I couldn't remember the commonly used option
name and thought persuading the various utility maintainers would be too
hard.

But now I have a RHEL request to do this for compatibility for a widely
used product so I want to go ahead with it and try and enlist the help
of some utility package maintainers.

Clearly, without the option nothing can be done so it's at least a
start.

Link: http://lkml.kernel.org/r/154725123970.11260.6113771566924907275.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Masahiro Yamada 54d50897d5 linux/kernel.h: split *_MAX and *_MIN macros into <linux/limits.h>
<linux/kernel.h> tends to be cluttered because we often put various sort
of unrelated stuff in it.  So, we have split out a sensible chunk of
code into a separate header from time to time.

This commit splits out the *_MAX and *_MIN defines.

The standard header <limits.h> contains various MAX, MIN constants
including numerial limits.  [1]

I think it makes sense to move in-kernel MAX, MIN constants into
include/linux/limits.h.

We already have include/uapi/linux/limits.h to contain some user-space
constants.  I changed its include guard to _UAPI_LINUX_LIMITS_H.  This
change has no impact to the user-space because
scripts/headers_install.sh rips off the '_UAPI' prefix from the include
guards of exported headers.

[1] http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html

Link: http://lkml.kernel.org/r/1549156242-20806-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Alex Elder <elder@linaro.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:31:59 -08:00
Linus Torvalds b1e243957e for-5.1-part1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlx9czQACgkQxWXV+ddt
 WDvC9w/8CxJf1/eZBqb+b+aA38kgZhoaNixMud/IW/IFmIlicX0PoDxk6dh1ZTA+
 3uej/7fyfwjNCVvtrPVVxdT8zhZgyJouHrbhG1PlDWtmTEV2VqV5pBG1xQtCwmZy
 oinQI5oYYM5Le5EXxRGH8TQs6Z3tFuLx2kcrVWBLFKoZ2kZBZxe6KykGF9izve4a
 sVjtOL1CEL1e00vrNLzUmch8qss9Cu0i3qd3k8UANp3SgKIaOkJt4S/HeEcLfy5J
 kf6hVKlgPDuakVtAJKyhbLVQsfHVNkfiyvplta9lDot/iJchJITTRkadP6LblVeo
 knl8V+VO9kzQUvGauxtu66Q3DJ/7mqbzHUwPISetdKCV9ZXkuPFHnu0AEP577mrx
 e1JAPA/a8lF3up5QhqIb0uzH3sczOd8nNN/b1Xnxl7Kogyl8SUjhmX3FFy88borj
 /8Ptv/fFMQZs9IJ0QWlkh5TKRXAtSNAzVy2FpkvLaO0k0gJKQjyJuTKV5ezv/PGU
 +4m5kDtfpyz//KAOZxq4lERj4EMIEDhHhNbA8Qqmdeoj7oaKZ+gW57enOXohCTbi
 gVE6xDr2u4oQ85j3JuQo5W5mZA4uza35Gh4t43n5akdrrkLFVLW584hDxShGx9uS
 B0maToGbzOdGJTZXZ2SLHZ5Da14Lzb/TooCufF8GMAISb99vbjw=
 =D9zc
 -----END PGP SIGNATURE-----

Merge tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This contains usual mix of new features, core changes and fixes; full
  list below. I'm planning second pull request, with a few more fixes
  that arrived recently but too close to merge window, will send it next
  week.

  New features:

   - support zstd compression levels

   - new ioctl to unregister a device from the module (ie. reverse of
     device scan)

   - scrub prints a message to log when it's about to start or finish

  Core changes:

   - qgroups can now skip part of a tree that does not get updated
     during relocation, because this does not affect the quota
     accounting, estimated speedup in run time is about 20%

   - the compression workspace management had to be enhanced due to zstd
     requirements

   - various enospc fixes, when there's high fragmentation the
     over-reservation can cause ENOSPC that might not happen after a
     flush, in such cases try to wait if the situation improves

  Fixes:

   - various ioctls could overwrite previous return value if
     copy_to_user fails, fix this so the original error is reported

   - more reclaim vs GFP_KERNEL fixes

   - other cleanups and refactoring

   - fix a (valid) lockdep warning in a test when device replace is
     destroying worker threads

   - make qgroup async transaction commit more aggressive, this avoids
     some 'quota limit reached' errors if there are not enough data to
     trigger transaction in order to flush

   - fix deadlock between snapshot deletion and quotas when backref
     walking is called from context that already holds the same locks

   - fsync fixes:
      - fix fsync after succession of renames of different files
      - fix fsync after succession of renames and unlink/rmdir"

* tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (92 commits)
  btrfs: Remove unnecessary casts in btrfs_read_root_item
  Btrfs: remove assertion when searching for a key in a node/leaf
  Btrfs: add missing error handling after doing leaf/node binary search
  btrfs: drop the lock on error in btrfs_dev_replace_cancel
  btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
  btrfs: init csum_list before possible free
  Btrfs: remove no longer needed range length checks for deduplication
  Btrfs: fix fsync after succession of renames and unlink/rmdir
  Btrfs: fix fsync after succession of renames of different files
  btrfs: honor path->skip_locking in backref code
  btrfs: qgroup: Make qgroup async transaction commit more aggressive
  btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record
  btrfs: scrub: remove unused nocow worker pointer
  btrfs: scrub: add assertions for worker pointers
  btrfs: scrub: convert scrub_workers_refcnt to refcount_t
  btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get
  btrfs: scrub: fix circular locking dependency warning
  btrfs: fix comment its device list mutex not volume lock
  btrfs: extent_io: Kill the forward declaration of flush_write_bio
  btrfs: Fix grossly misleading argument names in extent io search
  ...
2019-03-07 09:07:30 -08:00
Linus Torvalds 0556161ff9 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlx5SzMACgkQnJ2qBz9k
 QNnP0AgAl9HDtk436P4QPPFhdeXBB6uRTYU8wgZSropXMxoyUzqZotWkeUqUusHs
 I8BAJeQojZeKUnMwET1/RA+dMLsgAlMcxkM3+3eDCLF9wh/+wO2B2G3ywh8KMVed
 Sa3C4V/NZvZzAossoGDV/yWmK+ZYrrW8l/DM3LU54GV1NfAL+Khn4FNwtgWiYiP5
 S4RkRflzwhIEZJSZByMlCLcsrHl/ehtMJuR1opUPY1c0CY8iAGcIobSSzVFqv4f5
 ScRB56rnXqTt6CBGBcDIkWxWqEE9XuTdAn1AVMC7327UezyFNocXUZaqDzJcGcR/
 rM3sf+seZftbA4nid8dIhSRoqAf4Yw==
 =/DQr
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fanotify updates from Jan Kara:
 "Support for fanotify directory events and changes to make waiting for
  fanotify permission event response killable"

* tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (25 commits)
  fanotify: Make waits for fanotify events only killable
  fanotify: Use interruptible wait when waiting for permission events
  fanotify: Track permission event state
  fanotify: Simplify cleaning of access_list
  fsnotify: Create function to remove event from notification list
  fanotify: Move locking inside get_one_event()
  fanotify: Fold dequeue_event() into process_access_response()
  fanotify: Select EXPORTFS
  fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID
  fanotify: add support for create/attrib/move/delete events
  fanotify: support events with data type FSNOTIFY_EVENT_INODE
  fanotify: check FS_ISDIR flag instead of d_is_dir()
  fsnotify: report FS_ISDIR flag with MOVE_SELF and DELETE_SELF events
  fanotify: use vfs_get_fsid() helper instead of vfs_statfs()
  vfs: add vfs_get_fsid() helper
  fanotify: cache fsid in fsnotify_mark_connector
  fanotify: enable FAN_REPORT_FID init flag
  fanotify: copy event fid info to user
  fanotify: encode file identifier for FAN_REPORT_FID
  fanotify: open code fill_event_metadata()
  ...
2019-03-07 09:03:38 -08:00
Linus Torvalds 67e79a6dc2 TTY/Serial patches for 5.1-rc1
Here is the "big" patchset for the tty/serial driver layer for 5.1-rc1.
 
 It's really not all that big, nothing major here.
 
 There are a lot of tiny driver fixes and updates, combined with other
 cleanups for different serial drivers and the vt layer.  Full details
 are in the shortlog.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+g7A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymoRgCgy8MZ6vayopxNdRBnp4mMmgtaVYMAoMIPYq0V
 XuhV/ZSENWdef2QuaJ3O
 =nSZk
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here is the "big" patchset for the tty/serial driver layer for
  5.1-rc1.

  It's really not all that big, nothing major here.

  There are a lot of tiny driver fixes and updates, combined with other
  cleanups for different serial drivers and the vt layer. Full details
  are in the shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'tty-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (70 commits)
  tty: xilinx_uartps: Correct return value in probe
  serial: sprd: Modify the baud rate calculation formula
  dt-bindings: serial: Add Milbeaut serial driver description
  serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
  serial: 8250_pxa: honor the port number from devicetree
  tty: hvc_xen: Mark expected switch fall-through
  tty: n_gsm: Mark expected switch fall-throughs
  tty: serial: msm_serial: Remove __init from msm_console_setup()
  tty: serial: samsung: Enable baud clock during initialisation
  serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
  tty: serial: remove redundant likely annotation
  tty/n_hdlc: mark expected switch fall-through
  serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
  serial: 8250_pci: Fix number of ports for ACCES serial cards
  vt: perform safe console erase in the right order
  tty/nozomi: use pci_iomap instead of ioremap_nocache
  tty/synclink: remove ISA support
  serial: 8250_pci: Replace custom code with pci_match_id()
  serial: max310x: Correction of the initial setting of the MODE1 bits for various supported ICs.
  serial: mps2-uart: Add parentheses around conditional in mps2_uart_shutdown
  ...
2019-03-06 16:35:12 -08:00
Linus Torvalds e266ca36da Staging/IIO patches for 5.1-rc1
Here is the big staging/iio driver pull request for 5.1-rc1.
 
 Lots of good IIO driver updates and cleanups in here as always.
 Combined with the removal of the xgifb driver, we have a net "loss" of
 over 9000 lines in the pull request, always a nice thing.
 
 As the outreachy application process is currently happening, there are
 loads of tiny checkpatch cleanup fixes all over the staging tree, which
 accounts for the majority of the fixups.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+gLQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymBiQCeJpoBhG+W3r+kP8w65ZY8qU+/liIAn0Tkl4/k
 IX1dQzCsEpO1jA8AHj6n
 =7wCH
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO updates from Greg KH:
 "Here is the big staging/iio driver pull request for 5.1-rc1.

  Lots of good IIO driver updates and cleanups in here as always.
  Combined with the removal of the xgifb driver, we have a net "loss" of
  over 9000 lines in the pull request, always a nice thing.

  As the outreachy application process is currently happening, there are
  loads of tiny checkpatch cleanup fixes all over the staging tree,
  which accounts for the majority of the fixups"

* tag 'staging-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (341 commits)
  staging: mt7621-dma: remove license boilerplate text
  staging: mt7621-dma: add SPDX GPL-2.0+ license identifier
  Staging: ks7010: Replace typecast to int
  Staging: vt6655: Align a static function declaration
  staging: speakup: fix line over 80 characters.
  staging: mt7621-eth: Remove license boilerplate text
  staging: mt7621-eth: Add SPDX license identifier
  staging: ks7010: removed custom Michael MIC implementation.
  staging: rtl8192e: Fix space and suspect issue
  Staging: vt6655: Modify comment style of SPDX License Identifier
  Staging: vt6655: Modify comment style for SPDX-License-Identifier
  Staging: vt6655: Align a function declaration
  Staging: vt6655: Alignment of function declaration
  staging: rtl8712: Fix indentation issue
  staging: wilc1000: fix incorrent type in initializer
  staging: rtl8188eu: remove unused P2P_PRIVATE_IOCTL_SET_LEN
  staging: rtl8188eu: remove unused enum P2P_PROTO_WK_ID
  staging: rtl8723bs: Remove duplicated include from drv_types.h
  Staging: vt6655: Alignment should match open parenthesis
  staging: erofs: fix mis-acted TAIL merging behavior
  ...
2019-03-06 16:29:27 -08:00
Linus Torvalds 45763bf4bc Char/Misc driver patches for 5.1-rc1
Here is the big char/misc driver patch pull request for 5.1-rc1.
 
 The largest thing by far is the new habanalabs driver for their AI
 accelerator chip.  For now it is in the drivers/misc directory but will
 probably move to a new directory soon along with other drivers of this
 type.
 
 Other than that, just the usual set of individual driver updates and
 fixes.  There's an "odd" merge in here from the DRM tree that they asked
 me to do as the MEI driver is starting to interact with the i915 driver,
 and it needed some coordination.  All of those patches have been
 properly acked by the relevant subsystem maintainers.
 
 All of these have been in linux-next with no reported issues, most for
 quite some time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+dPQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym1fACgvpZAxjNzoRQJ6f06tc8ujtPk9rUAnR+tCtrZ
 9e3l7H76oe33o96Qjhor
 =8A2k
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big char/misc driver patch pull request for 5.1-rc1.

  The largest thing by far is the new habanalabs driver for their AI
  accelerator chip. For now it is in the drivers/misc directory but will
  probably move to a new directory soon along with other drivers of this
  type.

  Other than that, just the usual set of individual driver updates and
  fixes. There's an "odd" merge in here from the DRM tree that they
  asked me to do as the MEI driver is starting to interact with the i915
  driver, and it needed some coordination. All of those patches have
  been properly acked by the relevant subsystem maintainers.

  All of these have been in linux-next with no reported issues, most for
  quite some time"

* tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits)
  habanalabs: adjust Kconfig to fix build errors
  habanalabs: use %px instead of %p in error print
  habanalabs: use do_div for 64-bit divisions
  intel_th: gth: Fix an off-by-one in output unassigning
  habanalabs: fix little-endian<->cpu conversion warnings
  habanalabs: use NULL to initialize array of pointers
  habanalabs: fix little-endian<->cpu conversion warnings
  habanalabs: soft-reset device if context-switch fails
  habanalabs: print pointer using %p
  habanalabs: fix memory leak with CBs with unaligned size
  habanalabs: return correct error code on MMU mapping failure
  habanalabs: add comments in uapi/misc/habanalabs.h
  habanalabs: extend QMAN0 job timeout
  habanalabs: set DMA0 completion to SOB 1007
  habanalabs: fix validation of WREG32 to DMA completion
  habanalabs: fix mmu cache registers init
  habanalabs: disable CPU access on timeouts
  habanalabs: add MMU DRAM default page mapping
  habanalabs: Dissociate RAZWI info from event types
  misc/habanalabs: adjust Kconfig to fix build errors
  ...
2019-03-06 14:18:59 -08:00
Linus Torvalds 8dcd175bc3 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
  tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
  proc: more robust bulk read test
  proc: test /proc/*/maps, smaps, smaps_rollup, statm
  proc: use seq_puts() everywhere
  proc: read kernel cpu stat pointer once
  proc: remove unused argument in proc_pid_lookup()
  fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
  fs/proc/self.c: code cleanup for proc_setup_self()
  proc: return exit code 4 for skipped tests
  mm,mremap: bail out earlier in mremap_to under map pressure
  mm/sparse: fix a bad comparison
  mm/memory.c: do_fault: avoid usage of stale vm_area_struct
  writeback: fix inode cgroup switching comment
  mm/huge_memory.c: fix "orig_pud" set but not used
  mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
  mm/memcontrol.c: fix bad line in comment
  mm/cma.c: cma_declare_contiguous: correct err handling
  mm/page_ext.c: fix an imbalance with kmemleak
  mm/compaction: pass pgdat to too_many_isolated() instead of zone
  mm: remove zone_lru_lock() function, access ->lru_lock directly
  ...
2019-03-06 10:31:36 -08:00
Linus Torvalds fa29f5ba42 asm-generic changes for v5.1
Only a few small changes this time:
 
 - Michael S. Tsirkin cleans up linux/mman.h
 - Mike Rapoport found a typo
 
 I had originally merged another cleanup series for I/O accessors from
 Hugo Lefeuvre as well, but dropped it after the discussion of the barrier
 semantics and some conflicts. I expect this series to get merged for a
 later release though.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcf9yrAAoJEGCrR//JCVInohIP+wcCZ3XEym0oyb9C5BgIdnMF
 qq69wbOT9ypAnB469PPjsvGhyCsqGo1Fm2Cludac0C/OfmB/lYJCqklAJrc6a7Cp
 jXD5rQkXQZGdmFHS81i24ejcNV9F5fh16vyDwqCgUQDCgg9MeDPirvwaDl958rhq
 5dUsoUu+CJRI35jZ8NPWbsSU5Wa2BpckWnuTs97PtlLnMH/RzEyxmSRysxtHNA7S
 PgYjvWBHbRK6mTNgcY/eojAoQuQmBgiOppYX1XTffHZgwY/VBIjYL3Wxwe1qbvSN
 vvLgE6nvvmRYjdq4VOoVbi9BOtWiCtXw29TWXfxSetN9CWCEwIQJdIm8lHn1CJRh
 6GnOdb8ADqAzraxo7GOf78kNsekl3mxh+QYN5gDaGS6eQASCNlGx70zUh7Y6JcE5
 6NJ8VMy0oEEFzHAsE0mxluu8LHL3F1hwS832D6mQ697Z71T+IoHgIcMT2jHqp9fw
 7/D2taoBAXdGqhToY3hhMMWsi4lFCvxjVVlxCxhp7Ik+cyOgW0O5vG2afLOHrtti
 vWEcn7M6nuKvb3MxBVjg8sK2ln6vIXNgZGjVmkxn70ZAmvZJX+KE+G1cvB2gMGc0
 S/ogtpbETMgMCwuAf2hdYgiBFJpZL725DEWFfS+p+02PTwjdKF+EWAS4swZM0sCE
 U1v1N7yCe/Wb/ijEm0M1
 =C9p7
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "Only a few small changes this time:

   - Michael S. Tsirkin cleans up linux/mman.h

   - Mike Rapoport found a typo

  I had originally merged another cleanup series for I/O accessors from
  Hugo Lefeuvre as well, but dropped it after the discussion of the
  barrier semantics and some conflicts. I expect this series to get
  merged for a later release though"

* tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic/page.h: fix typo in #error text requiring a real asm/page.h
  arch: move common mmap flags to linux/mman.h
  drm: tweak header name
  x86/mpx: tweak header name
2019-03-06 09:18:43 -08:00
Linus Torvalds 203b6609e0 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of tooling updates - too many to list, here's a few highlights:

   - Various subcommand updates to 'perf trace', 'perf report', 'perf
     record', 'perf annotate', 'perf script', 'perf test', etc.

   - CPU and NUMA topology and affinity handling improvements,

   - HW tracing and HW support updates:
      - Intel PT updates
      - ARM CoreSight updates
      - vendor HW event updates

   - BPF updates

   - Tons of infrastructure updates, both on the build system and the
     library support side

   - Documentation updates.

   - ... and lots of other changes, see the changelog for details.

  Kernel side updates:

   - Tighten up kprobes blacklist handling, reduce the number of places
     where developers can install a kprobe and hang/crash the system.

   - Fix/enhance vma address filter handling.

   - Various PMU driver updates, small fixes and additions.

   - refcount_t conversions

   - BPF updates

   - error code propagation enhancements

   - misc other changes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
  perf script python: Add Python3 support to syscall-counts-by-pid.py
  perf script python: Add Python3 support to syscall-counts.py
  perf script python: Add Python3 support to stat-cpi.py
  perf script python: Add Python3 support to stackcollapse.py
  perf script python: Add Python3 support to sctop.py
  perf script python: Add Python3 support to powerpc-hcalls.py
  perf script python: Add Python3 support to net_dropmonitor.py
  perf script python: Add Python3 support to mem-phys-addr.py
  perf script python: Add Python3 support to failed-syscalls-by-pid.py
  perf script python: Add Python3 support to netdev-times.py
  perf tools: Add perf_exe() helper to find perf binary
  perf script: Handle missing fields with -F +..
  perf data: Add perf_data__open_dir_data function
  perf data: Add perf_data__(create_dir|close_dir) functions
  perf data: Fail check_backup in case of error
  perf data: Make check_backup work over directories
  perf tools: Add rm_rf_perf_data function
  perf tools: Add pattern name checking to rm_rf
  perf tools: Add depth checking to rm_rf
  perf data: Add global path holder
  ...
2019-03-06 07:59:36 -08:00
Joel Fernandes (Google) ab3948f58f mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd
Android uses ashmem for sharing memory regions.  We are looking forward
to migrating all usecases of ashmem to memfd so that we can possibly
remove the ashmem driver in the future from staging while also
benefiting from using memfd and contributing to it.  Note staging
drivers are also not ABI and generally can be removed at anytime.

One of the main usecases Android has is the ability to create a region
and mmap it as writeable, then add protection against making any
"future" writes while keeping the existing already mmap'ed
writeable-region active.  This allows us to implement a usecase where
receivers of the shared memory buffer can get a read-only view, while
the sender continues to write to the buffer.  See CursorWindow
documentation in Android for more details:

  https://developer.android.com/reference/android/database/CursorWindow

This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
which prevents any future mmap and write syscalls from succeeding while
keeping the existing mmap active.

A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week
where we don't need to modify core VFS structures to get the same
behavior of the seal.  This solves several side-effects pointed by Andy.
self-tests are provided in later patch to verify the expected semantics.

[1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/

Thanks a lot to Andy for suggestions to improve code.

Link: http://lkml.kernel.org/r/20190112203816.85534-2-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:19 -08:00
David Hildenbrand ca215086b1 mm: convert PG_balloon to PG_offline
PG_balloon was introduced to implement page migration/compaction for
pages inflated in virtio-balloon.  Nowadays, it is only a marker that a
page is part of virtio-balloon and therefore logically offline.

We also want to make use of this flag in other balloon drivers - for
inflated pages or when onlining a section but keeping some pages offline
(e.g.  used right now by XEN and Hyper-V via set_online_page_callback()).

We are going to expose this flag to dump tools like makedumpfile.  But
instead of exposing PG_balloon, let's generalize the concept of marking
pages as logically offline, so it can be reused for other purposes later
on.

Rename PG_balloon to PG_offline.  This is an indicator that the page is
logically offline, the content stale and that it should not be touched
(e.g.  a hypervisor would have to allocate backing storage in order for
the guest to dump an unused page).  We can then e.g.  exclude such pages
from dumps.

We replace and reuse KPF_BALLOON (23), as this shouldn't really harm
(and for now the semantics stay the same).  In following patches, we
will make use of this bit also in other balloon drivers.  While at it,
document PGTABLE.

[akpm@linux-foundation.org: fix comment text, per David]
Link: http://lkml.kernel.org/r/20181119101616.8901-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Pankaj gupta <pagupta@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Linus Torvalds b1b988a6a0 Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull year 2038 updates from Thomas Gleixner:
 "Another round of changes to make the kernel ready for 2038. After lots
  of preparatory work this is the first set of syscalls which are 2038
  safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

  The syscall numbers are identical all over the architectures"

* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  riscv: Use latest system call ABI
  checksyscalls: fix up mq_timedreceive and stat exceptions
  unicore32: Fix __ARCH_WANT_STAT64 definition
  asm-generic: Make time32 syscall numbers optional
  asm-generic: Drop getrlimit and setrlimit syscalls from default list
  32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
  compat ABI: use non-compat openat and open_by_handle_at variants
  y2038: add 64-bit time_t syscalls to all 32-bit architectures
  y2038: rename old time and utime syscalls
  y2038: remove struct definition redirects
  y2038: use time32 syscall names on 32-bit
  syscalls: remove obsolete __IGNORE_ macros
  y2038: syscalls: rename y2038 compat syscalls
  x86/x32: use time64 versions of sigtimedwait and recvmmsg
  timex: change syscalls to use struct __kernel_timex
  timex: use __kernel_timex internally
  sparc64: add custom adjtimex/clock_adjtime functions
  time: fix sys_timer_settime prototype
  time: Add struct __kernel_timex
  time: make adjtime compat handling available for 32 bit
  ...
2019-03-05 14:08:26 -08:00
Linus Torvalds edaed168e1 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti update from Thomas Gleixner:
 "Just a single change from the anti-performance departement:

   - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the
     speculation protections on a process without inheriting the state
     on exec.

     This remedies a situation where a Java-launcher has speculation
     protections enabled because that's the default for JVMs which
     causes the launched regular harmless processes to inherit the
     protection state which results in unintended performance
     degradation"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
2019-03-05 12:50:34 -08:00
David S. Miller f7fb7c1a1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-03-04

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP support to libbpf. Rationale is to facilitate writing
   AF_XDP applications by offering higher-level APIs that hide many
   of the details of the AF_XDP uapi. Sample programs are converted
   over to this new interface as well, from Magnus.

2) Introduce a new cant_sleep() macro for annotation of functions
   that cannot sleep and use it in BPF_PROG_RUN() to assert that
   BPF programs run under preemption disabled context, from Peter.

3) Introduce per BPF prog stats in order to monitor the usage
   of BPF; this is controlled by kernel.bpf_stats_enabled sysctl
   knob where monitoring tools can make use of this to efficiently
   determine the average cost of programs, from Alexei.

4) Split up BPF selftest's test_progs similarly as we already
   did with test_verifier. This allows to further reduce merge
   conflicts in future and to get more structure into our
   quickly growing BPF selftest suite, from Stanislav.

5) Fix a bug in BTF's dedup algorithm which can cause an infinite
   loop in some circumstances; also various BPF doc fixes and
   improvements, from Andrii.

6) Various BPF sample cleanups and migration to libbpf in order
   to further isolate the old sample loader code (so we can get
   rid of it at some point), from Jakub.

7) Add a new BPF helper for BPF cgroup skb progs that allows
   to set ECN CE code point and a Host Bandwidth Manager (HBM)
   sample program for limiting the bandwidth used by v2 cgroups,
   from Lawrence.

8) Enable write access to skb->queue_mapping from tc BPF egress
   programs in order to let BPF pick TX queue, from Jesper.

9) Fix a bug in BPF spinlock handling for map-in-map which did
   not propagate spin_lock_off to the meta map, from Yonghong.

10) Fix a bug in the new per-CPU BPF prog counters to properly
    initialize stats for each CPU, from Eric.

11) Add various BPF helper prototypes to selftest's bpf_helpers.h,
    from Willem.

12) Fix various BPF samples bugs in XDP and tracing progs,
    from Toke, Daniel and Yonghong.

13) Silence preemption splat in test_bpf after BPF_PROG_RUN()
    enforces it now everywhere, from Anders.

14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to
    get error handling working, from Dan.

15) Fix bpftool documentation and auto-completion with regards
    to stream_{verdict,parser} attach types, from Alban.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04 10:14:31 -08:00
Francesco Ruggeri 9036b2fe09 net: ipv6: add socket option IPV6_ROUTER_ALERT_ISOLATE
By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.

Signed-off-by: Maxim Martynov <maxim@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:05:10 -08:00
Kevin Darbyshire-Bryant 0b5c7efdfc sch_cake: Permit use of connmarks as tin classifiers
Add flag 'FWMARK' to enable use of firewall connmarks as tin selector.
The connmark (skbuff->mark) needs to be in the range 1->tin_cnt ie.
for diffserv3 the mark needs to be 1->3.

Background

Typically CAKE uses DSCP as the basis for tin selection.  DSCP values
are relatively easily changed as part of the egress path, usually with
iptables & the mangle table, ingress is more challenging.  CAKE is often
used on the WAN interface of a residential gateway where passthrough of
DSCP from the ISP is either missing or set to unhelpful values thus use
of ingress DSCP values for tin selection isn't helpful in that
environment.

An approach to solving the ingress tin selection problem is to use
CAKE's understanding of tc filters.  Naive tc filters could match on
source/destination port numbers and force tin selection that way, but
multiple filters don't scale particularly well as each filter must be
traversed whether it matches or not. e.g. a simple example to map 3
firewall marks to tins:

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
tc filter add dev $DEV parent $MAJOR protocol all handle 0x02 fw action skbedit priority ${MAJOR}2
tc filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3

Another option is to use eBPF cls_act with tc filters e.g.

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR bpf da obj my-bpf-fwmark-to-class.o

This has the disadvantages of a) needing someone to write & maintain
the bpf program, b) a bpf toolchain to compile it and c) needing to
hardcode the major number in the bpf program so it matches the cake
instance (or forcing the cake instance to a particular major number)
since the major number cannot be passed to the bpf program via tc
command line.

As already hinted at by the previous examples, it would be helpful
to associate tins with something that survives the Internet path and
ideally allows tin selection on both egress and ingress.  Netfilter's
conntrack permits setting an identifying mark on a connection which
can also be restored to an ingress packet with tc action connmark e.g.

tc filter add dev eth0 parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev ifb1

Since tc's connmark action has restored any connmark into skb->mark,
any of the previous solutions are based upon it and in one form or
another copy that mark to the skb->priority field where again CAKE
picks this up.

This change cuts out at least one of the (less intuitive &
non-scalable) middlemen and permit direct access to skb->mark.

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
brakmo f7c917ba11 bpf: add bpf helper bpf_skb_ecn_set_ce
This patch adds a new bpf helper BPF_FUNC_skb_ecn_set_ce
"int bpf_skb_ecn_set_ce(struct sk_buff *skb)". It is added to
BPF_PROG_TYPE_CGROUP_SKB typed bpf_prog which currently can
be attached to the ingress and egress path. The helper is needed
because his type of bpf_prog cannot modify the skb directly.

This helper is used to set the ECN field of ECN capable IP packets to ce
(congestion encountered) in the IPv6 or IPv4 header of the skb. It can be
used by a bpf_prog to manage egress or ingress network bandwdith limit
per cgroupv2 by inducing an ECN response in the TCP sender.
This works best when using DCTCP.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-02 10:48:27 -08:00
Ingo Molnar 9ed8f1a6e7 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:27:17 +01:00
Alexei Starovoitov 5f8f8b93ae bpf: expose program stats via bpf_prog_info
Return bpf program run_time_ns and run_cnt via bpf_prog_info

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-27 17:22:50 +01:00
Mohit P. Tahiliani 3f7ae5f3dc net: sched: pie: add more cases to auto-tune alpha and beta
The current implementation scales the local alpha and beta
variables in the calculate_probability function by the same
amount for all values of drop probability below 1%.

RFC 8033 suggests using additional cases for auto-tuning
alpha and beta when the drop probability is less than 1%.

In order to add more auto-tuning cases, MAX_PROB must be
scaled by u64 instead of u32 to prevent underflow when
scaling the local alpha and beta variables in the
calculate_probability function.

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25 14:21:03 -08:00
Anand Jain 228a73abde btrfs: introduce new ioctl to unregister a btrfs device
Support for a new command that can be used eg. as a command

  $ btrfs device scan --forget [dev]'
(the final name may change though)

to undo the effects of 'btrfs device scan [dev]'. For this purpose
this patch proposes to use ioctl #5 as it was empty and is next to the
SCAN ioctl.

The new ioctl BTRFS_IOC_FORGET_DEV works only on the control device
(/dev/btrfs-control) to unregister one or all devices, devices that are
not mounted.

The argument is struct btrfs_ioctl_vol_args, ::name specifies the device
path. To unregister all device, the path is an empty string.

Again, the devices are removed only if they aren't part of a mounte
filesystem.

This new ioctl provides:

- release of unwanted btrfs_fs_devices and btrfs_devices structures
  from memory if the device is not going to be mounted

- ability to mount filesystem in degraded mode, when one devices is
  corrupted like in split brain raid1

- running test cases which would require reloading the kernel module
  but this is not possible eg. due to mounted filesystem or built-in

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:30 +01:00
Heiner Kallweit e728fdf062 net: phy: improve definition of __ETHTOOL_LINK_MODE_MASK_NBITS
The way to define __ETHTOOL_LINK_MODE_MASK_NBITS seems to be overly
complicated, go with a standard approach instead.
Whilst we're at it, move the comment to the right place.

v2:
- rebased

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 17:33:59 -08:00
Callum Sinclair ca8d4794f6 ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
Currently the only way to clear the forwarding cache was to delete the
entries one by one using the MRT_DEL_MFC socket option or to destroy and
recreate the socket.

Create a new socket option which with the use of optional flags can
clear any combination of multicast entries (static or not static) and
multicast vifs (static or not static).

Calling the new socket option MRT_FLUSH with the flags MRT_FLUSH_MFC and
MRT_FLUSH_VIFS will clear all entries and vifs on the socket except for
static entries.

Signed-off-by: Callum Sinclair <callum.sinclair@alliedtelesis.co.nz>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 13:05:05 -08:00
Aya Levin 54719527fd devlink: Rename devlink health attributes
Rename devlink health attributes for better reflect the attributes use.
Add COUNT prefix on error counter attribute and recovery counter
attribute.

Fixes: 7afe335a8b ("devlink: Add health get command")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 10:38:51 -08:00
Aya Levin 2736d94f35 ethtool: Added support for 50Gbps per lane link modes
Added support for 50Gbps per lane link modes. Define various 50G, 100G
and 200G link modes using it.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-19 14:15:02 -08:00
David S. Miller 8bbed40f10 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for you net-next
tree:

1) Missing NFTA_RULE_POSITION_ID netlink attribute validation,
   from Phil Sutter.

2) Restrict matching on tunnel metadata to rx/tx path, from wenxu.

3) Avoid indirect calls for IPV6=y, from Florian Westphal.

4) Add two indirections to prepare merger of IPV4 and IPV6 nat
   modules, from Florian Westphal.

5) Broken indentation in ctnetlink, from Colin Ian King.

6) Patches to use struct_size() from netfilter and IPVS,
   from Gustavo A. R. Silva.

7) Display kernel splat only once in case of racing to confirm
   conntrack from bridge plus nfqueue setups, from Chieh-Min Wang.

8) Skip checksum validation for layer 4 protocols that don't need it,
   patch from Alin Nastac.

9) Sparse warning due to symbol that should be static in CLUSTERIP,
   from Wei Yongjun.

10) Add new toggle to disable SDP payload translation when media
    endpoint is reachable though the same interface as the signalling
    peer, from Alin Nastac.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-18 11:38:30 -08:00
Michael S. Tsirkin 746c9398f5 arch: move common mmap flags to linux/mman.h
Now that we have 3 mmap flags shared by all architectures,
let's move them into the common header.

This will help discourage future architectures from duplicating code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-18 17:49:30 +01:00
Jakub Kicinski 76726ccb7f devlink: add flash update command
Add devlink flash update command. Advanced NICs have firmware
stored in flash and often cryptographically secured. Updating
that flash is handled by management firmware. Ethtool has a
flash update command which served us well, however, it has two
shortcomings:
 - it takes rtnl_lock unnecessarily - really flash update has
   nothing to do with networking, so using a networking device
   as a handle is suboptimal, which leads us to the second one:
 - it requires a functioning netdev - in case device enters an
   error state and can't spawn a netdev (e.g. communication
   with the device fails) there is no netdev to use as a handle
   for flashing.

Devlink already has the ability to report the firmware versions,
now with the ability to update the firmware/flash we will be
able to recover devices in bad state.

To enable updates of sub-components of the FW allow passing
component name.  This name should correspond to one of the
versions reported in devlink info.

v1: - replace target id with component name (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-17 15:27:38 -08:00
David S. Miller 885e631959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.

2) test all bpf progs in alu32 mode, from Jiong.

3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.

4) support for IP encap in lwt bpf progs, from Peter.

5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:56:34 -08:00
David S. Miller 3313da8188 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The netfilter conflicts were rather simple overlapping
changes.

However, the cls_tcindex.c stuff was a bit more complex.

On the 'net' side, Cong is fixing several races and memory
leaks.  Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.

What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU.  I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15 12:38:38 -08:00
Deepa Dinamani 460a2db027 errqueue.h: Include time_types.h
Now that we have a separate header for struct __kernel_timespec,
include it directly without relying on userspace to do it.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14 11:51:51 -05:00
Deepa Dinamani ca5e9aba75 time: Add time_types.h
sys/time.h is the mandated include for many time related
defines. However, linux/time.h overlaps sys/time.h
significantly and this makes including both from userspace
or one from the other impossible.

This also means that userspace can get away with including
sys/time.h whenever it needs linux/time.h and this is what's
been happening in the user world usually.

But, we have new data types that we plan to use in the uapi time
interfaces also defined in the linux/time.h. But, we are unable
to use these types when sys/time.h is included.

Hence, move the new types to a new header, time_types.h.
We intend to eventually have all the uapi defines that the kernel
uses defined in this header.
Note that the plan is to replace uapi interfaces with timeval to
use __kernel_old_timeval, timespec to use __kernel_old_timespec etc.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Fixes: 9718475e69 ("socket: Add SO_TIMESTAMPING_NEW")
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14 11:51:51 -05:00
David S. Miller 03b9674202 This feature/cleanup patchset includes the following patches:
- fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt
 
  - fix typo, by Sven Eckelmann
 
  - netlink restructuring patch series (part 2), by Sven Eckelmann
    (19 patches)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlxj6MYWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoYTMD/9Uyaqe4ULGQ+rLBnCMSuNc5AnQ
 d7hebUep1RVDyyzJaHRNLmPwmSDNscdQX2AaAxOMhrbmDiBceHq1vDpNvLMDCa0g
 4KbljpfICc6Wm8cq6zHMvL5DQYdxvHt3xDDOvTufhTDra0+cZ8b7tLN1GIztrSSL
 BJ4gUkVydtFUIGCPkSfqZoMI82VTo4V/vOQpaAAANJXVehajHhPLOc2+AqnsHTxC
 6EpNFbMib7sxvrDV5/GdkhtyikmNyHXlaZaiboJjH8htdRWmxHSIfelgDZ2KzqJK
 NG5EhZNljLEHHU5cnCBA1ptvjB0zWVF1WEZDyD3fHPraNdmeyCyqI1kuEMFHPSEr
 PZ+gD9wqnmgQUc3D7Xi7Q0faEQ5wcOKtcCvMrWgydbQxanhiftBH8pikkR9VJh+B
 1ADuN2m+OSKaZ8Fw5tqknNbd8paKztlndgSNRHEc9KOWnT91cKO02Zg9wCyNhn7S
 SI/0eLZx0N0v3T/UVsoKbjKGCEiFZzgXMBRRb7jthJghD2hnyx+s1tRzkgxIJj8z
 fI5hwAl//m6tSCDjbOLjCdsNHiGs4AhP0MsFq5jHmpqvsY8ZUX59LcRnpfhX3IaN
 L+KpBxH9O25Pd3JgXgjlcuza2L1L/V7piPUit/C8LVb7goNN3QCi+UtXFVMR1C4c
 J7kI7vMak8BEIs4EuQ==
 =+p14
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20190213' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt

 - fix typo, by Sven Eckelmann

 - netlink restructuring patch series (part 2), by Sven Eckelmann
   (19 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-13 22:28:11 -08:00
Peter Oskolkov 3e0bd37ce0 bpf: add plumbing for BPF_LWT_ENCAP_IP in bpf_lwt_push_encap
This patch adds all needed plumbing in preparation to allowing
bpf programs to do IP encapping via bpf_lwt_push_encap. Actual
implementation is added in the next patch in the patchset.

Of note:
- bpf_lwt_push_encap can now be called from BPF_PROG_TYPE_LWT_XMIT
  prog types in addition to BPF_PROG_TYPE_LWT_IN;
- if the skb being encapped has GSO set, encapsulation is limited
  to IPIP/IP+GRE/IP+GUE (both IPv4 and IPv6);
- as route lookups are different for ingress vs egress, the single
  external bpf_lwt_push_encap BPF helper is routed internally to
  either bpf_lwt_in_push_encap or bpf_lwt_xmit_push_encap BPF_CALLs,
  depending on prog type.

v8 changes: fixed a typo.

Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-13 18:27:55 -08:00
Maxime Chevallier 7fd8afa893 net: phy: Add generic support for 2.5GBaseT and 5GBaseT
The 802.3bz specification, based on previous by the NBASET alliance,
defines the 2.5GBaseT and 5GBaseT link modes for ethernet traffic on
cat5e, cat6 and cat7 cables.

These mode integrate with the already defined C45 MDIO PMA/PMD registers
set that added 10G support, by defining some previously reserved bits,
and adding a new register (2.5G/5G Extended abilities).

This commit adds the required definitions in include/uapi/linux/mdio.h
to support these modes, and detect when a link-partner advertises them.

It also adds support for these mode in the generic C45 PHY
infrastructure.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-13 19:17:53 -05:00
Konstantin Khlebnikov 1ec17dbd90 inet_diag: fix reporting cgroup classid and fallback to priority
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
extensions has only 8 bits. Thus extensions starting from DCTCPINFO
cannot be requested directly. Some of them included into response
unconditionally or hook into some of lower 8 bits.

Extension INET_DIAG_CLASS_ID has not way to request from the beginning.

This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
reservation, and documents behavior for other extensions.

Also this patch adds fallback to reporting socket priority. This filed
is more widely used for traffic classification because ipv4 sockets
automatically maps TOS to priority and default qdisc pfifo_fast knows
about that. But priority could be changed via setsockopt SO_PRIORITY so
INET_DIAG_TOS isn't enough for predicting class.

Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).

So, after this patch INET_DIAG_CLASS_ID will report socket priority
for most common setup when net_cls isn't set and/or cgroup2 in use.

Fixes: 0888e372c3 ("net: inet: diag: expose sockets cgroup classid")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12 13:35:57 -05:00
Greg Kroah-Hartman c744ca39f2 Merge 5.0-rc6 into tty-next
We need the tty fixes in here for other patches to be based on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:26:45 +01:00
Greg Kroah-Hartman 3b6effbc38 Merge 5.0-rc6 into staging-next
We need the staging fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:25:01 +01:00
Greg Kroah-Hartman 5c07488d99 Merge 5.0-rc6 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:05:58 +01:00
Martin KaFai Lau 655a51e536 bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock
This patch adds a helper function BPF_FUNC_tcp_sock and it
is currently available for cg_skb and sched_(cls|act):

struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk);

int cg_skb_foo(struct __sk_buff *skb) {
	struct bpf_tcp_sock *tp;
	struct bpf_sock *sk;
	__u32 snd_cwnd;

	sk = skb->sk;
	if (!sk)
		return 1;

	tp = bpf_tcp_sock(sk);
	if (!tp)
		return 1;

	snd_cwnd = tp->snd_cwnd;
	/* ... */

	return 1;
}

A 'struct bpf_tcp_sock' is also added to the uapi bpf.h to provide
read-only access.  bpf_tcp_sock has all the existing tcp_sock's fields
that has already been exposed by the bpf_sock_ops.
i.e. no new tcp_sock's fields are exposed in bpf.h.

This helper returns a pointer to the tcp_sock.  If it is not a tcp_sock
or it cannot be traced back to a tcp_sock by sk_to_full_sk(), it
returns NULL.  Hence, the caller needs to check for NULL before
accessing it.

The current use case is to expose members from tcp_sock
to allow a cg_skb_bpf_prog to provide per cgroup traffic
policing/shaping.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau aa65d6960a bpf: Add state, dst_ip4, dst_ip6 and dst_port to bpf_sock
This patch adds "state", "dst_ip4", "dst_ip6" and "dst_port" to the
bpf_sock.  The userspace has already been using "state",
e.g. inet_diag (ss -t) and getsockopt(TCP_INFO).

This patch also allows narrow load on the following existing fields:
"family", "type", "protocol" and "src_port".  Unlike IP address,
the load offset is resticted to the first byte for them but it
can be relaxed later if there is a use case.

This patch also folds __sock_filter_check_size() into
bpf_sock_is_valid_access() since it is not called
by any where else.  All bpf_sock checking is in
one place.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau 46f8bc9275 bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helper
In kernel, it is common to check "skb->sk && sk_fullsock(skb->sk)"
before accessing the fields in sock.  For example, in __netdev_pick_tx:

static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
			    struct net_device *sb_dev)
{
	/* ... */

	struct sock *sk = skb->sk;

		if (queue_index != new_index && sk &&
		    sk_fullsock(sk) &&
		    rcu_access_pointer(sk->sk_dst_cache))
			sk_tx_queue_set(sk, new_index);

	/* ... */

	return queue_index;
}

This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff"
where a few of the convert_ctx_access() in filter.c has already been
accessing the skb->sk sock_common's fields,
e.g. sock_ops_convert_ctx_access().

"__sk_buff->sk" is a PTR_TO_SOCK_COMMON_OR_NULL in the verifier.
Some of the fileds in "bpf_sock" will not be directly
accessible through the "__sk_buff->sk" pointer.  It is limited
by the new "bpf_sock_common_is_valid_access()".
e.g. The existing "type", "protocol", "mark" and "priority" in bpf_sock
     are not allowed.

The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)"
can be used to get a sk with all accessible fields in "bpf_sock".
This helper is added to both cg_skb and sched_(cls|act).

int cg_skb_foo(struct __sk_buff *skb) {
	struct bpf_sock *sk;

	sk = skb->sk;
	if (!sk)
		return 1;

	sk = bpf_sk_fullsock(sk);
	if (!sk)
		return 1;

	if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP)
		return 1;

	/* some_traffic_shaping(); */

	return 1;
}

(1) The sk is read only

(2) There is no new "struct bpf_sock_common" introduced.

(3) Future kernel sock's members could be added to bpf_sock only
    instead of repeatedly adding at multiple places like currently
    in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc.

(4) After "sk = skb->sk", the reg holding sk is in type
    PTR_TO_SOCK_COMMON_OR_NULL.

(5) After bpf_sk_fullsock(), the return type will be in type
    PTR_TO_SOCKET_OR_NULL which is the same as the return type of
    bpf_sk_lookup_xxx().

    However, bpf_sk_fullsock() does not take refcnt.  The
    acquire_reference_state() is only depending on the return type now.
    To avoid it, a new is_acquire_function() is checked before calling
    acquire_reference_state().

(6) The WARN_ON in "release_reference_state()" is no longer an
    internal verifier bug.

    When reg->id is not found in state->refs[], it means the
    bpf_prog does something wrong like
    "bpf_sk_release(bpf_sk_fullsock(skb->sk))" where reference has
    never been acquired by calling "bpf_sk_fullsock(skb->sk)".

    A -EINVAL and a verbose are done instead of WARN_ON.  A test is
    added to the test_verifier in a later patch.

    Since the WARN_ON in "release_reference_state()" is no longer
    needed, "__release_reference_state()" is folded into
    "release_reference_state()" also.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Thomas Gleixner 41ea39101d y2038: Add time64 system calls
This series finally gets us to the point of having system calls with
 64-bit time_t on all architectures, after a long time of incremental
 preparation patches.
 
 There was actually one conversion that I missed during the summer,
 i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
 and review comments.
 
 The following system calls are now added on all 32-bit architectures
 using the same system call numbers:
 
 403 clock_gettime64
 404 clock_settime64
 405 clock_adjtime64
 406 clock_getres_time64
 407 clock_nanosleep_time64
 408 timer_gettime64
 409 timer_settime64
 410 timerfd_gettime64
 411 timerfd_settime64
 412 utimensat_time64
 413 pselect6_time64
 414 ppoll_time64
 416 io_pgetevents_time64
 417 recvmmsg_time64
 418 mq_timedsend_time64
 419 mq_timedreceiv_time64
 420 semtimedop_time64
 421 rt_sigtimedwait_time64
 422 futex_time64
 423 sched_rr_get_interval_time64
 
 Each one of these corresponds directly to an existing system call
 that includes a 'struct timespec' argument, or a structure containing
 a timespec or (in case of clock_adjtime) timeval. Not included here
 are new versions of getitimer/setitimer and getrusage/waitid, which
 are planned for the future but only needed to make a consistent API
 rather than for correct operation beyond y2038. These four system
 calls are based on 'timeval', and it has not been finally decided
 what the replacement kernel interface will use instead.
 
 So far, I have done a lot of build testing across most architectures,
 which has found a number of bugs. Runtime testing so far included
 testing LTP on 32-bit ARM with the existing system calls, to ensure
 we do not regress for existing binaries, and a test with a 32-bit
 x86 build of LTP against a modified version of the musl C library
 that has been adapted to the new system call interface [3].
 This library can be used for testing on all architectures supported
 by musl-1.1.21, but it is not how the support is getting integrated
 into the official musl release. Official musl support is planned
 but will require more invasive changes to the library.
 
 Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
 Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
 Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXf7/AAoJEGCrR//JCVInPSUP/RhsQSCKMGtONB/vVICQhwep
 PybhzBSpHWFxszzTi6BEPN1zS9B069G9mDollRBYZCckyPqL/Bv6sI/vzQZdNk01
 Q6Nw92OnNE1QP8owZ5TjrZhpbtopWdqIXjsbGZlloUemvuJP2JwvKovQUcn5CPTQ
 jbnqU04CVyFFJYVxAnGJ+VSeWNrjW/cm/m+rhLFjUcwW7Y3aodxsPqPP6+K9hY9P
 yIWfcH42WBeEWGm1RSBOZOScQl4SGCPUAhFydl/TqyEQagyegJMIyMOv9wZ5AuTT
 xK644bDVmNsrtJDZDpx+J8hytXCk1LrnKzkHR/uK80iUIraF/8D7PlaPgTmEEjko
 XcrywEkvkXTVU3owCm2/sbV+8fyFKzSPipnNfN1JNxEX71A98kvMRtPjDueQq/GA
 Yh81rr2YLF2sUiArkc2fNpENT7EGhrh1q6gviK3FB8YDgj1kSgPK5wC/X0uolC35
 E7iC2kg4NaNEIjhKP/WKluCaTvjRbvV+0IrlJLlhLTnsqbA57ZKCCteiBrlm7wQN
 4csUtCyxchR9Ac2o/lj+Mf53z68Zv74haIROp18K2dL7ZpVcOPnA3XHeauSAdoyp
 wy2Ek6ilNvlNB+4x+mRntPoOsyuOUGv7JXzB9JvweLWUd9G7tvYeDJQp/0YpDppb
 K4UWcKnhtEom0DgK08vY
 =IZVb
 -----END PGP SIGNATURE-----

Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull y2038 - time64 system calls from Arnd Bergmann:

This series finally gets us to the point of having system calls with 64-bit
time_t on all architectures, after a long time of incremental preparation
patches.

There was actually one conversion that I missed during the summer,
i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
and review comments.

The following system calls are now added on all 32-bit architectures using
the same system call numbers:

403 clock_gettime64
404 clock_settime64
405 clock_adjtime64
406 clock_getres_time64
407 clock_nanosleep_time64
408 timer_gettime64
409 timer_settime64
410 timerfd_gettime64
411 timerfd_settime64
412 utimensat_time64
413 pselect6_time64
414 ppoll_time64
416 io_pgetevents_time64
417 recvmmsg_time64
418 mq_timedsend_time64
419 mq_timedreceiv_time64
420 semtimedop_time64
421 rt_sigtimedwait_time64
422 futex_time64
423 sched_rr_get_interval_time64

Each one of these corresponds directly to an existing system call that
includes a 'struct timespec' argument, or a structure containing a timespec
or (in case of clock_adjtime) timeval. Not included here are new versions
of getitimer/setitimer and getrusage/waitid, which are planned for the
future but only needed to make a consistent API rather than for correct
operation beyond y2038. These four system calls are based on 'timeval', and
it has not been finally decided what the replacement kernel interface will
use instead.

So far, I have done a lot of build testing across most architectures, which
has found a number of bugs. Runtime testing so far included testing LTP on
32-bit ARM with the existing system calls, to ensure we do not regress for
existing binaries, and a test with a 32-bit x86 build of LTP against a
modified version of the musl C library that has been adapted to the new
system call interface [3].  This library can be used for testing on all
architectures supported by musl-1.1.21, but it is not how the support is
getting integrated into the official musl release. Official musl support is
planned but will require more invasive changes to the library.

Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
2019-02-10 21:24:43 +01:00
Eli Cohen eddd2cf195 net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICE
Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
TCA_ID_POLICE and also differentiates these identifier, used in struct
tc_action_ops type field, from other macros starting with TCA_ACT_.

To make things clearer, we name the enum defining the TCA_ID_*
identifiers and also change the "type" field of struct tc_action to
id.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00
Eli Cohen 257eeded20 net: Move all TC actions identifiers to one place
Move all the TC identifiers to one place, to the same enum that defines
the identifier of police action. This makes it easier choose numbers for
new actions since they are now defined in one place. We preserve the
original values for binary compatibility. New IDs should be added inside
the enum.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00
Sven Eckelmann 9a182242f1 batman-adv: Add throughput_override hardif genl configuration
The B.A.T.M.A.N. V implementation tries to estimate the link throughput of
an interface to an originator using different automatic methods. It is
still possible to overwrite it the link throughput for all reachable
originators via this interface.

The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get
the configuration of this feature using the u32
BATADV_ATTR_THROUGHPUT_OVERRIDE attribute. The used unit is in 100 Kbit/s.
If the value is set to 0 then batman-adv will try to estimate the
throughput by itself.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00
Sven Eckelmann a108008290 batman-adv: Add elp_interval hardif genl configuration
The ELP packets are transmitted every elp_interval milliseconds on an
slave/hard-interface. This value can be changed using the configuration
interface.

The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get
the configuration of this feature using the u32 BATADV_ATTR_ELP_INTERVAL
attribute.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00