Merge branch 'for-6.15/io_uring-rx-zc' into for-6.15/io_uring-reg-vec
* for-6.15/io_uring-rx-zc: (80 commits) io_uring/zcrx: add selftest case for recvzc with read limit io_uring/zcrx: add a read limit to recvzc requests io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap io_uring: Rename KConfig to Kconfig io_uring/zcrx: fix leaks on failed registration io_uring/zcrx: recheck ifq on shutdown io_uring/zcrx: add selftest net: add documentation for io_uring zcrx io_uring/zcrx: add copy fallback io_uring/zcrx: throttle receive requests io_uring/zcrx: set pp memory provider for an rx queue io_uring/zcrx: add io_recvzc request io_uring/zcrx: dma-map area for the device io_uring/zcrx: implement zerocopy receive pp memory provider io_uring/zcrx: grab a net device io_uring/zcrx: add io_zcrx_area io_uring/zcrx: add interface queue and refill queue net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers ...pull/1188/head
commit
78b6f6e9bf
|
|
@ -244,7 +244,7 @@ information about the interrupt from the irb parameter.
|
|||
--------------------
|
||||
|
||||
The ccwgroup mechanism is designed to handle devices consisting of multiple ccw
|
||||
devices, like lcs or ctc.
|
||||
devices, like qeth or ctc.
|
||||
|
||||
The ccw driver provides a 'group' attribute. Piping bus ids of ccw devices to
|
||||
this attributes creates a ccwgroup device consisting of these ccw devices (if
|
||||
|
|
|
|||
|
|
@ -44,6 +44,9 @@ properties:
|
|||
phy-mode:
|
||||
enum:
|
||||
- rgmii
|
||||
- rgmii-id
|
||||
- rgmii-rxid
|
||||
- rgmii-txid
|
||||
- rmii
|
||||
|
||||
phy-handle: true
|
||||
|
|
|
|||
|
|
@ -14,9 +14,10 @@ $defs:
|
|||
pattern: ^[0-9A-Za-z_-]+( - 1)?$
|
||||
minimum: 0
|
||||
len-or-limit:
|
||||
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
|
||||
# literal int, const name, or limit based on fixed-width type
|
||||
# e.g. u8-min, u16-max, etc.
|
||||
type: [ string, integer ]
|
||||
pattern: ^[su](8|16|32|64)-(min|max)$
|
||||
pattern: ^[0-9A-Za-z_-]+$
|
||||
minimum: 0
|
||||
|
||||
# Schema for specs
|
||||
|
|
|
|||
|
|
@ -14,9 +14,10 @@ $defs:
|
|||
pattern: ^[0-9A-Za-z_-]+( - 1)?$
|
||||
minimum: 0
|
||||
len-or-limit:
|
||||
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
|
||||
# literal int, const name, or limit based on fixed-width type
|
||||
# e.g. u8-min, u16-max, etc.
|
||||
type: [ string, integer ]
|
||||
pattern: ^[su](8|16|32|64)-(min|max)$
|
||||
pattern: ^[0-9A-Za-z_-]+$
|
||||
minimum: 0
|
||||
|
||||
# Schema for specs
|
||||
|
|
|
|||
|
|
@ -14,9 +14,10 @@ $defs:
|
|||
pattern: ^[0-9A-Za-z_-]+( - 1)?$
|
||||
minimum: 0
|
||||
len-or-limit:
|
||||
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
|
||||
# literal int, const name, or limit based on fixed-width type
|
||||
# e.g. u8-min, u16-max, etc.
|
||||
type: [ string, integer ]
|
||||
pattern: ^[su](8|16|32|64)-(min|max)$
|
||||
pattern: ^[0-9A-Za-z_-]+$
|
||||
minimum: 0
|
||||
|
||||
# Schema for specs
|
||||
|
|
|
|||
|
|
@ -114,6 +114,9 @@ attribute-sets:
|
|||
doc: Bitmask of enabled AF_XDP features.
|
||||
type: u64
|
||||
enum: xsk-flags
|
||||
-
|
||||
name: io-uring-provider-info
|
||||
attributes: []
|
||||
-
|
||||
name: page-pool
|
||||
attributes:
|
||||
|
|
@ -171,6 +174,11 @@ attribute-sets:
|
|||
name: dmabuf
|
||||
doc: ID of the dmabuf this page-pool is attached to.
|
||||
type: u32
|
||||
-
|
||||
name: io-uring
|
||||
doc: io-uring memory provider information.
|
||||
type: nest
|
||||
nested-attributes: io-uring-provider-info
|
||||
-
|
||||
name: page-pool-info
|
||||
subset-of: page-pool
|
||||
|
|
@ -296,6 +304,11 @@ attribute-sets:
|
|||
name: dmabuf
|
||||
doc: ID of the dmabuf attached to this queue, if any.
|
||||
type: u32
|
||||
-
|
||||
name: io-uring
|
||||
doc: io_uring memory provider information.
|
||||
type: nest
|
||||
nested-attributes: io-uring-provider-info
|
||||
|
||||
-
|
||||
name: qstats
|
||||
|
|
@ -572,6 +585,7 @@ operations:
|
|||
- inflight-mem
|
||||
- detach-time
|
||||
- dmabuf
|
||||
- io-uring
|
||||
dump:
|
||||
reply: *pp-reply
|
||||
config-cond: page-pool
|
||||
|
|
@ -637,6 +651,7 @@ operations:
|
|||
- napi-id
|
||||
- ifindex
|
||||
- dmabuf
|
||||
- io-uring
|
||||
dump:
|
||||
request:
|
||||
attributes:
|
||||
|
|
|
|||
|
|
@ -63,6 +63,7 @@ Contents:
|
|||
gtp
|
||||
ila
|
||||
ioam6-sysctl
|
||||
iou-zcrx
|
||||
ip_dynaddr
|
||||
ipsec
|
||||
ip-sysctl
|
||||
|
|
|
|||
|
|
@ -0,0 +1,202 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=====================
|
||||
io_uring zero copy Rx
|
||||
=====================
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
io_uring zero copy Rx (ZC Rx) is a feature that removes kernel-to-user copy on
|
||||
the network receive path, allowing packet data to be received directly into
|
||||
userspace memory. This feature is different to TCP_ZEROCOPY_RECEIVE in that
|
||||
there are no strict alignment requirements and no need to mmap()/munmap().
|
||||
Compared to kernel bypass solutions such as e.g. DPDK, the packet headers are
|
||||
processed by the kernel TCP stack as normal.
|
||||
|
||||
NIC HW Requirements
|
||||
===================
|
||||
|
||||
Several NIC HW features are required for io_uring ZC Rx to work. For now the
|
||||
kernel API does not configure the NIC and it must be done by the user.
|
||||
|
||||
Header/data split
|
||||
-----------------
|
||||
|
||||
Required to split packets at the L4 boundary into a header and a payload.
|
||||
Headers are received into kernel memory as normal and processed by the TCP
|
||||
stack as normal. Payloads are received into userspace memory directly.
|
||||
|
||||
Flow steering
|
||||
-------------
|
||||
|
||||
Specific HW Rx queues are configured for this feature, but modern NICs
|
||||
typically distribute flows across all HW Rx queues. Flow steering is required
|
||||
to ensure that only desired flows are directed towards HW queues that are
|
||||
configured for io_uring ZC Rx.
|
||||
|
||||
RSS
|
||||
---
|
||||
|
||||
In addition to flow steering above, RSS is required to steer all other non-zero
|
||||
copy flows away from queues that are configured for io_uring ZC Rx.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
Setup NIC
|
||||
---------
|
||||
|
||||
Must be done out of band for now.
|
||||
|
||||
Ensure there are at least two queues::
|
||||
|
||||
ethtool -L eth0 combined 2
|
||||
|
||||
Enable header/data split::
|
||||
|
||||
ethtool -G eth0 tcp-data-split on
|
||||
|
||||
Carve out half of the HW Rx queues for zero copy using RSS::
|
||||
|
||||
ethtool -X eth0 equal 1
|
||||
|
||||
Set up flow steering, bearing in mind that queues are 0-indexed::
|
||||
|
||||
ethtool -N eth0 flow-type tcp6 ... action 1
|
||||
|
||||
Setup io_uring
|
||||
--------------
|
||||
|
||||
This section describes the low level io_uring kernel API. Please refer to
|
||||
liburing documentation for how to use the higher level API.
|
||||
|
||||
Create an io_uring instance with the following required setup flags::
|
||||
|
||||
IORING_SETUP_SINGLE_ISSUER
|
||||
IORING_SETUP_DEFER_TASKRUN
|
||||
IORING_SETUP_CQE32
|
||||
|
||||
Create memory area
|
||||
------------------
|
||||
|
||||
Allocate userspace memory area for receiving zero copy data::
|
||||
|
||||
void *area_ptr = mmap(NULL, area_size,
|
||||
PROT_READ | PROT_WRITE,
|
||||
MAP_ANONYMOUS | MAP_PRIVATE,
|
||||
0, 0);
|
||||
|
||||
Create refill ring
|
||||
------------------
|
||||
|
||||
Allocate memory for a shared ringbuf used for returning consumed buffers::
|
||||
|
||||
void *ring_ptr = mmap(NULL, ring_size,
|
||||
PROT_READ | PROT_WRITE,
|
||||
MAP_ANONYMOUS | MAP_PRIVATE,
|
||||
0, 0);
|
||||
|
||||
This refill ring consists of some space for the header, followed by an array of
|
||||
``struct io_uring_zcrx_rqe``::
|
||||
|
||||
size_t rq_entries = 4096;
|
||||
size_t ring_size = rq_entries * sizeof(struct io_uring_zcrx_rqe) + PAGE_SIZE;
|
||||
/* align to page size */
|
||||
ring_size = (ring_size + (PAGE_SIZE - 1)) & ~(PAGE_SIZE - 1);
|
||||
|
||||
Register ZC Rx
|
||||
--------------
|
||||
|
||||
Fill in registration structs::
|
||||
|
||||
struct io_uring_zcrx_area_reg area_reg = {
|
||||
.addr = (__u64)(unsigned long)area_ptr,
|
||||
.len = area_size,
|
||||
.flags = 0,
|
||||
};
|
||||
|
||||
struct io_uring_region_desc region_reg = {
|
||||
.user_addr = (__u64)(unsigned long)ring_ptr,
|
||||
.size = ring_size,
|
||||
.flags = IORING_MEM_REGION_TYPE_USER,
|
||||
};
|
||||
|
||||
struct io_uring_zcrx_ifq_reg reg = {
|
||||
.if_idx = if_nametoindex("eth0"),
|
||||
/* this is the HW queue with desired flow steered into it */
|
||||
.if_rxq = 1,
|
||||
.rq_entries = rq_entries,
|
||||
.area_ptr = (__u64)(unsigned long)&area_reg,
|
||||
.region_ptr = (__u64)(unsigned long)®ion_reg,
|
||||
};
|
||||
|
||||
Register with kernel::
|
||||
|
||||
io_uring_register_ifq(ring, ®);
|
||||
|
||||
Map refill ring
|
||||
---------------
|
||||
|
||||
The kernel fills in fields for the refill ring in the registration ``struct
|
||||
io_uring_zcrx_ifq_reg``. Map it into userspace::
|
||||
|
||||
struct io_uring_zcrx_rq refill_ring;
|
||||
|
||||
refill_ring.khead = (unsigned *)((char *)ring_ptr + reg.offsets.head);
|
||||
refill_ring.khead = (unsigned *)((char *)ring_ptr + reg.offsets.tail);
|
||||
refill_ring.rqes =
|
||||
(struct io_uring_zcrx_rqe *)((char *)ring_ptr + reg.offsets.rqes);
|
||||
refill_ring.rq_tail = 0;
|
||||
refill_ring.ring_ptr = ring_ptr;
|
||||
|
||||
Receiving data
|
||||
--------------
|
||||
|
||||
Prepare a zero copy recv request::
|
||||
|
||||
struct io_uring_sqe *sqe;
|
||||
|
||||
sqe = io_uring_get_sqe(ring);
|
||||
io_uring_prep_rw(IORING_OP_RECV_ZC, sqe, fd, NULL, 0, 0);
|
||||
sqe->ioprio |= IORING_RECV_MULTISHOT;
|
||||
|
||||
Now, submit and wait::
|
||||
|
||||
io_uring_submit_and_wait(ring, 1);
|
||||
|
||||
Finally, process completions::
|
||||
|
||||
struct io_uring_cqe *cqe;
|
||||
unsigned int count = 0;
|
||||
unsigned int head;
|
||||
|
||||
io_uring_for_each_cqe(ring, head, cqe) {
|
||||
struct io_uring_zcrx_cqe *rcqe = (struct io_uring_zcrx_cqe *)(cqe + 1);
|
||||
|
||||
unsigned long mask = (1ULL << IORING_ZCRX_AREA_SHIFT) - 1;
|
||||
unsigned char *data = area_ptr + (rcqe->off & mask);
|
||||
/* do something with the data */
|
||||
|
||||
count++;
|
||||
}
|
||||
io_uring_cq_advance(ring, count);
|
||||
|
||||
Recycling buffers
|
||||
-----------------
|
||||
|
||||
Return buffers back to the kernel to be used again::
|
||||
|
||||
struct io_uring_zcrx_rqe *rqe;
|
||||
unsigned mask = refill_ring.ring_entries - 1;
|
||||
rqe = &refill_ring.rqes[refill_ring.rq_tail & mask];
|
||||
|
||||
unsigned long area_offset = rcqe->off & ~IORING_ZCRX_AREA_MASK;
|
||||
rqe->off = area_offset | area_reg.rq_area_token;
|
||||
rqe->len = cqe->res;
|
||||
IO_URING_WRITE_ONCE(*refill_ring.ktail, ++refill_ring.rq_tail);
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
See ``tools/testing/selftests/drivers/net/hw/iou-zcrx.c``
|
||||
2
Kconfig
2
Kconfig
|
|
@ -30,3 +30,5 @@ source "lib/Kconfig"
|
|||
source "lib/Kconfig.debug"
|
||||
|
||||
source "Documentation/Kconfig"
|
||||
|
||||
source "io_uring/Kconfig"
|
||||
|
|
|
|||
|
|
@ -54,7 +54,6 @@ enum interruption_class {
|
|||
IRQIO_C70,
|
||||
IRQIO_TAP,
|
||||
IRQIO_VMR,
|
||||
IRQIO_LCS,
|
||||
IRQIO_CTC,
|
||||
IRQIO_ADM,
|
||||
IRQIO_CSC,
|
||||
|
|
|
|||
|
|
@ -84,7 +84,6 @@ static const struct irq_class irqclass_sub_desc[] = {
|
|||
{.irq = IRQIO_C70, .name = "C70", .desc = "[I/O] 3270"},
|
||||
{.irq = IRQIO_TAP, .name = "TAP", .desc = "[I/O] Tape"},
|
||||
{.irq = IRQIO_VMR, .name = "VMR", .desc = "[I/O] Unit Record Devices"},
|
||||
{.irq = IRQIO_LCS, .name = "LCS", .desc = "[I/O] LCS"},
|
||||
{.irq = IRQIO_CTC, .name = "CTC", .desc = "[I/O] CTC"},
|
||||
{.irq = IRQIO_ADM, .name = "ADM", .desc = "[I/O] EADM Subchannel"},
|
||||
{.irq = IRQIO_CSC, .name = "CSC", .desc = "[I/O] CHSC Subchannel"},
|
||||
|
|
|
|||
|
|
@ -432,9 +432,6 @@ static struct net_device *bond_ipsec_dev(struct xfrm_state *xs)
|
|||
struct bonding *bond;
|
||||
struct slave *slave;
|
||||
|
||||
if (!bond_dev)
|
||||
return NULL;
|
||||
|
||||
bond = netdev_priv(bond_dev);
|
||||
if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
|
||||
return NULL;
|
||||
|
|
|
|||
|
|
@ -226,7 +226,6 @@ struct __packed offload_info {
|
|||
struct offload_port_info ports;
|
||||
struct offload_ka_info kas;
|
||||
struct offload_rr_info rrs;
|
||||
u8 buf[];
|
||||
};
|
||||
|
||||
struct __packed hw_atl_utils_fw_rpc {
|
||||
|
|
|
|||
|
|
@ -1433,22 +1433,6 @@ int octeon_wait_for_ddr_init(struct octeon_device *oct, u32 *timeout)
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(octeon_wait_for_ddr_init);
|
||||
|
||||
/* Get the octeon id assigned to the octeon device passed as argument.
|
||||
* This function is exported to other modules.
|
||||
* @param dev - octeon device pointer passed as a void *.
|
||||
* @return octeon device id
|
||||
*/
|
||||
int lio_get_device_id(void *dev)
|
||||
{
|
||||
struct octeon_device *octeon_dev = (struct octeon_device *)dev;
|
||||
u32 i;
|
||||
|
||||
for (i = 0; i < MAX_OCTEON_DEVICES; i++)
|
||||
if (octeon_device[i] == octeon_dev)
|
||||
return octeon_dev->octeon_id;
|
||||
return -1;
|
||||
}
|
||||
|
||||
void lio_enable_irq(struct octeon_droq *droq, struct octeon_instr_queue *iq)
|
||||
{
|
||||
u64 instr_cnt;
|
||||
|
|
|
|||
|
|
@ -705,13 +705,6 @@ octeon_get_dispatch(struct octeon_device *octeon_dev, u16 opcode,
|
|||
*/
|
||||
struct octeon_device *lio_get_device(u32 octeon_id);
|
||||
|
||||
/** Get the octeon id assigned to the octeon device passed as argument.
|
||||
* This function is exported to other modules.
|
||||
* @param dev - octeon device pointer passed as a void *.
|
||||
* @return octeon device id
|
||||
*/
|
||||
int lio_get_device_id(void *dev);
|
||||
|
||||
/** Read windowed register.
|
||||
* @param oct - pointer to the Octeon device.
|
||||
* @param addr - Address of the register to read.
|
||||
|
|
|
|||
|
|
@ -1211,9 +1211,6 @@ struct adapter {
|
|||
struct timer_list flower_stats_timer;
|
||||
struct work_struct flower_stats_work;
|
||||
|
||||
/* Ethtool Dump */
|
||||
struct ethtool_dump eth_dump;
|
||||
|
||||
/* HMA */
|
||||
struct hma_data hma;
|
||||
|
||||
|
|
@ -1233,6 +1230,10 @@ struct adapter {
|
|||
|
||||
/* Ethtool n-tuple */
|
||||
struct cxgb4_ethtool_filter *ethtool_filters;
|
||||
|
||||
/* Ethtool Dump */
|
||||
/* Must be last - ends in a flex-array member. */
|
||||
struct ethtool_dump eth_dump;
|
||||
};
|
||||
|
||||
/* Support for "sched-class" command to allow a TX Scheduling Class to be
|
||||
|
|
|
|||
|
|
@ -526,28 +526,6 @@ out:
|
|||
return res;
|
||||
}
|
||||
|
||||
u32 mlx4_zone_free_entries(struct mlx4_zone_allocator *zones, u32 uid, u32 obj, u32 count)
|
||||
{
|
||||
struct mlx4_zone_entry *zone;
|
||||
int res = 0;
|
||||
|
||||
spin_lock(&zones->lock);
|
||||
|
||||
zone = __mlx4_find_zone_by_uid(zones, uid);
|
||||
|
||||
if (NULL == zone) {
|
||||
res = -1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
__mlx4_free_from_zone(zone, obj, count);
|
||||
|
||||
out:
|
||||
spin_unlock(&zones->lock);
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
u32 mlx4_zone_free_entries_unique(struct mlx4_zone_allocator *zones, u32 obj, u32 count)
|
||||
{
|
||||
struct mlx4_zone_entry *zone;
|
||||
|
|
|
|||
|
|
@ -1478,12 +1478,6 @@ void mlx4_zone_allocator_destroy(struct mlx4_zone_allocator *zone_alloc);
|
|||
u32 mlx4_zone_alloc_entries(struct mlx4_zone_allocator *zones, u32 uid, int count,
|
||||
int align, u32 skip_mask, u32 *puid);
|
||||
|
||||
/* Free <count> objects, start from <obj> of the uid <uid> from zone_allocator
|
||||
* <zones>.
|
||||
*/
|
||||
u32 mlx4_zone_free_entries(struct mlx4_zone_allocator *zones,
|
||||
u32 uid, u32 obj, u32 count);
|
||||
|
||||
/* If <zones> was allocated with MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP, instead of
|
||||
* specifying the uid when freeing an object, zone allocator could figure it by
|
||||
* itself. Other parameters are similar to mlx4_zone_free.
|
||||
|
|
|
|||
|
|
@ -147,26 +147,6 @@ static int mlx4_set_port_mac_table(struct mlx4_dev *dev, u8 port,
|
|||
return err;
|
||||
}
|
||||
|
||||
int mlx4_find_cached_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *idx)
|
||||
{
|
||||
struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
|
||||
struct mlx4_mac_table *table = &info->mac_table;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < MLX4_MAX_MAC_NUM; i++) {
|
||||
if (!table->refs[i])
|
||||
continue;
|
||||
|
||||
if (mac == (MLX4_MAC_MASK & be64_to_cpu(table->entries[i]))) {
|
||||
*idx = i;
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return -ENOENT;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(mlx4_find_cached_mac);
|
||||
|
||||
static bool mlx4_need_mf_bond(struct mlx4_dev *dev)
|
||||
{
|
||||
int i, num_eth_ports = 0;
|
||||
|
|
|
|||
|
|
@ -296,11 +296,16 @@ enum mlx5e_fec_supported_link_mode {
|
|||
MLX5E_FEC_SUPPORTED_LINK_MODE_200G_2X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_400G_4X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_800G_8X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_200G_1X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_400G_2X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_800G_4X,
|
||||
MLX5E_FEC_SUPPORTED_LINK_MODE_1600G_8X,
|
||||
MLX5E_MAX_FEC_SUPPORTED_LINK_MODE,
|
||||
};
|
||||
|
||||
#define MLX5E_FEC_FIRST_50G_PER_LANE_MODE MLX5E_FEC_SUPPORTED_LINK_MODE_50G_1X
|
||||
#define MLX5E_FEC_FIRST_100G_PER_LANE_MODE MLX5E_FEC_SUPPORTED_LINK_MODE_100G_1X
|
||||
#define MLX5E_FEC_FIRST_200G_PER_LANE_MODE MLX5E_FEC_SUPPORTED_LINK_MODE_200G_1X
|
||||
|
||||
#define MLX5E_FEC_OVERRIDE_ADMIN_POLICY(buf, policy, write, link) \
|
||||
do { \
|
||||
|
|
@ -320,8 +325,10 @@ static bool mlx5e_is_fec_supported_link_mode(struct mlx5_core_dev *dev,
|
|||
return link_mode < MLX5E_FEC_FIRST_50G_PER_LANE_MODE ||
|
||||
(link_mode < MLX5E_FEC_FIRST_100G_PER_LANE_MODE &&
|
||||
MLX5_CAP_PCAM_FEATURE(dev, fec_50G_per_lane_in_pplm)) ||
|
||||
(link_mode >= MLX5E_FEC_FIRST_100G_PER_LANE_MODE &&
|
||||
MLX5_CAP_PCAM_FEATURE(dev, fec_100G_per_lane_in_pplm));
|
||||
(link_mode < MLX5E_FEC_FIRST_200G_PER_LANE_MODE &&
|
||||
MLX5_CAP_PCAM_FEATURE(dev, fec_100G_per_lane_in_pplm)) ||
|
||||
(link_mode >= MLX5E_FEC_FIRST_200G_PER_LANE_MODE &&
|
||||
MLX5_CAP_PCAM_FEATURE(dev, fec_200G_per_lane_in_pplm));
|
||||
}
|
||||
|
||||
/* get/set FEC admin field for a given speed */
|
||||
|
|
@ -368,6 +375,18 @@ static int mlx5e_fec_admin_field(u32 *pplm, u16 *fec_policy, bool write,
|
|||
case MLX5E_FEC_SUPPORTED_LINK_MODE_800G_8X:
|
||||
MLX5E_FEC_OVERRIDE_ADMIN_POLICY(pplm, *fec_policy, write, 800g_8x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_200G_1X:
|
||||
MLX5E_FEC_OVERRIDE_ADMIN_POLICY(pplm, *fec_policy, write, 200g_1x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_400G_2X:
|
||||
MLX5E_FEC_OVERRIDE_ADMIN_POLICY(pplm, *fec_policy, write, 400g_2x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_800G_4X:
|
||||
MLX5E_FEC_OVERRIDE_ADMIN_POLICY(pplm, *fec_policy, write, 800g_4x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_1600G_8X:
|
||||
MLX5E_FEC_OVERRIDE_ADMIN_POLICY(pplm, *fec_policy, write, 1600g_8x);
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
|
@ -421,6 +440,18 @@ static int mlx5e_get_fec_cap_field(u32 *pplm, u16 *fec_cap,
|
|||
case MLX5E_FEC_SUPPORTED_LINK_MODE_800G_8X:
|
||||
*fec_cap = MLX5E_GET_FEC_OVERRIDE_CAP(pplm, 800g_8x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_200G_1X:
|
||||
*fec_cap = MLX5E_GET_FEC_OVERRIDE_CAP(pplm, 200g_1x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_400G_2X:
|
||||
*fec_cap = MLX5E_GET_FEC_OVERRIDE_CAP(pplm, 400g_2x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_800G_4X:
|
||||
*fec_cap = MLX5E_GET_FEC_OVERRIDE_CAP(pplm, 800g_4x);
|
||||
break;
|
||||
case MLX5E_FEC_SUPPORTED_LINK_MODE_1600G_8X:
|
||||
*fec_cap = MLX5E_GET_FEC_OVERRIDE_CAP(pplm, 1600g_8x);
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
|
@ -494,6 +525,26 @@ out:
|
|||
return 0;
|
||||
}
|
||||
|
||||
static u16 mlx5e_remap_fec_conf_mode(enum mlx5e_fec_supported_link_mode link_mode,
|
||||
u16 conf_fec)
|
||||
{
|
||||
/* RS fec in ethtool is originally mapped to MLX5E_FEC_RS_528_514.
|
||||
* For link modes up to 25G per lane, the value is kept.
|
||||
* For 50G or 100G per lane, it's remapped to MLX5E_FEC_RS_544_514.
|
||||
* For 200G per lane, remapped to MLX5E_FEC_RS_544_514_INTERLEAVED_QUAD.
|
||||
*/
|
||||
if (conf_fec != BIT(MLX5E_FEC_RS_528_514))
|
||||
return conf_fec;
|
||||
|
||||
if (link_mode >= MLX5E_FEC_FIRST_200G_PER_LANE_MODE)
|
||||
return BIT(MLX5E_FEC_RS_544_514_INTERLEAVED_QUAD);
|
||||
|
||||
if (link_mode >= MLX5E_FEC_FIRST_50G_PER_LANE_MODE)
|
||||
return BIT(MLX5E_FEC_RS_544_514);
|
||||
|
||||
return conf_fec;
|
||||
}
|
||||
|
||||
int mlx5e_set_fec_mode(struct mlx5_core_dev *dev, u16 fec_policy)
|
||||
{
|
||||
bool fec_50g_per_lane = MLX5_CAP_PCAM_FEATURE(dev, fec_50G_per_lane_in_pplm);
|
||||
|
|
@ -530,14 +581,7 @@ int mlx5e_set_fec_mode(struct mlx5_core_dev *dev, u16 fec_policy)
|
|||
if (!mlx5e_is_fec_supported_link_mode(dev, i))
|
||||
break;
|
||||
|
||||
/* RS fec in ethtool is mapped to MLX5E_FEC_RS_528_514
|
||||
* to link modes up to 25G per lane and to
|
||||
* MLX5E_FEC_RS_544_514 in the new link modes based on
|
||||
* 50G or 100G per lane
|
||||
*/
|
||||
if (conf_fec == (1 << MLX5E_FEC_RS_528_514) &&
|
||||
i >= MLX5E_FEC_FIRST_50G_PER_LANE_MODE)
|
||||
conf_fec = (1 << MLX5E_FEC_RS_544_514);
|
||||
conf_fec = mlx5e_remap_fec_conf_mode(i, conf_fec);
|
||||
|
||||
mlx5e_get_fec_cap_field(out, &fec_caps, i);
|
||||
|
||||
|
|
|
|||
|
|
@ -61,6 +61,7 @@ enum {
|
|||
MLX5E_FEC_NOFEC,
|
||||
MLX5E_FEC_FIRECODE,
|
||||
MLX5E_FEC_RS_528_514,
|
||||
MLX5E_FEC_RS_544_514_INTERLEAVED_QUAD = 4,
|
||||
MLX5E_FEC_RS_544_514 = 7,
|
||||
MLX5E_FEC_LLRS_272_257_1 = 9,
|
||||
};
|
||||
|
|
|
|||
|
|
@ -326,7 +326,7 @@ static int mlx5e_ptp_alloc_txqsq(struct mlx5e_ptp *c, int txq_ix,
|
|||
int node;
|
||||
|
||||
sq->pdev = c->pdev;
|
||||
sq->clock = &mdev->clock;
|
||||
sq->clock = mdev->clock;
|
||||
sq->mkey_be = c->mkey_be;
|
||||
sq->netdev = c->netdev;
|
||||
sq->priv = c->priv;
|
||||
|
|
@ -696,7 +696,7 @@ static int mlx5e_init_ptp_rq(struct mlx5e_ptp *c, struct mlx5e_params *params,
|
|||
rq->pdev = c->pdev;
|
||||
rq->netdev = priv->netdev;
|
||||
rq->priv = priv;
|
||||
rq->clock = &mdev->clock;
|
||||
rq->clock = mdev->clock;
|
||||
rq->tstamp = &priv->tstamp;
|
||||
rq->mdev = mdev;
|
||||
rq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
|
||||
|
|
|
|||
|
|
@ -73,11 +73,6 @@ struct mlx5e_tc_act {
|
|||
bool is_terminating_action;
|
||||
};
|
||||
|
||||
struct mlx5e_tc_flow_action {
|
||||
unsigned int num_entries;
|
||||
struct flow_action_entry **entries;
|
||||
};
|
||||
|
||||
extern struct mlx5e_tc_act mlx5e_tc_act_drop;
|
||||
extern struct mlx5e_tc_act mlx5e_tc_act_trap;
|
||||
extern struct mlx5e_tc_act mlx5e_tc_act_accept;
|
||||
|
|
|
|||
|
|
@ -46,7 +46,7 @@ static void mlx5e_init_trap_rq(struct mlx5e_trap *t, struct mlx5e_params *params
|
|||
rq->pdev = t->pdev;
|
||||
rq->netdev = priv->netdev;
|
||||
rq->priv = priv;
|
||||
rq->clock = &mdev->clock;
|
||||
rq->clock = mdev->clock;
|
||||
rq->tstamp = &priv->tstamp;
|
||||
rq->mdev = mdev;
|
||||
rq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
|
||||
|
|
|
|||
|
|
@ -289,9 +289,9 @@ static u64 mlx5e_xsk_fill_timestamp(void *_priv)
|
|||
ts = get_cqe_ts(priv->cqe);
|
||||
|
||||
if (mlx5_is_real_time_rq(priv->cq->mdev) || mlx5_is_real_time_sq(priv->cq->mdev))
|
||||
return mlx5_real_time_cyc2time(&priv->cq->mdev->clock, ts);
|
||||
return mlx5_real_time_cyc2time(priv->cq->mdev->clock, ts);
|
||||
|
||||
return mlx5_timecounter_cyc2time(&priv->cq->mdev->clock, ts);
|
||||
return mlx5_timecounter_cyc2time(priv->cq->mdev->clock, ts);
|
||||
}
|
||||
|
||||
static void mlx5e_xsk_request_checksum(u16 csum_start, u16 csum_offset, void *priv)
|
||||
|
|
|
|||
|
|
@ -72,7 +72,7 @@ static int mlx5e_init_xsk_rq(struct mlx5e_channel *c,
|
|||
rq->netdev = c->netdev;
|
||||
rq->priv = c->priv;
|
||||
rq->tstamp = c->tstamp;
|
||||
rq->clock = &mdev->clock;
|
||||
rq->clock = mdev->clock;
|
||||
rq->icosq = &c->icosq;
|
||||
rq->ix = c->ix;
|
||||
rq->channel = c;
|
||||
|
|
|
|||
|
|
@ -237,6 +237,27 @@ void mlx5e_build_ptys2ethtool_map(void)
|
|||
ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT);
|
||||
MLX5_BUILD_PTYS2ETHTOOL_CONFIG(MLX5E_200GAUI_1_200GBASE_CR1_KR1, ext,
|
||||
ETHTOOL_LINK_MODE_200000baseCR_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_200000baseKR_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_200000baseDR_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_200000baseDR_2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_200000baseSR_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_200000baseVR_Full_BIT);
|
||||
MLX5_BUILD_PTYS2ETHTOOL_CONFIG(MLX5E_400GAUI_2_400GBASE_CR2_KR2, ext,
|
||||
ETHTOOL_LINK_MODE_400000baseCR2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_400000baseKR2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_400000baseDR2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_400000baseDR2_2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_400000baseSR2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_400000baseVR2_Full_BIT);
|
||||
MLX5_BUILD_PTYS2ETHTOOL_CONFIG(MLX5E_800GAUI_4_800GBASE_CR4_KR4, ext,
|
||||
ETHTOOL_LINK_MODE_800000baseCR4_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseKR4_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseDR4_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseDR4_2_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseSR4_Full_BIT,
|
||||
ETHTOOL_LINK_MODE_800000baseVR4_Full_BIT);
|
||||
}
|
||||
|
||||
static void mlx5e_ethtool_get_speed_arr(struct mlx5_core_dev *mdev,
|
||||
|
|
@ -931,6 +952,7 @@ static const u32 pplm_fec_2_ethtool[] = {
|
|||
[MLX5E_FEC_RS_528_514] = ETHTOOL_FEC_RS,
|
||||
[MLX5E_FEC_RS_544_514] = ETHTOOL_FEC_RS,
|
||||
[MLX5E_FEC_LLRS_272_257_1] = ETHTOOL_FEC_LLRS,
|
||||
[MLX5E_FEC_RS_544_514_INTERLEAVED_QUAD] = ETHTOOL_FEC_RS,
|
||||
};
|
||||
|
||||
static u32 pplm2ethtool_fec(u_long fec_mode, unsigned long size)
|
||||
|
|
|
|||
|
|
@ -737,7 +737,7 @@ static int mlx5e_init_rxq_rq(struct mlx5e_channel *c, struct mlx5e_params *param
|
|||
rq->netdev = c->netdev;
|
||||
rq->priv = c->priv;
|
||||
rq->tstamp = c->tstamp;
|
||||
rq->clock = &mdev->clock;
|
||||
rq->clock = mdev->clock;
|
||||
rq->icosq = &c->icosq;
|
||||
rq->ix = c->ix;
|
||||
rq->channel = c;
|
||||
|
|
@ -1614,7 +1614,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
|
|||
int err;
|
||||
|
||||
sq->pdev = c->pdev;
|
||||
sq->clock = &mdev->clock;
|
||||
sq->clock = mdev->clock;
|
||||
sq->mkey_be = c->mkey_be;
|
||||
sq->netdev = c->netdev;
|
||||
sq->mdev = c->mdev;
|
||||
|
|
@ -3816,8 +3816,11 @@ static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
|
|||
/* MQPRIO is another toplevel qdisc that can't be attached
|
||||
* simultaneously with the offloaded HTB.
|
||||
*/
|
||||
if (WARN_ON(mlx5e_selq_is_htb_enabled(&priv->selq)))
|
||||
return -EINVAL;
|
||||
if (mlx5e_selq_is_htb_enabled(&priv->selq)) {
|
||||
NL_SET_ERR_MSG_MOD(mqprio->extack,
|
||||
"MQPRIO cannot be configured when HTB offload is enabled.");
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
switch (mqprio->mode) {
|
||||
case TC_MQPRIO_MODE_DCB:
|
||||
|
|
|
|||
|
|
@ -97,7 +97,7 @@ static int mlx5_lag_create_port_sel_table(struct mlx5_lag *ldev,
|
|||
mlx5_del_flow_rules(lag_definer->rules[idx]);
|
||||
}
|
||||
j = ldev->buckets;
|
||||
};
|
||||
}
|
||||
goto destroy_fg;
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -43,6 +43,8 @@
|
|||
#include <linux/cpufeature.h>
|
||||
#endif /* CONFIG_X86 */
|
||||
|
||||
#define MLX5_RT_CLOCK_IDENTITY_SIZE MLX5_FLD_SZ_BYTES(mrtcq_reg, rt_clock_identity)
|
||||
|
||||
enum {
|
||||
MLX5_PIN_MODE_IN = 0x0,
|
||||
MLX5_PIN_MODE_OUT = 0x1,
|
||||
|
|
@ -77,6 +79,56 @@ enum {
|
|||
MLX5_MTUTC_OPERATION_ADJUST_TIME_EXTENDED_MAX = 200000,
|
||||
};
|
||||
|
||||
struct mlx5_clock_dev_state {
|
||||
struct mlx5_core_dev *mdev;
|
||||
struct mlx5_devcom_comp_dev *compdev;
|
||||
struct mlx5_nb pps_nb;
|
||||
struct work_struct out_work;
|
||||
};
|
||||
|
||||
struct mlx5_clock_priv {
|
||||
struct mlx5_clock clock;
|
||||
struct mlx5_core_dev *mdev;
|
||||
struct mutex lock; /* protect mdev and used in PTP callbacks */
|
||||
struct mlx5_core_dev *event_mdev;
|
||||
};
|
||||
|
||||
static struct mlx5_clock_priv *clock_priv(struct mlx5_clock *clock)
|
||||
{
|
||||
return container_of(clock, struct mlx5_clock_priv, clock);
|
||||
}
|
||||
|
||||
static void mlx5_clock_lockdep_assert(struct mlx5_clock *clock)
|
||||
{
|
||||
if (!clock->shared)
|
||||
return;
|
||||
|
||||
lockdep_assert(lockdep_is_held(&clock_priv(clock)->lock));
|
||||
}
|
||||
|
||||
static struct mlx5_core_dev *mlx5_clock_mdev_get(struct mlx5_clock *clock)
|
||||
{
|
||||
mlx5_clock_lockdep_assert(clock);
|
||||
|
||||
return clock_priv(clock)->mdev;
|
||||
}
|
||||
|
||||
static void mlx5_clock_lock(struct mlx5_clock *clock)
|
||||
{
|
||||
if (!clock->shared)
|
||||
return;
|
||||
|
||||
mutex_lock(&clock_priv(clock)->lock);
|
||||
}
|
||||
|
||||
static void mlx5_clock_unlock(struct mlx5_clock *clock)
|
||||
{
|
||||
if (!clock->shared)
|
||||
return;
|
||||
|
||||
mutex_unlock(&clock_priv(clock)->lock);
|
||||
}
|
||||
|
||||
static bool mlx5_real_time_mode(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
return (mlx5_is_real_time_rq(mdev) || mlx5_is_real_time_sq(mdev));
|
||||
|
|
@ -94,6 +146,22 @@ static bool mlx5_modify_mtutc_allowed(struct mlx5_core_dev *mdev)
|
|||
return MLX5_CAP_MCAM_FEATURE(mdev, ptpcyc2realtime_modify);
|
||||
}
|
||||
|
||||
static int mlx5_clock_identity_get(struct mlx5_core_dev *mdev,
|
||||
u8 identify[MLX5_RT_CLOCK_IDENTITY_SIZE])
|
||||
{
|
||||
u32 out[MLX5_ST_SZ_DW(mrtcq_reg)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(mrtcq_reg)] = {};
|
||||
int err;
|
||||
|
||||
err = mlx5_core_access_reg(mdev, in, sizeof(in),
|
||||
out, sizeof(out), MLX5_REG_MRTCQ, 0, 0);
|
||||
if (!err)
|
||||
memcpy(identify, MLX5_ADDR_OF(mrtcq_reg, out, rt_clock_identity),
|
||||
MLX5_RT_CLOCK_IDENTITY_SIZE);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static u32 mlx5_ptp_shift_constant(u32 dev_freq_khz)
|
||||
{
|
||||
/* Optimal shift constant leads to corrections above just 1 scaled ppm.
|
||||
|
|
@ -119,21 +187,30 @@ static u32 mlx5_ptp_shift_constant(u32 dev_freq_khz)
|
|||
ilog2((U32_MAX / NSEC_PER_MSEC) * dev_freq_khz));
|
||||
}
|
||||
|
||||
static s32 mlx5_ptp_getmaxphase(struct ptp_clock_info *ptp)
|
||||
static s32 mlx5_clock_getmaxphase(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
|
||||
return MLX5_CAP_MCAM_FEATURE(mdev, mtutc_time_adjustment_extended_range) ?
|
||||
MLX5_MTUTC_OPERATION_ADJUST_TIME_EXTENDED_MAX :
|
||||
MLX5_MTUTC_OPERATION_ADJUST_TIME_MAX;
|
||||
}
|
||||
|
||||
static s32 mlx5_ptp_getmaxphase(struct ptp_clock_info *ptp)
|
||||
{
|
||||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev;
|
||||
s32 ret;
|
||||
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
ret = mlx5_clock_getmaxphase(mdev);
|
||||
mlx5_clock_unlock(clock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static bool mlx5_is_mtutc_time_adj_cap(struct mlx5_core_dev *mdev, s64 delta)
|
||||
{
|
||||
s64 max = mlx5_ptp_getmaxphase(&mdev->clock.ptp_info);
|
||||
s64 max = mlx5_clock_getmaxphase(mdev);
|
||||
|
||||
if (delta < -max || delta > max)
|
||||
return false;
|
||||
|
|
@ -209,7 +286,7 @@ static int mlx5_mtctr_syncdevicetime(ktime_t *device_time,
|
|||
if (real_time_mode)
|
||||
*device_time = ns_to_ktime(REAL_TIME_TO_NS(device >> 32, device & U32_MAX));
|
||||
else
|
||||
*device_time = mlx5_timecounter_cyc2time(&mdev->clock, device);
|
||||
*device_time = mlx5_timecounter_cyc2time(mdev->clock, device);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
@ -220,16 +297,23 @@ static int mlx5_ptp_getcrosststamp(struct ptp_clock_info *ptp,
|
|||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct system_time_snapshot history_begin = {0};
|
||||
struct mlx5_core_dev *mdev;
|
||||
int err;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
|
||||
if (!mlx5_is_ptm_source_time_available(mdev))
|
||||
return -EBUSY;
|
||||
if (!mlx5_is_ptm_source_time_available(mdev)) {
|
||||
err = -EBUSY;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
ktime_get_snapshot(&history_begin);
|
||||
|
||||
return get_device_system_crosststamp(mlx5_mtctr_syncdevicetime, mdev,
|
||||
&history_begin, cts);
|
||||
err = get_device_system_crosststamp(mlx5_mtctr_syncdevicetime, mdev,
|
||||
&history_begin, cts);
|
||||
unlock:
|
||||
mlx5_clock_unlock(clock);
|
||||
return err;
|
||||
}
|
||||
#endif /* CONFIG_X86 */
|
||||
|
||||
|
|
@ -263,8 +347,7 @@ static u64 read_internal_timer(const struct cyclecounter *cc)
|
|||
{
|
||||
struct mlx5_timer *timer = container_of(cc, struct mlx5_timer, cycles);
|
||||
struct mlx5_clock *clock = container_of(timer, struct mlx5_clock, timer);
|
||||
struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev,
|
||||
clock);
|
||||
struct mlx5_core_dev *mdev = mlx5_clock_mdev_get(clock);
|
||||
|
||||
return mlx5_read_time(mdev, NULL, false) & cc->mask;
|
||||
}
|
||||
|
|
@ -272,7 +355,7 @@ static u64 read_internal_timer(const struct cyclecounter *cc)
|
|||
static void mlx5_update_clock_info_page(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_ib_clock_info *clock_info = mdev->clock_info;
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_timer *timer;
|
||||
u32 sign;
|
||||
|
||||
|
|
@ -295,12 +378,10 @@ static void mlx5_update_clock_info_page(struct mlx5_core_dev *mdev)
|
|||
|
||||
static void mlx5_pps_out(struct work_struct *work)
|
||||
{
|
||||
struct mlx5_pps *pps_info = container_of(work, struct mlx5_pps,
|
||||
out_work);
|
||||
struct mlx5_clock *clock = container_of(pps_info, struct mlx5_clock,
|
||||
pps_info);
|
||||
struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev,
|
||||
clock);
|
||||
struct mlx5_clock_dev_state *clock_state = container_of(work, struct mlx5_clock_dev_state,
|
||||
out_work);
|
||||
struct mlx5_core_dev *mdev = clock_state->mdev;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
|
||||
unsigned long flags;
|
||||
int i;
|
||||
|
|
@ -330,7 +411,8 @@ static long mlx5_timestamp_overflow(struct ptp_clock_info *ptp_info)
|
|||
unsigned long flags;
|
||||
|
||||
clock = container_of(ptp_info, struct mlx5_clock, ptp_info);
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
timer = &clock->timer;
|
||||
|
||||
if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
|
||||
|
|
@ -342,6 +424,7 @@ static long mlx5_timestamp_overflow(struct ptp_clock_info *ptp_info)
|
|||
write_sequnlock_irqrestore(&clock->lock, flags);
|
||||
|
||||
out:
|
||||
mlx5_clock_unlock(clock);
|
||||
return timer->overflow_period;
|
||||
}
|
||||
|
||||
|
|
@ -361,15 +444,12 @@ static int mlx5_ptp_settime_real_time(struct mlx5_core_dev *mdev,
|
|||
return mlx5_set_mtutc(mdev, in, sizeof(in));
|
||||
}
|
||||
|
||||
static int mlx5_ptp_settime(struct ptp_clock_info *ptp, const struct timespec64 *ts)
|
||||
static int mlx5_clock_settime(struct mlx5_core_dev *mdev, struct mlx5_clock *clock,
|
||||
const struct timespec64 *ts)
|
||||
{
|
||||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_timer *timer = &clock->timer;
|
||||
struct mlx5_core_dev *mdev;
|
||||
unsigned long flags;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
|
||||
if (mlx5_modify_mtutc_allowed(mdev)) {
|
||||
int err = mlx5_ptp_settime_real_time(mdev, ts);
|
||||
|
||||
|
|
@ -385,6 +465,20 @@ static int mlx5_ptp_settime(struct ptp_clock_info *ptp, const struct timespec64
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int mlx5_ptp_settime(struct ptp_clock_info *ptp, const struct timespec64 *ts)
|
||||
{
|
||||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev;
|
||||
int err;
|
||||
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
err = mlx5_clock_settime(mdev, clock, ts);
|
||||
mlx5_clock_unlock(clock);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static
|
||||
struct timespec64 mlx5_ptp_gettimex_real_time(struct mlx5_core_dev *mdev,
|
||||
struct ptp_system_timestamp *sts)
|
||||
|
|
@ -404,7 +498,8 @@ static int mlx5_ptp_gettimex(struct ptp_clock_info *ptp, struct timespec64 *ts,
|
|||
struct mlx5_core_dev *mdev;
|
||||
u64 cycles, ns;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
if (mlx5_real_time_mode(mdev)) {
|
||||
*ts = mlx5_ptp_gettimex_real_time(mdev, sts);
|
||||
goto out;
|
||||
|
|
@ -414,6 +509,7 @@ static int mlx5_ptp_gettimex(struct ptp_clock_info *ptp, struct timespec64 *ts,
|
|||
ns = mlx5_timecounter_cyc2time(clock, cycles);
|
||||
*ts = ns_to_timespec64(ns);
|
||||
out:
|
||||
mlx5_clock_unlock(clock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
@ -444,14 +540,16 @@ static int mlx5_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
|
|||
struct mlx5_timer *timer = &clock->timer;
|
||||
struct mlx5_core_dev *mdev;
|
||||
unsigned long flags;
|
||||
int err = 0;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
|
||||
if (mlx5_modify_mtutc_allowed(mdev)) {
|
||||
int err = mlx5_ptp_adjtime_real_time(mdev, delta);
|
||||
err = mlx5_ptp_adjtime_real_time(mdev, delta);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
write_seqlock_irqsave(&clock->lock, flags);
|
||||
|
|
@ -459,17 +557,23 @@ static int mlx5_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
|
|||
mlx5_update_clock_info_page(mdev);
|
||||
write_sequnlock_irqrestore(&clock->lock, flags);
|
||||
|
||||
return 0;
|
||||
unlock:
|
||||
mlx5_clock_unlock(clock);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int mlx5_ptp_adjphase(struct ptp_clock_info *ptp, s32 delta)
|
||||
{
|
||||
struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev;
|
||||
int err;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
err = mlx5_ptp_adjtime_real_time(mdev, delta);
|
||||
mlx5_clock_unlock(clock);
|
||||
|
||||
return mlx5_ptp_adjtime_real_time(mdev, delta);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int mlx5_ptp_freq_adj_real_time(struct mlx5_core_dev *mdev, long scaled_ppm)
|
||||
|
|
@ -498,15 +602,17 @@ static int mlx5_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
|
|||
struct mlx5_timer *timer = &clock->timer;
|
||||
struct mlx5_core_dev *mdev;
|
||||
unsigned long flags;
|
||||
int err = 0;
|
||||
u32 mult;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
|
||||
if (mlx5_modify_mtutc_allowed(mdev)) {
|
||||
int err = mlx5_ptp_freq_adj_real_time(mdev, scaled_ppm);
|
||||
err = mlx5_ptp_freq_adj_real_time(mdev, scaled_ppm);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
mult = (u32)adjust_by_scaled_ppm(timer->nominal_c_mult, scaled_ppm);
|
||||
|
|
@ -518,7 +624,9 @@ static int mlx5_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
|
|||
write_sequnlock_irqrestore(&clock->lock, flags);
|
||||
ptp_schedule_worker(clock->ptp, timer->overflow_period);
|
||||
|
||||
return 0;
|
||||
unlock:
|
||||
mlx5_clock_unlock(clock);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int mlx5_extts_configure(struct ptp_clock_info *ptp,
|
||||
|
|
@ -527,18 +635,14 @@ static int mlx5_extts_configure(struct ptp_clock_info *ptp,
|
|||
{
|
||||
struct mlx5_clock *clock =
|
||||
container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev =
|
||||
container_of(clock, struct mlx5_core_dev, clock);
|
||||
u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
|
||||
struct mlx5_core_dev *mdev;
|
||||
u32 field_select = 0;
|
||||
u8 pin_mode = 0;
|
||||
u8 pattern = 0;
|
||||
int pin = -1;
|
||||
int err = 0;
|
||||
|
||||
if (!MLX5_PPS_CAP(mdev))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
/* Reject requests with unsupported flags */
|
||||
if (rq->extts.flags & ~(PTP_ENABLE_FEATURE |
|
||||
PTP_RISING_EDGE |
|
||||
|
|
@ -569,6 +673,14 @@ static int mlx5_extts_configure(struct ptp_clock_info *ptp,
|
|||
field_select = MLX5_MTPPS_FS_ENABLE;
|
||||
}
|
||||
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
|
||||
if (!MLX5_PPS_CAP(mdev)) {
|
||||
err = -EOPNOTSUPP;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
MLX5_SET(mtpps_reg, in, pin, pin);
|
||||
MLX5_SET(mtpps_reg, in, pin_mode, pin_mode);
|
||||
MLX5_SET(mtpps_reg, in, pattern, pattern);
|
||||
|
|
@ -577,15 +689,23 @@ static int mlx5_extts_configure(struct ptp_clock_info *ptp,
|
|||
|
||||
err = mlx5_set_mtpps(mdev, in, sizeof(in));
|
||||
if (err)
|
||||
return err;
|
||||
goto unlock;
|
||||
|
||||
return mlx5_set_mtppse(mdev, pin, 0,
|
||||
MLX5_EVENT_MODE_REPETETIVE & on);
|
||||
err = mlx5_set_mtppse(mdev, pin, 0, MLX5_EVENT_MODE_REPETETIVE & on);
|
||||
if (err)
|
||||
goto unlock;
|
||||
|
||||
clock->pps_info.pin_armed[pin] = on;
|
||||
clock_priv(clock)->event_mdev = mdev;
|
||||
|
||||
unlock:
|
||||
mlx5_clock_unlock(clock);
|
||||
return err;
|
||||
}
|
||||
|
||||
static u64 find_target_cycles(struct mlx5_core_dev *mdev, s64 target_ns)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
u64 cycles_now, cycles_delta;
|
||||
u64 nsec_now, nsec_delta;
|
||||
struct mlx5_timer *timer;
|
||||
|
|
@ -644,7 +764,7 @@ static int mlx5_perout_conf_out_pulse_duration(struct mlx5_core_dev *mdev,
|
|||
struct ptp_clock_request *rq,
|
||||
u32 *out_pulse_duration_ns)
|
||||
{
|
||||
struct mlx5_pps *pps_info = &mdev->clock.pps_info;
|
||||
struct mlx5_pps *pps_info = &mdev->clock->pps_info;
|
||||
u32 out_pulse_duration;
|
||||
struct timespec64 ts;
|
||||
|
||||
|
|
@ -677,7 +797,7 @@ static int perout_conf_npps_real_time(struct mlx5_core_dev *mdev, struct ptp_clo
|
|||
u32 *field_select, u32 *out_pulse_duration_ns,
|
||||
u64 *period, u64 *time_stamp)
|
||||
{
|
||||
struct mlx5_pps *pps_info = &mdev->clock.pps_info;
|
||||
struct mlx5_pps *pps_info = &mdev->clock->pps_info;
|
||||
struct ptp_clock_time *time = &rq->perout.start;
|
||||
struct timespec64 ts;
|
||||
|
||||
|
|
@ -712,26 +832,18 @@ static int mlx5_perout_configure(struct ptp_clock_info *ptp,
|
|||
{
|
||||
struct mlx5_clock *clock =
|
||||
container_of(ptp, struct mlx5_clock, ptp_info);
|
||||
struct mlx5_core_dev *mdev =
|
||||
container_of(clock, struct mlx5_core_dev, clock);
|
||||
bool rt_mode = mlx5_real_time_mode(mdev);
|
||||
u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
|
||||
u32 out_pulse_duration_ns = 0;
|
||||
struct mlx5_core_dev *mdev;
|
||||
u32 field_select = 0;
|
||||
u64 npps_period = 0;
|
||||
u64 time_stamp = 0;
|
||||
u8 pin_mode = 0;
|
||||
u8 pattern = 0;
|
||||
bool rt_mode;
|
||||
int pin = -1;
|
||||
int err = 0;
|
||||
|
||||
if (!MLX5_PPS_CAP(mdev))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
/* Reject requests with unsupported flags */
|
||||
if (mlx5_perout_verify_flags(mdev, rq->perout.flags))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (rq->perout.index >= clock->ptp_info.n_pins)
|
||||
return -EINVAL;
|
||||
|
||||
|
|
@ -740,14 +852,29 @@ static int mlx5_perout_configure(struct ptp_clock_info *ptp,
|
|||
if (pin < 0)
|
||||
return -EBUSY;
|
||||
|
||||
if (on) {
|
||||
bool rt_mode = mlx5_real_time_mode(mdev);
|
||||
mlx5_clock_lock(clock);
|
||||
mdev = mlx5_clock_mdev_get(clock);
|
||||
rt_mode = mlx5_real_time_mode(mdev);
|
||||
|
||||
if (!MLX5_PPS_CAP(mdev)) {
|
||||
err = -EOPNOTSUPP;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
/* Reject requests with unsupported flags */
|
||||
if (mlx5_perout_verify_flags(mdev, rq->perout.flags)) {
|
||||
err = -EOPNOTSUPP;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (on) {
|
||||
pin_mode = MLX5_PIN_MODE_OUT;
|
||||
pattern = MLX5_OUT_PATTERN_PERIODIC;
|
||||
|
||||
if (rt_mode && rq->perout.start.sec > U32_MAX)
|
||||
return -EINVAL;
|
||||
if (rt_mode && rq->perout.start.sec > U32_MAX) {
|
||||
err = -EINVAL;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
field_select |= MLX5_MTPPS_FS_PIN_MODE |
|
||||
MLX5_MTPPS_FS_PATTERN |
|
||||
|
|
@ -760,7 +887,7 @@ static int mlx5_perout_configure(struct ptp_clock_info *ptp,
|
|||
else
|
||||
err = perout_conf_1pps(mdev, rq, &time_stamp, rt_mode);
|
||||
if (err)
|
||||
return err;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
MLX5_SET(mtpps_reg, in, pin, pin);
|
||||
|
|
@ -773,13 +900,16 @@ static int mlx5_perout_configure(struct ptp_clock_info *ptp,
|
|||
MLX5_SET(mtpps_reg, in, out_pulse_duration_ns, out_pulse_duration_ns);
|
||||
err = mlx5_set_mtpps(mdev, in, sizeof(in));
|
||||
if (err)
|
||||
return err;
|
||||
goto unlock;
|
||||
|
||||
if (rt_mode)
|
||||
return 0;
|
||||
goto unlock;
|
||||
|
||||
return mlx5_set_mtppse(mdev, pin, 0,
|
||||
MLX5_EVENT_MODE_REPETETIVE & on);
|
||||
err = mlx5_set_mtppse(mdev, pin, 0, MLX5_EVENT_MODE_REPETETIVE & on);
|
||||
|
||||
unlock:
|
||||
mlx5_clock_unlock(clock);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int mlx5_pps_configure(struct ptp_clock_info *ptp,
|
||||
|
|
@ -866,10 +996,8 @@ static int mlx5_query_mtpps_pin_mode(struct mlx5_core_dev *mdev, u8 pin,
|
|||
mtpps_size, MLX5_REG_MTPPS, 0, 0);
|
||||
}
|
||||
|
||||
static int mlx5_get_pps_pin_mode(struct mlx5_clock *clock, u8 pin)
|
||||
static int mlx5_get_pps_pin_mode(struct mlx5_core_dev *mdev, u8 pin)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
|
||||
u32 out[MLX5_ST_SZ_DW(mtpps_reg)] = {};
|
||||
u8 mode;
|
||||
int err;
|
||||
|
|
@ -888,8 +1016,9 @@ static int mlx5_get_pps_pin_mode(struct mlx5_clock *clock, u8 pin)
|
|||
return PTP_PF_NONE;
|
||||
}
|
||||
|
||||
static void mlx5_init_pin_config(struct mlx5_clock *clock)
|
||||
static void mlx5_init_pin_config(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
int i;
|
||||
|
||||
if (!clock->ptp_info.n_pins)
|
||||
|
|
@ -910,15 +1039,15 @@ static void mlx5_init_pin_config(struct mlx5_clock *clock)
|
|||
sizeof(clock->ptp_info.pin_config[i].name),
|
||||
"mlx5_pps%d", i);
|
||||
clock->ptp_info.pin_config[i].index = i;
|
||||
clock->ptp_info.pin_config[i].func = mlx5_get_pps_pin_mode(clock, i);
|
||||
clock->ptp_info.pin_config[i].func = mlx5_get_pps_pin_mode(mdev, i);
|
||||
clock->ptp_info.pin_config[i].chan = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static void mlx5_get_pps_caps(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
u32 out[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
|
||||
mlx5_query_mtpps(mdev, out, sizeof(out));
|
||||
|
||||
|
|
@ -968,16 +1097,16 @@ static u64 perout_conf_next_event_timer(struct mlx5_core_dev *mdev,
|
|||
static int mlx5_pps_event(struct notifier_block *nb,
|
||||
unsigned long type, void *data)
|
||||
{
|
||||
struct mlx5_clock *clock = mlx5_nb_cof(nb, struct mlx5_clock, pps_nb);
|
||||
struct mlx5_clock_dev_state *clock_state = mlx5_nb_cof(nb, struct mlx5_clock_dev_state,
|
||||
pps_nb);
|
||||
struct mlx5_core_dev *mdev = clock_state->mdev;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct ptp_clock_event ptp_event;
|
||||
struct mlx5_eqe *eqe = data;
|
||||
int pin = eqe->data.pps.pin;
|
||||
struct mlx5_core_dev *mdev;
|
||||
unsigned long flags;
|
||||
u64 ns;
|
||||
|
||||
mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
|
||||
switch (clock->ptp_info.pin_config[pin].func) {
|
||||
case PTP_PF_EXTTS:
|
||||
ptp_event.index = pin;
|
||||
|
|
@ -997,11 +1126,15 @@ static int mlx5_pps_event(struct notifier_block *nb,
|
|||
ptp_clock_event(clock->ptp, &ptp_event);
|
||||
break;
|
||||
case PTP_PF_PEROUT:
|
||||
if (clock->shared) {
|
||||
mlx5_core_warn(mdev, " Received unexpected PPS out event\n");
|
||||
break;
|
||||
}
|
||||
ns = perout_conf_next_event_timer(mdev, clock);
|
||||
write_seqlock_irqsave(&clock->lock, flags);
|
||||
clock->pps_info.start[pin] = ns;
|
||||
write_sequnlock_irqrestore(&clock->lock, flags);
|
||||
schedule_work(&clock->pps_info.out_work);
|
||||
schedule_work(&clock_state->out_work);
|
||||
break;
|
||||
default:
|
||||
mlx5_core_err(mdev, " Unhandled clock PPS event, func %d\n",
|
||||
|
|
@ -1013,7 +1146,7 @@ static int mlx5_pps_event(struct notifier_block *nb,
|
|||
|
||||
static void mlx5_timecounter_init(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_timer *timer = &clock->timer;
|
||||
u32 dev_freq;
|
||||
|
||||
|
|
@ -1029,10 +1162,10 @@ static void mlx5_timecounter_init(struct mlx5_core_dev *mdev)
|
|||
ktime_to_ns(ktime_get_real()));
|
||||
}
|
||||
|
||||
static void mlx5_init_overflow_period(struct mlx5_clock *clock)
|
||||
static void mlx5_init_overflow_period(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev, clock);
|
||||
struct mlx5_ib_clock_info *clock_info = mdev->clock_info;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_timer *timer = &clock->timer;
|
||||
u64 overflow_cycles;
|
||||
u64 frac = 0;
|
||||
|
|
@ -1065,7 +1198,7 @@ static void mlx5_init_overflow_period(struct mlx5_clock *clock)
|
|||
|
||||
static void mlx5_init_clock_info(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_ib_clock_info *info;
|
||||
struct mlx5_timer *timer;
|
||||
|
||||
|
|
@ -1088,7 +1221,7 @@ static void mlx5_init_clock_info(struct mlx5_core_dev *mdev)
|
|||
|
||||
static void mlx5_init_timer_max_freq_adjustment(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
u32 out[MLX5_ST_SZ_DW(mtutc_reg)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(mtutc_reg)] = {};
|
||||
u8 log_max_freq_adjustment = 0;
|
||||
|
|
@ -1107,7 +1240,7 @@ static void mlx5_init_timer_max_freq_adjustment(struct mlx5_core_dev *mdev)
|
|||
|
||||
static void mlx5_init_timer_clock(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
|
||||
/* Configure the PHC */
|
||||
clock->ptp_info = mlx5_ptp_clock_info;
|
||||
|
|
@ -1123,38 +1256,30 @@ static void mlx5_init_timer_clock(struct mlx5_core_dev *mdev)
|
|||
|
||||
mlx5_timecounter_init(mdev);
|
||||
mlx5_init_clock_info(mdev);
|
||||
mlx5_init_overflow_period(clock);
|
||||
mlx5_init_overflow_period(mdev);
|
||||
|
||||
if (mlx5_real_time_mode(mdev)) {
|
||||
struct timespec64 ts;
|
||||
|
||||
ktime_get_real_ts64(&ts);
|
||||
mlx5_ptp_settime(&clock->ptp_info, &ts);
|
||||
mlx5_clock_settime(mdev, clock, &ts);
|
||||
}
|
||||
}
|
||||
|
||||
static void mlx5_init_pps(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
|
||||
if (!MLX5_PPS_CAP(mdev))
|
||||
return;
|
||||
|
||||
mlx5_get_pps_caps(mdev);
|
||||
mlx5_init_pin_config(clock);
|
||||
mlx5_init_pin_config(mdev);
|
||||
}
|
||||
|
||||
void mlx5_init_clock(struct mlx5_core_dev *mdev)
|
||||
static void mlx5_init_clock_dev(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz)) {
|
||||
mlx5_core_warn(mdev, "invalid device_frequency_khz, aborting HW clock init\n");
|
||||
return;
|
||||
}
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
|
||||
seqlock_init(&clock->lock);
|
||||
INIT_WORK(&clock->pps_info.out_work, mlx5_pps_out);
|
||||
|
||||
/* Initialize the device clock */
|
||||
mlx5_init_timer_clock(mdev);
|
||||
|
|
@ -1163,35 +1288,27 @@ void mlx5_init_clock(struct mlx5_core_dev *mdev)
|
|||
mlx5_init_pps(mdev);
|
||||
|
||||
clock->ptp = ptp_clock_register(&clock->ptp_info,
|
||||
&mdev->pdev->dev);
|
||||
clock->shared ? NULL : &mdev->pdev->dev);
|
||||
if (IS_ERR(clock->ptp)) {
|
||||
mlx5_core_warn(mdev, "ptp_clock_register failed %ld\n",
|
||||
mlx5_core_warn(mdev, "%sptp_clock_register failed %ld\n",
|
||||
clock->shared ? "shared clock " : "",
|
||||
PTR_ERR(clock->ptp));
|
||||
clock->ptp = NULL;
|
||||
}
|
||||
|
||||
MLX5_NB_INIT(&clock->pps_nb, mlx5_pps_event, PPS_EVENT);
|
||||
mlx5_eq_notifier_register(mdev, &clock->pps_nb);
|
||||
|
||||
if (clock->ptp)
|
||||
ptp_schedule_worker(clock->ptp, 0);
|
||||
}
|
||||
|
||||
void mlx5_cleanup_clock(struct mlx5_core_dev *mdev)
|
||||
static void mlx5_destroy_clock_dev(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = &mdev->clock;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz))
|
||||
return;
|
||||
|
||||
mlx5_eq_notifier_unregister(mdev, &clock->pps_nb);
|
||||
if (clock->ptp) {
|
||||
ptp_clock_unregister(clock->ptp);
|
||||
clock->ptp = NULL;
|
||||
}
|
||||
|
||||
cancel_work_sync(&clock->pps_info.out_work);
|
||||
|
||||
if (mdev->clock_info) {
|
||||
free_page((unsigned long)mdev->clock_info);
|
||||
mdev->clock_info = NULL;
|
||||
|
|
@ -1199,3 +1316,248 @@ void mlx5_cleanup_clock(struct mlx5_core_dev *mdev)
|
|||
|
||||
kfree(clock->ptp_info.pin_config);
|
||||
}
|
||||
|
||||
static void mlx5_clock_free(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock_priv *cpriv = clock_priv(mdev->clock);
|
||||
|
||||
mlx5_destroy_clock_dev(mdev);
|
||||
mutex_destroy(&cpriv->lock);
|
||||
kfree(cpriv);
|
||||
mdev->clock = NULL;
|
||||
}
|
||||
|
||||
static int mlx5_clock_alloc(struct mlx5_core_dev *mdev, bool shared)
|
||||
{
|
||||
struct mlx5_clock_priv *cpriv;
|
||||
struct mlx5_clock *clock;
|
||||
|
||||
cpriv = kzalloc(sizeof(*cpriv), GFP_KERNEL);
|
||||
if (!cpriv)
|
||||
return -ENOMEM;
|
||||
|
||||
mutex_init(&cpriv->lock);
|
||||
cpriv->mdev = mdev;
|
||||
clock = &cpriv->clock;
|
||||
clock->shared = shared;
|
||||
mdev->clock = clock;
|
||||
mlx5_clock_lock(clock);
|
||||
mlx5_init_clock_dev(mdev);
|
||||
mlx5_clock_unlock(clock);
|
||||
|
||||
if (!clock->shared)
|
||||
return 0;
|
||||
|
||||
if (!clock->ptp) {
|
||||
mlx5_core_warn(mdev, "failed to create ptp dev shared by multiple functions");
|
||||
mlx5_clock_free(mdev);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void mlx5_shared_clock_register(struct mlx5_core_dev *mdev, u64 key)
|
||||
{
|
||||
struct mlx5_core_dev *peer_dev, *next = NULL;
|
||||
struct mlx5_devcom_comp_dev *pos;
|
||||
|
||||
mdev->clock_state->compdev = mlx5_devcom_register_component(mdev->priv.devc,
|
||||
MLX5_DEVCOM_SHARED_CLOCK,
|
||||
key, NULL, mdev);
|
||||
if (IS_ERR(mdev->clock_state->compdev))
|
||||
return;
|
||||
|
||||
mlx5_devcom_comp_lock(mdev->clock_state->compdev);
|
||||
mlx5_devcom_for_each_peer_entry(mdev->clock_state->compdev, peer_dev, pos) {
|
||||
if (peer_dev->clock) {
|
||||
next = peer_dev;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (next) {
|
||||
mdev->clock = next->clock;
|
||||
/* clock info is shared among all the functions using the same clock */
|
||||
mdev->clock_info = next->clock_info;
|
||||
} else {
|
||||
mlx5_clock_alloc(mdev, true);
|
||||
}
|
||||
mlx5_devcom_comp_unlock(mdev->clock_state->compdev);
|
||||
|
||||
if (!mdev->clock) {
|
||||
mlx5_devcom_unregister_component(mdev->clock_state->compdev);
|
||||
mdev->clock_state->compdev = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
static void mlx5_shared_clock_unregister(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_core_dev *peer_dev, *next = NULL;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_devcom_comp_dev *pos;
|
||||
|
||||
mlx5_devcom_comp_lock(mdev->clock_state->compdev);
|
||||
mlx5_devcom_for_each_peer_entry(mdev->clock_state->compdev, peer_dev, pos) {
|
||||
if (peer_dev->clock && peer_dev != mdev) {
|
||||
next = peer_dev;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (next) {
|
||||
struct mlx5_clock_priv *cpriv = clock_priv(clock);
|
||||
|
||||
mlx5_clock_lock(clock);
|
||||
if (mdev == cpriv->mdev)
|
||||
cpriv->mdev = next;
|
||||
mlx5_clock_unlock(clock);
|
||||
} else {
|
||||
mlx5_clock_free(mdev);
|
||||
}
|
||||
|
||||
mdev->clock = NULL;
|
||||
mdev->clock_info = NULL;
|
||||
mlx5_devcom_comp_unlock(mdev->clock_state->compdev);
|
||||
|
||||
mlx5_devcom_unregister_component(mdev->clock_state->compdev);
|
||||
}
|
||||
|
||||
static void mlx5_clock_arm_pps_in_event(struct mlx5_clock *clock,
|
||||
struct mlx5_core_dev *new_mdev,
|
||||
struct mlx5_core_dev *old_mdev)
|
||||
{
|
||||
struct ptp_clock_info *ptp_info = &clock->ptp_info;
|
||||
struct mlx5_clock_priv *cpriv = clock_priv(clock);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < ptp_info->n_pins; i++) {
|
||||
if (ptp_info->pin_config[i].func != PTP_PF_EXTTS ||
|
||||
!clock->pps_info.pin_armed[i])
|
||||
continue;
|
||||
|
||||
if (new_mdev) {
|
||||
mlx5_set_mtppse(new_mdev, i, 0, MLX5_EVENT_MODE_REPETETIVE);
|
||||
cpriv->event_mdev = new_mdev;
|
||||
} else {
|
||||
cpriv->event_mdev = NULL;
|
||||
}
|
||||
|
||||
if (old_mdev)
|
||||
mlx5_set_mtppse(old_mdev, i, 0, MLX5_EVENT_MODE_DISABLE);
|
||||
}
|
||||
}
|
||||
|
||||
void mlx5_clock_load(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_clock_priv *cpriv;
|
||||
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz))
|
||||
return;
|
||||
|
||||
INIT_WORK(&mdev->clock_state->out_work, mlx5_pps_out);
|
||||
MLX5_NB_INIT(&mdev->clock_state->pps_nb, mlx5_pps_event, PPS_EVENT);
|
||||
mlx5_eq_notifier_register(mdev, &mdev->clock_state->pps_nb);
|
||||
|
||||
if (!clock->shared) {
|
||||
mlx5_clock_arm_pps_in_event(clock, mdev, NULL);
|
||||
return;
|
||||
}
|
||||
|
||||
cpriv = clock_priv(clock);
|
||||
mlx5_devcom_comp_lock(mdev->clock_state->compdev);
|
||||
mlx5_clock_lock(clock);
|
||||
if (mdev == cpriv->mdev && mdev != cpriv->event_mdev)
|
||||
mlx5_clock_arm_pps_in_event(clock, mdev, cpriv->event_mdev);
|
||||
mlx5_clock_unlock(clock);
|
||||
mlx5_devcom_comp_unlock(mdev->clock_state->compdev);
|
||||
}
|
||||
|
||||
void mlx5_clock_unload(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
struct mlx5_core_dev *peer_dev, *next = NULL;
|
||||
struct mlx5_clock *clock = mdev->clock;
|
||||
struct mlx5_devcom_comp_dev *pos;
|
||||
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz))
|
||||
return;
|
||||
|
||||
if (!clock->shared) {
|
||||
mlx5_clock_arm_pps_in_event(clock, NULL, mdev);
|
||||
goto out;
|
||||
}
|
||||
|
||||
mlx5_devcom_comp_lock(mdev->clock_state->compdev);
|
||||
mlx5_devcom_for_each_peer_entry(mdev->clock_state->compdev, peer_dev, pos) {
|
||||
if (peer_dev->clock && peer_dev != mdev) {
|
||||
next = peer_dev;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
mlx5_clock_lock(clock);
|
||||
if (mdev == clock_priv(clock)->event_mdev)
|
||||
mlx5_clock_arm_pps_in_event(clock, next, mdev);
|
||||
mlx5_clock_unlock(clock);
|
||||
mlx5_devcom_comp_unlock(mdev->clock_state->compdev);
|
||||
|
||||
out:
|
||||
mlx5_eq_notifier_unregister(mdev, &mdev->clock_state->pps_nb);
|
||||
cancel_work_sync(&mdev->clock_state->out_work);
|
||||
}
|
||||
|
||||
static struct mlx5_clock null_clock;
|
||||
|
||||
int mlx5_init_clock(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
u8 identity[MLX5_RT_CLOCK_IDENTITY_SIZE];
|
||||
struct mlx5_clock_dev_state *clock_state;
|
||||
u64 key;
|
||||
int err;
|
||||
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz)) {
|
||||
mdev->clock = &null_clock;
|
||||
mlx5_core_warn(mdev, "invalid device_frequency_khz, aborting HW clock init\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
clock_state = kzalloc(sizeof(*clock_state), GFP_KERNEL);
|
||||
if (!clock_state)
|
||||
return -ENOMEM;
|
||||
clock_state->mdev = mdev;
|
||||
mdev->clock_state = clock_state;
|
||||
|
||||
if (MLX5_CAP_MCAM_REG3(mdev, mrtcq) && mlx5_real_time_mode(mdev)) {
|
||||
if (mlx5_clock_identity_get(mdev, identity)) {
|
||||
mlx5_core_warn(mdev, "failed to get rt clock identity, create ptp dev per function\n");
|
||||
} else {
|
||||
memcpy(&key, &identity, sizeof(key));
|
||||
mlx5_shared_clock_register(mdev, key);
|
||||
}
|
||||
}
|
||||
|
||||
if (!mdev->clock) {
|
||||
err = mlx5_clock_alloc(mdev, false);
|
||||
if (err) {
|
||||
kfree(clock_state);
|
||||
mdev->clock_state = NULL;
|
||||
return err;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void mlx5_cleanup_clock(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
if (!MLX5_CAP_GEN(mdev, device_frequency_khz))
|
||||
return;
|
||||
|
||||
if (mdev->clock->shared)
|
||||
mlx5_shared_clock_unregister(mdev);
|
||||
else
|
||||
mlx5_clock_free(mdev);
|
||||
kfree(mdev->clock_state);
|
||||
mdev->clock_state = NULL;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -33,6 +33,35 @@
|
|||
#ifndef __LIB_CLOCK_H__
|
||||
#define __LIB_CLOCK_H__
|
||||
|
||||
#include <linux/ptp_clock_kernel.h>
|
||||
|
||||
#define MAX_PIN_NUM 8
|
||||
struct mlx5_pps {
|
||||
u8 pin_caps[MAX_PIN_NUM];
|
||||
u64 start[MAX_PIN_NUM];
|
||||
u8 enabled;
|
||||
u64 min_npps_period;
|
||||
u64 min_out_pulse_duration_ns;
|
||||
bool pin_armed[MAX_PIN_NUM];
|
||||
};
|
||||
|
||||
struct mlx5_timer {
|
||||
struct cyclecounter cycles;
|
||||
struct timecounter tc;
|
||||
u32 nominal_c_mult;
|
||||
unsigned long overflow_period;
|
||||
};
|
||||
|
||||
struct mlx5_clock {
|
||||
seqlock_t lock;
|
||||
struct hwtstamp_config hwtstamp_config;
|
||||
struct ptp_clock *ptp;
|
||||
struct ptp_clock_info ptp_info;
|
||||
struct mlx5_pps pps_info;
|
||||
struct mlx5_timer timer;
|
||||
bool shared;
|
||||
};
|
||||
|
||||
static inline bool mlx5_is_real_time_rq(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
u8 rq_ts_format_cap = MLX5_CAP_GEN(mdev, rq_ts_format);
|
||||
|
|
@ -54,12 +83,14 @@ static inline bool mlx5_is_real_time_sq(struct mlx5_core_dev *mdev)
|
|||
typedef ktime_t (*cqe_ts_to_ns)(struct mlx5_clock *, u64);
|
||||
|
||||
#if IS_ENABLED(CONFIG_PTP_1588_CLOCK)
|
||||
void mlx5_init_clock(struct mlx5_core_dev *mdev);
|
||||
int mlx5_init_clock(struct mlx5_core_dev *mdev);
|
||||
void mlx5_cleanup_clock(struct mlx5_core_dev *mdev);
|
||||
void mlx5_clock_load(struct mlx5_core_dev *mdev);
|
||||
void mlx5_clock_unload(struct mlx5_core_dev *mdev);
|
||||
|
||||
static inline int mlx5_clock_get_ptp_index(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
return mdev->clock.ptp ? ptp_clock_index(mdev->clock.ptp) : -1;
|
||||
return mdev->clock->ptp ? ptp_clock_index(mdev->clock->ptp) : -1;
|
||||
}
|
||||
|
||||
static inline ktime_t mlx5_timecounter_cyc2time(struct mlx5_clock *clock,
|
||||
|
|
@ -87,8 +118,10 @@ static inline ktime_t mlx5_real_time_cyc2time(struct mlx5_clock *clock,
|
|||
return ns_to_ktime(time);
|
||||
}
|
||||
#else
|
||||
static inline void mlx5_init_clock(struct mlx5_core_dev *mdev) {}
|
||||
static inline int mlx5_init_clock(struct mlx5_core_dev *mdev) { return 0; }
|
||||
static inline void mlx5_cleanup_clock(struct mlx5_core_dev *mdev) {}
|
||||
static inline void mlx5_clock_load(struct mlx5_core_dev *mdev) {}
|
||||
static inline void mlx5_clock_unload(struct mlx5_core_dev *mdev) {}
|
||||
static inline int mlx5_clock_get_ptp_index(struct mlx5_core_dev *mdev)
|
||||
{
|
||||
return -1;
|
||||
|
|
|
|||
|
|
@ -11,6 +11,7 @@ enum mlx5_devcom_component {
|
|||
MLX5_DEVCOM_MPV,
|
||||
MLX5_DEVCOM_HCA_PORTS,
|
||||
MLX5_DEVCOM_SD_GROUP,
|
||||
MLX5_DEVCOM_SHARED_CLOCK,
|
||||
MLX5_DEVCOM_NUM_COMPONENTS,
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -1038,7 +1038,11 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
|
|||
|
||||
mlx5_init_reserved_gids(dev);
|
||||
|
||||
mlx5_init_clock(dev);
|
||||
err = mlx5_init_clock(dev);
|
||||
if (err) {
|
||||
mlx5_core_err(dev, "failed to initialize hardware clock\n");
|
||||
goto err_tables_cleanup;
|
||||
}
|
||||
|
||||
dev->vxlan = mlx5_vxlan_create(dev);
|
||||
dev->geneve = mlx5_geneve_create(dev);
|
||||
|
|
@ -1046,7 +1050,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
|
|||
err = mlx5_init_rl_table(dev);
|
||||
if (err) {
|
||||
mlx5_core_err(dev, "Failed to init rate limiting\n");
|
||||
goto err_tables_cleanup;
|
||||
goto err_clock_cleanup;
|
||||
}
|
||||
|
||||
err = mlx5_mpfs_init(dev);
|
||||
|
|
@ -1123,10 +1127,11 @@ err_mpfs_cleanup:
|
|||
mlx5_mpfs_cleanup(dev);
|
||||
err_rl_cleanup:
|
||||
mlx5_cleanup_rl_table(dev);
|
||||
err_tables_cleanup:
|
||||
err_clock_cleanup:
|
||||
mlx5_geneve_destroy(dev->geneve);
|
||||
mlx5_vxlan_destroy(dev->vxlan);
|
||||
mlx5_cleanup_clock(dev);
|
||||
err_tables_cleanup:
|
||||
mlx5_cleanup_reserved_gids(dev);
|
||||
mlx5_cq_debugfs_cleanup(dev);
|
||||
mlx5_fw_reset_cleanup(dev);
|
||||
|
|
@ -1359,6 +1364,8 @@ static int mlx5_load(struct mlx5_core_dev *dev)
|
|||
goto err_eq_table;
|
||||
}
|
||||
|
||||
mlx5_clock_load(dev);
|
||||
|
||||
err = mlx5_fw_tracer_init(dev->tracer);
|
||||
if (err) {
|
||||
mlx5_core_err(dev, "Failed to init FW tracer %d\n", err);
|
||||
|
|
@ -1442,6 +1449,7 @@ err_fpga_start:
|
|||
mlx5_hv_vhca_cleanup(dev->hv_vhca);
|
||||
mlx5_fw_reset_events_stop(dev);
|
||||
mlx5_fw_tracer_cleanup(dev->tracer);
|
||||
mlx5_clock_unload(dev);
|
||||
mlx5_eq_table_destroy(dev);
|
||||
err_eq_table:
|
||||
mlx5_irq_table_destroy(dev);
|
||||
|
|
@ -1468,6 +1476,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
|
|||
mlx5_hv_vhca_cleanup(dev->hv_vhca);
|
||||
mlx5_fw_reset_events_stop(dev);
|
||||
mlx5_fw_tracer_cleanup(dev->tracer);
|
||||
mlx5_clock_unload(dev);
|
||||
mlx5_eq_table_destroy(dev);
|
||||
mlx5_irq_table_destroy(dev);
|
||||
mlx5_pagealloc_stop(dev);
|
||||
|
|
|
|||
|
|
@ -1105,6 +1105,9 @@ static const u32 mlx5e_ext_link_speed[MLX5E_EXT_LINK_MODES_NUMBER] = {
|
|||
[MLX5E_200GAUI_2_200GBASE_CR2_KR2] = 200000,
|
||||
[MLX5E_400GAUI_4_400GBASE_CR4_KR4] = 400000,
|
||||
[MLX5E_800GAUI_8_800GBASE_CR8_KR8] = 800000,
|
||||
[MLX5E_200GAUI_1_200GBASE_CR1_KR1] = 200000,
|
||||
[MLX5E_400GAUI_2_400GBASE_CR2_KR2] = 400000,
|
||||
[MLX5E_800GAUI_4_800GBASE_CR4_KR4] = 800000,
|
||||
};
|
||||
|
||||
int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext,
|
||||
|
|
|
|||
|
|
@ -516,30 +516,6 @@ def_xa_destroy:
|
|||
return NULL;
|
||||
}
|
||||
|
||||
/* Assure synchronization of the device steering tables with updates made by SW
|
||||
* insertion.
|
||||
*/
|
||||
int mlx5dr_domain_sync(struct mlx5dr_domain *dmn, u32 flags)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
if (flags & MLX5DR_DOMAIN_SYNC_FLAGS_SW) {
|
||||
mlx5dr_domain_lock(dmn);
|
||||
ret = mlx5dr_send_ring_force_drain(dmn);
|
||||
mlx5dr_domain_unlock(dmn);
|
||||
if (ret) {
|
||||
mlx5dr_err(dmn, "Force drain failed flags: %d, ret: %d\n",
|
||||
flags, ret);
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
|
||||
if (flags & MLX5DR_DOMAIN_SYNC_FLAGS_HW)
|
||||
ret = mlx5dr_cmd_sync_steering(dmn->mdev);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int mlx5dr_domain_destroy(struct mlx5dr_domain *dmn)
|
||||
{
|
||||
if (WARN_ON_ONCE(refcount_read(&dmn->refcount) > 1))
|
||||
|
|
|
|||
|
|
@ -1331,36 +1331,3 @@ void mlx5dr_send_ring_free(struct mlx5dr_domain *dmn,
|
|||
kfree(send_ring->sync_buff);
|
||||
kfree(send_ring);
|
||||
}
|
||||
|
||||
int mlx5dr_send_ring_force_drain(struct mlx5dr_domain *dmn)
|
||||
{
|
||||
struct mlx5dr_send_ring *send_ring = dmn->send_ring;
|
||||
struct postsend_info send_info = {};
|
||||
u8 data[DR_STE_SIZE];
|
||||
int num_of_sends_req;
|
||||
int ret;
|
||||
int i;
|
||||
|
||||
/* Sending this amount of requests makes sure we will get drain */
|
||||
num_of_sends_req = send_ring->signal_th * TH_NUMS_TO_DRAIN / 2;
|
||||
|
||||
/* Send fake requests forcing the last to be signaled */
|
||||
send_info.write.addr = (uintptr_t)data;
|
||||
send_info.write.length = DR_STE_SIZE;
|
||||
send_info.write.lkey = 0;
|
||||
/* Using the sync_mr in order to write/read */
|
||||
send_info.remote_addr = (uintptr_t)send_ring->sync_mr->addr;
|
||||
send_info.rkey = send_ring->sync_mr->mkey;
|
||||
|
||||
for (i = 0; i < num_of_sends_req; i++) {
|
||||
ret = dr_postsend_icm_data(dmn, &send_info);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
spin_lock(&send_ring->lock);
|
||||
ret = dr_handle_pending_wc(dmn, send_ring);
|
||||
spin_unlock(&send_ring->lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1473,7 +1473,6 @@ struct mlx5dr_send_ring {
|
|||
int mlx5dr_send_ring_alloc(struct mlx5dr_domain *dmn);
|
||||
void mlx5dr_send_ring_free(struct mlx5dr_domain *dmn,
|
||||
struct mlx5dr_send_ring *send_ring);
|
||||
int mlx5dr_send_ring_force_drain(struct mlx5dr_domain *dmn);
|
||||
int mlx5dr_send_postsend_ste(struct mlx5dr_domain *dmn,
|
||||
struct mlx5dr_ste *ste,
|
||||
u8 *data,
|
||||
|
|
|
|||
|
|
@ -45,8 +45,6 @@ mlx5dr_domain_create(struct mlx5_core_dev *mdev, enum mlx5dr_domain_type type);
|
|||
|
||||
int mlx5dr_domain_destroy(struct mlx5dr_domain *domain);
|
||||
|
||||
int mlx5dr_domain_sync(struct mlx5dr_domain *domain, u32 flags);
|
||||
|
||||
void mlx5dr_domain_set_peer(struct mlx5dr_domain *dmn,
|
||||
struct mlx5dr_domain *peer_dmn,
|
||||
u16 peer_vhca_id);
|
||||
|
|
|
|||
|
|
@ -754,9 +754,6 @@ void
|
|||
mlxsw_sp_port_vlan_router_leave(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan);
|
||||
void mlxsw_sp_rif_destroy_by_dev(struct mlxsw_sp *mlxsw_sp,
|
||||
struct net_device *dev);
|
||||
bool mlxsw_sp_rif_exists(struct mlxsw_sp *mlxsw_sp,
|
||||
const struct net_device *dev);
|
||||
u16 mlxsw_sp_rif_vid(struct mlxsw_sp *mlxsw_sp, const struct net_device *dev);
|
||||
u16 mlxsw_sp_router_port(const struct mlxsw_sp *mlxsw_sp);
|
||||
int mlxsw_sp_router_nve_promote_decap(struct mlxsw_sp *mlxsw_sp, u32 ul_tb_id,
|
||||
enum mlxsw_sp_l3proto ul_proto,
|
||||
|
|
|
|||
|
|
@ -8184,41 +8184,6 @@ mlxsw_sp_rif_find_by_dev(const struct mlxsw_sp *mlxsw_sp,
|
|||
return NULL;
|
||||
}
|
||||
|
||||
bool mlxsw_sp_rif_exists(struct mlxsw_sp *mlxsw_sp,
|
||||
const struct net_device *dev)
|
||||
{
|
||||
struct mlxsw_sp_rif *rif;
|
||||
|
||||
mutex_lock(&mlxsw_sp->router->lock);
|
||||
rif = mlxsw_sp_rif_find_by_dev(mlxsw_sp, dev);
|
||||
mutex_unlock(&mlxsw_sp->router->lock);
|
||||
|
||||
return rif;
|
||||
}
|
||||
|
||||
u16 mlxsw_sp_rif_vid(struct mlxsw_sp *mlxsw_sp, const struct net_device *dev)
|
||||
{
|
||||
struct mlxsw_sp_rif *rif;
|
||||
u16 vid = 0;
|
||||
|
||||
mutex_lock(&mlxsw_sp->router->lock);
|
||||
rif = mlxsw_sp_rif_find_by_dev(mlxsw_sp, dev);
|
||||
if (!rif)
|
||||
goto out;
|
||||
|
||||
/* We only return the VID for VLAN RIFs. Otherwise we return an
|
||||
* invalid value (0).
|
||||
*/
|
||||
if (rif->ops->type != MLXSW_SP_RIF_TYPE_VLAN)
|
||||
goto out;
|
||||
|
||||
vid = mlxsw_sp_fid_8021q_vid(rif->fid);
|
||||
|
||||
out:
|
||||
mutex_unlock(&mlxsw_sp->router->lock);
|
||||
return vid;
|
||||
}
|
||||
|
||||
static int mlxsw_sp_router_rif_disable(struct mlxsw_sp *mlxsw_sp, u16 rif)
|
||||
{
|
||||
char ritr_pl[MLXSW_REG_RITR_LEN];
|
||||
|
|
@ -8417,19 +8382,6 @@ u16 mlxsw_sp_ipip_lb_rif_index(const struct mlxsw_sp_rif_ipip_lb *lb_rif)
|
|||
return lb_rif->common.rif_index;
|
||||
}
|
||||
|
||||
u16 mlxsw_sp_ipip_lb_ul_vr_id(const struct mlxsw_sp_rif_ipip_lb *lb_rif)
|
||||
{
|
||||
struct net_device *dev = mlxsw_sp_rif_dev(&lb_rif->common);
|
||||
u32 ul_tb_id = mlxsw_sp_ipip_dev_ul_tb_id(dev);
|
||||
struct mlxsw_sp_vr *ul_vr;
|
||||
|
||||
ul_vr = mlxsw_sp_vr_get(lb_rif->common.mlxsw_sp, ul_tb_id, NULL);
|
||||
if (WARN_ON(IS_ERR(ul_vr)))
|
||||
return 0;
|
||||
|
||||
return ul_vr->id;
|
||||
}
|
||||
|
||||
u16 mlxsw_sp_ipip_lb_ul_rif_id(const struct mlxsw_sp_rif_ipip_lb *lb_rif)
|
||||
{
|
||||
return lb_rif->ul_rif_id;
|
||||
|
|
|
|||
|
|
@ -90,7 +90,6 @@ struct mlxsw_sp_ipip_entry;
|
|||
struct mlxsw_sp_rif *mlxsw_sp_rif_by_index(const struct mlxsw_sp *mlxsw_sp,
|
||||
u16 rif_index);
|
||||
u16 mlxsw_sp_ipip_lb_rif_index(const struct mlxsw_sp_rif_ipip_lb *rif);
|
||||
u16 mlxsw_sp_ipip_lb_ul_vr_id(const struct mlxsw_sp_rif_ipip_lb *rif);
|
||||
u16 mlxsw_sp_ipip_lb_ul_rif_id(const struct mlxsw_sp_rif_ipip_lb *lb_rif);
|
||||
u32 mlxsw_sp_ipip_dev_ul_tb_id(const struct net_device *ol_dev);
|
||||
int mlxsw_sp_rif_dev_ifindex(const struct mlxsw_sp_rif *rif);
|
||||
|
|
|
|||
|
|
@ -10,6 +10,40 @@
|
|||
|
||||
static struct dentry *fbnic_dbg_root;
|
||||
|
||||
static void fbnic_dbg_desc_break(struct seq_file *s, int i)
|
||||
{
|
||||
while (i--)
|
||||
seq_putc(s, '-');
|
||||
|
||||
seq_putc(s, '\n');
|
||||
}
|
||||
|
||||
static int fbnic_dbg_mac_addr_show(struct seq_file *s, void *v)
|
||||
{
|
||||
struct fbnic_dev *fbd = s->private;
|
||||
char hdr[80];
|
||||
int i;
|
||||
|
||||
/* Generate Header */
|
||||
snprintf(hdr, sizeof(hdr), "%3s %s %-17s %s\n",
|
||||
"Idx", "S", "TCAM Bitmap", "Addr/Mask");
|
||||
seq_puts(s, hdr);
|
||||
fbnic_dbg_desc_break(s, strnlen(hdr, sizeof(hdr)));
|
||||
|
||||
for (i = 0; i < FBNIC_RPC_TCAM_MACDA_NUM_ENTRIES; i++) {
|
||||
struct fbnic_mac_addr *mac_addr = &fbd->mac_addr[i];
|
||||
|
||||
seq_printf(s, "%02d %d %64pb %pm\n",
|
||||
i, mac_addr->state, mac_addr->act_tcam,
|
||||
mac_addr->value.addr8);
|
||||
seq_printf(s, " %pm\n",
|
||||
mac_addr->mask.addr8);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
DEFINE_SHOW_ATTRIBUTE(fbnic_dbg_mac_addr);
|
||||
|
||||
static int fbnic_dbg_pcie_stats_show(struct seq_file *s, void *v)
|
||||
{
|
||||
struct fbnic_dev *fbd = s->private;
|
||||
|
|
@ -48,6 +82,8 @@ void fbnic_dbg_fbd_init(struct fbnic_dev *fbd)
|
|||
fbd->dbg_fbd = debugfs_create_dir(name, fbnic_dbg_root);
|
||||
debugfs_create_file("pcie_stats", 0400, fbd->dbg_fbd, fbd,
|
||||
&fbnic_dbg_pcie_stats_fops);
|
||||
debugfs_create_file("mac_addr", 0400, fbd->dbg_fbd, fbd,
|
||||
&fbnic_dbg_mac_addr_fops);
|
||||
}
|
||||
|
||||
void fbnic_dbg_fbd_exit(struct fbnic_dev *fbd)
|
||||
|
|
|
|||
|
|
@ -628,6 +628,8 @@ struct net_device *fbnic_netdev_alloc(struct fbnic_dev *fbd)
|
|||
fbnic_rss_key_fill(fbn->rss_key);
|
||||
fbnic_rss_init_en_mask(fbn);
|
||||
|
||||
netdev->priv_flags |= IFF_UNICAST_FLT;
|
||||
|
||||
netdev->features |=
|
||||
NETIF_F_RXHASH |
|
||||
NETIF_F_SG |
|
||||
|
|
|
|||
|
|
@ -3033,7 +3033,7 @@ static void qed_iov_vf_mbx_vport_update(struct qed_hwfn *p_hwfn,
|
|||
u16 length;
|
||||
int rc;
|
||||
|
||||
/* Valiate PF can send such a request */
|
||||
/* Validate PF can send such a request */
|
||||
if (!vf->vport_instance) {
|
||||
DP_VERBOSE(p_hwfn,
|
||||
QED_MSG_IOV,
|
||||
|
|
@ -3312,7 +3312,7 @@ static void qed_iov_vf_mbx_ucast_filter(struct qed_hwfn *p_hwfn,
|
|||
goto out;
|
||||
}
|
||||
|
||||
/* Determine if the unicast filtering is acceptible by PF */
|
||||
/* Determine if the unicast filtering is acceptable by PF */
|
||||
if ((p_bulletin->valid_bitmap & BIT(VLAN_ADDR_FORCED)) &&
|
||||
(params.type == QED_FILTER_VLAN ||
|
||||
params.type == QED_FILTER_MAC_VLAN)) {
|
||||
|
|
@ -3729,7 +3729,7 @@ qed_iov_execute_vf_flr_cleanup(struct qed_hwfn *p_hwfn,
|
|||
|
||||
rc = qed_iov_enable_vf_access(p_hwfn, p_ptt, p_vf);
|
||||
if (rc) {
|
||||
DP_ERR(p_hwfn, "Failed to re-enable VF[%d] acces\n",
|
||||
DP_ERR(p_hwfn, "Failed to re-enable VF[%d] access\n",
|
||||
vfid);
|
||||
return rc;
|
||||
}
|
||||
|
|
@ -4480,7 +4480,7 @@ int qed_sriov_disable(struct qed_dev *cdev, bool pci_enabled)
|
|||
struct qed_ptt *ptt = qed_ptt_acquire(hwfn);
|
||||
|
||||
/* Failure to acquire the ptt in 100g creates an odd error
|
||||
* where the first engine has already relased IOV.
|
||||
* where the first engine has already released IOV.
|
||||
*/
|
||||
if (!ptt) {
|
||||
DP_ERR(hwfn, "Failed to acquire ptt\n");
|
||||
|
|
|
|||
|
|
@ -114,7 +114,8 @@ config R8169
|
|||
will be called r8169. This is recommended.
|
||||
|
||||
config R8169_LEDS
|
||||
def_bool R8169 && LEDS_TRIGGER_NETDEV
|
||||
bool "Support for controlling the NIC LEDs"
|
||||
depends on R8169 && LEDS_TRIGGER_NETDEV
|
||||
depends on !(R8169=y && LEDS_CLASS=m)
|
||||
help
|
||||
Optional support for controlling the NIC LED's with the netdev
|
||||
|
|
|
|||
|
|
@ -5222,6 +5222,7 @@ static int r8169_mdio_register(struct rtl8169_private *tp)
|
|||
new_bus->priv = tp;
|
||||
new_bus->parent = &pdev->dev;
|
||||
new_bus->irq[0] = PHY_MAC_INTERRUPT;
|
||||
new_bus->phy_mask = GENMASK(31, 1);
|
||||
snprintf(new_bus->id, MII_BUS_ID_SIZE, "r8169-%x-%x",
|
||||
pci_domain_nr(pdev->bus), pci_dev_id(pdev));
|
||||
|
||||
|
|
|
|||
|
|
@ -13,7 +13,7 @@
|
|||
*/
|
||||
const char *phy_speed_to_str(int speed)
|
||||
{
|
||||
BUILD_BUG_ON_MSG(__ETHTOOL_LINK_MODE_MASK_NBITS != 103,
|
||||
BUILD_BUG_ON_MSG(__ETHTOOL_LINK_MODE_MASK_NBITS != 121,
|
||||
"Enum ethtool_link_mode_bit_indices and phylib are out of sync. "
|
||||
"If a speed or mode has been added please update phy_speed_to_str "
|
||||
"and the PHY settings array.\n");
|
||||
|
|
@ -169,6 +169,12 @@ static const struct phy_setting settings[] = {
|
|||
PHY_SETTING( 800000, FULL, 800000baseDR8_2_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseSR8_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseVR8_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseCR4_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseKR4_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseDR4_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseDR4_2_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseSR4_Full ),
|
||||
PHY_SETTING( 800000, FULL, 800000baseVR4_Full ),
|
||||
/* 400G */
|
||||
PHY_SETTING( 400000, FULL, 400000baseCR8_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseKR8_Full ),
|
||||
|
|
@ -180,6 +186,12 @@ static const struct phy_setting settings[] = {
|
|||
PHY_SETTING( 400000, FULL, 400000baseLR4_ER4_FR4_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseDR4_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseSR4_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseCR2_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseKR2_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseDR2_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseDR2_2_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseSR2_Full ),
|
||||
PHY_SETTING( 400000, FULL, 400000baseVR2_Full ),
|
||||
/* 200G */
|
||||
PHY_SETTING( 200000, FULL, 200000baseCR4_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseKR4_Full ),
|
||||
|
|
@ -191,6 +203,12 @@ static const struct phy_setting settings[] = {
|
|||
PHY_SETTING( 200000, FULL, 200000baseLR2_ER2_FR2_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseDR2_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseSR2_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseCR_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseKR_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseDR_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseDR_2_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseSR_Full ),
|
||||
PHY_SETTING( 200000, FULL, 200000baseVR_Full ),
|
||||
/* 100G */
|
||||
PHY_SETTING( 100000, FULL, 100000baseCR4_Full ),
|
||||
PHY_SETTING( 100000, FULL, 100000baseKR4_Full ),
|
||||
|
|
|
|||
|
|
@ -4,8 +4,12 @@ config REALTEK_PHY
|
|||
help
|
||||
Currently supports RTL821x/RTL822x and fast ethernet PHYs
|
||||
|
||||
if REALTEK_PHY
|
||||
|
||||
config REALTEK_PHY_HWMON
|
||||
def_bool REALTEK_PHY && HWMON
|
||||
depends on !(REALTEK_PHY=y && HWMON=m)
|
||||
bool "HWMON support for Realtek PHYs"
|
||||
depends on HWMON && !(REALTEK_PHY=y && HWMON=m)
|
||||
help
|
||||
Optional hwmon support for the temperature sensor
|
||||
|
||||
endif # REALTEK_PHY
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/clk.h>
|
||||
#include <linux/string_choices.h>
|
||||
|
||||
#include "realtek.h"
|
||||
|
||||
|
|
@ -422,11 +423,11 @@ static int rtl8211f_config_init(struct phy_device *phydev)
|
|||
} else if (ret) {
|
||||
dev_dbg(dev,
|
||||
"%s 2ns TX delay (and changing the value from pin-strapping RXD1 or the bootloader)\n",
|
||||
val_txdly ? "Enabling" : "Disabling");
|
||||
str_enable_disable(val_txdly));
|
||||
} else {
|
||||
dev_dbg(dev,
|
||||
"2ns TX delay was already %s (by pin-strapping RXD1 or bootloader configuration)\n",
|
||||
val_txdly ? "enabled" : "disabled");
|
||||
str_enabled_disabled(val_txdly));
|
||||
}
|
||||
|
||||
ret = phy_modify_paged_changed(phydev, 0xd08, 0x15, RTL8211F_RX_DELAY,
|
||||
|
|
@ -437,11 +438,11 @@ static int rtl8211f_config_init(struct phy_device *phydev)
|
|||
} else if (ret) {
|
||||
dev_dbg(dev,
|
||||
"%s 2ns RX delay (and changing the value from pin-strapping RXD0 or the bootloader)\n",
|
||||
val_rxdly ? "Enabling" : "Disabling");
|
||||
str_enable_disable(val_rxdly));
|
||||
} else {
|
||||
dev_dbg(dev,
|
||||
"2ns RX delay was already %s (by pin-strapping RXD0 or bootloader configuration)\n",
|
||||
val_rxdly ? "enabled" : "disabled");
|
||||
str_enabled_disabled(val_rxdly));
|
||||
}
|
||||
|
||||
if (priv->has_phycr2) {
|
||||
|
|
|
|||
|
|
@ -227,9 +227,9 @@ static int vxlan_fdb_info(struct sk_buff *skb, struct vxlan_dev *vxlan,
|
|||
be32_to_cpu(fdb->vni)))
|
||||
goto nla_put_failure;
|
||||
|
||||
ci.ndm_used = jiffies_to_clock_t(now - fdb->used);
|
||||
ci.ndm_used = jiffies_to_clock_t(now - READ_ONCE(fdb->used));
|
||||
ci.ndm_confirmed = 0;
|
||||
ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated);
|
||||
ci.ndm_updated = jiffies_to_clock_t(now - READ_ONCE(fdb->updated));
|
||||
ci.ndm_refcnt = 0;
|
||||
|
||||
if (nla_put(skb, NDA_CACHEINFO, sizeof(ci), &ci))
|
||||
|
|
@ -434,8 +434,12 @@ static struct vxlan_fdb *vxlan_find_mac(struct vxlan_dev *vxlan,
|
|||
struct vxlan_fdb *f;
|
||||
|
||||
f = __vxlan_find_mac(vxlan, mac, vni);
|
||||
if (f && f->used != jiffies)
|
||||
f->used = jiffies;
|
||||
if (f) {
|
||||
unsigned long now = jiffies;
|
||||
|
||||
if (READ_ONCE(f->used) != now)
|
||||
WRITE_ONCE(f->used, now);
|
||||
}
|
||||
|
||||
return f;
|
||||
}
|
||||
|
|
@ -1009,12 +1013,10 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
|
|||
!(f->flags & NTF_VXLAN_ADDED_BY_USER)) {
|
||||
if (f->state != state) {
|
||||
f->state = state;
|
||||
f->updated = jiffies;
|
||||
notify = 1;
|
||||
}
|
||||
if (f->flags != fdb_flags) {
|
||||
f->flags = fdb_flags;
|
||||
f->updated = jiffies;
|
||||
notify = 1;
|
||||
}
|
||||
}
|
||||
|
|
@ -1048,12 +1050,13 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
|
|||
}
|
||||
|
||||
if (ndm_flags & NTF_USE)
|
||||
f->used = jiffies;
|
||||
WRITE_ONCE(f->updated, jiffies);
|
||||
|
||||
if (notify) {
|
||||
if (rd == NULL)
|
||||
rd = first_remote_rtnl(f);
|
||||
|
||||
WRITE_ONCE(f->updated, jiffies);
|
||||
err = vxlan_fdb_notify(vxlan, f, rd, RTM_NEWNEIGH,
|
||||
swdev_notify, extack);
|
||||
if (err)
|
||||
|
|
@ -1292,7 +1295,7 @@ int __vxlan_fdb_delete(struct vxlan_dev *vxlan,
|
|||
struct vxlan_fdb *f;
|
||||
int err = -ENOENT;
|
||||
|
||||
f = vxlan_find_mac(vxlan, addr, src_vni);
|
||||
f = __vxlan_find_mac(vxlan, addr, src_vni);
|
||||
if (!f)
|
||||
return err;
|
||||
|
||||
|
|
@ -1459,9 +1462,13 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
|
|||
ifindex = src_ifindex;
|
||||
#endif
|
||||
|
||||
f = vxlan_find_mac(vxlan, src_mac, vni);
|
||||
f = __vxlan_find_mac(vxlan, src_mac, vni);
|
||||
if (likely(f)) {
|
||||
struct vxlan_rdst *rdst = first_remote_rcu(f);
|
||||
unsigned long now = jiffies;
|
||||
|
||||
if (READ_ONCE(f->updated) != now)
|
||||
WRITE_ONCE(f->updated, now);
|
||||
|
||||
if (likely(vxlan_addr_equal(&rdst->remote_ip, src_ip) &&
|
||||
rdst->remote_ifindex == ifindex))
|
||||
|
|
@ -1481,7 +1488,6 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
|
|||
src_mac, &rdst->remote_ip.sa, &src_ip->sa);
|
||||
|
||||
rdst->remote_ip = *src_ip;
|
||||
f->updated = jiffies;
|
||||
vxlan_fdb_notify(vxlan, f, rdst, RTM_NEWNEIGH, true, NULL);
|
||||
} else {
|
||||
u32 hash_index = fdb_head_index(vxlan, src_mac, vni);
|
||||
|
|
@ -2852,7 +2858,7 @@ static void vxlan_cleanup(struct timer_list *t)
|
|||
if (f->flags & NTF_EXT_LEARNED)
|
||||
continue;
|
||||
|
||||
timeout = f->used + vxlan->cfg.age_interval * HZ;
|
||||
timeout = READ_ONCE(f->updated) + vxlan->cfg.age_interval * HZ;
|
||||
if (time_before_eq(timeout, jiffies)) {
|
||||
netdev_dbg(vxlan->dev,
|
||||
"garbage collect %pM\n",
|
||||
|
|
@ -4768,7 +4774,7 @@ vxlan_fdb_offloaded_set(struct net_device *dev,
|
|||
|
||||
spin_lock_bh(&vxlan->hash_lock[hash_index]);
|
||||
|
||||
f = vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
|
||||
f = __vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
|
||||
if (!f)
|
||||
goto out;
|
||||
|
||||
|
|
@ -4824,7 +4830,7 @@ vxlan_fdb_external_learn_del(struct net_device *dev,
|
|||
hash_index = fdb_head_index(vxlan, fdb_info->eth_addr, fdb_info->vni);
|
||||
spin_lock_bh(&vxlan->hash_lock[hash_index]);
|
||||
|
||||
f = vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
|
||||
f = __vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
|
||||
if (!f)
|
||||
err = -ENOENT;
|
||||
else if (f->flags & NTF_EXT_LEARNED)
|
||||
|
|
|
|||
|
|
@ -2,15 +2,6 @@
|
|||
menu "S/390 network device drivers"
|
||||
depends on NETDEVICES && S390
|
||||
|
||||
config LCS
|
||||
def_tristate m
|
||||
prompt "Lan Channel Station Interface"
|
||||
depends on CCW && NETDEVICES && ETHERNET
|
||||
help
|
||||
Select this option if you want to use LCS networking on IBM System z.
|
||||
To compile as a module, choose M. The module name is lcs.
|
||||
If you do not use LCS, choose N.
|
||||
|
||||
config CTCM
|
||||
def_tristate m
|
||||
prompt "CTC and MPC SNA device support"
|
||||
|
|
@ -98,7 +89,7 @@ config QETH_OSX
|
|||
|
||||
config CCWGROUP
|
||||
tristate
|
||||
default (LCS || CTCM || QETH || SMC)
|
||||
default (CTCM || QETH || SMC)
|
||||
|
||||
config ISM
|
||||
tristate "Support for ISM vPCI Adapter"
|
||||
|
|
|
|||
|
|
@ -8,7 +8,6 @@ obj-$(CONFIG_CTCM) += ctcm.o fsm.o
|
|||
obj-$(CONFIG_NETIUCV) += netiucv.o fsm.o
|
||||
obj-$(CONFIG_SMSGIUCV) += smsgiucv.o
|
||||
obj-$(CONFIG_SMSGIUCV_EVENT) += smsgiucv_app.o
|
||||
obj-$(CONFIG_LCS) += lcs.o
|
||||
qeth-y += qeth_core_sys.o qeth_core_main.o qeth_core_mpc.o qeth_ethtool.o
|
||||
obj-$(CONFIG_QETH) += qeth.o
|
||||
qeth_l2-y += qeth_l2_main.o qeth_l2_sys.o
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load Diff
|
|
@ -1,342 +0,0 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/*lcs.h*/
|
||||
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/workqueue.h>
|
||||
#include <linux/refcount.h>
|
||||
#include <asm/ccwdev.h>
|
||||
|
||||
#define LCS_DBF_TEXT(level, name, text) \
|
||||
do { \
|
||||
debug_text_event(lcs_dbf_##name, level, text); \
|
||||
} while (0)
|
||||
|
||||
#define LCS_DBF_HEX(level,name,addr,len) \
|
||||
do { \
|
||||
debug_event(lcs_dbf_##name,level,(void*)(addr),len); \
|
||||
} while (0)
|
||||
|
||||
#define LCS_DBF_TEXT_(level,name,text...) \
|
||||
do { \
|
||||
if (debug_level_enabled(lcs_dbf_##name, level)) { \
|
||||
scnprintf(debug_buffer, sizeof(debug_buffer), text); \
|
||||
debug_text_event(lcs_dbf_##name, level, debug_buffer); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
/**
|
||||
* sysfs related stuff
|
||||
*/
|
||||
#define CARD_FROM_DEV(cdev) \
|
||||
(struct lcs_card *) dev_get_drvdata( \
|
||||
&((struct ccwgroup_device *)dev_get_drvdata(&cdev->dev))->dev);
|
||||
|
||||
/**
|
||||
* Enum for classifying detected devices.
|
||||
*/
|
||||
enum lcs_channel_types {
|
||||
/* Device is not a channel */
|
||||
lcs_channel_type_none,
|
||||
|
||||
/* Device is a 2216 channel */
|
||||
lcs_channel_type_parallel,
|
||||
|
||||
/* Device is a 2216 channel */
|
||||
lcs_channel_type_2216,
|
||||
|
||||
/* Device is a OSA2 card */
|
||||
lcs_channel_type_osa2
|
||||
};
|
||||
|
||||
/**
|
||||
* CCW commands used in this driver
|
||||
*/
|
||||
#define LCS_CCW_WRITE 0x01
|
||||
#define LCS_CCW_READ 0x02
|
||||
#define LCS_CCW_TRANSFER 0x08
|
||||
|
||||
/**
|
||||
* LCS device status primitives
|
||||
*/
|
||||
#define LCS_CMD_STARTLAN 0x01
|
||||
#define LCS_CMD_STOPLAN 0x02
|
||||
#define LCS_CMD_LANSTAT 0x04
|
||||
#define LCS_CMD_STARTUP 0x07
|
||||
#define LCS_CMD_SHUTDOWN 0x08
|
||||
#define LCS_CMD_QIPASSIST 0xb2
|
||||
#define LCS_CMD_SETIPM 0xb4
|
||||
#define LCS_CMD_DELIPM 0xb5
|
||||
|
||||
#define LCS_INITIATOR_TCPIP 0x00
|
||||
#define LCS_INITIATOR_LGW 0x01
|
||||
#define LCS_STD_CMD_SIZE 16
|
||||
#define LCS_MULTICAST_CMD_SIZE 404
|
||||
|
||||
/**
|
||||
* LCS IPASSIST MASKS,only used when multicast is switched on
|
||||
*/
|
||||
/* Not supported by LCS */
|
||||
#define LCS_IPASS_ARP_PROCESSING 0x0001
|
||||
#define LCS_IPASS_IN_CHECKSUM_SUPPORT 0x0002
|
||||
#define LCS_IPASS_OUT_CHECKSUM_SUPPORT 0x0004
|
||||
#define LCS_IPASS_IP_FRAG_REASSEMBLY 0x0008
|
||||
#define LCS_IPASS_IP_FILTERING 0x0010
|
||||
/* Supported by lcs 3172 */
|
||||
#define LCS_IPASS_IPV6_SUPPORT 0x0020
|
||||
#define LCS_IPASS_MULTICAST_SUPPORT 0x0040
|
||||
|
||||
/**
|
||||
* LCS sense byte definitions
|
||||
*/
|
||||
#define LCS_SENSE_BYTE_0 0
|
||||
#define LCS_SENSE_BYTE_1 1
|
||||
#define LCS_SENSE_BYTE_2 2
|
||||
#define LCS_SENSE_BYTE_3 3
|
||||
#define LCS_SENSE_INTERFACE_DISCONNECT 0x01
|
||||
#define LCS_SENSE_EQUIPMENT_CHECK 0x10
|
||||
#define LCS_SENSE_BUS_OUT_CHECK 0x20
|
||||
#define LCS_SENSE_INTERVENTION_REQUIRED 0x40
|
||||
#define LCS_SENSE_CMD_REJECT 0x80
|
||||
#define LCS_SENSE_RESETTING_EVENT 0x80
|
||||
#define LCS_SENSE_DEVICE_ONLINE 0x20
|
||||
|
||||
/**
|
||||
* LCS packet type definitions
|
||||
*/
|
||||
#define LCS_FRAME_TYPE_CONTROL 0
|
||||
#define LCS_FRAME_TYPE_ENET 1
|
||||
#define LCS_FRAME_TYPE_TR 2
|
||||
#define LCS_FRAME_TYPE_FDDI 7
|
||||
#define LCS_FRAME_TYPE_AUTO -1
|
||||
|
||||
/**
|
||||
* some more definitions,we will sort them later
|
||||
*/
|
||||
#define LCS_ILLEGAL_OFFSET 0xffff
|
||||
#define LCS_IOBUFFERSIZE 0x5000
|
||||
#define LCS_NUM_BUFFS 32 /* needs to be power of 2 */
|
||||
#define LCS_MAC_LENGTH 6
|
||||
#define LCS_INVALID_PORT_NO -1
|
||||
#define LCS_LANCMD_TIMEOUT_DEFAULT 5
|
||||
|
||||
/**
|
||||
* Multicast state
|
||||
*/
|
||||
#define LCS_IPM_STATE_SET_REQUIRED 0
|
||||
#define LCS_IPM_STATE_DEL_REQUIRED 1
|
||||
#define LCS_IPM_STATE_ON_CARD 2
|
||||
|
||||
/**
|
||||
* LCS IP Assist declarations
|
||||
* seems to be only used for multicast
|
||||
*/
|
||||
#define LCS_IPASS_ARP_PROCESSING 0x0001
|
||||
#define LCS_IPASS_INBOUND_CSUM_SUPP 0x0002
|
||||
#define LCS_IPASS_OUTBOUND_CSUM_SUPP 0x0004
|
||||
#define LCS_IPASS_IP_FRAG_REASSEMBLY 0x0008
|
||||
#define LCS_IPASS_IP_FILTERING 0x0010
|
||||
#define LCS_IPASS_IPV6_SUPPORT 0x0020
|
||||
#define LCS_IPASS_MULTICAST_SUPPORT 0x0040
|
||||
|
||||
/**
|
||||
* LCS Buffer states
|
||||
*/
|
||||
enum lcs_buffer_states {
|
||||
LCS_BUF_STATE_EMPTY, /* buffer is empty */
|
||||
LCS_BUF_STATE_LOCKED, /* buffer is locked, don't touch */
|
||||
LCS_BUF_STATE_READY, /* buffer is ready for read/write */
|
||||
LCS_BUF_STATE_PROCESSED,
|
||||
};
|
||||
|
||||
/**
|
||||
* LCS Channel State Machine declarations
|
||||
*/
|
||||
enum lcs_channel_states {
|
||||
LCS_CH_STATE_INIT,
|
||||
LCS_CH_STATE_HALTED,
|
||||
LCS_CH_STATE_STOPPED,
|
||||
LCS_CH_STATE_RUNNING,
|
||||
LCS_CH_STATE_SUSPENDED,
|
||||
LCS_CH_STATE_CLEARED,
|
||||
LCS_CH_STATE_ERROR,
|
||||
};
|
||||
|
||||
/**
|
||||
* LCS device state machine
|
||||
*/
|
||||
enum lcs_dev_states {
|
||||
DEV_STATE_DOWN,
|
||||
DEV_STATE_UP,
|
||||
DEV_STATE_RECOVER,
|
||||
};
|
||||
|
||||
enum lcs_threads {
|
||||
LCS_SET_MC_THREAD = 1,
|
||||
LCS_RECOVERY_THREAD = 2,
|
||||
};
|
||||
|
||||
/**
|
||||
* LCS struct declarations
|
||||
*/
|
||||
struct lcs_header {
|
||||
__u16 offset;
|
||||
__u8 type;
|
||||
__u8 slot;
|
||||
} __attribute__ ((packed));
|
||||
|
||||
struct lcs_ip_mac_pair {
|
||||
__be32 ip_addr;
|
||||
__u8 mac_addr[LCS_MAC_LENGTH];
|
||||
__u8 reserved[2];
|
||||
} __attribute__ ((packed));
|
||||
|
||||
struct lcs_ipm_list {
|
||||
struct list_head list;
|
||||
struct lcs_ip_mac_pair ipm;
|
||||
__u8 ipm_state;
|
||||
};
|
||||
|
||||
struct lcs_cmd {
|
||||
__u16 offset;
|
||||
__u8 type;
|
||||
__u8 slot;
|
||||
__u8 cmd_code;
|
||||
__u8 initiator;
|
||||
__u16 sequence_no;
|
||||
__u16 return_code;
|
||||
union {
|
||||
struct {
|
||||
__u8 lan_type;
|
||||
__u8 portno;
|
||||
__u16 parameter_count;
|
||||
__u8 operator_flags[3];
|
||||
__u8 reserved[3];
|
||||
} lcs_std_cmd;
|
||||
struct {
|
||||
__u16 unused1;
|
||||
__u16 buff_size;
|
||||
__u8 unused2[6];
|
||||
} lcs_startup;
|
||||
struct {
|
||||
__u8 lan_type;
|
||||
__u8 portno;
|
||||
__u8 unused[10];
|
||||
__u8 mac_addr[LCS_MAC_LENGTH];
|
||||
__u32 num_packets_deblocked;
|
||||
__u32 num_packets_blocked;
|
||||
__u32 num_packets_tx_on_lan;
|
||||
__u32 num_tx_errors_detected;
|
||||
__u32 num_tx_packets_disgarded;
|
||||
__u32 num_packets_rx_from_lan;
|
||||
__u32 num_rx_errors_detected;
|
||||
__u32 num_rx_discarded_nobuffs_avail;
|
||||
__u32 num_rx_packets_too_large;
|
||||
} lcs_lanstat_cmd;
|
||||
#ifdef CONFIG_IP_MULTICAST
|
||||
struct {
|
||||
__u8 lan_type;
|
||||
__u8 portno;
|
||||
__u16 num_ip_pairs;
|
||||
__u16 ip_assists_supported;
|
||||
__u16 ip_assists_enabled;
|
||||
__u16 version;
|
||||
struct {
|
||||
struct lcs_ip_mac_pair
|
||||
ip_mac_pair[32];
|
||||
__u32 response_data;
|
||||
} lcs_ipass_ctlmsg __attribute ((packed));
|
||||
} lcs_qipassist __attribute__ ((packed));
|
||||
#endif /*CONFIG_IP_MULTICAST */
|
||||
} cmd __attribute__ ((packed));
|
||||
} __attribute__ ((packed));
|
||||
|
||||
/**
|
||||
* Forward declarations.
|
||||
*/
|
||||
struct lcs_card;
|
||||
struct lcs_channel;
|
||||
|
||||
/**
|
||||
* Definition of an lcs buffer.
|
||||
*/
|
||||
struct lcs_buffer {
|
||||
enum lcs_buffer_states state;
|
||||
void *data;
|
||||
int count;
|
||||
/* Callback for completion notification. */
|
||||
void (*callback)(struct lcs_channel *, struct lcs_buffer *);
|
||||
};
|
||||
|
||||
struct lcs_reply {
|
||||
struct list_head list;
|
||||
__u16 sequence_no;
|
||||
refcount_t refcnt;
|
||||
/* Callback for completion notification. */
|
||||
void (*callback)(struct lcs_card *, struct lcs_cmd *);
|
||||
wait_queue_head_t wait_q;
|
||||
struct lcs_card *card;
|
||||
struct timer_list timer;
|
||||
int received;
|
||||
int rc;
|
||||
};
|
||||
|
||||
/**
|
||||
* Definition of an lcs channel
|
||||
*/
|
||||
struct lcs_channel {
|
||||
enum lcs_channel_states state;
|
||||
struct ccw_device *ccwdev;
|
||||
struct ccw1 ccws[LCS_NUM_BUFFS + 1];
|
||||
wait_queue_head_t wait_q;
|
||||
struct tasklet_struct irq_tasklet;
|
||||
struct lcs_buffer iob[LCS_NUM_BUFFS];
|
||||
int io_idx;
|
||||
int buf_idx;
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* definition of the lcs card
|
||||
*/
|
||||
struct lcs_card {
|
||||
spinlock_t lock;
|
||||
spinlock_t ipm_lock;
|
||||
enum lcs_dev_states state;
|
||||
struct net_device *dev;
|
||||
struct net_device_stats stats;
|
||||
__be16 (*lan_type_trans)(struct sk_buff *skb,
|
||||
struct net_device *dev);
|
||||
struct ccwgroup_device *gdev;
|
||||
struct lcs_channel read;
|
||||
struct lcs_channel write;
|
||||
struct lcs_buffer *tx_buffer;
|
||||
int tx_emitted;
|
||||
struct list_head lancmd_waiters;
|
||||
int lancmd_timeout;
|
||||
|
||||
struct work_struct kernel_thread_starter;
|
||||
spinlock_t mask_lock;
|
||||
unsigned long thread_start_mask;
|
||||
unsigned long thread_running_mask;
|
||||
unsigned long thread_allowed_mask;
|
||||
wait_queue_head_t wait_q;
|
||||
|
||||
#ifdef CONFIG_IP_MULTICAST
|
||||
struct list_head ipm_list;
|
||||
#endif
|
||||
__u8 mac[LCS_MAC_LENGTH];
|
||||
__u16 ip_assists_supported;
|
||||
__u16 ip_assists_enabled;
|
||||
__s8 lan_type;
|
||||
__u32 pkt_seq;
|
||||
__u16 sequence_no;
|
||||
__s16 portno;
|
||||
/* Some info copied from probeinfo */
|
||||
u8 device_forced;
|
||||
u8 max_port_no;
|
||||
u8 hint_port_no;
|
||||
s16 port_protocol_no;
|
||||
} __attribute__ ((aligned(8)));
|
||||
|
||||
|
|
@ -40,6 +40,8 @@ enum io_uring_cmd_flags {
|
|||
IO_URING_F_TASK_DEAD = (1 << 13),
|
||||
};
|
||||
|
||||
struct io_zcrx_ifq;
|
||||
|
||||
struct io_wq_work_node {
|
||||
struct io_wq_work_node *next;
|
||||
};
|
||||
|
|
@ -384,6 +386,8 @@ struct io_ring_ctx {
|
|||
struct wait_queue_head poll_wq;
|
||||
struct io_restriction restrictions;
|
||||
|
||||
struct io_zcrx_ifq *ifq;
|
||||
|
||||
u32 pers_next;
|
||||
struct xarray personalities;
|
||||
|
||||
|
|
@ -436,6 +440,8 @@ struct io_ring_ctx {
|
|||
struct io_mapped_region ring_region;
|
||||
/* used for optimised request parameter and wait argument passing */
|
||||
struct io_mapped_region param_region;
|
||||
/* just one zcrx per ring for now, will move to io_zcrx_ifq eventually */
|
||||
struct io_mapped_region zcrx_region;
|
||||
};
|
||||
|
||||
/*
|
||||
|
|
|
|||
|
|
@ -1415,7 +1415,6 @@ int mlx4_get_is_vlan_offload_disabled(struct mlx4_dev *dev, u8 port,
|
|||
bool *vlan_offload_disabled);
|
||||
void mlx4_handle_eth_header_mcast_prio(struct mlx4_net_trans_rule_hw_ctrl *ctrl,
|
||||
struct _rule_hw *eth_header);
|
||||
int mlx4_find_cached_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *idx);
|
||||
int mlx4_find_cached_vlan(struct mlx4_dev *dev, u8 port, u16 vid, int *idx);
|
||||
int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
|
||||
void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan);
|
||||
|
|
|
|||
|
|
@ -54,7 +54,6 @@
|
|||
#include <linux/mlx5/doorbell.h>
|
||||
#include <linux/mlx5/eq.h>
|
||||
#include <linux/timecounter.h>
|
||||
#include <linux/ptp_clock_kernel.h>
|
||||
#include <net/devlink.h>
|
||||
|
||||
#define MLX5_ADEV_NAME "mlx5_core"
|
||||
|
|
@ -679,33 +678,8 @@ struct mlx5_rsvd_gids {
|
|||
struct ida ida;
|
||||
};
|
||||
|
||||
#define MAX_PIN_NUM 8
|
||||
struct mlx5_pps {
|
||||
u8 pin_caps[MAX_PIN_NUM];
|
||||
struct work_struct out_work;
|
||||
u64 start[MAX_PIN_NUM];
|
||||
u8 enabled;
|
||||
u64 min_npps_period;
|
||||
u64 min_out_pulse_duration_ns;
|
||||
};
|
||||
|
||||
struct mlx5_timer {
|
||||
struct cyclecounter cycles;
|
||||
struct timecounter tc;
|
||||
u32 nominal_c_mult;
|
||||
unsigned long overflow_period;
|
||||
};
|
||||
|
||||
struct mlx5_clock {
|
||||
struct mlx5_nb pps_nb;
|
||||
seqlock_t lock;
|
||||
struct hwtstamp_config hwtstamp_config;
|
||||
struct ptp_clock *ptp;
|
||||
struct ptp_clock_info ptp_info;
|
||||
struct mlx5_pps pps_info;
|
||||
struct mlx5_timer timer;
|
||||
};
|
||||
|
||||
struct mlx5_clock;
|
||||
struct mlx5_clock_dev_state;
|
||||
struct mlx5_dm;
|
||||
struct mlx5_fw_tracer;
|
||||
struct mlx5_vxlan;
|
||||
|
|
@ -789,7 +763,8 @@ struct mlx5_core_dev {
|
|||
#ifdef CONFIG_MLX5_FPGA
|
||||
struct mlx5_fpga_device *fpga;
|
||||
#endif
|
||||
struct mlx5_clock clock;
|
||||
struct mlx5_clock *clock;
|
||||
struct mlx5_clock_dev_state *clock_state;
|
||||
struct mlx5_ib_clock_info *clock_info;
|
||||
struct mlx5_fw_tracer *tracer;
|
||||
struct mlx5_rsc_dump *rsc_dump;
|
||||
|
|
|
|||
|
|
@ -115,9 +115,12 @@ enum mlx5e_ext_link_mode {
|
|||
MLX5E_100GAUI_1_100GBASE_CR_KR = 11,
|
||||
MLX5E_200GAUI_4_200GBASE_CR4_KR4 = 12,
|
||||
MLX5E_200GAUI_2_200GBASE_CR2_KR2 = 13,
|
||||
MLX5E_200GAUI_1_200GBASE_CR1_KR1 = 14,
|
||||
MLX5E_400GAUI_8_400GBASE_CR8 = 15,
|
||||
MLX5E_400GAUI_4_400GBASE_CR4_KR4 = 16,
|
||||
MLX5E_400GAUI_2_400GBASE_CR2_KR2 = 17,
|
||||
MLX5E_800GAUI_8_800GBASE_CR8_KR8 = 19,
|
||||
MLX5E_800GAUI_4_800GBASE_CR4_KR4 = 20,
|
||||
MLX5E_EXT_LINK_MODES_NUMBER,
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -658,6 +658,7 @@ struct netdev_queue {
|
|||
struct Qdisc __rcu *qdisc_sleeping;
|
||||
#ifdef CONFIG_SYSFS
|
||||
struct kobject kobj;
|
||||
const struct attribute_group **groups;
|
||||
#endif
|
||||
unsigned long tx_maxrate;
|
||||
/*
|
||||
|
|
|
|||
|
|
@ -43,6 +43,7 @@ extern void rtnl_lock(void);
|
|||
extern void rtnl_unlock(void);
|
||||
extern int rtnl_trylock(void);
|
||||
extern int rtnl_is_locked(void);
|
||||
extern int rtnl_lock_interruptible(void);
|
||||
extern int rtnl_lock_killable(void);
|
||||
extern bool refcount_dec_and_rtnl_lock(refcount_t *r);
|
||||
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ struct netdev_rx_queue {
|
|||
struct rps_dev_flow_table __rcu *rps_flow_table;
|
||||
#endif
|
||||
struct kobject kobj;
|
||||
const struct attribute_group **groups;
|
||||
struct net_device *dev;
|
||||
netdevice_tracker dev_tracker;
|
||||
|
||||
|
|
|
|||
|
|
@ -24,11 +24,20 @@ struct net_iov {
|
|||
unsigned long __unused_padding;
|
||||
unsigned long pp_magic;
|
||||
struct page_pool *pp;
|
||||
struct dmabuf_genpool_chunk_owner *owner;
|
||||
struct net_iov_area *owner;
|
||||
unsigned long dma_addr;
|
||||
atomic_long_t pp_ref_count;
|
||||
};
|
||||
|
||||
struct net_iov_area {
|
||||
/* Array of net_iovs for this area. */
|
||||
struct net_iov *niovs;
|
||||
size_t num_niovs;
|
||||
|
||||
/* Offset into the dma-buf where this chunk starts. */
|
||||
unsigned long base_virtual;
|
||||
};
|
||||
|
||||
/* These fields in struct page are used by the page_pool and net stack:
|
||||
*
|
||||
* struct {
|
||||
|
|
@ -54,6 +63,16 @@ NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
|
|||
NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
|
||||
#undef NET_IOV_ASSERT_OFFSET
|
||||
|
||||
static inline struct net_iov_area *net_iov_owner(const struct net_iov *niov)
|
||||
{
|
||||
return niov->owner;
|
||||
}
|
||||
|
||||
static inline unsigned int net_iov_idx(const struct net_iov *niov)
|
||||
{
|
||||
return niov - net_iov_owner(niov)->niovs;
|
||||
}
|
||||
|
||||
/* netmem */
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -0,0 +1,45 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef _NET_PAGE_POOL_MEMORY_PROVIDER_H
|
||||
#define _NET_PAGE_POOL_MEMORY_PROVIDER_H
|
||||
|
||||
#include <net/netmem.h>
|
||||
#include <net/page_pool/types.h>
|
||||
|
||||
struct netdev_rx_queue;
|
||||
struct sk_buff;
|
||||
|
||||
struct memory_provider_ops {
|
||||
netmem_ref (*alloc_netmems)(struct page_pool *pool, gfp_t gfp);
|
||||
bool (*release_netmem)(struct page_pool *pool, netmem_ref netmem);
|
||||
int (*init)(struct page_pool *pool);
|
||||
void (*destroy)(struct page_pool *pool);
|
||||
int (*nl_fill)(void *mp_priv, struct sk_buff *rsp,
|
||||
struct netdev_rx_queue *rxq);
|
||||
void (*uninstall)(void *mp_priv, struct netdev_rx_queue *rxq);
|
||||
};
|
||||
|
||||
bool net_mp_niov_set_dma_addr(struct net_iov *niov, dma_addr_t addr);
|
||||
void net_mp_niov_set_page_pool(struct page_pool *pool, struct net_iov *niov);
|
||||
void net_mp_niov_clear_page_pool(struct net_iov *niov);
|
||||
|
||||
int net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *p);
|
||||
void net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *old_p);
|
||||
|
||||
/**
|
||||
* net_mp_netmem_place_in_cache() - give a netmem to a page pool
|
||||
* @pool: the page pool to place the netmem into
|
||||
* @netmem: netmem to give
|
||||
*
|
||||
* Push an accounted netmem into the page pool's allocation cache. The caller
|
||||
* must ensure that there is space in the cache. It should only be called off
|
||||
* the mp_ops->alloc_netmems() path.
|
||||
*/
|
||||
static inline void net_mp_netmem_place_in_cache(struct page_pool *pool,
|
||||
netmem_ref netmem)
|
||||
{
|
||||
pool->alloc.cache[pool->alloc.count++] = netmem;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
|
@ -152,8 +152,11 @@ struct page_pool_stats {
|
|||
*/
|
||||
#define PAGE_POOL_FRAG_GROUP_ALIGN (4 * sizeof(long))
|
||||
|
||||
struct memory_provider_ops;
|
||||
|
||||
struct pp_memory_provider_params {
|
||||
void *mp_priv;
|
||||
const struct memory_provider_ops *mp_ops;
|
||||
};
|
||||
|
||||
struct page_pool {
|
||||
|
|
@ -216,6 +219,7 @@ struct page_pool {
|
|||
struct ptr_ring ring;
|
||||
|
||||
void *mp_priv;
|
||||
const struct memory_provider_ops *mp_ops;
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
/* recycle stats are per-cpu to avoid locking */
|
||||
|
|
|
|||
|
|
@ -2059,6 +2059,24 @@ enum ethtool_link_mode_bit_indices {
|
|||
ETHTOOL_LINK_MODE_10baseT1S_Half_BIT = 100,
|
||||
ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT = 101,
|
||||
ETHTOOL_LINK_MODE_10baseT1BRR_Full_BIT = 102,
|
||||
ETHTOOL_LINK_MODE_200000baseCR_Full_BIT = 103,
|
||||
ETHTOOL_LINK_MODE_200000baseKR_Full_BIT = 104,
|
||||
ETHTOOL_LINK_MODE_200000baseDR_Full_BIT = 105,
|
||||
ETHTOOL_LINK_MODE_200000baseDR_2_Full_BIT = 106,
|
||||
ETHTOOL_LINK_MODE_200000baseSR_Full_BIT = 107,
|
||||
ETHTOOL_LINK_MODE_200000baseVR_Full_BIT = 108,
|
||||
ETHTOOL_LINK_MODE_400000baseCR2_Full_BIT = 109,
|
||||
ETHTOOL_LINK_MODE_400000baseKR2_Full_BIT = 110,
|
||||
ETHTOOL_LINK_MODE_400000baseDR2_Full_BIT = 111,
|
||||
ETHTOOL_LINK_MODE_400000baseDR2_2_Full_BIT = 112,
|
||||
ETHTOOL_LINK_MODE_400000baseSR2_Full_BIT = 113,
|
||||
ETHTOOL_LINK_MODE_400000baseVR2_Full_BIT = 114,
|
||||
ETHTOOL_LINK_MODE_800000baseCR4_Full_BIT = 115,
|
||||
ETHTOOL_LINK_MODE_800000baseKR4_Full_BIT = 116,
|
||||
ETHTOOL_LINK_MODE_800000baseDR4_Full_BIT = 117,
|
||||
ETHTOOL_LINK_MODE_800000baseDR4_2_Full_BIT = 118,
|
||||
ETHTOOL_LINK_MODE_800000baseSR4_Full_BIT = 119,
|
||||
ETHTOOL_LINK_MODE_800000baseVR4_Full_BIT = 120,
|
||||
|
||||
/* must be last entry */
|
||||
__ETHTOOL_LINK_MODE_MASK_NBITS
|
||||
|
|
|
|||
|
|
@ -87,6 +87,7 @@ struct io_uring_sqe {
|
|||
union {
|
||||
__s32 splice_fd_in;
|
||||
__u32 file_index;
|
||||
__u32 zcrx_ifq_idx;
|
||||
__u32 optlen;
|
||||
struct {
|
||||
__u16 addr_len;
|
||||
|
|
@ -278,6 +279,7 @@ enum io_uring_op {
|
|||
IORING_OP_FTRUNCATE,
|
||||
IORING_OP_BIND,
|
||||
IORING_OP_LISTEN,
|
||||
IORING_OP_RECV_ZC,
|
||||
|
||||
/* this goes last, obviously */
|
||||
IORING_OP_LAST,
|
||||
|
|
@ -639,7 +641,8 @@ enum io_uring_register_op {
|
|||
/* send MSG_RING without having a ring */
|
||||
IORING_REGISTER_SEND_MSG_RING = 31,
|
||||
|
||||
/* 32 reserved for zc rx */
|
||||
/* register a netdev hw rx queue for zerocopy */
|
||||
IORING_REGISTER_ZCRX_IFQ = 32,
|
||||
|
||||
/* resize CQ ring */
|
||||
IORING_REGISTER_RESIZE_RINGS = 33,
|
||||
|
|
@ -956,6 +959,55 @@ enum io_uring_socket_op {
|
|||
SOCKET_URING_OP_SETSOCKOPT,
|
||||
};
|
||||
|
||||
/* Zero copy receive refill queue entry */
|
||||
struct io_uring_zcrx_rqe {
|
||||
__u64 off;
|
||||
__u32 len;
|
||||
__u32 __pad;
|
||||
};
|
||||
|
||||
struct io_uring_zcrx_cqe {
|
||||
__u64 off;
|
||||
__u64 __pad;
|
||||
};
|
||||
|
||||
/* The bit from which area id is encoded into offsets */
|
||||
#define IORING_ZCRX_AREA_SHIFT 48
|
||||
#define IORING_ZCRX_AREA_MASK (~(((__u64)1 << IORING_ZCRX_AREA_SHIFT) - 1))
|
||||
|
||||
struct io_uring_zcrx_offsets {
|
||||
__u32 head;
|
||||
__u32 tail;
|
||||
__u32 rqes;
|
||||
__u32 __resv2;
|
||||
__u64 __resv[2];
|
||||
};
|
||||
|
||||
struct io_uring_zcrx_area_reg {
|
||||
__u64 addr;
|
||||
__u64 len;
|
||||
__u64 rq_area_token;
|
||||
__u32 flags;
|
||||
__u32 __resv1;
|
||||
__u64 __resv2[2];
|
||||
};
|
||||
|
||||
/*
|
||||
* Argument for IORING_REGISTER_ZCRX_IFQ
|
||||
*/
|
||||
struct io_uring_zcrx_ifq_reg {
|
||||
__u32 if_idx;
|
||||
__u32 if_rxq;
|
||||
__u32 rq_entries;
|
||||
__u32 flags;
|
||||
|
||||
__u64 area_ptr; /* pointer to struct io_uring_zcrx_area_reg */
|
||||
__u64 region_ptr; /* struct io_uring_region_desc * */
|
||||
|
||||
struct io_uring_zcrx_offsets offsets;
|
||||
__u64 __resv[4];
|
||||
};
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -86,6 +86,11 @@ enum {
|
|||
NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
|
||||
};
|
||||
|
||||
enum {
|
||||
__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
|
||||
NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
|
||||
};
|
||||
|
||||
enum {
|
||||
NETDEV_A_PAGE_POOL_ID = 1,
|
||||
NETDEV_A_PAGE_POOL_IFINDEX,
|
||||
|
|
@ -94,6 +99,7 @@ enum {
|
|||
NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
|
||||
NETDEV_A_PAGE_POOL_DETACH_TIME,
|
||||
NETDEV_A_PAGE_POOL_DMABUF,
|
||||
NETDEV_A_PAGE_POOL_IO_URING,
|
||||
|
||||
__NETDEV_A_PAGE_POOL_MAX,
|
||||
NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
|
||||
|
|
@ -136,6 +142,7 @@ enum {
|
|||
NETDEV_A_QUEUE_TYPE,
|
||||
NETDEV_A_QUEUE_NAPI_ID,
|
||||
NETDEV_A_QUEUE_DMABUF,
|
||||
NETDEV_A_QUEUE_IO_URING,
|
||||
|
||||
__NETDEV_A_QUEUE_MAX,
|
||||
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,10 @@
|
|||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
#
|
||||
# io_uring configuration
|
||||
#
|
||||
|
||||
config IO_URING_ZCRX
|
||||
def_bool y
|
||||
depends on PAGE_POOL
|
||||
depends on INET
|
||||
depends on NET_RX_BUSY_POLL
|
||||
|
|
@ -14,6 +14,7 @@ obj-$(CONFIG_IO_URING) += io_uring.o opdef.o kbuf.o rsrc.o notif.o \
|
|||
epoll.o statx.o timeout.o fdinfo.o \
|
||||
cancel.o waitid.o register.o \
|
||||
truncate.o memmap.o alloc_cache.o
|
||||
obj-$(CONFIG_IO_URING_ZCRX) += zcrx.o
|
||||
obj-$(CONFIG_IO_WQ) += io-wq.o
|
||||
obj-$(CONFIG_FUTEX) += futex.o
|
||||
obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o
|
||||
|
|
|
|||
|
|
@ -97,6 +97,7 @@
|
|||
#include "uring_cmd.h"
|
||||
#include "msg_ring.h"
|
||||
#include "memmap.h"
|
||||
#include "zcrx.h"
|
||||
|
||||
#include "timeout.h"
|
||||
#include "poll.h"
|
||||
|
|
@ -2729,6 +2730,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
|
|||
mutex_lock(&ctx->uring_lock);
|
||||
io_sqe_buffers_unregister(ctx);
|
||||
io_sqe_files_unregister(ctx);
|
||||
io_unregister_zcrx_ifqs(ctx);
|
||||
io_cqring_overflow_kill(ctx);
|
||||
io_eventfd_unregister(ctx);
|
||||
io_free_alloc_caches(ctx);
|
||||
|
|
@ -2888,6 +2890,11 @@ static __cold void io_ring_exit_work(struct work_struct *work)
|
|||
io_cqring_overflow_kill(ctx);
|
||||
mutex_unlock(&ctx->uring_lock);
|
||||
}
|
||||
if (ctx->ifq) {
|
||||
mutex_lock(&ctx->uring_lock);
|
||||
io_shutdown_zcrx_ifqs(ctx);
|
||||
mutex_unlock(&ctx->uring_lock);
|
||||
}
|
||||
|
||||
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
|
||||
io_move_task_work_from_local(ctx);
|
||||
|
|
|
|||
|
|
@ -190,6 +190,16 @@ static inline bool io_get_cqe(struct io_ring_ctx *ctx, struct io_uring_cqe **ret
|
|||
return io_get_cqe_overflow(ctx, ret, false);
|
||||
}
|
||||
|
||||
static inline bool io_defer_get_uncommited_cqe(struct io_ring_ctx *ctx,
|
||||
struct io_uring_cqe **cqe_ret)
|
||||
{
|
||||
io_lockdep_assert_cq_locked(ctx);
|
||||
|
||||
ctx->cq_extra++;
|
||||
ctx->submit_state.cq_flush = true;
|
||||
return io_get_cqe(ctx, cqe_ret);
|
||||
}
|
||||
|
||||
static __always_inline bool io_fill_cqe_req(struct io_ring_ctx *ctx,
|
||||
struct io_kiocb *req)
|
||||
{
|
||||
|
|
|
|||
|
|
@ -271,6 +271,8 @@ static struct io_mapped_region *io_mmap_get_region(struct io_ring_ctx *ctx,
|
|||
return io_pbuf_get_region(ctx, bgid);
|
||||
case IORING_MAP_OFF_PARAM_REGION:
|
||||
return &ctx->param_region;
|
||||
case IORING_MAP_OFF_ZCRX_REGION:
|
||||
return &ctx->zcrx_region;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -2,6 +2,7 @@
|
|||
#define IO_URING_MEMMAP_H
|
||||
|
||||
#define IORING_MAP_OFF_PARAM_REGION 0x20000000ULL
|
||||
#define IORING_MAP_OFF_ZCRX_REGION 0x30000000ULL
|
||||
|
||||
struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages);
|
||||
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@
|
|||
#include "net.h"
|
||||
#include "notif.h"
|
||||
#include "rsrc.h"
|
||||
#include "zcrx.h"
|
||||
|
||||
#if defined(CONFIG_NET)
|
||||
struct io_shutdown {
|
||||
|
|
@ -88,6 +89,14 @@ struct io_sr_msg {
|
|||
*/
|
||||
#define MULTISHOT_MAX_RETRY 32
|
||||
|
||||
struct io_recvzc {
|
||||
struct file *file;
|
||||
unsigned msg_flags;
|
||||
u16 flags;
|
||||
u32 len;
|
||||
struct io_zcrx_ifq *ifq;
|
||||
};
|
||||
|
||||
int io_shutdown_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
|
||||
{
|
||||
struct io_shutdown *shutdown = io_kiocb_to_cmd(req, struct io_shutdown);
|
||||
|
|
@ -1200,6 +1209,81 @@ out_free:
|
|||
return ret;
|
||||
}
|
||||
|
||||
int io_recvzc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
|
||||
{
|
||||
struct io_recvzc *zc = io_kiocb_to_cmd(req, struct io_recvzc);
|
||||
unsigned ifq_idx;
|
||||
|
||||
if (unlikely(sqe->file_index || sqe->addr2 || sqe->addr ||
|
||||
sqe->addr3))
|
||||
return -EINVAL;
|
||||
|
||||
ifq_idx = READ_ONCE(sqe->zcrx_ifq_idx);
|
||||
if (ifq_idx != 0)
|
||||
return -EINVAL;
|
||||
zc->ifq = req->ctx->ifq;
|
||||
if (!zc->ifq)
|
||||
return -EINVAL;
|
||||
zc->len = READ_ONCE(sqe->len);
|
||||
zc->flags = READ_ONCE(sqe->ioprio);
|
||||
zc->msg_flags = READ_ONCE(sqe->msg_flags);
|
||||
if (zc->msg_flags)
|
||||
return -EINVAL;
|
||||
if (zc->flags & ~(IORING_RECVSEND_POLL_FIRST | IORING_RECV_MULTISHOT))
|
||||
return -EINVAL;
|
||||
/* multishot required */
|
||||
if (!(zc->flags & IORING_RECV_MULTISHOT))
|
||||
return -EINVAL;
|
||||
/* All data completions are posted as aux CQEs. */
|
||||
req->flags |= REQ_F_APOLL_MULTISHOT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int io_recvzc(struct io_kiocb *req, unsigned int issue_flags)
|
||||
{
|
||||
struct io_recvzc *zc = io_kiocb_to_cmd(req, struct io_recvzc);
|
||||
struct socket *sock;
|
||||
unsigned int len;
|
||||
int ret;
|
||||
|
||||
if (!(req->flags & REQ_F_POLLED) &&
|
||||
(zc->flags & IORING_RECVSEND_POLL_FIRST))
|
||||
return -EAGAIN;
|
||||
|
||||
sock = sock_from_file(req->file);
|
||||
if (unlikely(!sock))
|
||||
return -ENOTSOCK;
|
||||
|
||||
len = zc->len;
|
||||
ret = io_zcrx_recv(req, zc->ifq, sock, zc->msg_flags | MSG_DONTWAIT,
|
||||
issue_flags, &zc->len);
|
||||
if (len && zc->len == 0) {
|
||||
io_req_set_res(req, 0, 0);
|
||||
|
||||
if (issue_flags & IO_URING_F_MULTISHOT)
|
||||
return IOU_STOP_MULTISHOT;
|
||||
return IOU_OK;
|
||||
}
|
||||
if (unlikely(ret <= 0) && ret != -EAGAIN) {
|
||||
if (ret == -ERESTARTSYS)
|
||||
ret = -EINTR;
|
||||
if (ret == IOU_REQUEUE)
|
||||
return IOU_REQUEUE;
|
||||
|
||||
req_set_fail(req);
|
||||
io_req_set_res(req, ret, 0);
|
||||
|
||||
if (issue_flags & IO_URING_F_MULTISHOT)
|
||||
return IOU_STOP_MULTISHOT;
|
||||
return IOU_OK;
|
||||
}
|
||||
|
||||
if (issue_flags & IO_URING_F_MULTISHOT)
|
||||
return IOU_ISSUE_SKIP_COMPLETE;
|
||||
return -EAGAIN;
|
||||
}
|
||||
|
||||
void io_send_zc_cleanup(struct io_kiocb *req)
|
||||
{
|
||||
struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg);
|
||||
|
|
|
|||
|
|
@ -37,6 +37,7 @@
|
|||
#include "waitid.h"
|
||||
#include "futex.h"
|
||||
#include "truncate.h"
|
||||
#include "zcrx.h"
|
||||
|
||||
static int io_no_issue(struct io_kiocb *req, unsigned int issue_flags)
|
||||
{
|
||||
|
|
@ -514,6 +515,18 @@ const struct io_issue_def io_issue_defs[] = {
|
|||
.async_size = sizeof(struct io_async_msghdr),
|
||||
#else
|
||||
.prep = io_eopnotsupp_prep,
|
||||
#endif
|
||||
},
|
||||
[IORING_OP_RECV_ZC] = {
|
||||
.needs_file = 1,
|
||||
.unbound_nonreg_file = 1,
|
||||
.pollin = 1,
|
||||
.ioprio = 1,
|
||||
#if defined(CONFIG_NET)
|
||||
.prep = io_recvzc_prep,
|
||||
.issue = io_recvzc,
|
||||
#else
|
||||
.prep = io_eopnotsupp_prep,
|
||||
#endif
|
||||
},
|
||||
};
|
||||
|
|
@ -745,6 +758,9 @@ const struct io_cold_def io_cold_defs[] = {
|
|||
[IORING_OP_LISTEN] = {
|
||||
.name = "LISTEN",
|
||||
},
|
||||
[IORING_OP_RECV_ZC] = {
|
||||
.name = "RECV_ZC",
|
||||
},
|
||||
};
|
||||
|
||||
const char *io_uring_get_opcode(u8 opcode)
|
||||
|
|
|
|||
|
|
@ -30,6 +30,7 @@
|
|||
#include "eventfd.h"
|
||||
#include "msg_ring.h"
|
||||
#include "memmap.h"
|
||||
#include "zcrx.h"
|
||||
|
||||
#define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \
|
||||
IORING_REGISTER_LAST + IORING_OP_LAST)
|
||||
|
|
@ -813,6 +814,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
|
|||
break;
|
||||
ret = io_register_clone_buffers(ctx, arg);
|
||||
break;
|
||||
case IORING_REGISTER_ZCRX_IFQ:
|
||||
ret = -EINVAL;
|
||||
if (!arg || nr_args != 1)
|
||||
break;
|
||||
ret = io_register_zcrx_ifq(ctx, arg);
|
||||
break;
|
||||
case IORING_REGISTER_RESIZE_RINGS:
|
||||
ret = -EINVAL;
|
||||
if (!arg || nr_args != 1)
|
||||
|
|
|
|||
|
|
@ -80,7 +80,7 @@ static int io_account_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int io_buffer_validate(struct iovec *iov)
|
||||
int io_buffer_validate(struct iovec *iov)
|
||||
{
|
||||
unsigned long tmp, acct_len = iov->iov_len + (PAGE_SIZE - 1);
|
||||
|
||||
|
|
|
|||
|
|
@ -76,6 +76,7 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
|
|||
unsigned size, unsigned type);
|
||||
int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
|
||||
unsigned int size, unsigned int type);
|
||||
int io_buffer_validate(struct iovec *iov);
|
||||
|
||||
bool io_check_coalesce_buffer(struct page **page_array, int nr_pages,
|
||||
struct io_imu_folio_data *data);
|
||||
|
|
|
|||
|
|
@ -0,0 +1,960 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/errno.h>
|
||||
#include <linux/dma-map-ops.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/io_uring.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include <linux/rtnetlink.h>
|
||||
#include <linux/skbuff_ref.h>
|
||||
|
||||
#include <net/page_pool/helpers.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
#include <net/netlink.h>
|
||||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/tcp.h>
|
||||
#include <net/rps.h>
|
||||
|
||||
#include <trace/events/page_pool.h>
|
||||
|
||||
#include <uapi/linux/io_uring.h>
|
||||
|
||||
#include "io_uring.h"
|
||||
#include "kbuf.h"
|
||||
#include "memmap.h"
|
||||
#include "zcrx.h"
|
||||
#include "rsrc.h"
|
||||
|
||||
#define IO_DMA_ATTR (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING)
|
||||
|
||||
static void __io_zcrx_unmap_area(struct io_zcrx_ifq *ifq,
|
||||
struct io_zcrx_area *area, int nr_mapped)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < nr_mapped; i++) {
|
||||
struct net_iov *niov = &area->nia.niovs[i];
|
||||
dma_addr_t dma;
|
||||
|
||||
dma = page_pool_get_dma_addr_netmem(net_iov_to_netmem(niov));
|
||||
dma_unmap_page_attrs(ifq->dev, dma, PAGE_SIZE,
|
||||
DMA_FROM_DEVICE, IO_DMA_ATTR);
|
||||
net_mp_niov_set_dma_addr(niov, 0);
|
||||
}
|
||||
}
|
||||
|
||||
static void io_zcrx_unmap_area(struct io_zcrx_ifq *ifq, struct io_zcrx_area *area)
|
||||
{
|
||||
if (area->is_mapped)
|
||||
__io_zcrx_unmap_area(ifq, area, area->nia.num_niovs);
|
||||
}
|
||||
|
||||
static int io_zcrx_map_area(struct io_zcrx_ifq *ifq, struct io_zcrx_area *area)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < area->nia.num_niovs; i++) {
|
||||
struct net_iov *niov = &area->nia.niovs[i];
|
||||
dma_addr_t dma;
|
||||
|
||||
dma = dma_map_page_attrs(ifq->dev, area->pages[i], 0, PAGE_SIZE,
|
||||
DMA_FROM_DEVICE, IO_DMA_ATTR);
|
||||
if (dma_mapping_error(ifq->dev, dma))
|
||||
break;
|
||||
if (net_mp_niov_set_dma_addr(niov, dma)) {
|
||||
dma_unmap_page_attrs(ifq->dev, dma, PAGE_SIZE,
|
||||
DMA_FROM_DEVICE, IO_DMA_ATTR);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (i != area->nia.num_niovs) {
|
||||
__io_zcrx_unmap_area(ifq, area, i);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
area->is_mapped = true;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void io_zcrx_sync_for_device(const struct page_pool *pool,
|
||||
struct net_iov *niov)
|
||||
{
|
||||
#if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC)
|
||||
dma_addr_t dma_addr;
|
||||
|
||||
if (!dma_dev_need_sync(pool->p.dev))
|
||||
return;
|
||||
|
||||
dma_addr = page_pool_get_dma_addr_netmem(net_iov_to_netmem(niov));
|
||||
__dma_sync_single_for_device(pool->p.dev, dma_addr + pool->p.offset,
|
||||
PAGE_SIZE, pool->p.dma_dir);
|
||||
#endif
|
||||
}
|
||||
|
||||
#define IO_RQ_MAX_ENTRIES 32768
|
||||
|
||||
#define IO_SKBS_PER_CALL_LIMIT 20
|
||||
|
||||
struct io_zcrx_args {
|
||||
struct io_kiocb *req;
|
||||
struct io_zcrx_ifq *ifq;
|
||||
struct socket *sock;
|
||||
unsigned nr_skbs;
|
||||
};
|
||||
|
||||
static const struct memory_provider_ops io_uring_pp_zc_ops;
|
||||
|
||||
static inline struct io_zcrx_area *io_zcrx_iov_to_area(const struct net_iov *niov)
|
||||
{
|
||||
struct net_iov_area *owner = net_iov_owner(niov);
|
||||
|
||||
return container_of(owner, struct io_zcrx_area, nia);
|
||||
}
|
||||
|
||||
static inline atomic_t *io_get_user_counter(struct net_iov *niov)
|
||||
{
|
||||
struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
|
||||
|
||||
return &area->user_refs[net_iov_idx(niov)];
|
||||
}
|
||||
|
||||
static bool io_zcrx_put_niov_uref(struct net_iov *niov)
|
||||
{
|
||||
atomic_t *uref = io_get_user_counter(niov);
|
||||
|
||||
if (unlikely(!atomic_read(uref)))
|
||||
return false;
|
||||
atomic_dec(uref);
|
||||
return true;
|
||||
}
|
||||
|
||||
static void io_zcrx_get_niov_uref(struct net_iov *niov)
|
||||
{
|
||||
atomic_inc(io_get_user_counter(niov));
|
||||
}
|
||||
|
||||
static inline struct page *io_zcrx_iov_page(const struct net_iov *niov)
|
||||
{
|
||||
struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
|
||||
|
||||
return area->pages[net_iov_idx(niov)];
|
||||
}
|
||||
|
||||
static int io_allocate_rbuf_ring(struct io_zcrx_ifq *ifq,
|
||||
struct io_uring_zcrx_ifq_reg *reg,
|
||||
struct io_uring_region_desc *rd)
|
||||
{
|
||||
size_t off, size;
|
||||
void *ptr;
|
||||
int ret;
|
||||
|
||||
off = sizeof(struct io_uring);
|
||||
size = off + sizeof(struct io_uring_zcrx_rqe) * reg->rq_entries;
|
||||
if (size > rd->size)
|
||||
return -EINVAL;
|
||||
|
||||
ret = io_create_region_mmap_safe(ifq->ctx, &ifq->ctx->zcrx_region, rd,
|
||||
IORING_MAP_OFF_ZCRX_REGION);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
ptr = io_region_get_ptr(&ifq->ctx->zcrx_region);
|
||||
ifq->rq_ring = (struct io_uring *)ptr;
|
||||
ifq->rqes = (struct io_uring_zcrx_rqe *)(ptr + off);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void io_free_rbuf_ring(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
io_free_region(ifq->ctx, &ifq->ctx->zcrx_region);
|
||||
ifq->rq_ring = NULL;
|
||||
ifq->rqes = NULL;
|
||||
}
|
||||
|
||||
static void io_zcrx_free_area(struct io_zcrx_area *area)
|
||||
{
|
||||
io_zcrx_unmap_area(area->ifq, area);
|
||||
|
||||
kvfree(area->freelist);
|
||||
kvfree(area->nia.niovs);
|
||||
kvfree(area->user_refs);
|
||||
if (area->pages) {
|
||||
unpin_user_pages(area->pages, area->nia.num_niovs);
|
||||
kvfree(area->pages);
|
||||
}
|
||||
kfree(area);
|
||||
}
|
||||
|
||||
static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
|
||||
struct io_zcrx_area **res,
|
||||
struct io_uring_zcrx_area_reg *area_reg)
|
||||
{
|
||||
struct io_zcrx_area *area;
|
||||
int i, ret, nr_pages;
|
||||
struct iovec iov;
|
||||
|
||||
if (area_reg->flags || area_reg->rq_area_token)
|
||||
return -EINVAL;
|
||||
if (area_reg->__resv1 || area_reg->__resv2[0] || area_reg->__resv2[1])
|
||||
return -EINVAL;
|
||||
if (area_reg->addr & ~PAGE_MASK || area_reg->len & ~PAGE_MASK)
|
||||
return -EINVAL;
|
||||
|
||||
iov.iov_base = u64_to_user_ptr(area_reg->addr);
|
||||
iov.iov_len = area_reg->len;
|
||||
ret = io_buffer_validate(&iov);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = -ENOMEM;
|
||||
area = kzalloc(sizeof(*area), GFP_KERNEL);
|
||||
if (!area)
|
||||
goto err;
|
||||
|
||||
area->pages = io_pin_pages((unsigned long)area_reg->addr, area_reg->len,
|
||||
&nr_pages);
|
||||
if (IS_ERR(area->pages)) {
|
||||
ret = PTR_ERR(area->pages);
|
||||
area->pages = NULL;
|
||||
goto err;
|
||||
}
|
||||
area->nia.num_niovs = nr_pages;
|
||||
|
||||
area->nia.niovs = kvmalloc_array(nr_pages, sizeof(area->nia.niovs[0]),
|
||||
GFP_KERNEL | __GFP_ZERO);
|
||||
if (!area->nia.niovs)
|
||||
goto err;
|
||||
|
||||
area->freelist = kvmalloc_array(nr_pages, sizeof(area->freelist[0]),
|
||||
GFP_KERNEL | __GFP_ZERO);
|
||||
if (!area->freelist)
|
||||
goto err;
|
||||
|
||||
for (i = 0; i < nr_pages; i++)
|
||||
area->freelist[i] = i;
|
||||
|
||||
area->user_refs = kvmalloc_array(nr_pages, sizeof(area->user_refs[0]),
|
||||
GFP_KERNEL | __GFP_ZERO);
|
||||
if (!area->user_refs)
|
||||
goto err;
|
||||
|
||||
for (i = 0; i < nr_pages; i++) {
|
||||
struct net_iov *niov = &area->nia.niovs[i];
|
||||
|
||||
niov->owner = &area->nia;
|
||||
area->freelist[i] = i;
|
||||
atomic_set(&area->user_refs[i], 0);
|
||||
}
|
||||
|
||||
area->free_count = nr_pages;
|
||||
area->ifq = ifq;
|
||||
/* we're only supporting one area per ifq for now */
|
||||
area->area_id = 0;
|
||||
area_reg->rq_area_token = (u64)area->area_id << IORING_ZCRX_AREA_SHIFT;
|
||||
spin_lock_init(&area->freelist_lock);
|
||||
*res = area;
|
||||
return 0;
|
||||
err:
|
||||
if (area)
|
||||
io_zcrx_free_area(area);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static struct io_zcrx_ifq *io_zcrx_ifq_alloc(struct io_ring_ctx *ctx)
|
||||
{
|
||||
struct io_zcrx_ifq *ifq;
|
||||
|
||||
ifq = kzalloc(sizeof(*ifq), GFP_KERNEL);
|
||||
if (!ifq)
|
||||
return NULL;
|
||||
|
||||
ifq->if_rxq = -1;
|
||||
ifq->ctx = ctx;
|
||||
spin_lock_init(&ifq->lock);
|
||||
spin_lock_init(&ifq->rq_lock);
|
||||
return ifq;
|
||||
}
|
||||
|
||||
static void io_zcrx_drop_netdev(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
spin_lock(&ifq->lock);
|
||||
if (ifq->netdev) {
|
||||
netdev_put(ifq->netdev, &ifq->netdev_tracker);
|
||||
ifq->netdev = NULL;
|
||||
}
|
||||
spin_unlock(&ifq->lock);
|
||||
}
|
||||
|
||||
static void io_close_queue(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
struct net_device *netdev;
|
||||
netdevice_tracker netdev_tracker;
|
||||
struct pp_memory_provider_params p = {
|
||||
.mp_ops = &io_uring_pp_zc_ops,
|
||||
.mp_priv = ifq,
|
||||
};
|
||||
|
||||
if (ifq->if_rxq == -1)
|
||||
return;
|
||||
|
||||
spin_lock(&ifq->lock);
|
||||
netdev = ifq->netdev;
|
||||
netdev_tracker = ifq->netdev_tracker;
|
||||
ifq->netdev = NULL;
|
||||
spin_unlock(&ifq->lock);
|
||||
|
||||
if (netdev) {
|
||||
net_mp_close_rxq(netdev, ifq->if_rxq, &p);
|
||||
netdev_put(netdev, &netdev_tracker);
|
||||
}
|
||||
ifq->if_rxq = -1;
|
||||
}
|
||||
|
||||
static void io_zcrx_ifq_free(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
io_close_queue(ifq);
|
||||
io_zcrx_drop_netdev(ifq);
|
||||
|
||||
if (ifq->area)
|
||||
io_zcrx_free_area(ifq->area);
|
||||
if (ifq->dev)
|
||||
put_device(ifq->dev);
|
||||
|
||||
io_free_rbuf_ring(ifq);
|
||||
kfree(ifq);
|
||||
}
|
||||
|
||||
int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
|
||||
struct io_uring_zcrx_ifq_reg __user *arg)
|
||||
{
|
||||
struct pp_memory_provider_params mp_param = {};
|
||||
struct io_uring_zcrx_area_reg area;
|
||||
struct io_uring_zcrx_ifq_reg reg;
|
||||
struct io_uring_region_desc rd;
|
||||
struct io_zcrx_ifq *ifq;
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* 1. Interface queue allocation.
|
||||
* 2. It can observe data destined for sockets of other tasks.
|
||||
*/
|
||||
if (!capable(CAP_NET_ADMIN))
|
||||
return -EPERM;
|
||||
|
||||
/* mandatory io_uring features for zc rx */
|
||||
if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN &&
|
||||
ctx->flags & IORING_SETUP_CQE32))
|
||||
return -EINVAL;
|
||||
if (ctx->ifq)
|
||||
return -EBUSY;
|
||||
if (copy_from_user(®, arg, sizeof(reg)))
|
||||
return -EFAULT;
|
||||
if (copy_from_user(&rd, u64_to_user_ptr(reg.region_ptr), sizeof(rd)))
|
||||
return -EFAULT;
|
||||
if (memchr_inv(®.__resv, 0, sizeof(reg.__resv)))
|
||||
return -EINVAL;
|
||||
if (reg.if_rxq == -1 || !reg.rq_entries || reg.flags)
|
||||
return -EINVAL;
|
||||
if (reg.rq_entries > IO_RQ_MAX_ENTRIES) {
|
||||
if (!(ctx->flags & IORING_SETUP_CLAMP))
|
||||
return -EINVAL;
|
||||
reg.rq_entries = IO_RQ_MAX_ENTRIES;
|
||||
}
|
||||
reg.rq_entries = roundup_pow_of_two(reg.rq_entries);
|
||||
|
||||
if (copy_from_user(&area, u64_to_user_ptr(reg.area_ptr), sizeof(area)))
|
||||
return -EFAULT;
|
||||
|
||||
ifq = io_zcrx_ifq_alloc(ctx);
|
||||
if (!ifq)
|
||||
return -ENOMEM;
|
||||
|
||||
ret = io_allocate_rbuf_ring(ifq, ®, &rd);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
ret = io_zcrx_create_area(ifq, &ifq->area, &area);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
ifq->rq_entries = reg.rq_entries;
|
||||
|
||||
ret = -ENODEV;
|
||||
ifq->netdev = netdev_get_by_index(current->nsproxy->net_ns, reg.if_idx,
|
||||
&ifq->netdev_tracker, GFP_KERNEL);
|
||||
if (!ifq->netdev)
|
||||
goto err;
|
||||
|
||||
ifq->dev = ifq->netdev->dev.parent;
|
||||
ret = -EOPNOTSUPP;
|
||||
if (!ifq->dev)
|
||||
goto err;
|
||||
get_device(ifq->dev);
|
||||
|
||||
ret = io_zcrx_map_area(ifq, ifq->area);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
mp_param.mp_ops = &io_uring_pp_zc_ops;
|
||||
mp_param.mp_priv = ifq;
|
||||
ret = net_mp_open_rxq(ifq->netdev, reg.if_rxq, &mp_param);
|
||||
if (ret)
|
||||
goto err;
|
||||
ifq->if_rxq = reg.if_rxq;
|
||||
|
||||
reg.offsets.rqes = sizeof(struct io_uring);
|
||||
reg.offsets.head = offsetof(struct io_uring, head);
|
||||
reg.offsets.tail = offsetof(struct io_uring, tail);
|
||||
|
||||
if (copy_to_user(arg, ®, sizeof(reg)) ||
|
||||
copy_to_user(u64_to_user_ptr(reg.region_ptr), &rd, sizeof(rd)) ||
|
||||
copy_to_user(u64_to_user_ptr(reg.area_ptr), &area, sizeof(area))) {
|
||||
ret = -EFAULT;
|
||||
goto err;
|
||||
}
|
||||
ctx->ifq = ifq;
|
||||
return 0;
|
||||
err:
|
||||
io_zcrx_ifq_free(ifq);
|
||||
return ret;
|
||||
}
|
||||
|
||||
void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
|
||||
{
|
||||
struct io_zcrx_ifq *ifq = ctx->ifq;
|
||||
|
||||
lockdep_assert_held(&ctx->uring_lock);
|
||||
|
||||
if (!ifq)
|
||||
return;
|
||||
|
||||
ctx->ifq = NULL;
|
||||
io_zcrx_ifq_free(ifq);
|
||||
}
|
||||
|
||||
static struct net_iov *__io_zcrx_get_free_niov(struct io_zcrx_area *area)
|
||||
{
|
||||
unsigned niov_idx;
|
||||
|
||||
lockdep_assert_held(&area->freelist_lock);
|
||||
|
||||
niov_idx = area->freelist[--area->free_count];
|
||||
return &area->nia.niovs[niov_idx];
|
||||
}
|
||||
|
||||
static void io_zcrx_return_niov_freelist(struct net_iov *niov)
|
||||
{
|
||||
struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
|
||||
|
||||
spin_lock_bh(&area->freelist_lock);
|
||||
area->freelist[area->free_count++] = net_iov_idx(niov);
|
||||
spin_unlock_bh(&area->freelist_lock);
|
||||
}
|
||||
|
||||
static void io_zcrx_return_niov(struct net_iov *niov)
|
||||
{
|
||||
netmem_ref netmem = net_iov_to_netmem(niov);
|
||||
|
||||
if (!niov->pp) {
|
||||
/* copy fallback allocated niovs */
|
||||
io_zcrx_return_niov_freelist(niov);
|
||||
return;
|
||||
}
|
||||
page_pool_put_unrefed_netmem(niov->pp, netmem, -1, false);
|
||||
}
|
||||
|
||||
static void io_zcrx_scrub(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
struct io_zcrx_area *area = ifq->area;
|
||||
int i;
|
||||
|
||||
if (!area)
|
||||
return;
|
||||
|
||||
/* Reclaim back all buffers given to the user space. */
|
||||
for (i = 0; i < area->nia.num_niovs; i++) {
|
||||
struct net_iov *niov = &area->nia.niovs[i];
|
||||
int nr;
|
||||
|
||||
if (!atomic_read(io_get_user_counter(niov)))
|
||||
continue;
|
||||
nr = atomic_xchg(io_get_user_counter(niov), 0);
|
||||
if (nr && !page_pool_unref_netmem(net_iov_to_netmem(niov), nr))
|
||||
io_zcrx_return_niov(niov);
|
||||
}
|
||||
}
|
||||
|
||||
void io_shutdown_zcrx_ifqs(struct io_ring_ctx *ctx)
|
||||
{
|
||||
lockdep_assert_held(&ctx->uring_lock);
|
||||
|
||||
if (!ctx->ifq)
|
||||
return;
|
||||
io_zcrx_scrub(ctx->ifq);
|
||||
io_close_queue(ctx->ifq);
|
||||
}
|
||||
|
||||
static inline u32 io_zcrx_rqring_entries(struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
u32 entries;
|
||||
|
||||
entries = smp_load_acquire(&ifq->rq_ring->tail) - ifq->cached_rq_head;
|
||||
return min(entries, ifq->rq_entries);
|
||||
}
|
||||
|
||||
static struct io_uring_zcrx_rqe *io_zcrx_get_rqe(struct io_zcrx_ifq *ifq,
|
||||
unsigned mask)
|
||||
{
|
||||
unsigned int idx = ifq->cached_rq_head++ & mask;
|
||||
|
||||
return &ifq->rqes[idx];
|
||||
}
|
||||
|
||||
static void io_zcrx_ring_refill(struct page_pool *pp,
|
||||
struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
unsigned int mask = ifq->rq_entries - 1;
|
||||
unsigned int entries;
|
||||
netmem_ref netmem;
|
||||
|
||||
spin_lock_bh(&ifq->rq_lock);
|
||||
|
||||
entries = io_zcrx_rqring_entries(ifq);
|
||||
entries = min_t(unsigned, entries, PP_ALLOC_CACHE_REFILL - pp->alloc.count);
|
||||
if (unlikely(!entries)) {
|
||||
spin_unlock_bh(&ifq->rq_lock);
|
||||
return;
|
||||
}
|
||||
|
||||
do {
|
||||
struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(ifq, mask);
|
||||
struct io_zcrx_area *area;
|
||||
struct net_iov *niov;
|
||||
unsigned niov_idx, area_idx;
|
||||
|
||||
area_idx = rqe->off >> IORING_ZCRX_AREA_SHIFT;
|
||||
niov_idx = (rqe->off & ~IORING_ZCRX_AREA_MASK) >> PAGE_SHIFT;
|
||||
|
||||
if (unlikely(rqe->__pad || area_idx))
|
||||
continue;
|
||||
area = ifq->area;
|
||||
|
||||
if (unlikely(niov_idx >= area->nia.num_niovs))
|
||||
continue;
|
||||
niov_idx = array_index_nospec(niov_idx, area->nia.num_niovs);
|
||||
|
||||
niov = &area->nia.niovs[niov_idx];
|
||||
if (!io_zcrx_put_niov_uref(niov))
|
||||
continue;
|
||||
|
||||
netmem = net_iov_to_netmem(niov);
|
||||
if (page_pool_unref_netmem(netmem, 1) != 0)
|
||||
continue;
|
||||
|
||||
if (unlikely(niov->pp != pp)) {
|
||||
io_zcrx_return_niov(niov);
|
||||
continue;
|
||||
}
|
||||
|
||||
io_zcrx_sync_for_device(pp, niov);
|
||||
net_mp_netmem_place_in_cache(pp, netmem);
|
||||
} while (--entries);
|
||||
|
||||
smp_store_release(&ifq->rq_ring->head, ifq->cached_rq_head);
|
||||
spin_unlock_bh(&ifq->rq_lock);
|
||||
}
|
||||
|
||||
static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
|
||||
{
|
||||
struct io_zcrx_area *area = ifq->area;
|
||||
|
||||
spin_lock_bh(&area->freelist_lock);
|
||||
while (area->free_count && pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
|
||||
struct net_iov *niov = __io_zcrx_get_free_niov(area);
|
||||
netmem_ref netmem = net_iov_to_netmem(niov);
|
||||
|
||||
net_mp_niov_set_page_pool(pp, niov);
|
||||
io_zcrx_sync_for_device(pp, niov);
|
||||
net_mp_netmem_place_in_cache(pp, netmem);
|
||||
}
|
||||
spin_unlock_bh(&area->freelist_lock);
|
||||
}
|
||||
|
||||
static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
|
||||
{
|
||||
struct io_zcrx_ifq *ifq = pp->mp_priv;
|
||||
|
||||
/* pp should already be ensuring that */
|
||||
if (unlikely(pp->alloc.count))
|
||||
goto out_return;
|
||||
|
||||
io_zcrx_ring_refill(pp, ifq);
|
||||
if (likely(pp->alloc.count))
|
||||
goto out_return;
|
||||
|
||||
io_zcrx_refill_slow(pp, ifq);
|
||||
if (!pp->alloc.count)
|
||||
return 0;
|
||||
out_return:
|
||||
return pp->alloc.cache[--pp->alloc.count];
|
||||
}
|
||||
|
||||
static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
|
||||
{
|
||||
struct net_iov *niov;
|
||||
|
||||
if (WARN_ON_ONCE(!netmem_is_net_iov(netmem)))
|
||||
return false;
|
||||
|
||||
niov = netmem_to_net_iov(netmem);
|
||||
net_mp_niov_clear_page_pool(niov);
|
||||
io_zcrx_return_niov_freelist(niov);
|
||||
return false;
|
||||
}
|
||||
|
||||
static int io_pp_zc_init(struct page_pool *pp)
|
||||
{
|
||||
struct io_zcrx_ifq *ifq = pp->mp_priv;
|
||||
|
||||
if (WARN_ON_ONCE(!ifq))
|
||||
return -EINVAL;
|
||||
if (WARN_ON_ONCE(ifq->dev != pp->p.dev))
|
||||
return -EINVAL;
|
||||
if (WARN_ON_ONCE(!pp->dma_map))
|
||||
return -EOPNOTSUPP;
|
||||
if (pp->p.order != 0)
|
||||
return -EOPNOTSUPP;
|
||||
if (pp->p.dma_dir != DMA_FROM_DEVICE)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
percpu_ref_get(&ifq->ctx->refs);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void io_pp_zc_destroy(struct page_pool *pp)
|
||||
{
|
||||
struct io_zcrx_ifq *ifq = pp->mp_priv;
|
||||
struct io_zcrx_area *area = ifq->area;
|
||||
|
||||
if (WARN_ON_ONCE(area->free_count != area->nia.num_niovs))
|
||||
return;
|
||||
percpu_ref_put(&ifq->ctx->refs);
|
||||
}
|
||||
|
||||
static int io_pp_nl_fill(void *mp_priv, struct sk_buff *rsp,
|
||||
struct netdev_rx_queue *rxq)
|
||||
{
|
||||
struct nlattr *nest;
|
||||
int type;
|
||||
|
||||
type = rxq ? NETDEV_A_QUEUE_IO_URING : NETDEV_A_PAGE_POOL_IO_URING;
|
||||
nest = nla_nest_start(rsp, type);
|
||||
if (!nest)
|
||||
return -EMSGSIZE;
|
||||
nla_nest_end(rsp, nest);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void io_pp_uninstall(void *mp_priv, struct netdev_rx_queue *rxq)
|
||||
{
|
||||
struct pp_memory_provider_params *p = &rxq->mp_params;
|
||||
struct io_zcrx_ifq *ifq = mp_priv;
|
||||
|
||||
io_zcrx_drop_netdev(ifq);
|
||||
p->mp_ops = NULL;
|
||||
p->mp_priv = NULL;
|
||||
}
|
||||
|
||||
static const struct memory_provider_ops io_uring_pp_zc_ops = {
|
||||
.alloc_netmems = io_pp_zc_alloc_netmems,
|
||||
.release_netmem = io_pp_zc_release_netmem,
|
||||
.init = io_pp_zc_init,
|
||||
.destroy = io_pp_zc_destroy,
|
||||
.nl_fill = io_pp_nl_fill,
|
||||
.uninstall = io_pp_uninstall,
|
||||
};
|
||||
|
||||
static bool io_zcrx_queue_cqe(struct io_kiocb *req, struct net_iov *niov,
|
||||
struct io_zcrx_ifq *ifq, int off, int len)
|
||||
{
|
||||
struct io_uring_zcrx_cqe *rcqe;
|
||||
struct io_zcrx_area *area;
|
||||
struct io_uring_cqe *cqe;
|
||||
u64 offset;
|
||||
|
||||
if (!io_defer_get_uncommited_cqe(req->ctx, &cqe))
|
||||
return false;
|
||||
|
||||
cqe->user_data = req->cqe.user_data;
|
||||
cqe->res = len;
|
||||
cqe->flags = IORING_CQE_F_MORE;
|
||||
|
||||
area = io_zcrx_iov_to_area(niov);
|
||||
offset = off + (net_iov_idx(niov) << PAGE_SHIFT);
|
||||
rcqe = (struct io_uring_zcrx_cqe *)(cqe + 1);
|
||||
rcqe->off = offset + ((u64)area->area_id << IORING_ZCRX_AREA_SHIFT);
|
||||
rcqe->__pad = 0;
|
||||
return true;
|
||||
}
|
||||
|
||||
static struct net_iov *io_zcrx_alloc_fallback(struct io_zcrx_area *area)
|
||||
{
|
||||
struct net_iov *niov = NULL;
|
||||
|
||||
spin_lock_bh(&area->freelist_lock);
|
||||
if (area->free_count)
|
||||
niov = __io_zcrx_get_free_niov(area);
|
||||
spin_unlock_bh(&area->freelist_lock);
|
||||
|
||||
if (niov)
|
||||
page_pool_fragment_netmem(net_iov_to_netmem(niov), 1);
|
||||
return niov;
|
||||
}
|
||||
|
||||
static ssize_t io_zcrx_copy_chunk(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
void *src_base, struct page *src_page,
|
||||
unsigned int src_offset, size_t len)
|
||||
{
|
||||
struct io_zcrx_area *area = ifq->area;
|
||||
size_t copied = 0;
|
||||
int ret = 0;
|
||||
|
||||
while (len) {
|
||||
size_t copy_size = min_t(size_t, PAGE_SIZE, len);
|
||||
const int dst_off = 0;
|
||||
struct net_iov *niov;
|
||||
struct page *dst_page;
|
||||
void *dst_addr;
|
||||
|
||||
niov = io_zcrx_alloc_fallback(area);
|
||||
if (!niov) {
|
||||
ret = -ENOMEM;
|
||||
break;
|
||||
}
|
||||
|
||||
dst_page = io_zcrx_iov_page(niov);
|
||||
dst_addr = kmap_local_page(dst_page);
|
||||
if (src_page)
|
||||
src_base = kmap_local_page(src_page);
|
||||
|
||||
memcpy(dst_addr, src_base + src_offset, copy_size);
|
||||
|
||||
if (src_page)
|
||||
kunmap_local(src_base);
|
||||
kunmap_local(dst_addr);
|
||||
|
||||
if (!io_zcrx_queue_cqe(req, niov, ifq, dst_off, copy_size)) {
|
||||
io_zcrx_return_niov(niov);
|
||||
ret = -ENOSPC;
|
||||
break;
|
||||
}
|
||||
|
||||
io_zcrx_get_niov_uref(niov);
|
||||
src_offset += copy_size;
|
||||
len -= copy_size;
|
||||
copied += copy_size;
|
||||
}
|
||||
|
||||
return copied ? copied : ret;
|
||||
}
|
||||
|
||||
static int io_zcrx_copy_frag(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
const skb_frag_t *frag, int off, int len)
|
||||
{
|
||||
struct page *page = skb_frag_page(frag);
|
||||
u32 p_off, p_len, t, copied = 0;
|
||||
int ret = 0;
|
||||
|
||||
off += skb_frag_off(frag);
|
||||
|
||||
skb_frag_foreach_page(frag, off, len,
|
||||
page, p_off, p_len, t) {
|
||||
ret = io_zcrx_copy_chunk(req, ifq, NULL, page, p_off, p_len);
|
||||
if (ret < 0)
|
||||
return copied ? copied : ret;
|
||||
copied += ret;
|
||||
}
|
||||
return copied;
|
||||
}
|
||||
|
||||
static int io_zcrx_recv_frag(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
const skb_frag_t *frag, int off, int len)
|
||||
{
|
||||
struct net_iov *niov;
|
||||
|
||||
if (unlikely(!skb_frag_is_net_iov(frag)))
|
||||
return io_zcrx_copy_frag(req, ifq, frag, off, len);
|
||||
|
||||
niov = netmem_to_net_iov(frag->netmem);
|
||||
if (niov->pp->mp_ops != &io_uring_pp_zc_ops ||
|
||||
niov->pp->mp_priv != ifq)
|
||||
return -EFAULT;
|
||||
|
||||
if (!io_zcrx_queue_cqe(req, niov, ifq, off + skb_frag_off(frag), len))
|
||||
return -ENOSPC;
|
||||
|
||||
/*
|
||||
* Prevent it from being recycled while user is accessing it.
|
||||
* It has to be done before grabbing a user reference.
|
||||
*/
|
||||
page_pool_ref_netmem(net_iov_to_netmem(niov));
|
||||
io_zcrx_get_niov_uref(niov);
|
||||
return len;
|
||||
}
|
||||
|
||||
static int
|
||||
io_zcrx_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
|
||||
unsigned int offset, size_t len)
|
||||
{
|
||||
struct io_zcrx_args *args = desc->arg.data;
|
||||
struct io_zcrx_ifq *ifq = args->ifq;
|
||||
struct io_kiocb *req = args->req;
|
||||
struct sk_buff *frag_iter;
|
||||
unsigned start, start_off = offset;
|
||||
int i, copy, end, off;
|
||||
int ret = 0;
|
||||
|
||||
len = min_t(size_t, len, desc->count);
|
||||
if (unlikely(args->nr_skbs++ > IO_SKBS_PER_CALL_LIMIT))
|
||||
return -EAGAIN;
|
||||
|
||||
if (unlikely(offset < skb_headlen(skb))) {
|
||||
ssize_t copied;
|
||||
size_t to_copy;
|
||||
|
||||
to_copy = min_t(size_t, skb_headlen(skb) - offset, len);
|
||||
copied = io_zcrx_copy_chunk(req, ifq, skb->data, NULL,
|
||||
offset, to_copy);
|
||||
if (copied < 0) {
|
||||
ret = copied;
|
||||
goto out;
|
||||
}
|
||||
offset += copied;
|
||||
len -= copied;
|
||||
if (!len)
|
||||
goto out;
|
||||
if (offset != skb_headlen(skb))
|
||||
goto out;
|
||||
}
|
||||
|
||||
start = skb_headlen(skb);
|
||||
|
||||
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
|
||||
const skb_frag_t *frag;
|
||||
|
||||
if (WARN_ON(start > offset + len))
|
||||
return -EFAULT;
|
||||
|
||||
frag = &skb_shinfo(skb)->frags[i];
|
||||
end = start + skb_frag_size(frag);
|
||||
|
||||
if (offset < end) {
|
||||
copy = end - offset;
|
||||
if (copy > len)
|
||||
copy = len;
|
||||
|
||||
off = offset - start;
|
||||
ret = io_zcrx_recv_frag(req, ifq, frag, off, copy);
|
||||
if (ret < 0)
|
||||
goto out;
|
||||
|
||||
offset += ret;
|
||||
len -= ret;
|
||||
if (len == 0 || ret != copy)
|
||||
goto out;
|
||||
}
|
||||
start = end;
|
||||
}
|
||||
|
||||
skb_walk_frags(skb, frag_iter) {
|
||||
if (WARN_ON(start > offset + len))
|
||||
return -EFAULT;
|
||||
|
||||
end = start + frag_iter->len;
|
||||
if (offset < end) {
|
||||
copy = end - offset;
|
||||
if (copy > len)
|
||||
copy = len;
|
||||
|
||||
off = offset - start;
|
||||
ret = io_zcrx_recv_skb(desc, frag_iter, off, copy);
|
||||
if (ret < 0)
|
||||
goto out;
|
||||
|
||||
offset += ret;
|
||||
len -= ret;
|
||||
if (len == 0 || ret != copy)
|
||||
goto out;
|
||||
}
|
||||
start = end;
|
||||
}
|
||||
|
||||
out:
|
||||
if (offset == start_off)
|
||||
return ret;
|
||||
desc->count -= (offset - start_off);
|
||||
return offset - start_off;
|
||||
}
|
||||
|
||||
static int io_zcrx_tcp_recvmsg(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
struct sock *sk, int flags,
|
||||
unsigned issue_flags, unsigned int *outlen)
|
||||
{
|
||||
unsigned int len = *outlen;
|
||||
struct io_zcrx_args args = {
|
||||
.req = req,
|
||||
.ifq = ifq,
|
||||
.sock = sk->sk_socket,
|
||||
};
|
||||
read_descriptor_t rd_desc = {
|
||||
.count = len ? len : UINT_MAX,
|
||||
.arg.data = &args,
|
||||
};
|
||||
int ret;
|
||||
|
||||
lock_sock(sk);
|
||||
ret = tcp_read_sock(sk, &rd_desc, io_zcrx_recv_skb);
|
||||
if (len && ret > 0)
|
||||
*outlen = len - ret;
|
||||
if (ret <= 0) {
|
||||
if (ret < 0 || sock_flag(sk, SOCK_DONE))
|
||||
goto out;
|
||||
if (sk->sk_err)
|
||||
ret = sock_error(sk);
|
||||
else if (sk->sk_shutdown & RCV_SHUTDOWN)
|
||||
goto out;
|
||||
else if (sk->sk_state == TCP_CLOSE)
|
||||
ret = -ENOTCONN;
|
||||
else
|
||||
ret = -EAGAIN;
|
||||
} else if (unlikely(args.nr_skbs > IO_SKBS_PER_CALL_LIMIT) &&
|
||||
(issue_flags & IO_URING_F_MULTISHOT)) {
|
||||
ret = IOU_REQUEUE;
|
||||
} else if (sock_flag(sk, SOCK_DONE)) {
|
||||
/* Make it to retry until it finally gets 0. */
|
||||
if (issue_flags & IO_URING_F_MULTISHOT)
|
||||
ret = IOU_REQUEUE;
|
||||
else
|
||||
ret = -EAGAIN;
|
||||
}
|
||||
out:
|
||||
release_sock(sk);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
struct socket *sock, unsigned int flags,
|
||||
unsigned issue_flags, unsigned int *len)
|
||||
{
|
||||
struct sock *sk = sock->sk;
|
||||
const struct proto *prot = READ_ONCE(sk->sk_prot);
|
||||
|
||||
if (prot->recvmsg != tcp_recvmsg)
|
||||
return -EPROTONOSUPPORT;
|
||||
|
||||
sock_rps_record_flow(sk);
|
||||
return io_zcrx_tcp_recvmsg(req, ifq, sk, flags, issue_flags, len);
|
||||
}
|
||||
|
|
@ -0,0 +1,73 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#ifndef IOU_ZC_RX_H
|
||||
#define IOU_ZC_RX_H
|
||||
|
||||
#include <linux/io_uring_types.h>
|
||||
#include <linux/socket.h>
|
||||
#include <net/page_pool/types.h>
|
||||
#include <net/net_trackers.h>
|
||||
|
||||
struct io_zcrx_area {
|
||||
struct net_iov_area nia;
|
||||
struct io_zcrx_ifq *ifq;
|
||||
atomic_t *user_refs;
|
||||
|
||||
bool is_mapped;
|
||||
u16 area_id;
|
||||
struct page **pages;
|
||||
|
||||
/* freelist */
|
||||
spinlock_t freelist_lock ____cacheline_aligned_in_smp;
|
||||
u32 free_count;
|
||||
u32 *freelist;
|
||||
};
|
||||
|
||||
struct io_zcrx_ifq {
|
||||
struct io_ring_ctx *ctx;
|
||||
struct io_zcrx_area *area;
|
||||
|
||||
struct io_uring *rq_ring;
|
||||
struct io_uring_zcrx_rqe *rqes;
|
||||
u32 rq_entries;
|
||||
u32 cached_rq_head;
|
||||
spinlock_t rq_lock;
|
||||
|
||||
u32 if_rxq;
|
||||
struct device *dev;
|
||||
struct net_device *netdev;
|
||||
netdevice_tracker netdev_tracker;
|
||||
spinlock_t lock;
|
||||
};
|
||||
|
||||
#if defined(CONFIG_IO_URING_ZCRX)
|
||||
int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
|
||||
struct io_uring_zcrx_ifq_reg __user *arg);
|
||||
void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx);
|
||||
void io_shutdown_zcrx_ifqs(struct io_ring_ctx *ctx);
|
||||
int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
struct socket *sock, unsigned int flags,
|
||||
unsigned issue_flags, unsigned int *len);
|
||||
#else
|
||||
static inline int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
|
||||
struct io_uring_zcrx_ifq_reg __user *arg)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static inline void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
|
||||
{
|
||||
}
|
||||
static inline void io_shutdown_zcrx_ifqs(struct io_ring_ctx *ctx)
|
||||
{
|
||||
}
|
||||
static inline int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
|
||||
struct socket *sock, unsigned int flags,
|
||||
unsigned issue_flags, unsigned int *len)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
#endif
|
||||
|
||||
int io_recvzc(struct io_kiocb *req, unsigned int issue_flags);
|
||||
int io_recvzc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe);
|
||||
|
||||
#endif
|
||||
|
|
@ -1040,7 +1040,7 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
|
|||
|
||||
/* host join */
|
||||
if (!port) {
|
||||
if (mp->host_joined) {
|
||||
if (mp->host_joined && !(cfg->nlflags & NLM_F_REPLACE)) {
|
||||
NL_SET_ERR_MSG_MOD(extack, "Group is already joined by host");
|
||||
return -EEXIST;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -159,6 +159,7 @@
|
|||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/page_pool/types.h>
|
||||
#include <net/page_pool/helpers.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
#include <net/rps.h>
|
||||
#include <linux/phy_link_topology.h>
|
||||
|
||||
|
|
@ -6184,16 +6185,18 @@ EXPORT_SYMBOL(netif_receive_skb_list);
|
|||
static void flush_backlog(struct work_struct *work)
|
||||
{
|
||||
struct sk_buff *skb, *tmp;
|
||||
struct sk_buff_head list;
|
||||
struct softnet_data *sd;
|
||||
|
||||
__skb_queue_head_init(&list);
|
||||
local_bh_disable();
|
||||
sd = this_cpu_ptr(&softnet_data);
|
||||
|
||||
backlog_lock_irq_disable(sd);
|
||||
skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
|
||||
if (skb->dev->reg_state == NETREG_UNREGISTERING) {
|
||||
if (READ_ONCE(skb->dev->reg_state) == NETREG_UNREGISTERING) {
|
||||
__skb_unlink(skb, &sd->input_pkt_queue);
|
||||
dev_kfree_skb_irq(skb);
|
||||
__skb_queue_tail(&list, skb);
|
||||
rps_input_queue_head_incr(sd);
|
||||
}
|
||||
}
|
||||
|
|
@ -6201,14 +6204,16 @@ static void flush_backlog(struct work_struct *work)
|
|||
|
||||
local_lock_nested_bh(&softnet_data.process_queue_bh_lock);
|
||||
skb_queue_walk_safe(&sd->process_queue, skb, tmp) {
|
||||
if (skb->dev->reg_state == NETREG_UNREGISTERING) {
|
||||
if (READ_ONCE(skb->dev->reg_state) == NETREG_UNREGISTERING) {
|
||||
__skb_unlink(skb, &sd->process_queue);
|
||||
kfree_skb(skb);
|
||||
__skb_queue_tail(&list, skb);
|
||||
rps_input_queue_head_incr(sd);
|
||||
}
|
||||
}
|
||||
local_unlock_nested_bh(&softnet_data.process_queue_bh_lock);
|
||||
local_bh_enable();
|
||||
|
||||
__skb_queue_purge_reason(&list, SKB_DROP_REASON_DEV_READY);
|
||||
}
|
||||
|
||||
static bool flush_required(int cpu)
|
||||
|
|
@ -7153,6 +7158,9 @@ void __netif_napi_del_locked(struct napi_struct *napi)
|
|||
if (!test_and_clear_bit(NAPI_STATE_LISTED, &napi->state))
|
||||
return;
|
||||
|
||||
/* Make sure NAPI is disabled (or was never enabled). */
|
||||
WARN_ON(!test_bit(NAPI_STATE_SCHED, &napi->state));
|
||||
|
||||
if (napi->config) {
|
||||
napi->index = -1;
|
||||
napi->config = NULL;
|
||||
|
|
@ -11820,6 +11828,19 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head)
|
|||
}
|
||||
EXPORT_SYMBOL(unregister_netdevice_queue);
|
||||
|
||||
static void dev_memory_provider_uninstall(struct net_device *dev)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < dev->real_num_rx_queues; i++) {
|
||||
struct netdev_rx_queue *rxq = &dev->_rx[i];
|
||||
struct pp_memory_provider_params *p = &rxq->mp_params;
|
||||
|
||||
if (p->mp_ops && p->mp_ops->uninstall)
|
||||
p->mp_ops->uninstall(rxq->mp_params.mp_priv, rxq);
|
||||
}
|
||||
}
|
||||
|
||||
void unregister_netdevice_many_notify(struct list_head *head,
|
||||
u32 portid, const struct nlmsghdr *nlh)
|
||||
{
|
||||
|
|
@ -11874,7 +11895,7 @@ void unregister_netdevice_many_notify(struct list_head *head,
|
|||
dev_tcx_uninstall(dev);
|
||||
dev_xdp_uninstall(dev);
|
||||
bpf_dev_bound_netdev_unregister(dev);
|
||||
dev_dmabuf_uninstall(dev);
|
||||
dev_memory_provider_uninstall(dev);
|
||||
|
||||
netdev_offload_xstats_disable_all(dev);
|
||||
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@
|
|||
#include <net/netdev_queues.h>
|
||||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/page_pool/helpers.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
#include <trace/events/page_pool.h>
|
||||
|
||||
#include "devmem.h"
|
||||
|
|
@ -27,20 +28,28 @@
|
|||
/* Protected by rtnl_lock() */
|
||||
static DEFINE_XARRAY_FLAGS(net_devmem_dmabuf_bindings, XA_FLAGS_ALLOC1);
|
||||
|
||||
static const struct memory_provider_ops dmabuf_devmem_ops;
|
||||
|
||||
bool net_is_devmem_iov(struct net_iov *niov)
|
||||
{
|
||||
return niov->pp->mp_ops == &dmabuf_devmem_ops;
|
||||
}
|
||||
|
||||
static void net_devmem_dmabuf_free_chunk_owner(struct gen_pool *genpool,
|
||||
struct gen_pool_chunk *chunk,
|
||||
void *not_used)
|
||||
{
|
||||
struct dmabuf_genpool_chunk_owner *owner = chunk->owner;
|
||||
|
||||
kvfree(owner->niovs);
|
||||
kvfree(owner->area.niovs);
|
||||
kfree(owner);
|
||||
}
|
||||
|
||||
static dma_addr_t net_devmem_get_dma_addr(const struct net_iov *niov)
|
||||
{
|
||||
struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
|
||||
struct dmabuf_genpool_chunk_owner *owner;
|
||||
|
||||
owner = net_devmem_iov_to_chunk_owner(niov);
|
||||
return owner->base_dma_addr +
|
||||
((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT);
|
||||
}
|
||||
|
|
@ -83,7 +92,7 @@ net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
|
|||
|
||||
offset = dma_addr - owner->base_dma_addr;
|
||||
index = offset / PAGE_SIZE;
|
||||
niov = &owner->niovs[index];
|
||||
niov = &owner->area.niovs[index];
|
||||
|
||||
niov->pp_magic = 0;
|
||||
niov->pp = NULL;
|
||||
|
|
@ -94,7 +103,7 @@ net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
|
|||
|
||||
void net_devmem_free_dmabuf(struct net_iov *niov)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding = net_iov_binding(niov);
|
||||
struct net_devmem_dmabuf_binding *binding = net_devmem_iov_binding(niov);
|
||||
unsigned long dma_addr = net_devmem_get_dma_addr(niov);
|
||||
|
||||
if (WARN_ON(!gen_pool_has_addr(binding->chunk_pool, dma_addr,
|
||||
|
|
@ -117,6 +126,7 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
|
|||
WARN_ON(rxq->mp_params.mp_priv != binding);
|
||||
|
||||
rxq->mp_params.mp_priv = NULL;
|
||||
rxq->mp_params.mp_ops = NULL;
|
||||
|
||||
rxq_idx = get_netdev_rx_queue_index(rxq);
|
||||
|
||||
|
|
@ -152,7 +162,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
|
|||
}
|
||||
|
||||
rxq = __netif_get_rx_queue(dev, rxq_idx);
|
||||
if (rxq->mp_params.mp_priv) {
|
||||
if (rxq->mp_params.mp_ops) {
|
||||
NL_SET_ERR_MSG(extack, "designated queue already memory provider bound");
|
||||
return -EEXIST;
|
||||
}
|
||||
|
|
@ -170,6 +180,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
|
|||
return err;
|
||||
|
||||
rxq->mp_params.mp_priv = binding;
|
||||
rxq->mp_params.mp_ops = &dmabuf_devmem_ops;
|
||||
|
||||
err = netdev_rx_queue_restart(dev, rxq_idx);
|
||||
if (err)
|
||||
|
|
@ -179,6 +190,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
|
|||
|
||||
err_xa_erase:
|
||||
rxq->mp_params.mp_priv = NULL;
|
||||
rxq->mp_params.mp_ops = NULL;
|
||||
xa_erase(&binding->bound_rxqs, xa_idx);
|
||||
|
||||
return err;
|
||||
|
|
@ -261,9 +273,9 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
|
|||
goto err_free_chunks;
|
||||
}
|
||||
|
||||
owner->base_virtual = virtual;
|
||||
owner->area.base_virtual = virtual;
|
||||
owner->base_dma_addr = dma_addr;
|
||||
owner->num_niovs = len / PAGE_SIZE;
|
||||
owner->area.num_niovs = len / PAGE_SIZE;
|
||||
owner->binding = binding;
|
||||
|
||||
err = gen_pool_add_owner(binding->chunk_pool, dma_addr,
|
||||
|
|
@ -275,17 +287,17 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
|
|||
goto err_free_chunks;
|
||||
}
|
||||
|
||||
owner->niovs = kvmalloc_array(owner->num_niovs,
|
||||
sizeof(*owner->niovs),
|
||||
GFP_KERNEL);
|
||||
if (!owner->niovs) {
|
||||
owner->area.niovs = kvmalloc_array(owner->area.num_niovs,
|
||||
sizeof(*owner->area.niovs),
|
||||
GFP_KERNEL);
|
||||
if (!owner->area.niovs) {
|
||||
err = -ENOMEM;
|
||||
goto err_free_chunks;
|
||||
}
|
||||
|
||||
for (i = 0; i < owner->num_niovs; i++) {
|
||||
niov = &owner->niovs[i];
|
||||
niov->owner = owner;
|
||||
for (i = 0; i < owner->area.num_niovs; i++) {
|
||||
niov = &owner->area.niovs[i];
|
||||
niov->owner = &owner->area;
|
||||
page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
|
||||
net_devmem_get_dma_addr(niov));
|
||||
}
|
||||
|
|
@ -313,26 +325,6 @@ err_put_dmabuf:
|
|||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
void dev_dmabuf_uninstall(struct net_device *dev)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding;
|
||||
struct netdev_rx_queue *rxq;
|
||||
unsigned long xa_idx;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < dev->real_num_rx_queues; i++) {
|
||||
binding = dev->_rx[i].mp_params.mp_priv;
|
||||
if (!binding)
|
||||
continue;
|
||||
|
||||
xa_for_each(&binding->bound_rxqs, xa_idx, rxq)
|
||||
if (rxq == &dev->_rx[i]) {
|
||||
xa_erase(&binding->bound_rxqs, xa_idx);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*** "Dmabuf devmem memory provider" ***/
|
||||
|
||||
int mp_dmabuf_devmem_init(struct page_pool *pool)
|
||||
|
|
@ -398,3 +390,36 @@ bool mp_dmabuf_devmem_release_page(struct page_pool *pool, netmem_ref netmem)
|
|||
/* We don't want the page pool put_page()ing our net_iovs. */
|
||||
return false;
|
||||
}
|
||||
|
||||
static int mp_dmabuf_devmem_nl_fill(void *mp_priv, struct sk_buff *rsp,
|
||||
struct netdev_rx_queue *rxq)
|
||||
{
|
||||
const struct net_devmem_dmabuf_binding *binding = mp_priv;
|
||||
int type = rxq ? NETDEV_A_QUEUE_DMABUF : NETDEV_A_PAGE_POOL_DMABUF;
|
||||
|
||||
return nla_put_u32(rsp, type, binding->id);
|
||||
}
|
||||
|
||||
static void mp_dmabuf_devmem_uninstall(void *mp_priv,
|
||||
struct netdev_rx_queue *rxq)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding = mp_priv;
|
||||
struct netdev_rx_queue *bound_rxq;
|
||||
unsigned long xa_idx;
|
||||
|
||||
xa_for_each(&binding->bound_rxqs, xa_idx, bound_rxq) {
|
||||
if (bound_rxq == rxq) {
|
||||
xa_erase(&binding->bound_rxqs, xa_idx);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static const struct memory_provider_ops dmabuf_devmem_ops = {
|
||||
.init = mp_dmabuf_devmem_init,
|
||||
.destroy = mp_dmabuf_devmem_destroy,
|
||||
.alloc_netmems = mp_dmabuf_devmem_alloc_netmems,
|
||||
.release_netmem = mp_dmabuf_devmem_release_page,
|
||||
.nl_fill = mp_dmabuf_devmem_nl_fill,
|
||||
.uninstall = mp_dmabuf_devmem_uninstall,
|
||||
};
|
||||
|
|
|
|||
|
|
@ -10,6 +10,8 @@
|
|||
#ifndef _NET_DEVMEM_H
|
||||
#define _NET_DEVMEM_H
|
||||
|
||||
#include <net/netmem.h>
|
||||
|
||||
struct netlink_ext_ack;
|
||||
|
||||
struct net_devmem_dmabuf_binding {
|
||||
|
|
@ -51,17 +53,11 @@ struct net_devmem_dmabuf_binding {
|
|||
* allocations from this chunk.
|
||||
*/
|
||||
struct dmabuf_genpool_chunk_owner {
|
||||
/* Offset into the dma-buf where this chunk starts. */
|
||||
unsigned long base_virtual;
|
||||
struct net_iov_area area;
|
||||
struct net_devmem_dmabuf_binding *binding;
|
||||
|
||||
/* dma_addr of the start of the chunk. */
|
||||
dma_addr_t base_dma_addr;
|
||||
|
||||
/* Array of net_iovs for this chunk. */
|
||||
struct net_iov *niovs;
|
||||
size_t num_niovs;
|
||||
|
||||
struct net_devmem_dmabuf_binding *binding;
|
||||
};
|
||||
|
||||
void __net_devmem_dmabuf_binding_free(struct net_devmem_dmabuf_binding *binding);
|
||||
|
|
@ -72,38 +68,34 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding);
|
|||
int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
|
||||
struct net_devmem_dmabuf_binding *binding,
|
||||
struct netlink_ext_ack *extack);
|
||||
void dev_dmabuf_uninstall(struct net_device *dev);
|
||||
|
||||
static inline struct dmabuf_genpool_chunk_owner *
|
||||
net_iov_owner(const struct net_iov *niov)
|
||||
net_devmem_iov_to_chunk_owner(const struct net_iov *niov)
|
||||
{
|
||||
return niov->owner;
|
||||
}
|
||||
struct net_iov_area *owner = net_iov_owner(niov);
|
||||
|
||||
static inline unsigned int net_iov_idx(const struct net_iov *niov)
|
||||
{
|
||||
return niov - net_iov_owner(niov)->niovs;
|
||||
return container_of(owner, struct dmabuf_genpool_chunk_owner, area);
|
||||
}
|
||||
|
||||
static inline struct net_devmem_dmabuf_binding *
|
||||
net_iov_binding(const struct net_iov *niov)
|
||||
net_devmem_iov_binding(const struct net_iov *niov)
|
||||
{
|
||||
return net_iov_owner(niov)->binding;
|
||||
return net_devmem_iov_to_chunk_owner(niov)->binding;
|
||||
}
|
||||
|
||||
static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
|
||||
{
|
||||
return net_devmem_iov_binding(niov)->id;
|
||||
}
|
||||
|
||||
static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
|
||||
{
|
||||
struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
|
||||
struct net_iov_area *owner = net_iov_owner(niov);
|
||||
|
||||
return owner->base_virtual +
|
||||
((unsigned long)net_iov_idx(niov) << PAGE_SHIFT);
|
||||
}
|
||||
|
||||
static inline u32 net_iov_binding_id(const struct net_iov *niov)
|
||||
{
|
||||
return net_iov_owner(niov)->binding->id;
|
||||
}
|
||||
|
||||
static inline void
|
||||
net_devmem_dmabuf_binding_get(struct net_devmem_dmabuf_binding *binding)
|
||||
{
|
||||
|
|
@ -123,6 +115,8 @@ struct net_iov *
|
|||
net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding);
|
||||
void net_devmem_free_dmabuf(struct net_iov *ppiov);
|
||||
|
||||
bool net_is_devmem_iov(struct net_iov *niov);
|
||||
|
||||
#else
|
||||
struct net_devmem_dmabuf_binding;
|
||||
|
||||
|
|
@ -152,10 +146,6 @@ net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
|
|||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
static inline void dev_dmabuf_uninstall(struct net_device *dev)
|
||||
{
|
||||
}
|
||||
|
||||
static inline struct net_iov *
|
||||
net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
|
||||
{
|
||||
|
|
@ -171,10 +161,15 @@ static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static inline u32 net_iov_binding_id(const struct net_iov *niov)
|
||||
static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline bool net_is_devmem_iov(struct net_iov *niov)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* _NET_DEVMEM_H */
|
||||
|
|
|
|||
|
|
@ -832,12 +832,10 @@ static int pneigh_ifdown_and_unlock(struct neigh_table *tbl,
|
|||
return -ENOENT;
|
||||
}
|
||||
|
||||
static void neigh_parms_destroy(struct neigh_parms *parms);
|
||||
|
||||
static inline void neigh_parms_put(struct neigh_parms *parms)
|
||||
{
|
||||
if (refcount_dec_and_test(&parms->refcnt))
|
||||
neigh_parms_destroy(parms);
|
||||
kfree(parms);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -1713,11 +1711,6 @@ void neigh_parms_release(struct neigh_table *tbl, struct neigh_parms *parms)
|
|||
}
|
||||
EXPORT_SYMBOL(neigh_parms_release);
|
||||
|
||||
static void neigh_parms_destroy(struct neigh_parms *parms)
|
||||
{
|
||||
kfree(parms);
|
||||
}
|
||||
|
||||
static struct lock_class_key neigh_table_proxy_queue_class;
|
||||
|
||||
static struct neigh_table __rcu *neigh_tables[NEIGH_NR_TABLES] __read_mostly;
|
||||
|
|
|
|||
|
|
@ -42,6 +42,87 @@ static inline int dev_isalive(const struct net_device *dev)
|
|||
return READ_ONCE(dev->reg_state) <= NETREG_REGISTERED;
|
||||
}
|
||||
|
||||
/* There is a possible ABBA deadlock between rtnl_lock and kernfs_node->active,
|
||||
* when unregistering a net device and accessing associated sysfs files. The
|
||||
* potential deadlock is as follow:
|
||||
*
|
||||
* CPU 0 CPU 1
|
||||
*
|
||||
* rtnl_lock vfs_read
|
||||
* unregister_netdevice_many kernfs_seq_start
|
||||
* device_del / kobject_put kernfs_get_active (kn->active++)
|
||||
* kernfs_drain sysfs_kf_seq_show
|
||||
* wait_event( rtnl_lock
|
||||
* kn->active == KN_DEACTIVATED_BIAS) -> waits on CPU 0 to release
|
||||
* -> waits on CPU 1 to decrease kn->active the rtnl lock.
|
||||
*
|
||||
* The historical fix was to use rtnl_trylock with restart_syscall to bail out
|
||||
* of sysfs operations when the lock couldn't be taken. This fixed the above
|
||||
* issue as it allowed CPU 1 to bail out of the ABBA situation.
|
||||
*
|
||||
* But it came with performances issues, as syscalls are being restarted in
|
||||
* loops when there was contention on the rtnl lock, with huge slow downs in
|
||||
* specific scenarios (e.g. lots of virtual interfaces created and userspace
|
||||
* daemons querying their attributes).
|
||||
*
|
||||
* The idea below is to bail out of the active kernfs_node protection
|
||||
* (kn->active) while trying to take the rtnl lock.
|
||||
*
|
||||
* This replaces rtnl_lock() and still has to be used with rtnl_unlock(). The
|
||||
* net device is guaranteed to be alive if this returns successfully.
|
||||
*/
|
||||
static int sysfs_rtnl_lock(struct kobject *kobj, struct attribute *attr,
|
||||
struct net_device *ndev)
|
||||
{
|
||||
struct kernfs_node *kn;
|
||||
int ret = 0;
|
||||
|
||||
/* First, we hold a reference to the net device as the unregistration
|
||||
* path might run in parallel. This will ensure the net device and the
|
||||
* associated sysfs objects won't be freed while we try to take the rtnl
|
||||
* lock.
|
||||
*/
|
||||
dev_hold(ndev);
|
||||
/* sysfs_break_active_protection was introduced to allow self-removal of
|
||||
* devices and their associated sysfs files by bailing out of the
|
||||
* sysfs/kernfs protection. We do this here to allow the unregistration
|
||||
* path to complete in parallel. The following takes a reference on the
|
||||
* kobject and the kernfs_node being accessed.
|
||||
*
|
||||
* This works because we hold a reference onto the net device and the
|
||||
* unregistration path will wait for us eventually in netdev_run_todo
|
||||
* (outside an rtnl lock section).
|
||||
*/
|
||||
kn = sysfs_break_active_protection(kobj, attr);
|
||||
/* We can now try to take the rtnl lock. This can't deadlock us as the
|
||||
* unregistration path is able to drain sysfs files (kernfs_node) thanks
|
||||
* to the above dance.
|
||||
*/
|
||||
if (rtnl_lock_interruptible()) {
|
||||
ret = -ERESTARTSYS;
|
||||
goto unbreak;
|
||||
}
|
||||
/* Check dismantle on the device hasn't started, otherwise deny the
|
||||
* operation.
|
||||
*/
|
||||
if (!dev_isalive(ndev)) {
|
||||
rtnl_unlock();
|
||||
ret = -ENODEV;
|
||||
goto unbreak;
|
||||
}
|
||||
/* We are now sure the device dismantle hasn't started nor that it can
|
||||
* start before we exit the locking section as we hold the rtnl lock.
|
||||
* There's no need to keep unbreaking the sysfs protection nor to hold
|
||||
* a net device reference from that point; that was only needed to take
|
||||
* the rtnl lock.
|
||||
*/
|
||||
unbreak:
|
||||
sysfs_unbreak_active_protection(kn);
|
||||
dev_put(ndev);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* use same locking rules as GIF* ioctl's */
|
||||
static ssize_t netdev_show(const struct device *dev,
|
||||
struct device_attribute *attr, char *buf,
|
||||
|
|
@ -95,14 +176,14 @@ static ssize_t netdev_store(struct device *dev, struct device_attribute *attr,
|
|||
if (ret)
|
||||
goto err;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
ret = (*set)(netdev, new);
|
||||
if (ret == 0)
|
||||
ret = len;
|
||||
|
||||
if (dev_isalive(netdev)) {
|
||||
ret = (*set)(netdev, new);
|
||||
if (ret == 0)
|
||||
ret = len;
|
||||
}
|
||||
rtnl_unlock();
|
||||
err:
|
||||
return ret;
|
||||
|
|
@ -220,7 +301,7 @@ static ssize_t carrier_store(struct device *dev, struct device_attribute *attr,
|
|||
struct net_device *netdev = to_net_dev(dev);
|
||||
|
||||
/* The check is also done in change_carrier; this helps returning early
|
||||
* without hitting the trylock/restart in netdev_store.
|
||||
* without hitting the locking section in netdev_store.
|
||||
*/
|
||||
if (!netdev->netdev_ops->ndo_change_carrier)
|
||||
return -EOPNOTSUPP;
|
||||
|
|
@ -234,8 +315,9 @@ static ssize_t carrier_show(struct device *dev,
|
|||
struct net_device *netdev = to_net_dev(dev);
|
||||
int ret = -EINVAL;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (netif_running(netdev)) {
|
||||
/* Synchronize carrier state with link watch,
|
||||
|
|
@ -245,8 +327,8 @@ static ssize_t carrier_show(struct device *dev,
|
|||
|
||||
ret = sysfs_emit(buf, fmt_dec, !!netif_carrier_ok(netdev));
|
||||
}
|
||||
rtnl_unlock();
|
||||
|
||||
rtnl_unlock();
|
||||
return ret;
|
||||
}
|
||||
static DEVICE_ATTR_RW(carrier);
|
||||
|
|
@ -258,13 +340,14 @@ static ssize_t speed_show(struct device *dev,
|
|||
int ret = -EINVAL;
|
||||
|
||||
/* The check is also done in __ethtool_get_link_ksettings; this helps
|
||||
* returning early without hitting the trylock/restart below.
|
||||
* returning early without hitting the locking section below.
|
||||
*/
|
||||
if (!netdev->ethtool_ops->get_link_ksettings)
|
||||
return ret;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (netif_running(netdev)) {
|
||||
struct ethtool_link_ksettings cmd;
|
||||
|
|
@ -284,13 +367,14 @@ static ssize_t duplex_show(struct device *dev,
|
|||
int ret = -EINVAL;
|
||||
|
||||
/* The check is also done in __ethtool_get_link_ksettings; this helps
|
||||
* returning early without hitting the trylock/restart below.
|
||||
* returning early without hitting the locking section below.
|
||||
*/
|
||||
if (!netdev->ethtool_ops->get_link_ksettings)
|
||||
return ret;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (netif_running(netdev)) {
|
||||
struct ethtool_link_ksettings cmd;
|
||||
|
|
@ -490,16 +574,15 @@ static ssize_t ifalias_store(struct device *dev, struct device_attribute *attr,
|
|||
if (len > 0 && buf[len - 1] == '\n')
|
||||
--count;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (dev_isalive(netdev)) {
|
||||
ret = dev_set_alias(netdev, buf, count);
|
||||
if (ret < 0)
|
||||
goto err;
|
||||
ret = len;
|
||||
netdev_state_change(netdev);
|
||||
}
|
||||
ret = dev_set_alias(netdev, buf, count);
|
||||
if (ret < 0)
|
||||
goto err;
|
||||
ret = len;
|
||||
netdev_state_change(netdev);
|
||||
err:
|
||||
rtnl_unlock();
|
||||
|
||||
|
|
@ -551,24 +634,23 @@ static ssize_t phys_port_id_show(struct device *dev,
|
|||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
struct net_device *netdev = to_net_dev(dev);
|
||||
struct netdev_phys_item_id ppid;
|
||||
ssize_t ret = -EINVAL;
|
||||
|
||||
/* The check is also done in dev_get_phys_port_id; this helps returning
|
||||
* early without hitting the trylock/restart below.
|
||||
* early without hitting the locking section below.
|
||||
*/
|
||||
if (!netdev->netdev_ops->ndo_get_phys_port_id)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (dev_isalive(netdev)) {
|
||||
struct netdev_phys_item_id ppid;
|
||||
ret = dev_get_phys_port_id(netdev, &ppid);
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id);
|
||||
|
||||
ret = dev_get_phys_port_id(netdev, &ppid);
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id);
|
||||
}
|
||||
rtnl_unlock();
|
||||
|
||||
return ret;
|
||||
|
|
@ -580,24 +662,23 @@ static ssize_t phys_port_name_show(struct device *dev,
|
|||
{
|
||||
struct net_device *netdev = to_net_dev(dev);
|
||||
ssize_t ret = -EINVAL;
|
||||
char name[IFNAMSIZ];
|
||||
|
||||
/* The checks are also done in dev_get_phys_port_name; this helps
|
||||
* returning early without hitting the trylock/restart below.
|
||||
* returning early without hitting the locking section below.
|
||||
*/
|
||||
if (!netdev->netdev_ops->ndo_get_phys_port_name &&
|
||||
!netdev->devlink_port)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (dev_isalive(netdev)) {
|
||||
char name[IFNAMSIZ];
|
||||
ret = dev_get_phys_port_name(netdev, name, sizeof(name));
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%s\n", name);
|
||||
|
||||
ret = dev_get_phys_port_name(netdev, name, sizeof(name));
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%s\n", name);
|
||||
}
|
||||
rtnl_unlock();
|
||||
|
||||
return ret;
|
||||
|
|
@ -608,26 +689,25 @@ static ssize_t phys_switch_id_show(struct device *dev,
|
|||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
struct net_device *netdev = to_net_dev(dev);
|
||||
struct netdev_phys_item_id ppid = { };
|
||||
ssize_t ret = -EINVAL;
|
||||
|
||||
/* The checks are also done in dev_get_phys_port_name; this helps
|
||||
* returning early without hitting the trylock/restart below. This works
|
||||
* returning early without hitting the locking section below. This works
|
||||
* because recurse is false when calling dev_get_port_parent_id.
|
||||
*/
|
||||
if (!netdev->netdev_ops->ndo_get_port_parent_id &&
|
||||
!netdev->devlink_port)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(&dev->kobj, &attr->attr, netdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (dev_isalive(netdev)) {
|
||||
struct netdev_phys_item_id ppid = { };
|
||||
ret = dev_get_port_parent_id(netdev, &ppid, false);
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id);
|
||||
|
||||
ret = dev_get_port_parent_id(netdev, &ppid, false);
|
||||
if (!ret)
|
||||
ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id);
|
||||
}
|
||||
rtnl_unlock();
|
||||
|
||||
return ret;
|
||||
|
|
@ -1108,7 +1188,6 @@ static void rx_queue_get_ownership(const struct kobject *kobj,
|
|||
static const struct kobj_type rx_queue_ktype = {
|
||||
.sysfs_ops = &rx_queue_sysfs_ops,
|
||||
.release = rx_queue_release,
|
||||
.default_groups = rx_queue_default_groups,
|
||||
.namespace = rx_queue_namespace,
|
||||
.get_ownership = rx_queue_get_ownership,
|
||||
};
|
||||
|
|
@ -1131,6 +1210,22 @@ static int rx_queue_add_kobject(struct net_device *dev, int index)
|
|||
struct kobject *kobj = &queue->kobj;
|
||||
int error = 0;
|
||||
|
||||
/* Rx queues are cleared in rx_queue_release to allow later
|
||||
* re-registration. This is triggered when their kobj refcount is
|
||||
* dropped.
|
||||
*
|
||||
* If a queue is removed while both a read (or write) operation and a
|
||||
* the re-addition of the same queue are pending (waiting on rntl_lock)
|
||||
* it might happen that the re-addition will execute before the read,
|
||||
* making the initial removal to never happen (queue's kobj refcount
|
||||
* won't drop enough because of the pending read). In such rare case,
|
||||
* return to allow the removal operation to complete.
|
||||
*/
|
||||
if (unlikely(kobj->state_initialized)) {
|
||||
netdev_warn_once(dev, "Cannot re-add rx queues before their removal completed");
|
||||
return -EAGAIN;
|
||||
}
|
||||
|
||||
/* Kobject_put later will trigger rx_queue_release call which
|
||||
* decreases dev refcount: Take that reference here
|
||||
*/
|
||||
|
|
@ -1142,20 +1237,27 @@ static int rx_queue_add_kobject(struct net_device *dev, int index)
|
|||
if (error)
|
||||
goto err;
|
||||
|
||||
queue->groups = rx_queue_default_groups;
|
||||
error = sysfs_create_groups(kobj, queue->groups);
|
||||
if (error)
|
||||
goto err;
|
||||
|
||||
if (dev->sysfs_rx_queue_group) {
|
||||
error = sysfs_create_group(kobj, dev->sysfs_rx_queue_group);
|
||||
if (error)
|
||||
goto err;
|
||||
goto err_default_groups;
|
||||
}
|
||||
|
||||
error = rx_queue_default_mask(dev, queue);
|
||||
if (error)
|
||||
goto err;
|
||||
goto err_default_groups;
|
||||
|
||||
kobject_uevent(kobj, KOBJ_ADD);
|
||||
|
||||
return error;
|
||||
|
||||
err_default_groups:
|
||||
sysfs_remove_groups(kobj, queue->groups);
|
||||
err:
|
||||
kobject_put(kobj);
|
||||
return error;
|
||||
|
|
@ -1200,12 +1302,14 @@ net_rx_queue_update_kobjects(struct net_device *dev, int old_num, int new_num)
|
|||
}
|
||||
|
||||
while (--i >= new_num) {
|
||||
struct kobject *kobj = &dev->_rx[i].kobj;
|
||||
struct netdev_rx_queue *queue = &dev->_rx[i];
|
||||
struct kobject *kobj = &queue->kobj;
|
||||
|
||||
if (!refcount_read(&dev_net(dev)->ns.count))
|
||||
kobj->uevent_suppress = 1;
|
||||
if (dev->sysfs_rx_queue_group)
|
||||
sysfs_remove_group(kobj, dev->sysfs_rx_queue_group);
|
||||
sysfs_remove_groups(kobj, queue->groups);
|
||||
kobject_put(kobj);
|
||||
}
|
||||
|
||||
|
|
@ -1244,9 +1348,11 @@ static int net_rx_queue_change_owner(struct net_device *dev, int num,
|
|||
*/
|
||||
struct netdev_queue_attribute {
|
||||
struct attribute attr;
|
||||
ssize_t (*show)(struct netdev_queue *queue, char *buf);
|
||||
ssize_t (*store)(struct netdev_queue *queue,
|
||||
const char *buf, size_t len);
|
||||
ssize_t (*show)(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf);
|
||||
ssize_t (*store)(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len);
|
||||
};
|
||||
#define to_netdev_queue_attr(_attr) \
|
||||
container_of(_attr, struct netdev_queue_attribute, attr)
|
||||
|
|
@ -1263,7 +1369,7 @@ static ssize_t netdev_queue_attr_show(struct kobject *kobj,
|
|||
if (!attribute->show)
|
||||
return -EIO;
|
||||
|
||||
return attribute->show(queue, buf);
|
||||
return attribute->show(kobj, attr, queue, buf);
|
||||
}
|
||||
|
||||
static ssize_t netdev_queue_attr_store(struct kobject *kobj,
|
||||
|
|
@ -1277,7 +1383,7 @@ static ssize_t netdev_queue_attr_store(struct kobject *kobj,
|
|||
if (!attribute->store)
|
||||
return -EIO;
|
||||
|
||||
return attribute->store(queue, buf, count);
|
||||
return attribute->store(kobj, attr, queue, buf, count);
|
||||
}
|
||||
|
||||
static const struct sysfs_ops netdev_queue_sysfs_ops = {
|
||||
|
|
@ -1285,7 +1391,8 @@ static const struct sysfs_ops netdev_queue_sysfs_ops = {
|
|||
.store = netdev_queue_attr_store,
|
||||
};
|
||||
|
||||
static ssize_t tx_timeout_show(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t tx_timeout_show(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
unsigned long trans_timeout = atomic_long_read(&queue->trans_timeout);
|
||||
|
||||
|
|
@ -1303,18 +1410,18 @@ static unsigned int get_netdev_queue_index(struct netdev_queue *queue)
|
|||
return i;
|
||||
}
|
||||
|
||||
static ssize_t traffic_class_show(struct netdev_queue *queue,
|
||||
char *buf)
|
||||
static ssize_t traffic_class_show(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
int num_tc, tc;
|
||||
int index;
|
||||
int num_tc, tc, index, ret;
|
||||
|
||||
if (!netif_is_multiqueue(dev))
|
||||
return -ENOENT;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(kobj, attr, queue->dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
index = get_netdev_queue_index(queue);
|
||||
|
||||
|
|
@ -1341,24 +1448,25 @@ static ssize_t traffic_class_show(struct netdev_queue *queue,
|
|||
}
|
||||
|
||||
#ifdef CONFIG_XPS
|
||||
static ssize_t tx_maxrate_show(struct netdev_queue *queue,
|
||||
char *buf)
|
||||
static ssize_t tx_maxrate_show(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
return sysfs_emit(buf, "%lu\n", queue->tx_maxrate);
|
||||
}
|
||||
|
||||
static ssize_t tx_maxrate_store(struct netdev_queue *queue,
|
||||
const char *buf, size_t len)
|
||||
static ssize_t tx_maxrate_store(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
int err, index = get_netdev_queue_index(queue);
|
||||
struct net_device *dev = queue->dev;
|
||||
u32 rate = 0;
|
||||
|
||||
if (!capable(CAP_NET_ADMIN))
|
||||
return -EPERM;
|
||||
|
||||
/* The check is also done later; this helps returning early without
|
||||
* hitting the trylock/restart below.
|
||||
* hitting the locking section below.
|
||||
*/
|
||||
if (!dev->netdev_ops->ndo_set_tx_maxrate)
|
||||
return -EOPNOTSUPP;
|
||||
|
|
@ -1367,18 +1475,21 @@ static ssize_t tx_maxrate_store(struct netdev_queue *queue,
|
|||
if (err < 0)
|
||||
return err;
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
err = sysfs_rtnl_lock(kobj, attr, dev);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = -EOPNOTSUPP;
|
||||
if (dev->netdev_ops->ndo_set_tx_maxrate)
|
||||
err = dev->netdev_ops->ndo_set_tx_maxrate(dev, index, rate);
|
||||
|
||||
rtnl_unlock();
|
||||
if (!err) {
|
||||
queue->tx_maxrate = rate;
|
||||
rtnl_unlock();
|
||||
return len;
|
||||
}
|
||||
|
||||
rtnl_unlock();
|
||||
return err;
|
||||
}
|
||||
|
||||
|
|
@ -1422,16 +1533,17 @@ static ssize_t bql_set(const char *buf, const size_t count,
|
|||
return count;
|
||||
}
|
||||
|
||||
static ssize_t bql_show_hold_time(struct netdev_queue *queue,
|
||||
char *buf)
|
||||
static ssize_t bql_show_hold_time(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
|
||||
return sysfs_emit(buf, "%u\n", jiffies_to_msecs(dql->slack_hold_time));
|
||||
}
|
||||
|
||||
static ssize_t bql_set_hold_time(struct netdev_queue *queue,
|
||||
const char *buf, size_t len)
|
||||
static ssize_t bql_set_hold_time(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
unsigned int value;
|
||||
|
|
@ -1450,15 +1562,17 @@ static struct netdev_queue_attribute bql_hold_time_attribute __ro_after_init
|
|||
= __ATTR(hold_time, 0644,
|
||||
bql_show_hold_time, bql_set_hold_time);
|
||||
|
||||
static ssize_t bql_show_stall_thrs(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t bql_show_stall_thrs(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
|
||||
return sysfs_emit(buf, "%u\n", jiffies_to_msecs(dql->stall_thrs));
|
||||
}
|
||||
|
||||
static ssize_t bql_set_stall_thrs(struct netdev_queue *queue,
|
||||
const char *buf, size_t len)
|
||||
static ssize_t bql_set_stall_thrs(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
unsigned int value;
|
||||
|
|
@ -1484,13 +1598,15 @@ static ssize_t bql_set_stall_thrs(struct netdev_queue *queue,
|
|||
static struct netdev_queue_attribute bql_stall_thrs_attribute __ro_after_init =
|
||||
__ATTR(stall_thrs, 0644, bql_show_stall_thrs, bql_set_stall_thrs);
|
||||
|
||||
static ssize_t bql_show_stall_max(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t bql_show_stall_max(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
return sysfs_emit(buf, "%u\n", READ_ONCE(queue->dql.stall_max));
|
||||
}
|
||||
|
||||
static ssize_t bql_set_stall_max(struct netdev_queue *queue,
|
||||
const char *buf, size_t len)
|
||||
static ssize_t bql_set_stall_max(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
WRITE_ONCE(queue->dql.stall_max, 0);
|
||||
return len;
|
||||
|
|
@ -1499,7 +1615,8 @@ static ssize_t bql_set_stall_max(struct netdev_queue *queue,
|
|||
static struct netdev_queue_attribute bql_stall_max_attribute __ro_after_init =
|
||||
__ATTR(stall_max, 0644, bql_show_stall_max, bql_set_stall_max);
|
||||
|
||||
static ssize_t bql_show_stall_cnt(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t bql_show_stall_cnt(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
|
||||
|
|
@ -1509,8 +1626,8 @@ static ssize_t bql_show_stall_cnt(struct netdev_queue *queue, char *buf)
|
|||
static struct netdev_queue_attribute bql_stall_cnt_attribute __ro_after_init =
|
||||
__ATTR(stall_cnt, 0444, bql_show_stall_cnt, NULL);
|
||||
|
||||
static ssize_t bql_show_inflight(struct netdev_queue *queue,
|
||||
char *buf)
|
||||
static ssize_t bql_show_inflight(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct dql *dql = &queue->dql;
|
||||
|
||||
|
|
@ -1521,13 +1638,16 @@ static struct netdev_queue_attribute bql_inflight_attribute __ro_after_init =
|
|||
__ATTR(inflight, 0444, bql_show_inflight, NULL);
|
||||
|
||||
#define BQL_ATTR(NAME, FIELD) \
|
||||
static ssize_t bql_show_ ## NAME(struct netdev_queue *queue, \
|
||||
char *buf) \
|
||||
static ssize_t bql_show_ ## NAME(struct kobject *kobj, \
|
||||
struct attribute *attr, \
|
||||
struct netdev_queue *queue, char *buf) \
|
||||
{ \
|
||||
return bql_show(buf, queue->dql.FIELD); \
|
||||
} \
|
||||
\
|
||||
static ssize_t bql_set_ ## NAME(struct netdev_queue *queue, \
|
||||
static ssize_t bql_set_ ## NAME(struct kobject *kobj, \
|
||||
struct attribute *attr, \
|
||||
struct netdev_queue *queue, \
|
||||
const char *buf, size_t len) \
|
||||
{ \
|
||||
return bql_set(buf, len, &queue->dql.FIELD); \
|
||||
|
|
@ -1613,19 +1733,21 @@ out_no_maps:
|
|||
return len < PAGE_SIZE ? len : -EINVAL;
|
||||
}
|
||||
|
||||
static ssize_t xps_cpus_show(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t xps_cpus_show(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
unsigned int index;
|
||||
int len, tc;
|
||||
int len, tc, ret;
|
||||
|
||||
if (!netif_is_multiqueue(dev))
|
||||
return -ENOENT;
|
||||
|
||||
index = get_netdev_queue_index(queue);
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(kobj, attr, queue->dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/* If queue belongs to subordinate dev use its map */
|
||||
dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
|
||||
|
|
@ -1636,18 +1758,21 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue, char *buf)
|
|||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* Make sure the subordinate device can't be freed */
|
||||
get_device(&dev->dev);
|
||||
/* Increase the net device refcnt to make sure it won't be freed while
|
||||
* xps_queue_show is running.
|
||||
*/
|
||||
dev_hold(dev);
|
||||
rtnl_unlock();
|
||||
|
||||
len = xps_queue_show(dev, index, tc, buf, XPS_CPUS);
|
||||
|
||||
put_device(&dev->dev);
|
||||
dev_put(dev);
|
||||
return len;
|
||||
}
|
||||
|
||||
static ssize_t xps_cpus_store(struct netdev_queue *queue,
|
||||
const char *buf, size_t len)
|
||||
static ssize_t xps_cpus_store(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
unsigned int index;
|
||||
|
|
@ -1671,9 +1796,10 @@ static ssize_t xps_cpus_store(struct netdev_queue *queue,
|
|||
return err;
|
||||
}
|
||||
|
||||
if (!rtnl_trylock()) {
|
||||
err = sysfs_rtnl_lock(kobj, attr, dev);
|
||||
if (err) {
|
||||
free_cpumask_var(mask);
|
||||
return restart_syscall();
|
||||
return err;
|
||||
}
|
||||
|
||||
err = netif_set_xps_queue(dev, mask, index);
|
||||
|
|
@ -1687,26 +1813,34 @@ static ssize_t xps_cpus_store(struct netdev_queue *queue,
|
|||
static struct netdev_queue_attribute xps_cpus_attribute __ro_after_init
|
||||
= __ATTR_RW(xps_cpus);
|
||||
|
||||
static ssize_t xps_rxqs_show(struct netdev_queue *queue, char *buf)
|
||||
static ssize_t xps_rxqs_show(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, char *buf)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
unsigned int index;
|
||||
int tc;
|
||||
int tc, ret;
|
||||
|
||||
index = get_netdev_queue_index(queue);
|
||||
|
||||
if (!rtnl_trylock())
|
||||
return restart_syscall();
|
||||
ret = sysfs_rtnl_lock(kobj, attr, dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
tc = netdev_txq_to_tc(dev, index);
|
||||
rtnl_unlock();
|
||||
if (tc < 0)
|
||||
return -EINVAL;
|
||||
|
||||
return xps_queue_show(dev, index, tc, buf, XPS_RXQS);
|
||||
/* Increase the net device refcnt to make sure it won't be freed while
|
||||
* xps_queue_show is running.
|
||||
*/
|
||||
dev_hold(dev);
|
||||
rtnl_unlock();
|
||||
|
||||
ret = tc >= 0 ? xps_queue_show(dev, index, tc, buf, XPS_RXQS) : -EINVAL;
|
||||
dev_put(dev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t xps_rxqs_store(struct netdev_queue *queue, const char *buf,
|
||||
static ssize_t xps_rxqs_store(struct kobject *kobj, struct attribute *attr,
|
||||
struct netdev_queue *queue, const char *buf,
|
||||
size_t len)
|
||||
{
|
||||
struct net_device *dev = queue->dev;
|
||||
|
|
@ -1730,9 +1864,10 @@ static ssize_t xps_rxqs_store(struct netdev_queue *queue, const char *buf,
|
|||
return err;
|
||||
}
|
||||
|
||||
if (!rtnl_trylock()) {
|
||||
err = sysfs_rtnl_lock(kobj, attr, dev);
|
||||
if (err) {
|
||||
bitmap_free(mask);
|
||||
return restart_syscall();
|
||||
return err;
|
||||
}
|
||||
|
||||
cpus_read_lock();
|
||||
|
|
@ -1792,7 +1927,6 @@ static void netdev_queue_get_ownership(const struct kobject *kobj,
|
|||
static const struct kobj_type netdev_queue_ktype = {
|
||||
.sysfs_ops = &netdev_queue_sysfs_ops,
|
||||
.release = netdev_queue_release,
|
||||
.default_groups = netdev_queue_default_groups,
|
||||
.namespace = netdev_queue_namespace,
|
||||
.get_ownership = netdev_queue_get_ownership,
|
||||
};
|
||||
|
|
@ -1811,6 +1945,22 @@ static int netdev_queue_add_kobject(struct net_device *dev, int index)
|
|||
struct kobject *kobj = &queue->kobj;
|
||||
int error = 0;
|
||||
|
||||
/* Tx queues are cleared in netdev_queue_release to allow later
|
||||
* re-registration. This is triggered when their kobj refcount is
|
||||
* dropped.
|
||||
*
|
||||
* If a queue is removed while both a read (or write) operation and a
|
||||
* the re-addition of the same queue are pending (waiting on rntl_lock)
|
||||
* it might happen that the re-addition will execute before the read,
|
||||
* making the initial removal to never happen (queue's kobj refcount
|
||||
* won't drop enough because of the pending read). In such rare case,
|
||||
* return to allow the removal operation to complete.
|
||||
*/
|
||||
if (unlikely(kobj->state_initialized)) {
|
||||
netdev_warn_once(dev, "Cannot re-add tx queues before their removal completed");
|
||||
return -EAGAIN;
|
||||
}
|
||||
|
||||
/* Kobject_put later will trigger netdev_queue_release call
|
||||
* which decreases dev refcount: Take that reference here
|
||||
*/
|
||||
|
|
@ -1822,15 +1972,22 @@ static int netdev_queue_add_kobject(struct net_device *dev, int index)
|
|||
if (error)
|
||||
goto err;
|
||||
|
||||
queue->groups = netdev_queue_default_groups;
|
||||
error = sysfs_create_groups(kobj, queue->groups);
|
||||
if (error)
|
||||
goto err;
|
||||
|
||||
if (netdev_uses_bql(dev)) {
|
||||
error = sysfs_create_group(kobj, &dql_group);
|
||||
if (error)
|
||||
goto err;
|
||||
goto err_default_groups;
|
||||
}
|
||||
|
||||
kobject_uevent(kobj, KOBJ_ADD);
|
||||
return 0;
|
||||
|
||||
err_default_groups:
|
||||
sysfs_remove_groups(kobj, queue->groups);
|
||||
err:
|
||||
kobject_put(kobj);
|
||||
return error;
|
||||
|
|
@ -1885,6 +2042,7 @@ netdev_queue_update_kobjects(struct net_device *dev, int old_num, int new_num)
|
|||
if (netdev_uses_bql(dev))
|
||||
sysfs_remove_group(&queue->kobj, &dql_group);
|
||||
|
||||
sysfs_remove_groups(&queue->kobj, queue->groups);
|
||||
kobject_put(&queue->kobj);
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -10,6 +10,7 @@
|
|||
#include <net/sock.h>
|
||||
#include <net/xdp.h>
|
||||
#include <net/xdp_sock.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
|
||||
#include "dev.h"
|
||||
#include "devmem.h"
|
||||
|
|
@ -368,7 +369,7 @@ static int
|
|||
netdev_nl_queue_fill_one(struct sk_buff *rsp, struct net_device *netdev,
|
||||
u32 q_idx, u32 q_type, const struct genl_info *info)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding;
|
||||
struct pp_memory_provider_params *params;
|
||||
struct netdev_rx_queue *rxq;
|
||||
struct netdev_queue *txq;
|
||||
void *hdr;
|
||||
|
|
@ -385,15 +386,15 @@ netdev_nl_queue_fill_one(struct sk_buff *rsp, struct net_device *netdev,
|
|||
switch (q_type) {
|
||||
case NETDEV_QUEUE_TYPE_RX:
|
||||
rxq = __netif_get_rx_queue(netdev, q_idx);
|
||||
|
||||
if (rxq->napi && nla_put_u32(rsp, NETDEV_A_QUEUE_NAPI_ID,
|
||||
rxq->napi->napi_id))
|
||||
goto nla_put_failure;
|
||||
|
||||
binding = rxq->mp_params.mp_priv;
|
||||
if (binding &&
|
||||
nla_put_u32(rsp, NETDEV_A_QUEUE_DMABUF, binding->id))
|
||||
params = &rxq->mp_params;
|
||||
if (params->mp_ops &&
|
||||
params->mp_ops->nl_fill(params->mp_priv, rsp, rxq))
|
||||
goto nla_put_failure;
|
||||
|
||||
break;
|
||||
case NETDEV_QUEUE_TYPE_TX:
|
||||
txq = netdev_get_tx_queue(netdev, q_idx);
|
||||
|
|
|
|||
|
|
@ -3,6 +3,7 @@
|
|||
#include <linux/netdevice.h>
|
||||
#include <net/netdev_queues.h>
|
||||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
|
||||
#include "page_pool_priv.h"
|
||||
|
||||
|
|
@ -80,3 +81,71 @@ err_free_new_mem:
|
|||
return err;
|
||||
}
|
||||
EXPORT_SYMBOL_NS_GPL(netdev_rx_queue_restart, "NETDEV_INTERNAL");
|
||||
|
||||
static int __net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *p)
|
||||
{
|
||||
struct netdev_rx_queue *rxq;
|
||||
int ret;
|
||||
|
||||
if (ifq_idx >= dev->real_num_rx_queues)
|
||||
return -EINVAL;
|
||||
ifq_idx = array_index_nospec(ifq_idx, dev->real_num_rx_queues);
|
||||
|
||||
rxq = __netif_get_rx_queue(dev, ifq_idx);
|
||||
if (rxq->mp_params.mp_ops)
|
||||
return -EEXIST;
|
||||
|
||||
rxq->mp_params = *p;
|
||||
ret = netdev_rx_queue_restart(dev, ifq_idx);
|
||||
if (ret) {
|
||||
rxq->mp_params.mp_ops = NULL;
|
||||
rxq->mp_params.mp_priv = NULL;
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
int net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *p)
|
||||
{
|
||||
int ret;
|
||||
|
||||
rtnl_lock();
|
||||
ret = __net_mp_open_rxq(dev, ifq_idx, p);
|
||||
rtnl_unlock();
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void __net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *old_p)
|
||||
{
|
||||
struct netdev_rx_queue *rxq;
|
||||
|
||||
if (WARN_ON_ONCE(ifq_idx >= dev->real_num_rx_queues))
|
||||
return;
|
||||
|
||||
rxq = __netif_get_rx_queue(dev, ifq_idx);
|
||||
|
||||
/* Callers holding a netdev ref may get here after we already
|
||||
* went thru shutdown via dev_memory_provider_uninstall().
|
||||
*/
|
||||
if (dev->reg_state > NETREG_REGISTERED &&
|
||||
!rxq->mp_params.mp_ops)
|
||||
return;
|
||||
|
||||
if (WARN_ON_ONCE(rxq->mp_params.mp_ops != old_p->mp_ops ||
|
||||
rxq->mp_params.mp_priv != old_p->mp_priv))
|
||||
return;
|
||||
|
||||
rxq->mp_params.mp_ops = NULL;
|
||||
rxq->mp_params.mp_priv = NULL;
|
||||
WARN_ON(netdev_rx_queue_restart(dev, ifq_idx));
|
||||
}
|
||||
|
||||
void net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
|
||||
struct pp_memory_provider_params *old_p)
|
||||
{
|
||||
rtnl_lock();
|
||||
__net_mp_close_rxq(dev, ifq_idx, old_p);
|
||||
rtnl_unlock();
|
||||
}
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@
|
|||
|
||||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/page_pool/helpers.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
#include <net/xdp.h>
|
||||
|
||||
#include <linux/dma-direction.h>
|
||||
|
|
@ -285,13 +286,19 @@ static int page_pool_init(struct page_pool *pool,
|
|||
rxq = __netif_get_rx_queue(pool->slow.netdev,
|
||||
pool->slow.queue_idx);
|
||||
pool->mp_priv = rxq->mp_params.mp_priv;
|
||||
pool->mp_ops = rxq->mp_params.mp_ops;
|
||||
}
|
||||
|
||||
if (pool->mp_priv) {
|
||||
if (pool->mp_ops) {
|
||||
if (!pool->dma_map || !pool->dma_sync)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
err = mp_dmabuf_devmem_init(pool);
|
||||
if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) {
|
||||
err = -EFAULT;
|
||||
goto free_ptr_ring;
|
||||
}
|
||||
|
||||
err = pool->mp_ops->init(pool);
|
||||
if (err) {
|
||||
pr_warn("%s() mem-provider init failed %d\n", __func__,
|
||||
err);
|
||||
|
|
@ -587,8 +594,8 @@ netmem_ref page_pool_alloc_netmems(struct page_pool *pool, gfp_t gfp)
|
|||
return netmem;
|
||||
|
||||
/* Slow-path: cache empty, do real allocation */
|
||||
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_priv)
|
||||
netmem = mp_dmabuf_devmem_alloc_netmems(pool, gfp);
|
||||
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
|
||||
netmem = pool->mp_ops->alloc_netmems(pool, gfp);
|
||||
else
|
||||
netmem = __page_pool_alloc_pages_slow(pool, gfp);
|
||||
return netmem;
|
||||
|
|
@ -679,8 +686,8 @@ void page_pool_return_page(struct page_pool *pool, netmem_ref netmem)
|
|||
bool put;
|
||||
|
||||
put = true;
|
||||
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_priv)
|
||||
put = mp_dmabuf_devmem_release_page(pool, netmem);
|
||||
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
|
||||
put = pool->mp_ops->release_netmem(pool, netmem);
|
||||
else
|
||||
__page_pool_release_page_dma(pool, netmem);
|
||||
|
||||
|
|
@ -1048,8 +1055,8 @@ static void __page_pool_destroy(struct page_pool *pool)
|
|||
page_pool_unlist(pool);
|
||||
page_pool_uninit(pool);
|
||||
|
||||
if (pool->mp_priv) {
|
||||
mp_dmabuf_devmem_destroy(pool);
|
||||
if (pool->mp_ops) {
|
||||
pool->mp_ops->destroy(pool);
|
||||
static_branch_dec(&page_pool_mem_providers);
|
||||
}
|
||||
|
||||
|
|
@ -1190,3 +1197,31 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid)
|
|||
}
|
||||
}
|
||||
EXPORT_SYMBOL(page_pool_update_nid);
|
||||
|
||||
bool net_mp_niov_set_dma_addr(struct net_iov *niov, dma_addr_t addr)
|
||||
{
|
||||
return page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov), addr);
|
||||
}
|
||||
|
||||
/* Associate a niov with a page pool. Should follow with a matching
|
||||
* net_mp_niov_clear_page_pool()
|
||||
*/
|
||||
void net_mp_niov_set_page_pool(struct page_pool *pool, struct net_iov *niov)
|
||||
{
|
||||
netmem_ref netmem = net_iov_to_netmem(niov);
|
||||
|
||||
page_pool_set_pp_info(pool, netmem);
|
||||
|
||||
pool->pages_state_hold_cnt++;
|
||||
trace_page_pool_state_hold(pool, netmem, pool->pages_state_hold_cnt);
|
||||
}
|
||||
|
||||
/* Disassociate a niov from a page pool. Should only be used in the
|
||||
* ->release_netmem() path.
|
||||
*/
|
||||
void net_mp_niov_clear_page_pool(struct net_iov *niov)
|
||||
{
|
||||
netmem_ref netmem = net_iov_to_netmem(niov);
|
||||
|
||||
page_pool_clear_pp_info(netmem);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -8,9 +8,9 @@
|
|||
#include <net/netdev_rx_queue.h>
|
||||
#include <net/page_pool/helpers.h>
|
||||
#include <net/page_pool/types.h>
|
||||
#include <net/page_pool/memory_provider.h>
|
||||
#include <net/sock.h>
|
||||
|
||||
#include "devmem.h"
|
||||
#include "page_pool_priv.h"
|
||||
#include "netdev-genl-gen.h"
|
||||
|
||||
|
|
@ -216,7 +216,6 @@ static int
|
|||
page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
|
||||
const struct genl_info *info)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
|
||||
size_t inflight, refsz;
|
||||
unsigned int napi_id;
|
||||
void *hdr;
|
||||
|
|
@ -249,7 +248,7 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
|
|||
pool->user.detach_time))
|
||||
goto err_cancel;
|
||||
|
||||
if (binding && nla_put_u32(rsp, NETDEV_A_PAGE_POOL_DMABUF, binding->id))
|
||||
if (pool->mp_ops && pool->mp_ops->nl_fill(pool->mp_priv, rsp, NULL))
|
||||
goto err_cancel;
|
||||
|
||||
genlmsg_end(rsp, hdr);
|
||||
|
|
@ -356,7 +355,7 @@ void page_pool_unlist(struct page_pool *pool)
|
|||
int page_pool_check_memory_provider(struct net_device *dev,
|
||||
struct netdev_rx_queue *rxq)
|
||||
{
|
||||
struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv;
|
||||
void *binding = rxq->mp_params.mp_priv;
|
||||
struct page_pool *pool;
|
||||
struct hlist_node *n;
|
||||
|
||||
|
|
|
|||
|
|
@ -80,6 +80,11 @@ void rtnl_lock(void)
|
|||
}
|
||||
EXPORT_SYMBOL(rtnl_lock);
|
||||
|
||||
int rtnl_lock_interruptible(void)
|
||||
{
|
||||
return mutex_lock_interruptible(&rtnl_mutex);
|
||||
}
|
||||
|
||||
int rtnl_lock_killable(void)
|
||||
{
|
||||
return mutex_lock_killable(&rtnl_mutex);
|
||||
|
|
|
|||
|
|
@ -214,6 +214,24 @@ const char link_mode_names[][ETH_GSTRING_LEN] = {
|
|||
__DEFINE_LINK_MODE_NAME(10, T1S, Half),
|
||||
__DEFINE_LINK_MODE_NAME(10, T1S_P2MP, Half),
|
||||
__DEFINE_LINK_MODE_NAME(10, T1BRR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, CR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, KR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, DR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, DR_2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, SR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(200000, VR, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, CR2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, KR2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, DR2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, DR2_2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, SR2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(400000, VR2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, CR4, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, KR4, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, DR4, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, DR4_2, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, SR4, Full),
|
||||
__DEFINE_LINK_MODE_NAME(800000, VR4, Full),
|
||||
};
|
||||
static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
|
||||
|
||||
|
|
@ -222,8 +240,11 @@ static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
|
|||
#define __LINK_MODE_LANES_CR4 4
|
||||
#define __LINK_MODE_LANES_CR8 8
|
||||
#define __LINK_MODE_LANES_DR 1
|
||||
#define __LINK_MODE_LANES_DR_2 1
|
||||
#define __LINK_MODE_LANES_DR2 2
|
||||
#define __LINK_MODE_LANES_DR2_2 2
|
||||
#define __LINK_MODE_LANES_DR4 4
|
||||
#define __LINK_MODE_LANES_DR4_2 4
|
||||
#define __LINK_MODE_LANES_DR8 8
|
||||
#define __LINK_MODE_LANES_KR 1
|
||||
#define __LINK_MODE_LANES_KR2 2
|
||||
|
|
@ -252,6 +273,9 @@ static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
|
|||
#define __LINK_MODE_LANES_T1L 1
|
||||
#define __LINK_MODE_LANES_T1S 1
|
||||
#define __LINK_MODE_LANES_T1S_P2MP 1
|
||||
#define __LINK_MODE_LANES_VR 1
|
||||
#define __LINK_MODE_LANES_VR2 2
|
||||
#define __LINK_MODE_LANES_VR4 4
|
||||
#define __LINK_MODE_LANES_VR8 8
|
||||
#define __LINK_MODE_LANES_DR8_2 8
|
||||
#define __LINK_MODE_LANES_T1BRR 1
|
||||
|
|
@ -379,6 +403,24 @@ const struct link_mode_info link_mode_params[] = {
|
|||
__DEFINE_LINK_MODE_PARAMS(10, T1S, Half),
|
||||
__DEFINE_LINK_MODE_PARAMS(10, T1S_P2MP, Half),
|
||||
__DEFINE_LINK_MODE_PARAMS(10, T1BRR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, CR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, KR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, DR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, DR_2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, SR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(200000, VR, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, CR2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, KR2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, DR2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, DR2_2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, SR2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(400000, VR2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, CR4, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, KR4, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, DR4, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, DR4_2, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, SR4, Full),
|
||||
__DEFINE_LINK_MODE_PARAMS(800000, VR4, Full),
|
||||
};
|
||||
static_assert(ARRAY_SIZE(link_mode_params) == __ETHTOOL_LINK_MODE_MASK_NBITS);
|
||||
|
||||
|
|
|
|||
|
|
@ -141,7 +141,6 @@ static int ipgre_err(struct sk_buff *skb, u32 info,
|
|||
const struct iphdr *iph;
|
||||
const int type = icmp_hdr(skb)->type;
|
||||
const int code = icmp_hdr(skb)->code;
|
||||
unsigned int data_len = 0;
|
||||
struct ip_tunnel *t;
|
||||
|
||||
if (tpi->proto == htons(ETH_P_TEB))
|
||||
|
|
@ -182,7 +181,6 @@ static int ipgre_err(struct sk_buff *skb, u32 info,
|
|||
case ICMP_TIME_EXCEEDED:
|
||||
if (code != ICMP_EXC_TTL)
|
||||
return 0;
|
||||
data_len = icmp_hdr(skb)->un.reserved[1] * 4; /* RFC 4884 4.1 */
|
||||
break;
|
||||
|
||||
case ICMP_REDIRECT:
|
||||
|
|
@ -190,10 +188,16 @@ static int ipgre_err(struct sk_buff *skb, u32 info,
|
|||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_IPV6)
|
||||
if (tpi->proto == htons(ETH_P_IPV6) &&
|
||||
!ip6_err_gen_icmpv6_unreach(skb, iph->ihl * 4 + tpi->hdr_len,
|
||||
type, data_len))
|
||||
return 0;
|
||||
if (tpi->proto == htons(ETH_P_IPV6)) {
|
||||
unsigned int data_len = 0;
|
||||
|
||||
if (type == ICMP_TIME_EXCEEDED)
|
||||
data_len = icmp_hdr(skb)->un.reserved[1] * 4; /* RFC 4884 4.1 */
|
||||
|
||||
if (!ip6_err_gen_icmpv6_unreach(skb, iph->ihl * 4 + tpi->hdr_len,
|
||||
type, data_len))
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
if (t->parms.iph.daddr == 0 ||
|
||||
|
|
|
|||
|
|
@ -2493,6 +2493,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
|
|||
}
|
||||
|
||||
niov = skb_frag_net_iov(frag);
|
||||
if (!net_is_devmem_iov(niov)) {
|
||||
err = -ENODEV;
|
||||
goto out;
|
||||
}
|
||||
|
||||
end = start + skb_frag_size(frag);
|
||||
copy = end - offset;
|
||||
|
||||
|
|
@ -2511,7 +2516,7 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
|
|||
|
||||
/* Will perform the exchange later */
|
||||
dmabuf_cmsg.frag_token = tcp_xa_pool.tokens[tcp_xa_pool.idx];
|
||||
dmabuf_cmsg.dmabuf_id = net_iov_binding_id(niov);
|
||||
dmabuf_cmsg.dmabuf_id = net_devmem_iov_binding_id(niov);
|
||||
|
||||
offset += copy;
|
||||
remaining_len -= copy;
|
||||
|
|
|
|||
|
|
@ -86,6 +86,11 @@ enum {
|
|||
NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
|
||||
};
|
||||
|
||||
enum {
|
||||
__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
|
||||
NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
|
||||
};
|
||||
|
||||
enum {
|
||||
NETDEV_A_PAGE_POOL_ID = 1,
|
||||
NETDEV_A_PAGE_POOL_IFINDEX,
|
||||
|
|
@ -94,6 +99,7 @@ enum {
|
|||
NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
|
||||
NETDEV_A_PAGE_POOL_DETACH_TIME,
|
||||
NETDEV_A_PAGE_POOL_DMABUF,
|
||||
NETDEV_A_PAGE_POOL_IO_URING,
|
||||
|
||||
__NETDEV_A_PAGE_POOL_MAX,
|
||||
NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
|
||||
|
|
@ -136,6 +142,7 @@ enum {
|
|||
NETDEV_A_QUEUE_TYPE,
|
||||
NETDEV_A_QUEUE_NAPI_ID,
|
||||
NETDEV_A_QUEUE_DMABUF,
|
||||
NETDEV_A_QUEUE_IO_URING,
|
||||
|
||||
__NETDEV_A_QUEUE_MAX,
|
||||
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
|
||||
|
|
|
|||
|
|
@ -17,9 +17,11 @@ get_hdr_inc=-D$(1) -include $(UAPI_PATH)/linux/$(2)
|
|||
CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h)
|
||||
CFLAGS_dpll:=$(call get_hdr_inc,_LINUX_DPLL_H,dpll.h)
|
||||
CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_ETHTOOL_H,ethtool.h) \
|
||||
$(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h)
|
||||
$(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h) \
|
||||
$(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_GENERATED_H,ethtool_netlink_generated.h)
|
||||
CFLAGS_handshake:=$(call get_hdr_inc,_LINUX_HANDSHAKE_H,handshake.h)
|
||||
CFLAGS_mptcp_pm:=$(call get_hdr_inc,_LINUX_MPTCP_PM_H,mptcp_pm.h)
|
||||
CFLAGS_net_shaper:=$(call get_hdr_inc,_LINUX_NET_SHAPER_H,net_shaper.h)
|
||||
CFLAGS_netdev:=$(call get_hdr_inc,_LINUX_NETDEV_H,netdev.h)
|
||||
CFLAGS_nlctrl:=$(call get_hdr_inc,__LINUX_GENERIC_NETLINK_H,genetlink.h)
|
||||
CFLAGS_nfsd:=$(call get_hdr_inc,_LINUX_NFSD_NETLINK_H,nfsd_netlink.h)
|
||||
|
|
|
|||
|
|
@ -100,7 +100,7 @@ class Type(SpecAttr):
|
|||
if isinstance(value, int):
|
||||
return value
|
||||
if value in self.family.consts:
|
||||
raise Exception("Resolving family constants not implemented, yet")
|
||||
return self.family.consts[value]["value"]
|
||||
return limit_to_number(value)
|
||||
|
||||
def get_limit_str(self, limit, default=None, suffix=''):
|
||||
|
|
@ -110,6 +110,9 @@ class Type(SpecAttr):
|
|||
if isinstance(value, int):
|
||||
return str(value) + suffix
|
||||
if value in self.family.consts:
|
||||
const = self.family.consts[value]
|
||||
if const.get('header'):
|
||||
return c_upper(value)
|
||||
return c_upper(f"{self.family['name']}-{value}")
|
||||
return c_upper(value)
|
||||
|
||||
|
|
@ -2549,6 +2552,9 @@ def render_uapi(family, cw):
|
|||
|
||||
defines = []
|
||||
for const in family['definitions']:
|
||||
if const.get('header'):
|
||||
continue
|
||||
|
||||
if const['type'] != 'const':
|
||||
cw.writes_defines(defines)
|
||||
defines = []
|
||||
|
|
|
|||
|
|
@ -7,6 +7,7 @@ TEST_INCLUDES := $(wildcard lib/py/*.py) \
|
|||
|
||||
TEST_PROGS := \
|
||||
netcons_basic.sh \
|
||||
netcons_fragmented_msg.sh \
|
||||
netcons_overflow.sh \
|
||||
ping.py \
|
||||
queues.py \
|
||||
|
|
|
|||
|
|
@ -1 +1,3 @@
|
|||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
iou-zcrx
|
||||
ncdevmem
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# SPDX-License-Identifier: GPL-2.0+ OR MIT
|
||||
|
||||
TEST_GEN_FILES = iou-zcrx
|
||||
|
||||
TEST_PROGS = \
|
||||
csum.py \
|
||||
devlink_port_split.py \
|
||||
|
|
@ -10,6 +12,7 @@ TEST_PROGS = \
|
|||
ethtool_rmon.sh \
|
||||
hw_stats_l3.sh \
|
||||
hw_stats_l3_gre.sh \
|
||||
iou-zcrx.py \
|
||||
loopback.sh \
|
||||
nic_link_layer.py \
|
||||
nic_performance.py \
|
||||
|
|
@ -38,3 +41,5 @@ include ../../../lib.mk
|
|||
# YNL build
|
||||
YNL_GENS := ethtool netdev
|
||||
include ../../../net/ynl.mk
|
||||
|
||||
$(OUTPUT)/iou-zcrx: LDLIBS += -luring
|
||||
|
|
|
|||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue