mirror-linux

Commit Graph

Author	SHA1	Message	Date
Jakub Kicinski	377ea33128	Merge tag 'mlx5-next-lag' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-09-28 * tag 'mlx5-next-lag' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: IFC add balance ID and LAG per MP group bits net/mlx5: Add IFC bit for TIR/SQ order capability ==================== Link: https://patch.msgid.link/1759093989-841873-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-29 18:49:59 -07:00
Mark Bloch	137d1a6355	net/mlx5: IFC add balance ID and LAG per MP group bits Add interface definitions for load balance ID and LAG per multiplane group functionality. This patch introduces the hardware capability bits needed to support balance ID in multiplane LAG configurations. The new fields include: - load_balance_id: 4-bit field for balance identifier. - lag_per_mp_group: capability bit for LAG per multiplane group support. These interface additions are prerequisites for implementing balance ID support in the MLX5 driver. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758521191-814350-3-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-28 03:36:36 -04:00
Tariq Toukan	1ddf1636e0	net/mlx5: Add IFC bit for TIR/SQ order capability Before this cap, firmware requested a certain creation order between TIR objects and SQs of the same transport domain to properly support the self loopback prevention feature. If order is not preserved, explicit modify_tir operations are necessary after the opening of the SQs. When set, this cap bit indicates that this firmware requirement / limitation no longer holds. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758521191-814350-2-git-send-email-tariqt@nvidia.com Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-28 03:36:36 -04:00
Jakub Kicinski	203e3beb73	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.17-rc8). Conflicts: drivers/net/can/spi/hi311x.c `6b69680847` ("can: hi311x: fix null pointer dereference when resuming from sleep before interface was enabled") `27ce71e1ce` ("net: WQ_PERCPU added to alloc_workqueue users") https://lore.kernel.org/72ce7599-1b5b-464a-a5de-228ff9724701@kernel.org net/smc/smc_loopback.c drivers/dibs/dibs_loopback.c `a35c04de25` ("net/smc: fix warning in smc_rx_splice() when calling get_page()") `cc21191b58` ("dibs: Move data path to dibs layer") https://lore.kernel.org/74368a5c-48ac-4f8e-a198-40ec1ed3cf5f@kernel.org Adjacent changes: drivers/net/dsa/lantiq/lantiq_gswip.c `c0054b25e2` ("net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()") `7a1eaef0a7` ("net: dsa: lantiq_gswip: support model-specific mac_select_pcs()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-25 11:00:59 -07:00
Moshe Shemesh	6043819e70	net/mlx5: fs, fix UAF in flow counter release Fix a kernel trace [1] caused by releasing an HWS action of a local flow counter in mlx5_cmd_hws_delete_fte(), where the HWS action refcount and mutex were not initialized and the counter struct could already be freed when deleting the rule. Fix it by adding the missing initializations and adding refcount for the local flow counter struct. [1] Kernel log: Call Trace: <TASK> dump_stack_lvl+0x34/0x48 mlx5_fs_put_hws_action.part.0.cold+0x21/0x94 [mlx5_core] mlx5_fc_put_hws_action+0x96/0xad [mlx5_core] mlx5_fs_destroy_fs_actions+0x8b/0x152 [mlx5_core] mlx5_cmd_hws_delete_fte+0x5a/0xa0 [mlx5_core] del_hw_fte+0x1ce/0x260 [mlx5_core] mlx5_del_flow_rules+0x12d/0x240 [mlx5_core] ? ttwu_queue_wakelist+0xf4/0x110 mlx5_ib_destroy_flow+0x103/0x1b0 [mlx5_ib] uverbs_free_flow+0x20/0x50 [ib_uverbs] destroy_hw_idr_uobject+0x1b/0x50 [ib_uverbs] uverbs_destroy_uobject+0x34/0x1a0 [ib_uverbs] uobj_destroy+0x3c/0x80 [ib_uverbs] ib_uverbs_run_method+0x23e/0x360 [ib_uverbs] ? uverbs_finalize_object+0x60/0x60 [ib_uverbs] ib_uverbs_cmd_verbs+0x14f/0x2c0 [ib_uverbs] ? do_tty_write+0x1a9/0x270 ? file_tty_write.constprop.0+0x98/0xc0 ? new_sync_write+0xfc/0x190 ib_uverbs_ioctl+0xd7/0x160 [ib_uverbs] __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x59/0x90 Fixes: `b581f42669` ("net/mlx5: fs, manage flow counters HWS action sharing by refcount") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758525094-816583-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-23 17:17:30 -07:00
Jakub Kicinski	1bcce9ec18	Merge tag 'mlx5-next-counters' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-09-21 * tag 'mlx5-next-counters' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add uar access and odp page fault counters ==================== Link: https://patch.msgid.link/1758443940-708689-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-22 16:30:18 -07:00
Jakub Kicinski	38c5b9c38b	Merge tag 'mlx5-next-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-09-17 This series by Carolina contains cleanups significantly touching shared mlx5 net and rdma headers. * tag 'mlx5-next-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5e: Prevent WQE metadata conflicts between timestamping and offloads net/mlx5: Refactor MACsec WQE metadata shifts net/mlx5: Remove VLAN insertion fields from WQE Ether segment ==================== Link: https://patch.msgid.link/1757574619-604874-1-git-send-email-tariqt@nvidia.com Link: https://patch.msgid.link/1758104780-642426-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 15:47:35 -07:00
Jakub Kicinski	f2cdc4c22b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.17-rc7). No conflicts. Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en/fs.h `9536fbe10c` ("net/mlx5e: Add PSP steering in local NIC RX") `7601a0a462` ("net/mlx5e: Add a miss level for ipsec crypto offload") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 11:26:06 -07:00
Akiva Goldberger	a3d076b056	net/mlx5: Add uar access and odp page fault counters Add bar_uar_access, odp_local_triggered_page_fault, and odp_remote_triggered_page_fault counters to the query_vnic_env command. Additionally, add corresponding capabilities bits to the HCA CAP. Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758115678-643464-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-18 05:32:22 -04:00
Cosmin Ratiu	71fb4832d5	net/mlx5e: Use multiple TX doorbells First, allocate more doorbells in mlx5e_create_mdev_resources: - one doorbell remains 'global' and will be used by all non-channel associated SQs (e.g. ASO, HWS, PTP, ...). - allocate additional 'num_doorbells' doorbells. This defaults to minimum between 8 and max number of channels. mlx5e_channel_pick_doorbell() now spreads out channel SQs across available doorbells. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-17 18:30:44 -07:00
Cosmin Ratiu	a315b723e8	net/mlx5e: Prepare for using different CQ doorbells Completion queues (CQs) in mlx5 use the same global doorbell, which may become contended when accessed concurrently from many cores. This patch prepares the CQ management code for supporting different doorbells per CQ. This will be used in downstream patches to allow separate doorbells to be used by channels CQs. The main change is moving the 'uar' pointer from struct mlx5_core_cq to struct mlx5e_cq, as the uar page to be used is better off stored directly there. Other users of mlx5_core_cq also store the UAR to be used separately and therefore the pointer being removed is dead weight for them. As evidence, in this patch there are two users which set the mcq.uar pointer but didn't use it, Software Steering and old Innova CQ creation code. Instead, they rang the doorbell directly from another pointer. The 'uar' pointer added to struct mlx5e_cq remains in a hot cacheline (as before), because it may get accessed for each packet. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-17 18:30:40 -07:00
Cosmin Ratiu	aa4595d0ad	net/mlx5: Store the global doorbell in mlx5_priv The global doorbell is used for more than just Ethernet resources, so move it out of mlx5e_hw_objs into a common place (mlx5_priv), to avoid non-Ethernet modules (e.g. HWS, ASO) depending on Ethernet structs. Use this opportunity to consolidate it with the 'uar' pointer already there, which was used as an RX doorbell. Underneath the 'uar' pointer is identical to 'bfreg->up', so store a single resource and use that instead. For CQ doorbells, care is taken to always use bfreg->up->index instead of bfreg->index, which may refer to a subsequent UAR page from the same ALLOC_UAR batch on some NICs. This paves the way for cleanly supporting multiple doorbells in the Ethernet driver. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-17 18:30:32 -07:00
Cosmin Ratiu	05dfe654b5	net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg The 'offset' field was introduced in the original commit [1] and never used until commit [2], which added an unnecessary use. Remove the field and refactor the write-combining test to use a local variable instead. [1] commit `a6d51b6861` ("net/mlx5: Introduce blue flame register allocator") [2] commit `d98995b4bf` ("net/mlx5: Reimplement write combining test") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-17 18:30:25 -07:00
Carolina Jubran	2ac207381c	net/mlx5e: Prevent WQE metadata conflicts between timestamping and offloads Update the WQE metadata assignment to avoid overriding existing metadata when setting the sysport timestamp ID. Since timestamp IDs are limited to 256 values, they use only the lower 8 bits of the metadata field. To avoid conflicts, move IPsec and MACsec metadata ID to bits 8 and 9, and shift the MACsec fs_id accordingly. This ensures safe coexistence of timestamping and offload features that use the same metadata field. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757574619-604874-4-git-send-email-tariqt@nvidia.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-17 04:38:10 -04:00
Carolina Jubran	cce65f3244	net/mlx5: Refactor MACsec WQE metadata shifts Introduce MLX5_ETH_WQE_FT_META_SHIFT as a shared base offset for features that use the lower 8 bits of the WQE flow_table_metadata field, currently used for timestamping, IPsec, and MACsec. Define MLX5_ETH_WQE_FT_META_MACSEC_FS_ID_MASK so that fs_id occupies bits 2–5, making it clear that fs_id occupies bits in the metadata. Set MLX5_ETH_WQE_FT_META_MACSEC_MASK as the OR of the MACsec flag and MLX5_ETH_WQE_FT_META_MACSEC_FS_ID_MASK, corresponding to the original 0x3E mask. Update the fs_id macro to right-shift the MACsec flag by MLX5_ETH_WQE_FT_META_SHIFT and update the RoCE modify-header action to use it. Introduce the helper macro MLX5_MACSEC_TX_METADATA(fs_id) to compose the full shifted MACsec metadata value. These changes make it explicit exactly which metadata bits carry MACsec information, simplifying future feature exclusions when multiple features share the WQE flowtable metadata. In addition, drop the incorrect “RX flow steering” comment, since this applies to TX flow steering. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757574619-604874-3-git-send-email-tariqt@nvidia.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-17 04:38:10 -04:00
Carolina Jubran	de2be98541	net/mlx5: Remove VLAN insertion fields from WQE Ether segment Now that the driver no longer uses VLAN TX insertion via the WQE Ethernet segment, the related fields and flags can be removed. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757574619-604874-2-git-send-email-tariqt@nvidia.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-17 04:38:10 -04:00
Jianbo Liu	6b4be64fd9	net/mlx5e: Harden uplink netdev access against device unbind The function mlx5_uplink_netdev_get() gets the uplink netdevice pointer from mdev->mlx5e_res.uplink_netdev. However, the netdevice can be removed and its pointer cleared when unbound from the mlx5_core.eth driver. This results in a NULL pointer, causing a kernel panic. BUG: unable to handle page fault for address: 0000000000001300 at RIP: 0010:mlx5e_vport_rep_load+0x22a/0x270 [mlx5_core] Call Trace: <TASK> mlx5_esw_offloads_rep_load+0x68/0xe0 [mlx5_core] esw_offloads_enable+0x593/0x910 [mlx5_core] mlx5_eswitch_enable_locked+0x341/0x420 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x17e/0x3a0 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x183/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 do_syscall_64+0x53/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Ensure the pointer is valid before use by checking it for NULL. If it is valid, immediately call netdev_hold() to take a reference, and preventing the netdevice from being freed while it is in use. Fixes: `7a9fb35e8c` ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757939074-617281-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-16 17:19:11 -07:00
Saeed Mahameed	bf2da4799f	net/mlx5: Implement cqe_compress_type via devlink params Selects which algorithm should be used by the NIC in order to decide rate of CQE compression dependeng on PCIe bus conditions. Supported values: 1) balanced, merges fewer CQEs, resulting in a moderate compression ratio but maintaining a balance between bandwidth savings and performance 2) aggressive, merges more CQEs into a single entry, achieving a higher compression rate and maximizing performance, particularly under high traffic loads. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250907012953.301746-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:23 -07:00
Carolina Jubran	ff97bc38be	net/mlx5: Add RS FEC histogram infrastructure Define the Ports Phy Histogram Configuration Register (PPHCR) to expose RS-FEC histogram bin ranges, and expose a new counter group in the Ports Performance Counters Register (PPCNT) to report the corresponding histogram values. Co-developed-by: Yael Chemla <ychemla@nvidia.com> Signed-off-by: Yael Chemla <ychemla@nvidia.com> Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1756884600-520195-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-09-09 04:18:19 -04:00
Saeed Mahameed	04a3134f88	net/mlx5: Add PSP capabilities structures and bits Add mlx5_ifc PSP related capabilities structures and HW definitions needed for PSP support in mlx5. Link: https://lore.kernel.org/netdev/20250828162953.2707727-1-daniel.zahka@gmail.com/ Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2025-09-02 23:08:13 -07:00
Saeed Mahameed	40653f280b	{rdma,net}/mlx5: export mlx5_vport_get_vhca_id vhca id is already cached in the vport structure no need to query on every mlx5 layer, use the mlx5_vport_get_vhca_id, where possible. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Alexei Lazar <alazar@nvidia.com> Reviewed-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>	2025-08-15 12:17:47 -07:00
Saeed Mahameed	2335b3f566	net/mlx5: mlx5_ifc, Add hardware definitions needed for adjacent vports Next patches will implement the discovery and creation of adjacent functions vports, this patch introduces the hardware structures definitions needed for the driver implementation. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jack Morgenstein <jackm@nvidia.com> Signed-off-by: Alexei Lazar <alazar@nvidia.com>	2025-08-15 12:17:47 -07:00
Leon Romanovsky	b272fc8972	RDMA support for DMA handle From Yishai: This patch series introduces a new DMA Handle (DMAH) object, along with corresponding APIs for its allocation and deallocation. The DMAH object encapsulates attributes relevant for DMA transactions. While initially intended to support TLP Processing Hints (TPH) [1], the design is extensible to accommodate future features such as PCI multipath for DMA, PCI UIO configurations, traffic class selection, and more. Additionally, we introduce a new ioctl method on the MR object: UVERBS_METHOD_REG_MR. This method consolidates multiple reg_mr variants under a single user-space ioctl interface, supporting: ibv_reg_mr(), ibv_reg_mr_iova(), ibv_reg_mr_iova2() and ibv_reg_dmabuf_mr(). It also enables passing a DMA handle as part of the registration process. Throughout the patch series, the following DMAH-related stuff can also be observed in the IB layer: - Association with a CPU ID and its memory type, for use with Steering Tags [2]. - Inclusion of Processing Hints (PH) data for TPH functionality [3]. - Enforces security by ensuring that only tasks allowed to run on a given CPU may request a DMA handle for it. - Reference counting for DMAH life cycle management and safe usage across memory regions. mlx5 driver implementation: -------------------------- The series includes implementation of the above functionality in the mlx5 driver. In mlx5_core: - Enables TPH over PCIe when both firmware and OS support it. - Manages Steering Tags and corresponding indices by writing tag values to the PCI configuration space. - Exposes APIs to upper layers (e.g., mlx5_ib) to enable the PCIe TPH functionality. In mlx5_ib: - Adds full support for DMAH operations. - Utilizes mlx5_core's Steering Tag APIs to derive tag indices from input. - Stores the resulting index in a mlx5_dmah structure for use during MKEY creation with a DMA handle. - Adds support for allowing MKEYs to be created in conjunction with DMA handles. Additional details are provided in the commit messages. [1] Background, from PCIe specification 6.2. TLP Processing Hints (TPH) -------------------------- TLP Processing Hints is an optional feature that provides hints in Request TLP headers to facilitate optimized processing of Requests that target Memory Space. These Processing Hints enable the system hardware (e.g., the Root Complex and/ or Endpoints) to optimize platform resources such as system and memory interconnect on a per TLP basis. Steering Tags are system-specific values used to identify a processing resource that a Requester explicitly targets. System software discovers and identifies TPH capabilities to determine the Steering Tag allocation for each Function that supports TPH [2] Steering Tags Functions that intend to target a TLP towards a specific processing resource such as a host processor or system cache hierarchy require topological information of the target cache (e.g., which host cache). Steering Tags are system-specific values that provide information about the host or cache structure in the system cache hierarchy. These values are used to associate processing elements within the platform with the processing of Requests. [3] Processing Hints The Requester provides hints to the Root Complex or other targets about the intended use of data and data structures by the host and/or device. The hints are provided by the Requester, which has knowledge of upcoming Request patterns, and which the Completer would not be able to deduce autonomously (with good accuracy) Yishai Signed-off-by: Leon Romanovsky <leon@kernel.org> * mlx5-next: net/mlx5: Add support for device steering tag net/mlx5: Expose IFC bits for TPH PCI/TPH: Expose pcie_tph_get_st_table_size() net/mlx5: Expose cable_length field in PFCC register net/mlx5: Add IFC bits and enums for buf_ownership net/mlx5: Add IFC bits to support RSS for IPSec offload net/mlx5: IFC updates for disabled host PF net/mlx5: Expose disciplined_fr_counter through HCA capabilities in mlx5_ifc	2025-07-23 01:38:56 -04:00
Yishai Hadas	888a7776f4	net/mlx5: Add support for device steering tag Background, from PCIe specification 6.2. TLP Processing Hints (TPH) -------------------------- TLP Processing Hints is an optional feature that provides hints in Request TLP headers to facilitate optimized processing of Requests that target Memory Space. These Processing Hints enable the system hardware (e.g., the Root Complex and/or Endpoints) to optimize platform resources such as system and memory interconnect on a per TLP basis. Steering Tags are system-specific values used to identify a processing resource that a Requester explicitly targets. System software discovers and identifies TPH capabilities to determine the Steering Tag allocation for each Function that supports TPH. This patch adds steering tag support for mlx5 based NICs by: - Enabling the TPH functionality over PCI if both FW and OS support it. - Managing steering tags and their matching steering indexes by writing a ST to an ST index over the PCI configuration space. - Exposing APIs to upper layers (e.g.,mlx5_ib) to allow usage of the PCI TPH infrastructure. Further details: - Upon probing of a device, the feature will be enabled based on both capability detection and OS support. - It will retrieve the appropriate ST for a given CPU ID and memory type using the pcie_tph_get_cpu_st() API. - It will track available ST indices according to the configuration space table size (expected to be 63 entries), reserving index 0 to indicate non-TPH use. - It will assign a free ST index with a ST using the pcie_tph_set_st_entry() API. - It will reuse the same index for identical (CPU ID + memory type) combinations by maintaining a reference count per entry. - It will expose APIs to upper layers (e.g., mlx5_ib) to allow usage of the PCI TPH infrastructure. - SF will use its parent PF stuff. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/de1ae7398e9e34eacd8c10845683df44fc9e32f8.1752752567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-23 01:27:32 -04:00
Yishai Hadas	5f9ec7880e	net/mlx5: Expose IFC bits for TPH Expose IFC bits for the TPH functionality. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Link: https://patch.msgid.link/38ea3a0d56551364214e8edf359c9c77c9a3b71b.1752752567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-23 01:27:32 -04:00
Oren Sidi	9a0048e0ae	net/mlx5: Expose cable_length field in PFCC register Introduce new "cable_length" field in PFCC register and related fields to enhance rx buffer configuration management: 1. cable_length: Shifts cable length handling to fw by storing a manually entered length from user in PFCC.cable_length 2. lane_rate_oper: In a case where PFCC.cable_length is not supported, helps compute a default cable length Signed-off-by: Oren Sidi <osidi@nvidia.com> Reviewed-by: Alex Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752734895-257735-4-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-20 03:02:14 -04:00
Oren Sidi	6f09ee0b58	net/mlx5: Add IFC bits and enums for buf_ownership Extend structure layouts and defines buf_ownership. buf_ownership indicates whether the buffer is managed by SW or FW. Signed-off-by: Oren Sidi <osidi@nvidia.com> Reviewed-by: Alex Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752734895-257735-3-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-20 03:02:14 -04:00
Jianbo Liu	438794e93f	net/mlx5: Add IFC bits to support RSS for IPSec offload This adds the capabilities, ipsec_next_header and inner/outer l4_type_ext fields to support RSS for the decrypted packets. These fields are specifically for firmware steering. HWS validation logic is updated to correctly handle the changes, ensuring the unsupported fields are not set. Besides, reserved_at_c4 is fixed to reserved_at_d4 to reflect the accurate offset within the structure. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752734895-257735-2-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-20 03:02:14 -04:00
Daniel Jurgens	cd1746cb65	net/mlx5: IFC updates for disabled host PF The port 2 host PF can be disabled, this bit reflects that setting. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752064867-16874-3-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-13 03:17:30 -04:00
Carolina Jubran	cbe080f931	net/mlx5: Expose disciplined_fr_counter through HCA capabilities in mlx5_ifc Introduce the `disciplined_fr_counter` capability bit to indicate that the device’s free-running cycle counter is disciplined to real-time. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752064867-16874-2-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-13 03:17:30 -04:00
Leon Romanovsky	9879bddf5a	Optimize DMABUF mkey page size in mlx5 From Edward: This patch series enables the mlx5 driver to dynamically choose the optimal page size for a DMABUF-based memory key (mkey), rather than always registering with a fixed page size. Previously, DMABUF memory registration used a fixed 4K page size for mkeys which could lead to suboptimal performance when the underlying memory layout may offer better page sizes. This approach did not take advantage of larger page size capabilities advertised by the HCA, and the driver was not setting the proper page size mask in the mkey mask when performing page size changes, potentially leading to invalid registrations when updating to a very large pages. This series improves DMABUF performance by dynamically selecting the best page size for a given memory region (MR) both at creation time and on page fault occurrences, based on the underlying layout and fixing related gaps and bugs. By doing so, we reduce the number of page table entries (and thus MTT/ KSM descriptors) that the HCA must traverse, which in turn reduces cache-line fetches. Thanks * mlx5-next: RDMA/mlx5: Fix UMR modifying of mkey page size net/mlx5: Expose HCA capability bits for mkey max page size Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-13 03:00:21 -04:00
Edward Srouji	c4f96972c3	RDMA/mlx5: Fix UMR modifying of mkey page size When changing the page size on an mkey, the driver needs to set the appropriate bits in the mkey mask to indicate which fields are being modified. The 6th bit of a page size in mlx5 driver is considered an extension, and this bit has a dedicated capability and mask bits. Previously, the driver was not setting this mask in the mkey mask when performing page size changes, regardless of its hardware support, potentially leading to an incorrect page size updates. This fixes the issue by setting the relevant bit in the mkey mask when performing page size changes on an mkey and the 6th bit of this field is supported by the hardware. Fixes: `cef7dde883` ("net/mlx5: Expand mkey page size to support 6 bits") Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/9f43a9c73bf2db6085a99dc836f7137e76579f09.1751979184.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-13 02:57:30 -04:00
Michael Guralnik	8feaf9832b	net/mlx5: Expose HCA capability bits for mkey max page size Expose the HCA capability for maximal page size that can be configured for an mkey. Used for enforcing capabilities when working with highly contiguous memory and using large page sizes. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/3e4d3fda37934430f65f72601519e22bf396fd05.1751979184.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-07-13 02:57:23 -04:00
Leon Romanovsky	f3b7a65ce5	Merge branch 'mlx5-next' into wip/leon-for-next * mlx5-next: net/mlx5: Check device memory pointer before usage net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow net/mlx5: Add IFC bits for PCIe Congestion Event object net/mlx5: Small refactor for general object capabilities	2025-07-07 06:29:16 -04:00
Mark Bloch	611d08207d	RDMA/mlx5: Allocate IB device with net namespace supplied from core dev Use the new ib_alloc_device_with_net() API to allocate the IB device so that it is properly bound to the network namespace obtained via mlx5_core_net(). This change ensures correct namespace association (e.g., for containerized setups). Additionally, expose mlx5_core_net so that RDMA driver can use it. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2025-06-26 08:10:07 -04:00
Dragos Tatulea	1f6da56679	net/mlx5: Add IFC bits for PCIe Congestion Event object Add definitions for the PCIe Congestion Event object and the relevant FW command structures. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250619113721.60201-3-mbloch@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-06-25 07:34:27 -04:00
Dragos Tatulea	ebf8d47121	net/mlx5: Small refactor for general object capabilities Make enum for capability bits of general object types depend on the type definitions themselves. Make sure that capabilities in the [64,127] bit range are properly calculated (type id - 64). Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250619113721.60201-2-mbloch@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-06-25 07:34:27 -04:00
Patrisious Haddad	52931f5515	net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain RDMA TRANSPORT domains were initially limited to a single priority. This change allows the domains to have multiple priorities, making it possible to add several rules and control the order in which they're evaluated. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/b299cbb4c8678a33da6e6b6988b5bf6145c54b88.1750148083.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-06-25 03:55:53 -04:00
Patrisious Haddad	5d2ea5aebb	RDMA/mlx5: Fix error flow upon firmware failure for RQ destruction Upon RQ destruction if the firmware command fails which is the last resource to be destroyed some SW resources were already cleaned regardless of the failure. Now properly rollback the object to its original state upon such failure. In order to avoid a use-after free in case someone tries to destroy the object again, which results in the following kernel trace: refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 37589 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148 Modules linked in: rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) rfkill mlx5_core(OE) mlxdevm(OE) ib_uverbs(OE) ib_core(OE) psample mlxfw(OE) mlx_compat(OE) macsec tls pci_hyperv_intf sunrpc vfat fat virtio_net net_failover failover fuse loop nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_console virtio_gpu virtio_blk virtio_dma_buf virtio_mmio dm_mirror dm_region_hash dm_log dm_mod xpmem(OE) CPU: 0 UID: 0 PID: 37589 Comm: python3 Kdump: loaded Tainted: G OE ------- --- 6.12.0-54.el10.aarch64 #1 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0xf4/0x148 lr : refcount_warn_saturate+0xf4/0x148 sp : ffff80008b81b7e0 x29: ffff80008b81b7e0 x28: ffff000133d51600 x27: 0000000000000001 x26: 0000000000000000 x25: 00000000ffffffea x24: ffff00010ae80f00 x23: ffff00010ae80f80 x22: ffff0000c66e5d08 x21: 0000000000000000 x20: ffff0000c66e0000 x19: ffff00010ae80340 x18: 0000000000000006 x17: 0000000000000000 x16: 0000000000000020 x15: ffff80008b81b37f x14: 0000000000000000 x13: 2e656572662d7265 x12: ffff80008283ef78 x11: ffff80008257efd0 x10: ffff80008283efd0 x9 : ffff80008021ed90 x8 : 0000000000000001 x7 : 00000000000bffe8 x6 : c0000000ffff7fff x5 : ffff0001fb8e3408 x4 : 0000000000000000 x3 : ffff800179993000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000133d51600 Call trace: refcount_warn_saturate+0xf4/0x148 mlx5_core_put_rsc+0x88/0xa0 [mlx5_ib] mlx5_core_destroy_rq_tracked+0x64/0x98 [mlx5_ib] mlx5_ib_destroy_wq+0x34/0x80 [mlx5_ib] ib_destroy_wq_user+0x30/0xc0 [ib_core] uverbs_free_wq+0x28/0x58 [ib_uverbs] destroy_hw_idr_uobject+0x34/0x78 [ib_uverbs] uverbs_destroy_uobject+0x48/0x240 [ib_uverbs] __uverbs_cleanup_ufile+0xd4/0x1a8 [ib_uverbs] uverbs_destroy_ufile_hw+0x48/0x120 [ib_uverbs] ib_uverbs_close+0x2c/0x100 [ib_uverbs] __fput+0xd8/0x2f0 __fput_sync+0x50/0x70 __arm64_sys_close+0x40/0x90 invoke_syscall.constprop.0+0x74/0xd0 do_el0_svc+0x48/0xe8 el0_svc+0x44/0x1d0 el0t_64_sync_handler+0x120/0x130 el0t_64_sync+0x1a4/0x1a8 Fixes: `e2013b212f` ("net/mlx5_core: Add RQ and SQ event handling") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/3181433ccdd695c63560eeeb3f0c990961732101.1745839855.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-05-05 11:44:37 -04:00
Linus Torvalds	092e335082	RDMA v6.15 merge window pull request - Usual minor updates and fixes for bnxt_re, hfi1, rxe, mana, iser, mlx5, vmw_pvrdma, hns - Make rxe work on tun devices - mana gains more standard verbs as it moves toward supporting in-kernel verbs - DMABUF support for mana - Fix page size calculations when memory registration exceeds 4G - On Demand Paging support for rxe - mlx5 support for RDMA TRANSPORT flow tables and a new ucap mechanism to access control use of them - Optional RDMA_TX/RX counters per QP in mlx5 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZ+ap4gAKCRCFwuHvBreF YaFHAP9wyeZCZIbnWaGcbNdbsmkEgy7aTVILRHf1NA7VSJ211gD9Ha60E+mkwtvA i7IJ49R2BdqzKaO9oTutj2Lw+8rABwQ= =qXhh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: - Usual minor updates and fixes for bnxt_re, hfi1, rxe, mana, iser, mlx5, vmw_pvrdma, hns - Make rxe work on tun devices - mana gains more standard verbs as it moves toward supporting in-kernel verbs - DMABUF support for mana - Fix page size calculations when memory registration exceeds 4G - On Demand Paging support for rxe - mlx5 support for RDMA TRANSPORT flow tables and a new ucap mechanism to access control use of them - Optional RDMA_TX/RX counters per QP in mlx5 * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (73 commits) IB/mad: Check available slots before posting receive WRs RDMA/mana_ib: Fix integer overflow during queue creation RDMA/mlx5: Fix calculation of total invalidated pages RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow RDMA/mlx5: Fix page_size variable overflow RDMA/mlx5: Drop access_flags from _mlx5_mr_cache_alloc() RDMA/mlx5: Fix cache entry update on dereg error RDMA/mlx5: Fix MR cache initialization error flow RDMA/mlx5: Support optional-counters binding for QPs RDMA/mlx5: Compile fs.c regardless of INFINIBAND_USER_ACCESS config RDMA/core: Pass port to counter bind/unbind operations RDMA/core: Add support to optional-counters binding configuration RDMA/core: Create and destroy rdma_counter using rdma_zalloc_drv_obj() RDMA/mlx5: Add optional counters for RDMA_TX/RX_packets/bytes RDMA/core: Fix use-after-free when rename device name RDMA/bnxt_re: Support perf management counters RDMA/rxe: Fix incorrect return value of rxe_odp_atomic_op() RDMA/uverbs: Propagate errors from rdma_lookup_get_uobject() RDMA/mana_ib: Handle net event for pointing to the current netdev net: mana: Change the function signature of mana_get_primary_netdev_rcu ...	2025-03-29 11:12:28 -07:00
Patrisious Haddad	fd24c9ef6c	RDMA/mlx5: Support optional-counters binding for QPs Add support to allow optional-counters binding to a QP, whereas when a bind operation is requested depending on the counter optional-counter binding state the driver will determine if to also add optional-counters to this QP binding. The optional-counter binding is done by simply adding a steering rule for the specific optional-counter condition with the additional match over that QP number. Note that optional-counters per QP rules are handled on an earlier prio than per device counters, and per device counter correctness is maintained by core whereas it is responsible to sum active counters when checking device counter and to add them to history count when they are deallocated. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/2cad1b891a6641ae61fe8d92f867e1059121813a.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-18 06:18:57 -04:00
Patrisious Haddad	d375db42a8	RDMA/mlx5: Add optional counters for RDMA_TX/RX_packets/bytes Add the following optional counters: rdma_tx_packets,rdma_rx_bytes,rdma_rx_packets,rdma_tx_bytes. Which counts all RDMA packets/bytes sent and received per link. Note that since each direction packet and byte counter are shared, the counter is only reset when both counters of that direction are removed. But from user-perspective each can be enabled/disabled separately. The counters can be enabled using: sudo rdma stat set link rocep8s0f0/1 optional-counters rdma_tx_packets And can be seen using: rdma stat -j show link rocep8s0f0/1 Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/9f2753ad636f21704416df64b47395c8991d1123.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-18 06:18:30 -04:00
Moshe Shemesh	82d3639ef7	net/mlx5: fs, add support for flow meters HWS action Add support for HW Steering action of flow meter range. Flow meters range can use one HWS action for the whole range. Thus, share a cached HWS action among rules that use same flow meter object range. Hold refcount for each rule using the cached action. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1741543663-22123-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-17 18:57:17 +01:00
Paolo Abeni	89d75c4c67	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-03-10 The following pull-request contains common mlx5 updates for your net-next tree. Please pull and let me know of any problem. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add IFC bits for PPCNT recovery counters group net/mlx5: fs, add RDMA TRANSPORT steering domain support net/mlx5: Query ADV_RDMA capabilities net/mlx5: Limit non-privileged commands net/mlx5: Allow the throttle mechanism to be more dynamic net/mlx5: Add RDMA_CTRL HW capabilities ==================== Link: https://patch.msgid.link/1741608293-41436-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 12:40:46 +01:00
Yael Chemla	f550694e88	net/mlx5: Add IFC bits for PPCNT recovery counters group Add recovery counters group layout of PPCNT (Ports Performance Counters Register). This group counts recovery events per link. Also add the corresponding bit in PCAM to indicate this group is supported. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741545697-23041-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-10 04:31:15 -04:00
Patrisious Haddad	15b103df80	net/mlx5: fs, add RDMA TRANSPORT steering domain support Add RX and TX RDMA_TRANSPORT flow table namespace, and the ability to create flow tables in those namespaces. The RDMA_TRANSPORT RX and TX are per vport. Packets will traverse through RDMA_TRANSPORT_RX after RDMA_RX and through RDMA_TRANSPORT_TX before RDMA_TX, ensuring proper control and management. RDMA_TRANSPORT domains are managed by the vport group manager. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/a6b550d9859a197eafa804b9a8d76916ca481da9.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:49 -05:00
Patrisious Haddad	ab7d228c7e	net/mlx5: Query ADV_RDMA capabilities Query ADV_RDMA capabilities which provide information for advanced RDMA related features. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/e3e6ede03ea31cd201078dcdd4e407608e4a5a87.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:46 -05:00
Chiara Meiohas	f9deed0980	net/mlx5: Limit non-privileged commands Limit non-privileged UID commands to half of the available command slots when privileged UIDs are present. Privileged throttle commands will not be limited. Use an xarray to store privileged UIDs. Add insert and remove functions for privileged UIDs management. Non-user commands (with uid 0) are not limited. Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/d2f3dd9a0dbad3c9f2b4bb0723837995e4e06de2.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:42 -05:00
Chiara Meiohas	0a34fad1be	net/mlx5: Allow the throttle mechanism to be more dynamic Previously, throttle commands were identified and limited based on opcode. These commands were limited to half the command slots using a semaphore, and callback commands checked the opcode to determine semaphore release. To allow exceptions, we introduce a variable to indicate when the throttle lock is held. This allows scenarios where throttle commands are not limited. Callback functions use this variable to determine if the throttle semaphore needs to be released. This patch contains no functional changes. It's a preparation for the next patch. Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Link: https://patch.msgid.link/055d975edeb816ac4c0fd1e665c6157d11947d26.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:36 -05:00
Chiara Meiohas	f6f425f3d2	net/mlx5: Add RDMA_CTRL HW capabilities Add RDMA_CTRL UCTX capabilities and add the RDMA_CTRL general object type in hca_cap_2. Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://patch.msgid.link/ef7eb24be9a6f247ab52e8b4480350072e5182f5.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:32 -05:00

1 2 3 4 5 ...

1535 Commits (09cfd3c52ea76f43b3cb15e570aeddf633d65e80)