mirror-linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	092e335082	RDMA v6.15 merge window pull request - Usual minor updates and fixes for bnxt_re, hfi1, rxe, mana, iser, mlx5, vmw_pvrdma, hns - Make rxe work on tun devices - mana gains more standard verbs as it moves toward supporting in-kernel verbs - DMABUF support for mana - Fix page size calculations when memory registration exceeds 4G - On Demand Paging support for rxe - mlx5 support for RDMA TRANSPORT flow tables and a new ucap mechanism to access control use of them - Optional RDMA_TX/RX counters per QP in mlx5 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZ+ap4gAKCRCFwuHvBreF YaFHAP9wyeZCZIbnWaGcbNdbsmkEgy7aTVILRHf1NA7VSJ211gD9Ha60E+mkwtvA i7IJ49R2BdqzKaO9oTutj2Lw+8rABwQ= =qXhh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: - Usual minor updates and fixes for bnxt_re, hfi1, rxe, mana, iser, mlx5, vmw_pvrdma, hns - Make rxe work on tun devices - mana gains more standard verbs as it moves toward supporting in-kernel verbs - DMABUF support for mana - Fix page size calculations when memory registration exceeds 4G - On Demand Paging support for rxe - mlx5 support for RDMA TRANSPORT flow tables and a new ucap mechanism to access control use of them - Optional RDMA_TX/RX counters per QP in mlx5 * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (73 commits) IB/mad: Check available slots before posting receive WRs RDMA/mana_ib: Fix integer overflow during queue creation RDMA/mlx5: Fix calculation of total invalidated pages RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow RDMA/mlx5: Fix page_size variable overflow RDMA/mlx5: Drop access_flags from _mlx5_mr_cache_alloc() RDMA/mlx5: Fix cache entry update on dereg error RDMA/mlx5: Fix MR cache initialization error flow RDMA/mlx5: Support optional-counters binding for QPs RDMA/mlx5: Compile fs.c regardless of INFINIBAND_USER_ACCESS config RDMA/core: Pass port to counter bind/unbind operations RDMA/core: Add support to optional-counters binding configuration RDMA/core: Create and destroy rdma_counter using rdma_zalloc_drv_obj() RDMA/mlx5: Add optional counters for RDMA_TX/RX_packets/bytes RDMA/core: Fix use-after-free when rename device name RDMA/bnxt_re: Support perf management counters RDMA/rxe: Fix incorrect return value of rxe_odp_atomic_op() RDMA/uverbs: Propagate errors from rdma_lookup_get_uobject() RDMA/mana_ib: Handle net event for pointing to the current netdev net: mana: Change the function signature of mana_get_primary_netdev_rcu ...	2025-03-29 11:12:28 -07:00
Patrisious Haddad	fd24c9ef6c	RDMA/mlx5: Support optional-counters binding for QPs Add support to allow optional-counters binding to a QP, whereas when a bind operation is requested depending on the counter optional-counter binding state the driver will determine if to also add optional-counters to this QP binding. The optional-counter binding is done by simply adding a steering rule for the specific optional-counter condition with the additional match over that QP number. Note that optional-counters per QP rules are handled on an earlier prio than per device counters, and per device counter correctness is maintained by core whereas it is responsible to sum active counters when checking device counter and to add them to history count when they are deallocated. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/2cad1b891a6641ae61fe8d92f867e1059121813a.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-18 06:18:57 -04:00
Patrisious Haddad	d375db42a8	RDMA/mlx5: Add optional counters for RDMA_TX/RX_packets/bytes Add the following optional counters: rdma_tx_packets,rdma_rx_bytes,rdma_rx_packets,rdma_tx_bytes. Which counts all RDMA packets/bytes sent and received per link. Note that since each direction packet and byte counter are shared, the counter is only reset when both counters of that direction are removed. But from user-perspective each can be enabled/disabled separately. The counters can be enabled using: sudo rdma stat set link rocep8s0f0/1 optional-counters rdma_tx_packets And can be seen using: rdma stat -j show link rocep8s0f0/1 Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/9f2753ad636f21704416df64b47395c8991d1123.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-18 06:18:30 -04:00
Moshe Shemesh	82d3639ef7	net/mlx5: fs, add support for flow meters HWS action Add support for HW Steering action of flow meter range. Flow meters range can use one HWS action for the whole range. Thus, share a cached HWS action among rules that use same flow meter object range. Hold refcount for each rule using the cached action. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1741543663-22123-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-17 18:57:17 +01:00
Paolo Abeni	89d75c4c67	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-03-10 The following pull-request contains common mlx5 updates for your net-next tree. Please pull and let me know of any problem. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add IFC bits for PPCNT recovery counters group net/mlx5: fs, add RDMA TRANSPORT steering domain support net/mlx5: Query ADV_RDMA capabilities net/mlx5: Limit non-privileged commands net/mlx5: Allow the throttle mechanism to be more dynamic net/mlx5: Add RDMA_CTRL HW capabilities ==================== Link: https://patch.msgid.link/1741608293-41436-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 12:40:46 +01:00
Yael Chemla	f550694e88	net/mlx5: Add IFC bits for PPCNT recovery counters group Add recovery counters group layout of PPCNT (Ports Performance Counters Register). This group counts recovery events per link. Also add the corresponding bit in PCAM to indicate this group is supported. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741545697-23041-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-10 04:31:15 -04:00
Patrisious Haddad	15b103df80	net/mlx5: fs, add RDMA TRANSPORT steering domain support Add RX and TX RDMA_TRANSPORT flow table namespace, and the ability to create flow tables in those namespaces. The RDMA_TRANSPORT RX and TX are per vport. Packets will traverse through RDMA_TRANSPORT_RX after RDMA_RX and through RDMA_TRANSPORT_TX before RDMA_TX, ensuring proper control and management. RDMA_TRANSPORT domains are managed by the vport group manager. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/a6b550d9859a197eafa804b9a8d76916ca481da9.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:49 -05:00
Patrisious Haddad	ab7d228c7e	net/mlx5: Query ADV_RDMA capabilities Query ADV_RDMA capabilities which provide information for advanced RDMA related features. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/e3e6ede03ea31cd201078dcdd4e407608e4a5a87.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:46 -05:00
Chiara Meiohas	f9deed0980	net/mlx5: Limit non-privileged commands Limit non-privileged UID commands to half of the available command slots when privileged UIDs are present. Privileged throttle commands will not be limited. Use an xarray to store privileged UIDs. Add insert and remove functions for privileged UIDs management. Non-user commands (with uid 0) are not limited. Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/d2f3dd9a0dbad3c9f2b4bb0723837995e4e06de2.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:42 -05:00
Chiara Meiohas	0a34fad1be	net/mlx5: Allow the throttle mechanism to be more dynamic Previously, throttle commands were identified and limited based on opcode. These commands were limited to half the command slots using a semaphore, and callback commands checked the opcode to determine semaphore release. To allow exceptions, we introduce a variable to indicate when the throttle lock is held. This allows scenarios where throttle commands are not limited. Callback functions use this variable to determine if the throttle semaphore needs to be released. This patch contains no functional changes. It's a preparation for the next patch. Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Link: https://patch.msgid.link/055d975edeb816ac4c0fd1e665c6157d11947d26.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:36 -05:00
Chiara Meiohas	f6f425f3d2	net/mlx5: Add RDMA_CTRL HW capabilities Add RDMA_CTRL UCTX capabilities and add the RDMA_CTRL general object type in hca_cap_2. Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://patch.msgid.link/ef7eb24be9a6f247ab52e8b4480350072e5182f5.1740574103.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-03-08 13:22:32 -05:00
Shahar Shitrit	a2f61f1db8	net/mlx5: Relocate function declarations from port.h to mlx5_core.h The port header is a general file under include, yet it contains declarations for functions that are either not exported or exported but not used outside the mlx5_core driver. To enhance code organization, we move these declarations to mlx5_core.h, where they are more appropriately scoped. This refactor removes unnecessary exported symbols and prevents unexported functions from being inadvertently referenced outside of the mlx5_core driver. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250304160620.417580-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 17:53:34 -08:00
Jakub Kicinski	ef4a47a8ab	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-02-24 The following pull-request contains common mlx5 updates for your net-next tree. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Change POOL_NEXT_SIZE define value and make it global net/mlx5: Add new health syndrome error and crr bit offset ==================== Link: https://patch.msgid.link/20250224212446.523259-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:40:05 -08:00
Jianbo Liu	85e4a808af	net/mlx5e: Add correct match to check IPSec syndromes for switchdev mode In commit `dddb49b63d` ("net/mlx5e: Add IPsec and ASO syndromes check in HW"), IPSec and ASO syndromes checks after decryption for the specified ASO object were added. But they are correct only for eswith in legacy mode. For switchdev mode, metadata register c1 is used to save the mapped id (not ASO object id). So, need to change the match accordingly for the check rules in status table. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-24 18:14:32 -08:00
Patrisious Haddad	80df31f384	net/mlx5: Change POOL_NEXT_SIZE define value and make it global Change POOL_NEXT_SIZE define value from 0 to BIT(30), since this define is used to request the available maximum sized flow table, and zero doesn't make sense for it, whereas some places in the driver use zero explicitly expecting the smallest table size possible but instead due to this define they end up allocating the biggest table size unawarely. In addition move the definition to "include/linux/mlx5/fs.h" to expose the define to IB driver as well, while appropriately renaming it. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219085808.349923-3-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-02-23 04:43:14 -05:00
Shahar Shitrit	531ca2b9a2	net/mlx5: Add new health syndrome error and crr bit offset Add new error value for trust lockdown in health syndrome enum. Also, include the offset for crr bit in the health buffer layout. These changes prepare for downstream patches that update health event handling. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219085808.349923-2-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-02-23 04:42:36 -05:00
Jianbo Liu	ee0a4fc396	net/mlx5: Add support for 200Gbps per lane link modes This patch exposes new link modes using 200Gbps per lane, including 200G, 400G and 800G modes. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-06 10:14:01 +01:00
Jianbo Liu	574998cf3b	net/mlx5: Add devcom component for the clock shared by functions Add new devcom component for hardware clock. When it is running in real time mode, the functions are grouped by the identify they query. According to firmware document, the clock identify size is 64 bits, so it's safe to memcpy to component key, as the key size is also 64 bits. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-06 10:14:01 +01:00
Jianbo Liu	f9beaf4fac	net/mlx5: Change clock in mlx5_core_dev to mlx5_clock pointer Change clock member in mlx5_core_dev to a pointer, so it can point to a clock shared by multiple functions in later patch. For now, each function has its own clock, so mdev in mlx5_clock_priv is the back pointer to the function. Later it points to one (normally the first one) of the multiple functions sharing the same clock. Change mlx5_init_clock() to return error if mlx5_clock is not allocated. Besides, a null clock is defined and used when hardware clock is not supported. So, the clock pointer is always pointing to something valid. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-06 10:14:00 +01:00
Jakub Kicinski	d90e36f836	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-01-14 The following pull-request contains mlx5 IFC updates. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers net/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps net/mlx5: Add support for MRTCQ register net/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes ==================== Link: https://patch.msgid.link/20250114055700.1928736-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-14 11:13:35 -08:00
Moshe Shemesh	aecd9d1020	net/mlx5: fs, add HWS packet reformat API function Add packet reformat alloc and dealloc API functions to provide packet reformat actions for steering rules. Add HWS action pools for each of the following packet reformat types: - decapl3: decapsulate l3 tunnel to l2 - encapl2: encapsulate l2 to tunnel l2 - encapl3: encapsulate l2 to tunnel l3 - insert_hdr: insert header In addition cache remove header action for remove vlan header as this is currently the only use case of remove header action in the driver. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250109160546.1733647-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-13 19:21:08 -08:00
Akiva Goldberger	6ca00ec47b	net/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers Add nic_cap_reg and vhca_icm_ctrl registers interfaces for exposing ICM consumption. Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250109204231.1809851-5-tariqt@nvidia.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-01-12 03:58:00 -05:00
Saeed Mahameed	df75ad562a	net/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps Read and cache SHAMPO specific caps for header data split capabilities. Will be used in downstream patch. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250109204231.1809851-4-tariqt@nvidia.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-01-12 03:58:00 -05:00
Jianbo Liu	e2685ef5f5	net/mlx5: Add support for MRTCQ register Management Real Time Clock Query (MRTCQ) register is used to query hardware clock identity. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250109204231.1809851-3-tariqt@nvidia.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-01-12 03:58:00 -05:00
Jianbo Liu	387bef82d0	net/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes Add FEC admin and override related fields in PPLM, and the bit in PCAM to indicate those fields are supported. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250109204231.1809851-2-tariqt@nvidia.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-01-12 03:58:00 -05:00
Vadim Fedorenko	e61e6c415b	net/mlx5: use do_aux_work for PHC overflow checks The overflow_work is using system wq to do overflow checks and updates for PHC device timecounter, which might be overhelmed by other tasks. But there is dedicated kthread in PTP subsystem designed for such things. This patch changes the work queue to proper align with PTP subsystem and to avoid overloading system work queue. The adjfine() function acts the same way as overflow check worker, we can postpone ptp aux worker till the next overflow period after adjfine() was called. Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250107104812.380225-1-vadfed@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-01-09 12:37:11 +01:00
Jakub Kicinski	385f186aba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.13-rc6). No conflicts. Adjacent changes: include/linux/if_vlan.h `f91a5b8089` ("af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK") `3f330db306` ("net: reformat kdoc return statements") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-03 16:29:29 -08:00
Linus Torvalds	aba74e639f	Nothing major here. Over the last two weeks we gathered only around two-thirds of our normal weekly fix count, but delaying sending these until -rc7 seemed like a really bad idea. AFAIK we have no bugs under investigation. One or two reverts for stuff for which we haven't gotten a proper fix will likely come in the next PR. Including fixes from wireles and netfilter. Current release - fix to a fix: - netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Previous releases - regressions: - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets - mptcp: - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust - fix TCP options overflow - prevent excessive coalescing on receive, fix throughput - net: fix memory leak in tcp_conn_request() if map insertion fails - wifi: cw1200: fix potential NULL dereference after conversion to GPIO descriptors - phy: micrel: dynamically control external clock of KSZ PHY, fix suspend behavior Previous releases - always broken: - af_packet: fix VLAN handling with MSG_PEEK - net: restrict SO_REUSEPORT to inet sockets - netdev-genl: avoid empty messages in NAPI get - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X - eth: gve: XDP fixes around transmit, queue wakeup etc. - eth: ti: icssg-prueth: fix firmware load sequence to prevent time jump which breaks timesync related operations Misc: - netlink: specs: mptcp: add missing attr and improve documentation Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmd4FCgACgkQMUZtbf5S IruB7hAAkppr4+4kywDq/PIqgkisjYzpNIQIrphebUeIHIcAizWbJEL9CiHXnCQD w95+nVhzDmsWXdzE2AvwM/2YAByLP6CpgKEa+7rI3K5BQZ86lbzm+ftOOFYiF3eV 2v5yLfjgMVbzwqzuP3ivkNrtG95IGQFPobLl7tI9Wpgm3yD0+H8BCqtZNAwA4N6g mEdZD1jMOzkdINoJBg35O70B2GDq/WSS1N8dgxtj1F7EPDuQdO1kijJmqFjYT3fw 30/NdjlfGtbFro6zp+8ZY6P9RG3CLNhYXu+nUqdOHJOJw65aXUxOPjWCRp3o6m12 HIIp22imDTvNTSCfn0wlbGJfC+v9HdIgB9gqApAaxNOS02bQgEy5tuxxXNTo2B1I bqH7GJ+6a7axsQ6BnpcOcDhu2fnbtguj1W/zC/MlGQmZZj2g+UNCf4QqPir/XkcG r43/iy2/L8cgZWYXzH1skfUMgbymI5uLytO+n+CLsb6P/N3WLa7wxHNV5Lni+Xng qDyd0TwUccCr2KBfZqjgFeGbvQ7qub+eeQCYx9kUAaFhKUsQB/eZ5i9C7SsfutMa xzwITqCkcvmdyU3gG3v/oABtQkxdTcRqWCfYDGcXm0zb1jzMf5ZGN4E/gABYM7p9 EewL1kCAt4ULmBc8Hqrj+2Erlzh+xsodvvfpgpJandMVXaJW44k= =YKS4 -----END PGP SIGNATURE----- Merge tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireles and netfilter. Nothing major here. Over the last two weeks we gathered only around two-thirds of our normal weekly fix count, but delaying sending these until -rc7 seemed like a really bad idea. AFAIK we have no bugs under investigation. One or two reverts for stuff for which we haven't gotten a proper fix will likely come in the next PR. Current release - fix to a fix: - netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Previous releases - regressions: - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets - mptcp: - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust - fix TCP options overflow - prevent excessive coalescing on receive, fix throughput - net: fix memory leak in tcp_conn_request() if map insertion fails - wifi: cw1200: fix potential NULL dereference after conversion to GPIO descriptors - phy: micrel: dynamically control external clock of KSZ PHY, fix suspend behavior Previous releases - always broken: - af_packet: fix VLAN handling with MSG_PEEK - net: restrict SO_REUSEPORT to inet sockets - netdev-genl: avoid empty messages in NAPI get - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X - eth: - gve: XDP fixes around transmit, queue wakeup etc. - ti: icssg-prueth: fix firmware load sequence to prevent time jump which breaks timesync related operations Misc: - netlink: specs: mptcp: add missing attr and improve documentation" * tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init net: ti: icssg-prueth: Fix firmware load sequence. mptcp: prevent excessive coalescing on receive mptcp: don't always assume copied data in mptcp_cleanup_rbuf() mptcp: fix recvbuffer adjust on sleeping rcvmsg ila: serialize calls to nf_register_net_hooks() af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK af_packet: fix vlan_get_tci() vs MSG_PEEK net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() net: restrict SO_REUSEPORT to inet sockets net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets net: sfc: Correct key_len for efx_tc_ct_zone_ht_params net: wwan: t7xx: Fix FSM command timeout issue sky2: Add device ID 11ab:4373 for Marvell 88E8075 mptcp: fix TCP options overflow. net: mv643xx_eth: fix an OF node reference leak gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup eth: bcmsysport: fix call balance of priv->clk handling routines net: llc: reset skb->transport_header netlink: specs: mptcp: fix missing doc ...	2025-01-03 14:36:54 -08:00
Mark Zhang	45d339fefa	RDMA/mlx5: Enable multiplane mode only when it is supported Driver queries vport_cxt.num_plane and enables multiplane when it is greater then 0, but some old FWs (versions from x.40.1000 till x.42.1000), report vport_cxt.num_plane = 1 unexpectedly. Fix it by querying num_plane only when HCA_CAP2.multiplane bit is set. Fixes: `2a5db20fa5` ("RDMA/mlx5: Add support to multi-plane device and port") Link: https://patch.msgid.link/r/1ef901acdf564716fcf550453cf5e94f343777ec.1734610916.git.leon@kernel.org Cc: stable@vger.kernel.org Reported-by: Francesco Poli <invernomuto@paranoici.org> Closes: https://lore.kernel.org/all/nvs4i2v7o6vn6zhmtq4sgazy2hu5kiulukxcntdelggmznnl7h@so3oul6uwgbl/ Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-01-03 09:17:19 -04:00
Jianbo Liu	2a4f56fbcc	net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only In the cited commit, when changing from switchdev to legacy mode, uplink representor's netdev is kept, and its profile is replaced with nic profile, so netdev is detached from old profile, then attach to new profile. During profile change, the hardware resources allocated by the old profile will be cleaned up. However, the cleanup is relying on the related kernel modules. And they may need to flush themselves first, which is triggered by netdev events, for example, NETDEV_UNREGISTER. However, netdev is kept, or netdev_register is called after the cleanup, which may cause troubles because the resources are still referred by kernel modules. The same process applies to all the caes when uplink is leaving switchdev mode, including devlink eswitch mode set legacy, driver unload and devlink reload. For the first one, it can be blocked and returns failure to users, whenever possible. But it's hard for the others. Besides, the attachment to nic profile is unnecessary as the netdev will be unregistered anyway for such cases. So in this patch, the original behavior is kept only for devlink eswitch set mode legacy. For the others, moves netdev unregistration before the profile change. Fixes: `7a9fb35e8c` ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:02 -08:00
Moshe Shemesh	95f68e06b4	net/mlx5: fs, add counter object to flow destination Currently mlx5_flow_destination includes counter_id which is assigned in case we use flow counter on the flow steering rule. However, counter_id is not enough data in case of using HW Steering. Thus, have mlx5_fc object as part of mlx5_flow_destination instead of counter_id and assign it where needed. In case counter_id is received from user space, create a local counter object to represent it. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:34:45 -08:00
Rongwei Liu	aeb3ec9902	net/mlx5: Add device cap abs_native_port_num When the abs_native_port_num is set, the native_port_num reported by the device may not be continuous and bigger than the num_lag_ports. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241212221329.961628-2-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 02:29:16 -05:00
Cosmin Ratiu	f09ed834a9	net/mlx5: qos: Add ifc support for cross-esw scheduling This adds the capability bit and the vport element fields related to cross-esw scheduling. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241204220931.254964-5-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 04:19:49 -05:00
Carolina Jubran	03713108e0	net/mlx5: Add support for new scheduling elements Introduce new scheduling elements in the E-Switch QoS hierarchy to enhance traffic management capabilities. This patch adds support for: - Rate Limit scheduling elements: Enables bandwidth limitation across multiple nodes without a shared ancestor, providing a mechanism for more granular control of bandwidth allocation. - Traffic Class Transmit Scheduling Arbiter (TSAR): Introduces the infrastructure for creating Traffic Class TSARs, allowing hierarchical arbitration based on traffic classes. - Traffic Class Arbiter TSAR: Adds support for a TSAR capable of managing arbitration between multiple traffic classes, enabling improved bandwidth prioritization and traffic management. No functional changes are introduced in this patch. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241204220931.254964-4-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 04:19:49 -05:00
Yevgeny Kliteynik	e799ac9dd3	net/mlx5: Add ConnectX-8 device to ifc In preparation for ConnectX-8 SWS support, add enum for the new device type. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241204220931.254964-3-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 04:19:49 -05:00
Cosmin Ratiu	f9a5b34f92	net/mlx5: ifc: Reorganize mlx5_ifc_flow_table_context_bits The nested union at the end is not in the same style as the rest of the code, so un-nest it to make the style uniformly applied again. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241204220931.254964-2-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 04:19:49 -05:00
Patrisious Haddad	e05feab22f	RDMA/mlx5: Enforce same type port association for multiport RoCE Different core device types such as PFs and VFs shouldn't be affiliated together since they have different capabilities, fix that by enforcing type check before doing the affiliation. Fixes: `32f69e4be2` ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/88699500f690dff1c1852c1ddb71f8a1cc8b956e.1733233480.git.leonro@nvidia.com Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:35:11 -05:00
Linus Torvalds	2a163a4cea	RDMA v6.13 merge window pull request Seveal fixes scattered across the drivers and a few new features: - Minor updates and bug fixes to hfi1, efa, iopob, bnxt, hns - Force disassociate the userspace FD when hns does an async reset - bnxt new features for optimized modify QP to skip certain stayes, CQ coalescing, better debug dumping - mlx5 new data placement ordering feature - Faster destruction of mlx5 devx HW objects - Improvements to RDMA CM mad handling -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZz4ENwAKCRCFwuHvBreF YQYQAP9R54r5J1Iylg+zqhCc+e/9oveuuZbfLvy/EJiEpmdprQEAgPs1RrB0z7U6 1xrVStUKNPhGd5XeVVZGkIV0zYv6Tw4= =V5xI -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Seveal fixes scattered across the drivers and a few new features: - Minor updates and bug fixes to hfi1, efa, iopob, bnxt, hns - Force disassociate the userspace FD when hns does an async reset - bnxt new features for optimized modify QP to skip certain stayes, CQ coalescing, better debug dumping - mlx5 new data placement ordering feature - Faster destruction of mlx5 devx HW objects - Improvements to RDMA CM mad handling" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (51 commits) RDMA/bnxt_re: Correct the sequence of device suspend RDMA/bnxt_re: Use the default mode of congestion control RDMA/bnxt_re: Support different traffic class IB/cm: Rework sending DREQ when destroying a cm_id IB/cm: Do not hold reference on cm_id unless needed IB/cm: Explicitly mark if a response MAD is a retransmission RDMA/mlx5: Move events notifier registration to be after device registration RDMA/bnxt_re: Cache MSIx info to a local structure RDMA/bnxt_re: Refurbish CQ to NQ hash calculation RDMA/bnxt_re: Refactor NQ allocation RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved RDMA/hns: Fix different dgids mapping to the same dip_idx RDMA/bnxt_re: Add set_func_resources support for P5/P7 adapters RDMA/bnxt_re: Enhance RoCE SRIOV resource configuration design bnxt_en: Add support for RoCE sriov configuration RDMA/hns: Fix NULL pointer derefernce in hns_roce_map_mr_sg() RDMA/hns: Fix out-of-order issue of requester when setting FENCE RDMA/nldev: Add IB device and net device rename events RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation RDMA/core: Move ib_uverbs_file struct to uverbs_types.h ...	2024-11-22 20:03:57 -08:00
Edward Srouji	8ab3138a9b	net/mlx5: Introduce data placement ordering bits Introduce out-of-order (OOO) data placement (DP) IFC related bits to support OOO DP QP. Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/f30e5cbb5459fd02f27f35909bb545cab346b58b.1725362773.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 03:10:16 -05:00
Cosmin Ratiu	016f426a14	net/mlx5: qos: Flesh out element_attributes in mlx5_ifc.h This is used for multiple purposes, depending on the scheduling element created. There are a few helper struct defined a long time ago, but they are not easy to find in the file and they are about to get new members. This commit cleans up this area a bit by: - moving the helper structs closer to where they are relevant. - defining a helper union to include all of them to help discoverability. - making use of it everywhere element_attributes is used. - using a consistent 'attr' name. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	d1c9cffe4b	net/mlx5: hw counters: Remove mlx5_fc_create_ex It no longer serves any purpose and is identical to mlx5_fc_create upon which it was originally based of. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-04 11:33:47 -07:00
Cosmin Ratiu	5acd957a98	net/mlx5: hw counters: Make fc_stats & fc_pool private The mlx5_fc_stats and mlx5_fc_pool structs are only used from fs_counters.c. As such, make them private there. mlx5_fc_pool is not used or referenced at all outside fs_counters. mlx5_fc_stats is referenced from mlx5_core_dev, so instead of having it as a direct member (which requires exporting it from fs_counters), store a pointer to it, allocate it on init and clear it on destroy. One caveat is that a simple container_of to get from a 'work' struct to the outermost mlx5_core_dev struct directly no longer works, so an extra pointer had to be added to mlx5_fc_stats back to the parent dev. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-04 11:33:46 -07:00
Jakub Kicinski	854e9bf5c5	mlx5-fixes-2024-09-25 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmb0b3IACgkQSD+KveBX +j5nQAf/Z9HoBKQfTtw4Ke8OQCmYEiri20D8p47c6XVXB6g7tZ+d2EphvtGmgN8V NifxP5qcovFR7pXX2qaJRzeCmJ9J2hn4mo/cXY0EycNmR02a6kD5QFLvbmCZcFP5 VpkLAtTu9QyjmI4BkrX5wyMeXOzTX/UxG3aDcsuUf9eyuTYfu/6GEGfTFUMX9vpJ P3V2grBQHDU0M+CuigLtpPynBfTHot+/p7zWu+mGRGINFhVYeXkMzqwgeGiFpfya hjfGaKs6gFzlEGxMQoU2wigEtZnIYQr7BgrPBxzw78FgkGCx8YShoh5fCKb33OjS 0j7jwy6gcEAlZwmuuiTPpQ4TOlM27Q== =6+q6 -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2024-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2024-09-25 * tag 'mlx5-fixes-2024-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice net/mlx5e: SHAMPO, Fix overflow of hd_per_wq net/mlx5: HWS, changed E2BIG error to a negative return code net/mlx5: HWS, fixed double-free in error flow of creating SQ net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() net/mlx5: Added cond_resched() to crdump collection net/mlx5: Fix error path in multi-packet WQE transmit ==================== Link: https://patch.msgid.link/20240925202013.45374-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-02 17:14:53 -07:00
Yevgeny Kliteynik	19da17010a	net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc Fixing the wrong size of a field in hca_cap_2. The bug was introduced by adding new fields for HWS and not fixing the reserved field size. Fixes: `34c626c300` ("net/mlx5: Added missing mlx5_ifc definition for HW Steering") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2024-09-25 13:15:45 -07:00
Linus Torvalds	54d7e8190e	RDMA v6.12 merge window Usual collection of small improvements and fixes: - Bug fixes and minor improvments in cxgb4, siw, mlx5, rxe, efa, rts, hfi, erdma, hns, irdma - Code cleanups/typos/etc. Tidy alloc_ordered_workqueue() calls - Multipath PCI for mlx5 - Variable size work queue, SRQ changes, and relaxed ordering for new bnxt HW - New ODP fault resolution FW protocol in mlx5 - New "rdma monitor" netlink mechanism -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZvGZowAKCRCFwuHvBreF YcvbAP9abSxZte3zzG1ZJ/6BShSCGJvu4RMMMQI6wNJWZZiJ5wEA18MdaWzGFS8O BzP48Z/0VGsd2MOfNX4JeyYIs7SNYQA= =FXLo -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Usual collection of small improvements and fixes, nothing especially stands out to me here. The new multipath PCI feature is a sign of things to come, I think we will see more of this in the next 10 years. Broadcom and HNS continue to update their drivers for their new HW generations. Summary: - Bug fixes and minor improvments in cxgb4, siw, mlx5, rxe, efa, rts, hfi, erdma, hns, irdma - Code cleanups/typos/etc. Tidy alloc_ordered_workqueue() calls - Multipath PCI for mlx5 - Variable size work queue, SRQ changes, and relaxed ordering for new bnxt HW - New ODP fault resolution FW protocol in mlx5 - New 'rdma monitor' netlink mechanism" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits) RDMA/bnxt_re: Remove the unused variable en_dev RDMA/nldev: Add missing break in rdma_nl_notify_err_msg() RDMA/irdma: fix error message in irdma_modify_qp_roce() RDMA/cxgb4: Added NULL check for lookup_atid RDMA/hns: Fix ah error counter in sw stat not increasing RDMA/bnxt_re: Recover the device when FW error is detected RDMA/bnxt_re: Group all operations under add_device and remove_device RDMA/bnxt_re: Use the aux device for L2 ULP callbacks RDMA/bnxt_re: Change aux driver data to en_info to hold more information RDMA/nldev: Expose whether RDMA monitoring is supported RDMA/nldev: Add support for RDMA monitoring RDMA/mlx5: Use IB set_netdev and get_netdev functions RDMA/device: Remove optimization in ib_device_get_netdev() RDMA/mlx5: Initialize phys_port_cnt earlier in RDMA device creation RDMA/mlx5: Obtain upper net device only when needed RDMA/mlx5: Check RoCE LAG status before getting netdev RDMA/mlx5: Consider the query_vuid cap for data_direct net/mlx5: Handle memory scheme ODP capabilities RDMA/mlx5: Add implicit MR handling to ODP memory scheme RDMA/mlx5: Add handling for memory scheme page fault events ...	2024-09-24 11:48:00 -07:00
Chiara Meiohas	8d159eb211	RDMA/mlx5: Use IB set_netdev and get_netdev functions The IB layer provides a common interface to store and get net devices associated to an IB device port (ib_device_set_netdev() and ib_device_get_netdev()). Previously, mlx5_ib stored and managed the associated net devices internally. Replace internal net device management in mlx5_ib with ib_device_set_netdev() when attaching/detaching a net device and ib_device_get_netdev() when retrieving the net device. Export ib_device_get_netdev(). For mlx5 representors/PFs/VFs and lag creation we replace the netdev assignments with the IB set/get netdev functions. In active-backup mode lag the active slave net device is stored in the lag itself. To assure the net device stored in a lag bond IB device is the active slave we implement the following: - mlx5_core: when modifying the slave of a bond we send the internal driver event MLX5_DRIVER_EVENT_ACTIVE_BACKUP_LAG_CHANGE_LOWERSTATE. - mlx5_ib: when catching the event call ib_device_set_netdev() This patch also ensures the correct IB events are sent in switchdev lag. While at it, when in multiport eswitch mode, only a single IB device is created for all ports. The said IB device will receive all netdev events of its VFs once loaded, thus to avoid overwriting the mapping of PF IB device to PF netdev, ignore NETDEV_REGISTER events if the ib device has already been mapped to a netdev. Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/20240909173025.30422-6-michaelgur@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-09-13 08:27:40 +03:00
Michael Guralnik	907936b6f4	net/mlx5: Handle memory scheme ODP capabilities When running over new FW that supports the new memory scheme ODP, set the cap in the FW to signal the FW we are working in the new scheme. In the memory scheme ODP the per_transport_service capabilities are RO for the driver so we skip their setting. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/20240909100504.29797-9-michaelgur@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-09-13 08:27:30 +03:00
Shay Drory	5bd877093f	net/mlx5: Add NOT_READY command return status Add a new command status MLX5_CMD_STAT_NOT_READY to handle cases where the firmware is not ready. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20240911201757.1505453-14-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-09-12 20:50:29 -07:00
Moshe Shemesh	9947204cda	net/mlx5: Add device cap for supporting hot reset in sync reset flow New devices with new FW can support sync reset for firmware activate using hot reset. Add capability for supporting it and add MFRL field to query from FW which type of PCI reset method to use while handling sync reset events. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240911201757.1505453-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-09-12 20:50:29 -07:00
Moshe Shemesh	da2f660b3b	net/mlx5: fs, make get_root_namespace API function As preparation for HW Steering support, where the function get_root_namespace() is needed to get root FDB, make it an API function and rename it to mlx5_get_root_namespace(). Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20240911201757.1505453-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-09-12 20:50:28 -07:00

1 2 3 4 5 ...

1496 Commits (01f95500a162fca88cefab9ed64ceded5afabc12)