net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only

In the cited commit, when changing from switchdev to legacy mode,
uplink representor's netdev is kept, and its profile is replaced with
nic profile, so netdev is detached from old profile, then attach to
new profile.

During profile change, the hardware resources allocated by the old
profile will be cleaned up. However, the cleanup is relying on the
related kernel modules. And they may need to flush themselves first,
which is triggered by netdev events, for example, NETDEV_UNREGISTER.
However, netdev is kept, or netdev_register is called after the
cleanup, which may cause troubles because the resources are still
referred by kernel modules.

The same process applies to all the caes when uplink is leaving
switchdev mode, including devlink eswitch mode set legacy, driver
unload and devlink reload. For the first one, it can be blocked and
returns failure to users, whenever possible. But it's hard for the
others. Besides, the attachment to nic profile is unnecessary as the
netdev will be unregistered anyway for such cases.

So in this patch, the original behavior is kept only for devlink
eswitch set mode legacy. For the others, moves netdev unregistration
before the profile change.

Fixes: 7a9fb35e8c ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
pull/1114/head
Jianbo Liu 2024-12-20 10:15:05 +02:00 committed by Jakub Kicinski
parent 5a03b36856
commit 2a4f56fbcc
4 changed files with 35 additions and 2 deletions

View File

@ -6542,8 +6542,23 @@ static void _mlx5e_remove(struct auxiliary_device *adev)
mlx5_core_uplink_netdev_set(mdev, NULL);
mlx5e_dcbnl_delete_app(priv);
unregister_netdev(priv->netdev);
_mlx5e_suspend(adev, false);
/* When unload driver, the netdev is in registered state
* if it's from legacy mode. If from switchdev mode, it
* is already unregistered before changing to NIC profile.
*/
if (priv->netdev->reg_state == NETREG_REGISTERED) {
unregister_netdev(priv->netdev);
_mlx5e_suspend(adev, false);
} else {
struct mlx5_core_dev *pos;
int i;
if (test_bit(MLX5E_STATE_DESTROYING, &priv->state))
mlx5_sd_for_each_dev(i, mdev, pos)
mlx5e_destroy_mdev_resources(pos);
else
_mlx5e_suspend(adev, true);
}
/* Avoid cleanup if profile rollback failed. */
if (priv->profile)
priv->profile->cleanup(priv);

View File

@ -1509,6 +1509,21 @@ mlx5e_vport_uplink_rep_unload(struct mlx5e_rep_priv *rpriv)
priv = netdev_priv(netdev);
/* This bit is set when using devlink to change eswitch mode from
* switchdev to legacy. As need to keep uplink netdev ifindex, we
* detach uplink representor profile and attach NIC profile only.
* The netdev will be unregistered later when unload NIC auxiliary
* driver for this case.
* We explicitly block devlink eswitch mode change if any IPSec rules
* offloaded, but can't block other cases, such as driver unload
* and devlink reload. We have to unregister netdev before profile
* change for those cases. This is to avoid resource leak because
* the offloaded rules don't have the chance to be unoffloaded before
* cleanup which is triggered by detach uplink representor profile.
*/
if (!(priv->mdev->priv.flags & MLX5_PRIV_FLAGS_SWITCH_LEGACY))
unregister_netdev(netdev);
mlx5e_netdev_attach_nic_profile(priv);
}

View File

@ -3777,6 +3777,8 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
esw->eswitch_operation_in_progress = true;
up_write(&esw->mode_lock);
if (mode == DEVLINK_ESWITCH_MODE_LEGACY)
esw->dev->priv.flags |= MLX5_PRIV_FLAGS_SWITCH_LEGACY;
mlx5_eswitch_disable_locked(esw);
if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV) {
if (mlx5_devlink_trap_get_num_active(esw->dev)) {

View File

@ -524,6 +524,7 @@ enum {
* creation/deletion on drivers rescan. Unset during device attach.
*/
MLX5_PRIV_FLAGS_DETACH = 1 << 2,
MLX5_PRIV_FLAGS_SWITCH_LEGACY = 1 << 3,
};
struct mlx5_adev {