Commit Graph

25734 Commits (6a1636e06625ec0dd7f2b908ab39a8beea24bfd3)

Author SHA1 Message Date
Linus Torvalds 6a1636e066 SCSI misc on 20251214
The only core fix is in doc; all the others are in drivers, with the
 biggest impacts in libsas being the rollback on error handling and in
 ufs coming from a couple of error handling fixes, one causing a crash
 if it's activated before scanning and the other fixing W-LUN
 resumption.
 
 Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 
 iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaT4RWBsUgAAAAAAEAA5t
 YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy
 c2hpcC5jb20ACgkQ50LJTO6YrIVfSgEA23oY0kPjLLkNj4KAjLoWVPmZ/5efKw49
 m6YUwfvzTS8BAIq4eFHVecwAM8FAe9NwuOKgO5d9u6epAy6wzBeP2jly
 =cpHq
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "The only core fix is in doc; all the others are in drivers, with the
  biggest impacts in libsas being the rollback on error handling and in
  ufs coming from a couple of error handling fixes, one causing a crash
  if it's activated before scanning and the other fixing W-LUN
  resumption"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: qcom: Fix confusing cleanup.h syntax
  scsi: libsas: Add rollback handling when an error occurs
  scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name()
  scsi: ufs: core: Fix a deadlock in the frequency scaling code
  scsi: ufs: core: Fix an error handler crash
  scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed"
  scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies
  scsi: qla4xxx: Use time conversion macros
  scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset
  scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset
  scsi: imm: Fix use-after-free bug caused by unfinished delayed work
  scsi: target: sbp: Remove KMSG_COMPONENT macro
  scsi: core: Correct documentation for scsi_device_quiesce()
  scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1
  scsi: target: Reset t_task_cdb pointer in error case
  scsi: ufs: core: Fix EH failure after W-LUN resume error
2025-12-14 15:35:35 +12:00
Chaohai Chen 362432e9b9 scsi: libsas: Add rollback handling when an error occurs
In sas_register_phys(), if an error is triggered in the loop process, we
need to roll back the resources that have already been requested.

Add sas_unregister_phys() when an error occurs in sas_register_ha().

[mkp: a few coding style tweaks and address John's comment]

Signed-off-by: Chaohai Chen <wdhh6@aliyun.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20251206060616.69216-1-wdhh6@aliyun.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-12-08 22:08:31 -05:00
Benjamin Marzinski fd81bc5cca scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name()
If scsi_dh_attached_handler_name() fails to allocate the handler name,
dm-multipath (its only caller) assumes there is no attached device
handler, and sets the device up incorrectly. Return an error pointer
instead, so multipath can distinguish between failure, success where
there is no attached device handler, or when the path device is not a
SCSI device at all.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Link: https://patch.msgid.link/20251206010015.1595225-1-bmarzins@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-12-08 22:04:38 -05:00
Linus Torvalds 4482ebb297 block-6.19-20251208
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmk3KZsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkNYD/91yqAJeehx2Heq3dWj9L8hDuETQelj/g9j
 gtZCiriAPy+bb1/BmWjK+BmvjtBt+g3a4Cwi6tVj4F1zoE46IPeLhO+2iJTEBiBq
 AhRtEf/MFXFK3qUnTpEnS8w3CtsXejOTB81VQ+6BysSu+B708m/1AQHv2HocZ37R
 jivrzfCsEdBr+ISwYw/EG5KcDBVTFo/JdXIhs7k4Z8bBfa3P5ye4EhKjORtgbFNU
 5nXb78SZoWNCZF143YV++9MpZc3M2jzkzrk1CTLsUHhOxWg4T/6wTXfPGZc/W4m8
 UBhs03u/gMJnKHhlZd4kpZWDito1TQZTdY2f5sBsysRQqeT7bwDK/1xiQ1nllZiP
 oYbeD6t65yMAlELwNFXo7y/DNcS2VLBMvChIX6p1gweEzyf23YneoHYyN5agEQlN
 9C4EdcYzZRt0DwtHlIRtKvDk2LZzkJAcLau3D6ahU/DPLOawyWZKmvGiU+sSyJjF
 bEIO5c/+MLqkAgLAGaFgA4twFF1aYH9ssmJerDxprarkf1jtlOBLvUQ391Gtb5Hd
 B1yugmIgEwLbCFzhk9FlCtv2nQcWRCElnaeqv+Lv+xCBVPGCLm2qIHoTqmvHZPCd
 GbN/h0XLdgUboYPCFWVAX72/4K/cv+fQQcb+a7tiq6vMKcgJ/2I1szFGpFqz7azB
 hyiK0v3x2g==
 =r1xa
 -----END PGP SIGNATURE-----

Merge tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block updates from Jens Axboe:
 "Followup set of fixes and updates for block for the 6.19 merge window.

  NVMe had some late minute debates which lead to dropping some patches
  from that tree, which is why the initial PR didn't have NVMe included.
  It's here now. This pull request contains:

   - NVMe pull request via Keith:
       - Subsystem usage cleanups (Max)
       - Endpoint device fixes (Shin'ichiro)
       - Debug statements (Gerd)
       - FC fabrics cleanups and fixes (Daniel)
       - Consistent alloc API usages (Israel)
       - Code comment updates (Chu)
       - Authentication retry fix (Justin)

   - Fix a memory leak in the discard ioctl code, if the task is being
     interrupted by a signal at just the wrong time

   - Zoned write plugging fixes

   - Add ioctls for for persistent reservations

   - Enable per-cpu bio caching by default

   - Various little fixes and tweaks"

* tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (27 commits)
  nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
  nvme-auth: use kvfree() for memory allocated with kvcalloc()
  nvmet-tcp: use kvcalloc for commands array
  nvmet-rdma: use kvcalloc for commands and responses arrays
  nvme: fix typo error in nvme target
  nvmet-fc: use pr_* print macros instead of dev_*
  nvmet-fcloop: remove unused lsdir member.
  nvmet-fcloop: check all request and response have been processed
  nvme-fc: check all request and response have been processed
  block: fix memory leak in __blkdev_issue_zero_pages
  block: fix comment for op_is_zone_mgmt() to include RESET_ALL
  block: Clear BLK_ZONE_WPLUG_PLUGGED when aborting plugged BIOs
  blk-mq: Abort suspend when wakeup events are pending
  blk-mq: add blk_rq_nr_bvec() helper
  block: add IOC_PR_READ_RESERVATION ioctl
  block: add IOC_PR_READ_KEYS ioctl
  nvme: reject invalid pr_read_keys() num_keys values
  scsi: sd: reject invalid pr_read_keys() num_keys values
  block: enable per-cpu bio cache by default
  block: use bio_alloc_bioset for passthru IO by default
  ...
2025-12-09 08:53:24 +09:00
Linus Torvalds 7eb7f5723d SCSI misc on 20251204
Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted
 cleanups and fixes including the WQ_PERCPU series.  The biggest core
 change is the new allocation of pseudo-devices which allow the sending
 of internal commands to a given SCSI target.
 
 Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 
 iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaTJf+BsUgAAAAAAEAA5t
 YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy
 c2hpcC5jb20ACgkQ50LJTO6YrIUC9QEA+q+UqGr7hPTs9C4kdLoxjDpG6tmXo11H
 ZkuXKjR5rDABAPPe3HV0Bk9C9jpLVPLp+fGJvgw//rib+XtYMjwxAJ71
 =bQZL
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted
  cleanups and fixes including the WQ_PERCPU series.

  The biggest core change is the new allocation of pseudo-devices which
  allow the sending of internal commands to a given SCSI target"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits)
  scsi: MAINTAINERS: Add the UFS include directory
  scsi: scsi_debug: Support injecting unaligned write errors
  scsi: qla2xxx: Fix improper freeing of purex item
  scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB
  scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify
  scsi: ufs: core: Use scsi_device_busy()
  scsi: ufs: core: Fix single doorbell mode support
  scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users
  scsi: target: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users
  scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users
  scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users
  scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users
  scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users
  scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users
  scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users()
  scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users
  scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users
  scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq
  ...
2025-12-05 19:56:50 -08:00
Linus Torvalds 43dfc13ca9 pci-v6.19-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmkwoyoUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vztdBAAjvZO0NOafCbhn6lUAz/T4VxPY0R7
 L5RqeLci33rQzbQ0yhJYXsd8VemMo6Zk0qmlSwjddlOMPboHoC1jK3i4C16QoP+3
 R5ecab0VoImLl2Ffig0BZoHQpVq01q2kTGQ2YyrryzDCgBCsBG3U10ZD380pGsTW
 ypqEgOCxaQCq2mtqr5CavaCcquq2krrnHkkVQOP1ryWzRq1C3wDXcQXFYNdzXpDP
 Lq8pBIh8WN5pYwrqjrFMrtNhj7BHPmowLEaAbNIWmH8WjGav624XcKq2O+arx3Hl
 BDHFKVjWtiYikrWmAODZhlY3HgEj546h2HtQYwWPOKuSKkgzNVn28LFIDgL3oXDP
 sJ6gWIDWMgRpEI6VzmqzRXJWbTAkIrRfHv3QFzvATSZV7eHk3eUpMZtG/qOi5z3P
 rPwW7NSXKbPg5qi+zKpqC20Im1Wm6fF4qQtim3uNlz2KXlVGvQl5/Ww27nIZMn6B
 Kkv5RGzdodePbKGqd+CADAQoJOpR6kOng5ZaRdZx+6aTTooyM7KSk8bFq7j/CoYM
 PzVtO6IHIvV42di7H/NP8/qtQA3xwSLue3lWpTh+tzYmZvCdldZiIvBo9YXnlsEM
 kCetDIGBxUWBj7OdYK4hC1BPBVw3XtCWM/51ElslDZTPvT543lApBfMnVTv3JmFp
 gzvZ7XfZx443JOg=
 =qSzr
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan
     Williams)

   - Switch vmd from custom domain number allocator to the common
     allocator to prevent a potential race with new non-VMD buses (Dan
     Williams)

   - Enable Precision Time Measurement (PTM) only if device advertises
     support for a relevant role, to prevent invalid PTM Requests that
     cause ACS violations that are reported as AER Uncorrectable
     Non-Fatal errors (Mika Westerberg)

  Resource management:

   - Prevent resource tree corruption when BAR resize fails (Ilpo
     Järvinen)

   - Restore BARs to the original size if a BAR resize fails (Ilpo
     Järvinen)

   - Remove BAR release from BAR resize attempts by the xe, i915, and
     amdgpu drivers so the PCI core can restore BARs if the resize fails
     (Ilpo Järvinen)

   - Move Resizable BAR code to rebar.c (Ilpo Järvinen)

   - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo
     Järvinen)

   - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo
     Järvinen)

  Power management and error handling:

   - For drivers using PCI legacy suspend, save config state at suspend
     so that state (not any earlier state from enumeration, probe, or
     error recovery) will be restored when resuming (Lukas Wunner)

   - For devices with no driver or a driver that lacks power management,
     save config state at hibernate so that state (not any earlier state
     from enumeration, probe, or error recovery) will be restored when
     resuming (Lukas Wunner)

   - Save device config space on device addition, before driver binding,
     so error recovery works more reliably (Lukas Wunner)

   - Drop pci_save_state() from several drivers that no longer need it
     since the PCI core always does it and pci_restore_state() no longer
     invalidates the saved state (Lukas Wunner)

   - Document use of pci_save_state() by drivers to capture the state
     they want restored during error recovery (Lukas Wunner)

  Power control:

   - Add a struct pci_ops.assert_perst() function pointer to
     assert/deassert PCIe PERST# and implement it for the qcom driver
     (Krishna Chaitanya Chundru)

   - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe
     switch, which must be held in reset after poweron so the pwrctrl
     driver can configure the switch via I2C before bringing up the
     links (Krishna Chaitanya Chundru)

  Endpoint framework:

   - Convert the endpoint doorbell test to use a threaded IRQ to fix a
     'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri)

   - Add endpoint VNTB MSI doorbell support to reduce latency between
     host and endpoint (Frank Li)

  New native PCIe controller drivers:

   - Add CIX Sky1 host controller DT binding and driver (Hans Zhang)

   - Add NXP S32G host controller DT binding and driver (Vincent
     Guittot)

   - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu
     Beznea)

   - Add SpacemiT K1 host controller DT binding and driver (Alex Elder)

  Amlogic Meson PCIe controller driver:

   - Update DT binding to name DBI region 'dbi', not 'elbi', and update
     driver to support both (Manivannan Sadhasivam)

  Apple PCIe controller driver:

   - Move struct pci_host_bridge allocation from pci_host_common_init()
     to callers, which significantly simplifies pcie-apple (Marc
     Zyngier)

  Broadcom STB PCIe controller driver:

   - Disable advertising ASPM L0s support correctly (Jim Quinlan)

   - Add a panic/die handler to print diagnostic info in case PCIe
     caused an unrecoverable abort (Jim Quinlan)

  Cadence PCIe controller driver:

   - Add module support for Cadence platform host and endpoint
     controller driver (Manikandan K Pillai)

   - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare
     for new CIX Sky1 driver (Manikandan K Pillai)

  MediaTek PCIe controller driver:

   - Convert DT binding to YAML schema (Christian Marangi)

   - Add Airoha AN7583 DT compatible and driver support (Christian
     Marangi)

  Qualcomm PCIe controller driver:

   - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu)

   - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280,
     sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT
     schemas (Krzysztof Kozlowski)

   - Look up OPP using both frequency and data rate (not just frequency)
     so RPMh votes can account for both (Krishna Chaitanya Chundru)

  Rockchip DesignWare PCIe controller driver:

   - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi)

  STMicroelectronics STM32MP25 PCIe controller driver:

   - Fix a race between link training and endpoint register
     initialization (Christian Bruel)

   - Align endpoint allocations to match the ATU requirements (Christian
     Bruel)

  Synopsys DesignWare PCIe controller driver:

   - Clear L1 PM Substate Capability 'Supported' bits unless glue driver
     says it's supported, which prevents users from enabling non-working
     L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)

   - Remove now-superfluous L1SS disable code from tegra194 (Bjorn
     Helgaas)

   - Configure L1SS support in dw-rockchip when DT says
     'supports-clkreq' (Shawn Lin)

  TI Keystone PCIe controller driver:

   - Fail the probe instead of silently succeeding if ks_pcie_of_data
     didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)

   - Make keystone buildable as a loadable module, except on ARM32 where
     hook_fault_code() is __init (Siddharth Vadapalli)"

* tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits)
  MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer
  MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer
  PCI: sky1: Add PCIe host support for CIX Sky1
  dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings
  PCI: cadence: Add support for High Perf Architecture (HPA) controller
  MAINTAINERS: Add NXP S32G PCIe controller driver maintainer
  PCI: s32g: Add NXP S32G PCIe controller driver (RC)
  PCI: dwc: Add register and bitfield definitions
  dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
  PCI: Add Renesas RZ/G3S host controller driver
  PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
  dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
  PCI: Validate pci_rebar_size_supported() input
  Documentation: PCI: Amend error recovery doc with pci_save_state() rules
  treewide: Drop pci_save_state() after pci_restore_state()
  PCI/ERR: Ensure error recoverability at all times
  PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
  PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
  PCI: dw-rockchip: Configure L1SS support
  PCI: tegra194: Remove unnecessary L1SS disable code
  ...
2025-12-04 17:29:41 -08:00
Stefan Hajnoczi ab4fb1d8f6 scsi: sd: reject invalid pr_read_keys() num_keys values
The pr_read_keys() interface has a u32 num_keys parameter. The SCSI
PERSISTENT RESERVE IN command has a maximum READ KEYS service action
size of 65536 bytes. Reject num_keys values that are too large to fit
into the SCSI command.

This will become important when pr_read_keys() is exposed to untrusted
userspace via an <linux/pr.h> ioctl.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-04 07:19:26 -07:00
Linus Torvalds cc25df3e2e for-6.19/block-20251201
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmktsoMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpuiUD/92eivL+HmOh10o8trvxajB0yuyqfSjHHrL
 g+xUbF4s9bgAg/v+Upx7sTY8jdrTcMjKov+G9T6uPvBMqVmeVdZckA1PSAKQaIX1
 Zb7nS2LnO7F6JKbwpwVrrIaqVbcz8MfGIIMbN4yNNEOMCwdIVMp4fo7trPBknJNx
 WddNSGUFlIF3NqSI8AflSS/pYnGm+McfBHXBpJAKipI3iquKKubHv+FX9kLp7Tn4
 x27ZoCWOHglIBTJXU0mmXCVsLF8b5BA8DQcGtT62azb8+l0cRTkaHY0DFAv5BvhG
 TqcjrKdmR0cGSNt+nEmFrujE3atBRl0G0kiHA80YgA1MTtYzdPaUVOUtM9k/rEem
 gpiGMDpBypdxyJAyijPSaVJdfcg0psOlYbhIR4N2wbj/dq8268h+cWzXlF1spgVt
 /7ygoaCmfMNbTy9rKThTjH+es787AVXUAXXaPHhIFsnCKUj8xQl4pT7XltmgYeWx
 1/XD1NEJeLHHog5upAVlGX3H5tbvP1nIICxbZa9mDOJX1rwxxI7/s/RucPjbNXuY
 AiaKPTfxtB9+Ihd2HrJ/76RVMkckcOBc4GIKoFfwuKDbcdLXQ5FcZCmVRoI1V9SV
 KsH7JBgihLwR9XWKE1vp9+CBNe1Qlu3K4IjG/E7CNLeuDntIBu73ihqGP/DqV6Bq
 RX1Dc0OyAQ==
 =m22w
 -----END PGP SIGNATURE-----

Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block updates from Jens Axboe:

 - Fix head insertion for mq-deadline, a regression from when priority
   support was added

 - Series simplifying and improving the ublk user copy code

 - Various ublk related cleanups

 - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the
   request is punted to a thread for handling

 - Merge and then later revert loop dio nowait support, as it ended up
   causing excessive stack usage for when the inline issue code needs to
   dip back into the full file system code

 - Improve auto integrity code, making it less deadlock prone

 - Speedup polled IO handling, but manually managing the hctx lookups

 - Fixes for blk-throttle for SSD devices

 - Small series with fixes for the S390 dasd driver

 - Add support for caching zones, avoiding unnecessary report zone
   queries

 - MD pull requests via Yu:
      - fix null-ptr-dereference regression for dm-raid0
      - fix IO hang for raid5 when array is broken with IO inflight
      - remove legacy 1s delay to speed up system shutdown
      - change maintainer's email address
      - data can be lost if array is created with different lbs devices,
        fix this problem and record lbs of the array in metadata
      - fix rcu protection for md_thread
      - fix mddev kobject lifetime regression
      - enable atomic writes for md-linear
      - some cleanups

 - bcache updates via Coly
      - remove useless discard and cache device code
      - improve usage of per-cpu workqueues

 - Reorganize the IO scheduler switching code, fixing some lockdep
   reports as well

 - Improve the block layer P2P DMA support

 - Add support to the block tracing code for zoned devices

 - Segment calculation improves, and memory alignment flexibility
   improvements

 - Set of prep and cleanups patches for ublk batching support. The
   actual batching hasn't been added yet, but helps shrink down the
   workload of getting that patchset ready for 6.20

 - Fix for how the ps3 block driver handles segments offsets

 - Improve how block plugging handles batch tag allocations

 - nbd fixes for use-after-free of the configuration on device clear/put

 - Set of improvements and fixes for zloop

 - Add Damien as maintainer of the block zoned device code handling

 - Various other fixes and cleanups

* tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits)
  block/rnbd: correct all kernel-doc complaints
  blk-mq: use queue_hctx in blk_mq_map_queue_type
  md: remove legacy 1s delay in md_notify_reboot
  md/raid5: fix IO hang when array is broken with IO inflight
  md: warn about updating super block failure
  md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid
  sbitmap: fix all kernel-doc warnings
  ublk: add helper of __ublk_fetch()
  ublk: pass const pointer to ublk_queue_is_zoned()
  ublk: refactor auto buffer register in ublk_dispatch_req()
  ublk: add `union ublk_io_buf` with improved naming
  ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg()
  kfifo: add kfifo_alloc_node() helper for NUMA awareness
  blk-mq: fix potential uaf for 'queue_hw_ctx'
  blk-mq: use array manage hctx map instead of xarray
  ublk: prevent invalid access with DEBUG
  s390/dasd: Use scnprintf() instead of sprintf()
  s390/dasd: Move device name formatting into separate function
  s390/dasd: Remove unnecessary debugfs_create() return checks
  s390/dasd: Fix gendisk parent after copy pair swap
  ...
2025-12-03 19:26:18 -08:00
Linus Torvalds 4d38b88fd1 printk changes for 6.19
-----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmktlbUbFIAAAAAABAAO
 bWFudTIsMi41KzEuMTEsMiwyAAoJEFKgDEdIgJTyevsP/1z98/wfCaSCquIq4H8S
 OTqFGybGgYQt1NmMj2cGPpbAE3LJNYORT0A4tcoqOTy1Z5xbQz63rO3clSI/e7Mf
 n4ZZ7NvkE40i8et1BjqtZa9dSkAv4QLYH73KrtNeuTr5tqvHo1x8FakUH6gQnb1k
 QOOebvbVXnOb+rh89j1GZShrLFcCil0psjp165WHAYE/3PyFBgYGLMCgwLqS+W3H
 re5Q4sl/ySXpMFF/XN1Kww48FWxy/h+YQFCxZwuWlUcXtVjqZ+BN+keb7AqaFQ7R
 dC2exV2W0RBoupEJR/FWHoXrm/bDDLhzqRaMvoggLJrMJ9L6V0WdIhaFA4qzoG63
 paJGFjUfmDX3dpPsAddq7kKeevCz4a2/HwFKhiBqqq4tdHuely7wZgnoFO7ovgmu
 DYDCXHtpJuWZR3WJ5I/V/sJ9i9KFXhhyWcKVf13QTAFiCaA09aeSAcUWNYNaaxbn
 nu6IkUxdIVnWIEBgcYH6jz1DrPGreYLYuD4bVb2gdZoP0r3tnMpG6xfSNIUueSGd
 VFAKW9PJYaj7Id+jgACH6V+gQ22L600xJDdL1bPjRbGE0LD7vlz2F1MZTq3BFJFn
 hUxJeOZplHX+TPophdvH4MO9VLmydWLUyJiDBP1yA8M9XZms/5s7IJJ1RYXqUCcf
 qEB4L7W1+Qy1R/lzf2PU9X4R
 =FnfO
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Allow creaing nbcon console drivers with an unsafe write_atomic()
   callback that can only be called by the final nbcon_atomic_flush_unsafe().
   Otherwise, the driver would rely on the kthread.

   It is going to be used as the-best-effort approach for an
   experimental nbcon netconsole driver, see

     https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org

   Note that a safe .write_atomic() callback is supposed to work in NMI
   context. But some networking drivers are not safe even in IRQ
   context:

     https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35

   In an ideal world, all networking drivers would be fixed first and
   the atomic flush would be blocked only in NMI context. But it brings
   the question how reliable networking drivers are when the system is
   in a bad state. They might block flushing more reliable serial
   consoles which are more suitable for serious debugging anyway.

 - Allow to use the last 4 bytes of the printk ring buffer.

 - Prevent queuing IRQ work and block printk kthreads when consoles are
   suspended. Otherwise, they create non-necessary churn or even block
   the suspend.

 - Release console_lock() between each record in the kthread used for
   legacy consoles on RT. It might significantly speed up the boot.

 - Release nbcon context between each record in the atomic flush. It
   prevents stalls of the related printk kthread after it has lost the
   ownership in the middle of a record

 - Add support for NBCON consoles into KDB

 - Add %ptsP modifier for printing struct timespec64 and use it where
   possible

 - Misc code clean up

* tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits)
  printk: Use console_is_usable on console_unblank
  arch: um: kmsg_dump: Use console_is_usable
  drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT
  lib/vsprintf: Unify FORMAT_STATE_NUM handlers
  printk: Avoid irq_work for printk_deferred() on suspend
  printk: Avoid scheduling irq_work on suspend
  printk: Allow printk_trigger_flush() to flush all types
  tracing: Switch to use %ptSp
  scsi: snic: Switch to use %ptSp
  scsi: fnic: Switch to use %ptSp
  s390/dasd: Switch to use %ptSp
  ptp: ocp: Switch to use %ptSp
  pps: Switch to use %ptSp
  PCI: epf-test: Switch to use %ptSp
  net: dsa: sja1105: Switch to use %ptSp
  mmc: mmc_test: Switch to use %ptSp
  media: av7110: Switch to use %ptSp
  ipmi: Switch to use %ptSp
  igb: Switch to use %ptSp
  e1000e: Switch to use %ptSp
  ...
2025-12-03 12:42:36 -08:00
Xingui Yang 278712d20b scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed"
This reverts commit ab2068a6fb.

When probing the exp-attached sata device, libsas/libata will issue a
hard reset in sas_probe_sata() -> ata_sas_async_probe(), then a
broadcast event will be received after the disk probe fails, and this
commit causes the probe will be re-executed on the disk, and a faulty
disk may get into an indefinite loop of probe.

Therefore, revert this commit, although it can fix some temporary issues
with disk probe failure.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20251202065627.140361-1-yangxingui@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-12-03 01:36:13 -05:00
Shi Hao 9086cac895 scsi: qla4xxx: Use time conversion macros
Replace the raw use of 500 value in schedule_timeout() function with
msecs_to_jiffies() to ensure intended value across different kernel
configurations regardless of HZ value.

Signed-off-by: Shi Hao <i.shihao.999@gmail.com>
Link: https://patch.msgid.link/20251117200949.42557-1-i.shihao.999@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 16:37:44 -05:00
Wen Xiong eaea513077 scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset
A dynamic remove/add storage adapter test hits EEH on PowerPC:

  EEH: [c00000000004f77c] __eeh_send_failure_event+0x7c/0x160
  EEH: [c000000000048464] eeh_dev_check_failure.part.0+0x254/0x660
  EEH: [c000000000934e0c] __pci_read_msi_msg+0x1ac/0x280
  EEH: [c000000000100f68] pseries_msi_compose_msg+0x28/0x40
  EEH: [c00000000020e1cc] irq_chip_compose_msi_msg+0x5c/0x90
  EEH: [c000000000214b1c] msi_domain_set_affinity+0xbc/0x100
  EEH: [c000000000206be4] irq_do_set_affinity+0x214/0x2c0
  EEH: [c000000000206e04] irq_set_affinity_locked+0x174/0x230
  EEH: [c000000000207044] irq_set_affinity+0x64/0xa0
  EEH: [c000000000212890] write_irq_affinity.constprop.0.isra.0+0x130/0x150
  EEH: [c00000000068868c] proc_reg_write+0xfc/0x160
  EEH: [c0000000005adb48] vfs_write+0xf8/0x4e0
  EEH: [c0000000005ae234] ksys_write+0x84/0x140
  EEH: [c00000000002e994] system_call_exception+0x164/0x310
  EEH: [c00000000000bfe8] system_call_vectored_common+0xe8/0x278

The irqbalance daemon kicks in before invoking qla2xxx->slot_reset
during the EEH recovery process.

  irqbalance daemon
  ->irq_set_affinity()
  ->msi_domain_set_affinity()
  ->irq_chip_set_affiinity_parent()
  ->xive_irq_set_affinity()
  ->pseries_msi_compose_ms()
  ->__pci_read_msi_msg()
  ->irq_chip_compose_msi_msg()

In __pci_read_msi_msg(), the first MSI-X vector is set to all F by the
irqbalance daemon.  pci_write_msg_msix: index=0, lo=ffffffff hi=fffffff

IRQ balancing is not required during adapter reset.

Enable "IRQ_NO_BALANCING" bit before starting adapter reset and disable
it calling pci_restore_state(). The irqbalance daemon is disabled for
this short period of time (~2s).

Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Link: https://patch.msgid.link/20251028142427.3969819-3-wenxiong@linux.ibm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 15:56:18 -05:00
Wen Xiong 6ac3484fb1 scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset
A dynamic remove/add storage adapter test hits EEH on PowerPC:

  EEH: [c00000000004f75c] __eeh_send_failure_event+0x7c/0x160
  EEH: [c000000000048444] eeh_dev_check_failure.part.0+0x254/0x650
  EEH: [c008000001650678] eeh_readl+0x60/0x90 [ipr]
  EEH: [c00800000166746c] ipr_cancel_op+0x2b8/0x524 [ipr]
  EEH: [c008000001656524] ipr_eh_abort+0x6c/0x130 [ipr]
  EEH: [c000000000ab0d20] scmd_eh_abort_handler+0x140/0x440
  EEH: [c00000000017e558] process_one_work+0x298/0x590
  EEH: [c00000000017eef8] worker_thread+0xa8/0x620
  EEH: [c00000000018be34] kthread+0x124/0x130
  EEH: [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64

A PCIe bus trace reveals that a vector of MSI-X is cleared to 0 by
irqbalance daemon. If we disable irqbalance daemon, we won't see the
issue.

With debug enabled in ipr driver:

  [   44.103071] ipr: Entering __ipr_remove
  [   44.103083] ipr: Entering ipr_initiate_ioa_bringdown
  [   44.103091] ipr: Entering ipr_reset_shutdown_ioa
  [   44.103099] ipr: Leaving ipr_reset_shutdown_ioa
  [   44.103105] ipr: Leaving ipr_initiate_ioa_bringdown
  [   44.149918] ipr: Entering ipr_reset_ucode_download
  [   44.149935] ipr: Entering ipr_reset_alert
  [   44.150032] ipr: Entering ipr_reset_start_timer
  [   44.150038] ipr: Leaving ipr_reset_alert
  [   44.244343] scsi 1:2:3:0: alua: Detached
  [   44.254300] ipr: Entering ipr_reset_start_bist
  [   44.254320] ipr: Entering ipr_reset_start_timer
  [   44.254325] ipr: Leaving ipr_reset_start_bist
  [   44.364329] scsi 1:2:4:0: alua: Detached
  [   45.134341] scsi 1:2:5:0: alua: Detached
  [   45.860949] ipr: Entering ipr_reset_shutdown_ioa
  [   45.860962] ipr: Leaving ipr_reset_shutdown_ioa
  [   45.860966] ipr: Entering ipr_reset_alert
  [   45.861028] ipr: Entering ipr_reset_start_timer
  [   45.861035] ipr: Leaving ipr_reset_alert
  [   45.964302] ipr: Entering ipr_reset_start_bist
  [   45.964309] ipr: Entering ipr_reset_start_timer
  [   45.964313] ipr: Leaving ipr_reset_start_bist
  [   46.264301] ipr: Entering ipr_reset_bist_done
  [   46.264309] ipr: Leaving ipr_reset_bist_done

During adapter reset, ipr device driver blocks config space access but
can't block MMIO access for MSI-X entries.  There is very small window:
irqbalance daemon kicks in during adapter reset before ipr driver calls
pci_restore_state(pdev) to restore MSI-X table.

irqbalance daemon reads back all 0 for that MSI-X vector in
__pci_read_msi_msg().

irqbalance daemon:

  msi_domain_set_affinity()
  ->irq_chip_set_affinity_patent()
  ->xive_irq_set_affinity()
  ->irq_chip_compose_msi_msg()
    ->pseries_msi_compose_msg()
    ->__pci_read_msi_msg(): read all 0 since didn't call pci_restore_state
  ->irq_chip_write_msi_msg()
    -> pci_write_msg_msi(): write 0 to the msix vector entry

When ipr driver calls pci_restore_state(pdev) in
ipr_reset_restore_cfg_space(), the MSI-X vector entry has been cleared
by irqbalance daemon in pci_write_msg_msix().

  pci_restore_state()
  ->__pci_restore_msix_state()

Below is the MSI-X table for ipr adapter after irqbalance daemon kicked
in during adapter reset:

  Dump MSIx table: index=0 address_lo=c800 address_hi=10000000 msg_data=0
  Dump MSIx table: index=1 address_lo=c810 address_hi=10000000 msg_data=0
  Dump MSIx table: index=2 address_lo=c820 address_hi=10000000 msg_data=0
  Dump MSIx table: index=3 address_lo=c830 address_hi=10000000 msg_data=0
  Dump MSIx table: index=4 address_lo=c840 address_hi=10000000 msg_data=0
  Dump MSIx table: index=5 address_lo=c850 address_hi=10000000 msg_data=0
  Dump MSIx table: index=6 address_lo=c860 address_hi=10000000 msg_data=0
  Dump MSIx table: index=7 address_lo=c870 address_hi=10000000 msg_data=0
  Dump MSIx table: index=8 address_lo=0 address_hi=0 msg_data=0
  ---------> Hit EEH since msix vector of index=8 are 0
  Dump MSIx table: index=9 address_lo=c890 address_hi=10000000 msg_data=0
  Dump MSIx table: index=10 address_lo=c8a0 address_hi=10000000 msg_data=0
  Dump MSIx table: index=11 address_lo=c8b0 address_hi=10000000 msg_data=0
  Dump MSIx table: index=12 address_lo=c8c0 address_hi=10000000 msg_data=0
  Dump MSIx table: index=13 address_lo=c8d0 address_hi=10000000 msg_data=0
  Dump MSIx table: index=14 address_lo=c8e0 address_hi=10000000 msg_data=0
  Dump MSIx table: index=15 address_lo=c8f0 address_hi=10000000 msg_data=0

  [   46.264312] ipr: Entering ipr_reset_restore_cfg_space
  [   46.267439] ipr: Entering ipr_fail_all_ops
  [   46.267447] ipr: Leaving ipr_fail_all_ops
  [   46.267451] ipr: Leaving ipr_reset_restore_cfg_space
  [   46.267454] ipr: Entering ipr_ioa_bringdown_done
  [   46.267458] ipr: Leaving ipr_ioa_bringdown_done
  [   46.267467] ipr: Entering ipr_worker_thread
  [   46.267470] ipr: Leaving ipr_worker_thread

IRQ balancing is not required during adapter reset.

Enable "IRQ_NO_BALANCING" flag before starting adapter reset and disable
it after calling pci_restore_state(). The irqbalance daemon is disabled
for this short period of time (~2s).

Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Link: https://patch.msgid.link/20251028142427.3969819-2-wenxiong@linux.ibm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 15:56:18 -05:00
Duoming Zhou ab58153ec6 scsi: imm: Fix use-after-free bug caused by unfinished delayed work
The delayed work item 'imm_tq' is initialized in imm_attach() and
scheduled via imm_queuecommand() for processing SCSI commands.  When the
IMM parallel port SCSI host adapter is detached through imm_detach(),
the imm_struct device instance is deallocated.

However, the delayed work might still be pending or executing
when imm_detach() is called, leading to use-after-free bugs
when the work function imm_interrupt() accesses the already
freed imm_struct memory.

The race condition can occur as follows:

CPU 0(detach thread)   | CPU 1
                       | imm_queuecommand()
                       |   imm_queuecommand_lck()
imm_detach()           |     schedule_delayed_work()
  kfree(dev) //FREE    | imm_interrupt()
                       |   dev = container_of(...) //USE
                           dev-> //USE

Add disable_delayed_work_sync() in imm_detach() to guarantee proper
cancellation of the delayed work item before imm_struct is deallocated.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://patch.msgid.link/20251028100149.40721-1-duoming@zju.edu.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 15:42:17 -05:00
Miao Li c131c9bf98 scsi: core: Correct documentation for scsi_device_quiesce()
If scsi_device_quiesce() returns zero, the function executed successfully.

Signed-off-by: Miao Li <limiao@kylinos.cn>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://patch.msgid.link/20251124075444.32699-1-limiao870622@163.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 15:20:15 -05:00
Suganath Prabu S 4588e65cfd scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1
Avoid scanning SAS/SATA devices in channel 1 when SAS transport is
enabled, as the SAS/SATA devices are exposed through channel 0.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/stable/20251120071955.463475-1-suganath-prabu.subramani%40broadcom.com
Link: https://patch.msgid.link/20251120071955.463475-1-suganath-prabu.subramani@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-29 15:08:41 -05:00
Rafael J. Wysocki f086594adb Merge branch 'pm-sleep'
Merge updates related to system suspend and hibernation for 6.19-rc1:

 - Replace snprintf() with scnprintf() in show_trace_dev_match()
   (Kaushlendra Kumar)

 - Fix memory allocation error handling in pm_vt_switch_required()
   (Malaya Kumar Rout)

 - Introduce CALL_PM_OP() macro and use it to simplify code in
   generic PM operations (Kaushlendra Kumar)

 - Add module param to backtrace all CPUs in the device power management
   watchdog (Sergey Senozhatsky)

 - Rework message printing in swsusp_save() (Rafael Wysocki)

 - Make it possible to change the number of hibernation compression
   threads (Xueqin Luo)

 - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo)

 - Add document on debugging shutdown hangs to PM documentation and
   correct a mistaken configuration option in it (Mario Limonciello)

 - Shut down wakeup source timer before removing the wakeup source from
   the list (Kaushlendra Kumar, Rafael Wysocki)

 - Introduce new PMSG_POWEROFF event for system shutdown handling with
   the help of PM device callbacks (Mario Limonciello)

 - Make pm_test delay interruptible by wakeup events (Riwen Lu)

 - Clean up kernel-doc comment style usage in the core hibernation
   code and remove unuseful comments from it (Sunday Adelodun, Rafael
   Wysocki)

 - Add support for handling wakeup events and aborting the suspend
   process while it is syncing file systems (Samuel Wu, Rafael Wysocki)

* pm-sleep: (21 commits)
  PM: hibernate: Extra cleanup of comments in swap handling code
  PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper()
  PM: sleep: Add support for wakeup during filesystem sync
  PM: hibernate: Clean up kernel-doc comment style usage
  PM: suspend: Make pm_test delay interruptible by wakeup events
  usb: sl811-hcd: Add PM_EVENT_POWEROFF into suspend callbacks
  scsi: Add PM_EVENT_POWEROFF into suspend callbacks
  PM: Introduce new PMSG_POWEROFF event
  PM: wakeup: Update after recent wakeup source removal ordering change
  PM: wakeup: Delete timer before removing wakeup source from list
  Documentation: power: Correct a mistaken configuration option
  Documentation: power: Add document on debugging shutdown hangs
  freezer: Clarify that only cgroup1 freezer uses PM freezer
  PM: hibernate: add sysfs interface for hibernate_compression_threads
  PM: hibernate: make compression threads configurable
  PM: hibernate: dynamically allocate crc->unc_len/unc for configurable threads
  PM: hibernate: Rework message printing in swsusp_save()
  PM: dpm_watchdog: add module param to backtrace all CPUs
  PM: sleep: Introduce CALL_PM_OP() macro to simplify code
  PM: console: Fix memory allocation error handling in pm_vt_switch_required()
  ...
2025-11-28 16:01:13 +01:00
Lukas Wunner 383d89699c treewide: Drop pci_save_state() after pci_restore_state()
In 2009, commit c82f63e411 ("PCI: check saved state before restore")
changed the behavior of pci_restore_state() such that it became necessary
to call pci_save_state() afterwards, lest recovery from subsequent PCI
errors fails.

The commit has just been reverted and so all the pci_save_state() after
pci_restore_state() calls that have accumulated in the tree are now
superfluous.  Drop them.

Two drivers chose a different approach to achieve the same result:
drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the
pci_dev's "state_saved" flag to true before calling pci_restore_state().
Drop this as well.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>  # qat
Link: https://patch.msgid.link/c2b28cc4defa1b743cf1dedee23c455be98b397a.1760274044.git.lukas@wunner.de
2025-11-24 16:58:59 -06:00
Martin K. Petersen e54f7b4b81 Merge branch 6.18/scsi-fixes into 6.19/scsi-staging
Pull in fixes branch to resolve UFS merge conflict.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-19 22:59:25 -05:00
Bart Van Assche 90449f2d1e scsi: sg: Do not sleep in atomic context
sg_finish_rem_req() calls blk_rq_unmap_user(). The latter function may
sleep. Hence, call sg_finish_rem_req() with interrupts enabled instead
of disabled.

Reported-by: syzbot+c01f8e6e73f20459912e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-scsi/691560c4.a70a0220.3124cb.001a.GAE@google.com/
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org
Fixes: 97d27b0dd0 ("scsi: sg: close race condition in sg_remove_sfp_usercontext()")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251113181643.1108973-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-19 22:46:36 -05:00
Bart Van Assche 13b77ed9c2 scsi: scsi_debug: Support injecting unaligned write errors
Allow user space software, e.g. a blktests test, to inject unaligned
write errors.

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251113174151.1095574-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-19 22:42:09 -05:00
Zilin Guan 78b1a242fe scsi: qla2xxx: Fix improper freeing of purex item
In qla2xxx_process_purls_iocb(), an item is allocated via
qla27xx_copy_multiple_pkt(), which internally calls
qla24xx_alloc_purex_item().

The qla24xx_alloc_purex_item() function may return a pre-allocated item
from a per-adapter pool for small allocations, instead of dynamically
allocating memory with kzalloc().

An error handling path in qla2xxx_process_purls_iocb() incorrectly uses
kfree() to release the item. If the item was from the pre-allocated
pool, calling kfree() on it is a bug that can lead to memory corruption.

Fix this by using the correct deallocation function,
qla24xx_free_purex_item(), which properly handles both dynamically
allocated and pre-allocated items.

Fixes: 875386b988 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Link: https://patch.msgid.link/20251113151246.762510-1-zilin@seu.edu.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-19 22:38:27 -05:00
Andy Shevchenko 7b040d4571 scsi: snic: Switch to use %ptSp
Use %ptSp instead of open coded variants to print content of
struct timespec64 in human readable format.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://patch.msgid.link/20251113150217.3030010-21-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-19 12:30:11 +01:00
Andy Shevchenko d710741f83 scsi: fnic: Switch to use %ptSp
Use %ptSp instead of open coded variants to print content of
struct timespec64 in human readable format.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://patch.msgid.link/20251113150217.3030010-20-andriy.shevchenko@linux.intel.com
[pmladek@suse.com: Fixed output ordering and last_read_time update.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-19 12:28:03 +01:00
Rafael J. Wysocki 37d6d92fe0 Merge back earlier material related to system sleep for 6.19 2025-11-17 16:55:55 +01:00
Mario Limonciello (AMD) 988dd0bd91 scsi: Add PM_EVENT_POWEROFF into suspend callbacks
If the PM core uses hibernation callbacks for powering off the
system, drivers will receive PM_EVENT_POWEROFF and should handle
it the same as they previously handled PM_EVENT_HIBERNATE.

Support this case in the scsi driver.  No functional changes.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Tested-by: Eric Naim <dnaim@cachyos.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251112224025.2051702-3-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:05:53 +01:00
Martin K. Petersen e360bb6dc8 Merge patch series "replace old wq(s), added WQ_PERCPU to alloc_workqueue"
Marco Crivellari <marco.crivellari@suse.com> says:

Hi,

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an
isolated CPU where "delay" value is 0 or greater then 0:
        schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

        schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the API.

=== Recent changes to the WQ API ===

The following, address the recent changes in the Workqueue API:

- commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
- commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

The old workqueues will be removed in a future release cycle.

=== Introduced Changes by this series ===

1) [P 1]  Replace uses of system_wq and system_unbound_wq

    system_unbound_wq is to be used when locality is not required.

    Because of that, system_unbound_wq has been replaced with
	system_dfl_wq, to make clear it should be used when locality
	is not required.

2) [P 2-3-4] WQ_PERCPU added to alloc_workqueue()

    This change adds a new WQ_PERCPU flag to explicitly request
    alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

Thanks!

Link: https://patch.msgid.link/20251031095643.74246-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:30:23 -05:00
Marco Crivellari 8d5cad38cf scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This  adds a new WQ_PERCPU flag to explicitly request to alloc_workqueue()
to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107155257.316728-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:27 -05:00
Marco Crivellari f60b8957d8 scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107151618.281250-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari 42312d3acd scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107150542.271229-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari e036dadf78 scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107150155.267651-3-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari a43a2e48d5 scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107150155.267651-2-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari 6184ec8b63 scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistency cannot be addressed
without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251107144949.256894-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari 84150ef06f scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.  This lack of consistentcy cannot be
addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

This patch continues the effort to refactor worqueue APIs, which has
begun with the change introducing new workqueues and a new
alloc_workqueue flag:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251104110808.123424-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari f76e4e1e83 scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users()
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251031095643.74246-5-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari afad6b34de scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251031095643.74246-4-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:26 -05:00
Marco Crivellari 5ca003bb43 scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251031095643.74246-3-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:28:25 -05:00
Marco Crivellari cd87aa2e50 scsi: scsi_transport_iscsi: Replace use of system_unbound_wq with system_dfl_wq
Currently if a user enqueue a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the
API.

system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.

Adding system_dfl_wq to encourage its use when unbound work should be
used.

The old system_unbound_wq will be kept for a few release cycles.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251031095643.74246-2-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:15:18 -05:00
Marco Crivellari 49783aca15 scsi: qla2xxx: Replace use of system_unbound_wq with system_dfl_wq
Currently if a user enqueue a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the
API.

system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.

Adding system_dfl_wq to encourage its use when unbound work should be
used.

The old system_unbound_wq will be kept for a few release cycles.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251031095643.74246-2-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:15:18 -05:00
Ally Heev 3813d28b2b scsi: scsi_debug: Fix uninitialized pointers with __free attr
Uninitialized pointers with '__free' attribute can cause undefined
behaviour as the memory assigned(randomly) to the pointer is freed
automatically when the pointer goes out of scope

scsi doesn't have any bugs related to this as of now, but it is better
to initialize and assign pointers with '__free' attr in one statement to
ensure proper scope-based cleanup

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aPiG_F5EBQUjZqsl@stanley.mountain/
Signed-off-by: Ally Heev <allyheev@gmail.com>
Link: https://patch.msgid.link/20251105-aheev-uninitialized-free-attr-scsi-v1-1-d28435a0a7ea@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 21:08:05 -05:00
Nuno Sá 1028258914 scsi: pm: Drop unneeded call to pm_runtime_mark_last_busy()
There's no need to explicitly call pm_runtime_mark_last_busy() since
pm_runtime_autosuspend() is now doing it since commit 08071e64cb ("PM:
runtime: Mark last busy stamp in pm_runtime_autosuspend()")

Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20251111-scsi-pm-improv-v2-1-626b8491f4b4@analog.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 20:56:38 -05:00
Haotian Zhang acd194d9b5 scsi: sim710: Fix resource leak by adding missing ioport_unmap() calls
The driver calls ioport_map() to map I/O ports in sim710_probe_common()
but never calls ioport_unmap() to release the mapping. This causes
resource leaks in both the error path when request_irq() fails and in
the normal device removal path via sim710_device_remove().

Add ioport_unmap() calls in the out_release error path and in
sim710_device_remove().

Fixes: 56fece2008 ("[PATCH] finally fix 53c700 to use the generic iomem infrastructure")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20251029032555.1476-1-vulab@iscas.ac.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 20:53:41 -05:00
Junrui Luo f6ab594672 scsi: aic94xx: fix use-after-free in device removal path
The asd_pci_remove() function fails to synchronize with pending tasklets
before freeing the asd_ha structure, leading to a potential
use-after-free vulnerability.

When a device removal is triggered (via hot-unplug or module unload),
race condition can occur.

The fix adds tasklet_kill() before freeing the asd_ha structure,
ensuring all scheduled tasklets complete before cleanup proceeds.

Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reported-by: Junrui Luo <moonafterrain@outlook.com>
Fixes: 2908d778ab ("[SCSI] aic94xx: new driver")
Cc: stable@vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Link: https://patch.msgid.link/ME2PR01MB3156AB7DCACA206C845FC7E8AFFDA@ME2PR01MB3156.ausprd01.prod.outlook.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 20:50:43 -05:00
Martin K. Petersen d204087a59 Merge patch series "qla2xxx target mode improvements"
Tony Battersby <tonyb@cybernetics.com> says:

This patch series improves the qla2xxx FC driver in target mode. I
developed these patches using the out-of-tree SCST target-mode
subsystem (https://scst.sourceforge.net/), although most of the
improvements will also apply to the other target-mode subsystems such
as the in-tree LIO.  Unfortunately qla2xxx+LIO does not pass all of my
tests, but my patches do not make it any worse (results below). These
patches have been well-tested at my employer with qla2xxx+SCST in both
initiator mode and target mode and with a variety of FC HBAs and
initiators. Since SCST is out-of-tree, some of the patches have parts
that apply in-tree and other parts that apply out-of-tree to SCST. The
SCST patches can be found in the v2 posting linked above.

All patches apply to linux 6.17 and SCST 3.10 master branch.

Summary of patches:
- bugfixes
- cleanups
- improve handling of aborts and task management requests
- improve log message
- add back SLER / SRR support (removed in 2017)

Some of these patches improve handling of aborts and task management
requests. This is some of the testing that I did:

Test 1: Use /dev/sg to queue random disk I/O with short timeouts; make
sure cmds are aborted successfully.
Test 2: Queue lots of disk I/O, then use "sg_reset -N -d /dev/sg" on
initiator to reset logical unit.
Test 3: Queue lots of disk I/O, then use "sg_reset -N -t /dev/sg" on
initiator to reset target.
Test 4: Queue lots of disk I/O, then use "sg_reset -N -b /dev/sg" on
initiator to reset bus.
Test 5: Queue lots of disk I/O, then use "sg_reset -N -H /dev/sg" on
initiator to reset host.
Test 6: Use fiber channel attenuator to trigger SRR during
write/read/compare test; check data integrity.

With my patches, SCST passes all of these tests.

Results with in-tree LIO target-mode subsystem:

Test 1: Seems to abort the same cmd multiple times (both
qlt_24xx_retry_term_exchange() and __qlt_send_term_exchange()). But
cmds get aborted, so give it a pass?

Test 2: Seems to work; cmds are aborted.

Test 3: Target reset doesn't seem to abort cmds, instead, a few seconds
later:
qla2xxx [0000:04:00.0]-f058:9: qla_target(0): tag 1314312, op 2a: CTIO
with TIMEOUT status 0xb received (state 1, port 51:40:2e:c0:18:1d:9f:cc,
LUN 0)

Tests 4 and 5: The initiator is unable to log back in to the target; the
following messages are repeated over and over on the target:
qla2xxx [0000:04:00.0]-e01c:9: Sending TERM ELS CTIO (ha=00000000f8811390)
qla2xxx [0000:04:00.0]-f097:9: Linking sess 000000008df5aba8 [0] wwn
51:40:2e:c0:18:1d:9f:cc with PLOGI ACK to wwn 51:40:2e:c0:18:1d:9f:cc
s_id 00:00:01, ref=2 pla 00000000835a9271 link 0

Test 6: passes with my patches; SRR not supported previously.

So qla2xxx+LIO seems a bit flaky when handling exceptions, but my
patches do not make it any worse. Perhaps someone who is more familiar
with LIO can look at the difference between LIO and SCST and figure out
how to improve it.

Tony Battersby
https://www.cybernetics.com/

v1: https://lore.kernel.org/r/f8977250-638c-4d7d-ac0c-65f742b8d535@cybernetics.com/
v2: https://lore.kernel.org/linux-scsi/e95ee7d0-3580-4124-b854-7f73ca3a3a84@cybernetics.com/
Link: https://patch.msgid.link/aaea0ab0-da8b-4153-9369-60db7507ff7a@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:50 -05:00
Tony Battersby 4f5eb50f7c scsi: qla2xxx: target: Improve safety of cmd lookup by handle
The driver associates two different structs with numeric handles and
passes the handles to the hardware.  When the hardware passes the handle
back to the driver, the driver consults a table of void * to convert the
handle back to the struct without checking the type of struct.  This can
lead to type confusion if the HBA firmware misbehaves (and some firmware
versions do).  So verify the type of struct is what is expected before
using it.

But we can also do better than that.  Also verify that the exchange
address of the message sent from the hardware matches the exchange
address of the command being returned.  This adds an extra guard against
buggy HBA firmware that returns duplicate messages multiple times (which
has also been seen) in case the driver has reused the handle for a
different command of the same type.

These problems were seen on a QLE2694L with firmware 9.08.02 when
testing SLER / SRR support.  The SRR caused the HBA to flood the
response queue with hundreds of bogus entries.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/7c7cb574-fe62-42ae-b800-d136d8dd89ca@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00
Tony Battersby c7bd85a7b9 scsi: qla2xxx: target: Add back SRR support
Background: loading qla2xxx with "ql2xtgt_tape_enable=1" enables
Sequence Level Error Recovery (SLER), which is most commonly used for
tape drives.  With SLER enabled, if there is a recoverable I/O error
during a SCSI command, a Sequence Retransmission Request (SRR) will be
used to retry the I/O at a low-level completely within the driver
without propagating the error to the upper levels of the SCSI stack.

SRR support was removed in 2017 by commit 2c39b5ca2a ("qla2xxx: Remove
SRR code"). Add it back, new and improved.

The old removed SRR code used sequence numbers to correlate the SRR
CTIOs with SRR immediate notify messages.  I don't see how that would
work reliably with MSI-X interrupts and multiple queues.  So instead use
the exchange address to find the command associated with the immediate
notify (qlt_srr_to_cmd).

The old removed SRR code had a function qlt_check_srr_debug() to
simulate a SRR, but it didn't work for me.  Instead I just used fiber
optic attenuators attached to the FC cable to reduce the strength of the
signal and induce errors.  Unfortunately this only worked for inducing
SRRs on Data-Out (write) commands, so that is all I was able to test.

The code to build a new scatterlist for a SRR with nonzero offset has
been improved to reduce memory requirements and has been well-tested.
However it does not support protection information.

When a single cmd gets multiple SRRs, the old removed SRR code would
restore the data buffer from the values in cmd->se_cmd before processing
the new SRR.  That might be needed if the offset for the new SRR was
lower than the offset for the previous SRR, but I am not sure if that
can happen.  In my testing, when a single cmd gets multiple SRRs, the
SRR offset always increases or stays the same.  But in case it can
decrease, I added the function qlt_restore_orig_sg().  If this is not
supposed to happen then qlt_restore_orig_sg() can be removed to simplify
the code.

I ran into some HBA firmware bugs with QLE269x, QLE27xx, and QLE28xx
firmware 9.05.xx - 9.08.xx where a SRR would cause the HBA to misbehave
badly.  Since SRRs are rare and therefore difficult to test, I figured
it would be worth checking for the buggy firmware and disabling SLER
with a warning instead of letting others run into the same problem on
the rare occasion that they get a SRR.  This turned out to be difficult
because the firmware version isn't known in the normal NVRAM config
routine, so I added a second NVRAM config routine that is called after
the firmware version is known.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/654b7181-b79e-40ed-a15b-6d6e441a5d5f@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00
Tony Battersby 04957d8c98 scsi: qla2xxx: target: Improve cmd logging
- Add the command tag to various messages so that different messages
   about the same command can be correlated.

 - For CTIO errors (i.e. when the HW reports an error about a cmd),
   print the cmd tag, opcode, state, initiator WWPN, and LUN.  This info
   helps an administrator determine what is going wrong.

 - When a command experiences a transport error, log a message when it
   is freed.  This makes debugging exceptions easier.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/c579987d-5658-41ae-9653-f0e58c9d1880@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00
Tony Battersby f4199d5812 scsi: qla2xxx: target: Add cmd->rsp_sent
Add cmd->rsp_sent to indicate that the SCSI status has been sent
successfully, so that SCST can be informed of any transport errors.
This will also be used for logging in later patches.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/d4b0203f-7817-4517-9789-5866bb24fad7@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00
Tony Battersby 091719c21d scsi: qla2xxx: target: Fix invalid memory access with big CDBs
struct atio7_fcp_cmnd is a variable-length data structure because of
add_cdb_len, but it is embedded in struct atio_from_isp and copied
around like a fixed-length data structure.  For big CDBs > 16 bytes,
get_datalen_for_atio() called on a fixed-length copy of the atio will
access invalid memory.

In some cases this can be fixed by moving the atio to the end of the
data structure and using a variable-length allocation.  In other cases
such as allocating struct qla_tgt_cmd, the fixed-length data structures
are preallocated for speed, so in the case that add_cdb_len != 0,
allocate a separate buffer for the CDB.  Also add memcpy_atio() as a
safeguard against invalid memory accesses.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/306a9d0b-3c89-42fc-a69c-eebca8171347@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00
Tony Battersby 3d56983cc6 scsi: qla2xxx: Fix TMR failure handling
(target mode)

If handle_tmr() fails:

 - The code for QLA_TGT_ABTS results in memory-use-after-free and
   double-free:
	qlt_do_tmr_work()
		qlt_build_abts_resp_iocb()
			qpair->req->outstanding_cmds[h] = (srb_t *)mcmd;
		mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool); FIRST FREE
	qlt_handle_abts_completion()
		mcmd = qlt_ctio_to_cmd()
			cmd = req->outstanding_cmds[h];
			return cmd;
		vha  = mcmd->vha; USE-AFTER-FREE
		ha->tgt.tgt_ops->free_mcmd(mcmd); SECOND FREE

 - qlt_send_busy() makes no sense because it sends a SCSI command
   response instead of a TMR response.

Instead just call qlt_xmit_tm_rsp() to send a TMR failed response, since
that code is well-tested and handles a number of corner cases.  But it
would be incorrect to call ha->tgt.tgt_ops->free_mcmd() after
handle_tmr() failed, so add a flag to mcmd indicating the proper way to
free the mcmd so that qlt_xmit_tm_rsp() can be used for both cases.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/09a1ff3d-6738-4953-a31b-10e89c540462@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12 18:17:28 -05:00