Commit Graph

25550 Commits (09cfd3c52ea76f43b3cb15e570aeddf633d65e80)

Author SHA1 Message Date
Linus Torvalds 8804d970fa Summary of significant series in this pull request:
- The 3 patch series "mm, swap: improve cluster scan strategy" from
   Kairui Song improves performance and reduces the failure rate of swap
   cluster allocation.
 
 - The 4 patch series "support large align and nid in Rust allocators"
   from Vitaly Wool permits Rust allocators to set NUMA node and large
   alignment when perforning slub and vmalloc reallocs.
 
 - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
   Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
   for virtual address spaces for ops-level DAMOS filters.
 
 - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
   from Suren Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps.
 
 - The 2 patch series "mm/mincore: minor clean up for swap cache
   checking" from Kairui Song performs some cleanup in the swap code.
 
 - The 11 patch series "mm: vm_normal_page*() improvements" from David
   Hildenbrand provides code cleanup in the pagemap code.
 
 - The 5 patch series "add persistent huge zero folio support" from
   Pankaj Raghav provides a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero.
 
 - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
   few touchups to the recently added Kexec Handover feature.
 
 - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
   arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap.  To
   end the constant struggle with space shortage on 32-bit conflicting with
   64-bit's needs.
 
 - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
   cleans up some swap code.
 
 - The 7 patch series "selftests/mm: Fix false positives and skip
   unsupported tests" from Donet Tom fixes a few things in our selftests
   code.
 
 - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
   THPs when advised" from David Hildenbrand "allows individual processes
   to opt-out of THP=always into THP=madvise, without affecting other
   workloads on the system".
 
   It's a long story - the [1/N] changelog spells out the considerations.
 
 - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
   gets us started on the memdesc project.  Please see
   https://kernelnewbies.org/MatthewWilcox/Memdescs and
   https://blogs.oracle.com/linux/post/introducing-memdesc.
 
 - The 3 patch series "Tiny optimization for large read operations" from
   Chi Zhiling improves the efficiency of the pagecache read path.
 
 - The 5 patch series "Better split_huge_page_test result check" from Zi
   Yan improves our folio splitting selftest code.
 
 - The 2 patch series "test that rmap behaves as expected" from Wei Yang
   adds some rmap selftests.
 
 - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
   removes that function and converts its two remaining callers.
 
 - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
   fixes some UFFD selftests issues.
 
 - The 3 patch series "introduce kernel file mapped folios" from Boris
   Burkov introduces the concept of "kernel file pages".  Using these
   permits btrfs to account its metadata pages to the root cgroup, rather
   than to the cgroups of random inappropriate tasks.
 
 - The 2 patch series "mm/pageblock: improve readability of some
   pageblock handling" from Wei Yang provides some readability improvements
   to the page allocator code.
 
 - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
   Park teaches DAMON to understand arm32 highmem.
 
 - The 4 patch series "tools: testing: Use existing atomic.h for
   vma/maple tests" from Brendan Jackman performs some code cleanups and
   deduplication under tools/testing/.
 
 - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
   Liam Howlett fixes a couple of 32-bit issues in
   tools/testing/radix-tree.c.
 
 - The 2 patch series "kasan: unify kasan_enabled() and remove
   arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
   arch-specific initialization code into a common arch-neutral
   implementation.
 
 - The 3 patch series "mm: remove zpool" from Johannes Weiner removes
   zspool - an indirection layer which now only redirects to a single thing
   (zsmalloc).
 
 - The 2 patch series "mm: task_stack: Stack handling cleanups" from
   Pasha Tatashin makes a couple of cleanups in the fork code.
 
 - The 37 patch series "mm: remove nth_page()" from David Hildenbrand
   makes rather a lot of adjustments at various nth_page() callsites,
   eventually permitting the removal of that undesirable helper function.
 
 - The 2 patch series "introduce kasan.write_only option in hw-tags" from
   Yeoreum Yun creates a KASAN read-only mode for ARM, using that
   architecture's memory tagging feature.  It is felt that a read-only mode
   KASAN is suitable for use in production systems rather than debug-only.
 
 - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
   from Kefeng Wang does some tidying in the hugetlb folio allocation code.
 
 - The 12 patch series "mm: establish const-correctness for pointer
   parameters" from Max Kellermann makes quite a number of the MM API
   functions more accurate about the constness of their arguments.  This
   was getting in the way of subsystems (in this case CEPH) when they
   attempt to improving their own const/non-const accuracy.
 
 - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
   fixes a number of code sites which were confused over when to use
   free_pages() vs __free_pages().
 
 - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
   Ryhl makes the mapletree code accessible to Rust.  Required by nouveau
   and by its forthcoming successor: the new Rust Nova driver.
 
 - The 2 patch series "selftests/mm: split_huge_page_test:
   split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
   some cleanups to the thp selftesting code.
 
 - The 14 patch series "mm, swap: introduce swap table as swap cache
   (phase I)" from Chris Li and Kairui Song is the first step along the
   path to implementing "swap tables" - a new approach to swap allocation
   and state tracking which is expected to yield speed and space
   improvements.  This patchset itself yields a 5-20% performance benefit
   in some situations.
 
 - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
   the new memdesc layer to clean up the ptdesc code a little.
 
 - The 3 patch series "Fix va_high_addr_switch.sh test failure" from
   Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
 
 - The 2 patch series "Minor fixes for memory allocation profiling" from
   Suren Baghdasaryan addresses a couple of minor issues in relatively new
   memory allocation profiling feature.
 
 - The 3 patch series "Small cleanups" from Matthew Wilcox has a few
   cleanups in preparation for more memdesc work.
 
 - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
   DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
   furtherance of supporting arm highmem.
 
 - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
   warnings" from Muhammad Anjum adds that compiler check to selftests code
   and fixes the fallout, by removing dead code.
 
 - The 10 patch series "Improvements to Victim Process Thawing and OOM
   Reaper Traversal Order" from zhongjinji makes a number of improvements
   in the OOM killer: mainly thawing a more appropriate group of victim
   threads so they can release resources.
 
 - The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
   from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
 
 - The 7 patch series "mm/damon: define and use DAMON initialization
   check function" from SeongJae Park implement reliability and
   maintainability improvements to a recently-added bug fix.
 
 - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
   non-idle ages" from SeongJae Park provides additional transparency to
   userspace clients of the DAMON_STAT information.
 
 - The 2 patch series "Expand scope of khugepaged anonymous collapse"
   from Dev Jain removes some constraints on khubepaged's collapsing of
   anon VMAs.  It also increases the success rate of MADV_COLLAPSE against
   an anon vma.
 
 - The 2 patch series "mm: do not assume file == vma->vm_file in
   compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
   removal of file_operations.mmap().  This patchset concentrates upon
   clearing up the treatment of stacked filesystems.
 
 - The 6 patch series "mm: Improve mlock tracking for large folios" from
   Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
   of large folios.  /proc/meminfo's "Mlocked" field became more accurate.
 
 - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
   during fork" from Donet Tom fixes several user-visible KSM stats
   inaccuracies across forks and adds selftest code to verify these
   counters.
 
 - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
   Yang addresses some potential but presently benign issues in KSM's
   mm_slot handling.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
 jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
 =4mhY
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "mm, swap: improve cluster scan strategy" from Kairui Song improves
   performance and reduces the failure rate of swap cluster allocation

 - "support large align and nid in Rust allocators" from Vitaly Wool
   permits Rust allocators to set NUMA node and large alignment when
   perforning slub and vmalloc reallocs

 - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
   DAMOS_STAT's handling of the DAMON operations sets for virtual
   address spaces for ops-level DAMOS filters

 - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
   Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps

 - "mm/mincore: minor clean up for swap cache checking" from Kairui Song
   performs some cleanup in the swap code

 - "mm: vm_normal_page*() improvements" from David Hildenbrand provides
   code cleanup in the pagemap code

 - "add persistent huge zero folio support" from Pankaj Raghav provides
   a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero

 - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
   the recently added Kexec Handover feature

 - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
   Stoakes turns mm_struct.flags into a bitmap. To end the constant
   struggle with space shortage on 32-bit conflicting with 64-bit's
   needs

 - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
   code

 - "selftests/mm: Fix false positives and skip unsupported tests" from
   Donet Tom fixes a few things in our selftests code

 - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
   from David Hildenbrand "allows individual processes to opt-out of
   THP=always into THP=madvise, without affecting other workloads on the
   system".

   It's a long story - the [1/N] changelog spells out the considerations

 - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
   the memdesc project. Please see

      https://kernelnewbies.org/MatthewWilcox/Memdescs and
      https://blogs.oracle.com/linux/post/introducing-memdesc

 - "Tiny optimization for large read operations" from Chi Zhiling
   improves the efficiency of the pagecache read path

 - "Better split_huge_page_test result check" from Zi Yan improves our
   folio splitting selftest code

 - "test that rmap behaves as expected" from Wei Yang adds some rmap
   selftests

 - "remove write_cache_pages()" from Christoph Hellwig removes that
   function and converts its two remaining callers

 - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
   selftests issues

 - "introduce kernel file mapped folios" from Boris Burkov introduces
   the concept of "kernel file pages". Using these permits btrfs to
   account its metadata pages to the root cgroup, rather than to the
   cgroups of random inappropriate tasks

 - "mm/pageblock: improve readability of some pageblock handling" from
   Wei Yang provides some readability improvements to the page allocator
   code

 - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
   to understand arm32 highmem

 - "tools: testing: Use existing atomic.h for vma/maple tests" from
   Brendan Jackman performs some code cleanups and deduplication under
   tools/testing/

 - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
   a couple of 32-bit issues in tools/testing/radix-tree.c

 - "kasan: unify kasan_enabled() and remove arch-specific
   implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
   initialization code into a common arch-neutral implementation

 - "mm: remove zpool" from Johannes Weiner removes zspool - an
   indirection layer which now only redirects to a single thing
   (zsmalloc)

 - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
   couple of cleanups in the fork code

 - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
   adjustments at various nth_page() callsites, eventually permitting
   the removal of that undesirable helper function

 - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
   creates a KASAN read-only mode for ARM, using that architecture's
   memory tagging feature. It is felt that a read-only mode KASAN is
   suitable for use in production systems rather than debug-only

 - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
   some tidying in the hugetlb folio allocation code

 - "mm: establish const-correctness for pointer parameters" from Max
   Kellermann makes quite a number of the MM API functions more accurate
   about the constness of their arguments. This was getting in the way
   of subsystems (in this case CEPH) when they attempt to improving
   their own const/non-const accuracy

 - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
   code sites which were confused over when to use free_pages() vs
   __free_pages()

 - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
   mapletree code accessible to Rust. Required by nouveau and by its
   forthcoming successor: the new Rust Nova driver

 - "selftests/mm: split_huge_page_test: split_pte_mapped_thp
   improvements" from David Hildenbrand adds a fix and some cleanups to
   the thp selftesting code

 - "mm, swap: introduce swap table as swap cache (phase I)" from Chris
   Li and Kairui Song is the first step along the path to implementing
   "swap tables" - a new approach to swap allocation and state tracking
   which is expected to yield speed and space improvements. This
   patchset itself yields a 5-20% performance benefit in some situations

 - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
   layer to clean up the ptdesc code a little

 - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
   issues in our 5-level pagetable selftesting code

 - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
   addresses a couple of minor issues in relatively new memory
   allocation profiling feature

 - "Small cleanups" from Matthew Wilcox has a few cleanups in
   preparation for more memdesc work

 - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
   Quanmin Yan makes some changes to DAMON in furtherance of supporting
   arm highmem

 - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
   Anjum adds that compiler check to selftests code and fixes the
   fallout, by removing dead code

 - "Improvements to Victim Process Thawing and OOM Reaper Traversal
   Order" from zhongjinji makes a number of improvements in the OOM
   killer: mainly thawing a more appropriate group of victim threads so
   they can release resources

 - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
   is a bunch of small and unrelated fixups for DAMON

 - "mm/damon: define and use DAMON initialization check function" from
   SeongJae Park implement reliability and maintainability improvements
   to a recently-added bug fix

 - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
   SeongJae Park provides additional transparency to userspace clients
   of the DAMON_STAT information

 - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
   some constraints on khubepaged's collapsing of anon VMAs. It also
   increases the success rate of MADV_COLLAPSE against an anon vma

 - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
   from Lorenzo Stoakes moves us further towards removal of
   file_operations.mmap(). This patchset concentrates upon clearing up
   the treatment of stacked filesystems

 - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
   provides some fixes and improvements to mlock's tracking of large
   folios. /proc/meminfo's "Mlocked" field became more accurate

 - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
   Donet Tom fixes several user-visible KSM stats inaccuracies across
   forks and adds selftest code to verify these counters

 - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
   some potential but presently benign issues in KSM's mm_slot handling

* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
  mm: swap: check for stable address space before operating on the VMA
  mm: convert folio_page() back to a macro
  mm/khugepaged: use start_addr/addr for improved readability
  hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
  alloc_tag: fix boot failure due to NULL pointer dereference
  mm: silence data-race in update_hiwater_rss
  mm/memory-failure: don't select MEMORY_ISOLATION
  mm/khugepaged: remove definition of struct khugepaged_mm_slot
  mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
  hugetlb: increase number of reserving hugepages via cmdline
  selftests/mm: add fork inheritance test for ksm_merging_pages counter
  mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
  drivers/base/node: fix double free in register_one_node()
  mm: remove PMD alignment constraint in execmem_vmalloc()
  mm/memory_hotplug: fix typo 'esecially' -> 'especially'
  mm/rmap: improve mlock tracking for large folios
  mm/filemap: map entire large folio faultaround
  mm/fault: try to map the entire file folio in finish_fault()
  mm/rmap: mlock large folios in try_to_unmap_one()
  mm/rmap: fix a mlock race condition in folio_referenced_one()
  ...
2025-10-02 18:18:33 -07:00
Linus Torvalds e1b1d03cee for-6.18/block-20250929
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmjbLCgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpoY0D/9J+11BC88pBxCrLKv/V2TwCNokRMi0dU3L
 r3EUdA46k0oXmvb6ueZqIcfY2e+IX7rdQkaRbh1zRdsNejqHo4548C3ePWGdBAcM
 OdNEGfpehO0aD0td1+mK/NxoJMLhbs5QraPanz+SOkGZOKeF+vGCga5PUDivsr5J
 16T9yb7i+isENLdAc2RJbZVyAphqHQlo5GHi5ZIKOVi5cNt8GU/q2sQl7NYmGvHd
 aq37svvZHFOhLRajP959Fw9WOxEYITewzQ4UYf1FZjUodJUxO+vCnP0ooBQRlyu8
 1B4PYWwSE+Vn3GkQE0Om+mzo9AVPOiLmoAWGxdgJBMyEkZndocr46XEslXOufQ1Z
 T3Gu19G6jCxcyByNVhjVnaajYKmvSQAy1w75m4XlfqTRm4f9Om+LAJavUk3RuaOL
 7lXKQ7Ql1/Tby9Jmf8afjYYXXotNDNku6rz2P3qtOwAA26mNJfgVt0rO+8XGRDe9
 ioLbCkTjslYMc/Oh4jSsbrspsVALbaQMq/Dmah8k0EWb4QAHVgCJyGBoff3hOboI
 jD6B1enaKOQVgcjWcjm/FjOk3jv2h3v4X26YWQZTvEc/1PnSnST78Zi/ePhzDdmt
 sBALUAS37TfTgNMzrhbHl5Zs13k0C0XyANuayuKuo5hlNnC1wbdap+5FZJOmpuOB
 YT+VkYnaOA==
 =kOmc
 -----END PGP SIGNATURE-----

Merge tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
     - FC target fixes (Daniel)
     - Authentication fixes and updates (Martin, Chris)
     - Admin controller handling (Kamaljit)
     - Target lockdep assertions (Max)
     - Keep-alive updates for discovery (Alastair)
     - Suspend quirk (Georg)

 - MD pull request via Yu:
     - Add support for a lockless bitmap.

       A key feature for the new bitmap are that the IO fastpath is
       lockless. If a user issues lots of write IO to the same bitmap
       bit in a short time, only the first write has additional overhead
       to update bitmap bit, no additional overhead for the following
       writes.

       By supporting only resync or recover written data, means in the
       case creating new array or replacing with a new disk, there is no
       need to do a full disk resync/recovery.

 - Switch ->getgeo() and ->bios_param() to using struct gendisk rather
   than struct block_device.

 - Rust block changes via Andreas. This series adds configuration via
   configfs and remote completion to the rnull driver. The series also
   includes a set of changes to the rust block device driver API: a few
   cleanup patches, and a few features supporting the rnull changes.

   The series removes the raw buffer formatting logic from
   `kernel::block` and improves the logic available in `kernel::string`
   to support the same use as the removed logic.

 - floppy arch cleanups

 - Reduce the number of dereferencing needed for ublk commands

 - Restrict supported sockets for nbd. Mostly done to eliminate a class
   of issues perpetually reported by syzbot, by using nonsensical socket
   setups.

 - A few s390 dasd block fixes

 - Fix a few issues around atomic writes

 - Improve DMA interation for integrity requests

 - Improve how iovecs are treated with regards to O_DIRECT aligment
   constraints.

   We used to require each segment to adhere to the constraints, now
   only the request as a whole needs to.

 - Clean up and improve p2p support, enabling use of p2p for metadata
   payloads

 - Improve locking of request lookup, using SRCU where appropriate

 - Use page references properly for brd, avoiding very long RCU sections

 - Fix ordering of recursively submitted IOs

 - Clean up and improve updating nr_requests for a live device

 - Various fixes and cleanups

* tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits)
  s390/dasd: enforce dma_alignment to ensure proper buffer validation
  s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request
  ublk: remove redundant zone op check in ublk_setup_iod()
  nvme: Use non zero KATO for persistent discovery connections
  nvmet: add safety check for subsys lock
  nvme-core: use nvme_is_io_ctrl() for I/O controller check
  nvme-core: do ioccsz/iorcsz validation only for I/O controllers
  nvme-core: add method to check for an I/O controller
  blk-cgroup: fix possible deadlock while configuring policy
  blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path
  blk-mq: Fix more tag iteration function documentation
  selftests: ublk: fix behavior when fio is not installed
  ublk: don't access ublk_queue in ublk_unmap_io()
  ublk: pass ublk_io to __ublk_complete_rq()
  ublk: don't access ublk_queue in ublk_need_complete_req()
  ublk: don't access ublk_queue in ublk_check_commit_and_fetch()
  ublk: don't pass ublk_queue to ublk_fetch()
  ublk: don't access ublk_queue in ublk_config_io_buf()
  ublk: don't access ublk_queue in ublk_check_fetch_buf()
  ublk: pass q_id and tag to __ublk_check_and_get_req()
  ...
2025-10-02 10:16:56 -07:00
Nathan Chancellor c7d3dd9163
Merge patch series "Add generated modalias to modules.builtin.modinfo"
Alexey Gladkov says:

The modules.builtin.modinfo file is used by userspace (kmod to be specific) to
get information about builtin modules. Among other information about the module,
information about module aliases is stored. This is very important to determine
that a particular modalias will be handled by a module that is inside the
kernel.

There are several mechanisms for creating modalias for modules:

The first is to explicitly specify the MODULE_ALIAS of the macro. In this case,
the aliases go into the '.modinfo' section of the module if it is compiled
separately or into vmlinux.o if it is builtin into the kernel.

The second is the use of MODULE_DEVICE_TABLE followed by the use of the
modpost utility. In this case, vmlinux.o no longer has this information and
does not get it into modules.builtin.modinfo.

For example:

$ modinfo pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30
modinfo: ERROR: Module pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 not found.

$ modinfo xhci_pci
name:           xhci_pci
filename:       (builtin)
license:        GPL
file:           drivers/usb/host/xhci-pci
description:    xHCI PCI Host Controller Driver

The builtin module is missing alias "pci:v*d*sv*sd*bc0Csc03i30*" which will be
generated by modpost if the module is built separately.

To fix this it is necessary to add the generated by modpost modalias to
modules.builtin.modinfo. Fortunately modpost already generates .vmlinux.export.c
for exported symbols. It is possible to add `.modinfo` for builtin modules and
modify the build system so that `.modinfo` section is extracted from the
intermediate vmlinux after modpost is executed.

Link: https://patch.msgid.link/cover.1758182101.git.legion@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24 09:10:54 -07:00
Alexey Gladkov b88f88c267
scsi: Always define blogic_pci_tbl structure
The blogic_pci_tbl structure is used by the MODULE_DEVICE_TABLE macro.
There is no longer a need to protect it with the MODULE condition, since
this no longer causes the compiler to warn about an unused variable.

To avoid warnings when -Wunused-const-variable option is used, mark it
as __maybe_unused for such configuration.

Cc: Khalid Aziz <khalid@gonehiking.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://patch.msgid.link/fd8e30de07de79a4923ae967eaee5ba2f2fcef00.1758182101.git.legion@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24 09:10:44 -07:00
David Hildenbrand d66ff3db89 scsi: sg: drop nth_page() usage within SG entry
It's no longer required to use nth_page() when iterating pages within a
single SG entry, so let's drop the nth_page() usage.

Link: https://lkml.kernel.org/r/20250901150359.867252-32-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-21 14:22:08 -07:00
David Hildenbrand 9b6024fa76 scsi: scsi_lib: drop nth_page() usage within SG entry
It's no longer required to use nth_page() when iterating pages within a
single SG entry, so let's drop the nth_page() usage.

Link: https://lkml.kernel.org/r/20250901150359.867252-31-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-21 14:22:08 -07:00
Jens Axboe 4dbe13c784 switching ->getgeo() from struct block_device to struct gendisk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaLifHQAKCRBZ7Krx/gZQ
 64qlAPsGU9cVg8tVcbbuf767MXyuQZkUPeA5AWnSkm0jfQzaKAEAmsF4+KsjOFRR
 EmdjHBlN5kk6a0TWzXcADlieJ/ccNA4=
 =Tr1Q
 -----END PGP SIGNATURE-----

Merge tag 'pull-getgeo' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into for-6.18/block

Pull struct block_device getgeo changes from Al.

"switching ->getgeo() from struct block_device to struct gendisk

 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>"

* tag 'pull-getgeo' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  block: switch ->getgeo() to struct gendisk
  scsi: switch ->bios_param() to passing gendisk
  scsi: switch scsi_bios_ptable() and scsi_partsize() to gendisk
2025-09-03 15:15:43 -06:00
Ming Lei 708e2371f7 scsi: sr: Reinstate rotational media flag
Reinstate the rotational media flag for the CD-ROM driver. The flag has
been cleared since commit bd4a633b6f ("block: move the nonrot flag to
queue_limits") and this breaks some applications.

Move queue limit configuration from get_sectorsize() to
sr_revalidate_disk() and set the rotational flag.

Cc: Christoph Hellwig <hch@lst.de>
Fixes: bd4a633b6f ("block: move the nonrot flag to queue_limits")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250827113550.2614535-1-ming.lei@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-30 21:46:21 -04:00
John Evans 9dba9a45c3 scsi: lpfc: Fix buffer free/clear order in deferred receive path
Fix a use-after-free window by correcting the buffer release sequence in
the deferred receive path. The code freed the RQ buffer first and only
then cleared the context pointer under the lock. Concurrent paths (e.g.,
ABTS and the repost path) also inspect and release the same pointer under
the lock, so the old order could lead to double-free/UAF.

Note that the repost path already uses the correct pattern: detach the
pointer under the lock, then free it after dropping the lock. The
deferred path should do the same.

Fixes: 472e146d1c ("scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall")
Cc: stable@vger.kernel.org
Signed-off-by: John Evans <evans1210144@gmail.com>
Link: https://lore.kernel.org/r/20250828044008.743-1-evans1210144@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-30 21:11:35 -04:00
Dan Carpenter 9dcf111dd3 scsi: qla4xxx: Prevent a potential error pointer dereference
The qla4xxx_get_ep_fwdb() function is supposed to return NULL on error,
but qla4xxx_ep_connect() returns error pointers.  Propagating the error
pointers will lead to an Oops in the caller, so change the error pointers
to NULL.

Fixes: 13483730a1 ("[SCSI] qla4xxx: fix flash/ddb support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/aJwnVKS9tHsw1tEu@stanley.mountain
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-14 23:22:46 -04:00
Christoph Hellwig fad2cf04e9 scsi: fnic: Remove a useless struct mempool forward declaration
struct mempool doesn't currently exist, and thus also isn't used in
fnic.h, remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250812082808.371119-1-hch@lst.de
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-14 22:23:32 -04:00
Al Viro 4fc8728aa3 block: switch ->getgeo() to struct gendisk
Instances are happier that way and it makes more sense anyway -
the only part of the result that is related to partition we are given
is the start sector, and that has been filled in by the caller.

Everything else is a function of the disk.  Only one instance
(DASD) is ever looking at anything other than bdev->bd_disk and
that one is trivial to adjust.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-08-13 02:59:29 -04:00
Al Viro 3eb50369c0 scsi: switch ->bios_param() to passing gendisk
Instances are passed struct block_device *bdev argument; the only thing
it is used for (if it's used in the first place) is bdev->bd_disk.
Might as well pass that in the first place...

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-08-13 02:59:28 -04:00
Al Viro 1fd143c24f scsi: switch scsi_bios_ptable() and scsi_partsize() to gendisk
Both helpers are reading the partition table of the disk specified
by block_device of some partition on it; result depends only upon
the disk in question, so we might as well pass the struct gendisk
instead.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-08-13 02:59:28 -04:00
Martin K. Petersen c6b819e005 Merge branch '6.17/scsi-queue' into 6.17/scsi-fixes
Pull in outstanding commits for 6.17.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-12 21:36:18 -04:00
Linus Torvalds d7edcc7c91 SCSI misc on 20250806
This is mostly fixes and cleanups and code reworks that trickled in
 across the merge window and the weeks leading up.  The only
 substantive update is the Mediatek ufs driver which accounts for the
 bulk of the additions.
 
 Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCaJNGsyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbe3AQCWaCas
 yQj/3S7dK17qdRQa7ooU3xeXt1A1CLlhkJEyWwD/TmmxUSFvbxjm/+Wdu0l+JX15
 EGVmwp+bX/p2ea+s6AE=
 =ZYNP
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is mostly fixes and cleanups and code reworks that trickled in
  across the merge window and the weeks leading up. The only substantive
  update is the Mediatek ufs driver which accounts for the bulk of the
  additions"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (37 commits)
  scsi: libsas: Use a bool for sas_deform_port() second argument
  scsi: libsas: Move declarations of internal functions to sas_internal.h
  scsi: libsas: Make sas_get_ata_info() static
  scsi: libsas: Simplify sas_ata_wait_eh()
  scsi: libsas: Refactor dev_is_sata()
  scsi: sd: Make sd shutdown issue START STOP UNIT appropriately
  scsi: arm64: dts: mediatek: mt8195: Add UFSHCI node
  scsi: dt-bindings: mediatek,ufs: add MT8195 compatible and update clock nodes
  scsi: dt-bindings: mediatek,ufs: Add ufs-disable-mcq flag for UFS host
  scsi: ufs: ufs-mediatek: Add UFS host support for MT8195 SoC
  scsi: ufs: ufs-pci: Remove control of UIC Completion interrupt for Intel MTL
  scsi: ufs: core: Do not write interrupt enable register unnecessarily
  scsi: ufs: core: Set and clear UIC Completion interrupt as needed
  scsi: ufs: core: Remove duplicated code in ufshcd_send_bsg_uic_cmd()
  scsi: ufs: core: Move ufshcd_enable_intr() and ufshcd_disable_intr()
  scsi: ufs: ufs-pci: Remove UFS PCI driver's ->late_init() call back
  scsi: ufs: ufs-pci: Fix default runtime and system PM levels
  scsi: ufs: ufs-pci: Fix hibernate state transition for Intel MTL-like host controllers
  scsi: ufs: host: mediatek: Support FDE (AES) clock scaling
  scsi: ufs: host: mediatek: Support clock scaling with Vcore binding
  ...
2025-08-06 15:44:25 +03:00
Jiasheng Jiang eea6cafb58 scsi: lpfc: Remove redundant assignment to avoid memory leak
Remove the redundant assignment if kzalloc() succeeds to avoid memory
leak.

Fixes: bd2cdd5e40 ("scsi: lpfc: NVME Initiator: Add debugfs support")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Link: https://lore.kernel.org/r/20250801185202.42631-1-jiashengjiangcool@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-05 22:05:40 -04:00
Jean Delvare a59976116a scsi: lpfc: Fix wrong function reference in a comment
Function scsi_host_remove() doesn't exist, the actual function name is
scsi_remove_host().

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Link: https://lore.kernel.org/r/20250731133311.52034cc4@endymion
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-08-05 21:57:13 -04:00
Linus Torvalds 2c8c9aae44 SCSI misc on 20250730
Smaller set of driver updates than usual (ufs, lpfc, mpi3mr).  The
 rest (including the core file changes) are doc updates and some minor
 bug fixes.
 
 Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCaIosYSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQWTAQCfaWMn
 U7rAoU2zEkv4/6kajfw0Nz62IjbX3fLveBOgFwD/ZQqXVPpD+316ksjzwM+5E+O9
 fxYASbF/IOLC8g1z7JU=
 =7x/z
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Smaller set of driver updates than usual (ufs, lpfc, mpi3mr).

  The rest (including the core file changes) are doc updates and some
  minor bug fixes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (49 commits)
  scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated
  scsi: scsi_transport_fc: Add comments to describe added 'rport' parameter
  scsi: bfa: Double-free fix
  scsi: isci: Fix dma_unmap_sg() nents value
  scsi: mvsas: Fix dma_unmap_sg() nents value
  scsi: elx: efct: Fix dma_unmap_sg() nents value
  scsi: scsi_transport_fc: Change to use per-rport devloss_work_q
  scsi: ufs: exynos: Fix programming of HCI_UTRL_NEXUS_TYPE
  scsi: core: Fix kernel doc for scsi_track_queue_full()
  scsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value
  scsi: ibmvscsi_tgt: Fix typo in comment
  scsi: mpi3mr: Update driver version to 8.14.0.5.50
  scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems
  scsi: mpi3mr: Drop unnecessary volatile from __iomem pointers
  scsi: mpi3mr: Fix race between config read submit and interrupt completion
  scsi: ufs: ufs-qcom: Enable QUnipro Internal Clock Gating
  scsi: ufs: core: Add ufshcd_dme_rmw() to modify DME attributes
  scsi: ufs: ufs-qcom: Update esi_vec_mask for HW major version >= 6
  scsi: core: Use scsi_cmd_priv() instead of open-coding it
  scsi: qla2xxx: Remove firmware URL
  ...
2025-07-31 12:13:53 -07:00
Colin Ian King 383cd6d879 scsi: scsi_debug: Make read-only arrays static const
Don't populate the read-only arrays on the stack at run time, instead
make them static const. Also reduces overall size.

before:
   text	   data	    bss	    dec	    hex	filename
 367439	  89582	   5952	 462973	  7107d	drivers/scsi/scsi_debug.o

after:
   text	   data	    bss	    dec	    hex	filename
 365847	  90702	   5952	 462501	  70ea5	drivers/scsi/scsi_debug.o

(gcc 14.2.0, x86-64)

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20250729064930.1659007-1-colin.i.king@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-30 23:26:50 -04:00
Linus Torvalds 22c5696e3f Driver core changes for 6.17-rc1
- DEBUGFS
 
   - Remove unneeded debugfs_file_{get,put}() instances
 
   - Remove last remnants of debugfs_real_fops()
 
   - Allow storing non-const void * in struct debugfs_inode_info::aux
 
 - SYSFS
 
   - Switch back to attribute_group::bin_attrs (treewide)
 
   - Switch back to bin_attribute::read()/write() (treewide)
 
   - Constify internal references to 'struct bin_attribute'
 
 - Support cache-ids for device-tree systems
 
   - Add arch hook arch_compact_of_hwid()
 
   - Use arch_compact_of_hwid() to compact MPIDR values on arm64
 
 - Rust
 
   - Device
 
     - Introduce CoreInternal device context (for bus internal methods)
 
     - Provide generic drvdata accessors for bus devices
 
     - Provide Driver::unbind() callbacks
 
     - Use the infrastructure above for auxiliary, PCI and platform
 
     - Implement Device::as_bound()
 
     - Rename Device::as_ref() to Device::from_raw() (treewide)
 
     - Implement fwnode and device property abstractions
 
       - Implement example usage in the Rust platform sample driver
 
   - Devres
 
     - Remove the inner reference count (Arc) and use pin-init instead
 
     - Replace Devres::new_foreign_owned() with devres::register()
 
     - Require T to be Send in Devres<T>
 
     - Initialize the data kept inside a Devres last
 
     - Provide an accessor for the Devres associated Device
 
   - Device ID
 
     - Add support for ACPI device IDs and driver match tables
 
     - Split up generic device ID infrastructure
 
     - Use generic device ID infrastructure in net::phy
 
   - DMA
 
     - Implement the dma::Device trait
 
     - Add DMA mask accessors to dma::Device
 
     - Implement dma::Device for PCI and platform devices
 
     - Use DMA masks from the DMA sample module
 
   - I/O
 
     - Implement abstraction for resource regions (struct resource)
 
     - Implement resource-based ioremap() abstractions
 
     - Provide platform device accessors for I/O (remap) requests
 
   - Misc
 
     - Support fallible PinInit types in Revocable
 
     - Implement Wrapper<T> for Opaque<T>
 
     - Merge pin-init blanket dependencies (for Devres)
 
 - Misc
 
   - Fix OF node leak in auxiliary_device_create()
 
   - Use util macros in device property iterators
 
   - Improve kobject sample code
 
   - Add device_link_test() for testing device link flags
 
   - Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
 
   - Hint to prefer container_of_const() over container_of()
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaIjkhwAKCRBFlHeO1qrK
 LpXuAP9RWwfD9ZGgQZ9OsMk/0pZ2mDclaK97jcmI9TAeSxeZMgD1FHnOMTY7oSIi
 iG7Muq0yLD+A5gk9HUnMUnFNrngWCg==
 =jgRj
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core

Pull driver core updates from Danilo Krummrich:
 "debugfs:
   - Remove unneeded debugfs_file_{get,put}() instances
   - Remove last remnants of debugfs_real_fops()
   - Allow storing non-const void * in struct debugfs_inode_info::aux

  sysfs:
   - Switch back to attribute_group::bin_attrs (treewide)
   - Switch back to bin_attribute::read()/write() (treewide)
   - Constify internal references to 'struct bin_attribute'

  Support cache-ids for device-tree systems:
   - Add arch hook arch_compact_of_hwid()
   - Use arch_compact_of_hwid() to compact MPIDR values on arm64

  Rust:
   - Device:
       - Introduce CoreInternal device context (for bus internal methods)
       - Provide generic drvdata accessors for bus devices
       - Provide Driver::unbind() callbacks
       - Use the infrastructure above for auxiliary, PCI and platform
       - Implement Device::as_bound()
       - Rename Device::as_ref() to Device::from_raw() (treewide)
       - Implement fwnode and device property abstractions
       - Implement example usage in the Rust platform sample driver
   - Devres:
       - Remove the inner reference count (Arc) and use pin-init instead
       - Replace Devres::new_foreign_owned() with devres::register()
       - Require T to be Send in Devres<T>
       - Initialize the data kept inside a Devres last
       - Provide an accessor for the Devres associated Device
   - Device ID:
       - Add support for ACPI device IDs and driver match tables
       - Split up generic device ID infrastructure
       - Use generic device ID infrastructure in net::phy
   - DMA:
       - Implement the dma::Device trait
       - Add DMA mask accessors to dma::Device
       - Implement dma::Device for PCI and platform devices
       - Use DMA masks from the DMA sample module
   - I/O:
       - Implement abstraction for resource regions (struct resource)
       - Implement resource-based ioremap() abstractions
       - Provide platform device accessors for I/O (remap) requests
   - Misc:
       - Support fallible PinInit types in Revocable
       - Implement Wrapper<T> for Opaque<T>
       - Merge pin-init blanket dependencies (for Devres)

  Misc:
   - Fix OF node leak in auxiliary_device_create()
   - Use util macros in device property iterators
   - Improve kobject sample code
   - Add device_link_test() for testing device link flags
   - Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
   - Hint to prefer container_of_const() over container_of()"

* tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (84 commits)
  rust: io: fix broken intra-doc links to `platform::Device`
  rust: io: fix broken intra-doc link to missing `flags` module
  rust: io: mem: enable IoRequest doc-tests
  rust: platform: add resource accessors
  rust: io: mem: add a generic iomem abstraction
  rust: io: add resource abstraction
  rust: samples: dma: set DMA mask
  rust: platform: implement the `dma::Device` trait
  rust: pci: implement the `dma::Device` trait
  rust: dma: add DMA addressing capabilities
  rust: dma: implement `dma::Device` trait
  rust: net::phy Change module_phy_driver macro to use module_device_table macro
  rust: net::phy represent DeviceId as transparent wrapper over mdio_device_id
  rust: device_id: split out index support into a separate trait
  device: rust: rename Device::as_ref() to Device::from_raw()
  arm64: cacheinfo: Provide helper to compress MPIDR value into u32
  cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
  cacheinfo: Set cache 'id' based on DT data
  container_of: Document container_of() is not to be used in new code
  driver core: auxiliary bus: fix OF node leak
  ...
2025-07-29 12:15:39 -07:00
Damien Le Moal a2f54ff15c scsi: core: sysfs: Correct sysfs attributes access rights
The SCSI sysfs attributes "supported_mode" and "active_mode" do not
define a store method and thus cannot be modified.  Correct the
DEVICE_ATTR() call for these two attributes to not include S_IWUSR to
allow write access as they are read-only.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250728041700.76660-1-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshin <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-28 22:32:48 -04:00
Linus Torvalds ced1b9e039 ata changes for 6.17-rc1
- Replace the ATA_DFLAG_ZAC device flag with the helper function
    ata_dev_is_zac() testing directly the device class and device zoned
    mode (me).
 
  - Some small cleanup of ata_scsi_offline_dev() code (me).
 
  - Improve the description of the link power management (LPM) policies
    in Kconfig and in the comments defining these. Together with this,
    clarify the description of the ahci driver mobile_lpm_policy module
    parameter (me).
 
  - Various code refactoring of libata LPM handling (ata_eh_set_lpm()
    renaming, introduce ata_dev_config_lpm(), LPM related quirk handling,
    and LPM related feature advertizing on device scan) (me).
 
  - Avoid unnecessary device reset when revalidating after an error when
    LPM is used (me).
 
  - Do not allow setting a port/link LPM policy if LPM is not supported,
    either because the controller does not support partial, slumber nor
    devsleep, or when the port is an external port with hotplug
    capability (me).
 
  - Make sure that device initiated power management (DIPM) is not
    enabled if the host (controller) lacks support for this feature (me).
 
  - Improve messages and debug messages related to LPM, in particular,
    reduce the number of messages signaling the lack of LPM support (me).
 
  - Cache in memory a device general purpose log directory to avoid
    having to access this log for every log page access. The intent here
    is to reduce the number of read log commands when scanning or
    revalidating a device (me).
 
  - Change ata_dev_cleanup_cdl_resources() to be a static function (me).
 
  - Rename and simplify the mode setting functions (me).
 
  - Introduce the helper function ata_port_eh_scheduled() to check if EH
    is pending or running for a port (me)
 
  - Improve ata_eh_set_pending() (return bool instead of int) (me).
 
  - Use sysfs_emit() instead of scnprintf() for libata-transport
    attributes (Jonathan).
 
  - Use the existing macro definiton of RDC vendor ID instead of
    hardcoding it in the pata_rdc driver (Andy).
 
  - Rework how EH is called for a port to avoid needing to pass along the
    prereset, softreset, hardreset and postreset operations. The
    driver API documentation for this is also updated  (me).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCaIcTIQAKCRDdoc3SxdoY
 dgtgAP4vgCeecZm0yW4eEEE5AFCqrF0TqTQsubH9UA5WtXs7aAEAtI6EFxp1BYN+
 PFSKPxrVgxPRuMpJTtYt+N/KkjoZjwo=
 =5ppj
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata updates from Damien Le Moal:

 - Replace the ATA_DFLAG_ZAC device flag with the helper function
   ata_dev_is_zac() testing directly the device class and device zoned
   mode (me)

 - Some small cleanup of ata_scsi_offline_dev() code (me)

 - Improve the description of the link power management (LPM) policies
   in Kconfig and in the comments defining these. Together with this,
   clarify the description of the ahci driver mobile_lpm_policy module
   parameter (me)

 - Various code refactoring of libata LPM handling (ata_eh_set_lpm()
   renaming, introduce ata_dev_config_lpm(), LPM related quirk handling,
   and LPM related feature advertizing on device scan) (me)

 - Avoid unnecessary device reset when revalidating after an error when
   LPM is used (me)

 - Do not allow setting a port/link LPM policy if LPM is not supported,
   either because the controller does not support partial, slumber nor
   devsleep, or when the port is an external port with hotplug
   capability (me)

 - Make sure that device initiated power management (DIPM) is not
   enabled if the host (controller) lacks support for this feature (me)

 - Improve messages and debug messages related to LPM, in particular,
   reduce the number of messages signaling the lack of LPM support (me)

 - Cache in memory a device general purpose log directory to avoid
   having to access this log for every log page access. The intent here
   is to reduce the number of read log commands when scanning or
   revalidating a device (me)

 - Change ata_dev_cleanup_cdl_resources() to be a static function (me)

 - Rename and simplify the mode setting functions (me)

 - Introduce the helper function ata_port_eh_scheduled() to check if EH
   is pending or running for a port (me)

 - Improve ata_eh_set_pending() (return bool instead of int) (me)

 - Use sysfs_emit() instead of scnprintf() for libata-transport
   attributes (Jonathan)

 - Use the existing macro definiton of RDC vendor ID instead of
   hardcoding it in the pata_rdc driver (Andy)

 - Rework how EH is called for a port to avoid needing to pass along the
   prereset, softreset, hardreset and postreset operations. The driver
   API documentation for this is also updated (me)

* tag 'ata-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: (28 commits)
  Documentation: driver-api: Update libata error handler information
  ata: libata-eh: Simplify reset operation management
  ata: libata-eh: Remove ata_do_eh()
  ata: pata_rdc: Use registered definition for the RDC vendor
  ata: libata-eh: Make ata_eh_followup_srst_needed() return a bool
  ata: libata-transport: replace scnprintf with sysfs_emit for simple attributes
  ata: libata-eh: use bool for fastdrain in ata_eh_set_pending()
  ata: libata: Introduce ata_port_eh_scheduled()
  ata: libata-core: Rename ata_do_set_mode()
  ata: libata-eh: Rename and make ata_set_mode() static
  ata: libata-core: Make ata_dev_cleanup_cdl_resources() static
  ata: libata-core: Cache the general purpose log directory
  ata: libata_eh: Add debug messages to ata_eh_link_set_lpm()
  ata: libata-core: Reduce the number of messages signaling broken LPM
  ata: ahci: Disallow LPM policy control if not supported
  ata: ahci: Disallow LPM policy control for external ports
  ata: ahci: Disable DIPM if host lacks support
  ata: libata-sata: Disallow changing LPM state if not supported
  ata: libata-eh: Avoid unnecessary resets when revalidating devices
  ata: libata-core: Advertize device support for DIPM and HIPM features
  ...
2025-07-28 17:08:24 -07:00
Linus Torvalds 6e11664f14 for-6.17/block-20250728
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiHdZ8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptRED/9o3dQ1QHL5yNM/AyCCGox0V4zra8qGS/Vc
 cBWpAVrmPGRw0IYlLZENtN9PdwKcbMzJq3l6cxeC7dBnAZP0AxTzP4YYJYUNVsqo
 WtJ3d/k5+cVp0OyOp4uabaqNeMeLoPk9/JXe1Ml2KxtDmHtj5yee0JRh7zlPZmZj
 tsrpIUTeHgAPn6yR1EI+0ybx/mjCb05Mv2Y8gF5hkUPA2PuON+MTFixJmqoy2ySh
 n+22mz/prqlyOSYh/VVv1+9jcQ94wMjcW0JIpg9lM3Kg8BCPU4IetvO1UiX6X33v
 154zEh2aJJDBx+yORS4BM4JMXjRZI7lYea2dkHM8Cajctu1Wpja9bNwnK9ibXvEc
 WtyBwztleLbAZef25fA/W87JE23fGa/r3nwIb2cF4QqkAFslCvhjA93WkOzNJCgQ
 qsWOrlCh3IK2NUu4b1Ncs3ZHOPvc51+zzjMzC6SUr54xhrxDK+gngDPhRy7XDqWJ
 DTMpIlr366o8GdJqnib0/e/CPBrThS6Vl6u0tgLnNbwdpK1svgo/uHW5ksKvDqHX
 kGEIhyRRJJC+4wyl4dsYKXa2twcyFrlWdAE+pZguEC2nZRYqYl9uXftOtvfp1x0y
 /skDX0FIDjvyjRqCLcqF03FSGqwCGS8WuWXZjPhVhcfz47NvbHeFDh1G/jMzsbpj
 S9zrPve/DQ==
 =e86T
 -----END PGP SIGNATURE-----

Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - MD pull request via Yu:
      - call del_gendisk synchronously (Xiao)
      - cleanup unused variable (John)
      - cleanup workqueue flags (Ryo)
      - fix faulty rdev can't be removed during resync (Qixing)

 - NVMe pull request via Christoph:
      - try PCIe function level reset on init failure (Keith Busch)
      - log TLS handshake failures at error level (Maurizio Lombardi)
      - pci-epf: do not complete commands twice if nvmet_req_init()
        fails (Rick Wertenbroek)
      - misc cleanups (Alok Tiwari)

 - Removal of the pktcdvd driver

   This has been more than a decade coming at this point, and some
   recently revealed breakages that had it causing issues even for cases
   where it isn't required made me re-pull the trigger on this one. It's
   known broken and nobody has stepped up to maintain the code

 - Series for ublk supporting batch commands, enabling the use of
   multishot where appropriate

 - Speed up ublk exit handling

 - Fix for the two-stage elevator fixing which could leak data

 - Convert NVMe to use the new IOVA based API

 - Increase default max transfer size to something more reasonable

 - Series fixing write operations on zoned DM devices

 - Add tracepoints for zoned block device operations

 - Prep series working towards improving blk-mq queue management in the
   presence of isolated CPUs

 - Don't allow updating of the block size of a loop device that is
   currently under exclusively ownership/open

 - Set chunk sectors from stacked device stripe size and use it for the
   atomic write size limit

 - Switch to folios in bcache read_super()

 - Fix for CD-ROM MRW exit flush handling

 - Various tweaks, fixes, and cleanups

* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
  block: restore two stage elevator switch while running nr_hw_queue update
  cdrom: Call cdrom_mrw_exit from cdrom_release function
  sunvdc: Balance device refcount in vdc_port_mpgroup_check
  nvme-pci: try function level reset on init failure
  dm: split write BIOs on zone boundaries when zone append is not emulated
  block: use chunk_sectors when evaluating stacked atomic write limits
  dm-stripe: limit chunk_sectors to the stripe size
  md/raid10: set chunk_sectors limit
  md/raid0: set chunk_sectors limit
  block: sanitize chunk_sectors for atomic write limits
  ilog2: add max_pow_of_two_factor()
  nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
  nvme-tcp: log TLS handshake failures at error level
  docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
  nvme: fix typo in status code constant for self-test in progress
  nvmet: remove redundant assignment of error code in nvmet_ns_enable()
  nvme: fix incorrect variable in io cqes error message
  nvme: fix multiple spelling and grammar issues in host drivers
  block: fix blk_zone_append_update_request_bio() kernel-doc
  md/raid10: fix set but not used variable in sync_request_write()
  ...
2025-07-28 16:43:54 -07:00
Linus Torvalds cec40a7c80 vfs-6.17-rc1.integrity
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCngAKCRCRxhvAZXjc
 ogAMAP9LqNHFf7JfDIvF/PJBxzYa0ToWwPsWACERknwkvtBRCwEAhkmscIcIMQ4t
 LPGLGha17dfpaE4RurRhBYgS9x2/1Ao=
 =jSnJ
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs 'protection info' updates from Christian Brauner:
 "This adds the new FS_IOC_GETLBMD_CAP ioctl() to query metadata and
  protection info (PI) capabilities. This ioctl returns information
  about the files integrity profile. This is useful for userspace
  applications to understand a files end-to-end data protection support
  and configure the I/O accordingly.

  For now this interface is only supported by block devices. However the
  design and placement of this ioctl in generic FS ioctl space allows us
  to extend it to work over files as well. This maybe useful when
  filesystems start supporting PI-aware layouts.

  A new structure struct logical_block_metadata_cap is introduced, which
  contains the following fields:

   - lbmd_flags:
     bitmask of logical block metadata capability flags

   - lbmd_interval:
     the amount of data described by each unit of logical block metadata

   - lbmd_size:
     size in bytes of the logical block metadata associated with each
     interval

   - lbmd_opaque_size:
     size in bytes of the opaque block tag associated with each interval

   - lbmd_opaque_offset:
     offset in bytes of the opaque block tag within the logical block
     metadata

   - lbmd_pi_size:
     size in bytes of the T10 PI tuple associated with each interval

   - lbmd_pi_offset:
     offset in bytes of T10 PI tuple within the logical block metadata

   - lbmd_pi_guard_tag_type:
     T10 PI guard tag type

   - lbmd_pi_app_tag_size:
     size in bytes of the T10 PI application tag

   - lbmd_pi_ref_tag_size:
     size in bytes of the T10 PI reference tag

   - lbmd_pi_storage_tag_size:
     size in bytes of the T10 PI storage tag

  The internal logic to fetch the capability is encapsulated in a helper
  function blk_get_meta_cap(), which uses the blk_integrity profile
  associated with the device. The ioctl returns -EOPNOTSUPP, if
  CONFIG_BLK_DEV_INTEGRITY is not enabled"

* tag 'vfs-6.17-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP
  block: fix FS_IOC_GETLBMD_CAP parsing in blkdev_common_ioctl()
  fs: add ioctl to query metadata and protection info capabilities
  nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
  block: introduce pi_tuple_size field in blk_integrity
  block: rename tuple_size field in blk_integrity to metadata_size
2025-07-28 15:12:00 -07:00
Linus Torvalds 278c7d9b5e vfs-6.17-rc1.fallocate
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCeQAKCRCRxhvAZXjc
 otqEAP9bWFExQtnzrNR+1s4UBfPVDAaTJzDnBWj6z0+Idw9oegEAoxF2ifdCPnR4
 t/xWiM4FmSA+9pwvP3U5z3sOReDDsgo=
 =WMMB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fallocate updates from Christian Brauner:
 "fallocate() currently supports creating preallocated files
  efficiently. However, on most filesystems fallocate() will preallocate
  blocks in an unwriten state even if FALLOC_FL_ZERO_RANGE is specified.

  The extent state must later be converted to a written state when the
  user writes data into this range, which can trigger numerous metadata
  changes and journal I/O. This may leads to significant write
  amplification and performance degradation in synchronous write mode.

  At the moment, the only method to avoid this is to create an empty
  file and write zero data into it (for example, using 'dd' with a large
  block size). However, this method is slow and consumes a considerable
  amount of disk bandwidth.

  Now that more and more flash-based storage devices are available it is
  possible to efficiently write zeros to SSDs using the unmap write
  zeroes command if the devices do not write physical zeroes to the
  media.

  For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support
  the DEAC bit[1], the write zeroes command does not write actual data
  to the device, instead, NVMe converts the zeroed range to a
  deallocated state, which works fast and consumes almost no disk write
  bandwidth.

  This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and
  BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and
  device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and
  STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices.

  fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES
  flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a
  way that subsequent writes to that range do not require further
  changes to the file mapping metadata. This flag is beneficial for
  subsequent pure overwriting within this range, as it can save on block
  allocation and, consequently, significant metadata changes"

* tag 'vfs-6.17-rc1.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  ext4: add FALLOC_FL_WRITE_ZEROES support
  block: add FALLOC_FL_WRITE_ZEROES support
  block: factor out common part in blkdev_fallocate()
  fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate
  dm: clear unmap write zeroes limits when disabling write zeroes
  scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP
  nvmet: set WZDS and DRB if device enables unmap write zeroes operation
  nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit
  block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
2025-07-28 13:36:49 -07:00
Martin K. Petersen 7038db7033 Merge patch series "libsas cleanups"
Damien Le Moal <dlemoal@kernel.org> says:

Martin, John,

While debugging an issue with the pm8001 driver, I generated these
cleanup patches. No functional changes overall.

These patches are against the 6.17/scsi-staging branch of the scsi tree.

Link: https://lore.kernel.org/r/20250725015818.171252-1-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:05:23 -04:00
Damien Le Moal 75fe230b9b scsi: libsas: Use a bool for sas_deform_port() second argument
Change the type of the "gone" argument of sas_deform_port() from int to
bool. Simliarly, to be consistent, do the same change to the function
sas_unregister_domain_devices().

No functional changes.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250725015818.171252-6-dlemoal@kernel.org
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:03:57 -04:00
Damien Le Moal 704ed03abf scsi: libsas: Move declarations of internal functions to sas_internal.h
Move the declaration of all functions used only within libsas from
include/scsi/sas_ata.h to drivers/scsi/libsas/sas_internal.h.

No functional changes.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250725015818.171252-5-dlemoal@kernel.org
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:03:57 -04:00
Damien Le Moal bd31394aab scsi: libsas: Make sas_get_ata_info() static
The function sas_get_ata_info() is used only in
drivers/scsi/libsas/sas_ata.c. Remove its definition from
include/scsi/sas_ata.h and make this function static.

No functional changes.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250725015818.171252-4-dlemoal@kernel.org
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:03:56 -04:00
Damien Le Moal 0dd0357051 scsi: libsas: Simplify sas_ata_wait_eh()
Simplify the code of sas_ata_wait_eh(), removing the local variable ap
for the pointer to the device ata_port structure. The test using
dev_is_sata() is also removed as all call sites of this function check if
the device is a SATA one before calling this function.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250725015818.171252-3-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:03:56 -04:00
Salomon Dushimirimana 8e48727c26 scsi: sd: Make sd shutdown issue START STOP UNIT appropriately
Commit aa3998dbeb ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") enabled libata EH to manage device power mode
trasitions for system suspend/resume and removed the flag from
ata_scsi_dev_config. However, since the sd_shutdown() function still
relies on the manage_system_start_stop flag, a spin-down command is not
issued to the disk with command "echo 1 > /sys/block/sdb/device/delete"

sd_shutdown() can be called for both system/runtime start stop
operations, so utilize the manage_run_time_start_stop flag set in the
ata_scsi_dev_config and issue a spin-down command during disk removal
when the system is running. This is in addition to when the system is
powering off and manage_shutdown flag is set. The
manage_system_start_stop flag will still be used for drivers that still
set the flag.

Fixes: aa3998dbeb ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Signed-off-by: Salomon Dushimirimana <salomondush@google.com>
Link: https://lore.kernel.org/r/20250724214520.112927-1-salomondush@google.com
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-25 09:01:21 -04:00
Ranjan Kumar 37c4e72b06 scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans
sas_user_scan() did not fully process wildcard channel scans
(SCAN_WILD_CARD) when a transport-specific user_scan() callback was
present. Only channel 0 would be scanned via user_scan(), while the
remaining channels were skipped, potentially missing devices.

user_scan() invokes updated sas_user_scan() for channel 0, and if
successful, iteratively scans remaining channels (1 to
shost->max_channel) via scsi_scan_host_selected().  This ensures complete
wildcard scanning without affecting transport-specific scanning behavior.

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250624061649.17990-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-24 22:00:43 -04:00
Li Lingfeng 7bdc689214 scsi: Revert "scsi: iscsi: Fix HW conn removal use after free"
This reverts commit c577ab7ba5.

The invocation of iscsi_put_conn() in iscsi_iter_destory_conn_fn() is
used to free the initial reference counter of iscsi_cls_conn.  For
non-qla4xxx cases, the ->destroy_conn() callback (e.g.,
iscsi_conn_teardown) will call iscsi_remove_conn() and iscsi_put_conn()
to remove the connection from the children list of session and free the
connection at last.  However for qla4xxx, it is not the case. The
->destroy_conn() callback of qla4xxx will keep the connection in the
session conn_list and doesn't use iscsi_put_conn() to free the initial
reference counter. Therefore, it seems necessary to keep the
iscsi_put_conn() in the iscsi_iter_destroy_conn_fn(), otherwise, there
will be memory leak problem.

Link: https://lore.kernel.org/all/88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com/
Fixes: c577ab7ba5 ("scsi: iscsi: Fix HW conn removal use after free")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Link: https://lore.kernel.org/r/20250715073926.3529456-1-lilingfeng3@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-24 21:21:00 -04:00
John Garry dafeaf2c03 scsi: aacraid: Stop using PCI_IRQ_AFFINITY
When PCI_IRQ_AFFINITY is set for calling pci_alloc_irq_vectors(), it
means interrupts are spread around the available CPUs. It also means that
the interrupts become managed, which means that an interrupt is shutdown
when all the CPUs in the interrupt affinity mask go offline.

Using managed interrupts in this way means that we should ensure that
completions should not occur on HW queues where the associated interrupt
is shutdown. This is typically achieved by ensuring only CPUs which are
online can generate IO completion traffic to the HW queue which they are
mapped to (so that they can also serve completion interrupts for that HW
queue).

The problem in the driver is that a CPU can generate completions to a HW
queue whose interrupt may be shutdown, as the CPUs in the HW queue
interrupt affinity mask may be offline. This can cause IOs to never
complete and hang the system. The driver maintains its own CPU <-> HW
queue mapping for submissions, see aac_fib_vector_assign(), but this does
not reflect the CPU <-> HW queue interrupt affinity mapping.

Commit 9dc704dcc0 ("scsi: aacraid: Reply queue mapping to CPUs based on
IRQ affinity") tried to remedy this issue may mapping CPUs properly to HW
queue interrupts. However this was later reverted in commit c5becf57dd
("Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ
affinity") - it seems that there were other reports of hangs. I guess
that this was due to some implementation issue in the original commit or
maybe a HW issue.

Fix the very original hang by just not using managed interrupts by not
setting PCI_IRQ_AFFINITY.  In this way, all CPUs will be in each HW queue
affinity mask, so should not create completion problems if any CPUs go
offline.

Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20250715111535.499853-1-john.g.garry@oracle.com
Closes: https://lore.kernel.org/linux-scsi/20250618192427.3845724-1-jmeneghi@redhat.com/
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-24 21:18:00 -04:00
Tomas Henzl 3e90b38781 scsi: mpt3sas: Fix a fw_event memory leak
In _mpt3sas_fw_work() the fw_event reference is removed, it should also
be freed in all cases.

Fixes: 4318c73478 ("scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20250723153018.50518-1-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-24 21:06:38 -04:00
Showrya M N 3ea3a256ed scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated
In case of an ib_fast_reg_mr allocation failure during iSER setup, the
machine hits a panic because iscsi_conn->dd_data is initialized
unconditionally, even when no memory is allocated (dd_size == 0).  This
leads invalid pointer dereference during connection teardown.

Fix by setting iscsi_conn->dd_data only if memory is actually allocated.

Panic trace:
------------
 iser: iser_create_fastreg_desc: Failed to allocate ib_fast_reg_mr err=-12
 iser: iser_alloc_rx_descriptors: failed allocating rx descriptors / data buffers
 BUG: unable to handle page fault for address: fffffffffffffff8
 RIP: 0010:swake_up_locked.part.5+0xa/0x40
 Call Trace:
  complete+0x31/0x40
  iscsi_iser_conn_stop+0x88/0xb0 [ib_iser]
  iscsi_stop_conn+0x66/0xc0 [scsi_transport_iscsi]
  iscsi_if_stop_conn+0x14a/0x150 [scsi_transport_iscsi]
  iscsi_if_rx+0x1135/0x1834 [scsi_transport_iscsi]
  ? netlink_lookup+0x12f/0x1b0
  ? netlink_deliver_tap+0x2c/0x200
  netlink_unicast+0x1ab/0x280
  netlink_sendmsg+0x257/0x4f0
  ? _copy_from_user+0x29/0x60
  sock_sendmsg+0x5f/0x70

Signed-off-by: Showrya M N <showrya@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Link: https://lore.kernel.org/r/20250627112329.19763-1-showrya@chelsio.com
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-21 23:48:36 -04:00
Ewan D. Milne 603e4dbe91 scsi: scsi_transport_fc: Add comments to describe added 'rport' parameter
Note that there is no executable code altered by this patch.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507181446.aAoFiDm5-lkp@intel.com/
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Link: https://lore.kernel.org/r/20250721164652.335716-1-emilne@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-21 23:13:34 -04:00
Damien Le Moal a4daf088a7 ata: libata-eh: Simplify reset operation management
Introduce struct ata_reset_operations to aggregate in a single structure
the definitions of the 4 reset methods (prereset, softreset, hardreset
and postreset) for a port. This new structure is used in struct ata_port
to define the reset methods for a regular port (reset field) and for a
port-multiplier port (pmp_reset field). A pointer to either of these
fields replaces the 4 reset method arguments passed to ata_eh_recover()
and ata_eh_reset().

The definition of the reset methods for all drivers is changed to use
the reset and pmp_reset fields in struct ata_port_operations.

A large number of files is modifed, but no functional changes are
introduced.

Suggested-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20250716020315.235457-3-dlemoal@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
2025-07-16 09:31:43 +02:00
jackysliu add4c48503 scsi: bfa: Double-free fix
When the bfad_im_probe() function fails during initialization, the memory
pointed to by bfad->im is freed without setting bfad->im to NULL.

Subsequently, during driver uninstallation, when the state machine enters
the bfad_sm_stopping state and calls the bfad_im_probe_undo() function,
it attempts to free the memory pointed to by bfad->im again, thereby
triggering a double-free vulnerability.

Set bfad->im to NULL if probing fails.

Signed-off-by: jackysliu <1972843537@qq.com>
Link: https://lore.kernel.org/r/tencent_3BB950D6D2D470976F55FC879206DE0B9A09@qq.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 21:10:30 -04:00
Thomas Fourier 063bec4444 scsi: isci: Fix dma_unmap_sg() nents value
The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: ddcc7e347a ("isci: fix dma_unmap_sg usage")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627142451.241713-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:57:25 -04:00
Thomas Fourier 0141618727 scsi: mvsas: Fix dma_unmap_sg() nents value
The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: b576294826 ("[SCSI] mvsas: Add Marvell 6440 SAS/SATA driver")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627134822.234813-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:57:18 -04:00
Thomas Fourier 3a988d0b65 scsi: elx: efct: Fix dma_unmap_sg() nents value
The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: 692e5d73a8 ("scsi: elx: efct: LIO backend interface routines")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250627114117.188480-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:57:13 -04:00
Ewan D. Milne 25236d4844 scsi: scsi_transport_fc: Change to use per-rport devloss_work_q
Configurations with large numbers of FC rports per host instance are
taking a very long time to complete all devloss work.  Increase potential
parallelism by using a per-rport devloss_work_q for dev_loss_work and
fast_io_fail_work.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Link: https://lore.kernel.org/r/20250707202225.1203189-1-emilne@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:57:08 -04:00
Bagas Sanjaya 6070bd558a scsi: core: Fix kernel doc for scsi_track_queue_full()
Sphinx reports indentation warning on scsi_track_queue_full() return
values:

Documentation/driver-api/scsi:101: ./drivers/scsi/scsi.c:247: ERROR: Unexpected indentation. [docutils]

Fix the warning by making the return values listing a bullet list.

Fixes: eb44820c28 ("[SCSI] Add Documentation and integrate into docbook build")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20250702035822.18072-2-bagasdotme@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:56:55 -04:00
Thomas Fourier 023a293b9c scsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value
The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.

Fixes: 88a678bbc3 ("ibmvscsis: Initial commit of IBM VSCSI Tgt Driver")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250630111803.94389-2-fourier.thomas@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:56:49 -04:00
Ankit Dange 278577d850 scsi: ibmvscsi_tgt: Fix typo in comment
Correct the misspelling of "transitition" to "transition" in a comment
in ibmvscsi_tgt.c for clarity.

Signed-off-by: Ankit Dange <ankitdange37@gmail.com>
Link: https://lore.kernel.org/r/20250628125320.295824-1-ankitdange37@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:56:43 -04:00
Martin K. Petersen ae996aeb0e Merge patch series "mpi3mr: Few minor bug fixes"
Ranjan Kumar <ranjan.kumar@broadcom.com> says:

Few minor fixes of mpi3mr driver.

Link: https://lore.kernel.org/r/20250627194539.48851-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:56:16 -04:00
Ranjan Kumar e1c9a704f2 scsi: mpi3mr: Update driver version to 8.14.0.5.50
Updated driver version to 8.14.0.5.50

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:53:22 -04:00
Ranjan Kumar c91e140c82 scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems
On 32-bit systems, 64-bit BAR writes to admin queue registers are
performed as two 32-bit writes. Without locking, this can cause partial
writes when accessed concurrently.

Updated per-queue spinlocks is used to serialize these writes and prevent
race conditions.

Fixes: 824a156633 ("scsi: mpi3mr: Base driver code")
Cc: stable@vger.kernel.org
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250627194539.48851-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-14 20:53:17 -04:00