Commit Graph

1938 Commits (master)

Author SHA1 Message Date
Linus Torvalds 3381d7b2b3 Updates for the [PCI] MSI subsystem:
- Add interrupt redirection infrastructure
 
     Some PCI controllers use a single demultiplexing interrupt for the MSI
     interrupts of subordinate devices.
 
     This prevents setting the interrupt affinity of device interrupts, which
     causes device interrupts to be delivered to a single CPU. That obviously is
     counterproductive for multi-queue devices and interrupt balancing.
 
     To work around this limitation the new infrastructure installs a dummy
     irq_set_affinity() callback which captures the affinity mask and picks a
     redirection target CPU out of the mask.
 
     When the PCI controller demultiplexes the interrupts it invokes a new
     handling function in the core, which either runs the interrupt handler in
     the context of the target CPU or delegates it to irq_work on the target CPU.
 
   - Utilize the interrupt redirection mechanism in the PCI DWC host controller
     driver.
 
     This allows affinity control for the subordinate device MSI interrupts
     instead of being randomly executed on the CPU which runs the demultiplex
     handler.
 
   - Replace the binary 64-bit MSI flag with a DMA mask
 
     Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but
     implement less than 64 address bits. This breaks on platforms where such a
     device is assigned an MSI address higher than what's supported.
 
     With the binary 64-bit flag there is no other choice than disabling 64-bit
     MSI support which leaves the device disfunctional.
 
     By using a DMA mask the address limit of a device can be described
     correctly which provides support for the above scenario.
 
   - Make use of the DMA mask based address limit in the hda/intel and radeon
     drivers to enable them on affected platforms.
 
   - The usual small cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmmJyPsQHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYocekEADAsS5FlUkFuBy6kODhl5J7b9/oqlL3IEnR
 3CdOrFO716dce+Gej+Wp3T93dJ3XsfD7nCZuy99+LwUkTubmaBJXfjY9S+Ket0ID
 Wc3ltiD6f3GEFB14rXN+fFG/u+OOLkaXdpbQpiTnqL4JAti9qF80D4uon28+FC/o
 wc1MhqVBPbOHU9iM196ngkZuXCNVPLcnZN6PNBgIn0sxx06LcK+daY0bNGxfn5Ua
 LY9SD8hN7tYlkDi42nB/ZXMrexqT9cxSqHObmPX+G/QLfXCRBtD+gyVbs+KVzpRL
 hmFERTlUh9tUdcQFrjgiZP/r4N5ilzsu6w5ZpSOEsGuahFUPZWJWFFC1D8rmq/Ay
 X9HKge1jqXJtbCf0pJM/kdbJKSH5S6aLP3iF37y+PqITIEIX8jIT3oVcvL9hI0BW
 HFxpuJfhAVg63kMegZCO/iROTusLHUZr8iwYOM7pEiCE6fP46jPijsPffVIWvrlJ
 2LVOv/A5wy9q8FW8sF9/M6CW7cdeYQF06Ce3qAyMxjZjEyR3KFBJCVWjhqyMxZJP
 3zFl1XXKXgRO+CDrYKVTPIaXR5D76k/l6MnECQpq81CQyQKm2h6A9PyY+n70FfbZ
 BimakUlBGCd92ZbSxzC9pAOiHo0ZoKtc5BhnsRhKVyBCmEKDazEplDuf49/OSZUE
 p2kaf/PuOw==
 =SCSQ
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI updates from Thomas Gleixner:
 "Updates for the [PCI] MSI subsystem:

   - Add interrupt redirection infrastructure

     Some PCI controllers use a single demultiplexing interrupt for the
     MSI interrupts of subordinate devices.

     This prevents setting the interrupt affinity of device interrupts,
     which causes device interrupts to be delivered to a single CPU.
     That obviously is counterproductive for multi-queue devices and
     interrupt balancing.

     To work around this limitation the new infrastructure installs a
     dummy irq_set_affinity() callback which captures the affinity mask
     and picks a redirection target CPU out of the mask.

     When the PCI controller demultiplexes the interrupts it invokes a
     new handling function in the core, which either runs the interrupt
     handler in the context of the target CPU or delegates it to
     irq_work on the target CPU.

   - Utilize the interrupt redirection mechanism in the PCI DWC host
     controller driver.

     This allows affinity control for the subordinate device MSI
     interrupts instead of being randomly executed on the CPU which runs
     the demultiplex handler.

   - Replace the binary 64-bit MSI flag with a DMA mask

     Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability,
     but implement less than 64 address bits. This breaks on platforms
     where such a device is assigned an MSI address higher than what's
     supported.

     With the binary 64-bit flag there is no other choice than disabling
     64-bit MSI support which leaves the device disfunctional.

     By using a DMA mask the address limit of a device can be described
     correctly which provides support for the above scenario.

   - Make use of the DMA mask based address limit in the hda/intel and
     radeon drivers to enable them on affected platforms

   - The usual small cleanups and improvements"

* tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ALSA: hda/intel: Make MSI address limit based on the device DMA limit
  drm/radeon: Make MSI address limit based on the device DMA limit
  PCI/MSI: Check the device specific address mask in msi_verify_entries()
  PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask
  genirq/redirect: Prevent writing MSI message on affinity change
  PCI/MSI: Unmap MSI-X region on error
  genirq: Update effective affinity for redirected interrupts
  PCI: dwc: Enable MSI affinity support
  PCI: dwc: Code cleanup
  genirq: Add interrupt redirection infrastructure
  genirq/msi: Correct kernel-doc in <linux/msi.h>
2026-02-10 16:30:29 -08:00
Linus Torvalds 66bbe4a8ed Updates for the interrupt core subsystem:
- Remove the interrupt timing infrastructure
 
    This was added seven years ago to be used for power management purposes, but
    that integration never happened.
 
  - Clean up the remaining setup_percpu_irq() users
 
    The memory allocator is available when interrupts can be requested so there
    is not need for static irq_action. Move the remaining users to
    request_percpu_irq() and delete the historical cruft.
 
  - Warn when interrupt flag inconsistencies are detected in request*_irq().
 
    Inconsistent flags can lead to hard to diagnose malfunction. The fallout of
    this new warning has been addressed in next and the fixes are coming in via
    the maintainer trees and the tip irq/cleanup pull requests.
 
  - Invoke affinity notifier when CPU hotplug breaks affinity
 
    Otherwise the code using the notifier misses the affinity change and
    operates on stale information.
 
  - The usual cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmmJwSAQHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYoV1rD/9lxepiLVhIvaaauyr69HkHnLDJyyyIuNQk
 OZ5Rx8JEnmtJIN49DJ/Ps/VjDZA6GYV2RvtKfqCNtpA3AdpHcLCDw2/mZKetHH3v
 0RlTNeu0TKt76Mo+d7vadJZqdcKbBPYEBucaQed4XLGFq9Vb7nuVMFLzkDZe+C5U
 zG8zhtIaPQyY+MUxFGpWJdeKcddnIC/lKRFBmhglmuflQ8ZdrydRlJZzFBxfcnPy
 dcZVnOVZYD4aaW+uryd9eVWg0de5q97C03OobVdhwt077zcmTZpGanRCBM/+7Kop
 UDB/p04Z5FFFKFbd+j0f76gl1l6yQ06K/2Pby8qrLnodsqOEgjEgLs0qbHhkNsVw
 g5W6ry9boLWAA1XhGFOnqZ4Dc/Qb1G7DpnRUzpmAlyL04MTManjocIHNPrGhjlNG
 +jT3iPuraiRUoNZ9K2PBYX9SikGVnKJXFmOwlffkPpOtaJk/oYVUAdMq2W71C4Du
 TZeAbd2RyrSPMYaSh7pnBhM7KBmsYIg+hHU1+RaAi/rFSOT0Se5sUWwmUUUzSfOn
 KvCAzMGX6gIg8rkSmTp7OvLQhXEfJ2kEVrVZACS6MBxS7+PNVBkxYuOqxdhdoVj0
 sZrfB8Qnm4mAcbq7YJtrD6az3+LH5CkrdNyWIAL62leMmqWDoTOgLRMFh2/yEMTl
 BeAOgPO2pw==
 =3hJ2
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "Updates for the interrupt core subsystem:

   - Remove the interrupt timing infrastructure

     This was added seven years ago to be used for power management
     purposes, but that integration never happened.

   - Clean up the remaining setup_percpu_irq() users

     The memory allocator is available when interrupts can be requested
     so there is not need for static irq_action. Move the remaining
     users to request_percpu_irq() and delete the historical cruft.

   - Warn when interrupt flag inconsistencies are detected in
     request*_irq().

     Inconsistent flags can lead to hard to diagnose malfunction. The
     fallout of this new warning has been addressed in next and the
     fixes are coming in via the maintainer trees and the tip
     irq/cleanup pull requests.

   - Invoke affinity notifier when CPU hotplug breaks affinity

     Otherwise the code using the notifier misses the affinity change
     and operates on stale information.

   - The usual cleanups and improvements"

* tag 'irq-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/proc: Replace snprintf with strscpy in register_handler_proc
  genirq/cpuhotplug: Notify about affinity changes breaking the affinity mask
  genirq: Move clear of kstat_irqs to free_desc()
  genirq: Warn about using IRQF_ONESHOT without a threaded handler
  irqdomain: Fix up const problem in irq_domain_set_name()
  genirq: Remove setup_percpu_irq()
  clocksource/drivers/mips-gic-timer: Move GIC timer to request_percpu_irq()
  MIPS: Move IP27 timer to request_percpu_irq()
  MIPS: Move IP30 timer to request_percpu_irq()
  genirq: Remove __request_percpu_irq() helper
  genirq: Remove IRQ timing tracking infrastructure
2026-02-10 13:39:37 -08:00
Linus Torvalds 9b1b3dcd28 Power management updates for 6.20-rc1/7.0-rc1
- Remove the unused omap-cpufreq driver (Andreas Kemnade)
 
  - Optimize error handling code in cpufreq_boost_trigger_state() and
    make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
    supports boost (Lifeng Zheng)
 
  - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
    Dhruva Gole, and Konrad Dybcio)
 
  - Minor improvements to the cpufreq and cpumask rust implementation
    (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)
 
  - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)
 
  - Update arch_freq_scale in the CPPC cpufreq driver's frequency
    invariance engine (FIE) in scheduler ticks if the related CPPC
    registers are not in PCC (Jie Zhan)
 
  - Assorted minor cleanups and improvements in ARM cpufreq drivers (Juan
    Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)
 
  - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
    Gupta)
 
  - Make the scaling_setspeed cpufreq sysfs attribute return the actual
    requested frequency to avoid confusion (Pengjie Zhang)
 
  - Simplify the idle CPU time granularity test in the ondemand cpufreq
    governor (Frederic Weisbecker)
 
  - Enable asym capacity in intel_pstate only when CPU SMT is not
    possible (Yaxiong Tian)
 
  - Update the description of rate_limit_us default value in cpufreq
    documentation (Yaxiong Tian)
 
  - Add a command line option to adjust the C-states table in the
    intel_idle driver, remove the 'preferred_cstates' module parameter
    from it, add C-states validation to it and clean it up (Artem
    Bityutskiy)
 
  - Make the menu cpuidle governor always check the time till the closest
    timer event when the scheduler tick has been stopped to prevent it
    from mistakenly selecting the deepest available idle state (Rafael
    Wysocki)
 
  - Update the teo cpuidle governor to avoid making suboptimal decisions
    in certain corner cases and generally improve idle state selection
    accuracy (Rafael Wysocki)
 
  - Remove an unlikely() annotation on the early-return condition in
    menu_select() that leads to branch misprediction 100% of the time
    on systems with only 1 idle state enabled, like ARM64 servers (Breno
    Leitao)
 
  - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer (Christian
    Loehle)
 
  - Stop flagging the PM runtime workqueue as freezable to avoid system
    suspend and resume deadlocks in subsystems that assume asynchronous
    runtime PM to work during system-wide PM transitions (Rafael Wysocki)
 
  - Drop redundant NULL pointer checks before acomp_request_free() from
    the hibernation code handling image saving (Rafael Wysocki)
 
  - Update wakeup_sources_walk_start() to handle empty lists of wakeup
    sources as appropriate (Samuel Wu)
 
  - Make dev_pm_clear_wake_irq() check the power.wakeirq value under
    power.lock to avoid race conditions (Gui-Dong Han)
 
  - Avoid bit field races related to power.work_in_progress in the core
    device suspend code (Xuewen Yan)
 
  - Make several drivers discard pm_runtime_put() return value in
    preparation for converting that function to a void one (Rafael
    Wysocki)
 
  - Add PL4 support for Ice Lake to the Intel RAPL power capping
    driver (Daniel Tang)
 
  - Replace sprintf() with sysfs_emit() in power capping sysfs show
    functions (Sumeet Pawnikar)
 
  - Make dev_pm_opp_get_level() return value match the documentation
    after a previous update of the latter (Aleks Todorov)
 
  - Use scoped for each OF child loop in the OPP code (Krzysztof
    Kozlowski)
 
  - Fix a bug in an example code snippet and correct typos in the energy
    model management documentation (Patrick Little)
 
  - Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
 
    * idle_monitor: Fix incorrect value logged after stop
    * Fix inverted APERF capability check
    * Use strcspn() to strip trailing newline
    * Reset errno before strtoull()
    * Show C0 in idle-info dump
 
  - Improve cpupower installation procedure by making the systemd step
    optional and allowing users to disable the installation of systemd's
    unit file (João Marcos Costa)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmmDr5ASHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1Q8oH/0KRqdidHzesIQl6gd5WSS/sWdxODRUt
 R9dEGQQ6LXCY0z05RAq29HZQf618fYuRFX4PSrtyCvrcRJK7MJKuzK55MRq0MC3c
 c/2pL1PdpHexjLXUP9pcoxrYjetsr7SnD6Y0M3JfOPg1E/bG8sp1DlnE8cdqrL0W
 lrdB2cEGewT2SVkNhCIQ2n6bwfQwmLlfQl1vXTM8BA7xCjoslePUJlRphAFVAt/J
 5fQxSOH0eSxK5PYQFUDM2D2J3uMAN0pFb6eIjwVYYqjABqV//BPl99Rv2W3ElJq7
 K/SICRWlvzyINCgF15QAUtQHWdINxSb0GzovECVxODHOv0N4mKHdpNU=
 =QlVe
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "By the number of commits, cpufreq is the leading party (again) and the
  most visible change there is the removal of the omap-cpufreq driver
  that has not been used for a long time (good riddance). There are also
  quite a few changes in the cppc_cpufreq driver, mostly related to
  fixing its frequency invariance engine in the case when the CPPC
  registers used by it are not in PCC. In addition to that, support for
  AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev
  list is updated for some platforms. The remaining cpufreq changes are
  assorted fixes and cleanups.

  Next up is cpuidle and the changes there are dominated by intel_idle
  driver updates, mostly related to the new command line facility
  allowing users to adjust the list of C-states used by the driver.
  There are also a few updates of cpuidle governors, including two menu
  governor fixes and some refinements of the teo governor, and a
  MAINTAINERS update adding Christian Loehle as a cpuidle reviewer.
  [Thanks for stepping up Christian!]

  The most significant update related to system suspend and hibernation
  is the one to stop freezing the PM runtime workqueue during system PM
  transitions which allows some deadlocks to be avoided. There is also a
  fix for possible concurrent bit field updates in the core device
  suspend code and a few other minor fixes.

  Apart from the above, several drivers are updated to discard the
  return value of pm_runtime_put() which is going to be converted to a
  void function as soon as everybody stops using its return value, PL4
  support for Ice Lake is added to the Intel RAPL power capping driver,
  and there are assorted cleanups, documentation fixes, and some
  cpupower utility improvements.

  Specifics:

   - Remove the unused omap-cpufreq driver (Andreas Kemnade)

   - Optimize error handling code in cpufreq_boost_trigger_state() and
     make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
     supports boost (Lifeng Zheng)

   - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
     Dhruva Gole, and Konrad Dybcio)

   - Minor improvements to the cpufreq and cpumask rust implementation
     (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)

   - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)

   - Update arch_freq_scale in the CPPC cpufreq driver's frequency
     invariance engine (FIE) in scheduler ticks if the related CPPC
     registers are not in PCC (Jie Zhan)

   - Assorted minor cleanups and improvements in ARM cpufreq drivers
     (Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)

   - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
     Gupta)

   - Make the scaling_setspeed cpufreq sysfs attribute return the actual
     requested frequency to avoid confusion (Pengjie Zhang)

   - Simplify the idle CPU time granularity test in the ondemand cpufreq
     governor (Frederic Weisbecker)

   - Enable asym capacity in intel_pstate only when CPU SMT is not
     possible (Yaxiong Tian)

   - Update the description of rate_limit_us default value in cpufreq
     documentation (Yaxiong Tian)

   - Add a command line option to adjust the C-states table in the
     intel_idle driver, remove the 'preferred_cstates' module parameter
     from it, add C-states validation to it and clean it up (Artem
     Bityutskiy)

   - Make the menu cpuidle governor always check the time till the
     closest timer event when the scheduler tick has been stopped to
     prevent it from mistakenly selecting the deepest available idle
     state (Rafael Wysocki)

   - Update the teo cpuidle governor to avoid making suboptimal
     decisions in certain corner cases and generally improve idle state
     selection accuracy (Rafael Wysocki)

   - Remove an unlikely() annotation on the early-return condition in
     menu_select() that leads to branch misprediction 100% of the time
     on systems with only 1 idle state enabled, like ARM64 servers
     (Breno Leitao)

   - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer
     (Christian Loehle)

   - Stop flagging the PM runtime workqueue as freezable to avoid system
     suspend and resume deadlocks in subsystems that assume asynchronous
     runtime PM to work during system-wide PM transitions (Rafael
     Wysocki)

   - Drop redundant NULL pointer checks before acomp_request_free() from
     the hibernation code handling image saving (Rafael Wysocki)

   - Update wakeup_sources_walk_start() to handle empty lists of wakeup
     sources as appropriate (Samuel Wu)

   - Make dev_pm_clear_wake_irq() check the power.wakeirq value under
     power.lock to avoid race conditions (Gui-Dong Han)

   - Avoid bit field races related to power.work_in_progress in the core
     device suspend code (Xuewen Yan)

   - Make several drivers discard pm_runtime_put() return value in
     preparation for converting that function to a void one (Rafael
     Wysocki)

   - Add PL4 support for Ice Lake to the Intel RAPL power capping driver
     (Daniel Tang)

   - Replace sprintf() with sysfs_emit() in power capping sysfs show
     functions (Sumeet Pawnikar)

   - Make dev_pm_opp_get_level() return value match the documentation
     after a previous update of the latter (Aleks Todorov)

   - Use scoped for each OF child loop in the OPP code (Krzysztof
     Kozlowski)

   - Fix a bug in an example code snippet and correct typos in the
     energy model management documentation (Patrick Little)

   - Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
      * idle_monitor: Fix incorrect value logged after stop
      * Fix inverted APERF capability check
      * Use strcspn() to strip trailing newline
      * Reset errno before strtoull()
      * Show C0 in idle-info dump

   - Improve cpupower installation procedure by making the systemd step
     optional and allowing users to disable the installation of
     systemd's unit file (João Marcos Costa)"

* tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
  PM: sleep: core: Avoid bit field races related to work_in_progress
  PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races
  cpufreq: Documentation: Update description of rate_limit_us default value
  cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible
  PM: wakeup: Handle empty list in wakeup_sources_walk_start()
  PM: EM: Documentation: Fix bug in example code snippet
  Documentation: Fix typos in energy model documentation
  cpuidle: governors: teo: Refine intercepts-based idle state lookup
  cpuidle: governors: teo: Adjust the classification of wakeup events
  cpufreq: ondemand: Simplify idle cputime granularity test
  cpufreq: userspace: make scaling_setspeed return the actual requested frequency
  PM: hibernate: Drop NULL pointer checks before acomp_request_free()
  cpufreq: CPPC: Add generic helpers for sysfs show/store
  cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
  cpufreq: ti-cpufreq: add support for AM62L3 SoC
  cpufreq: dt-platdev: Add ti,am62l3 to blocklist
  cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
  cpufreq: scmi: correct SCMI explanation
  cpufreq: dt-platdev: Block the driver from probing on more QC platforms
  rust: cpumask: rename methods of Cpumask for clarity and consistency
  ...
2026-02-09 19:00:42 -08:00
Rafael J. Wysocki 073dcc0283 Merge branch 'pm-runtime'
Merge updates related to runtime PM for 6.20-rc1/7.0-rc1:

 - Make several drivers discard pm_runtime_put() return value in
   preparation for converting that function to a void one (Rafael
   Wysocki)

* pm-runtime:
  drm: Discard pm_runtime_put() return value
  genirq/chip: Change irq_chip_pm_put() return type to void
  scsi: ufs: core: Discard pm_runtime_put() return values
  platform/chrome: cros_hps_i2c: Discard pm_runtime_put() return value
  coresight: Discard pm_runtime_put() return values
  hwspinlock: omap: Discard pm_runtime_put() return value
  watchdog: rzv2h_wdt: Discard pm_runtime_put() return value
  watchdog: rz: Discard pm_runtime_put() return values
  media: ccs: Discard pm_runtime_put() return value
  drm/imagination: Discard pm_runtime_put() return value
  USB: core: Discard pm_runtime_put() return value
2026-02-04 21:03:18 +01:00
Thorsten Blum 2dfc417414 genirq/proc: Replace snprintf with strscpy in register_handler_proc
Replace snprintf("%s", ...) with the faster and more direct strscpy().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260127224949.441391-2-thorsten.blum@linux.dev
2026-01-30 08:53:53 +01:00
Thomas Gleixner 37f9d5026c genirq/redirect: Prevent writing MSI message on affinity change
The interrupts which are handled by the redirection infrastructure provide
a irq_set_affinity() callback, which solely determines the target CPU for
redirection via irq_work and und updates the effective affinity mask.

Contrary to regular MSI interrupts this affinity setting does not change
the underlying interrupt message as the message is only created at setup
time to deliver to the demultiplexing interrupt.

Therefore the message write in msi_domain_set_affinity() is a pointless
exercise. In principle the write is harmless, but a Tegra system exposes a
full system hang during suspend due to that write.

It's unclear why the check for the PCI device state PCI_D0 in
pci_msi_domain_write_msg(), which prevents the actual hardware access if
a device is in powered down state, fails on this particular system, but
that's a different problem which needs to be investigated by the Tegra
experts.

The irq_set_affinity() callback can advise msi_domain_set_affinity() not to
write the MSI message by returning IRQ_SET_MASK_OK_DONE instead of
IRQ_SET_MASK_OK. Do exactly that.

Just to make it clear again:

This is not a correctness issue of the redirection code as returning
IRQ_SET_MASK_OK in that context is completely correct. From the core
code point of view this is solely a optimization to avoid an redundant
hardware write.

As a byproduct it papers over the underlying problem on the Tegra platform,
which fails to put the PCIe device[s] out of PCI_D0 despite the fact that
the devices and busses have been shut down. The redirect infrastructure
just unearthed the underlying issue, which is prone to happen in quite some
other code paths which use the PCI_D0 check to prevent hardware access to
powered down devices.

This therefore has neither a 'Fixes:' nor a 'Closes:' tag associated as the
underlying problem, which is outside the scope of the interrupt code, is
still unresolved.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/all/4e5b349c-6599-4871-9e3b-e10352ae0ca0@nvidia.com
Link: https://patch.msgid.link/87tsw6aglz.ffs@tglx
2026-01-29 23:49:55 +01:00
Lorenzo Pieralisi 0323897a88 irqdomain: Add parent field to struct irqchip_fwid
The GICv5 driver IRQ domain hierarchy requires adding a parent field to
struct irqchip_fwid so that core code can reference a fwnode_handle parent
for a given fwnode.

Add a parent field to struct irqchip_fwid and update the related kernel API
functions to initialize and handle it.

Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-1-c13a9a150388@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27 15:31:41 +01:00
Rafael J. Wysocki e9df6eba06 genirq/chip: Change irq_chip_pm_put() return type to void
The irq_chip_pm_put() return value is only used in __irq_do_set_handler()
to trigger a WARN_ON() if it is negative, but doing so is not useful
because irq_chip_pm_put() simply passes the pm_runtime_put() return value
to its callers.

Returning an error code from pm_runtime_put() merely means that it has
not queued up a work item to check whether or not the device can be
suspended and there are many perfectly valid situations in which that
can happen, like after writing "on" to the devices' runtime PM "control"
attribute in sysfs for one example.

For this reason, modify irq_chip_pm_put() to discard the pm_runtime_put()
return value, change its return type to void, and drop the WARN_ON()
around the irq_chip_pm_put() invocation from __irq_do_set_handler().
Also update the irq_chip_pm_put() kerneldoc comment to be more accurate.

This will facilitate a planned change of the pm_runtime_put() return
type to void in the future.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/5075294.31r3eYUQgx@rafael.j.wysocki
2026-01-16 20:28:05 +01:00
Imran Khan dd9f6d30c6 genirq/cpuhotplug: Notify about affinity changes breaking the affinity mask
During CPU offlining the interrupts affined to that CPU are moved to other
online CPUs, which might break the original affinity mask if the outgoing
CPU was the last online CPU in that mask. This change is not propagated to
irq_desc::affinity_notify(), which leaves users of the affinity notifier
mechanism with stale information.

Avoid this by scheduling affinity change notification work for interrupts
that were affined to the CPU being offlined, if the new target CPU is not
part of the original affinity mask.

Since irq_set_affinity_locked() uses the same logic to schedule affinity
change notification work, split out this logic into a dedicated function
and use that at both places.

[ tglx: Removed the EXPORT(), removed the !SMP stub, moved the prototype,
  	added a lockdep assert instead of a comment, fixed up coding style
  	and name space. Polished and clarified the change log ]

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260113143727.1041265-1-imran.f.khan@oracle.com
2026-01-13 21:18:16 +01:00
Luigi Rizzo fb11a2493e genirq: Move clear of kstat_irqs to free_desc()
desc_set_defaults() has a loop to clear the per-cpu counters kstats_irq.

This is only needed in free_desc(), which is used with non-sparse IRQs so
that the interrupt descriptor can be recycled. For newly allocated
descriptors, the memory comes from alloc_percpu() and is already zeroed
out.

Move the loop to free_desc() to avoid wasting time unnecessarily.

Signed-off-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260112083234.2665832-1-lrizzo@google.com
2026-01-13 10:16:29 +01:00
Radu Rendec df439718af genirq: Update effective affinity for redirected interrupts
For redirected interrupts, irq_chip_redirect_set_affinity() does not
update the effective affinity mask, which then triggers the warning in
irq_validate_effective_affinity(). Also, because the effective affinity
mask is empty, the cpumask_test_cpu(smp_processor_id(), m) condition in
demux_redirect_remote() is always false, and the interrupt is always
redirected, even if it's already running on the target CPU.

Set the effective affinity mask to be the same as the requested affinity
mask. It's worth noting that irq_do_set_affinity() filters out offline
CPUs before calling chip->irq_set_affinity() (unless `force` is set), so
the mask passed to irq_chip_redirect_set_affinity() is already filtered.

The solution is not ideal because it may lie about the effective
affinity of the demultiplexed ("child") interrupt. If the requested
affinity mask includes multiple CPUs, the effective affinity, in
reality, is the intersection between the requested mask and the
demultiplexing ("parent") interrupt's effective affinity mask, plus
the first CPU in the requested mask.

Accurately describing the effective affinity of the demultiplexed
interrupt is not trivial because it requires keeping track of the
demultiplexing interrupt's effective affinity. That is tricky in the
context of CPU hot(un)plugging, where interrupt migration ordering is
not guaranteed. The solution in the initial version of the fixed patch,
which stored the first CPU of the demultiplexing interrupt's effective
affinity in the `target_cpu` field, has its own drawbacks and
limitations.

Fixes: fcc1d0dabd ("genirq: Add interrupt redirection infrastructure")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260112211402.2927336-1-rrendec@redhat.com
Closes: https://lore.kernel.org/all/44509520-f29b-4b8a-8986-5eae3e022eb7@nvidia.com/
2026-01-13 09:59:28 +01:00
Sebastian Andrzej Siewior aef30c8d56 genirq: Warn about using IRQF_ONESHOT without a threaded handler
IRQF_ONESHOT disables the interrupt source until after the threaded
handler completed its work. This is needed to allow the threaded handler
to run - otherwise the CPU will get back to the interrupt handler
because the interrupt source remains active and the threaded handler
will not able to do its work.

Specifying IRQF_ONESHOT without a threaded handler does not make sense.
It could be a leftover if the handler _was_ threaded and changed back to
primary and the flag was not removed. This can be problematic in the
`threadirqs' case because the handler is exempt from forced-threading.
This in turn can become a problem on a PREEMPT_RT system if the handler
attempts to acquire sleeping locks.

Warn about missing threaded handlers with the IRQF_ONESHOT flag.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20260112134013.eQWyReHR@linutronix.de
2026-01-13 09:56:25 +01:00
Thomas Gleixner 2e4b28c48f treewide: Update email address
In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-01-11 06:09:11 -10:00
Linus Torvalds 610192c229 Fix IRQ thread affinity flags setup regression.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmlHrpkRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hBGBAAuZG/SpA5k4Aexg3rfisxfenj4s4qeTO3
 HqMXKU5M1zEWOlG2RUx/+Gn0RCb2U/eHzwAXtu85TJjYPQ3tCN+G49/VbbKZm3/J
 wM1gvGvpdX9miSi5S2e6Ete0zf0bYXCP4dJOreekyzzngXd9lqncBA/u+vimt00l
 1axg7osB73puSKjeg/g8t3CIjRvQ6XgD9HC5dAlUebmFQz6KSgM6yZXac/SXbYqn
 nyGN9EIq7HegrBAOLW16viUK5dq5WmmUS02MA728soLEwwxn/yDr44MjPaqPAJlQ
 2cwDHiKvZRtzI9wDlPaVH/Dg2OoXvrEEjrjfrzTWpjHn6p8jjluuUYBqtTIJF+g7
 rUnbtVz4GwOMg7SetMHW9YOMo3R0LXC5ly6RFycRdYqgdnEwTML5Jh4qjxWBiqJD
 98DtPGxUEdGKAUkOE4wLSEJy9EgjSXMm0nc8V6MJMl6afEoKBSU/w73nxc+VVHvw
 x7B/G/qXhJdSqO1aEftHbeZUlpOc2NSMzMXEXqCfE7B/S3vfCbC0PSLQnEq90AjY
 yTT7gI9pI37ULMqqf1b0pfosQuBu0cdkT5r++RFuaZLRl/kBefIyRiWcgwNDAs5C
 /YNeSH+qU043/f55vOoHna8B+W+VSkTGZCQpWcYcRhYCxVIWupdY1WBH9AqBg1R4
 Ar81aRZK2hI=
 =HT4H
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix IRQ thread affinity flags setup regression"

* tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Don't overwrite interrupt thread flags on setup
2025-12-21 14:34:13 -08:00
Greg Kroah-Hartman 90876d9b37 irqdomain: Fix up const problem in irq_domain_set_name()
In irq_domain_set_name() a const pointer is passed in, and then the
const is "lost" when container_of() is called.  Fix this up by properly
preserving the const pointer attribute when container_of() is used to
enforce the fact that this pointer should not have anything at it
changed.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/2025121731-facing-unhitched-63ae@gregkh
2025-12-19 00:39:39 +01:00
Radu Rendec fcc1d0dabd genirq: Add interrupt redirection infrastructure
Add infrastructure to redirect interrupt handler execution to a
different CPU when the current CPU is not part of the interrupt's CPU
affinity mask.

This is primarily aimed at (de)multiplexed interrupts, where the child
interrupt handler runs in the context of the parent interrupt handler,
and therefore CPU affinity control for the child interrupt is typically
not available.

With the new infrastructure, the child interrupt is allowed to freely
change its affinity setting, independently of the parent. If the
interrupt handler happens to be triggered on an "incompatible" CPU (a
CPU that's not part of the child interrupt's affinity mask), the handler
is redirected and runs in IRQ work context on a "compatible" CPU.

No functional change is being made to any existing irqchip driver, and
irqchip drivers must be explicitly modified to use the newly added
infrastructure to support interrupt redirection.

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
Link: https://patch.msgid.link/20251128212055.1409093-2-rrendec@redhat.com
2025-12-15 22:30:48 +01:00
Marc Zyngier dbcc728e18 genirq: Remove setup_percpu_irq()
setup_percpu_irq() was always a bad kludge, and should have never
been there the first place. Now that the last users are gone,
remove it for good.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251210082242.360936-7-maz@kernel.org
2025-12-15 22:20:51 +01:00
Marc Zyngier e9b624ea31 genirq: Remove __request_percpu_irq() helper
With the IRQ timing stuff being gone, there is no need to specify a flag
when requesting a percpu interrupt. Not only IRQF_TIMER was the only flag
(set of flags actually) allowed, but nobody ever passed it.

Get rid of __request_percpu_irq(), which was only getting 0 as flags, and
promote request_percpu_irq_affinity() as its replacement.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20251210082242.360936-3-maz@kernel.org
2025-12-15 22:20:50 +01:00
Marc Zyngier c119e66853 genirq: Remove IRQ timing tracking infrastructure
The IRQ timing tracking infrastructure was merged in 2019, but was never
plumbed in, is not selectable, and is therefore never used.

As Daniel agrees that there is little hope for this infrastructure to be
completed in the near term, drop it altogether.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/87zf7vex6h.wl-maz@kernel.org
Link: https://patch.msgid.link/20251210082242.360936-2-maz@kernel.org
2025-12-15 22:20:50 +01:00
Linus Torvalds db0130185e Misc fixes:
- Fix error code in the irqchip/mchp-eic driver
  - Fix setup_percpu_irq() affinity assumptions
  - Remove the unused irq_domain_add_tree() function
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmk74RgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gSEw/+IwHsSH9a4fOJGwV+vlwljy3zw6Sbmfwh
 D8C+jtVBQwocYzb/snbTvWnHJuhGwyLh6gVcYDDwJj/8IRGXnKVm7S1UKpV/XpRl
 O/dhqo8uogjqVAlo7osqTlYlwSGVLBY++iYCZrt9cY/Hn1bCm0Oz19/wbSJupHqp
 G/aG05iX4SRDHhTI8oWp7oEnnS1Hyox/2CKYHEBQoR2BapJL10exHqnyDpQ48pgL
 60LCQbPeZp5KArQ+NRkc4RbcHBW6tsBJQUPc+m71H5HJTuetxwQ7uNuc3UsaR6jI
 0A/vBryzeS5d88+FO08WxAYaiSENaGBAZ7aAKVbuiSoddwuRBUhvW6OYvWIGNqJG
 gCk4YB9K5li/2slrQdS4BQhq8thdVGkgzcSDFiFLyYmg7NjECPpr4IWggKwFT2zn
 hTF6tjuanBL9jew4Zxw21WcoJa3Kz6FXwSbU+PfWUUb+xc0/zouai17vkP8qwXR6
 xeMqVJCZjb88VXUvL42xqKjIYD4x6UaoUqOtXjfO7xa40XC+2KVmALRInbSdTDpa
 9MsJg6Mi2C+hYPXM1fPCymbktZoBSWnzi8vGUbA9CegyP9J4Zhpl8vmDIn+mXMmB
 Lu7pGAotSWHnXf9oCqAJ4eioaAj4V6HtIPKkSpmQGozifYMvlEI+boIwfkS2sJWK
 lLGe7LH6QeU=
 =/KYE
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:

 - Fix error code in the irqchip/mchp-eic driver

 - Fix setup_percpu_irq() affinity assumptions

 - Remove the unused irq_domain_add_tree() function

* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
  irqdomain: Delete irq_domain_add_tree()
  genirq: Allow NULL affinity for setup_percpu_irq()
2025-12-14 06:07:09 +12:00
Thomas Gleixner fbbd7ce627 genirq: Don't overwrite interrupt thread flags on setup
Chris reported that the recent affinity management changes result in
overwriting the already initialized thread flags.

Use set_bit() to set the affinity bit instead of assigning the bit value to
the flags.

Fixes: 801afdfbfc ("genirq: Fix interrupt threads affinity vs. cpuset isolated partitions")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://patch.msgid.link/87ecp0e4cf.ffs@tglx
Closes: https://lore.kernel.org/all/20251212014848.3509622-1-clm@meta.com
2025-12-13 10:29:33 +09:00
Linus Torvalds 0723a166d1 more s390 updates for 6.19 merge window
- Use the MSI parent domain API instead of the legacy API for setup and
   teardown of PCI MSI IRQs
 
 - Select POSIX_CPU_TIMERS_TASK_WORK now that VIRT_XFER_TO_GUEST_WORK has
   been implemented for s390
 
 - Fix a KVM bug which can lead to guest memory corruption
 
 - Fix KASAN shadow memory mapping for hotplugged memory
 
 - Minor bug fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmk5WtQACgkQIg7DeRsp
 bsKOwRAAg7uNYB0nU5CYl/0GeExumHBxQT/B1Umxg8lCDf4rwAPwwSrV3Dpp04h5
 8FpKH42x8sig1yKuqjrBgHTlRofmSlZOrtDI9xv/dtp8yJWkAaMY0Ec8sf1VwEEO
 dgyEPH567ja+mWu5z74eFmxYSV5dsgcJ707DICTyDXB3DaIzmVH5D5f8yYbMs8eL
 gJ4bAuRkzwV+hp1UXLaWWELDrTX2KpQ3EBPz/2SuS8e3KbmL7MuDRdMbTJ7oEb1+
 fGGnm3qFhHszx5fGWUv/BrGsQidwZTe459JdQtKcQUsPkGqBjsbhmiq9ApomD2+7
 MJFDGDVHt623iBNwDwygAPCrWrKoVUFlw5MaKfztvSiuKt19N6OrwN0R4VA8P46I
 m9mVRSLeXTv4GzZcTmLsRb2cLJ3Z6co2HuDli4X1Jg3TTWBFzx0Jscp9dB1QKmoI
 EXBX44LoL5bUZeh5cihqGC5YBuXeaYyMnnsP56eJhq56QK9r4WDMcvGezBbzbkHE
 I3GjmFwEUr3U9hSx4xmUGh6pry5PLkuDch5So/vh1miAdHjZFFthXHK2s5UI+HUl
 OGjNxI9q2WzX4bb1C090zVljvYAsP6YzKnk7Ks76+0mEBvykTsGx3Ms4oS1L0k5+
 Z8R2J0BLfcBQU7pL0QoYE/fO2J5wo+atp/cBLwc2hd+G9YEvdTs=
 =Nbs5
 -----END PGP SIGNATURE-----

Merge tag 's390-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

 - Use the MSI parent domain API instead of the legacy API for setup and
   teardown of PCI MSI IRQs

 - Select POSIX_CPU_TIMERS_TASK_WORK now that VIRT_XFER_TO_GUEST_WORK
   has been implemented for s390

 - Fix a KVM bug which can lead to guest memory corruption

 - Fix KASAN shadow memory mapping for hotplugged memory

 - Minor bug fixes and improvements

* tag 's390-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/bug: Add missing alignment
  s390/bug: Add missing CONFIG_BUG ifdef again
  KVM: s390: Fix gmap_helper_zap_one_page() again
  s390/pci: Migrate s390 IRQ logic to IRQ domain API
  genirq: Change hwirq parameter to irq_hw_number_t
  s390: Select POSIX_CPU_TIMERS_TASK_WORK
  s390: Unmap early KASAN shadow on memory offlining
  s390/vmem: Support 2G page splitting for KASAN shadow freeing
  s390/boot: Use entire page for PTEs
  s390/vmur: Use scnprintf() instead of sprintf()
2025-12-11 08:19:46 +09:00
Marc Zyngier 89acaa5537 genirq: Allow NULL affinity for setup_percpu_irq()
setup_percpu_irq() was forgotten when the percpu_devid infrastructure was
updated to deal with CPU affinities.

In order to keep ignoring users of this legacy API, provide sensible
defaults by setting the affinity to cpu_online_mask if none was provided by
the caller.

Fixes: bdf4e2ac29 ("genirq: Allow per-cpu interrupt sharing for non-overlapping affinities")
Reported-by: Daniel Thompson <danielt@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251205091814.3944205-1-maz@kernel.org
Closes: https://lore.kernel.org/r/aTFozefMQRg7lYxh@aspen.lan
2025-12-10 09:47:33 +09:00
Tobias Schumacher 455a65260f genirq: Change hwirq parameter to irq_hw_number_t
The irqdomain implementation internally represents hardware IRQs as
irq_hw_number_t, which is defined as unsigned long int. When providing
an irq_hw_number_t to the generic_handle_domain() functions that expect
and unsigned int hwirq, this can lead to a loss of information. Change
the hwirq parameter to irq_hw_number_t to support the full range of
hwirqs.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Tobias Schumacher <ts@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-12-07 16:15:22 +01:00
Linus Torvalds 208eed95fc soc: driver updates for 6.19
This is the first half of the driver changes:
 
  - A treewide interface change to the "syscore" operations for
    power management, as a preparation for future Tegra specific
    changes.
 
  - Reset controller updates with added drivers for LAN969x, eic770
    and RZ/G3S SoCs.
 
  - Protection of system controller registers on Renesas and Google SoCs,
    to prevent trivially triggering a system crash from e.g. debugfs
    access.
 
  - soc_device identification updates on Nvidia, Exynos and Mediatek
 
  - debugfs support in the ST STM32 firewall driver
 
  - Minor updates for SoC drivers on AMD/Xilinx, Renesas,  Allwinner, TI
 
  - Cleanups for memory controller support on Nvidia and Renesas
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmky/8gACgkQmmx57+YA
 GNlqohAApPTLM6Q4gf1cIcsTVaP0uxx9CBgupCGuT5ORrOMKBghVWjTOTSxeEAab
 UQF465QwYUUu602GH34UmRaY9CKW2bMIsfmkgmxNB4Y4Qd7yCgQNJ/h/TnN0rBH+
 qTeEsRH/hax4miSNsh0oOZfVkZkg+23VF02d1VL0CcaX7y4oT45RPBQugrNx/gNS
 fHfVwgIq8vJ8WyrmM1h2nv1i1vgSzEy50B3kY674BBw83FcJTafNLvD7N5DSgD1H
 /I/2xeyEpb+oL1VfeHcXZaX/jf04O+cmvSzBi+MOH1tI3MpdxJib1vEYBdggoOWN
 K/FFGgsOY+DNmJPpSnPTTu8UpzksS8SxGBP7M9Q8roKZwA2c9wLotxySvjki5yv8
 2zvabRdzbrSaoYwsH9QnZdQ2hVkJ9W8MESu8PevD3yMNuFUzledPDWW0N1SbGm78
 0ZdB6NPdaBZYHMNMRdFhN8P275/Mx5e0XWN9oYMQqjPooH7YkyT7hJWz6ao2PCJP
 8mDmnW1RzL+LWf7mJ25ZEtS+YjmKA/PVmogRrGurKCadvdxXqCF09KNljICHhmmu
 t0KB4dqw02OXLPvBk21qCi0zL56w1JDgqtS8suFvDYo9sCceeAbAcmpyoUOFj2N+
 Upn976tb4iqFrr9mFswpmCJWPpqJkU+A+KnKsIRPU7N4kSrP35I=
 =HvlN
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC driver updates from Arnd Bergmann:
 "This is the first half of the driver changes:

   - A treewide interface change to the "syscore" operations for power
     management, as a preparation for future Tegra specific changes

   - Reset controller updates with added drivers for LAN969x, eic770 and
     RZ/G3S SoCs

   - Protection of system controller registers on Renesas and Google
     SoCs, to prevent trivially triggering a system crash from e.g.
     debugfs access

   - soc_device identification updates on Nvidia, Exynos and Mediatek

   - debugfs support in the ST STM32 firewall driver

   - Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI

   - Cleanups for memory controller support on Nvidia and Renesas"

* tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits)
  memory: tegra186-emc: Fix missing put_bpmp
  Documentation: reset: Remove reset_controller_add_lookup()
  reset: fix BIT macro reference
  reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe
  reset: th1520: Support reset controllers in more subsystems
  reset: th1520: Prepare for supporting multiple controllers
  dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys
  dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets
  reset: remove legacy reset lookup code
  clk: davinci: psc: drop unused reset lookup
  reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC
  reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY
  dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support
  reset: eswin: Add eic7700 reset driver
  dt-bindings: reset: eswin: Documentation for eic7700 SoC
  reset: sparx5: add LAN969x support
  dt-bindings: reset: microchip: Add LAN969x support
  soc: rockchip: grf: Add select correct PWM implementation on RK3368
  soc/tegra: pmc: Add USB wake events for Tegra234
  amba: tegra-ahb: Fix device leak on SMMU enable
  ...
2025-12-05 17:29:04 -08:00
Linus Torvalds 9ce62ebbb7 Updates for [PCI] MSI related code:
- Remove one variant of PCI/MSI management as all users have been
    converted to use per device domains. That reduces the variants to two:
 
    The modern and the real archaic legacy variant, which keeps the usual
    suspects in the museum category alive.
 
  - Rework the platform MSI device ID detection mechanism in the ARM GIC
    world to address resource leaks, duplicated code and other details. This
    requires a corresponding preparatory step in the PCI/iproc driver.
 
  - Trivial core code cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmkswn0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYpuD/wKT7d6I6AqnJVF/RhiJ+/d6vuX/aFW
 g6E7XAkMLKhmxunSNFfPzXsHy2a0oJroYKmDJH4C8GWGo/gXa+QvmDt2491k9rdV
 zM+CBodBu3/bXWvTW+o1fbyAvG+p2C3+iSRW/gGqzPdcY8gQiRnNOZS1j7zusMjO
 A6pz5SvLSPWQUnVl9PygJBuNX5TFHPnY3AySRpW11CvqB5/8gqGz+O6lT/Q+5hov
 GUC57hskbQd1PsYhTNRaUR4z7VMolPHqscp8DYVCWjOMP/r5quC6dlsn91yxuATU
 8D7oRiW8xkCaTJplY/rA6r/VxUthZ3EgIxzev3rGaWBdPxHcFfftf2oxyFFAf3lf
 3rEdfGBcNgApx+MCcoT5/3mf3KJfn2/bE6bZhwv94+dtbTlHguztyMD3vnGTS73i
 zPWQ5ae4M5sqc8kCNMRaBfU8yQEHEKs3gia67vStZyn5R/uUNVKRo67LBPZKVDcJ
 2511Ylnm62yG6PtdPGIFHY1i75uPpxXuS7F0BJignzM3iPvVvwLPZLDORr3/pR4q
 CmswZTA2obue6+nwz/LUacxzONsZ2Z8pzGY6rrT9sfj0Z4mk6xrfEPfjfmVoMpyk
 Dk4B8lIVYwcR7d/Sw+FIwYst8iw+L77Yn7kN8yCbh4lAOxBUUvtS5KAP6uPGe3D1
 30Q/DbBVlEvg/g==
 =VAtQ
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI updates from Thomas Gleixner:
 "Updates for [PCI] MSI related code:

   - Remove one variant of PCI/MSI management as all users have been
     converted to use per device domains. That reduces the variants to
     two:

     The modern and the real archaic legacy variant, which keeps the
     usual suspects in the museum category alive.

   - Rework the platform MSI device ID detection mechanism in the ARM
     GIC world to address resource leaks, duplicated code and other
     details. This requires a corresponding preparatory step in the
     PCI/iproc driver.

   - Trivial core code cleanups"

* tag 'irq-msi-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-its: Rework platform MSI deviceID detection
  PCI: iproc: Implement MSI controller node detection with of_msi_xlate()
  genirq/msi: Slightly simplify msi_domain_alloc()
  PCI/MSI: Delete pci_msi_create_irq_domain()
2025-12-02 09:35:59 -08:00
Linus Torvalds 6863c8385c Updates for the interrupt core and treewide cleanups:
- Rework of the Per Processor Interrupt (PPI) management on ARM[64].
 
     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.
 
     This rework provides affinity management for PPIs so that they can be
     individually configured in the firmware tables and mops up the related
     drivers all over the place.
 
   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.
 
   - Plug a harmless race in the interrupt affinity proc interface, which
     allows to see a half updated mask
 
   - Adjust the priority of secondary interrupt threads on RT, so that the
     combination of primary and secondary thread emulates the hardware
     interrupt plus thread scenario. Having them at the same priority can
     cause starvation issues in some drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksv3oTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoe5+D/wNnBaX9LRajuLOF+zaYw5WZxkzp6U7
 X4AP3cLny8xynI1kM5V8M1ym3Fspk0hiqxNX2LLXrSZzBR+3O4uGCyCceBXeHKo2
 vW4auUXG4MB+2sZyudQXaBpNK4A2YBubycTUcRECjkjDkBPAWgN7J+Oz2lXUSUcH
 zlitlHNo48hnZQPAJr4PDpi5q9+rChn+8/s+K1d8NlEf9HOXC98qzyMuMq+jHdJE
 AQ6tKoHkA5lHjHAUY3AbWptoHo1Wp+p5PSqsrFr6nbKuPlhUqRNEPXX0Z8q7aUTj
 NgdkvIHJVJ0C+T40FIWCNzUYOUk4gTQXBSPvptwJSHAmf9ovp+Kg2ltVZBzyL2iI
 R0EZSQAQU8iJcRrqjcAYqI36LkmwwVT6RD1zFa98xJT/AjsMpAt/U1pEMDtkoTKe
 Lv7ZQ/hloc+4wV4xS4zEtoV/ukdUfA9aEdXsh5hNH/07tvatpKO2LgortsiI+lCK
 76vAULcGvbMr5Jr63snjICgstahunpNMRn2HmnGAjmdZf4+g+TDvZR4DI6bswtuO
 jp5G6OM30Z9zKheAr1VioV1XAKr6Y4jDKVjfFy/n1k5pDwYaSJopmZxSD35aas4e
 VqWizAzc5dAVCYRlzr6S1lrMQ2JJRg0RpIn+sMS8dhf9SK7hs5ilGSOsgX1fgVat
 1N3WXvYM8vSW+g==
 =zrA1
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "Updates for the interrupt core and treewide cleanups:

   - Rework of the Per Processor Interrupt (PPI) management on ARM[64]

     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.

     This rework provides affinity management for PPIs so that they can
     be individually configured in the firmware tables and mops up the
     related drivers all over the place.

   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.

   - Plug a harmless race in the interrupt affinity proc interface,
     which allows to see a half updated mask

   - Adjust the priority of secondary interrupt threads on RT, so that
     the combination of primary and secondary thread emulates the
     hardware interrupt plus thread scenario. Having them at the same
     priority can cause starvation issues in some drivers"

* tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  genirq: Remove cpumask availability check on kthread affinity setting
  genirq: Fix interrupt threads affinity vs. cpuset isolated partitions
  genirq: Prevent early spurious wake-ups of interrupt threads
  genirq: Use raw_spinlock_irq() in irq_set_affinity_notifier()
  genirq/manage: Reduce priority of forced secondary interrupt handler
  genirq/proc: Fix race in show_irq_affinity()
  genirq: Fix percpu_devid irq affinity documentation
  perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer
  irqdomain: Kill of_node_to_fwnode() helper
  genirq: Kill irq_{g,s}et_percpu_devid_partition()
  irqchip: Kill irq-partition-percpu
  irqchip/apple-aic: Drop support for custom PMU irq partitions
  irqchip/gic-v3: Drop support for custom PPI partitions
  coresight: trbe: Request specific affinities for per CPU interrupts
  perf: arm_spe_pmu: Request specific affinities for per CPU interrupts
  perf: arm_pmu: Request specific affinities for per CPU NMIs/interrupts
  genirq: Add request_percpu_irq_affinity() helper
  genirq: Allow per-cpu interrupt sharing for non-overlapping affinities
  genirq: Update request_percpu_nmi() to take an affinity
  genirq: Add affinity to percpu_devid interrupt requests
  ...
2025-12-02 09:14:26 -08:00
Thomas Gleixner ebb922c920 Linux 6.18-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmj+p+UeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGKsIH/1EFGYZDVJ7pTOcO
 qJY/xfu5YNd4ezZTGMW5SgJK+lAdJwkmbu8PUlcOhXKRVvACG9Tud/+pZzw966C5
 pk9pF9vpCXq2Zz6dk3/XGFARUPUlDA4uJ/jiPTNVA8yy+V18u+Ds55Y+rhv9MkcW
 n/Fi+fiYfjqAaqP328mWH9z51ibRqH3WQfqVdjzClzoSC31BuJUVEZi9s5FZ7C9Q
 OCvRLp8WvTpcQ7ab7WH/wCgznXEKyRM/OxaNtXWztod9GLqOmWoFiHUxWfEQ/gg+
 KzgbgQOeXI6q7U8xJZ/711ZFzGLR9VBEPN0HnqxRNr8fCpzJ9FKFGTFD2HcBgUjy
 F9JH3nk=
 =YBEg
 -----END PGP SIGNATURE-----

Merge tag 'v6.18-rc3' into irq/msi

Pick up OF changes to resolve dependencies
2025-11-22 17:07:57 +01:00
Frederic Weisbecker 3de5e46e50 genirq: Remove cpumask availability check on kthread affinity setting
Failing to allocate the affinity mask of an interrupt descriptor fails the
whole descriptor initialization. It is then guaranteed that the cpumask is
always available whenever the related interrupt objects are alive, such as
the kthread handler.

Therefore remove the superfluous check since it is merely a historical
leftover. Get rid also of the comments above it that are obsolete and
useless.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251121143500.42111-4-frederic@kernel.org
2025-11-22 09:26:18 +01:00
Frederic Weisbecker 801afdfbfc genirq: Fix interrupt threads affinity vs. cpuset isolated partitions
When a cpuset isolated partition is created / updated or destroyed, the
interrupt threads are affined blindly to all the non-isolated CPUs. This
happens without taking into account the interrupt threads initial affinity
that becomes ignored.

For example in a system with 8 CPUs, if an interrupt and its kthread are
initially affine to CPU 5, creating an isolated partition with only CPU 2
inside will eventually end up affining the interrupt kthread to all CPUs
but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for CPU 5.

Besides the blind re-affining, this doesn't take care of the actual low
level interrupt which isn't migrated. As of today the only way to isolate
non managed interrupts, along with their kthreads, is to overwrite their
affinity separately, for example through /proc/irq/

To avoid doing that manually, future development should focus on updating
the interrupt's affinity whenever cpuset isolated partitions are updated.

In the meantime, cpuset shouldn't fiddle with interrupt threads directly.
To prevent from that, set the PF_NO_SETAFFINITY flag to them.

This is done through kthread_bind_mask() by affining them initially to all
possible CPUs as at that point the interrupt is not started up which means
the affinity of the hard interrupt is not known. The thread will adjust
that once it reaches the handler, which is guaranteed to happen after the
initial affinity of the hard interrupt is established.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251121143500.42111-3-frederic@kernel.org
2025-11-22 09:26:18 +01:00
Frederic Weisbecker 68775ca79a genirq: Prevent early spurious wake-ups of interrupt threads
During initialization, the interrupt thread is created before the interrupt
is enabled. The interrupt enablement happens before the actual kthread wake
up point. Once the interrupt is enabled the hardware can raise an interrupt
and once setup_irq() drops the descriptor lock a interrupt wake-up can
happen.

Even when such an interrupt can be considered premature, this is not a
problem in general because at the point where the descriptor lock is
dropped and the wakeup can happen, the data which is used by the thread is
fully initialized.

Though from the perspective of least surprise, the initial wakeup really
should be performed by the setup code and not randomly by a premature
interrupt.

Prevent this by performing a wake-up only if the target is in state
TASK_INTERRUPTIBLE, which the thread uses in wait_for_interrupt().

If the thread is still in state TASK_UNINTERRUPTIBLE, the wake-up is not
lost because after the setup code completed the initial wake-up the thread
will observe the IRQTF_RUNTHREAD and proceed with the handling.

[ tglx: Simplified the changes and extended the changelog. ]

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251121143500.42111-2-frederic@kernel.org
2025-11-22 09:26:18 +01:00
Chengkaitao 9d3faec60b genirq: Use raw_spinlock_irq() in irq_set_affinity_notifier()
Since irq_set_affinity_notifier() may sleep, interrupts are enabled. So
raw_spinlock_irqsave() can be replaced with raw_spinlock_irq().

Signed-off-by: Chengkaitao <chengkaitao@kylinos.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251118012754.61805-1-pilgrimtao@gmail.com
2025-11-18 16:19:40 +01:00
Thierry Reding a97fbc3ee3 syscore: Pass context data to callbacks
Several drivers can benefit from registering per-instance data along
with the syscore operations. To achieve this, move the modifiable fields
out of the syscore_ops structure and into a separate struct syscore that
can be registered with the framework. Add a void * driver data field for
drivers to store contextual data that will be passed to the syscore ops.

Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-11-14 10:01:52 +01:00
Lukas Wunner 51d0656959 genirq/manage: Reduce priority of forced secondary interrupt handler
Crystal reports that the PCIe Advanced Error Reporting driver gets stuck
in an infinite loop on PREEMPT_RT:

Both the primary interrupt handler aer_irq() as well as the secondary
handler aer_isr() are forced into threads with identical priority.
Crystal writes that on the ARM system in question, the primary handler
has to clear an error in the Root Error Status register...

   "before the next error happens, or else the hardware will set the
    Multiple ERR_COR Received bit.  If that bit is set, then aer_isr()
    can't rely on the Error Source Identification register, so it scans
    through all devices looking for errors -- and for some reason, on
    this system, accessing the AER registers (or any Config Space above
    0x400, even though there are capabilities located there) generates
    an Unsupported Request Error (but returns valid data).  Since this
    happens more than once, without aer_irq() preempting, it causes
    another multi error and we get stuck in a loop."

The issue does not show on non-PREEMPT_RT because the primary handler
runs in hardirq context and thus can preempt the threaded secondary
handler, clear the Root Error Status register and prevent the secondary
handler from getting stuck.

Emulate the same behavior on PREEMPT_RT by assigning a lower default
priority to the secondary handler if the primary handler is forced into
a thread.

Reported-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Crystal Wood <crwood@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/f6dcdb41be2694886b8dbf4fe7b3ab89e9d5114c.1761569303.git.lukas@wunner.de
Closes: https://lore.kernel.org/r/20250902224441.368483-1-crwood@redhat.com/
2025-11-01 21:30:02 +01:00
Muchun Song 9ea2b810d5 genirq/proc: Fix race in show_irq_affinity()
Reading /proc/irq/N/smp_affinity* races with irq_set_affinity() and
irq_move_masked_irq(), leading to old or torn output for users.

After a user writes a new CPU mask to /proc/irq/N/affinity*, the syscall
returns success, yet a subsequent read of the same file immediately returns
a value different from what was just written.

That's due to a race between show_irq_affinity() and irq_move_masked_irq()
which lets the read observe a transient, inconsistent affinity mask.

Cure it by guarding the read with irq_desc::lock.

[ tglx: Massaged change log ]

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251028090408.76331-1-songmuchun@bytedance.com
2025-10-31 22:30:05 +01:00
Marc Zyngier ee2d50a9f5 genirq: Kill irq_{g,s}et_percpu_devid_partition()
These two helpers do not have any user anymore, and can be removed,
together with the affinity field kept in the irqdesc structure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-25-maz@kernel.org
2025-10-27 17:16:37 +01:00
Marc Zyngier bdf4e2ac29 genirq: Allow per-cpu interrupt sharing for non-overlapping affinities
Interrupt sharing for percpu-devid interrupts is forbidden, and for good
reasons. These are interrupts generated *from* a CPU and handled by itself
(timer, for example). Nobody in their right mind would put two devices on
the same pin (and if they have, they get to keep the pieces...).

But this also prevents more benign cases, where devices are connected
to groups of CPUs, and for which the affinities are not overlapping.
Effectively, the only thing they share is the interrupt number, and
nothing else.

Tweak the definition of IRQF_SHARED applied to percpu_devid interrupts to
allow this particular use case. This results in extra validation at the
point of the interrupt being setup and freed, as well as a tiny bit of
extra complexity for interrupts at handling time (to pick the correct
irqaction).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-17-maz@kernel.org
2025-10-27 17:16:35 +01:00
Marc Zyngier b9c6aa9efc genirq: Update request_percpu_nmi() to take an affinity
Continue spreading the notion of affinity to the per CPU interrupt request
code by updating the call sites that use request_percpu_nmi() (all two of
them) to take an affinity pointer. This pointer is firmly NULL for now.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-16-maz@kernel.org
2025-10-27 17:16:35 +01:00
Marc Zyngier 258e7d28a3 genirq: Add affinity to percpu_devid interrupt requests
Add an affinity field to both the irqaction structure and the interrupt
request primitives. Nothing is making use of it yet, and the only value
used it NULL, which is used as a shorthand for cpu_possible_mask.

This will shortly get used with actual affinities.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-15-maz@kernel.org
2025-10-27 17:16:34 +01:00
Marc Zyngier 9047a39daa genirq: Factor-in percpu irqaction creation
Move the code creating a per-cpu irqaction into its own helper, so that
future changes to this code can be kept localised.

At the same time, fix the documentation which appears to say the wrong
thing when it comes to interrupts being automatically enabled
(percpu_devid interrupts never are).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251020122944.3074811-14-maz@kernel.org
2025-10-27 17:16:34 +01:00
Marc Zyngier 5ff78c8de9 genirq: Kill handle_percpu_devid_fasteoi_nmi()
There is no in-tree user of this flow handler anymore, so simply remove it.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-12-maz@kernel.org
2025-10-27 17:16:34 +01:00
Marc Zyngier 87b0031f7f irqdomain: Add firmware info reporting interface
Add an irqdomain callback to report firmware-provided information that is
otherwise not available in a generic way. This is reported using a new data
structure (struct irq_fwspec_info).

This callback is optional and the only information that can be reported
currently is the affinity of an interrupt. However, the containing
structure is designed to be extensible, allowing other potentially relevant
information to be reported in the future.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251020122944.3074811-2-maz@kernel.org
2025-10-27 17:16:32 +01:00
Charles Keepax ef3330b99c genirq/manage: Add buslock back in to enable_irq()
The locking was changed from a buslock to a plain lock, but the patch
description states there was no functional change. Assuming this was
accidental so reverting to using the buslock.

Fixes: bddd10c554 ("genirq/manage: Rework enable_irq()")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251023154901.1333755-4-ckeepax@opensource.cirrus.com
2025-10-24 11:38:39 +02:00
Charles Keepax 56363e25f7 genirq/manage: Add buslock back in to __disable_irq_nosync()
The locking was changed from a buslock to a plain lock, but the patch
description states there was no functional change. Assuming this was
accidental so reverting to using the buslock.

Fixes: 1b74444467 ("genirq/manage: Rework __disable_irq_nosync()")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251023154901.1333755-3-ckeepax@opensource.cirrus.com
2025-10-24 11:38:39 +02:00
Charles Keepax 5d7e45dd67 genirq/chip: Add buslock back in to irq_set_handler()
The locking was changed from a buslock to a plain lock, but the patch
description states there was no functional change. Assuming this was
accidental so reverting to using the buslock.

Fixes: 5cd05f3e23 ("genirq/chip: Rework irq_set_handler() variants")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251023154901.1333755-2-ckeepax@opensource.cirrus.com
2025-10-24 11:38:39 +02:00
Christophe JAILLET ac646f4495 genirq/msi: Slightly simplify msi_domain_alloc()
The return value of irq_find_mapping() is only tested, not used for
anything else.

Replaced it by irq_resolve_mapping() which is internally used by
irq_find_mapping() and allows a simple boolean decision.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/1ce680114cdb8d40b072c54d7f015696a540e5a6.1760863194.git.christophe.jaillet@wanadoo.fr
2025-10-20 20:18:48 +02:00
Linus Torvalds 3b2074c77d A set of updates for the interrupt core subsystem:
- Introduce irq_chip_[startup|shutdown]_parent() to prepare for
     addressing a few short comings in the PCI/MSI interrupt subsystem.
 
     It allows to utilize the interrupt chip startup/shutdown callbacks for
     initializing the interrupt chip hierarchy properly on certain RISCV
     implementations and provides a mechanism to reduce the overhead of
     masking and unmasking PCI/MSI interrupts during operation when the
     underlying MSI provider can mask the interrupt.
 
     The actual usage comes with the interrupt driver pull request.
 
   - Add generic error handling for devm_request_*_irq()
 
     This allows to remove the zoo of random error printk's all over the
     usage sites.
 
   - Add a mechanism to warn about long-running interrupt handlers
 
     Long running interrupt handlers can introduce latencies and tracking
     them down is a tedious task. The tracking has to be enabled with a
     threshold on the kernel command line and utilizes a static branch to
     remove the overhead when disabled.
 
   - Update and extend the selftests which validate the CPU hotplug
     interrupt migration logic
 
   - Allow dropping the per CPU softirq lock on PREEMPT_RT kernels, which
     causes contention and latencies all over the place.
 
     The serialization requirements have been pushed down into the actual
     affected usage sites already.
 
   - The usual small cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaRdUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZTHD/0anRJCLFFQ8Xtuuu96MDfcvi8m6uhB
 E8gF6lXrljRgEzt33hnqSB4d41PYaOTje4h/Z0nCV2xva8cg6cErzjKlbRU/SMmj
 yEBV5XMHP8Nxog+A5q5OT3lXQ6pT5ZxDZ7l0wven1xX6LbyF63MQFdzFBiLrXnb1
 mOZM7F89FQKhutEMDqVwVLXLmJasNtZK0kBVrdjA/vDBlAk0bYCaPX4BxW1PQIw4
 56KkG1llK221ruT2YbMSv/oFmYulDYbnnM3Aw6stUsR6SPMkQX8Yke2wLQSq+GVV
 RI0f4GTa6XnlIiUivQ77HnV0m0g0Ybkj1PoKeGJIjWDFJ5hqkA0dRQMBRCna68wK
 c3IEWhn2zPl3YxHYMogLwlE4nJAVFUjeEtGPb/Z/mmHaBdd/Y7iqVjTgHJzDvi5Q
 SwgISZEB87vZINpraTuEFNimeIwjCVdBj5ImBHT4T/Av3DFzuvqvxCSP0AlVRdhs
 OdwF5/6jFdPP787+FtytEl/kFYLzf/7R9zVHLcpiCD3eYCU4pHU/ULjUIGxFkSrS
 ltezvfZkGnbqiKTR33nOxWBnM2Fr63FMv0bKMJ97x49TMN6Tw4L8mra/uxuMQSIf
 yRk3jRgUcx0xuigCXcb62dip+2WnsGjnaDj30yOs/d1Ab2eALrJCo0e1tgdK2JDu
 VgCX3xCbvj/NzA==
 =GZty
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "A set of updates for the interrupt core subsystem:

   - Introduce irq_chip_[startup|shutdown]_parent() to prepare for
     addressing a few short comings in the PCI/MSI interrupt subsystem.

     It allows to utilize the interrupt chip startup/shutdown callbacks
     for initializing the interrupt chip hierarchy properly on certain
     RISCV implementations and provides a mechanism to reduce the
     overhead of masking and unmasking PCI/MSI interrupts during
     operation when the underlying MSI provider can mask the interrupt.

     The actual usage comes with the interrupt driver pull request.

   - Add generic error handling for devm_request_*_irq()

     This allows to remove the zoo of random error printk's all over the
     usage sites.

   - Add a mechanism to warn about long-running interrupt handlers

     Long running interrupt handlers can introduce latencies and
     tracking them down is a tedious task. The tracking has to be
     enabled with a threshold on the kernel command line and utilizes a
     static branch to remove the overhead when disabled.

   - Update and extend the selftests which validate the CPU hotplug
     interrupt migration logic

   - Allow dropping the per CPU softirq lock on PREEMPT_RT kernels,
     which causes contention and latencies all over the place.

     The serialization requirements have been pushed down into the
     actual affected usage sites already.

   - The usual small cleanups and improvements"

* tag 'irq-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT
  softirq: Provide a handshake for canceling tasklets via polling
  genirq/test: Ensure CPU 1 is online for hotplug test
  genirq/test: Drop CONFIG_GENERIC_IRQ_MIGRATION assumptions
  genirq/test: Depend on SPARSE_IRQ
  genirq/test: Fail early if interrupt request fails
  genirq/test: Factor out fake-virq setup
  genirq/test: Select IRQ_DOMAIN
  genirq/test: Fix depth tests on architectures with NOREQUEST by default.
  genirq: Add support for warning on long-running interrupt handlers
  genirq/devres: Add error handling in devm_request_*_irq()
  genirq: Add irq_chip_(startup/shutdown)_parent()
  genirq: Remove GENERIC_IRQ_LEGACY
2025-09-30 15:55:25 -07:00
Nam Cao 91daac8a68 genirq/msi: Remove msi_post_free()
The only user of msi_post_free() - powerpc/pseries - has been changed to
use msi_teardown().

Remove this unused callback.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250916061007.964005-1-namcao@linutronix.de
2025-09-23 14:29:51 +05:30
Brian Norris 8ad25ebfa7 genirq/test: Ensure CPU 1 is online for hotplug test
It's possible to run these tests on platforms that think they have a
hotpluggable CPU1, but for whatever reason, CPU1 is not online and can't be
brought online:

    # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:210
    Expected remove_cpu(1) == 0, but
        remove_cpu(1) == 1 (0x1)
CPU1: failed to boot: -38
    # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:214
    Expected add_cpu(1) == 0, but
        add_cpu(1) == -38 (0xffffffffffffffda)

Check that CPU1 is actually online before trying to run the test.

Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20250822190140.2154646-7-briannorris@chromium.org
2025-09-03 17:04:52 +02:00
Brian Norris add03fdb9d genirq/test: Drop CONFIG_GENERIC_IRQ_MIGRATION assumptions
Not all platforms use the generic IRQ migration code, even if they select
GENERIC_IRQ_MIGRATION. (See, for example, powerpc / pseries_cpu_disable().)

If such platforms don't perform managed shutdown the same way, the interrupt
may not actually shut down, and these tests fail:

[    4.357022][  T101]     # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:211
[    4.357022][  T101]     Expected irqd_is_activated(data) to be false, but is true
[    4.358128][  T101]     # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:212
[    4.358128][  T101]     Expected irqd_is_started(data) to be false, but is true
[    4.375558][  T101]     # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:216
[    4.375558][  T101]     Expected irqd_is_activated(data) to be false, but is true
[    4.376088][  T101]     # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:217
[    4.376088][  T101]     Expected irqd_is_started(data) to be false, but is true
[    4.377851][    T1]     # irq_cpuhotplug_test: pass:0 fail:1 skip:0 total:1
[    4.377901][    T1]     not ok 4 irq_cpuhotplug_test
[    4.378073][    T1] # irq_test_cases: pass:3 fail:1 skip:0 total:4

Rather than test that PowerPC performs migration the same way as the
unterrupt core, just drop the state checks. The point of the test was to
ensure that the code kept |depth| balanced, which still can be tested for.

Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20250822190140.2154646-6-briannorris@chromium.org
2025-09-03 17:04:52 +02:00