mirror-linux

Commit Graph

Author	SHA1	Message	Date
Bjorn Helgaas	35dbfcb7dc	Merge branch 'pci/hotplug' - Clean up whitespace in messages (Colin Ian King) * pci/hotplug: PCI: hotplug: Clean up spaces in messages	2025-10-03 12:13:08 -05:00
Bjorn Helgaas	31e04a46ce	Merge branch 'pci/enumeration' - Use PCI_HEADER_TYPE_* defines, not hard-coded values (Ilpo Järvinen) - Clean up early_dump_pci_device() to avoid hard-coded values (Ilpo Järvinen) - Clean up pci_scan_child_bus_extend() loop to avoid hard-coded values (Ilpo Järvinen) - Add a Xeon 6 quirk to disable Extended Tags and limit Max Read Request Size to 128B to avoid a performance issue (Ilpo Järvinen) * pci/enumeration: PCI: Add Extended Tag + MRRS quirk for Xeon 6 PCI: Clean up pci_scan_child_bus_extend() loop PCI: Clean up early_dump_pci_device() PCI: Use header type defines in pci_setup_device()	2025-10-03 12:13:08 -05:00
Bjorn Helgaas	7cc5e1e62b	Merge branch 'pci/aspm' - Enable all ClockPM and ASPM states for devicetree platforms, since there's typically no firmware that enables ASPM (Manivannan Sadhasivam) - Remove the qcom code that enabled ASPM (Manivannan Sadhasivam) * pci/aspm: PCI: qcom: Remove custom ASPM enablement code PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms	2025-10-03 12:13:07 -05:00
Bjorn Helgaas	a0d0cad13f	Merge branch 'pci/aer' - Allow drivers to request a Bus Reset on Non-Fatal Errors (Lukas Wunner) - Send uevents for subordinate devices (not the bridge) on failure to recover from errors on the subordinate devices (Lukas Wunner) - Notify drivers by calling their err_handler.error_detected() callback on failure to recover (Lukas Wunner) - Update device error_state earlier after reset to align AER and EEH error recovery (Lukas Wunner) - Remove obsolete comments about .link_reset(), which was removed long ago (Lukas Wunner) - Emit a uevent for the beginning of error recovery if driver requests a reset (Niklas Schnelle) - Emit error recover uevents on s390 as is done by EEH and AER (Niklas Schnelle) - Include error_detected() result in AER uevent to align with corresponding uevents from EEH and s390 (Niklas Schnelle) - Decode new errors added in PCIe r6.0 (Lukas Wunner) - Print TLP Log for errors introduced since PCIe spec r1.1 (Lukas Wunner) - Check for allocation failure in pci_aer_init() (Vernon Yang) - Update error recovery documentation to match the current code and use consistent nomenclature (Lukas Wunner) - Avoid NULL pointer dereference in aer_ratelimit() when GHES error info points to a device with no AER Capability (Breno Leitao) * pci/aer: PCI/AER: Avoid NULL pointer dereference in aer_ratelimit() Documentation: PCI: Fix typos Documentation: PCI: Tidy error recovery doc's PCIe nomenclature Documentation: PCI: Amend error recovery doc with DPC/AER specifics Documentation: PCI: Sync error recovery doc with code Documentation: PCI: Sync AER doc with code PCI/AER: Fix NULL pointer access by aer_info PCI/AER: Print TLP Log for errors introduced since PCIe r1.1 PCI/AER: Support errors introduced by PCIe r6.0 powerpc/eeh: Use result of error_detected() in uevent s390/pci: Use pci_uevent_ers() in PCI recovery PCI/AER: Fix missing uevent on recovery when a reset is requested PCI/ERR: Remove remnants of .link_reset() callback PCI/ERR: Update device error_state already after reset PCI/ERR: Notify drivers on failure to recover PCI/ERR: Fix uevent on failure to recover PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors	2025-10-03 12:13:07 -05:00
Breno Leitao	deb2f22838	PCI/AER: Avoid NULL pointer dereference in aer_ratelimit() When platform firmware supplies error information to the OS, e.g., via the ACPI APEI GHES mechanism, it may identify an error source device that doesn't advertise an AER Capability and therefore dev->aer_info, which contains AER stats and ratelimiting data, is NULL. pci_dev_aer_stats_incr() already checks dev->aer_info for NULL, but aer_ratelimit() did not, leading to NULL pointer dereferences like this one from the URL below: {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 {1}[Hardware Error]: event severity: corrected {1}[Hardware Error]: device_id: 0000:00:00.0 {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2020 {1}[Hardware Error]: aer_cor_status: 0x00001000, aer_cor_mask: 0x00002000 BUG: kernel NULL pointer dereference, address: 0000000000000264 RIP: 0010:___ratelimit+0xc/0x1b0 pci_print_aer+0x141/0x360 aer_recover_work_func+0xb5/0x130 [8086:2020] is an Intel "Sky Lake-E DMI3 Registers" device that claims to be a Root Port but does not advertise an AER Capability. Add a NULL check in aer_ratelimit() to avoid the NULL pointer dereference. Note that this also prevents ratelimiting these events from GHES. Fixes: `a57f2bfb4a` ("PCI/AER: Ratelimit correctable and non-fatal error logging") Link: https://lore.kernel.org/r/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ Signed-off-by: Breno Leitao <leitao@debian.org> [bhelgaas: add crash details to commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250929-aer_crash_2-v1-1-68ec4f81c356@debian.org	2025-10-02 09:35:10 -05:00
Manivannan Sadhasivam	a729c16646	PCI: qcom: Remove custom ASPM enablement code Since the PCI subsystem has started enabling all ASPM states for all devicetree based platforms, the ASPM enablement code from this driver can now be dropped. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250922-pci-dt-aspm-v2-2-2a65cf84e326@oss.qualcomm.com	2025-09-23 18:07:01 -05:00
Manivannan Sadhasivam	f3ac2ff148	PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms So far, the PCI subsystem has honored the ASPM and Clock PM states set by the BIOS (through LNKCTL) during device initialization, if it relies on the default state selected using: * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or * cmdline: "pcie_aspm=off", or * FADT: ACPI_FADT_NO_ASPM This was done conservatively to avoid issues with the buggy devices that advertise ASPM capabilities, but behave erratically if the ASPM states are enabled. So the PCI subsystem ended up trusting the BIOS to enable only the ASPM states that were known to work for the devices. But this turned out to be a problem for devicetree platforms, especially the ARM based devicetree platforms powering Embedded and some Compute devices as they tend to run without any standard BIOS. So the ASPM states on these platforms were left disabled during boot and the PCI subsystem never bothered to enable them, unless the user has forcefully enabled the ASPM states through Kconfig, cmdline, and sysfs or the device drivers themselves, enabling the ASPM states through pci_enable_link_state() APIs. This caused runtime power issues on those platforms. So a couple of approaches were tried to mitigate this BIOS dependency without user intervention by enabling the ASPM states in the PCI controller drivers after device enumeration, and overriding the ASPM/Clock PM states by the PCI controller drivers through an API before enumeration. But it has been concluded that none of these mitigations should really be required and the PCI subsystem should enable the ASPM states advertised by the devices without relying on BIOS or the PCI controller drivers. If any device is found to be misbehaving after enabling ASPM states that they advertised, then those devices should be quirked to disable the problematic ASPM/Clock PM states. In an effort to do so, start by overriding the ASPM and Clock PM states set by the BIOS for devicetree platforms first. Separate helper functions are introduced to override the BIOS set states by enabling all of them if of_have_populated_dt() returns true. To aid debugging, print the overridden ASPM and Clock PM states as well. In the future, these helpers could be extended to allow other platforms like VMD, newer ACPI systems with a cutoff year etc... to follow the path. Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> [bhelgaas: tweak comments and dmesg logs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250922-pci-dt-aspm-v2-1-2a65cf84e326@oss.qualcomm.com	2025-09-23 18:06:33 -05:00
Emilio Perez	8a760c454d	Documentation: PCI: Fix typos The PCIe spec uses "Requester ID", not "requestor ID". Follow the spec to avoid confusion. Signed-off-by: Emilio Perez <emiliopeju@gmail.com> [bhelgaas: capitalize as a hint that the spec defines this] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250818023121.33427-1-emiliopeju@gmail.com	2025-09-18 17:23:22 -05:00
Lukas Wunner	40103d2a42	Documentation: PCI: Tidy error recovery doc's PCIe nomenclature Commit `11502feab4` ("Documentation: PCI: Tidy AER documentation") replaced the terms "PCI-E", "PCI-Express" and "PCI Express" with "PCIe" in the AER documentation. Do the same in the documentation on PCI error recovery. While at it, add a missing period and a missing blank. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/db56b7ef12043f709a04ce67c1d1e102ab5f4e19.1757942121.git.lukas@wunner.de	2025-09-16 10:53:40 -05:00
Lukas Wunner	15dea68d41	Documentation: PCI: Amend error recovery doc with DPC/AER specifics Amend the documentation on PCI error recovery with specifics about Downstream Port Containment and Advanced Error Reporting: * Explain that with DPC, devices are inaccessible upon an error (similar to EEH on powerpc) and do not become accessible until the link is re-enabled. * Explain that with AER, although devices may already be accessible in the ->error_detected() callback, accesses should be deferred to the ->mmio_enabled() callback for compatibility with EEH on powerpc and with s390. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/61d8eeadb20ee71c3a852f44c863bfe0209c454d.1757942121.git.lukas@wunner.de	2025-09-16 10:53:25 -05:00
Lukas Wunner	8e4a13fc61	Documentation: PCI: Sync error recovery doc with code Amend the documentation on PCI error recovery to fix minor inaccuracies vis-à-vis the actual code: * The documentation claims that a missing ->resume() or ->mmio_enabled() callback always leads to recovery through reset. But none of the implementations do this (pcie_do_recovery(), eeh_handle_normal_event(), zpci_event_do_error_state_clear()). Drop the claim to align the documentation with the code. * The documentation does not list PCI_ERS_RESULT_RECOVERED as a valid return value from ->error_detected(). But none of the implementations forbid this and some drivers are returning it, e.g.: drivers/bus/mhi/host/pci_generic.c drivers/infiniband/hw/hfi1/pcie.c Further down in the documentation it is implied that the return value is in fact allowed: "The platform will call the resume() callback on all affected device drivers if all drivers on the segment have returned PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks." The "3 previous callbacks" being ->error_detected(), ->mmio_enabled() and ->slot_reset(). Add it to the valid return values for consistency. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/ed3c3385499775fcc25f1ee66f395e212919f94a.1757942121.git.lukas@wunner.de	2025-09-16 10:53:03 -05:00
Lukas Wunner	01cc0dc9de	Documentation: PCI: Sync AER doc with code The PCIe Advanced Error Reporting driver has evolved over the years but its documentation hasn't. Catch up with past code changes: * The documentation claims that Correctable Errors are logged with KERN_INFO severity, but the code uses KERN_WARN. It had used KERN_WARN from the beginning with commit `6c2b374d74` ("PCI-Express AER implemetation: AER core and aerdriver"). In 2013, commit `2cced2d959` ("aerdrv: Cleanup log output for AER") switched to KERN_ERR, until 2020 when it was reverted back to KERN_WARN by commit `e83e2ca3c3` ("PCI/AER: Log correctable errors as warning, not error"). * An example log message in the documentation uses the term "Uncorrected", but the code uses "Uncorrectable" since commit `02a06f5f1a` ("PCI/AER: Use 'Correctable' and 'Uncorrectable' spec terms for errors"). * The example contains the Requester ID "id=0500", which is omitted since commit `010caed4cc` ("PCI/AER: Decode Error Source Requester ID"). * The example contains the error name "Unsupported Request", which is instead reported as "UnsupReq" since commit `bd237801fe` ("PCI/AER: Adopt lspci names for AER error decoding"). * The example doesn't prepend "0x" to hex values from the TLP Header Log, as introduced by commit `f68ea779d9` ("PCI: Add pcie_print_tlp_log() to print TLP Header and Prefix Log"). * The documentation refers to a reset_link callback which was removed by commit `b6cf1a42f9` ("PCI/ERR: Remove service dependency in pcie_do_recovery()"). * Commit `5790862255` ("PCI/ERR: Recover from RCiEP AER errors") added support to recover Root Complex Integrated Endpoints by applying a Function Level Reset, alternatively to the Secondary Bus Reset which is applied otherwise. * On non-fatal errors, a reset was previously never performed. But the AER driver has just been amended to allow drivers to opt in to a reset. * The documentation claims that a warning message is logged if a driver lacks pci_error_handlers. But the message has been informational (logged with KERN_INFO severity) since its introduction with commit `01daacfb90` ("PCI/AER: Log which device prevents error recovery"). The documentation claims that the message is only logged for fatal errors, which is incorrect. Moreover it refers to "section 3", even though the documentation no longer contains section numbers since commit `4e37f055a9` ("Documentation: PCI: convert pcieaer-howto.txt to reST"). Section 3 is titled "Developer Guide". That's the same section where the reference is located, so it is self-referential and can be dropped. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/7501bfc5b9920193a25998a3cbcf72c47674ec63.1757942121.git.lukas@wunner.de	2025-09-16 10:52:07 -05:00
Vernon Yang	0a27bdb14b	PCI/AER: Fix NULL pointer access by aer_info The kzalloc(GFP_KERNEL) may return NULL, so all accesses to aer_info->xxx will result in kernel panic. Fix it. Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250904182527.67371-1-vernon2gm@gmail.com	2025-09-11 17:55:33 -05:00
Lukas Wunner	c8ab5e888b	PCI/AER: Print TLP Log for errors introduced since PCIe r1.1 When reporting an error, the AER driver prints the TLP Header / Prefix Log only for errors enumerated in the AER_LOG_TLP_MASKS macro. The macro was never amended since its introduction in 2006 with commit `6c2b374d74` ("PCI-Express AER implemetation: AER core and aerdriver"). At the time, PCIe r1.1 was the latest spec revision. Amend the macro with errors defined since then to avoid omitting the TLP Header / Prefix Log for newer errors. The order of the errors in AER_LOG_TLP_MASKS follows PCIe r1.1 sec 6.2.7 rather than 7.10.2, because only the former documents for which errors a TLP Header / Prefix is logged. Retain this order. The section number is still 6.2.7 in today's PCIe r7.0. For Completion Timeouts, the TLP Header / Prefix is only logged if the Completion Timeout Prefix / Header Log Capable bit is set in the AER Capabilities and Control register. Introduce a tlp_header_logged() helper to check whether the TLP Header / Prefix Log is populated and use it in the two places which currently match against AER_LOG_TLP_MASKS directly. For Uncorrectable Internal Errors, logging of the TLP Header / Prefix is optional per PCIe r7.0 sec 6.2.7. If needed, drivers could indicate through a flag whether devices are capable and tlp_header_logged() could then check that flag. pcitools introduced macros for newer errors with commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"): https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b Unfortunately some of those macros are overly long: PCI_ERR_UNC_POISONED_TLP_EGRESS PCI_ERR_UNC_DMWR_REQ_EGRESS_BLOCKED PCI_ERR_UNC_IDE_CHECK PCI_ERR_UNC_MISR_IDE_TLP PCI_ERR_UNC_PCRC_CHECK PCI_ERR_UNC_TLP_XLAT_EGRESS_BLOCKED This seems unsuitable for <linux/pci_regs.h>, so shorten to: PCI_ERR_UNC_POISON_BLK PCI_ERR_UNC_DMWR_BLK PCI_ERR_UNC_IDE_CHECK PCI_ERR_UNC_MISR_IDE PCI_ERR_UNC_PCRC_CHECK PCI_ERR_UNC_XLAT_BLK Note that some of the existing macros in <linux/pci_regs.h> do not match exactly with pcitools (e.g. PCI_ERR_UNC_SDES versus PCI_ERR_UNC_SURPDN), so it does not seem mandatory for them to be identical. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/5f707caf1260bd8f15012bb032f7da9a9b898aba.1756712066.git.lukas@wunner.de	2025-09-04 10:09:05 -05:00
Lukas Wunner	6633875250	PCI/AER: Support errors introduced by PCIe r6.0 PCIe r6.0 defined five additional errors in the Uncorrectable Error Status, Mask and Severity Registers (PCIe r7.0 sec 7.8.4.2ff). lspci has been supporting them since commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"): https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b Amend the AER driver to recognize them as well, instead of logging them as "Unknown Error Bit". Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/21f1875b18d4078c99353378f37dcd6b994f6d4e.1756301211.git.lukas@wunner.de	2025-08-27 16:15:24 -05:00
Niklas Schnelle	704e5dd1c0	powerpc/eeh: Use result of error_detected() in uevent Ever since uevent support was added for AER and EEH with commit `856e1eb9bd` ("PCI/AER: Add uevents in AER and EEH error/resume"), it reported PCI_ERS_RESULT_NONE as uevent when recovery begins. Commit `7b42d97e99` ("PCI/ERR: Always report current recovery status for udev") subsequently amended AER to report the actual return value of error_detected(). Make the same change to EEH to align it with AER and s390. Suggested-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/linux-pci/aIp6LiKJor9KLVpv@wunner.de/ Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Link: https://patch.msgid.link/20250807-add_err_uevents-v5-3-adf85b0620b0@linux.ibm.com	2025-08-14 15:58:11 -05:00
Niklas Schnelle	dab32f2576	s390/pci: Use pci_uevent_ers() in PCI recovery Issue uevents on s390 during PCI recovery using pci_uevent_ers() as done by EEH and AER PCIe recovery routines. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Link: https://patch.msgid.link/20250807-add_err_uevents-v5-2-adf85b0620b0@linux.ibm.com	2025-08-14 15:56:28 -05:00
Niklas Schnelle	bbf7d0468d	PCI/AER: Fix missing uevent on recovery when a reset is requested Since commit `7b42d97e99` ("PCI/ERR: Always report current recovery status for udev") AER uses the result of error_detected() as parameter to pci_uevent_ers(). As pci_uevent_ers() however does not handle PCI_ERS_RESULT_NEED_RESET this results in a missing uevent for the beginning of recovery if drivers request a reset. Fix this by treating PCI_ERS_RESULT_NEED_RESET as beginning recovery. Fixes: `7b42d97e99` ("PCI/ERR: Always report current recovery status for udev") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250807-add_err_uevents-v5-1-adf85b0620b0@linux.ibm.com	2025-08-14 15:55:58 -05:00
Lukas Wunner	cc4a7a21e8	PCI/ERR: Remove remnants of .link_reset() callback Back in 2017, commit `2fd260f03b` ("PCI/AER: Remove unused .link_reset() callback") removed .link_reset() from struct pci_error_handlers, but left a few code comments behind which still mention it. Remove them. The code comments in the SolarFlare Ethernet drivers point out that no .mmio_enabled() callback is needed because the driver's .error_detected() callback always returns PCI_ERS_RESULT_NEED_RESET, which causes pcie_do_recovery() to skip .mmio_enabled(). That's not quite correct because efx_io_error_detected() does return PCI_ERS_RESULT_RECOVERED under certain conditions and then .mmio_enabled() would indeed be called if it were implemented. Remove this misleading portion of the code comment as well. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1d72a891a7f57115e78a73046e776f7e0c8cd68f.1755008151.git.lukas@wunner.de	2025-08-13 13:21:02 -05:00
Lukas Wunner	45bc82563d	PCI/ERR: Update device error_state already after reset After a Fatal Error has been reported by a device and has been recovered through a Secondary Bus Reset, AER updates the device's error_state to pci_channel_io_normal before invoking its driver's ->resume() callback. By contrast, EEH updates the error_state earlier, namely after resetting the device and before invoking its driver's ->slot_reset() callback. Commit `c58dc575f3` ("powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()") explains in great detail that the earlier invocation is necessitated by various drivers checking accessibility of the device with pci_channel_offline() and avoiding accesses if it returns true. It returns true for any other error_state than pci_channel_io_normal. The device should be accessible already after reset, hence the reasoning is that it's safe to update the error_state immediately afterwards. This deviation between AER and EEH seems problematic because drivers behave differently depending on which error recovery mechanism the platform uses. Three drivers have gone so far as to update the error_state themselves, presumably to work around AER's behavior. For consistency, amend AER to update the error_state at the same recovery steps as EEH. Drop the now unnecessary workaround from the three drivers. Keep updating the error_state before ->resume() in case ->error_detected() or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes ->slot_reset() to be skipped. There are drivers doing this even for Fatal Errors, e.g. mhi_pci_error_detected(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de	2025-08-13 13:20:56 -05:00
Lukas Wunner	9011f0667c	PCI/ERR: Notify drivers on failure to recover According to Documentation/PCI/pci-error-recovery.rst, the following shall occur on failure to recover from a PCIe Uncorrectable Error: STEP 6: Permanent Failure ------------------------- A "permanent failure" has occurred, and the platform cannot recover the device. The platform will call error_detected() with a pci_channel_state_t value of pci_channel_io_perm_failure. The device driver should, at this point, assume the worst. It should cancel all pending I/O, refuse all new I/O, returning -EIO to higher layers. The device driver should then clean up all of its memory and remove itself from kernel operations, much as it would during system shutdown. Sathya notes that AER does not call error_detected() on failure and thus deviates from the document (as well as EEH, for which the document was originally added). Most drivers do nothing on permanent failure, but the SCSI drivers and a number of Ethernet drivers do take advantage of the notification to flush queues and give up resources. Amend AER to notify such drivers and align with the documentation and EEH. Link: https://lore.kernel.org/r/f496fc0f-64d7-46a4-8562-dba74e31a956@linux.intel.com/ Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/ec212d4d4f5c65d29349df33acdc9768ff8279d1.1755008151.git.lukas@wunner.de	2025-08-13 13:20:46 -05:00
Lukas Wunner	1cbc5e25fb	PCI/ERR: Fix uevent on failure to recover Upon failure to recover from a PCIe error through AER, DPC or EDR, a uevent is sent to inform user space about disconnection of the bridge whose subordinate devices failed to recover. However the bridge itself is not disconnected. Instead, a uevent should be sent for each of the subordinate devices. Only if the "bridge" happens to be a Root Complex Event Collector or Integrated Endpoint does it make sense to send a uevent for it (because there are no subordinate devices). Right now if there is a mix of subordinate devices with and without pci_error_handlers, a BEGIN_RECOVERY event is sent for those with pci_error_handlers but no FAILED_RECOVERY event is ever sent for them afterwards. Fix it. Fixes: `856e1eb9bd` ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.16+ Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de	2025-08-13 13:20:35 -05:00
Lukas Wunner	d0a2dee7d4	PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors When Advanced Error Reporting was introduced in September 2006 by commit `6c2b374d74` ("PCI-Express AER implemetation: AER core and aerdriver"), it sought to adhere to the recovery flow and callbacks specified in Documentation/PCI/pci-error-recovery.rst. That document had been added in January 2006, when Enhanced Error Handling (EEH) was introduced for PowerPC with commit `065c635907` ("[PATCH] PCI Error Recovery: documentation"). However the AER driver deviates from the document in that it never performs a Secondary Bus Reset on Non-Fatal Errors, but always on Fatal Errors. By contrast, EEH allows drivers to opt in or out of a Bus Reset regardless of error severity, by returning PCI_ERS_RESULT_NEED_RESET or PCI_ERS_RESULT_CAN_RECOVER from their ->error_detected() callback. If all drivers agree that they can recover without a Bus Reset, EEH skips it. Should one of them request a Bus Reset, it overrides all other drivers. This inconsistency between EEH and AER seems problematic because drivers need to be aware of and cope with it. The file Documentation/PCI/pcieaer-howto.rst hints at a rationale for always performing a Bus Reset on Fatal Errors: "Fatal errors [...] cause the link to be unreliable. [...] This [reset_link] callback is used to reset the PCIe physical link when a fatal error happens. If an error message indicates a fatal error, [...] performing link reset at upstream is necessary." There's no such rationale provided for never performing a Bus Reset on Non-Fatal Errors. The "xe" driver has a need to attempt a reset of local units on graphics cards upon a Non-Fatal Error. If that is insufficient for recovery, the driver wants to opt in to a Bus Reset. Accommodate such use cases and align AER more closely with EEH by performing a Bus Reset in pcie_do_recovery() if drivers request it and the faulting device's channel_state is pci_channel_io_normal. The AER driver sets this channel_state for Non-Fatal Errors. For Fatal Errors, it uses pci_channel_io_frozen. This limits the deviation from Documentation/PCI/pci-error-recovery.rst and EEH to the unconditional Bus Reset on Fatal Errors. pcie_do_recovery() is also invoked by the Downstream Port Containment and Error Disconnect Recover drivers. They both set the channel_state to pci_channel_io_frozen, hence pcie_do_recovery() continues to always invoke the ->reset_subordinates() callback in their case. That is necessary because the callback brings the link back up at the containing Downstream Port. There are two behavioral changes resulting from this commit: First, if channel_state is pci_channel_io_normal and one of the affected drivers returns PCI_ERS_RESULT_NEED_RESET from its ->error_detected() callback, a Bus Reset will now be performed. There are drivers doing this and although it would be possible to avoid a behavioral change by letting them return PCI_ERS_RESULT_CAN_RECOVER instead, the impression I got from examination of all drivers is that they actually expect or want a Bus Reset (cxl_error_detected() is a case in point). In any case, if they can cope with a Bus Reset on Fatal Errors, they shouldn't have issues with a Bus Reset on Non-Fatal Errors. Second, if channel_state is pci_channel_io_frozen and all affected drivers return PCI_ERS_RESULT_CAN_RECOVER from ->error_detected(), their ->mmio_enabled() callback is now invoked prior to performing a Bus Reset, instead of afterwards. This actually makes sense: For example, drivers/scsi/sym53c8xx_2/sym_glue.c dumps debug registers in its ->mmio_enabled() callback. Doing so after reset right now captures the post-reset state instead of the faulting state, which is useless. There is only one other driver which implements ->mmio_enabled() and returns PCI_ERS_RESULT_CAN_RECOVER from ->error_detected() for channel_state pci_channel_io_frozen, drivers/scsi/ipr.c (IBM Power RAID). It appears to only be used on EEH platforms. So the second behavioral change is limited to these two drivers. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/28fd805043bb57af390168d05abb30898cf4fc58.1755008151.git.lukas@wunner.de	2025-08-13 13:20:26 -05:00
Colin Ian King	1d33d9e46c	PCI: hotplug: Clean up spaces in messages Remove extraneous space before newline in messages. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> [bhelgaas: squash into one patch, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250731100017.2165781-1-colin.i.king@gmail.com Link: https://patch.msgid.link/20250731101036.2167812-1-colin.i.king@gmail.com	2025-08-11 15:01:50 -05:00
Ilpo Järvinen	a22250fe93	PCI: Add Extended Tag + MRRS quirk for Xeon 6 When bifurcated to x2, Xeon 6 Root Port performance is sensitive to the configuration of Extended Tags, Max Read Request Size (MRRS), and 10-Bit Tag Requester (note: there is currently no 10-Bit Tag support in the kernel). While those can be configured to the recommended values by FW, kernel may decide to overwrite the initial values. Add a quirk that disallows enabling Extended Tags and setting MRRS larger than 128B for devices under Xeon 6 Root Ports if the Root Port is bifurcated to x2. Use the host bridge's enable_device hook to overwrite MRRS if it's set to >128B for the device to be enabled. The earlier attempts to implement this quirk polluted PCI core code with the checks necessary to support this quirk. Using the enable_device hook keeps the quirk well-contained, away from the PCI core code. Suggested-by: Lukas Wunner <lukas@wunner.de> Link: https://cdrdv2.intel.com/v1/dl/getContent/837176 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: 2x -> x2, rename quirk] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250610114802.7460-1-ilpo.jarvinen@linux.intel.com	2025-08-11 15:00:51 -05:00
Ilpo Järvinen	c763fae8c4	PCI: Clean up pci_scan_child_bus_extend() loop pci_scan_child_bus_extend() open-codes device number iteration in the for loop. Convert to use PCI_DEVFN() and add PCI_MAX_NR_DEVS (there seems to be no pre-existing define for this purpose). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/20250610105820.7126-3-ilpo.jarvinen@linux.intel.com	2025-08-11 15:00:51 -05:00
Ilpo Järvinen	aa84931ba7	PCI: Clean up early_dump_pci_device() Convert 256 to PCI_CFG_SPACE_SIZE and 4 to sizeof(u32) and avoid i / 4 construct by changing the iteration. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/20250610105820.7126-2-ilpo.jarvinen@linux.intel.com	2025-08-11 15:00:51 -05:00
Ilpo Järvinen	24b2d3c452	PCI: Use header type defines in pci_setup_device() Replace literals with PCI_HEADER_TYPE_* defines in pci_setup_device(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/20250610105820.7126-1-ilpo.jarvinen@linux.intel.com	2025-08-11 15:00:51 -05:00
Linus Torvalds	8f5ae30d69	Linux 6.17-rc1	2025-08-10 19:41:16 +03:00
Linus Torvalds	2b38afce25	tools/power turbostat: version 2025.09.09 Probe and display L3 Cache topology Add ability to average an added counter (useful for pre-integrated "counters", such as Watts) Break the limit of 64 built-in counters. Assorted bug fixes and minor feature tweaks -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEE67dNfPFP+XUaA73mB9BFOha3NhcFAmiX9oQUHGxlbi5icm93 bkBpbnRlbC5jb20ACgkQB9BFOha3NhfNYg/+Lh6tMh84N0ziNpX31mWCJChlGJXq 7vR4J8E9GSO7Ixz9HGGQxvPAXp/FNuOCYJ6LNOpyzauoxwtF836MABGhQveBUzfu /0wmkN/DEUb0FTRHPR1LiGDarl42g8CsaQfrCzAvO9WUviOHOicfM6duk5hVItQd 5QotmHuLJtEbwxnYdVZW5FbXFMFU/C1z/8zk3VcoW8H2gV3qR/MXuzyOcp8C2pdU x7/i5FlAOEabhL8liOx1x8OcCo5NGne+eV7tr1SE6Duykg9aIL3o/KdfQhEI7uF0 f8Ya7Sol0d6kFTJNnSOPWa5QNkLOW5ib4iTyDkvaFHY8CeakMLNjIkdcLXKRsYCs yWszWtUMECC/GprDwl5Aq77a54p/2gp6Ntekhn1aWw0/jhBIf/ZTAiFA2OcG3Ikd RWAn3veaVRNxrHA6Ck7US/sJAiE3VNod+eIFA+/4NaQEpVJdtHlkwNwjwgF6pwaM PakryoT4v6ZfC7FfCBH3wMSlWmO5612zNqQM35yWuGMXMBHZKcuSQUliXkN5KfpX /ShjYbBUhiK6MiswZaXsWEdocqjX4t4QHRGAi75mGDpBey7gd62RQH+8R5bcLiVH 258Y/Zg+yvHrcq3rmpK0tF/nTI+WrXlcsv6sGHW/QJ3Fet0iXlI9kLtzRANwkcYk Zfvwnp9BP3Q6FRg= =1iYa -----END PGP SIGNATURE----- Merge tag 'turbostat-2025.09.09' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: "tools/power turbostat: version 2025.09.09 - Probe and display L3 Cache topology - Add ability to average an added counter (useful for pre-integrated "counters", such as Watts) - Break the limit of 64 built-in counters - Assorted bug fixes and minor feature tweaks" * tag 'turbostat-2025.09.09' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 2025.09.09 tools/power turbostat: Handle non-root legacy-uncore sysfs permissions tools/power turbostat: standardize PER_THREAD_PARAMS tools/power turbostat: Fix DMR support tools/power turbostat: add format "average" for external attributes tools/power turbostat: delete GET_PKG() tools/power turbostat: probe and display L3 cache topology tools/power turbostat: Support more than 64 built-in-counters tools/power turbostat.8: Document Totl%C0, Any%C0, GFX%C0, CPUGFX% columns tools/power turbostat: Fix bogus SysWatt for forked program tools/power turbostat: Handle cap_get_proc() ENOSYS tools/power turbostat: Fix build with musl tools/power turbostat: verify arguments to params --show and --hide tools/power turbostat: regression fix: --show C1E%	2025-08-10 09:02:36 +03:00
Linus Torvalds	b96ddbc5c8	- Remove an obsolete comment and fix spelling -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXmjUACgkQEsHwGGHe VUohIg/9FA8vG1+KBjC6njb8M4tmKiIpS2/OLfdurdp0MFCnxgsAifHbB4Lfh0Aq RQO/d3kQ7MrdWS8vuR7AHBTq6ceGuKwIpcikJOvloohlTcHslxpxXDedKdTeSx4L CgV88uZbiYppZwFPafC/KOEkMKb+gd/WhnzHQmhIzMk/Cqtv55qaATTdk4Qmai02 rPcLoS1Zhcs+t/4z9OleELZE9dcNTK2ZpOhHe+ygWZjWQizXAA8hJCyr8zc8UPhu +JLj5y2QofFa9XqADk54HzGidR/KKM9or8dsg2X1i4vO8UL26cKHry4NOJOUCblV AJptZzFZ3BU0kOyr0RVgT/dCSupl/ujGyiB9B3IuvxYqPAAt11KTbRrY0SoDd2w1 7XGkm4PQiUikA0VFBKjsoeaJg+portp/TJ4O9r7etRgy+qgguLzZZzaZw1zlGEon juPlcTpdEwZND7XtnjtG9hIdJlih8NuJwKWSlmRuoWtzyCR5eGEnxOaD8pgTflr8 MR5mQlkOgoXQjzjn6D+bmlB4qdst8aP3A8OY3uuj52xfGoaNp7FsP4RL//nGpdqg p/sdAOhGGyhs6c3/XrhRT/8YXog6cEJXelXXL1jzIjuiQX3yfR1//nWl7+ftKDvB EOR30na1iZ6MDGK7xJiS5B2wyc1my7AQj/mi/+5KopP8APMmqhg= =lrRr -----END PGP SIGNATURE----- Merge tag 'smp_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp fixes from Borislav Petkov: - Remove an obsolete comment and fix spelling * tag 'smp_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu: Remove obsolete comment from takedown_cpu() smp: Fix spelling in on_each_cpu_cond_mask()'s doc-comment	2025-08-10 08:51:37 +03:00
Linus Torvalds	7d2fed1f3c	- Fix a wrong ioremap size in mvebu-gicp - Remove yet another compile-test case for a driver which needs an additional dependency - Fix a lock inversion scenario in the IRQ unit test suite - Remove an impossible flag situation in gic-v5 - Do not iounmap resources in gic-v5 which are managed by devm - Make sure stale, left-over interrupts in mvebu-gicp are cleared on driver init - Fix a reference counting mishap in msi-lib - Fix a dereference-before-null-ptr-check case in the riscv-imsic irqchip driver -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXkZYACgkQEsHwGGHe VUoiaw/7B/0wjf6CZdrzFW+gSBgXMA4VghVcEUySQTeE66SL5dhOM5Pw2w8z5ow5 tn2/cMZg7aAyQUt9XlPzWhHwU5Pu06Jtd32LtDX0sqf54byYybQHtoUBnNcPY7FR Anf7jr0WnXNHe+qQcccfVDtEYzF9R/HIkmp63f348wP3aS5RTbFPWk2cOPdAXOAY TeWordoDXjtzix+Ro8zk2WaD0h9oDLdHgg4eww3FUVNBvKiXIhkV7bb70t85f+gA f0eF6LJ0318e6UHwVhU+OgYzdD9uXMNTrZDKeZq/xYYIytVlCfwvKDMWFZC11ltC 89BMyghLSRkI/oV/2/gXGICRmGp7OEn6HzD/vpPv30Zfeyj0e8O/rat2uZCifrbL 9RJ4sXMJCOJUoHD3t/e7i1TDqsmVF9CdTbgwqQt6ANtypJrkVkIBqO4QvcNu8qQ5 c6lt5Y7ob+owpIhUoBmxCUaZz19wZAwRcOIkAZwoWXTrvfYjD28AveQOqHpOBvvQ WQY3pvGkvgY9vmWbIeshWhZzb+kX5Wn5WI4C7Ul5cng2WUfo1pkI6U8u9dmv0D7y LBVjnj/rXTWR0G9OyI3R9WqrGmrCnKOPMpv98lsETctBZxTrSStbOWe4dOBTe1Zh Jq1KRWZ4UE5SauOr8R/59y21E4HulVVH2WK7TUhM8paPkeYbw3o= =VUGL -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a wrong ioremap size in mvebu-gicp - Remove yet another compile-test case for a driver which needs an additional dependency - Fix a lock inversion scenario in the IRQ unit test suite - Remove an impossible flag situation in gic-v5 - Do not iounmap resources in gic-v5 which are managed by devm - Make sure stale, left-over interrupts in mvebu-gicp are cleared on driver init - Fix a reference counting mishap in msi-lib - Fix a dereference-before-null-ptr-check case in the riscv-imsic irqchip driver * tag 'irq_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mvebu-gicp: Use resource_size() for ioremap() irqchip: Build IMX_MU_MSI only on ARM genirq/test: Resolve irq lock inversion warnings irqchip/gic-v5: Remove IRQD_RESEND_WHEN_IN_PROGRESS for ITS IRQs irqchip/gic-v5: iwb: Fix iounmap probe failure path irqchip/mvebu-gicp: Clear pending interrupts on init irqchip/msi-lib: Fix fwnode refcount in msi_lib_irq_domain_select() irqchip/riscv-imsic: Don't dereference before NULL pointer check	2025-08-10 08:46:47 +03:00
Linus Torvalds	acaa21a26f	- Fix an interrupt vector setup race which leads to a non-functioning device - Add new Intel CPU models and a family: 0x12. Finally. Yippie! :-) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXjqoACgkQEsHwGGHe VUrD0Q/9FPOeQ6sNazmk4oC0d7BrgTmjiSR9yHsgbHzLRilx7ncBNLS1DAwVBKNS JPpHUEWdIjY7cURYFxihSPvjG1d+yCoKHkn/HlGXpVNm2bfN8M5ejlqd/yskGwa1 9eNzPCB08+dD0zJhEq6teDn5cAGqeLmdiQa+h3UZ6Wt5BOGv2NAMpF8cgObGC6le 2ednA+lHPlQdIIJCH+HmmHIvVHXAGiuisfdD6eVtec28LU+V/vNPkGnmss616dui 48O2+FPaBnN+0tlNmn/bdisjAUWpYPYA9sto2tk/wKXZM+OW5rIlB/NhzDvvjd1c 1kHX05bzH9aKIQQ4d+UUTnnlqY/biG80UWg1BPgJzW/ko6rtdVP8kZ5xQXQsEJ15 gihYMdcyb3TwDxXkEqWVH0INmAmIC0daCvrDgiXcyOlPPoBdh9IYE0YxMAY+GSF0 RN55ZSMc0l2p6qL70WeWau5JJCplNjT6YyfOX8PeTRpkpoZIUlNVZyJcItaJMVbf cbi1SE9TRzoHgenDOBZZVQX54X1FlZg6WupHEoN8amIBEh6rw3E6biKmQDcXLa8n 9tex9t7OgyFcn8iLFrUVxzssQi+2+kt6tvTeJds4qG4U4dvWm8o1z4Lja3WkQCp2 6It8J2PS1KJYVfffJz0SXFex9SKImsgBCSLn/U6qNJF9mAUG9/4= =pdid -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix an interrupt vector setup race which leads to a non-functioning device - Add new Intel CPU models and a family: 0x12. Finally. Yippie! :-) * tag 'x86_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Plug vector setup race x86/cpu: Add new Intel CPU model numbers for Wildcatlake and Novalake	2025-08-10 08:15:32 +03:00
Linus Torvalds	8e8f6b635f	- Prevent a futex hash leak due to different mm lifetimes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXjOUACgkQEsHwGGHe VUpEog//eQ60cSh6pzFJ6yypmPmp1/Tk7XHQH9s9V4JzrsbFoTCwqm2h3NE27Pfu INNfdiZ76Paf5fRkl/pITGwW11Svn0w42xWwM8BDeZMv/yAq8dXa/QKaABVJa+Hd PujrWUno3H/qck0O50Fq9Y6nbE0lHBczHxKsaGdrARxra91JpAezsgwkN7jVO9Kk 6/2Gb9Hk2buWvG+eLmM4JwNIvaxgbMttw93tfA7DthPyQCI0dCPINmJ22fXhVZLI tVmkid9MqGOjz4789z0AN+pF+VfEcejGSy29zzCk5NrrgfgK0QSoZ0JpvUP5vtUh Opoez017+3sKe3REk+0j+PGdttmE48Zhl7WDgkAIZqOOEwWiVBoqbk9gCIGiJKan x9BRjcP3p1TH1RsS6OsHA+tbf+ZlGhOKQNRNeWmisteiOcDiuRYY8NE7F5Q3/mBQ N0KnlzAo2m2uTwJ4r5uvEOAIcCvB+EtNn2SYBkCxMpTCRzT65/WEQjgqmLDHR6cP LSFOfo91E210TwU/ZospjXxT3NhntoWRQVbvbbO5QS4gr3Sq6MCIofmIrjfJNqq6 AoVnrM+8QAOp+pOaoPwSIcwp68uhI4L6SXAZtP0+xScwv6UUUy1KUv9TMMNbZ4/4 lh9JYYIdfh3rtOlZmdK4+KoGBQ19YZ/qc9tXB8/oqadrQWFBOic= =Xpyt -----END PGP SIGNATURE----- Merge tag 'locking_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: - Prevent a futex hash leak due to different mm lifetimes * tag 'locking_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Move futex cleanup to __mmdrop()	2025-08-10 08:11:39 +03:00
Len Brown	5e98a5e73e	tools/power turbostat: version 2025.09.09 Probe and display L3 Cache topology Add ability to average an added counter (useful for pre-integrated "counters", such as Watts) Break the limit of 64 built-in counters. Assorted bug fixes and minor feature tweaks Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Len Brown	e60a13bcef	tools/power turbostat: Handle non-root legacy-uncore sysfs permissions /sys/devices/system/cpu/intel_uncore_frequency/package_X_die_Y/ may be readable by all, but /sys/devices/system/cpu/intel_uncore_frequency/package_X_die_Y/current_freq_khz may be readable only by root. Non-root turbostat users see complaints in this scenario. Fail probe of the interface if we can't read current_freq_khz. Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Original-patch-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Len Brown	378e901160	tools/power turbostat: standardize PER_THREAD_PARAMS use a macro for PER_THREAD_PARAMS to make adding one later more clear. no functional change Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Zhang Rui	3a088b07c4	tools/power turbostat: Fix DMR support Together with the RAPL MSRs, there are more MSRs gone on DMR, including PLR (Perf Limit Reasons), and IRTL (Package cstate Interrupt Response Time Limit) MSRs. The configurable TDP info should also be retrieved from TPMI based Intel Speed Select Technology feature. Remove the access of these MSRs for DMR. Improve the DMR platform feature table to make it more readable at the same time. Fixes: `83075bd59d` ("tools/power turbostat: Add initial support for DMR") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Michael Hebenstreit	dcd1c379b0	tools/power turbostat: add format "average" for external attributes External atributes with format "raw" are not printed in summary lines for nodes/packages (or with option -S). The new format "average" behaves like "raw" but also adds the summary data Signed-off-by: Michael Hebenstreit <michael.hebenstreit@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Len Brown	a5015d945d	tools/power turbostat: delete GET_PKG() pkg_base[pkg_id] is a simple array of structure pointers, let the compiler treat it that way. Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Len Brown	5f961fb2a7	tools/power turbostat: probe and display L3 cache topology Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:24:46 -04:00
Len Brown	8d14a098b4	tools/power turbostat: Support more than 64 built-in-counters We have out-grown the ability to use a 64-bit memory location to inventory every possible built-in counter. Leverage the the CPU_SET(3) macros to break this barrier. Also, break the Joules & Watts counters into two, since we can no longer 'or' them together... Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 21:23:45 -04:00
Len Brown	d240b441b5	tools/power turbostat.8: Document Totl%C0, Any%C0, GFX%C0, CPUGFX% columns Explain the meaning of the Totl%C0, Any%C0, GFX%C0, CPUGFX% columns. Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-09 11:14:30 -04:00
Linus Torvalds	561c80369d	TTY revert fix for 6.16-rc1 Here is a single revert of one of the previous patches that went in the last tty/serial merge that is breaking userspace on some platforms (specifically powerpc, probably a few others.) It accidentially changed the ioctl values of some tty ioctls, which breaks xorg. The revert has been in linux-next all this week with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaJdfkg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymq2QCgxaxTJGciGevsEi3rcXw+TkS0dq4AniOTgmCb cLQx6kIGVCucA1dOxWr8 =Vzw4 -----END PGP SIGNATURE----- Merge tag 'tty-6.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull TTY fix from Greg KH: "Here is a single revert of one of the previous patches that went in the last tty/serial merge that is breaking userspace on some platforms (specifically powerpc, probably a few others.) It accidentially changed the ioctl values of some tty ioctls, which breaks xorg. The revert has been in linux-next all this week with no reported issues" * tag 'tty-6.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "tty: vt: use _IO() to define ioctl numbers"	2025-08-09 18:12:23 +03:00
Linus Torvalds	402e262d77	EFI updates for v6.17 - Expose the OVMF firmware debug log via sysfs - Lower the default log level for the EFI stub to avoid corrupting any splash screens with unimportant diagnostic output -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCaJbnoAAKCRAwbglWLn0t XIhuAQDB+qEzM0nBPRVnlztrjKDbxu01jzFgCuf4egFePs4OjwD/dgNCXXudn5JB X7VWNi8VQAWBMhijpEKXvDu6f0MXFgw= =UeGX -----END PGP SIGNATURE----- Merge tag 'efi-next-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Expose the OVMF firmware debug log via sysfs - Lower the default log level for the EFI stub to avoid corrupting any splash screens with unimportant diagnostic output * tag 'efi-next-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: add API doc entry for ovmf_debug_log efistub: Lower default log level efi: add ovmf debug log driver	2025-08-09 18:10:01 +03:00
Linus Torvalds	c30a13538d	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiWKasACgkQ6rmadz2v bToTMhAAmAvsOqfaoYR0LfmuwPZj9F0bqg9voRRgHNnf3j1Iz0L2po65pdbJGQPs mf+KivbeWsPic8AD0mnbgPLgPXNYsunxsQ+k5LfShbtECZ+0QqUmT27EAIhsHWTr ghWmMZ6E3kq7bfsTaNAtJNjiovHCr+swqpZoJpAlwz1CxNhESDIjdDzGtarPNxHw s3ioI9gvPuH4x5vq7SblyfgeYJWNk0ZELISmdNEud1nBrdR91Ji2Iqxwkky0qhKI 4kQzN9MmjsIl4dIa1DMlQsMpKkC/sHtoFW2VEUkuHUwP+wa/YIn7iAvD7xc2k3nS 2fLzlpfZnSIdu+2piu9A7RoU4VI/XyC+tud6WmDYGN/xKgA0N7BmoBQjhgptj+uH BitpTGmFPoDuRiQDKHiGPTP6Wc4djt0Ipp6hFr89Q2ywCUCrylRafuJDpOn6xwzG VZH6yK1cMk2kIa8jArGjMtvcWiMbaMn6GwxjQaPC+Syhy6dmAWVmcyEKYXki8GAc N+gl0bBGHEGhcCCuyzxFAOLKx81CDp5C+gfoCzt3lDAXztrd1PeaZV+n9c6jtgVT VPO9TahYqnsfuLPBVDTCGSvVsTP2Fh2fIcdsEJvcF1EISBiwYBw6FGpDi4WlbOfm pisRchdT0YV6vg0V+f2U6mL5UbZr7v5w5PAfh7zXCqvfppoPD8U= =Qt6J -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix memory leak of bpf_scc_info objects (Eduard Zingerman) - Fix a regression in the 'perf' tool caused by moving UID filtering to BPF (Ilya Leoshkevich) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: perf bpf-filter: Enable events manually libbpf: Add the ability to suppress perf event enablement bpf: Fix memory leak of bpf_scc_info objects	2025-08-09 09:03:21 +03:00
Linus Torvalds	2988dfed8a	block-6.17-20250808 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiWLjoQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpvveD/9vbvp3XaF0LagRJLH0fcdhcxL7Z+IHD+7U v5vICMeoeBhhhOtPJ0y+h/9LMLQWFYDFl6drkY0atSSxp/CK6CB25qFhIDsoA6Qk RBM/qZ64z4Uxvlc+VQmCqI2EMc/ZrYtrcr7jsornwORoTSEKXVHdyO5k7Q9002Sw XNWc0bZKIibFlgOk12Wnd8ZS5RWHw1uViUcreojcGVZAVR+BuHNGGoa3xq0bLiHU ERbQXfjaN28R+eo4E1euCtdf++7tW2kFjClrDmLcszdb27E2+MWMA6AKMiSTBE2k 2e2TvJUcGZs1s8atqSIIjBtmwQW3rKws33zODLMONzOP8CIErcaniHxyDSaxJIJr kjsdKnwlziL3xVnwQcpgnVOPvvDSKZ4OKEqx8rAuYTqiknpz3uhbt/7EqumuPLHr e7Rz0MnFolrVN7KZOHQ5CPJIezkEAOAEpItLdfc5cfLS06pbeTN3j+dJZp+tUohi WP/K3l2N3C5pkXA0ilAzshRF20Rwv/09M85BoqWocTLBJY7WqyIKXywCNdX81wkv tpbQvp2MpPkJXUIbAh5484BOfCfx9vkYVm2cam2UxXJhR6VfrQCjYfXIjfpqF4jp q7xxNesUezrOqB2Q/cKxw8dKOaRtO1XzVnmwutBrcKgqqLezMwUTDDjQYe8l6p1Z 40E74tsJwQ== =EQ7g -----END PGP SIGNATURE----- Merge tag 'block-6.17-20250808' of git://git.kernel.dk/linux Pull more block updates from Jens Axboe: - MD pull request via Yu: - mddev null-ptr-dereference fix, by Erkun - md-cluster fail to remove the faulty disk regression fix, by Heming - minor cleanup, by Li Nan and Jinchao - mdadm lifetime regression fix reported by syzkaller, by Yu Kuai - MD pull request via Christoph - add support for getting the FDP featuee in fabrics passthru path (Nitesh Shetty) - add capability to connect to an administrative controller (Kamaljit Singh) - fix a leak on sgl setup error (Keith Busch) - initialize discovery subsys after debugfs is initialized (Mohamed Khalfella) - fix various comment typos (Bjorn Helgaas) - remove unneeded semicolons (Jiapeng Chong) - nvmet debugfs ordering issue fix - Fix UAF in the tag_set in zloop - Ensure sbitmap shallow depth covers entire set - Reduce lock roundtrips in io context lookup - Move scheduler tags alloc/free out of elevator and freeze lock, to fix some lockdep found issues - Improve robustness of queue limits checking - Fix a regression with IO priorities, if no io context exists * tag 'block-6.17-20250808' of git://git.kernel.dk/linux: (26 commits) lib/sbitmap: make sbitmap_get_shallow() internal lib/sbitmap: convert shallow_depth from one word to the whole sbitmap nvmet: exit debugfs after discovery subsystem exits block, bfq: Reorder struct bfq_iocq_bfqq_data md: make rdev_addable usable for rcu mode md/raid1: remove struct pool_info and related code md/raid1: change r1conf->r1bio_pool to a pointer type block: ensure discard_granularity is zero when discard is not supported zloop: fix KASAN use-after-free of tag set block: Fix default IO priority if there is no IO context nvme: fix various comment typos nvme-auth: remove unneeded semicolon nvme-pci: fix leak on sgl setup error nvmet: initialize discovery subsys after debugfs is initialized nvme: add capability to connect to an administrative controller nvmet: add support for FDP in fabrics passthru path md: rename recovery_cp to resync_offset md/md-cluster: handle REMOVE message earlier md: fix create on open mddev lifetime regression block: fix potential deadlock while running nr_hw_queue update ...	2025-08-09 08:47:28 +03:00
Linus Torvalds	24bbfb8920	io_uring-6.17-20250808 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiWLk0QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpjl3D/0eZiD3qTmdIcf7GNAoaR/aD4zgICEhwEUf 2bzVFJ7RAZeplPhT2QLAf4CRoDui6iYLXaJONNsZc8r355FLKsZcMVmU8FFBnM6Q chn0wAmv92Jk/fMQWhw1mL4c19jCibTOefzHzm7s2AMhPyhAzlvcMG/Mr9uiNt38 uBXOo9DOkX1fYlUcmYH9v8KDTQBE6UMXeEcDiLvxGUs+JW1hbBAPmh9K3/KJ5qbs wShNL0UOgq5V/GUZqMr6xRqkW++B/6/v6Y/O+wbJ8lgLV97YQi3xemHu3BCx8Kv4 lbvyctUKCTCgQPgnhluu3KuldDkLn3FA7sQ2g1b1FZWkEpqLhgJIRjgGR53GmeHX c58VcRc+MNXf+Lzuy7ZbrYdI6Gbt52Ns+g4S05TNo6ZZW0NGyaZK5aeb7qQyhmnY rCELejwPQKTZv8YfRgV+Kt14x2Z7OdGa49u31JeYUE/IYu9M2FC+XE3//D7Bdz+3 QhU4ZzOW/LRA7xXq/uRb6XK0qRFt34nA+A7jFeKaZbrh6XvXD8VN5MFKkEttcyw5 JCr3jeYV7RbhWuPyKFwAJ4EBn/HnUUEsSEKA5/Rr0tvHkR95ytx+2l3qZfihrCsU jsjSMCn4PZ+L3t2OQ2EaVEeiJ4oB4zwi37GZdQGiMQ7T46RkSzwBam6sJETFGPhF TCL70jIwbg== =LCD/ -----END PGP SIGNATURE----- Merge tag 'io_uring-6.17-20250808' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Allow vectorized payloads for send/send-zc - like sendmsg, but without the hassle of a msghdr. - Fix for an integer wrap that should go to stable, spotted by syzbot. Nothing alarming here, as you need to be root to hit this. Nevertheless, it should get fixed. FWIW, kudos to the syzbot crew for having much nicer reproducers now, and with nicely annotated source code as well. This is particularly useful as syzbot uses the raw interface rather than liburing, historically it's been difficult to turn a syzbot reproducer into a meaningful test case. With the recent changes, not true anymore! * tag 'io_uring-6.17-20250808' of git://git.kernel.dk/linux: io_uring/memmap: cast nr_pages to size_t before shifting io_uring/net: Allow to do vectorized send	2025-08-09 08:45:08 +03:00
Linus Torvalds	71a076033b	spi: Fixes for v6.17 There's one fix here for an issue with the CS42L43 where we were allocating a single property for client devices as just that property rather than a terminated array of properties like we are supposed to. We also have an update to the MAINTAINERS file for some Renesas devices. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmiV8hQACgkQJNaLcl1U h9DTGQf/ZU+UjQPEAYE5oFngQQzllOEb72io3QtfMXKGjZmEJGmm50vZP1h2psWv CwFmzxzN/g3b2FsPmXGLRRtwSvnH9cPxu3gXxM5VMBppeGyTt9j/St0clvzozL8J vOnYYt5AtPCDyxCNNwGg1H9AMEAEpzE2vk1LLA3lfLGen7R2f9nRrwbyMoMeCTRu pyZngxtLfk8GOrVZR8KoRUqs3ugFMLblRXElXkyAGPtyP0RZ5eDXMP5IDG1ymGxt a0zwRu0XBmgskVACMhFhPPNa7o4t2vHXZCmQ2cFE3e3j1lDgd9JSRbfXnBDl+i4v cTx4u+DMqILdZlaWhqrDz1oeSpXtVA== =iOoB -----END PGP SIGNATURE----- Merge tag 'spi-fix-v6.17-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "There's one fix here for an issue with the CS42L43 where we were allocating a single property for client devices as just that property rather than a terminated array of properties like we are supposed to. We also have an update to the MAINTAINERS file for some Renesas devices" * tag 'spi-fix-v6.17-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: cs42l43: Property entry should be a null-terminated array MAINTAINERS: Add entries for the RZ/V2H(P) RSPI	2025-08-09 08:43:24 +03:00
Linus Torvalds	c5bf33d778	regulator: Fix for v6.17 This fixes an issue with the newly added code for handling large voltage changes on regulators which require that individual voltage changes cover a limited range, the check for convergence was broken. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmiV71cACgkQJNaLcl1U h9BT/wf/RFUziTbJP13+I+Ce+BixwFOfTJ2+aE2xlCWAd8eyylxdOvZwRaxiiU8K 2o8mfyVpGitROyEgsuXDqiel1hFTSVpuFzT8NhGe9tc5RC+UtF1nuygsMvPAuUlR cyrGF4aUSsEGr5R37lZh6dC/kJDHOxoQy+YEk9ngnnfWlddpHJsQMulWU3MIVAaG IZ3KVQeZeEpCNkgKr50IG/TQh9bX8Dct/EPSxKhR4EjaKXvEyKcrMnbQpb7h7dxN dUFpKk99EKpC6uI7jLIwUgIA4CCN8h2g0+B10zo4Kp2AQLltGs/H/gZJSCzKzEv8 3UTeyBz8Uzr4srsKRBiP8woqA6hjoA== =xTjP -----END PGP SIGNATURE----- Merge tag 'regulator-fix-v6.17-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "This fixes an issue with the newly added code for handling large voltage changes on regulators which require that individual voltage changes cover a limited range, the check for convergence was broken" * tag 'regulator-fix-v6.17-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: correct convergence check in regulator_set_voltage()	2025-08-09 08:41:53 +03:00

1 2 3 4 5 ...

1381721 Commits (35dbfcb7dc0d3acad5e44749cbd49f418eaae2fa) All Branches Search

1381721 Commits (35dbfcb7dc0d3acad5e44749cbd49f418eaae2fa)

All Branches