mirror-linux/drivers/acpi/apei
Kai-Heng Feng d7610855b0 ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
Add support for decoding NVIDIA-specific CPER sections delivered via
the APEI GHES vendor record notifier chain. NVIDIA hardware generates
vendor-specific CPER sections containing error signatures and diagnostic
register dumps. This implementation registers a notifier_block with the
GHES vendor record notifier and decodes these sections, printing error
details via dev_info().

The driver binds to ACPI device NVDA2012, present on NVIDIA server
platforms. The NVIDIA CPER section contains a fixed header with error
metadata (signature, error type, severity, socket) followed by
variable-length register address-value pairs for hardware diagnostics.

This work is based on libcper [1].

Example output:
nvidia-ghes NVDA2012:00: NVIDIA CPER section, error_data_length: 544
nvidia-ghes NVDA2012:00: signature: CMET-INFO
nvidia-ghes NVDA2012:00: error_type: 0
nvidia-ghes NVDA2012:00: error_instance: 0
nvidia-ghes NVDA2012:00: severity: 3
nvidia-ghes NVDA2012:00: socket: 0
nvidia-ghes NVDA2012:00: number_regs: 32
nvidia-ghes NVDA2012:00: instance_base: 0x0000000000000000
nvidia-ghes NVDA2012:00: register[0]: address=0x8000000100000000 value=0x0000000100000000

https://github.com/openbmc/libcper/commit/683e055061ce [1]
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260330094203.38022-4-kaihengf@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-06 16:48:58 +02:00
..
Kconfig ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler 2026-04-06 16:48:58 +02:00
Makefile ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler 2026-04-06 16:48:58 +02:00
apei-base.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
apei-internal.h ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities 2025-06-18 20:49:31 +02:00
bert.c ACPI: APEI: mark bert_disable as __initdata 2023-06-12 19:23:25 +02:00
einj-core.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
einj-cxl.c ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities 2025-06-18 20:49:31 +02:00
erst-dbg.c ACPI: APEI: Remove redundant assignments in erst_dbg_{ioctl|write}() 2025-09-15 21:21:57 +02:00
erst.c ACPI: APEI: Use ERST timeout for slow devices 2023-10-24 20:50:17 +02:00
ghes-nvidia.c ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler 2026-04-06 16:48:58 +02:00
ghes.c ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier() 2026-04-06 16:48:58 +02:00
ghes_helpers.c ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work struct 2026-01-14 17:09:34 +01:00
hest.c ACPI: APEI: Skip initialization of GHES_ASSIST structures for Machine Check Architecture 2024-02-29 18:34:40 +01:00