Commit Graph

17039 Commits (38a01c9700b0dcafe97dfa9dc7531bf4a245deff)

Author SHA1 Message Date
Kees Cook 189f164e57 Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch && !(file in "tools") && !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 08:26:33 -08:00
Linus Torvalds 32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds 323bbfcf1e Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook 69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds d4a292c5f8 drm next fixes for 7.0-rc1
pagemap:
 - drm/pagemap: pass pagemap_addr by reference
 
 amdgpu:
 - DML 2.1 fixes
 - Panel replay fixes
 - Display writeback fixes
 - MES 11 old firmware compat fix
 - DC CRC improvements
 - DPIA fixes
 - XGMI fixes
 - ASPM fix
 - SMU feature bit handling fixes
 - DC LUT fixes
 - RAS fixes
 - Misc memory leak in error path fixes
 - SDMA queue reset fixes
 - PG handling fixes
 - 5 level GPUVM page table fix
 - SR-IOV fix
 - Queue reset fix
 - SMU 13.x fixes
 - DC resume lag fix
 - MPO fixes
 - DCN 3.6 fix
 - VSDB fixes
 - HWSS clean up
 - Replay fixes
 - DCE cursor fixes
 - DCN 3.5 SR DDR5 latency fixes
 - HPD fixes
 - Error path unwind fixes
 - SMU13/14 mode1 reset fixes
 - PSP 15 updates
 - SMU 15 updates
 - Sync fix in amdgpu_dma_buf_move_notify()
 - HAINAN fix
 - PSP 13.x fix
 - GPUVM locking fix
 - Fixes for DC analog support
 - DC FAMS fixes
 - DML 2.1 fixes
 - eDP fixes
 - Misc DC fixes
 - Fastboot fix
 - 3DLUT fixes
 - GPUVM fixes
 - 64bpp format fix
 - Fix for MacBooks with switchable gfx
 
 amdkfd:
 - Fix possible double deletion of validate list
 - Event setup fix
 - Device disconnect regression fix
 - APU GTT as VRAM fix
 - Fix piority inversion with MQDs
 - NULL check fix
 
 radeon:
 - HAINAN fix
 
 i915/xe display:
 - Regresion fix for HDR 4k displays (#15503)
 - Fixup for Dell XPS 13 7390 eDP rate limit
 - Memory leak fix on ACPI _DSM handling
 - Add missing slice count check during DP mode validation
 
 xe:
 - drm/xe: Prevent VFs from exposing the CCS mode sysfs file
 - SRIOV related fixes
 - PAT cache fix
 - MMIO read fix
 - W/a fixes
 - Adjust type of xe_modparam.force_vram_bar_size
 - Wedge mode fix
 - HWMon fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmmYyNEACgkQDHTzWXnE
 hr7n0Q/+L4KscEx0ByCOBNfWJollaJKZEpdeWLQ2J9RSa/hPVOTE+k2z/TokCsYw
 jYIHCv8UBU/VvffWWHj0CHCRSRFcfwWo9bZm0J0C6vRSL77iXJPUDZOZ/EhP7Cy3
 oMZyexBcVTwNqzI2cg8PYsGkTMPpdBUxrSNxa9d15kNJbMSm5RdAMJ4pt/9Id9V4
 iCGQccsNZN8d9JdPzYU/Tay2RQOD/75l9xtywq4vb8Ktszc7gWFca+tRleYZxbtG
 WuXvMStqmZAUZJaTbSASbSQFQMEdyJXSsFq/T0cY1Mm5DSUqnJ27YqOzL5sqzQTE
 +Aeh5w5xEgq/GaLzdx9tOSxBr6mK7p251RApSunPn4nwb50iT8dBAqaYh8Zy2e0o
 vQtFgp3PLlnqwvdvEtqPoq6+oG/FtIsULLPLuMqlZtOe3EG7BzQsWeWASj793EtN
 KrTu4a9HufzZrf+t8a3ZLp2CMQJI5sCCBZBJiZ6/ImiixBEH5bJzSLPhdx7VeoMe
 //kVBbhYXD8oi8QMcJHdZg9ERp80D1yYJhhGWes330s1tdl87Gizy4Mu4pSH9Mds
 bw7u6PTS7iI7hoXvz/ITrrHkFTkxLsxVhPI2OhLAfRpIEoH9XOFgi+BCmZOEb9lT
 V5RhIDxrKpjQcgJ9d9dAM2arO11kTPdVTEH4tYpzQ5cHd17mls4=
 =SQjl
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2026-02-21' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "This is the fixes and cleanups for the end of the merge window, it's
  nearly all amdgpu, with some amdkfd, then a pagemap core fix, i915/xe
  display fixes, and some xe driver fixes.

  Nothing seems out of the ordinary, except amdgpu is a little more
  volume than usual.

  pagemap:
   - drm/pagemap: pass pagemap_addr by reference

  amdgpu:
   - DML 2.1 fixes
   - Panel replay fixes
   - Display writeback fixes
   - MES 11 old firmware compat fix
   - DC CRC improvements
   - DPIA fixes
   - XGMI fixes
   - ASPM fix
   - SMU feature bit handling fixes
   - DC LUT fixes
   - RAS fixes
   - Misc memory leak in error path fixes
   - SDMA queue reset fixes
   - PG handling fixes
   - 5 level GPUVM page table fix
   - SR-IOV fix
   - Queue reset fix
   - SMU 13.x fixes
   - DC resume lag fix
   - MPO fixes
   - DCN 3.6 fix
   - VSDB fixes
   - HWSS clean up
   - Replay fixes
   - DCE cursor fixes
   - DCN 3.5 SR DDR5 latency fixes
   - HPD fixes
   - Error path unwind fixes
   - SMU13/14 mode1 reset fixes
   - PSP 15 updates
   - SMU 15 updates
   - Sync fix in amdgpu_dma_buf_move_notify()
   - HAINAN fix
   - PSP 13.x fix
   - GPUVM locking fix
   - Fixes for DC analog support
   - DC FAMS fixes
   - DML 2.1 fixes
   - eDP fixes
   - Misc DC fixes
   - Fastboot fix
   - 3DLUT fixes
   - GPUVM fixes
   - 64bpp format fix
   - Fix for MacBooks with switchable gfx

  amdkfd:
   - Fix possible double deletion of validate list
   - Event setup fix
   - Device disconnect regression fix
   - APU GTT as VRAM fix
   - Fix piority inversion with MQDs
   - NULL check fix

  radeon:
   - HAINAN fix

  i915/xe display:
   - Regresion fix for HDR 4k displays (#15503)
   - Fixup for Dell XPS 13 7390 eDP rate limit
   - Memory leak fix on ACPI _DSM handling
   - Add missing slice count check during DP mode validation

  xe:
   - drm/xe: Prevent VFs from exposing the CCS mode sysfs file
   - SRIOV related fixes
   - PAT cache fix
   - MMIO read fix
   - W/a fixes
   - Adjust type of xe_modparam.force_vram_bar_size
   - Wedge mode fix
   - HWMon fix

* tag 'drm-next-2026-02-21' of https://gitlab.freedesktop.org/drm/kernel: (143 commits)
  drm/amd/display: Remove unneeded DAC link encoder register
  drm/amd/display: Enable DAC in DCE link encoder
  drm/amd/display: Set CRTC source for DAC using registers
  drm/amd/display: Initialize DAC in DCE link encoder using VBIOS
  drm/amd/display: Turn off DAC in DCE link encoder using VBIOS
  drm/amd/display: Don't call find_analog_engine() twice
  drm/amdgpu: fix 4-level paging if GMC supports 57-bit VA v2
  drm/amdgpu: keep vga memory on MacBooks with switchable graphics
  drm/amdgpu: Set atomics to true for xgmi
  drm/amdkfd: Check for NULL return values
  drm/amd/display: Use same max plane scaling limits for all 64 bpp formats
  drm/amdgpu: Set vmid0 PAGE_TABLE_DEPTH for GFX12.1
  drm/amdkfd: Disable MQD queue priority
  drm/amd/display: Remove conditional for shaper 3DLUT power-on
  drm/amd/display: Check return of shaper curve to HW format
  drm/amd/display: Correct logic check error for fastboot
  drm/amd/display: Skip eDP detection when no sink
  Revert "drm/amd/display: Add Gfx Base Case For Linear Tiling Handling"
  Revert "drm/amd/display: Correct hubp GfxVersion verification"
  Revert "drm/amd/display: Add Handling for gfxversion DcGfxBase"
  ...
2026-02-20 15:36:38 -08:00
Christian König aa25c111a7 drm/amdgpu: fix 4-level paging if GMC supports 57-bit VA v2
It turned that using 4 level page tables on GMC generations which support
57bit VAs actually doesn't work at all.

Background is that the GMC actually can't switch between 4 and 5 levels,
but rather just uses a subset of address space when less than 5 levels are
selected.

Philip already removed the automatically switch to 4levels, now fix it as
well should it be enabled by module parameters.

v2: fix AMDGPU_GMC_HOLE_MASK as well, fix off by one issue pointed out
    by Philip

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:12 -05:00
Alex Deucher 096bb75e13 drm/amdgpu: keep vga memory on MacBooks with switchable graphics
On Intel MacBookPros with switchable graphics, when the iGPU
is enabled, the address of VRAM gets put at 0 in the dGPU's
virtual address space.  This is non-standard and seems to cause
issues with the cursor if it ends up at 0.  We have the framework
to reserve memory at 0 in the address space, so enable it here if
the vram start address is 0.

Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4302
Cc: stable@vger.kernel.org
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Harish Kasiviswanathan 23c098b5fc drm/amdgpu: Set atomics to true for xgmi
xgmi support atomics between links. Set them to true. This only set for
GFX12 onwards to avoid regression on older generations

v2: Use correct xgmi flag that indicates CPU connection

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Andrew Martin 09da66f139 drm/amdkfd: Check for NULL return values
This patch fixes issues when the code moves forward with a potential
NULL pointer, without checking.
Removed one redundant NULL check for a function parameter. This check
is already done in the only caller.

Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Harish Kasiviswanathan a1e0a6b552 drm/amdgpu: Set vmid0 PAGE_TABLE_DEPTH for GFX12.1
GFX12.1 uses 2 level gart table. Set the context register appropriately

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Christian König fd1fa48b93 drm/amdgpu: lock both VM and BO in amdgpu_gem_object_open
The VM was not locked in the past since we initially only cleared the
linked list element and not added it to any VM state.

But this has changed quite some time ago, we just never realized this
problem because the VM state lock was masking it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:24:59 -05:00
Mangesh Gadre 5e9aec4ea3 drm/amdgpu:Add psp v13_0_15 ip block
Add support for psp v13_0_15 ip block

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:23:39 -05:00
Alex Deucher 57d00816c6 drm/amdgpu: set family for GC 11.5.4
Set the family for GC 11.5.4

Fixes: 47ae1f938d ("drm/amdgpu: add support for GC IP version 11.5.4")
Cc: Tim Huang <tim.huang@amd.com>
Cc: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:23:09 -05:00
Pierre-Eric Pelloux-Prayer b18fc0ab83 drm/amdgpu: fix sync handling in amdgpu_dma_buf_move_notify
Invalidating a dmabuf will impact other users of the shared BO.
In the scenario where process A moves the BO, it needs to inform
process B about the move and process B will need to update its
page table.

The commit fixes a synchronisation bug caused by the use of the
ticket: it made amdgpu_vm_handle_moved behave as if updating
the page table immediately was correct but in this case it's not.

An example is the following scenario, with 2 GPUs and glxgears
running on GPU0 and Xorg running on GPU1, on a system where P2P
PCI isn't supported:

glxgears:
  export linear buffer from GPU0 and import using GPU1
  submit frame rendering to GPU0
  submit tiled->linear blit
Xorg:
  copy of linear buffer

The sequence of jobs would be:
  drm_sched_job_run                       # GPU0, frame rendering
  drm_sched_job_queue                     # GPU0, blit
  drm_sched_job_done                      # GPU0, frame rendering
  drm_sched_job_run                       # GPU0, blit
  move linear buffer for GPU1 access      #
  amdgpu_dma_buf_move_notify -> update pt # GPU0

It this point the blit job on GPU0 is still running and would
likely produce a page fault.

Cc: stable@vger.kernel.org
Fixes: a448cb003e ("drm/amdgpu: implement amdgpu_gem_prime_move_notify v2")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:22:03 -05:00
Lijo Lazar 9aca641143 drm/amdgpu: Move xgmi status to interface header
These definitions are used by user APIs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:22:00 -05:00
Gangliang Xie 044f8d3b1f drm/amdgpu: return when ras table checksum is error
end the function flow when ras table checksum is error

Signed-off-by: Gangliang Xie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:21:56 -05:00
Ce Sun 3ee1c72606 drm/amdgpu: Adjust usleep_range in fence wait
Tune the sleep interval in the PSP fence wait loop from 10-100us to
60-100us.This adjustment results in an overall wait window of 1.2s
(60us * 20000 iterations) to 2 seconds (100us * 20000 iterations),
which guarantees that we can retrieve the correct fence value

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:21:41 -05:00
Pratik Vishwakarma 33ed922d24 drm/amd: Add CG/PG flags for GC 11.5.4
Enable GFXOff for GC 11.5.4

Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:21:30 -05:00
Pratik Vishwakarma 39a96f126d drm/amdgpu: enable mode2 reset for SMU IP v15.0.0
Set the default reset method to mode2 for SMU 15.0.0.

Signed-off-by: Kanala Ramalingeswara Reddy <Kanala.RamalingeswaraReddy@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:21:25 -05:00
Pratik Vishwakarma e98bb71e24 drm/amdgpu: Load TA ucode for PSP 15_0_0
TOC and TA both are required

Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:20:56 -05:00
Kenneth Feng d76c66c6ad drm/amd/pm: send unload command to smu during modprobe -r amdgpu
Send unload command to smu during modprobe -r amdgpu for smu 13/14.
1. This can fix the high voltage/temperatue issue after driver is unloaded.
2. Reloading driver could fail but with the debug port based mode1 reset
during driver is reloaded, it is good and safe.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:20:07 -05:00
Srinivasan Shanmugam ba03806565 drm/amdgpu: Fix missing unwind in amdgpu_ib_schedule() error path
amdgpu_ib_schedule() returns early after calling amdgpu_ring_undo().
This skips the common free_fence cleanup path.  Other error paths were
already changed to use goto free_fence, but this one was missed.

Change the early return to goto free_fence so all error paths clean up
the same way.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c:232 amdgpu_ib_schedule()
warn: missing unwind goto?

drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
    124 int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned int num_ibs,
    125                        struct amdgpu_ib *ibs, struct amdgpu_job *job,
    126                        struct dma_fence **f)
    127 {

    ...

    224
    225         if (ring->funcs->insert_start)
    226                 ring->funcs->insert_start(ring);
    227
    228         if (job) {
    229                 r = amdgpu_vm_flush(ring, job, need_pipe_sync);
    230                 if (r) {
    231                         amdgpu_ring_undo(ring);
--> 232                         return r;

	The patch changed the other error paths to goto free_fence but
	this one was accidentally skipped.

    233                 }
    234         }
    235
    236         amdgpu_ring_ib_begin(ring);

    ...

    338
    339 free_fence:
    340         if (!job)
    341                 kfree(af);
    342         return r;
    343 }

Fixes: f903b85ed0 ("drm/amdgpu: fix possible fence leaks from job structure")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:17:31 -05:00
Linus Torvalds 136114e0ab mm.git review status for linus..mm-nonmm-stable
Total patches:       107
 Reviews/patch:       1.07
 Reviewed rate:       67%
 
 - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim
   suballocator free bg" from Heming Zhao saves disk space by teaching
   ocfs2 to reclaim suballocator block group space.
 
 - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one
   bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in
   various places.
 
 - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than
   PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if
   VMCOREINFO_BYTES ever exceeds the page size.
 
 - The 7 patch series "kallsyms: Prevent invalid access when showing
   module buildid" from Petr Mladek cleans up kallsyms code related to
   module buildid and fixes an invalid access crash when printing
   backtraces.
 
 - The 3 patch series "Address page fault in
   ima_restore_measurement_list()" from Harshit Mogalapalli fixes a
   kexec-related crash that can occur when booting the second-stage kernel
   on x86.
 
 - The 6 patch series "kho: ABI headers and Documentation updates" from
   Mike Rapoport updates the kexec handover ABI documentation.
 
 - The 4 patch series "Align atomic storage" from Finn Thain adds the
   __aligned attribute to atomic_t and atomic64_t definitions to get
   natural alignment of both types on csky, m68k, microblaze, nios2,
   openrisc and sh.
 
 - The 2 patch series "kho: clean up page initialization logic" from
   Pratyush Yadav simplifies the page initialization logic in
   kho_restore_page().
 
 - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves
   several things out of kernel.h and into more appropriate places.
 
 - The 7 patch series "don't abuse task_struct.group_leader" from Oleg
   Nesterov removes the usage of ->group_leader when it is "obviously
   unnecessary".
 
 - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin
   adds some infrastructure improvements to the live update orchestrator.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA
 jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6
 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww=
 =mmsH
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
   disk space by teaching ocfs2 to reclaim suballocator block group
   space (Heming Zhao)

 - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
   ARRAY_END() macro and uses it in various places (Alejandro Colomar)

 - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
   the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
   page size (Pnina Feder)

 - "kallsyms: Prevent invalid access when showing module buildid" cleans
   up kallsyms code related to module buildid and fixes an invalid
   access crash when printing backtraces (Petr Mladek)

 - "Address page fault in ima_restore_measurement_list()" fixes a
   kexec-related crash that can occur when booting the second-stage
   kernel on x86 (Harshit Mogalapalli)

 - "kho: ABI headers and Documentation updates" updates the kexec
   handover ABI documentation (Mike Rapoport)

 - "Align atomic storage" adds the __aligned attribute to atomic_t and
   atomic64_t definitions to get natural alignment of both types on
   csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)

 - "kho: clean up page initialization logic" simplifies the page
   initialization logic in kho_restore_page() (Pratyush Yadav)

 - "Unload linux/kernel.h" moves several things out of kernel.h and into
   more appropriate places (Yury Norov)

 - "don't abuse task_struct.group_leader" removes the usage of
   ->group_leader when it is "obviously unnecessary" (Oleg Nesterov)

 - "list private v2 & luo flb" adds some infrastructure improvements to
   the live update orchestrator (Pasha Tatashin)

* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
  watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
  procfs: fix missing RCU protection when reading real_parent in do_task_stat()
  watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
  kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
  kho: fix doc for kho_restore_pages()
  tests/liveupdate: add in-kernel liveupdate test
  liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
  liveupdate: luo_file: Use private list
  list: add kunit test for private list primitives
  list: add primitives for private list manipulations
  delayacct: fix uapi timespec64 definition
  panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
  netclassid: use thread_group_leader(p) in update_classid_task()
  RDMA/umem: don't abuse current->group_leader
  drm/pan*: don't abuse current->group_leader
  drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
  drm/amdgpu: don't abuse current->group_leader
  android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
  android/binder: don't abuse current->group_leader
  kho: skip memoryless NUMA nodes when reserving scratch areas
  ...
2026-02-12 12:13:01 -08:00
Linus Torvalds 939faf71cf drm for 7.0-rc1
core:
 - drop kgdb support
 - replace system workqueue with percpu
 - account for property blobs in memcg
 - MAINTAINERS updates for xe + buddy
 
 rust:
 - Fix documentation for Registration constructors.
 - Use pin_init::zeroed() for fops initialization.
 - Annotate DRM helpers with __rust_helper.
 - Improve safety documentation for gem::Object::new().
 - Update AlwaysRefCounted imports.
 - mm: Prevent integer overflow in page_align().
 
 atomic:
 - add drm_device pointer to drm_private_obj
 - introduce gamma/degamma LUT size check
 
 buddy:
 - fix free_trees memory leak
 - prevent BUG_ON
 
 bridge:
 - introduce drm_bridge_unplug/enter/exit
 - add connector argument to .hpd_notify
 - lots of recounting conversions
 - convert rockchip inno hdmi to bridge
 - lontium-lt9611uxc: switch to HDMI audio helpers
 - dw-hdmi-qp: add support for HPD-less setups
 - Algoltek AG6311 support
 
 panels:
 - edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
 - st75751: add SPI support
 - Sitronix ST7920, Samsung LTL106HL02
 - LG LH546WF1-ED01, HannStar HSD156J
 - BOE NV130WUM-T08
 - Innolux G150XGE-L05
 - Anbernic RG-DS
 
 dma-buf:
 - improve sg_table debugging
 - add tracepoints
 - call clear_page instead of memset
 - start to introduce cgroup memory accounting in heaps
 - remove sysfs stats
 
 dma-fence:
 - add new helpers
 
 dp:
 - mst: avoid oob access with vcpi=0
 
 hdmi:
 - limit infoframes exposure to userspace
 
 gem:
 - reduce page table overhead with THP
 - fix leak in drm_gem_get_unmapped_area
 
 gpuvm:
 - API sanitation for rust bindings
 
 sched:
 - introduce new helpers
 
 panic:
 - report invalid panic modes
 - add kunit tests
 
 i915/xe display:
 - Expose sharpness only if num_scalers is >= 2
 - Add initial Xe3P_LPD for NVL
 - BMG FBC support
 - Add MTL+ platforms to support dpll framework
 _ fix DIMM_S DRM decoding on ICL
 - Return to using AUX interrupts
 - PSR/Panel replay refactoring
 - use consolidation HDMI tables
 - Xe3_LPD CD2X dividier changes
 
 xe:
 - vfio: add vfio_pci for intel GPU
 - multi queue support
 - dynamic pagemaps and multi-device SVM
 - expose temp attribs in hwmon
 - NO_COMPRESSION bo flag
 - expose MERT OA unit
 - sysfs survivability refactor
 - SRIOV PF: add MERT support
 - enable SR-IOV VF migration
 - Enable I2C/NVM on Crescent Island
 - Xe3p page reclaimation support
 - introduce SRIOV scheduler groups
 - add SoC remappt support in system controller
 - insert compiler barriers in GuC code
 - define NVL GuC firmware
 - handle GT resume failure
 - fix drm scheduler layering violations
 - enable GSC loading and PXP for PTL
 - disable GuC Power DCC strategy on PTL
 - unregister drm device on probe error
 
 i915:
 - move to kernel standard fault injection
 - bump recommended GuC version for DG2 and MTL
 
 amdgpu:
 - SMUIO 15.x, PSP 15.x support
 - IH 6.1.1/7.1 support
 - MMHUB 3.4/4.2 support
 - GC 11.5.4/12.1 support
 - SDMA 6.1.4/7.1/7.11.4 support
 - JPEG 5.3 support
 - UserQ updates
 - GC 9 gfx queue reset support
 - TTM memory ops parallelization
 - convert legacy logging to new helpers
 - DC analog fixes
 
 amdkfd:
 - GC 11.5.4/12.1 suppport
 - SDMA 6.1.4/7.1 support
 - per context support
 - increase kfd process hash table
 - Reserved SDMA rework
 
 radeon:
 - convert legacy logging to new helpers
 - use devm for i2c adapters
 
 msm:
 - GPU
   - Document a612/RGMU dt bindings
   - UBWC 6.0 support (for A840 / Kaanapali)
   - a225 support
 - DPU:
   - Switched to use virtual planes by default
   - Fixed DSI CMD panels on DPU 3.x
   - Rewrote format handling to remove intermediate representation
   - Fixed watchdog on DPU 8.x+
   - Fixed TE / Vsync source setting on DPU 8.x+
   - Added 3D_Mux on SC7280
   - Kaanapali platform support
   - Fixed UBWC register programming
   - Made RM reserve DSPP-enabled mixers for CRTCs with LMs.
   - Gamma correction support
 - DP:
   - Enabled support for eDP 1.4+ link rate tables
   - Fixed MDSS1 DP indices on SA8775P, making them to work
   - Fixed msm_dp_ctrl_config_msa() to work with LLVM 20
 - DSI:
   - Documented QCS8300 as compatible with SA8775P
   - Kaanapali platform support
 - DSI PHY:
   - switched to divider_determine_rate()
 - MDP5:
   - Dropped support for MSM8998, SDM660 and SDM630 (switched over
     to DPU)
 -  MDSS:
   - Kaanapali platform support
   - Fixed UBWC register programming
 
 nova-core:
 - Prepare for Turing support. This includes parsing and handling
   Turing-specific firmware headers and sections as well as a Turing
   Falcon HAL implementation.
 - Get rid of the Result<impl PinInit<T, E>> anti-pattern.
 - Relocate initializer-specific code into the appropriate initializer.
 - Use CStr::from_bytes_until_nul() to remove custom helpers.
 - Improve handling of unexpected firmware values.
 - Clean up redundant debug prints.
 - Replace c_str!() with native Rust C-string literals.
 - Update nova-core task list.
 
 nova:
 - Align GEM object size to system page size.
 
 tyr:
 - Use generated uAPI bindings for GpuInfo.
 - Replace manual sleeps with read_poll_timeout().
 - Replace c_str!() with native Rust C-string literals.
 - Suppress warnings for unread fields.
 - Fix incorrect register name in print statement.
 
 nouveau:
 - fix big page table support races in PTE management
 - improve reclocking on tegra 186+
 
 amdxdna:
 - fix suspend race conditions
 - improve handling of zero tail pointers
 - fix cu_idx overwritten during command setup
 - enable hardware context priority
 - remove NPU2 support
 - update message buffer allocation requirements
 - update firmware version check
 
 ast:
 - support imported cursor buffers
 - big endian fixes
 
 etnaviv:
 - add PPU flop reset support
 
 imagination:
 - add AM62P support
 - introduce hw version checks
 
 ivpu:
 - implement warm boot flow
 
 panfrost:
 - add bo sync ioctl
 - add GPU_PM_RT support for RZ/G3E SoC
 
 panthor:
 - add bo sync ioctl
 - enable timestamp propagation
 - scheduler robustness improvements
 - VM termination fixes
 - huge page support
 
 rockchip:
 - RK3368 HDMI Support
 - get rid of atomic_check fixups
 - RK3506 support
 - RK3576/RK3588 improved HPD handling
 
 rz-du:
 - RZ/V2H(P) MIPI-DSI Support
 
 v3d:
 - fix DMA segment size
 - convert to new logging helpers
 
 mediatek:
 - move DP training to hotplug thread
 - convert logging to new helpers
 - add support for HS speed DSI
 - Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support
 
 atmel-hlcdc:
 - switch to drmm resource
 - support nomodeset
 - use newer helpers
 
 hisilicon:
 - fix various DP bugs
 
 renesas:
 - fix kernel panic on reboot
 
 exynos:
 - fix vidi_connection_ioctl using wrong device
 - fix vidi_connection deref user ptr
 - fix concurrency regression with vidi_context
 
 vkms:
 - add configfs support for display configuration
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmmMLAcACgkQDHTzWXnE
 hr7aAw//bQ2WLhXeMqWyqrPPe51l2DmWvbdwP6TKUjzMwd9+xvs6wQcSg80Mn230
 0vqSpqKq2aMB6GMmz7wdHG8JgZOvO7qDf2TZodXe5lvBiAAPjzX+UE/0bIQKuhym
 Ufb7tqCIPsj6TpcD3ef/173x3BnVPA6Y7lS11KaaG5l01vUAVlTD1vfWGDQp/L6P
 7g94cC+0+3eYZyKxE1+Rn7FDXdw08u+vtLchIoowcAHobgucZ8K/XtZZoqFFy3sj
 ZZN580AhyZoGcgmn2KhNvU4B+3tBFFMSVZkJm7skOO0IB2AMQGdEr0uVUDzLGc7K
 DrLaxYwM6HfxM4o0r0Ai0WCuoysCAJ95M2Cp58uDuNcew4lRTtIUqz32Sm2OJ8bD
 Z91Rvh/kOcA0Ru11Sb/kQvy9/OJ54CqojKVaUlkFo9VhHyPCPo9hjnPvaDvCt34N
 FmnhuVpZMWqcjjq5yO/192qpDJnm470eQExvkZ4YpgmWkekND0zwaT4PG4763dZJ
 juPlBQ5WtUlIzlUpRxdHE7C7ht1rWRS+HdzSYPM5aHTXDvktJvcA+1b/Jyicc+x4
 QZiZ/1AC0KKlLrZxpVpEcjkPdQj2CiCXHQ+0YjDfO3cHo/55EfKj4iiARzhDzokf
 h7FgKwvVhc9DycSq8KPGAf09AswceGAtvB1rKk+Jh9D/GqbgGtM=
 =RFJ2
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "Highlights:
   - amdgpu support for lots of new IP blocks which means newer GPUs
   - xe has a lot of SR-IOV and SVM improvements
   - lots of intel display refactoring across i915/xe
   - msm has more support for gen8 platforms
   - Given up on kgdb/kms integration, it's too hard on modern hw

  core:
   - drop kgdb support
   - replace system workqueue with percpu
   - account for property blobs in memcg
   - MAINTAINERS updates for xe + buddy

  rust:
   - Fix documentation for Registration constructors
   - Use pin_init::zeroed() for fops initialization
   - Annotate DRM helpers with __rust_helper
   - Improve safety documentation for gem::Object::new()
   - Update AlwaysRefCounted imports
   - mm: Prevent integer overflow in page_align()

  atomic:
   - add drm_device pointer to drm_private_obj
   - introduce gamma/degamma LUT size check

  buddy:
   - fix free_trees memory leak
   - prevent BUG_ON

  bridge:
   - introduce drm_bridge_unplug/enter/exit
   - add connector argument to .hpd_notify
   - lots of recounting conversions
   - convert rockchip inno hdmi to bridge
   - lontium-lt9611uxc: switch to HDMI audio helpers
   - dw-hdmi-qp: add support for HPD-less setups
   - Algoltek AG6311 support

  panels:
   - edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
   - st75751: add SPI support
   - Sitronix ST7920, Samsung LTL106HL02
   - LG LH546WF1-ED01, HannStar HSD156J
   - BOE NV130WUM-T08
   - Innolux G150XGE-L05
   - Anbernic RG-DS

  dma-buf:
   - improve sg_table debugging
   - add tracepoints
   - call clear_page instead of memset
   - start to introduce cgroup memory accounting in heaps
   - remove sysfs stats

  dma-fence:
   - add new helpers

  dp:
   - mst: avoid oob access with vcpi=0

  hdmi:
   - limit infoframes exposure to userspace

  gem:
   - reduce page table overhead with THP
   - fix leak in drm_gem_get_unmapped_area

  gpuvm:
   - API sanitation for rust bindings

  sched:
   - introduce new helpers

  panic:
   - report invalid panic modes
   - add kunit tests

  i915/xe display:
   - Expose sharpness only if num_scalers is >= 2
   - Add initial Xe3P_LPD for NVL
   - BMG FBC support
   - Add MTL+ platforms to support dpll framework
   _ fix DIMM_S DRM decoding on ICL
   - Return to using AUX interrupts
   - PSR/Panel replay refactoring
   - use consolidation HDMI tables
   - Xe3_LPD CD2X dividier changes

  xe:
   - vfio: add vfio_pci for intel GPU
   - multi queue support
   - dynamic pagemaps and multi-device SVM
   - expose temp attribs in hwmon
   - NO_COMPRESSION bo flag
   - expose MERT OA unit
   - sysfs survivability refactor
   - SRIOV PF: add MERT support
   - enable SR-IOV VF migration
   - Enable I2C/NVM on Crescent Island
   - Xe3p page reclaimation support
   - introduce SRIOV scheduler groups
   - add SoC remappt support in system controller
   - insert compiler barriers in GuC code
   - define NVL GuC firmware
   - handle GT resume failure
   - fix drm scheduler layering violations
   - enable GSC loading and PXP for PTL
   - disable GuC Power DCC strategy on PTL
   - unregister drm device on probe error

  i915:
   - move to kernel standard fault injection
   - bump recommended GuC version for DG2 and MTL

  amdgpu:
   - SMUIO 15.x, PSP 15.x support
   - IH 6.1.1/7.1 support
   - MMHUB 3.4/4.2 support
   - GC 11.5.4/12.1 support
   - SDMA 6.1.4/7.1/7.11.4 support
   - JPEG 5.3 support
   - UserQ updates
   - GC 9 gfx queue reset support
   - TTM memory ops parallelization
   - convert legacy logging to new helpers
   - DC analog fixes

  amdkfd:
   - GC 11.5.4/12.1 suppport
   - SDMA 6.1.4/7.1 support
   - per context support
   - increase kfd process hash table
   - Reserved SDMA rework

  radeon:
   - convert legacy logging to new helpers
   - use devm for i2c adapters

  msm:
   - GPU
      - Document a612/RGMU dt bindings
      - UBWC 6.0 support (for A840 / Kaanapali)
      - a225 support
   - DPU:
      - Switch to use virtual planes by default
      - Fix DSI CMD panels on DPU 3.x
      - Rewrite format handling to remove intermediate representation
      - Fix watchdog on DPU 8.x+
      - Fix TE / Vsync source setting on DPU 8.x+
      - Add 3D_Mux on SC7280
      - Kaanapali platform support
      - Fix UBWC register programming
      - Make RM reserve DSPP-enabled mixers for CRTCs with LMs
      - Gamma correction support
   - DP:
      - Enable support for eDP 1.4+ link rate tables
      - Fix MDSS1 DP indices on SA8775P, making them to work
      - Fix msm_dp_ctrl_config_msa() to work with LLVM 20
   - DSI:
      - Document QCS8300 as compatible with SA8775P
      - Kaanapali platform support
   - DSI PHY:
      - switch to divider_determine_rate()
   - MDP5:
      - Drop support for MSM8998, SDM660 and SDM630 (switch over to DPU)
   -  MDSS:
      - Kaanapali platform support
      - Fixed UBWC register programming

  nova-core:
   - Prepare for Turing support. This includes parsing and handling
     Turing-specific firmware headers and sections as well as a Turing
     Falcon HAL implementation
   - Get rid of the Result<impl PinInit<T, E>> anti-pattern
   - Relocate initializer-specific code into the appropriate initializer
   - Use CStr::from_bytes_until_nul() to remove custom helpers
   - Improve handling of unexpected firmware values
   - Clean up redundant debug prints
   - Replace c_str!() with native Rust C-string literals
   - Update nova-core task list

  nova:
   - Align GEM object size to system page size

  tyr:
   - Use generated uAPI bindings for GpuInfo
   - Replace manual sleeps with read_poll_timeout()
   - Replace c_str!() with native Rust C-string literals
   - Suppress warnings for unread fields
   - Fix incorrect register name in print statement

  nouveau:
   - fix big page table support races in PTE management
   - improve reclocking on tegra 186+

  amdxdna:
   - fix suspend race conditions
   - improve handling of zero tail pointers
   - fix cu_idx overwritten during command setup
   - enable hardware context priority
   - remove NPU2 support
   - update message buffer allocation requirements
   - update firmware version check

  ast:
   - support imported cursor buffers
   - big endian fixes

  etnaviv:
   - add PPU flop reset support

  imagination:
   - add AM62P support
   - introduce hw version checks

  ivpu:
   - implement warm boot flow

  panfrost:
   - add bo sync ioctl
   - add GPU_PM_RT support for RZ/G3E SoC

  panthor:
   - add bo sync ioctl
   - enable timestamp propagation
   - scheduler robustness improvements
   - VM termination fixes
   - huge page support

  rockchip:
   - RK3368 HDMI Support
   - get rid of atomic_check fixups
   - RK3506 support
   - RK3576/RK3588 improved HPD handling

  rz-du:
   - RZ/V2H(P) MIPI-DSI Support

  v3d:
   - fix DMA segment size
   - convert to new logging helpers

  mediatek:
   - move DP training to hotplug thread
   - convert logging to new helpers
   - add support for HS speed DSI
   - Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support

  atmel-hlcdc:
   - switch to drmm resource
   - support nomodeset
   - use newer helpers

  hisilicon:
   - fix various DP bugs

  renesas:
   - fix kernel panic on reboot

  exynos:
   - fix vidi_connection_ioctl using wrong device
   - fix vidi_connection deref user ptr
   - fix concurrency regression with vidi_context

  vkms:
   - add configfs support for display configuration

* tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel: (1610 commits)
  drm/xe/pm: Disable D3Cold for BMG only on specific platforms
  drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
  drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
  drm/xe: Fix kerneldoc for xe_migrate_exec_queue
  drm/xe/query: Fix topology query pointer advance
  drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
  drm/xe/guc: Fix CFI violation in debugfs access.
  accel/amdxdna: Move RPM resume into job run function
  accel/amdxdna: Fix incorrect DPM level after suspend/resume
  nouveau/vmm: start tracking if the LPT PTE is valid. (v6)
  nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)
  nouveau/vmm: rewrite pte tracker using a struct and bitfields.
  accel/amdxdna: Fix incorrect error code returned for failed chain command
  accel/amdxdna: Remove hardware context status
  drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()
  drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc
  drm/i915/display: fix the pixel normalization handling for xe3p_lpd
  drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
  drm/exynos: vidi: fix to avoid directly dereferencing user pointer
  drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
  ...
2026-02-11 12:55:44 -08:00
Kent Russell 5028a24aa8 drm/amdgpu: Send applicable RMA CPERs at end of RAS init
Firmware and monitoring tools may not be ready to receive a CPER when we
read the bad pages, so send the CPERs at the end of RAS initialization
to ensure that the FW is ready to receive and process the CPER. This
removes the previous CPER submission that was added during bad page
load, and sends both in-band and out-of-band at the same time.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:28:34 -05:00
Mario Limonciello f7afda7fcd drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected()
The commit 6a23e7b433 ("drm/amd: Clean up kfd node on surprise
disconnect") introduced early KFD cleanup when drm_dev_is_unplugged()
returns true. However, this causes hangs during normal module unload
(rmmod amdgpu).

The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove()
for all removal scenarios, not just surprise disconnects. This was done
intentionally in commit 39934d3ed5 ("Revert "drm/amdgpu: TA unload
messages are not actually sent to psp when amdgpu is uninstalled"") to
fix IGT PCI software unplug test failures. As a result,
drm_dev_is_unplugged() returns true even during normal module unload,
triggering the early KFD cleanup inappropriately.

The correct check should distinguish between:
- Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected()
  returns true
- Normal module unload (rmmod): pci_dev_is_disconnected() returns false

Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure
the early cleanup only happens during true hardware disconnect events.

Cc: stable@vger.kernel.org
Reported-by: Cal Peake <cp@absolutedigital.net>
Closes: https://lore.kernel.org/all/b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net/
Fixes: 6a23e7b433 ("drm/amd: Clean up kfd node on surprise disconnect")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:25:57 -05:00
Alex Deucher 6952255ed9 drm/amdgpu: re-add the bad job to the pending list for ring resets
Returning DRM_GPU_SCHED_STAT_NO_HANG causes the scheduler
to add the bad job back the pending list.  We've already
set the errors on the fence and killed the bad job at this point
so it's the correct behavior.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:25:44 -05:00
Victor Zhao 5cc7bbd9f1 drm/amdgpu: avoid sdma ring reset in sriov
sdma ring reset is not supported in SRIOV. kfd driver does not check
reset mask, and could queue sdma ring reset during unmap_queues_cpsch.

Avoid the ring reset for sriov.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:25:37 -05:00
Sunil Khatri f025a2b8d9 drm/amdgpu: clean up the amdgpu_cs_parser_bos
In low memory conditions, kmalloc can fail. In such conditions
unlock the mutex for a clean exit.

We do not need to amdgpu_bo_list_put as it's been handled in the
amdgpu_cs_parser_fini.

Fixes: 737da5363c ("drm/amdgpu: update the functions to use amdgpu version of hmm")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202602030017.7E0xShmH-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:24:33 -05:00
Philip Yang 3b948dd036 drm/amdgpu: Use 5-level paging if gmc support 57-bit VA
Regardless if CPU enable 5-level paging, GPU vm use 5-level paging if
gmc init with 57-bit address space support, because

ARM64 4-level paging support 48-bit VA, x86 and GPU 4-level paging
support 47-bit VA, require 5-level paging on GPU to support ARM64.

NPA address space 52-bit mapping on NPA GPU VM require 5-level paging.

Debugger trap get device snapshot expect LDS and Scratch base, limit
above 57-bit, which is set only for 5-level paging.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.19.x
2026-02-05 17:23:51 -05:00
Yifan Zhang 39fc2bc4da drm/amdgpu: Protect GPU register accesses in powergated state in some paths
Ungate GPU CG/PG in device_fini_hw and device_halt to protect GPU
register accesses, e.g. GC registers are accessed in amdgpu_irq_disable_all()
and amdgpu_fence_driver_hw_fini().

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2026-02-05 17:20:33 -05:00
Alex Deucher 56423871e9 drm/amdgpu/sdma6: enable queue resets unconditionally
There is no firmware version dependency.  This also
enables sdma queue resets on all SDMA 6.x based
chips.

Fixes: 59fd50b866 ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:20:04 -05:00
Alex Deucher 314d30ad50 drm/amdgpu/sdma5.2: enable queue resets unconditionally
There is no firmware version dependency.  This also
enables sdma queue resets on all SDMA 5.2.x based
chips.

Fixes: 59fd50b866 ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:19:57 -05:00
Alex Deucher 46a2cb7d24 drm/amdgpu/sdma5: enable queue resets unconditionally
There is no firmware version dependency.

Fixes: 59fd50b866 ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:19:40 -05:00
Zilin Guan ee41e5b63c drm/amdgpu: Fix memory leak in amdgpu_ras_init()
When amdgpu_nbio_ras_sw_init() fails in amdgpu_ras_init(), the function
returns directly without freeing the allocated con structure, leading
to a memory leak.

Fix this by jumping to the release_con label to properly clean up the
allocated memory before returning the error code.

Compile tested only. Issue found using a prototype static analysis tool
and code review.

Fixes: fdc94d3a8c ("drm/amdgpu: Rework pcie_bif ras sw_init")
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:19:09 -05:00
Zilin Guan 0c44d61945 drm/amdgpu: Use kvfree instead of kfree in amdgpu_gmc_get_nps_memranges()
amdgpu_discovery_get_nps_info() internally allocates memory for ranges
using kvcalloc(), which may use vmalloc() for large allocation. Using
kfree() to release vmalloc memory will lead to a memory corruption.

Use kvfree() to safely handle both kmalloc and vmalloc allocations.

Compile tested only. Issue found using a prototype static analysis tool
and code review.

Fixes: b194d21b9b ("drm/amdgpu: Use NPS ranges from discovery table")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:18:33 -05:00
Zilin Guan c9be63d565 drm/amdgpu: Fix memory leak in amdgpu_acpi_enumerate_xcc()
In amdgpu_acpi_enumerate_xcc(), if amdgpu_acpi_dev_init() returns -ENOMEM,
the function returns directly without releasing the allocated xcc_info,
resulting in a memory leak.

Fix this by ensuring that xcc_info is properly freed in the error paths.

Compile tested only. Issue found using a prototype static analysis tool
and code review.

Fixes: 4d5275ab0b ("drm/amdgpu: Add parsing of acpi xcc objects")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-05 17:17:57 -05:00
Harish Kasiviswanathan 6b61a54e68 drm/amdgpu: Fix double deletion of validate_list
If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is
removed from validate_list, the mem handle still lingers in the KFD idr.
This means when process is terminated,
kfd_process_free_outstanding_kfd_bos() will call
amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double
deletion.

To avoid this -
 (a) Check if list is empty before deleting it
 (b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can
     be safely called again if it returns failure the first time.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6ba60345f4)
2026-02-03 17:24:21 -05:00
Bert Karwatzki 243b467dea Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem"
This reverts commit 7294863a6f.

This commit was erroneously applied again after commit 0ab5d711ec
("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
removed it, leading to very hard to debug crashes, when used with a system with two
AMD GPUs of which only one supports ASPM.

Link: https://lore.kernel.org/linux-acpi/20251006120944.7880-1-spasswolf@web.de/
Link: https://github.com/acpica/acpica/issues/1060
Fixes: 0ab5d711ec ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 97a9689300)
Cc: stable@vger.kernel.org
2026-02-03 17:20:59 -05:00
Mario Limonciello 1478a34470 drm/amd: Set minimum version for set_hw_resource_1 on gfx11 to 0x52
commit f81cd79311 ("drm/amd/amdgpu: Fix MES init sequence") caused
a dependency on new enough MES firmware to use amdgpu.  This was fixed
on most gfx11 and gfx12 hardware with commit 0180e0a5dd
("drm/amdgpu/mes: add compatibility checks for set_hw_resource_1"), but
this left out that GC 11.0.4 had breakage at MES 0x51.

Bump the requirement to 0x52 instead.

Reported-by: danijel@nausys.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4576
Fixes: f81cd79311 ("drm/amd/amdgpu: Fix MES init sequence")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c2d2ccc85f)
Cc: stable@vger.kernel.org
2026-02-03 17:20:38 -05:00
Lijo Lazar 8980be03b3 drm/amdgpu: Skip vcn poison irq release on VF
VF doesn't enable VCN poison irq in VCNv2.5. Skip releasing it and avoid
call trace during deinitialization.

[   71.913601] [drm] clean up the vf2pf work item
[   71.915088] ------------[ cut here ]------------
[   71.915092] WARNING: CPU: 3 PID: 1079 at /tmp/amd.aFkFvSQl/amd/amdgpu/amdgpu_irq.c:641 amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[   71.915355] Modules linked in: amdgpu(OE-) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_display_helper cec rc_core i2c_algo_bit video wmi binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common input_leds joydev serio_raw mac_hid qemu_fw_cfg sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 hid_generic crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel usbhid 8139too sha256_ssse3 sha1_ssse3 hid psmouse bochs i2c_i801 ahci drm_vram_helper libahci i2c_smbus lpc_ich drm_ttm_helper 8139cp mii ttm aesni_intel crypto_simd cryptd
[   71.915484] CPU: 3 PID: 1079 Comm: rmmod Tainted: G           OE      6.8.0-87-generic #88~22.04.1-Ubuntu
[   71.915489] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-2.el9_5.1 04/01/2014
[   71.915492] RIP: 0010:amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[   71.915768] Code: 75 84 b8 ea ff ff ff eb d4 44 89 ea 48 89 de 4c 89 e7 e8 fd fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 55 30 3b c7 <0f> 0b eb d4 b8 fe ff ff ff eb a8 e9 b7 3b 8a 00 66 2e 0f 1f 84 00
[   71.915771] RSP: 0018:ffffcf0800eafa30 EFLAGS: 00010246
[   71.915775] RAX: 0000000000000000 RBX: ffff891bda4b0668 RCX: 0000000000000000
[   71.915777] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   71.915779] RBP: ffffcf0800eafa50 R08: 0000000000000000 R09: 0000000000000000
[   71.915781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff891bda480000
[   71.915782] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
[   71.915792] FS:  000070cff87c4c40(0000) GS:ffff893abfb80000(0000) knlGS:0000000000000000
[   71.915795] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.915797] CR2: 00005fa13073e478 CR3: 000000010d634006 CR4: 0000000000770ef0
[   71.915800] PKRU: 55555554
[   71.915802] Call Trace:
[   71.915805]  <TASK>
[   71.915809]  vcn_v2_5_hw_fini+0x19e/0x1e0 [amdgpu]

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:48:12 -05:00
Harish Kasiviswanathan 6ba60345f4 drm/amdgpu: Fix double deletion of validate_list
If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is
removed from validate_list, the mem handle still lingers in the KFD idr.
This means when process is terminated,
kfd_process_free_outstanding_kfd_bos() will call
amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double
deletion.

To avoid this -
 (a) Check if list is empty before deleting it
 (b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can
     be safely called again if it returns failure the first time.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:46:33 -05:00
Andrew Martin 08b1eb621c drm/amdgpu: Ignored various return code
The return code of a non void function should not be ignored. In cases
where we do not care, the code needs to suppress it.

Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:46:31 -05:00
Jinzhou Su c6fa06fc4c drm/amdgpu/psp_v15_0_8: Add get ras capability
Add get ras capability for psp 15.0.8.

v2:Remove APU type check and IP version check.

Signed-off-by: Jinzhou Su <jinzhou.su@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:46:25 -05:00
Bert Karwatzki 97a9689300 Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem"
This reverts commit 7294863a6f.

This commit was erroneously applied again after commit 0ab5d711ec
("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
removed it, leading to very hard to debug crashes, when used with a system with two
AMD GPUs of which only one supports ASPM.

Link: https://lore.kernel.org/linux-acpi/20251006120944.7880-1-spasswolf@web.de/
Link: https://github.com/acpica/acpica/issues/1060
Fixes: 0ab5d711ec ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:43:59 -05:00
Stanley.Yang 500449eb70 drm/amdgpu: statistic xgmi training error count
Report xgmi training error uncorrectable error count.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:43:46 -05:00
Mario Limonciello c2d2ccc85f drm/amd: Set minimum version for set_hw_resource_1 on gfx11 to 0x52
commit f81cd79311 ("drm/amd/amdgpu: Fix MES init sequence") caused
a dependency on new enough MES firmware to use amdgpu.  This was fixed
on most gfx11 and gfx12 hardware with commit 0180e0a5dd
("drm/amdgpu/mes: add compatibility checks for set_hw_resource_1"), but
this left out that GC 11.0.4 had breakage at MES 0x51.

Bump the requirement to 0x52 instead.

Reported-by: danijel@nausys.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4576
Fixes: f81cd79311 ("drm/amd/amdgpu: Fix MES init sequence")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:40:15 -05:00
Perry Yuan 31b153315b drm/amdgpu: ensure no_hw_access is visible before MMIO
Add a full memory barrier after clearing no_hw_access in
amdgpu_device_mode1_reset() so subsequent PCI state restore
access cannot observe stale state on other CPUs.

Fixes: 7edb503fe4 ("drm/amd/pm: Disable MMIO access during SMU Mode 1 reset")
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:23:11 -05:00
Oleg Nesterov a87da7a9fa drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
Nowadays task->group_leader->mm != task->mm is only possible if a) task is
not a group leader and b) task->group_leader->mm == NULL because
task->group_leader has already exited using sys_exit().

I don't think that drm/amd tries to detect/nack this case.

Link: https://lkml.kernel.org/r/aXY_yLVHd63UlWtm@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Christan König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03 08:21:25 -08:00