There are a gazillion files that depend on drm_print.h being indirectly
included via drm_buddy.h, drm_mm.h, or ttm/ttm_resource.h. In
preparation for removing those includes, explicitly include drm_print.h
where needed.
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/5fe67395907be33eb5199ea6d540e29fddee71c8.1761734313.git.jani.nikula@intel.com
When a command times out, mark it as ERT_CMD_STATE_TIMEOUT. Any other
commands that are canceled due to this timeout should be marked as
ERT_CMD_STATE_ABORT.
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251029193423.2430463-1-lizhi.hou@amd.com
During power down, pending DVFS operations may still be in progress
when the NPU reset is asserted after CDYN=0 is set. Since the READY
bit may already be deasserted at this point, checking only the READY
bit is insufficient to ensure all transactions have completed.
Add an explicit check for CDYN de-assertion after the READY bit check
to guarantee no outstanding transactions remain before proceeding.
Fixes: 550f4dd2ce ("accel/ivpu: Add support for Nova Lake's NPU")
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251030091700.293341-1-karol.wachowski@linux.intel.com
OS scheduling mode gets deprecated starting from NPU6 onward.
Print warning and fallback to HW scheduling mode if OS mode is
explicitly selected with sched_mode parameter.
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251029201554.257708-1-karol.wachowski@linux.intel.com
Introduce a new ioctl `drm_ivpu_bo_create_from_userptr` that allows
users to create GEM buffer objects from user pointers to memory regions.
The user pointer must be page-aligned and the memory region must remain
valid for the buffer object's lifetime.
Userptr buffers enable direct use of mmapped files (e.g. inference
weights) in NPU workloads without copying data to NPU buffer objects.
This reduces memory usage and provides better flexibility for NPU
applications.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251029091752.203198-1-karol.wachowski@linux.intel.com
Fix 'Memory manager not clean during takedown' warning that occurs
when ivpu_gem_bo_free() removes the BO from the BOs list before it
gets unmapped. Then file_priv_unbind() triggers a warning in
drm_mm_takedown() during context teardown.
Protect the unmapping sequence with bo_list_lock to ensure the BO is
always fully unmapped when removed from the list. This ensures the BO
is either fully unmapped at context teardown time or present on the
list and unmapped by file_priv_unbind().
Fixes: 48aea7f2a2 ("accel/ivpu: Fix locking in ivpu_bo_remove_all_bos_from_context()")
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251029071451.184243-1-karol.wachowski@linux.intel.com
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
system_wq should be the per-cpu workqueue, yet in this name nothing makes
that clear, so replace system_wq with system_percpu_wq.
The old wq (system_wq) will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251029165642.364488-3-marco.crivellari@suse.com
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251029165642.364488-2-marco.crivellari@suse.com
pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(),
pm_runtime_autosuspend() and pm_request_autosuspend() now include a call
to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to
pm_runtime_mark_last_busy().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Link: https://patch.msgid.link/20251027133956.393375-1-sakari.ailus@linux.intel.com
Rework of imported buffers introduced in the commit
e0c0891cd6 ("accel/ivpu: Rework bind/unbind of imported buffers")
switched the logic of imported buffers by dma mapping/unmapping
them just as the regular buffers.
The commit didn't include removal of skipping dma unmap of imported
buffers which results in them being mapped without unmapping.
Fixes: e0c0891cd6 ("accel/ivpu: Rework bind/unbind of imported buffers")
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Link: https://patch.msgid.link/20251027150933.2384538-1-maciej.falkowski@linux.intel.com
QAIC_MANAGE_EXT_MSG_LENGTH is ambiguous and has been confused with
QAIC_MANAGE_MAX_MSG_LENGTH. Rename it to clarify it's a wire length.
Signed-off-by: Troy Hanson <thanson@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: capitalize subject]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251022141606.3740470-1-youssef.abdulrahman@oss.qualcomm.com
Replace the word "Qranium" with "qaic" in the function parameter
description.
Signed-off-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: adjust word wrapping in commit text]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251022124107.3712466-1-youssef.abdulrahman@oss.qualcomm.com
Update the Sahara image table for the AIC200 to add entries for:
- qupv3fw.elf at id 54
- xbl_config.elf at id 38
- tz_qti_config.mbn at id 76
And move pvs.bin to id 78 to avoid firmware conflict.
Co-developed-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Co-developed-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251017173432.1207656-1-youssef.abdulrahman@oss.qualcomm.com
Add a driver for Arm Ethos-U65/U85 NPUs. The Ethos-U NPU has a
relatively simple interface with single command stream to describe
buffers, operation settings, and network operations. It supports up to 8
memory regions (though no h/w bounds on a region). The Ethos NPUs
are designed to use an SRAM for scratch memory. Region 2 is reserved
for SRAM (like the downstream driver stack and compiler). Userspace
doesn't need access to the SRAM.
The h/w has no MMU nor external IOMMU and is a DMA engine which can
read and write anywhere in memory without h/w bounds checks. The user
submitted command streams must be validated against the bounds of the
GEM BOs. This is similar to the VC4 design which validates shaders.
The job submit is based on the rocket driver for the Rockchip NPU
utilizing the GPU scheduler. It is simpler as there's only 1 core rather
than 3.
Tested on i.MX93 platform (U65) and FVP (U85) with Mesa Teflon
support.
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251020-ethos-v6-2-ecebc383c4b7@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
In aie2_get_hwctx_status() and aie2_query_ctx_status_array(), the
functions could return an uninitialized value in some cases. Update them
to always return 0. The amount of valid results is indicated by the
returned buffer_size, element_size, and num_element fields.
Fixes: 2f509fe6a4 ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251024165503.1548131-1-lizhi.hou@amd.com
When the driver issues the SYNC_DEBUG_BO command, it currently returns 0
even if the firmware fails to execute the command. Update the driver to
return -EINVAL in this case to properly indicate the failure.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/dri-devel/aPsadTBXunUSBByV@stanley.mountain/
Fixes: 7ea0468380 ("accel/amdxdna: Support firmware debug buffer")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251024162608.1544842-1-lizhi.hou@amd.com
Add support for NPU6 generation that will be present on Nova Lake CPUs.
As with previous generations, it maintains compatibility
so no bigger functional changes apart from removing
deprecated call to soc_cpu_drive() function.
Quiescing TOP_MMIO in SOC_CPU_NOC as part of boot procedure is no longer
needed starting from 60XX. Remove soc_cpu_drive() call from NPU6 onward.
The VPU_CPU_NOC_QREQN, VPU_CPU_NOC_QACCEPTN, and VPU_CPU_NOC_QDENY
registers are deprecated and non-functional on 60XX. They will be
removed in future generations.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251022105348.2237273-1-maciej.falkowski@linux.intel.com
drm-misc-next for v6.19:
UAPI Changes:
Cross-subsystem Changes:
- fbcon cleanups.
- Make drivers depend on FB_TILEBLITTING instead of selecting it,
and hide FB_MODE_HELPERS.
Core Changes:
- More preparations for rust.
- Throttle dirty worker with vblank
- Use drm_for_each_bridge_in_chain_scoped in drm's bridge code and
assorted fixes.
- Ensure drm_client_modeset tests are enabled in UML.
- Rename ttm_bo_put to ttm_bo_fini, as a further step in removing the
TTM bo refcount.
- Add POST_LT_ADJ_REQ training sequence.
- Show list of removed but still allocated bridges.
- Add a simulated vblank interrupt for hardware without it,
and add some helpers to use them in vkms and hypervdrm.
Driver Changes:
- Assorted small fixes, cleanups and updates to host1x, tegra,
panthor, amdxdna, gud, vc4, ssd130x, ivpu, panfrost, panthor,
sysfb, bridge/sn65dsi86, solomon, ast, tidss.
- Convert drivers from using .round_rate() to .determine_rate()
- Add support for KD116N3730A07/A12, chromebook mt8189, JT101TM023,
LQ079L1SX01, raspberrypi 5" panels.
- Improve reclocking on tegra186+ with nouveau.
- Improve runtime pm in amdxdna.
- Add support for HTX_PAI in imx.
- Use a helper to calculate dumb buffer sizes in most drivers.
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/b412fb91-8545-466a-8102-d89c0f2758a7@linux.intel.com
To collect firmware debug information, the userspace application allocates
a AMDXDNA_BO_DEV buffer object through DRM_IOCTL_AMDXDNA_CREATE_BO.
Then it associates the buffer with the hardware context through
DRM_IOCTL_AMDXDNA_CONFIG_HWCTX which requests firmware to bind the buffer
through a mailbox command. The firmware then writes the debug data into
this buffer. The buffer can be mapped into userspace so that
applications can retrieve and analyze the firmware debug information.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20251016203016.819441-1-lizhi.hou@amd.com
Use min_t() instead of min() to resolve compiler warnings for mismatched
types.
Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251015153715.184143-1-youssef.abdulrahman@oss.qualcomm.com
Use check_add_overflow instead of size_add in sahara when
64b types are being added to ensure compatibility with 32b
systems. The size_add function parameters are of size_t, so
64b data types may be truncated when cast to size_t on 32b
systems. When using check_add_overflow, no type casts are made,
making it a more portable option.
Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251015165408.213645-1-youssef.abdulrahman@oss.qualcomm.com
Add new parameter DRM_AMDXDNA_HW_LAST_ASYNC_ERR to get array IOCTL. When
hardware reports an error, the driver save the error information and
timestamp. This new get array parameter retrieves the last error.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20251014234119.628453-1-lizhi.hou@amd.com
Fix a race that can occur when multiple jobs submit the same dmabuf.
This could cause the sg_table to be mapped twice, leading to undefined
behavior.
Fixes: e0c0891cd6 ("accel/ivpu: Rework bind/unbind of imported buffers")
Signed-off-by: Wludzik, Jozef <jozef.wludzik@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251014071725.3047287-1-karol.wachowski@linux.intel.com
AIC200 uses the newer "XBL" firmware implementation which changes the
expectations of how READ_DATA is performed. Larger data requests are
supported via streaming the data over the transport instead of requiring
a single transport transfer for everything.
Co-developed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007224045.605374-1-youssef.abdulrahman@oss.qualcomm.com
struct qaic_perf_stats is defined to have a DBC specified in the header,
followed by struct qaic_perf_stats_entry instances, each pointing to a BO
that is associated with the DBC. Currently, qaic_perf_stats_bo_ioctl() does
not check if the entries belong to the DBC specified in the header.
Therefore, add checks to ensure that each entry in the request is sliced
and belongs to hdr.dbc_id.
Co-developed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007221212.559474-1-youssef.abdulrahman@oss.qualcomm.com
Division is an expensive operation. Overflow check functions exist
already. Use existing overflow check functions rather than dividing to
check for overflow.
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007174218.469867-1-youssef.abdulrahman@oss.qualcomm.com
Found via code inspection that when encode_message() fails in the middle
of processing, instead of returning the actual error code, it always
returns -EINVAL. This is because the entire message length has not been
processed, and the error code is set to -EINVAL.
Instead, take the 'out' path on failure to return the actual error code.
Signed-off-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007170130.445878-1-youssef.abdulrahman@oss.qualcomm.com
If msg_xfer() is called and the channel ring does not have enough room
to accommodate the whole message, the function sleeps and tries again.
It uses retry_count to keep track of the number of retrials done. This
variable is not used after the space check succeeds. So, remove the
retry_count = 0 statement used later in the function.
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007161148.422744-1-youssef.abdulrahman@oss.qualcomm.com
Two threads of the same process can potential read and write parallelly to
head and tail pointers of the same DBC request queue. This could lead to a
race condition and corrupt the DBC request queue.
Fixes: ff13be8303 ("accel/qaic: Add datapath")
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
[jhugo: Add fixes tag]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007061837.206132-1-youssef.abdulrahman@oss.qualcomm.com
Include linux/sched/signal.h in qaic_control.c
to avoid implicit inclusion of signal_pending().
Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007154525.415039-1-youssef.abdulrahman@oss.qualcomm.com
Currently, if find_and_map_user_pages() takes a DMA xfer request from the
user with a length field set to 0, or in a rare case, the host receives
QAIC_TRANS_DMA_XFER_CONT from the device where resources->xferred_dma_size
is equal to the requested transaction size, the function will return 0
before allocating an sgt or setting the fields of the dma_xfer struct.
In that case, encode_addr_size_pairs() will try to access the sgt which
will lead to a general protection fault.
Return an EINVAL in case the user provides a zero-sized ALP, or the device
requests continuation after all of the bytes have been transferred.
Fixes: 96d3c1cade ("accel/qaic: Clean up integer overflow checking in map_user_pages()")
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007122320.339654-1-youssef.abdulrahman@oss.qualcomm.com
When a BO is created, qaic will use the page allocator to request the
memory chunks that the BO will be composed of in-memory. The number of
chunks increases when memory is segmented. For example, a 16MB BO can
be composed of four 4MB chunks or 4096 4KB chunks.
A BO is then sliced into a single or multiple slices to be transferred
to the device on the DBC's xfer queue. For that to happen, the slice
needs to encode its memory chunks into DBC requests and keep track of
them in an array, which is allocated using kcalloc(). Knowing that the
BO might be very fragmented, this array can grow so large that the
allocation may fail to find contiguous memory for it.
Replace kcalloc() with kvcalloc() to allocate the DBC requests array
for a slice.
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007121845.337382-1-youssef.abdulrahman@oss.qualcomm.com
As soon as we queue MHI buffers to receive the bootlog from the device,
we could be receiving data. Therefore all the resources needed to
process that data need to be setup prior to queuing the buffers.
We currently initialize some of the resources after queuing the buffers
which creates a race between the probe() and any data that comes back
from the device. If the uninitialized resources are accessed, we could
see page faults.
Fix the init ordering to close the race.
Fixes: 5f8df5c6de ("accel/qaic: Add bootlog debugfs")
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007115750.332169-1-youssef.abdulrahman@oss.qualcomm.com
Add support to export BO as DMABUF to enable userspace to reuse buffers
and reduce number of copy.
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007053853.193608-1-youssef.abdulrahman@oss.qualcomm.com
Currently the driver returns ABORTED for all errors that trigger engine
reset. It is better to distinguish between different error types by
returning the actual error code reported by firmware.
This allows userspace to take different actions based on the error type
and improves debuggability.
Refactor ivpu_job_signal_and_destroy() by extracting engine error
handling logic into a new function ivpu_job_handle_engine_error().
This simplifies engine error handling logic by removing necessity of
calling ivpu_job_singal_and_destroy() multiple times by a single job
changing it's behavior based on job status.
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251008061255.2909794-1-karol.wachowski@linux.intel.com
Trigger engine reset for any status code in the range.
This allows to add additional status codes in the future without
breaking compatibility between the firmware and the driver.
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251007083511.2817021-1-karol.wachowski@linux.intel.com
New API header includes additional status codes and range definitions
for error handling and improved API documentation.
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251007083451.2816990-1-karol.wachowski@linux.intel.com
When the hardware is powered down by auto-suspend, creating or destroying
a hardware context without resuming power will fail.
Call amdxdna_pm_resume_get() before requesting the hardware to create or
destroy a hardware context.
Fixes: 063db45183 ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20251008045324.4171807-1-lizhi.hou@amd.com
Documentation/filesystems/sysfs.rst mentions that show() should only
use sysfs_emit() or sysfs_emit_at() when formatting the value to be
returned to user space. So replace snprintf() with sysfs_emit().
Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: Fix commit text typos]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250822112804.1726592-1-chelsyratnawat2001@gmail.com
Replace kcalloc() followed by copy_from_user() with memdup_array_user()
to improve and simplify both __qaic_execute_bo_ioctl() and
qaic_perf_stats_bo_ioctl().
In __qaic_execute_bo_ioctl(), return early if an error occurs and remove
the obsolete 'free_exec' label.
Since memdup_array_user() already checks for multiplication overflow,
remove the manual check in __qaic_execute_bo_ioctl(). Remove any unused
local variables accordingly.
Since 'ret = copy_from_user()' has been removed, initialize 'ret = 0' to
preserve the same return value on success.
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250917124805.90395-4-thorsten.blum@linux.dev
Replace kzalloc() followed by copy_from_user() with memdup_user() to
improve and simplify qaic_attach_slice_bo_ioctl().
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250917124805.90395-2-thorsten.blum@linux.dev
The pcode MAILBOX STATUS register PARAM2 field expects DCT active
percent in U1.7 value format. Convert percentage value to this
format before writing to the register.
Fixes: a19bffb10c ("accel/ivpu: Implement DCT handling")
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251001104322.1249896-1-karol.wachowski@linux.intel.com
Add additional warnings related to allocation and
deallocation of buffer objects to better track possible
memory leaks and generally the BO's lifecycle.
Introduce checks for handle_count to ensure it is zero
before creating a new handle, and exactly one
after successfully creating a handle.
Introduce also a check to warn if the VMA node is not
empty when freeing the buffer object.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145154.1446427-1-maciej.falkowski@linux.intel.com
Fix doc description of job structure as it is
improperly formatted. Align order of job structure's
fields according to the documentation.
Fixes: 0bf37f45d5 ("accel/ivpu: Add support for user-managed preemption buffer")
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145131.1446323-1-maciej.falkowski@linux.intel.com
Don't add BO to the vdev->bo_list in ivpu_gem_create_object().
When failure happens inside drm_gem_shmem_create(), the BO is not
fully created and ivpu_gem_bo_free() callback will not be called
causing a deleted BO to be left on the list.
Fixes: 8d88e4cdce ("accel/ivpu: Use GEM shmem helper for all buffers")
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145114.1446283-1-maciej.falkowski@linux.intel.com
Ensure that imported buffers are properly mapped and unmapped in
the same way as regular buffers to properly handle buffers during
device's bind and unbind operations to prevent resource leaks and
inconsistent buffer states.
Imported buffers are now dma_mapped before submission and
dma_unmapped in ivpu_bo_unbind(), guaranteeing they are unmapped
when the device is unbound.
Add also imported buffers to vdev->bo_list for consistent unmapping
on device unbind. The bo->ctx_id is set in open() so imported
buffers have a valid context ID.
Debug logs have been updated to match the new code structure.
The function ivpu_bo_pin() has been renamed to ivpu_bo_bind()
to better reflect its purpose, and unbind tests have been refactored
for improved coverage and clarity.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145059.1446243-1-maciej.falkowski@linux.intel.com
Add new boot parameter for NPU5+ that enables
ECC signalling for on-chip memory based on the value
of MSR_INTEGRITY_CAPS register.
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145020.1446208-1-maciej.falkowski@linux.intel.com