Commit Graph

12 Commits (50f09a3dd5877bda888fc25c3d98937dcfb85539)

Author SHA1 Message Date
Oded Gabbay 27a9e35daa habanalabs: ignore f/w status error
In case firmware has a bug and erroneously reports a status error
(e.g. device unusable) during boot, allow the user to tell the driver
to continue the boot regardless of the error status.

This will be done via kernel parameter which exposes a mask. The
user that loads the driver can decide exactly which status error to
ignore and which to take into account. The bitmask is according to
defines in hl_boot_if.h

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-05-08 11:21:57 +03:00
Ofir Bitton e5042a6fa6 habanalabs/gaudi: derive security status from pci id
As F/ security indication must be available before driver approaches
PCI bus, F/W security should be derived from PCI id rather than be
fetched during boot handshake with F/W.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09 14:09:25 +03:00
Ofir Bitton 6a2f5d7098 habanalabs: use a single FW loading bringup flag
For simplicity, use a single bringup flag indicating which FW
binaries should loaded to device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09 14:09:23 +03:00
Oded Gabbay 17b59dd339 habanalabs: change default CS timeout to 30 seconds
Because our graph contains network operations, we need to account
for delay in the network.

5 seconds timeout per CS is not enough to account for that.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09 14:09:22 +03:00
Oded Gabbay fcaebc7354 habanalabs: register to pci shutdown callback
We need to make sure our device is idle when rebooting a virtual
machine. This is done in the driver level.

The firmware will later handle FLR but we want to be extra safe and
stop the devices until the FLR is handled.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-12-28 08:47:39 +02:00
Ofir Bitton d1ddd90551 habanalabs: move HW dirty check to a proper location
Driver must verify if HW is dirty before trying to fetch preboot
information. Hence, we move this validation to a prior stage of
the boot sequence.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Ofir Bitton 66a76401c5 habanalabs: add 'needs reset' state in driver
The new state indicates that device should be reset in order
to re-gain funcionality.
This unique state can occur if reset_on_lockup is disabled
and an actual lockup has occurred.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
farah kassabri 03df136bc5 habanalabs/gaudi: scrub all memory upon closing FD
In cases of multi-tenants, administrators may want to prevent data
leakage between users running on the same device one after another.

To do that the driver can scrub the internal memory (both SRAM and
DRAM) after a user finish to use the memory.

Because in GAUDI the driver allows only one application to use the
device at a time, it can scrub the memory when user app close FD.

In future devices where we have MMU on the DRAM, we can scrub the DRAM
memory with a finer granularity (page granularity) when the user
allocates the memory.

This feature is not supported in Goya.

To allow users that want to debug their applications, we add a kernel
module parameter to load the driver with this feature disabled.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Oded Gabbay 596553dbf9 habanalabs: support multiple types of firmwares
The driver now loads the firmware in two stages. For debugging purposes
we need to support situations where only the first stage firmware is
loaded.

Therefore, use a bitmask to determine which F/W is loaded

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:27 +02:00
Ofir Bitton 2e5eda4681 habanalabs: PCIe Advanced Error Reporting support
driver will now get notified upon any PCI error occurred and
will respond according to the severity of the error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Greg Kroah-Hartman 65a9bde6ed Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge 5.8-rc7 into char-misc-next

This should resolve the merge/build issues reported when trying to
create linux-next.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-27 11:49:37 +02:00
Oded Gabbay 70b2f993ea habanalabs: create common folder
For internal needs of our CI we need to move all the common code into a
common folder instead of putting them in the root folder of the driver.

Same applies to the common header files under include/

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-24 20:31:37 +03:00