Commit Graph

1169205 Commits (887185649c7ee8a9cc2d4e94de92bbbae6cd3747)

Author SHA1 Message Date
Robert Richter 623c075133 cxl/mbox: Fix Payload Length check for Get Log command
Commit 2aeaf663b8 introduced strict checking for variable length
payload size validation. The payload length of received data must
match the size of the requested data by the caller except for the case
where the min_out value is set.

The Get Log command does not have a header with a length field set.
The Log size is determined by the Get Supported Logs command (CXL 3.0,
8.2.9.5.1). However, the actual size can be smaller and the number of
valid bytes in the payload output must be determined reading the
Payload Length field (CXL 3.0, Table 8-36, Note 2).

Two issues arise: The command can successfully complete with a payload
length of zero. And, the valid payload length must then also be
consumed by the caller.

Change cxl_xfer_log() to pass the number of payload bytes back to the
caller to determine the number of log entries. Implement the payload
handling as a special case where mbox_cmd->size_out is consulted when
cxl_internal_send_cmd() returns -EIO. A WARN_ONCE() is added to check
that -EIO is only returned in case of an unexpected output size.

Logs can be bigger than the maximum payload length and multiple Get
Log commands can be issued. If the received payload size is smaller
than the maximum payload size we can assume all valid bytes have been
fetched. Stop sending further Get Log commands then.

On that occasion, change debug messages to also report the opcodes of
supported commands.

The variable payload commands GET_LSA and SET_LSA are not affected by
this strict check: SET_LSA cannot be broken because SET_LSA does not
return an output payload, and GET_LSA never expects short reads.

Fixes: 2aeaf663b8 ("cxl/mbox: Add variable output size validation for internal commands")
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230119094934.86067-1-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-27 13:58:11 -08:00
Linus Torvalds e6f2f6ac50 A bunch of driver fixes with a tiny bit of new IDs
-----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmPUO0cPHHdzYUBrZXJu
 ZWwub3JnAAoJEBQN5MwUoCm2W58QAIQMUVwScHxlEM4N9D7O585wIVFcSksmBdHn
 xQREKleAm+84EVXjj2rTFonqIUcVzm367rz2w9YOvtHilJaepTOcB8HRfoi5Mo9o
 9gYa2m+7PfqsWJI/DE/Bee97AwK1u7N+L1WoHPVY2ZJpPruw0/Ylr/kNXMf3cKAU
 8qTYYuNgc8uMU1b9WZbypYy1g0Tj+dyXE0nmPr1z26obDhsSsTG/dJ4fhoOmlGLC
 m1ioVymwlGBcc+aTIHoryiFF2/t4duPtFJ5J4Ks9Dw5brtUCNvq1CHu0u0g94mlB
 cEapGuhY6mhtbMkMBA4l7+MiJkeX0ZDBvDdTtSzeiEttHmVZAReWvtPkMIt2P+6z
 EgxYRx4Y0H0O/ckWW6zxM8f+Tp3WmYYoWncU7pnuekn78+usX+zFAF9xs9SpWEV5
 Mj8qQyDh56Aoaqdh/JP5SVARVMJz8uKn2NgZ32gxHBBx37N1gD6DYhs+N06yFr+X
 6V3aqcLKgpMMxLZMUP3vJEwuB+dHJ4JHWGsKW5MjXrBL0Oc17bEfXAjD6bpYas0t
 n7eeSk+d5VBT0gdcdgrHqBZyxKcnTUDqT18BS1+L+rvT+axSZxW0R6A1nI4j4QwU
 nV8NDKiB1nOSJiPPUuVuWFJcSrHDOilmGAM71d8VuYIvpak8aLqP4xz3Ve2UibU4
 hdfy3f/i
 =LSQy
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "A bunch of driver fixes with a tiny bit of new IDs"

* tag 'i2c-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: rk3x: fix a bunch of kernel-doc warnings
  i2c: axxia: use 'struct' for kernel-doc notation
  dt-bindings: i2c: renesas,rzv2m: Fix SoC specific string
  i2c: mxs: suppress probe-deferral error message
  i2c: designware-pci: Add new PCI IDs for AMD NAVI GPU
  i2c: designware: Fix unbalanced suspended flag
  i2c: designware: use casting of u64 in clock multiplication to avoid overflow
2023-01-27 13:52:38 -08:00
Linus Torvalds 37d0be6a7d gpio fixes for v6.2-rc6
- fix the -c option in the gpio-event-mode user-space example program
 - fix the irq number translation in gpio-ep93xx and make its irqchip immutable
 - add a missing spin_unlock in error path in gpio-mxc
 - fix a suspend breakage on System76 and Lenovo Gen2a introduced in GPIO ACPI
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmPUKckACgkQEacuoBRx
 13KjuQ//fL9MagRQVsE85SVhhMsjmTTbXnvGJOIT0iTTvqgh9DACX9m6EDQQ/8Kd
 bKuciI0h2zO5xt/fZPAyJWgGw0w3qvld7f3f6KQE60X4MRyckyZB9xEyHnvuUs+X
 v0SiH8rLm+znezFXFiIB3lq004T9DU+wxUCaSq4WH1eE7eqKP8ZXhg2sLUgiskiY
 c1Albv2X0xM4duLmPBK4vsXV2x35yj5qS0qkIymezff2sQekzb3TOZsaS7Doznex
 cKr2OvJR/VuH4SMv+uIoKjsVKzdvtP/6hBKJtTH5i1krzNsBW/S+/7odwxzdxPVU
 EgQc5I7/ubV7FP3Jo3ThO3exTFyd50hl4jKmrkrzzFpeh2xZ2eKU/AD8/MNsPVuA
 KAoBUxHnoJfdZuUNL0Cmf1pTmzZbS80qntznzO700QtBaIxS/6redfJ8CKUK4Iy8
 f491K3SbzpXzRAjVMBv8ua7JrB8MT82IKxbeXd1bellGrLaNuMzj1Yd8XhWlepnQ
 wZxMSo3457FpOghiiMM3YnLSCcpbs4WU8VV/NMCc8jfS14oq7gA26EalQLPwKeCF
 5c+UMAZ95gafJwPjuZcDJcRCGpDlhqPawRBo5cDVUlSokh1PSmPhcGNtE8XKUKlE
 fHyd6WEYn8AzXnf3PMceiuJYCADs8wtg87GHKuLF1PPXmcqM4C8=
 =wh3L
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix the -c option in the gpio-event-mode user-space example program

 - fix the irq number translation in gpio-ep93xx and make its irqchip
   immutable

 - add a missing spin_unlock in error path in gpio-mxc

 - fix a suspend breakage on System76 and Lenovo Gen2a introduced in
   GPIO ACPI

* tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  tools: gpio: fix -c option of gpio-event-mon
  gpio: ep93xx: remove unused variable
  gpio: ep93xx: Make irqchip immutable
  gpio: ep93xx: Fix port F hwirq numbers in handler
  gpio: mxc: Unlock on error path in mxc_flip_edge()
  gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
2023-01-27 13:47:40 -08:00
Linus Torvalds 4d1483a99e regulator: Fix for v6.2
A fix for the DT binding documentation which dropped a property
 when being converted to YAML format causing spurious errors
 validating device trees for platforms using the device.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmPT/N4ACgkQJNaLcl1U
 h9DebQf+O5ghcGfIu5RnFV5NMaUPlkVKJqY51iJQH50+CNqFAzmCIM8uBQjUxeE9
 Ujxenq31RxnBPBzdRZzuKYEWnM7Q2qPUp/dWrcdxJVpBYZuhq+ExQ9teUsf/EeIT
 95DorQ9jfOqlV18HNHNCoH8xn5clnW8iDIdciSzfpp3pX+trWAefeIaOlU5H3tKj
 Lb1gEf29LrE6zDsDRKjL1tuyQw/ED8MpKzcX6/Pjm6He5hWXpof1kSMm8Z085XQF
 pB8nuJoipQIbbj/cQm/eAwb8R5AUbotTKxDwamdVtPYm+3FEg3fi/SIBOdbDvRKc
 yQS42il5cE+YPozy9myeVps6d8/60A==
 =7HDJ
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "A fix for the DT binding documentation which dropped a property when
  being converted to YAML format causing spurious errors validating
  device trees for platforms using the device"

* tag 'regulator-fix-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: dt-bindings: samsung,s2mps14: add lost samsung,ext-control-gpios
2023-01-27 13:43:46 -08:00
Linus Torvalds 0acffb235f overlayfs fixes for 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCY9PrPQAKCRDh3BK/laaZ
 PKk5AP9UUlwGP2XIuCY7hMWvsZKe1FpAXyXzG3jrEmRyBmOEFQD/RMRItvlj330O
 ntPw7luRC4Us4TO/xc3OqVE0UUnwqQw=
 =sqlV
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Fix two bugs, a recent one introduced in the last cycle, and an older
  one from v5.11"

* tag 'ovl-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fail on invalid uid/gid mapping at copy up
  ovl: fix tmpfile leak
2023-01-27 13:39:30 -08:00
Andre Przywara beabd511e6 ARM: dts: sun8i: a83t: bananapi-m3: describe SATA disk regulator
The Bananapi-M3 has a SATA connector, driven by a USB-to-SATA bridge
soldered on the board. The power for the SATA device is provided by a
GPIO controlled regulator. Since the SATA device is behind USB, it has
no DT node, so we never described this regulator. Instead U-Boot was
turning this on in a rather hackish way, which we now want to get rid of.
On top of that it seems fragile to leave this GPIO undescribed, as
userland could claim it and turn the disk off.

Add a fixed regulator, controlled by the PD25 GPIO, and mark it as
always-on. This would mimic the current situation, but in a safer way,
and would allow U-Boot to drop the CONFIG_SATAPWR enable hack.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20230120012616.30960-1-andre.przywara@arm.com
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Samuel Holland 2177d4ae97 ARM: dts: sun8i: nanopi-duo2: Fix regulator GPIO reference
The property named in the schema is 'enable-gpios', not 'enable-gpio'.
This makes no difference at runtime, because the regulator is marked as
always-on, but it breaks validation.

Fixes: 4701fc6e5d ("ARM: dts: sun8i: add FriendlyARM NanoPi Duo2")
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221231225854.16320-2-samuel@sholland.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Samuel Holland a5978fb368 ARM: dts: sunxi: Fix GPIO LED node names
These board devicetrees fail to validate because the gpio-leds schema
requires its child nodes to have "led" in the node name.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20221231225854.16320-1-samuel@sholland.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Krzysztof Kozlowski 04961fbe8e ARM: dts: sun8i: h3-beelink-x2: align HDMI CEC node names with dtschema
The bindings expect "cec" for HDMI CEC node.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221204183341.139946-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Samuel Holland 862ee64b3a arm64: dts: allwinner: a64: Add DPHY interrupt
The DPHY has an interrupt line which is shared with the DSI controller.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221114022113.31694-4-samuel@sholland.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Samuel Holland 6c462d7f2e ARM: dts: sun8i: a33: Add DPHY interrupt
The DPHY has an interrupt line which is shared with the DSI controller.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221114022113.31694-3-samuel@sholland.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
2023-01-27 22:34:32 +01:00
Linus Torvalds 76e26e3c6a drm fixes for 6.2-rc6
drm:
 - DP MST kref fix
 - fb_helper: check return value
 
 i915:
 - Fix BSC default context for Meteor Lake
 - Fix selftest-scheduler's modify_type
 - memory leak fix
 
 amdgpu:
 - GC11.x fixes
 - SMU13.0.0 fix
 - Freesync video fix
 - DP MST fixes
 - build fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmPTXuMACgkQDHTzWXnE
 hr5kaw/+JVm3ghpcdjueHjanKEIk1gzpXY9mqfv2VyNoJvl+9Tf9IdCZ3KDJIH4u
 /fCR10IAWO5rUDE6pYELkwE78Fj4d0Tqj/OcuLTdIPsdzsvmh2gMdlKB24BiWods
 T4mFca0xpq4Ogeh2sfvBmNxUejK/4syiKYyLkvMKR51Nt9M0iF0WEY6RYgm/XYSz
 sLAkeAgh+YdEfH94zEj0U1Hi+N5RG1bQGbbrEbKsjrf10XoyICzP2U0MbtECQhQt
 dqYaLSnrQEiNybOtdG+H/oi4Gvn7MXe7QRSBBcgLtHYG+CR89rbw1imgVHDQhxA3
 G2SLUqTMdZ2m9oG8KRe1HTVr0kXl063Sdv7XWmJx5pZoXlKj5GuSGs2eaG+tdm3y
 SlwaOzSdo3v4ad5txFS1GKlpas8bhSFQ0XSQfJ0wXt7rXzz+1R4v8OJ8LWdLu/gP
 mFIMZ8o5U850hq8Y6m//Ivwff5MzPmrvcG53my9lr7D7/YH1J+UvmhfK9JtfllDe
 GxC4GXQoodhIqvgxGlJ1YX2GmJRGfrplhYPmdWSeNoUn+zPinOGFdWqgH50bKNvS
 0aykDMHBpsdxnzjugVaOGxpwCeYX/GZtcuJKCU7Ak/bGzuKoFU+aKKYFhO2u14pe
 N2pxqP+RRtHzif5vmIKSkjACYCAfjaaVvHcEMv2C1f5ySXXRTSM=
 =w7XH
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2023-01-27' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fairly small this week as well, i915 has a memory leak fix and some
  minor changes, and amdgpu has some MST fixes, and some other minor
  ones:

  drm:
   - DP MST kref fix
   - fb_helper: check return value

  i915:
   - Fix BSC default context for Meteor Lake
   - Fix selftest-scheduler's modify_type
   - memory leak fix

  amdgpu:
   - GC11.x fixes
   - SMU13.0.0 fix
   - Freesync video fix
   - DP MST fixes
   - build fix"

* tag 'drm-fixes-2023-01-27' of git://anongit.freedesktop.org/drm/drm:
  amdgpu: fix build on non-DCN platforms.
  drm/amd/display: Fix timing not changning when freesync video is enabled
  drm/display/dp_mst: Correct the kref of port.
  drm/amdgpu/display/mst: update mst_mgr relevant variable when long HPD
  drm/amdgpu/display/mst: limit payload to be updated one by one
  drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignments
  drm/amdgpu: declare firmware for new MES 11.0.4
  drm/amdgpu: enable imu firmware for GC 11.0.4
  drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0
  drm/amdgpu: remove unconditional trap enable on add gfx11 queues
  drm/fb-helper: Use a per-driver FB deferred I/O handler
  drm/fb-helper: Check fb_deferred_io_init() return value
  drm/i915/selftest: fix intel_selftest_modify_policy argument types
  drm/i915/mtl: Fix bcs default context
  drm/i915: Fix a memory leak with reused mmap_offset
  drm/drm_vma_manager: Add drm_vma_node_allow_once()
2023-01-27 13:18:14 -08:00
Linus Torvalds 04ad927cac ACPI fixes for 6.2-rc6
Add ACPI backlight handling quirks for 3 machines (Hans de Goede).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPT4IsSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxmecQAKvr5Y5bxgNbB1P0dElL/XAJNT4HuKPz
 nDEqj8rHbY+QwHtpbxRsglUggUE+9tGvNJseNivRMacPfk+rm4+zeSUSFBbbMGTM
 6DicU8iThkvG9VZpH2fVlYq7nkS+yZeM5yv7HOInSNFyi00bAo3vCWiGQh94q/rS
 6UuZWPyIi8zG/hqVQsJN8nRe+rJYmVVCRZtJ/tf/VJnYJ8fPj8sX2Z6KsDZ0kp3J
 E0wb9ULo0S68zmJu/oOAv5e/wXSm8mmz6fok1TA+KEfGkawVE9Y9DmKRIzQD2nl8
 S45e6Y2jjQ8qE9IK1hZp8bhtuefyhhW+7eVunMoh55FbizMfK2whyh73g62pzVJk
 bnLF9YFjOh76uYjtBtbvVy8CN5HPMPgqqsLSJU2NZH/dscd6J6k02xzm0ZT7DViq
 IoFCEmpP2GDZX9o7WfomykiH1R95I3q2LPJoRO2W+ybRvhNJA4PlZkAXjpNOHFSy
 6hT+EeHHGHCirpgbBQCprT+/a1+Sy3doi33B3EeiQrmQ8PLRw1RCaMfU+XLXME+s
 uSquxwMnanFxc5agb/8N1MuiudJrElCKn7CdnELPxa3Ah/OU/GHjCRKfebcTq/ec
 nHhAr3YJr8Dbf1cc9o9Tt1OKkUCB68XONnZAN2yAqmoKBdFam8FM7SAgQsLMI3ie
 aotAO41hOC++
 =Uc6L
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "Add ACPI backlight handling quirks for 3 machines (Hans de Goede)"

* tag 'acpi-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: video: Add backlight=native DMI quirk for Asus U46E
  ACPI: video: Add backlight=native DMI quirk for HP EliteBook 8460p
  ACPI: video: Add backlight=native DMI quirk for HP Pavilion g6-1d80nr
2023-01-27 13:11:19 -08:00
Stephen Boyd 65b07ecfab clk: renesas: Updates for v6.3 (take two)
- Add support for USB host/device configuration on RZ/N1,
   - Add PLL2 programming support, and CAN-FD clocks on R-Car V4H,
   - Miscellaneous fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCY9OhkwAKCRCKwlD9ZEnx
 cHPuAQCo1Id5WB+ho1fs0L1OCrMR+3k9xoFuSWdq1OzKnQ1vmAEAq50oJT9U8+Ge
 LFz+RyJVvlVkpKNhYT40OzIXIn0KmwE=
 =7Cjr
 -----END PGP SIGNATURE-----

Merge tag 'renesas-clk-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into clk-renesas

Pull more Renesas clk driver updates from Geert Uytterhoeven:

 - Add support for USB host/device configuration on RZ/N1
 - Add PLL2 programming support, and CAN-FD clocks on R-Car V4H
 - Miscellaneous fixes and improvements

* tag 'renesas-clk-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers:
  clk: renesas: r8a779g0: Add CAN-FD clocks
  clk: renesas: r8a779g0: Tidy up DMAC name on SYS-DMAC
  clk: renesas: r8a779a0: Tidy up DMAC name on SYS-DMAC
  clk: renesas: r8a779g0: Add custom clock for PLL2
  clk: renesas: cpg-mssr: Remove superfluous check in resume code
  clk: renesas: r9a06g032: Handle h2mode setting based on USBF presence
2023-01-27 13:05:33 -08:00
Linus Torvalds 274d2f8b0c Thermal control fixes for 6.2-rc6
Add locking to the Intel int340x thermal control driver to prevent
 its thermal zone callbacks from racing with firmware-induced thermal
 trip point updates (Srinivas Pandruvada, Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPT4RMSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxytIP/0rs3UsBprl3ShRbdILjwkZ/tMWvgkx0
 92Uv47Q5d69ydbc2/4uzALPykJkbDlglrg5qaTwfqekNsiFxcV8ii+ztmvnTgkoD
 5CvgSO28/YaVEetbF5fnG+9Ckuq3IA/oaCsRJl5ODccm2k++HFkOB6o4k8SWK0Eg
 KdXyHvQdF8uiR7HjVMHX4xScmICuYp22Fcj6P++A6+QkwiOUMQ5hDTBJv0+IaSlh
 z9n3C7Wq5Q+sMUZN3rU1F0oLRpULUmZPaamrp2IHHQtsmzyrtSr90cZcLjFA1IrQ
 p5RMoT3M8nJc9X8WMALP62o4r/HRI2U1lVejxVlnf/gGLfqhf10v3VvDX8UfCodn
 RkVbT+RmgxMDBb5GpvHrJMKzMvjwyZnDLA1ZiIHiP1ok3NFiUCG+j1bYF35mgJIj
 GX+rg01Qv0royn3FBePfbfYfDs5c+/elpOAu3Zs/DPi/R2hoFSYt/7iVBH3BYtvm
 qMjLucR/GochCo6X4RJsCB0KxotcaEahc6rr4V7D0a9kCEuAjCRPdMtnvU3orFhh
 b022ORJIbH8MhK43U2m+q4JXcYqbjm094smBwlD5NvIlOJfdWLo/5rXLfp7zPDub
 YPbB5OiIV5sVs61IP71KF/LEYnVjySOLyyEpFr7HBhsZDtIBnJwF8ivC93ZrK7MY
 Z70GgUqV5NX/
 =wNe5
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "Add locking to the Intel int340x thermal control driver to prevent its
  thermal zone callbacks from racing with firmware-induced thermal trip
  point updates (Srinivas Pandruvada, Rafael Wysocki)"

* tag 'thermal-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
  thermal: intel: int340x: Protect trip temperature from concurrent updates
2023-01-27 13:01:36 -08:00
John Harrison 583ebae783 drm/i915/guc: Rename GuC register state capture node to be more obvious
The GuC specific register state entry in the error capture object was
just called 'capture'. Although the companion 'node' entry was called
'guc_capture_node'. Rename the base entry to be 'guc_capture' instead
so that it is a) more consistent and b) more obvious what it is.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
2023-01-27 13:01:26 -08:00
John Harrison e9823f0fc3 drm/i915/guc: Add a debug print on GuC triggered reset
For understanding bug reports, it can be useful to have an explicit
dmesg print when a reset notification is received from GuC. As opposed
to simply inferring that this happened from other messages.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com
2023-01-27 13:01:24 -08:00
John Harrison d907852d29 drm/i915/guc: Look for a guilty context when an engine reset fails
Engine resets are supposed to never fail. But in the case when one
does (due to unknown reasons that normally come down to a missing
w/a), it is useful to get as much information out of the system as
possible. Given that the GuC intentionally dies on such a situation,
it is not possible to get a guilty context notification back. So do a
manual search instead. Given that GuC is dead, this is safe because
GuC won't be changing the engine state asynchronously.

v2: Change comment to be less alarming (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com
2023-01-27 13:01:23 -08:00
John Harrison e7696d6521 drm/i915: Allow error capture of a pending request
A hang situation has been observed where the only requests on the
context were either completed or not yet started according to the
breaadcrumbs. However, the register state claimed a batch was (maybe)
in progress. So, allow capture of the pending request on the grounds
that this might be better than nothing.

v2: Reword 'not started' warning message (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-6-John.C.Harrison@Intel.com
2023-01-27 13:01:22 -08:00
John Harrison e8a3319c31 drm/i915: Allow error capture without a request
There was a report of error captures occurring without any hung
context being indicated despite the capture being initiated by a 'hung
context notification' from GuC. The problem was not reproducible.
However, it is possible to happen if the context in question has no
active requests. For example, if the hang was in the context switch
itself then the breadcrumb write would have occurred and the KMD would
see an idle context.

In the interests of attempting to provide as much information as
possible about a hang, it seems wise to include the engine info
regardless of whether a request was found or not. As opposed to just
prentending there was no hang at all.

So update the error capture code to always record engine information
if a context is given. Which means updating record_context() to take a
context instead of a request (which it only ever used to find the
context anyway). And split the request agnostic parts of
intel_engine_coredump_add_request() out into a seaprate function.

v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null
pointer.
v3: Tidy up request locking code flow (Tvrtko)
v4: Pull in improved info message from next patch and fix up potential
leak of GuC register state (Daniele)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2)
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-5-John.C.Harrison@Intel.com
2023-01-27 13:01:21 -08:00
John Harrison a4be3dca53 drm/i915: Fix up locking around dumping requests lists
The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.

So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.

v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)

Fixes: dc0dad365c ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
2023-01-27 13:01:19 -08:00
John Harrison 3700e35378 drm/i915: Fix request ref counting during error capture & debugfs dump
When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.

The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.

The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.

v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)

Fixes: dc0dad365c ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126ae ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
2023-01-27 13:01:17 -08:00
John Harrison d1c3717501 drm/i915/guc: Fix locking when searching for a hung request
intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.

Fixes: dc0dad365c ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
2023-01-27 13:01:16 -08:00
Linus Torvalds 0d1e013fd9 arm64 fix for -rc6
- Fix event counting regression in Arm CMN PMU driver due to broken optimisation
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmPUBmoQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjND5ICACD5MypW0I+BtQiRRG2/yq+hAYgJN2TVnJC
 +o//Atj7aiwI5EIv+SxFhhwlRNsWG8DnwotetdO7YTqQWskDBNSSX6LO9bxjt7N8
 eGERr97IK2hOYlLgA5gStKSp4FJL6WLysNXwoyy+Ff6pBtLTFOZHdgUEsEV55XIL
 yCK8DdUU+J4CbcQX1EKwdj4yyJ2Hu2ThMvNnIpC6OrNg4eHIIP4fZbE+EKGNdcA6
 Xh+2tSfqUeClqOqbXKnshH9DLgywBgIwW6JUnKHJynNNUrJni6oKqoeBAEJUJkol
 prmvEr1if/++0QufOHRHZIx1VyeCFN/vwXrY6nZ8AhMRmjwrTVia
 =Dfxt
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:

 - Fix event counting regression in Arm CMN PMU driver due to broken
   optimisation

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Partially revert "perf/arm-cmn: Optimise DTC counter accesses"
2023-01-27 12:56:45 -08:00
Linus Torvalds db7c4673bb RISC-V Fixes for 6.2-rc6
* A few DT bindings fixes to more closely align the ISA string
   requirements between the bindings and the ISA manual.
 * A handful of build error/warning fixes.
 * A fix to move init_cpu_topology() later in the boot flow, so it can
   allocate memory.
 * The IRC channel is now in the MAINTAINERS file, so it's easier to
   find.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmPUC6kTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYibIoD/9WvCKADgFTdAsKx+daiKebpuWvCjUv
 du8/l+QE9xdmpbISdA5tpzHVLQNc3oqM2AtgeKtxdX5mG1DndtIuigM9k8hoF5p9
 id9nWnIAHI1t5/yigmd4NPMNBB23Lsh+NfFSfP+fn1Q9F3ZmUCvVnotu91+Qs9lQ
 ZYeb2l2jEJU19KV79Do+zNisJ5iXMMl7/5GIFjov6vUpVw5jIU5C6gbVEWHrgSEe
 xGr7aZBJPIszcM6L7nTVWrikdK4gn417ydqBrBUGN8n2skve6Q32snfN8zrs0DV7
 YYJfo+Gh2j2K5pQnpLganpbQIGBm1u9cx5sKboD6X/eAsyzlgjGelH13uA/1ZNB1
 rqUsr5V11TN2RconXK/kkox+pusq3/KRDB01focndJdHKNDJo5bZps6uH/GM9aog
 3wMafDLWd4LeBJiPbJx49834wy277GGbyxBpSar7uIRrhX7IPwLkj3tTU3DCP4kK
 xvrk2LhitdgLOzGSDMoPjTPTpXlqrXwb3bh4NITxx95XXOAE03/4W6FhYKRbgWub
 JDmX0JKGkcgWC/JX4TMAFLZU7AvjKpZBLoTTMrAe+y/bzHASBgFBrzqUVaKZ8GgC
 KV91OzOLJb1/np7kZCQHxO/72v5YPbPcfEyNUQwm8KC4/8CPf8/tq5eWgWsw9xBM
 GZjoUco1Y+9RCQ==
 =ZHbl
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A few DT bindings fixes to more closely align the ISA string
   requirements between the bindings and the ISA manual.

 - A handful of build error/warning fixes.

 - A fix to move init_cpu_topology() later in the boot flow, so it can
   allocate memory.

 - The IRC channel is now in the MAINTAINERS file, so it's easier to
   find.

* tag 'riscv-for-linus-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Move call to init_cpu_topology() to later initialization stage
  riscv/kprobe: Fix instruction simulation of JALR
  riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAIT
  MAINTAINERS: add an IRC entry for RISC-V
  RISC-V: fix compile error from deduplicated __ALTERNATIVE_CFG_2
  dt-bindings: riscv: fix single letter canonical order
  dt-bindings: riscv: fix underscore requirement for multi-letter extensions
2023-01-27 12:52:45 -08:00
Linus Torvalds e5eb2b22f0 ARM fixes for 6.2
- fix nommu assignment build warning
 - fix -Wundef preprocessor warning
 - reduce __thumb2__ definitions for crypto files that require it
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmPRGwUACgkQ9OeQG+St
 rGQ6GBAAixY+Ow4G9CE0A02tJJxMx/GxcmhMghv9UIiuRb14F4abxqzcw1ogD2Ri
 2WOiqzhFjUx0TS7svqMgD+HKifS3z1sIqjfC/eaWmHxpOb4ei1Z/B27Te3LVk9kZ
 ZaOv8lefhq8oby9Cj6Se/6Lqen+UPSFLRfYLcQCdG26Yrsupxr7TmT/quRmU2qC9
 Wb6/Fh3Y2EWUbvYLtl/CyqSvpGMqwRGkN5/8ZqBGjjLzI8CSKny4xNwqnPTWZzPQ
 mkf4r/JAnXb0137nfPcidahN7xxy67mG5t9yHYsdXVK5FrzBynilHpSW2Vadg7QW
 N8SPL7Cws/yJ/0iEoUkJVa8ByMx2HdpC4pEEqrEDm/Ze5IEh6gsgGDM6ClLLsecN
 N4lXVFYcnr0y/hIGqa+m7T/X870mOLWoGhqc3Fvz+k09QZqYyUWVhx+fCPZms2OK
 /+jtECqrfG4t8fm/FryTEHMlS51jhyNmbdQNyqhxa+umXPAHYxet9f4Ld097FUuc
 EFArCATaP9/d2aP4e/X4md68QSo0SwSi3l78GKIxWMAkw4O2Q1llBoVl7dkpJqg0
 0+YXLfqvaWGf5QBQXyXv+pL7ysbtyqBcQxy3Py3m3L7QZ72OM3+BG+hgwPruFVoU
 8DZ7VPBPi6Epp4H9EWqJIZMtZJH+FmSU7GiJxOKQ0G7rIWbdwEE=
 =77BU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - fix nommu assignment build warning

 - fix -Wundef preprocessor warning

 - reduce __thumb2__ definitions for crypto files that require it

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it
  ARM: 9284/1: include <asm/pgtable.h> from proc-macros.S to fix -Wundef warnings
  ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment
2023-01-27 12:49:00 -08:00
Linus Torvalds 9f4d0bd24e linux-kselftest-fixes-6.2-rc6
This Kselftest fixes update for Linux 6.2-rc6 consists of a single
 fix to a amd-pstate test Makefile bug that deletes source files
 during make clean run.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmPS4awACgkQCwJExA0N
 QxyxIA/+LSuXceaj895VX/wwCeB9SFarG5oZesX5sg4Z8RID91w505hY+ekuKyUT
 6Wt2sjc9XJ3RJfPZ5L9fK1TncIjkkQbKoDIJjKZ9nM10zT4wj0fK+vjnKY96Da9w
 s70stRKr0zyVE+uPV5oEuj89LsZITuOsDJxbxcF7RWHOCid+nYaO8RdoANAHMPnF
 f70ziaAAb8zNqv55Lnh05N+gd7VptaJ6CuS3d/WxRABhNgqSHheEMWymY51G0aO1
 3Om3/HLVFWLr7nqdt7Vnc2EjcllzwxaRnIACctmLIzVlabJSuDm4FtDpOU2D+sG4
 pGW/NVc+ydrM1PJuugery7tMMP1vV+frhhDvfnuRdLaPFl5Wxc0W8D7WOgMce0fo
 hK+wahfIrcNrXX8mQQi54P0K5imIZcgPxpZz3rgZHrbP+hJmDdStMh93zCLLfL/t
 zmVYf9xYJzb32OOkmjZdN0AKNDlDowgZQcbesmH22bDVMuKORb6YG920J6Qa2pmK
 5A168cLYS5yDHtbouZlcDK5XAroydcAYWGLXmVMoLKpS2Sa58DsKlVgq7BeHcUgX
 dQLhfS18YsW9XKiKYakWb6z67B+MQ0SulnOdOLhOO3ekds04hWFSASR4nFvDKHce
 h9MZKBk2osHYfX//RCfmyCVqE4SleMXBWNzFOfrfm0de0d+NILQ=
 =Wb/P
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "A single fix to a amd-pstate test Makefile bug that deletes source
  files during make clean run"

* tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: amd-pstate: Don't delete source files via Makefile
2023-01-27 12:41:09 -08:00
Andy Shevchenko a8c55407a7 lib/string: Use strchr() in strpbrk()
Use strchr() instead of open coding it as it's done elsewhere in
the same file. Either we will have similar to what it was or possibly
better performance in case architecture implements its own strchr().

Memory wise on x86_64 bloat-o-meter shows the following

  Function           old     new   delta
  strsep             111     102      -9
  Total: Before=2763, After=2754, chg -0.33%

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230127155135.27153-1-andriy.shevchenko@linux.intel.com
2023-01-27 11:42:57 -08:00
Kees Cook aa85923a95 crypto: hisilicon: Wipe entire pool on error
To work around a Clang __builtin_object_size bug that shows up under
CONFIG_FORTIFY_SOURCE and UBSAN_BOUNDS, move the per-loop-iteration
mem_block wipe into a single wipe of the entire pool structure after
the loop.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1780
Cc: Weili Qian <qianweili@huawei.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Link: https://lore.kernel.org/r/20230106041945.never.831-kees@kernel.org
2023-01-27 11:42:57 -08:00
Kees Cook 8500689095 net/i40e: Replace 0-length array with flexible array
Zero-length arrays are deprecated[1]. Replace struct i40e_lump_tracking's
"list" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:

In function 'i40e_put_lump',
    inlined from 'i40e_clear_interrupt_scheme' at drivers/net/ethernet/intel/i40e/i40e_main.c:5145:2:
drivers/net/ethernet/intel/i40e/i40e_main.c:278:27: warning: array subscript <unknown> is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Warray-bounds=]
  278 |                 pile->list[i] = 0;
      |                 ~~~~~~~~~~^~~
drivers/net/ethernet/intel/i40e/i40e.h: In function 'i40e_clear_interrupt_scheme':
drivers/net/ethernet/intel/i40e/i40e.h:179:13: note: while referencing 'list'
  179 |         u16 list[0];
      |             ^~~~

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230105234557.never.799-kees@kernel.org
2023-01-27 11:42:57 -08:00
Kees Cook 36632d0629 io_uring: Replace 0-length array with flexible array
Zero-length arrays are deprecated[1]. Replace struct io_uring_buf_ring's
"bufs" with a flexible array member. (How is the size of this array
verified?) Detected with GCC 13, using -fstrict-flex-arrays=3:

In function 'io_ring_buffer_select',
    inlined from 'io_buffer_select' at io_uring/kbuf.c:183:10:
io_uring/kbuf.c:141:23: warning: array subscript 255 is outside the bounds of an interior zero-length array 'struct io_uring_buf[0]' [-Wzero-length-bounds]
  141 |                 buf = &br->bufs[head];
      |                       ^~~~~~~~~~~~~~~
In file included from include/linux/io_uring.h:7,
                 from io_uring/kbuf.c:10:
include/uapi/linux/io_uring.h: In function 'io_buffer_select':
include/uapi/linux/io_uring.h:628:41: note: while referencing 'bufs'
  628 |                 struct io_uring_buf     bufs[0];
      |                                         ^~~~

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: stable@vger.kernel.org
Cc: io-uring@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230105190507.gonna.131-kees@kernel.org
2023-01-27 11:42:57 -08:00
Kees Cook 118901ad1f ext4: Fix function prototype mismatch for ext4_feat_ktype
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.

ext4_feat_ktype was setting the "release" handler to "kfree", which
doesn't have a matching function prototype. Add a simple wrapper
with the correct prototype.

This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.

Note that this code is only reached when ext4 is a loadable module and
it is being unloaded:

 CFI failure at kobject_put+0xbb/0x1b0 (target: kfree+0x0/0x180; expected type: 0x7c4aa698)
 ...
 RIP: 0010:kobject_put+0xbb/0x1b0
 ...
 Call Trace:
  <TASK>
  ext4_exit_sysfs+0x14/0x60 [ext4]
  cleanup_module+0x67/0xedb [ext4]

Fixes: b99fee58a2 ("ext4: create ext4_feat kobject dynamically")
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: stable@vger.kernel.org
Build-tested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20230103234616.never.915-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20230104210908.gonna.388-kees@kernel.org
2023-01-27 11:42:56 -08:00
Paulo Miguel Almeida 16a738f2f6 i915/gvt: Replace one-element array with flexible-array member
One-element arrays are deprecated, and we are replacing them with
flexible array members instead. So, replace one-element array with
flexible-array member in struct gvt_firmware_header and refactor the
rest of the code accordingly.

Additionally, previous implementation was allocating 8 bytes more than
required to represent firmware_header + cfg_space data + mmio data.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

To make reviewing this patch easier, I'm pasting before/after struct
sizes.

pahole -C gvt_firmware_header before/drivers/gpu/drm/i915/gvt/firmware.o
struct gvt_firmware_header {
	u64                        magic;                /*     0     8 */
	u32                        crc32;                /*     8     4 */
	u32                        version;              /*    12     4 */
	u64                        cfg_space_size;       /*    16     8 */
	u64                        cfg_space_offset;     /*    24     8 */
	u64                        mmio_size;            /*    32     8 */
	u64                        mmio_offset;          /*    40     8 */
	unsigned char              data[1];              /*    48     1 */

	/* size: 56, cachelines: 1, members: 8 */
	/* padding: 7 */
	/* last cacheline: 56 bytes */
};

pahole -C gvt_firmware_header after/drivers/gpu/drm/i915/gvt/firmware.o
struct gvt_firmware_header {
	u64                        magic;                /*     0     8 */
	u32                        crc32;                /*     8     4 */
	u32                        version;              /*    12     4 */
	u64                        cfg_space_size;       /*    16     8 */
	u64                        cfg_space_offset;     /*    24     8 */
	u64                        mmio_size;            /*    32     8 */
	u64                        mmio_offset;          /*    40     8 */
	unsigned char              data[];               /*    48     0 */

	/* size: 48, cachelines: 1, members: 8 */
	/* last cacheline: 48 bytes */
};

As you can see the additional byte of the fake-flexible array (data[1])
forced the compiler to pad the struct but those bytes aren't actually used
as first & last bytes (of both cfg_space and mmio) are controlled by the
<>_size and <>_offset members present in the gvt_firmware_header struct.

Link: https://github.com/KSPP/linux/issues/79
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/Y6Eu2604cqtryP4g@mail.google.com
2023-01-27 11:42:56 -08:00
Kees Cook 4076ea2419 drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size
Both Coverity and GCC with -Wstringop-overflow noticed that
nvif_outp_acquire_dp() accidentally defined its second argument with 1
additional element:

drivers/gpu/drm/nouveau/dispnv50/disp.c: In function 'nv50_pior_atomic_enable':
drivers/gpu/drm/nouveau/dispnv50/disp.c:1813:17: error: 'nvif_outp_acquire_dp' accessing 16 bytes in a region of size 15 [-Werror=stringop-overflow=]
 1813 |                 nvif_outp_acquire_dp(&nv_encoder->outp, nv_encoder->dp.dpcd, 0, 0, false, false);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/nouveau/dispnv50/disp.c:1813:17: note: referencing argument 2 of type 'u8[16]' {aka 'unsigned char[16]'}
drivers/gpu/drm/nouveau/include/nvif/outp.h:24:5: note: in a call to function 'nvif_outp_acquire_dp'
   24 | int nvif_outp_acquire_dp(struct nvif_outp *, u8 dpcd[16],
      |     ^~~~~~~~~~~~~~~~~~~~

Avoid these warnings by defining the argument size using the matching
define (DP_RECEIVER_CAP_SIZE, 15) instead of having it be a literal
(and incorrect) value (16).

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527269 ("Memory - corruptions")
Addresses-Coverity-ID: 1527268 ("Memory - corruptions")
Link: https://lore.kernel.org/lkml/202211100848.FFBA2432@keescook/
Link: https://lore.kernel.org/lkml/202211100848.F4C2819BB@keescook/
Fixes: 8134437213 ("drm/nouveau/disp: move DP link config into acquire")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221127183036.never.139-kees@kernel.org
2023-01-27 11:42:41 -08:00
Michal Wilczynski 53b9b77dcf ice: Fix broken link in ice NAPI doc
Current link for NAPI documentation in ice driver doesn't work - it
returns 404. Update the link to the working one.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-27 11:32:18 -08:00
Dave Ertman a6a0974aae ice: Prevent set_channel from changing queues while RDMA active
The PF controls the set of queues that the RDMA auxiliary_driver requests
resources from.  The set_channel command will alter that pool and trigger a
reconfiguration of the VSI, which breaks RDMA functionality.

Prevent set_channel from executing when RDMA driver bound to auxiliary
device.

Adding a locked variable to pass down the call chain to avoid double
locking the device_lock.

Fixes: 348048e724 ("ice: Implement iidc operations")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-27 11:32:18 -08:00
Ian Rogers 22e06e6825 perf buildid: Avoid copy of uninitialized memory
build_id__init() only copies the buildid data up to size leaving the
rest of the data array uninitialized. Copying the full array during
synthesis means the written event contains uninitialized memory.

Ensure the size is less that the buffer size and only copy the bytes
that were initialized. This was detected by the Clang/LLVM memory
sanitizer.

v2. Avoids the potential for copying too much as suggested by Arnaldo.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230120185828.43231-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:35 -03:00
James Clark 86569c0ab1 perf mem/c2c: Document that SPE is used for mem and c2c on ARM
Setup is non-trivial so also link to the full SPE docs.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-perf-users@vger.kernel.or
Link: https://lore.kernel.org/r/20230124145929.557891-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
James Clark 6bc75b4c90 perf cs-etm: Improve missing sink warning message
Make the sink error message more similar to the event error message that
reminds about missing kernel support. The available sinks are also
determined by the hardware so mention that too.

Also, usually it's not necessary to specify the sink, so add that as a
hint.

Now the error for a made up sink looks like this:

  $ perf record -e cs_etm/@abc/
  Couldn't find sink "abc" on event cs_etm/@abc/.
  Missing kernel or device support?

  Hint: An appropriate sink will be picked automatically if one isn't is specified.

For any error other than ENOENT, the same message as before is
displayed.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/ec7502e6-b406-3997-c2a5-24f98e5c4854@arm.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230124110220.460551-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
Arnaldo Carvalho de Melo 0b58d89b1e perf tools: Add Ian Rogers to MAINTAINERS as a reviewer
Ian has been reviewing perf tooling patches consistently for a long
time, so lets reflect that in the MAINTAINERS file so that contributors
add him to the CC list in patch submissions.

Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
Jeff Xu 8677e555f1
selftests/landlock: Test ptrace as much as possible with Yama
Update ptrace tests according to all potential Yama security policies.
This is required to make such tests pass even if Yama is enabled.

Tests are not skipped but they now check both Landlock and Yama boundary
restrictions at run time to keep a maximum test coverage (i.e. positive
and negative testing).

Signed-off-by: Jeff Xu <jeffxu@google.com>
Link: https://lore.kernel.org/r/20230114020306.1407195-2-jeffxu@google.com
Cc: stable@vger.kernel.org
[mic: Add curly braces around EXPECT_EQ() to make it build, and improve
commit message]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-01-27 18:53:55 +01:00
Mark Rutland a873bb493f arm64: traps: attempt to dump all instructions
Currently dump_kernel_instr() dumps a few instructions around the
pt_regs::pc value, dumping 4 instructions before the PC before dumping
the instruction at the PC. If an attempt to read an instruction fails,
it gives up and does not attempt to dump any subsequent instructions.

This is unfortunate when the pt_regs::pc value points to the start of a
page with a leading guard page, where the instruction at the PC can be
read, but prior instructions cannot.

This patch makes dump_kernel_instr() attempt to dump each instruction
regardless of whether reading a prior instruction could be read, which
gives a more useful code dump in such cases. When an instruction cannot
be read, it is reported as "????????", which cannot be confused with a
hex value,

For example, with a `UDF #0` (AKA 0x00000000) early in the kexec control
page, we'll now get the following code dump:

| Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 261 Comm: kexec Not tainted 6.2.0-rc5+ #26
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : 0x48c00000
| lr : machine_kexec+0x190/0x200
| sp : ffff80000d36ba80
| x29: ffff80000d36ba80 x28: ffff000002dfc380 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a9f7858 x22: 000000004c460000 x21: 0000000000000010
| x20: 00000000ad821000 x19: ffff000000aa0000 x18: 0000000000000006
| x17: ffff8000758a2000 x16: ffff800008000000 x15: ffff80000d36b568
| x14: 0000000000000000 x13: ffff80000d36b707 x12: ffff80000a9bf6e0
| x11: 00000000ffffdfff x10: ffff80000aaaf8e0 x9 : ffff80000815eff8
| x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a263008
| x2 : ffff80000a9e20f8 x1 : 0000000048c00000 x0 : ffff000000aa0000
| Call trace:
|  0x48c00000
|  kernel_kexec+0x88/0x138
|  __do_sys_reboot+0x108/0x288
|  __arm64_sys_reboot+0x2c/0x40
|  invoke_syscall+0x78/0x140
|  el0_svc_common.constprop.0+0x4c/0x100
|  do_el0_svc+0x34/0x80
|  el0_svc+0x34/0x140
|  el0t_64_sync_handler+0xf4/0x140
|  el0t_64_sync+0x194/0x1c0
| Code: ???????? ???????? ???????? ???????? (00000000)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| Kernel Offset: disabled
| CPU features: 0x002000,00050108,c8004203
| Memory Limit: none

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230127121256.2141368-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-27 17:48:34 +00:00
Kirill A. Shutemov 8de62af018 x86/tdx: Disable NOTIFY_ENABLES
== Background ==

There is a class of side-channel attacks against SGX enclaves called
"SGX Step"[1]. These attacks create lots of exceptions inside of
enclaves. Basically, run an in-enclave instruction, cause an exception.
Over and over.

There is a concern that a VMM could attack a TDX guest in the same way
by causing lots of #VE's. The TDX architecture includes new
countermeasures for these attacks. It basically counts the number of
exceptions and can send another *special* exception once the number of
VMM-induced #VE's hits a critical threshold[2].

== Problem ==

But, these special exceptions are independent of any action that the
guest takes. They can occur anywhere that the guest executes. This
includes sensitive areas like the entry code. The (non-paranoid) #VE
handler is incapable of handling exceptions in these areas.

== Solution ==

Fortunately, the special exceptions can be disabled by the guest via
write to NOTIFY_ENABLES TDCS field. NOTIFY_ENABLES is disabled by
default, but might be enabled by a bootloader, firmware or an earlier
kernel before the current kernel runs.

Disable NOTIFY_ENABLES feature explicitly and unconditionally. Any
NOTIFY_ENABLES-based #VE's that occur before this point will end up
in the early #VE exception handler and die due to unexpected exit
reason.

[1] https://github.com/jovanbulck/sgx-step
[2] https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html#safety-against-ve-in-kernel-code

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-8-kirill.shutemov%40linux.intel.com
2023-01-27 09:46:05 -08:00
Kirill A. Shutemov 47e67cf317 x86/tdx: Relax SEPT_VE_DISABLE check for debug TD
A "SEPT #VE" occurs when a TDX guest touches memory that is not properly
mapped into the "secure EPT".  This can be the result of hypervisor
attacks or bugs, *OR* guest bugs.  Most notably, buggy guests might
touch unaccepted memory for lots of different memory safety bugs like
buffer overflows.

TDX guests do not want to continue in the face of hypervisor attacks or
hypervisor bugs.  They want to terminate as fast and safely as possible.
SEPT_VE_DISABLE ensures that TDX guests *can't* continue in the face of
these kinds of issues.

But, that causes a problem.  TDX guests that can't continue can't spit
out oopses or other debugging info.  In essence SEPT_VE_DISABLE=1 guests
are not debuggable.

Relax the SEPT_VE_DISABLE check to warning on debug TD and panic() in
the #VE handler on EPT-violation on private memory. It will produce
useful backtrace.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-7-kirill.shutemov%40linux.intel.com
2023-01-27 09:46:05 -08:00
Kirill A. Shutemov 71acdcd7cd x86/tdx: Use ReportFatalError to report missing SEPT_VE_DISABLE
Linux TDX guests require that the SEPT_VE_DISABLE "attribute" be set.
If it is not set, the kernel is theoretically required to handle
exceptions anywhere that kernel memory is accessed, including places
like NMI handlers and in the syscall entry gap.

Rather than even try to handle these exceptions, the kernel refuses to
run if SEPT_VE_DISABLE is unset.

However, the SEPT_VE_DISABLE detection and refusal code happens very
early in boot, even before earlyprintk runs.  Calling panic() will
effectively just hang the system.

Instead, call a TDX-specific panic() function.  This makes a very simple
TDVMCALL which gets a short error string out to the hypervisor without
any console infrastructure.

Use TDG.VP.VMCALL<ReportFatalError> to report the error. The hypercall
can encode message up to 64 bytes in eight registers.

[ dhansen: tweak comment and remove while loop brackets. ]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-6-kirill.shutemov%40linux.intel.com
2023-01-27 09:45:55 -08:00
Mark Rutland dc4824faa2 arm64: avoid executing padding bytes during kexec / hibernation
Currently we rely on the HIBERNATE_TEXT section starting with the entry
point to swsusp_arch_suspend_exit, and the KEXEC_TEXT section starting
with the entry point to arm64_relocate_new_kernel. In both cases we copy
the entire section into a dynamically-allocated page, and then later
branch to the start of this page.

SYM_FUNC_START() will align the function entry points to
CONFIG_FUNCTION_ALIGNMENT, and when the linker later processes the
assembled code it will place padding bytes before the function entry
point if the location counter was not already sufficiently aligned. The
linker happens to use the value zero for these padding bytes.

This padding may end up being applied whenever CONFIG_FUNCTION_ALIGNMENT
is greater than 4, which can be the case with
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y or
CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y.

When such padding is applied, attempting to kexec or resume from
hibernate will result ina crash: the kernel will branch to the padding
bytes as the start of the dynamically-allocated page, and as those bytes
are zero they will decode as UDF #0, which reliably triggers an
UNDEFINED exception. For example:

| # ./kexec --reuse-cmdline -f Image
| [   46.965800] kexec_core: Starting new kernel
| [   47.143641] psci: CPU1 killed (polled 0 ms)
| [   47.233653] psci: CPU2 killed (polled 0 ms)
| [   47.323465] psci: CPU3 killed (polled 0 ms)
| [   47.324776] Bye!
| [   47.327072] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| [   47.328510] Modules linked in:
| [   47.329086] CPU: 0 PID: 259 Comm: kexec Not tainted 6.2.0-rc5+ #3
| [   47.330223] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| [   47.331497] pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| [   47.332782] pc : 0x43a95000
| [   47.333338] lr : machine_kexec+0x190/0x1e0
| [   47.334169] sp : ffff80000d293b70
| [   47.334845] x29: ffff80000d293b70 x28: ffff000002cc0000 x27: 0000000000000000
| [   47.336292] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| [   47.337744] x23: ffff80000a837858 x22: 0000000048ec9000 x21: 0000000000000010
| [   47.339192] x20: 00000000adc83000 x19: ffff000000827000 x18: 0000000000000006
| [   47.340638] x17: ffff800075a61000 x16: ffff800008000000 x15: ffff80000d293658
| [   47.342085] x14: 0000000000000000 x13: ffff80000d2937f7 x12: ffff80000a7ff6e0
| [   47.343530] x11: 00000000ffffdfff x10: ffff80000a8ef8e0 x9 : ffff80000813ef00
| [   47.344976] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| [   47.346431] x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a0a3008
| [   47.347877] x2 : ffff80000a8220f8 x1 : 0000000043a95000 x0 : ffff000000827000
| [   47.349334] Call trace:
| [   47.349834]  0x43a95000
| [   47.350338]  kernel_kexec+0x88/0x100
| [   47.351070]  __do_sys_reboot+0x108/0x268
| [   47.351873]  __arm64_sys_reboot+0x2c/0x40
| [   47.352689]  invoke_syscall+0x78/0x108
| [   47.353458]  el0_svc_common.constprop.0+0x4c/0x100
| [   47.354426]  do_el0_svc+0x34/0x50
| [   47.355102]  el0_svc+0x34/0x108
| [   47.355747]  el0t_64_sync_handler+0xf4/0x120
| [   47.356617]  el0t_64_sync+0x194/0x198
| [   47.357374] Code: bad PC value
| [   47.357999] ---[ end trace 0000000000000000 ]---
| [   47.358937] Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| [   47.360515] Kernel Offset: disabled
| [   47.361230] CPU features: 0x002000,00050108,c8004203
| [   47.362232] Memory Limit: none

Note: Unfortunately the code dump reports "bad PC value" as it attempts
to dump some instructions prior to the UDF (i.e. before the start of the
page), and terminates early upon a fault, obscuring the problem.

This patch fixes this issue by aligning the section starter markes to
CONFIG_FUNCTION_ALIGNMENT using the ALIGN_FUNCTION() helper, which
ensures that the linker never needs to place padding bytes within the
section. Assertions are added to verify each section begins with the
function we expect, making our implicit requirement explicit.

In future it might be nice to rework the kexec and hibernation code to
decouple the section start from the entry point, but that involves much
more significant changes that come with a higher risk of error, so I've
tried to keep this fix as simple as possible for now.

Fixes: 47a15aa544 ("arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT")
Reported-by: CKI Project <cki-project@redhat.com>
Link: https://lore.kernel.org/linux-arm-kernel/29992.123012504212600261@us-mta-139.us.mimecast.lan/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-27 17:45:44 +00:00
Kirill A. Shutemov 752d13305c x86/tdx: Expand __tdx_hypercall() to handle more arguments
So far __tdx_hypercall() only handles six arguments for VMCALL.
Expanding it to six more register would allow to cover more use-cases
like ReportFatalError() and Hyper-V hypercalls.

With all preparations in place, the expansion is pretty straight
forward.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-5-kirill.shutemov%40linux.intel.com
2023-01-27 09:42:09 -08:00
Kirill A. Shutemov c30c4b2555 x86/tdx: Refactor __tdx_hypercall() to allow pass down more arguments
RDI is the first argument to __tdx_hypercall() that used to pass pointer
to struct tdx_hypercall_args. RSI is the second argument that contains
flags, such as TDX_HCALL_HAS_OUTPUT and TDX_HCALL_ISSUE_STI.

RDI and RSI can also be used as arguments to TDVMCALL leafs. Move RDI to
RAX and RSI to RBP to free up them for the hypercall arguments.

RAX saved on stack during TDCALL as it returns status code in the
register.

RBP value has to be restored before returning from __tdx_hypercall() as
it is callee-saved register.

This is preparatory patch. No functional change.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-4-kirill.shutemov%40linux.intel.com
2023-01-27 09:42:09 -08:00
Kirill A. Shutemov 0da908c291 x86/tdx: Add more registers to struct tdx_hypercall_args
struct tdx_hypercall_args is used to pass down hypercall arguments to
__tdx_hypercall() assembly routine.

Currently __tdx_hypercall() handles up to 6 arguments. In preparation to
changes in __tdx_hypercall(), expand the structure to 6 more registers
and generate asm offsets for them.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-3-kirill.shutemov%40linux.intel.com
2023-01-27 09:42:09 -08:00
Kirill A. Shutemov 3543f8830b x86/tdx: Fix typo in comment in __tdx_hypercall()
Comment in __tdx_hypercall() points that RAX==0 indicates TDVMCALL
failure which is opposite of the truth: RAX==0 is success.

Fix the comment. No functional changes.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230126221159.8635-2-kirill.shutemov%40linux.intel.com
2023-01-27 09:42:09 -08:00