Commit Graph

1169205 Commits (887185649c7ee8a9cc2d4e94de92bbbae6cd3747)

Author SHA1 Message Date
Benjamin Berg 8e9cd85139 um: virtio_uml: mark device as unregistered when breaking it
Mark the device as not registered anymore when scheduling the work to
remove it. Otherwise we could end up scheduling the work multiple times
in a row, including scheduling it while it is already running.

Fixes: af9fb41ed3 ("um: virtio_uml: Fix broken device handling in time-travel")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:32 +01:00
Benjamin Berg 8a6ca54364 um: virtio_uml: free command if adding to virtqueue failed
If adding the command fails (i.e. the virtqueue is broken) then free it
again if the function allocated a new buffer for it.

Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:32 +01:00
Masahiro Yamada b99ddbe833 UML: define RUNTIME_DISCARD_EXIT
With CONFIG_VIRTIO_UML=y, GNU ld < 2.36 fails to link UML vmlinux
(w/wo CONFIG_LD_SCRIPT_STATIC).

  `.exit.text' referenced in section `.uml.exitcall.exit' of arch/um/drivers/virtio_uml.o: defined in discarded section `.exit.text' of arch/um/drivers/virtio_uml.o
  collect2: error: ld returned 1 exit status

This fix is similar to the following commits:

- 4b9880dbf3 ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT")
- a494398bde ("s390: define RUNTIME_DISCARD_EXIT to fix link error
  with GNU ld < 2.36")
- c1c551bebf ("sh: define RUNTIME_DISCARD_EXIT")

Fixes: 99cb0d917f ("arch: fix broken BuildID for arm64 and riscv")
Reported-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Vincent Whitchurch 522c532c4f virt-pci: add platform bus support
This driver registers PCI busses, but the underlying virtio protocol
could just as easily be used to provide a platform bus instead.  If the
virtio device node in the devicetree indicates that it's compatible with
simple-bus, register platform devices instead of handling it as a PCI
bus.

Only one platform bus is allowed and the logic MMIO region for the
platform bus is placed at an arbitrarily-chosen address away from the
PCI region.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Vincent Whitchurch 935f8f7a01 um-virt-pci: Make max delay configurable
There is a hard coded maximum time for which the driver busy-loops
waiting for a response from the virtio device.  This default time may be
too short for some systems, so make it a module parameter so that it can
be set via the kernel command line.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Vincent Whitchurch 314a1408b7 um: virt-pci: implement pcibios_get_phb_of_node()
Implement pcibios_get_phb_of_node() as x86 does in order to allow PCI
busses to be associated with devicetree nodes.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Peter Foley 83e913f52a um: Support LTO
Only a handful of changes are necessary to get it to work.

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Peter Foley f09c3fcf67 um: put power options in a menu
Because having them all dumped at top-level is a bit messy.

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:31 +01:00
Peter Foley 6aa56115c7 um: Use CFLAGS_vmlinux
link-vmlinux.sh doesn't use LDFLAGS_vmlinux when linking the kernel for
UML. Move the LDFLAGS_EXESTACK options into CFLAGS_vmlinux so they're
actually respected.

e.g.
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: warning: .tmp_vmlinux.kallsyms3.o: missing .note.GNU-stack section implies executable stack
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: warning: vmlinux has a LOAD segment with RWX permissions

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:20 +01:00
Peter Foley 910dba4123 um: Prevent building modules incompatible with MODVERSIONS
The manual ld invocation in arch/um/drivers doesn't play nicely with
genksyms. Given the problematic modules are deprecated anyway, just
prevent building them when using MODVERSIONS.

e.g.
MODPOST Module.symvers
arch/um/drivers/.pcap.o.cmd: No such file or directory

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:14:16 +01:00
Peter Foley 2c4d3841a8 um: Avoid pcap multiple definition errors
Change the function name in pcap_kern to avoid conflicting with
libpcap.a.

e.g.
ld: /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libpcap.a(pcap.o): in function `pcap_init':
(.text+0x7f0): multiple definition of `pcap_init'; arch/um/drivers/pcap_kern.o:pcap_kern.c:(.text.unlikely+0x0): first defined here

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-13 10:13:48 +01:00
Gustavo A. R. Silva 9a0450ada8 xen: Replace one-element array with flexible-array member
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct xen_page_directory.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

This results in no differences in binary output.

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/255
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/Y9xjN6Wa3VslgXeX@work
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 09:15:45 +01:00
Julien Gomes 85b93bbd1c tpm: add vendor flag to command code validation
Some TPM 2.0 devices have support for additional commands which are not
part of the TPM 2.0 specifications.
These commands are identified with bit 29 of the 32 bits command codes.
Contrarily to other fields of the TPMA_CC spec structure used to list
available commands, the Vendor flag also has to be present in the
command code itself (TPM_CC) when called.

Add this flag to tpm_find_cc() mask to prevent blocking vendor command
codes that can actually be supported by the underlying TPM device.

Signed-off-by: Julien Gomes <julien@arista.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Eddie James 1e2714bb83 tpm: Add reserved memory event log
Some platforms may desire to pass the event log up to Linux in the
form of a reserved memory region. In particular, this is desirable
for embedded systems or baseboard management controllers (BMCs)
booting with U-Boot. IBM OpenBMC BMCs will be the first user.
Add support for the reserved memory in the TPM core to find the
region and map it.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
[jarkko: removed spurious dev_info()'s from tpm_read_log_memory_region()]
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
[yang: return -ENOMEM when devm_memremap() fails]
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Eddie James 441b715272 tpm: Use managed allocation for bios event log
Since the bios event log is freed in the device release function,
let devres handle the deallocation. This will allow other memory
allocation/mapping functions to be used for the bios event log.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Uwe Kleine-König 40078327f6 tpm: tis_i2c: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Uwe Kleine-König 8f3fb73b8b tpm: tpm_i2c_nuvoton: Convert to i2c's .probe_new()
.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Uwe Kleine-König d5ae2f4760 tpm: tpm_i2c_infineon: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Uwe Kleine-König d787c95b56 tpm: tpm_i2c_atmel: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Uwe Kleine-König 376f88f44e tpm: st33zp24: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Denis Kenzior 10de7b5429 KEYS: asymmetric: Fix ECDSA use via keyctl uapi
When support for ECDSA keys was added, constraints for data & signature
sizes were never updated.  This makes it impossible to use such keys via
keyctl API from userspace.

Update constraint on max_data_size to 64 bytes in order to support
SHA512-based signatures. Also update the signature length constraints
per ECDSA signature encoding described in RFC 5480.

Fixes: 299f561a66 ("x509: Add support for parsing x509 certs with ECDSA keys")
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Thomas Weißschuh c95e8f6fd1 certs: don't try to update blacklist keys
When the same key is blacklisted repeatedly logging at pr_err() level is
excessive as no functionality is impaired.
When these duplicates are provided by buggy firmware there is nothing
the user can do to fix the situation.
Instead of spamming the bootlog with errors we use a warning that can
still be seen by OEMs when testing their firmware.

Link: https://lore.kernel.org/all/c8c65713-5cda-43ad-8018-20f2e32e4432@t-8ch.de/
Link: https://lore.kernel.org/all/20221104014704.3469-1-linux@weissschuh.net/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Thomas Weißschuh 6c1976addf KEYS: Add new function key_create()
key_create() works like key_create_or_update() but does not allow
updating an existing key, instead returning ERR_PTR(-EEXIST).

key_create() will be used by the blacklist keyring which should not
create duplicate entries or update existing entries.
Instead a dedicated message with appropriate severity will be logged.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Thomas Weißschuh 06b53b0294 certs: make blacklisted hash available in klog
One common situation triggering this log statement are duplicate hashes
reported by the system firmware.

These duplicates should be removed from the firmware.

Without logging the blacklisted hash triggering the issue however the users
can not report it properly to the firmware vendors and the firmware vendors
can not easily see which specific hash is duplicated.

While changing the log message also use the dedicated ERR_PTR format
placeholder for the returned error value.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:11:20 +02:00
Matthew Garrett 4d27328827 tpm_crb: Add support for CRB devices based on Pluton
Pluton is an integrated security processor present in some recent Ryzen
parts. If it's enabled, it presents two devices - an MSFT0101 ACPI device
that's broadly an implementation of a Command Response Buffer TPM2, and an
MSFT0200 ACPI device whose functionality I haven't examined in detail yet.
This patch only attempts to add support for the TPM device.

There's a few things that need to be handled here. The first is that the
TPM2 ACPI table uses a previously undefined start method identifier. The
table format appears to include 16 bytes of startup data, which corresponds
to one 64-bit address for a start message and one 64-bit address for a
completion response. The second is that the ACPI tables on the Thinkpad Z13
I'm testing this on don't define any memory windows in _CRS (or, more
accurately, there are two empty memory windows). This check doesn't seem
strictly necessary, so I've skipped that.

Finally, it seems like chip needs to be explicitly asked to transition into
ready status on every command. Failing to do this means that if two
commands are sent in succession without an idle/ready transition in
between, everything will appear to work fine but the response is simply the
original command. I'm working without any docs here, so I'm not sure if
this is actually the required behaviour or if I'm missing something
somewhere else, but doing this results in the chip working reliably.

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:10:52 +02:00
Kailang Yang 2bdccfd290 ALSA: hda/realtek - fixed wrong gpio assigned
GPIO2 PIN use for output. Mask Dir and Data need to assign for 0x4. Not 0x3.
This fixed was for Lenovo Desktop(0x17aa1056). GPIO2 use for AMP enable.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/8d02bb9ac8134f878cd08607fdf088fd@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-13 09:10:21 +01:00
Arnd Bergmann 0f5d4a0b99 crypto: certs: fix FIPS selftest dependency
The selftest code is built into the x509_key_parser module, and depends
on the pkcs7_message_parser module, which in turn has a dependency on
the key parser, creating a dependency loop and a resulting link
failure when the pkcs7 code is a loadable module:

ld: crypto/asymmetric_keys/selftest.o: in function `fips_signature_selftest':
crypto/asymmetric_keys/selftest.c:205: undefined reference to `pkcs7_parse_message'
ld: crypto/asymmetric_keys/selftest.c:209: undefined reference to `pkcs7_supply_detached_data'
ld: crypto/asymmetric_keys/selftest.c:211: undefined reference to `pkcs7_verify'
ld: crypto/asymmetric_keys/selftest.c:215: undefined reference to `pkcs7_validate_trust'
ld: crypto/asymmetric_keys/selftest.c:219: undefined reference to `pkcs7_free_message'

Avoid this by only allowing the selftest to be enabled when either
both parts are loadable modules, or both are built-in.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-02-13 10:00:41 +02:00
Kalle Valo b187c70b03 iwlwifi updates towards v6.3, this patch-set contains:
* EHT rate reporting
 * Sniffer support for EHT and a few fixes in the related code
 * A few general cleanups and small bugfixes
 * Bump FW API to 74 for AX devices
 * Fix a compilation error in mei (it still depends on BROKEN)
 * STEP equalizer support - transfer some Phy related parameters from
   the BIOS to the firmware
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEE9cg2NujikJ5EMZusCDCCYA5zdzwFAmPXqs4bHGdyZWdvcnku
 Z3JlZW5tYW5AaW50ZWwuY29tAAoJEAgwgmAOc3c8+DMP/1YmHcT4X6CemXoAm+EK
 HPDiVDQ+cgUhutIw7UrmpWqjiGW3R7QXtZMzdEWE7xU9qw5h9K1lihdrOde6kUFV
 eHbN1nRFn3yOTcbhlKZfenOySJD5cP/T5ce3wx4SGugwPbsGIQtNh8ZSnCgbqgfT
 YXmSPGfXp12aK5E2RbGGTYcFVnJgoaLNnI0iMiLciB3YqeZaPPLDZ4+CJ1Vpwf4p
 h/jIeKFIupgv/9xG5gFSd18qh96NHTDE6JQjZKRh/iyyqJxqBuS32lRM5v4kTkMK
 JzleBMaaLqrWgBbBYemmEyaslCOHHkfp/tqnR0J1X+GES7TkjRuMEuDyUzIr8OfR
 iF0i/32VWyx5PVWEHn1d8IDzNH2EM7Gr11AQz1wDoX7ig0Iv99G9Sdp1hbYNXZIb
 MROZYArZtQ4WIZS0brULU9PY+9/ZRH3PYOgY+tGYhDkCDIcGuWOl/YQRY3yPMkME
 MV0Q+jXnGXXDNM8nhflMYFv4P02/E1HI0dP+qAlEIdU3hv6abJmzGOI7vfN+j09B
 XFxgHsFPvJocXv15zdvSBQVgRjLxSqMISqc42AJ6fQ3ZY1bUCC2+9pi+iN4ko+1u
 fk+qd5/dGTsEKQ/sIfzOGjosC8vOVeZMAECpsB324bq60mElEjK1cQfgoHnSKIOM
 ECz9vZqKtlcQrp4Br5FMXcPa
 =xefb
 -----END PGP SIGNATURE-----

Merge tag 'iwlwifi-next-for-kalle-2023-01-30' of http://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

iwlwifi updates towards v6.3, this patch-set contains:

* EHT rate reporting
* Sniffer support for EHT and a few fixes in the related code
* A few general cleanups and small bugfixes
* Bump FW API to 74 for AX devices
* Fix a compilation error in mei (it still depends on BROKEN)
* STEP equalizer support - transfer some Phy related parameters from
  the BIOS to the firmware
2023-02-13 08:58:37 +02:00
Oleksandr Tyshchenko 2062f9fb64 xen/grant-dma-iommu: Implement a dummy probe_device() callback
Update stub IOMMU driver (which main purpose is to reuse generic
IOMMU device-tree bindings by Xen grant DMA-mapping layer on Arm)
according to the recent changes done in the following
commit 57365a04c9 ("iommu: Move bus setup to IOMMU device registration").

With probe_device() callback being called during IOMMU device registration,
the uninitialized callback just leads to the "kernel NULL pointer
dereference" issue during boot. Fix that by adding a dummy callback.

Looks like the release_device() callback is not mandatory to be
implemented as IOMMU framework makes sure that callback is initialized
before dereferencing.

Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Fixes: 57365a04c9 ("iommu: Move bus setup to IOMMU device registration")
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20230208153649.3604857-1-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 07:22:08 +01:00
Daniel Wagner 5f69f009b7 nvme-pci: add bogus ID quirk for ADATA SX6000PNP
Yet another device which needs a quirk:

 nvme nvme1: globally duplicate IDs for nsid 1
 nvme nvme1: VID:DID 10ec:5763 model:ADATA SX6000PNP firmware:V9002s94

Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1207827
Reported-by: Gustavo Freitas <freitasmgustavo@gmail.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-13 07:04:53 +01:00
Volodymyr Babchuk c70b7741dd xen/pvcalls-back: fix permanently masked event channel
There is a sequence of events that can lead to a permanently masked
event channel, because xen_irq_lateeoi() is newer called. This happens
when a backend receives spurious write event from a frontend. In this
case pvcalls_conn_back_write() returns early and it does not clears the
map->write counter. As map->write > 0, pvcalls_back_ioworker() returns
without calling xen_irq_lateeoi(). This leaves the event channel in
masked state, a backend does not receive any new events from a
frontend and the whole communication stops.

Move atomic_set(&map->write, 0) to the very beginning of
pvcalls_conn_back_write() to fix this issue.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reported-by: Oleksii Moisieiev <oleksii_moisieiev@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230119211037.1234931-1-volodymyr_babchuk@epam.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:20 +01:00
David Woodhouse 3e8cd711c3 xen: Allow platform PCI interrupt to be shared
When we don't use the per-CPU vector callback, we ask Xen to deliver event
channel interrupts as INTx on the PCI platform device. As such, it can be
shared with INTx on other PCI devices.

Set IRQF_SHARED, and make it return IRQ_HANDLED or IRQ_NONE according to
whether the evtchn_upcall_pending flag was actually set. Now I can share
the interrupt:

 11:         82          0   IO-APIC  11-fasteoi   xen-platform-pci, ens4

Drop the IRQF_TRIGGER_RISING. It has no effect when the IRQ is shared,
and besides, the only effect it was having even beforehand was to trigger
a debug message in both I/OAPIC and legacy PIC cases:

[    0.915441] genirq: No set_type function for IRQ 11 (IO-APIC)
[    0.951939] genirq: No set_type function for IRQ 11 (XT-PIC)

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/f9a29a68d05668a3636dd09acd94d970269eaec6.camel@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:20 +01:00
Krister Johansen caea091e48 x86/xen/time: prefer tsc as clocksource when it is invariant
Kvm elects to use tsc instead of kvm-clock when it can detect that the
TSC is invariant.

(As of commit 7539b174ae ("x86: kvmguest: use TSC clocksource if
invariant TSC is exposed")).

Notable cloud vendors[1] and performance engineers[2] recommend that Xen
users preferentially select tsc over xen-clocksource due the performance
penalty incurred by the latter.  These articles are persuasive and
tailored to specific use cases.  In order to understand the tradeoffs
around this choice more fully, this author had to reference the
documented[3] complexities around the Xen configuration, as well as the
kernel's clocksource selection algorithm.  Many users may not attempt
this to correctly configure the right clock source in their guest.

The approach taken in the kvm-clock module spares users this confusion,
where possible.

Both the Intel SDM[4] and the Xen tsc documentation explain that marking
a tsc as invariant means that it should be considered stable by the OS
and is elibile to be used as a wall clock source.

In order to obtain better out-of-the-box performance, and reduce the
need for user tuning, follow kvm's approach and decrease the xen clock
rating so that tsc is preferable, if it is invariant, stable, and the
tsc will never be emulated.

[1] https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
[2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
[3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
[4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
    3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Code-reviewed-by: David Reaver <me@davidreaver.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221216162118.GB2633@templeofstupid.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:20 +01:00
Juergen Gross f697cb00af x86/xen: mark xen_pv_play_dead() as __noreturn
Mark xen_pv_play_dead() and related to that xen_cpu_bringup_again()
as "__noreturn".

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221125063248.30256-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:19 +01:00
Juergen Gross 336f560a89 x86/xen: don't let xen_pv_play_dead() return
A function called via the paravirt play_dead() hook should not return
to the caller.

xen_pv_play_dead() has a problem in this regard, as it currently will
return in case an offlined cpu is brought to life again. This can be
changed only by doing basically a longjmp() to cpu_bringup_and_idle(),
as the hypercall for bringing down the cpu will just return when the
cpu is coming up again. Just re-initializing the cpu isn't possible,
as the Xen hypervisor will deny that operation.

So introduce xen_cpu_bringup_again() resetting the stack and calling
cpu_bringup_and_idle(), which can be called after HYPERVISOR_vcpu_op()
in xen_pv_play_dead().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221125063248.30256-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:19 +01:00
Per Bilse 415dab3c17 drivers/xen/hypervisor: Expose Xen SIF flags to userspace
/proc/xen is a legacy pseudo filesystem which predates Xen support
getting merged into Linux.  It has largely been replaced with more
normal locations for data (/sys/hypervisor/ for info, /dev/xen/ for
user devices).  We want to compile xenfs support out of the dom0 kernel.

There is one item which only exists in /proc/xen, namely
/proc/xen/capabilities with "control_d" being the signal of "you're in
the control domain".  This ultimately comes from the SIF flags provided
at VM start.

This patch exposes all SIF flags in /sys/hypervisor/start_flags/ as
boolean files, one for each bit, returning '1' if set, '0' otherwise.
Two known flags, 'privileged' and 'initdomain', are explicitly named,
and all remaining flags can be accessed via generically named files,
as suggested by Andrew Cooper.

Signed-off-by: Per Bilse <per.bilse@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230103130213.2129753-1-per.bilse@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:19 +01:00
Steven Rostedt (Google) 70b5339caf tracing: Make trace_define_field_ext() static
trace_define_field_ext() is not used outside of trace_events.c, it should
be static.

Link: https://lore.kernel.org/oe-kbuild-all/202302130750.679RaRog-lkp@intel.com/

Fixes: b6c7abd1c2 ("tracing: Fix TASK_COMM_LEN in trace event format file")
Reported-by: Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-12 20:14:11 -05:00
Thomas Weißschuh 2b188a2cfc zonefs: make kobj_type structure constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definition to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-02-13 08:03:48 +09:00
Geert Uytterhoeven 5aa52ccf69 m68k: nommu: Fix misspellings of "DragonEngine"
Exys produced the "DragonEngine II" board.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2023-02-13 08:56:39 +10:00
Geert Uytterhoeven 8f60772729 m68k: nommu: Fix misspellings of "uCdimm"
Arcturus Networks produced the "uCsimm" and "uCdimm" modules.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2023-02-13 08:56:39 +10:00
Dave Chinner bd4f5d09cc xfs: refactor the filestreams allocator pick functions
Now that the filestreams allocator is largely rewritten,
restructure the main entry point and pick function to seperate out
the different operations cleanly. The MRU lookup function should not
handle the start AG selection on MRU lookup failure, and nor should
the pick function handle building the association that is inserted
into the MRU.

This leaves the filestreams allocator fairly clean and easy to
understand, returning to the caller with an active perag reference
and a target block to allocate at.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:56 +11:00
Dave Chinner f8f1ed1ab3 xfs: return a referenced perag from filestreams allocator
Now that the filestreams AG selection tracks active perags, we need
to return an active perag to the core allocator code. This is
because the file allocation the filestreams code will run are AG
specific allocations and so need to pin the AG until the allocations
complete.

We cannot rely on the filestreams item reference to do this - the
filestreams association can be torn down at any time, hence we
need to have a separate reference for the allocation process to pin
the AG after it has been selected.

This means there is some perag juggling in allocation failure
fallback paths as they will do all AG scans in the case the AG
specific allocation fails. Hence we need to track the perag
reference that the filestream allocator returned to make sure we
don't leak it on repeated allocation failure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:56 +11:00
Dave Chinner 571e259282 xfs: pass perag to filestreams tracing
Pass perags instead of raw ag numbers, avoiding the need for the
special peek function for the tracing code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:56 +11:00
Dave Chinner eb70aa2d8e xfs: use for_each_perag_wrap in xfs_filestream_pick_ag
xfs_filestream_pick_ag() is now ready to rework to use
for_each_perag_wrap() for iterating the perags during the AG
selection scan.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner 3054face13 xfs: track an active perag reference in filestreams
Rather than just track the agno of the reference, track a referenced
perag pointer instead. This will allow active filestreams to prevent
AGs from going away until the filestreams have been torn down.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner f38b46bbfa xfs: factor out MRU hit case in xfs_filestream_select_ag
Because it now stands out like a sore thumb. Factoring out this case
starts the process of simplifying xfs_filestream_select_ag() again.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner 3e43877a9d xfs: remove xfs_filestream_select_ag() longest extent check
Picking a new AG checks the longest free extent in the AG is valid,
so there's no need to repeat the check in
xfs_filestream_select_ag(). Remove it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner ba34de8def xfs: merge new filestream AG selection into xfs_filestream_select_ag()
This is largely a wrapper around xfs_filestream_pick_ag() that
repeats a lot of the lookups that we just merged back into
xfs_filestream_select_ag() from the lookup code. Merge the
xfs_filestream_new_ag() code back into _select_ag() to get rid
of all the unnecessary logic.

Indeed, this makes it obvious that if we have no parent inode,
the filestreams allocator always selects AG 0 regardless of whether
it is fit for purpose or not.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner a52dc2ad36 xfs: merge filestream AG lookup into xfs_filestream_select_ag()
The lookup currently either returns the cached filestream AG or it
calls xfs_filestreams_select_lengths() to looks up a new AG. This
has verify the AG that is selected, so we end up doing "select a new
AG loop in a couple of places when only one really is needed.  Merge
the initial lookup functionality with the length selection so that
we only need to do a single pick loop on lookup or verification
failure.

This undoes a lot of the factoring that enabled the selection to be
moved over to the filestreams code. It makes
xfs_filestream_select_ag() an awful messier, but it has to be made
worse before it can get better in future patches...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00
Dave Chinner 8f7747ad8c xfs: move xfs_bmap_btalloc_filestreams() to xfs_filestreams.c
xfs_bmap_btalloc_filestreams() calls two filestreams functions to
select the AG to allocate from. Both those functions end up in
the same selection function that iterates all AGs multiple times.
Worst case, xfs_bmap_btalloc_filestreams() can iterate all AGs 4
times just to select the initial AG to allocate in.

Move the AG selection to fs/xfs/xfs_filestreams.c as a single
interface so that the inefficient AG interation is contained
entirely within the filestreams code. This will allow the
implementation to be simplified and made more efficient in future
patches.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-02-13 09:14:55 +11:00