Commit Graph

1106901 Commits (aa40d5a43526cca9439a2b45fcfdcd016594dece)

Author SHA1 Message Date
Jan Kara f950667356 bfq: Relax waker detection for shared queues
Currently we look for waker only if current queue has no requests. This
makes sense for bfq queues with a single process however for shared
queues when there is a larger number of processes the condition that
queue has no requests is difficult to meet because often at least one
process has some request in flight although all the others are waiting
for the waker to do the work and this harms throughput. Relax the "no
queued request for bfq queue" condition to "the current task has no
queued requests yet". For this, we also need to start tracking number of
requests in flight for each task.

This patch (together with the following one) restores the performance
for dbench with 128 clients that regressed with commit c65e6fd460
("bfq: Do not let waker requests skip proper accounting") because
this commit makes requests of wakers properly enter BFQ queues and thus
these queues become ineligible for the old waker detection logic.
Dbench results:

         Vanilla 5.18-rc3        5.18-rc3 + revert      5.18-rc3 patched
Mean     1237.36 (   0.00%)      950.16 *  23.21%*      988.35 *  20.12%*

Numbers are time to complete workload so lower is better.

Fixes: c65e6fd460 ("bfq: Do not let waker requests skip proper accounting")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220519105235.31397-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-19 06:52:33 -06:00
Marijn Suijten 4d8a768ef4 pinctrl: qcom: spmi-gpio: Add pm6125 compatible
The pm6125 has 9 GPIOs with no holes inbetween.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220511220613.1015472-4-marijn.suijten@somainline.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-05-19 14:52:10 +02:00
Marijn Suijten f82a2c212d dt-bindings: pinctrl: qcom-pmic-gpio: Add pm6125 compatible
The pm6125 comes with 9 GPIOs, without holes.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220511220613.1015472-3-marijn.suijten@somainline.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-05-19 14:51:25 +02:00
Luca Miccio 5b3353949e xen: add support for initializing xenstore later as HVM domain
When running as dom0less guest (HVM domain on ARM) the xenstore event
channel is available at domain creation but the shared xenstore
interface page only becomes available later on.

In that case, wait for a notification on the xenstore event channel,
then complete the xenstore initialization later, when the shared page
is actually available.

The xenstore page has few extra field. Add them to the shared struct.
One of the field is "connection", when the connection is ready, it is
zero. If the connection is not-zero, wait for a notification.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220513211938.719341-2-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:44:08 +02:00
Stefano Stabellini 62db0fafa8 xen: sync xs_wire.h header with upstream xen
Sync the xs_wire.h header file in Linux with the one in Xen.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220513211938.719341-1-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:44:05 +02:00
Maximilian Heyne 1591a65f55 x86: xen: remove STACK_FRAME_NON_STANDARD from xen_cpuid
Since commit 4d65adfcd1 ("x86: xen: insn: Decode Xen and KVM
emulate-prefix signature"), objtool is able to correctly parse the
prefixed instruction in xen_cpuid and emit correct orc unwind
information. Hence, marking the function as STACKFRAME_NON_STANDARD is
no longer needed.

This commit is basically a revert of commit 983bb6d254 ("x86/xen: Mark
xen_cpuid() stack frame as non-standard").

Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20220517162425.100567-1-mheyne@amazon.de
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:39:50 +02:00
SeongJae Park 12f112c3e3 xen-blk{back,front}: Update contact points for buffer_squeeze_duration_ms and feature_persistent
SeongJae is currently listed as a contact point for some blk{back,front}
features, but he will not work for XEN for a while.  This commit
therefore updates the contact point to his colleague, Maximilian, who is
understanding the context and actively working with the features now.

Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20220420072734.1692-1-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:38:28 +02:00
Zhihao Cheng 68f4c6eba7 fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
Commit 505a666ee3 ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during wb_writeback, which
may cause a potential ABBA dead lock:

    wb_writeback		fat_file_fsync
blk_start_plug(&plug)
for (;;) {
  iter i-1: some reqs have been added into plug->mq_list  // LOCK A
  iter i:
    progress = __writeback_inodes_wb(wb, work)
    . writeback_sb_inodes // fat's bdev
    .   __writeback_single_inode
    .   . generic_writepages
    .   .   __block_write_full_page
    .   .   . . 	    __generic_file_fsync
    .   .   . . 	      sync_inode_metadata
    .   .   . . 	        writeback_single_inode
    .   .   . . 		  __writeback_single_inode
    .   .   . . 		    fat_write_inode
    .   .   . . 		      __fat_write_inode
    .   .   . . 		        sync_dirty_buffer	// fat's bdev
    .   .   . . 			  lock_buffer(bh)	// LOCK B
    .   .   . . 			    submit_bh
    .   .   . . 			      blk_mq_get_tag	// LOCK A
    .   .   . trylock_buffer(bh)  // LOCK B
    .   .   .   redirty_page_for_writepage
    .   .   .     wbc->pages_skipped++
    .   .   --wbc->nr_to_write
    .   wrote += write_chunk - wbc.nr_to_write  // wrote > 0
    .   requeue_inode
    .     redirty_tail_locked
    if (progress)    // progress > 0
      continue;
  iter i+1:
      queue_io
      // similar process with iter i, infinite for-loop !
}
blk_finish_plug(&plug)   // flush plug won't be called

Above process triggers a hungtask like:
[  399.044861] INFO: task bb:2607 blocked for more than 30 seconds.
[  399.046824]       Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty
[  399.051539] task:bb              state:D stack:    0 pid: 2607 ppid:
2426 flags:0x00004000
[  399.051556] Call Trace:
[  399.051570]  __schedule+0x480/0x1050
[  399.051592]  schedule+0x92/0x1a0
[  399.051602]  io_schedule+0x22/0x50
[  399.051613]  blk_mq_get_tag+0x1d3/0x3c0
[  399.051640]  __blk_mq_alloc_requests+0x21d/0x3f0
[  399.051657]  blk_mq_submit_bio+0x68d/0xca0
[  399.051674]  __submit_bio+0x1b5/0x2d0
[  399.051708]  submit_bio_noacct+0x34e/0x720
[  399.051718]  submit_bio+0x3b/0x150
[  399.051725]  submit_bh_wbc+0x161/0x230
[  399.051734]  __sync_dirty_buffer+0xd1/0x420
[  399.051744]  sync_dirty_buffer+0x17/0x20
[  399.051750]  __fat_write_inode+0x289/0x310
[  399.051766]  fat_write_inode+0x2a/0xa0
[  399.051783]  __writeback_single_inode+0x53c/0x6f0
[  399.051795]  writeback_single_inode+0x145/0x200
[  399.051803]  sync_inode_metadata+0x45/0x70
[  399.051856]  __generic_file_fsync+0xa3/0x150
[  399.051880]  fat_file_fsync+0x1d/0x80
[  399.051895]  vfs_fsync_range+0x40/0xb0
[  399.051929]  __x64_sys_fsync+0x18/0x30

In my test, 'need_resched()' (which is imported by 590dca3a71 "fs-writeback:
unplug before cond_resched in writeback_sb_inodes") in function
'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted
from write_cache_pages().

Fix it by correcting wrote number according number of skipped pages
in writeback_sb_inodes().

Goto Link to find a reproducer.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215837
Cc: stable@vger.kernel.org # v4.3
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220510133805.1988292-1-chengzhihao1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-19 06:29:41 -06:00
Juergen Gross 4573240f07 xen/xenbus: eliminate xenbus_grant_ring()
There is no external user of xenbus_grant_ring() left, so merge it into
the only caller xenbus_setup_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:11 +02:00
Juergen Gross 360dc89d12 xen/sndfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify sndfront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:08 +02:00
Juergen Gross 2b3daf083a xen/usbfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify xen-hcd's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:05 +02:00
Juergen Gross caa427d252 xen/scsifront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify scsifront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:03 +02:00
Juergen Gross 0e6b139dbd xen/pcifront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify pcifront's shared page creation and removal via
xenbus_setup_ring() and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:02 +02:00
Juergen Gross ae19265ca3 xen/drmfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify drmfront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:01 +02:00
Juergen Gross 5e0afd8eab xen/tpmfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify tpmfront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring(), which are provided exactly for the use
pattern as seen in this driver.

Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:59 +02:00
Juergen Gross 46e20d43f5 xen/netfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify netfront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:58 +02:00
Juergen Gross 47cbd59833 xen/blkfront: use xenbus_setup_ring() and xenbus_teardown_ring()
Simplify blkfront's ring creation and removal via xenbus_setup_ring()
and xenbus_teardown_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:56 +02:00
Juergen Gross 7050096d07 xen/xenbus: add xenbus_setup_ring() service function
Most PV device frontends share very similar code for setting up shared
ring buffers:

- allocate page(s)
- init the ring admin data
- give the backend access to the ring via grants

Tearing down the ring requires similar actions in all frontends again:

- remove grants
- free the page(s)

Provide service functions xenbus_setup_ring() and xenbus_teardown_ring()
for that purpose.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:55 +02:00
Juergen Gross 6fac592cca xen: update ring.h
Update include/xen/interface/io/ring.h to its newest version.

Switch the two improper use cases of RING_HAS_UNCONSUMED_RESPONSES() to
XEN_RING_NR_UNCONSUMED_RESPONSES() in order to avoid the nasty
XEN_RING_HAS_UNCONSUMED_IS_BOOL #define.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:53 +02:00
Juergen Gross 888fd787f3 xen/shbuf: switch xen-front-pgdir-shbuf to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:51 +02:00
Juergen Gross bd506c7812 xen/dmabuf: switch gntdev-dmabuf to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:48 +02:00
Juergen Gross 297ce02669 xen/sound: switch xen_snd_front to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:46 +02:00
Juergen Gross cb5216319b xen/drm: switch xen_drm_front to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:45 +02:00
Juergen Gross edd81e7caa xen/usb: switch xen-hcd to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:43 +02:00
Juergen Gross 70920be6ff xen/scsifront: remove unused GRANT_INVALID_REF definition
GRANT_INVALID_REF isn't used in scsifront, so remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:42 +02:00
Juergen Gross 145daab239 xen/netfront: switch netfront to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:40 +02:00
Juergen Gross 21b539711a xen/blkfront: switch blkfront to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:39 +02:00
Juergen Gross 8c9eb0e373 xen/grant-table: never put a reserved grant on the free list
Make sure a reserved grant is never put on the free list, as this could
cause hard to debug errors.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:37 +02:00
Juergen Gross 79c22318f8 xen: update grant_table.h
Update include/xen/interface/grant_table.h to its newest version.

This allows to drop some private definitions in grant-table.c and
include/xen/grant_table.h.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:34 +02:00
Juergen Gross 6d1c2f48f3 xen/scsifront: harden driver against malicious backend
Instead of relying on a well behaved PV scsi backend verify all meta
data received from the backend and avoid multiple reads of the same
data from the shared ring page.

In case any illegal data from the backend is detected switch the
PV device to a new "error" state and deactivate it for further use.

Use the "lateeoi" variant for the event channel in order to avoid
event storms blocking the guest.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220428075323.12853-5-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:02:49 +02:00
Juergen Gross a2f6751d5a xen/scsifront: use new command result macros
Add a translation layer for the command result values.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220428075323.12853-4-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:02:47 +02:00
Juergen Gross 54aee68bb6 xen/scsiback: use new command result macros
Instead of using the kernel's values for the result of PV scsi
operations use the values of the interface definition.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220428075323.12853-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:02:43 +02:00
Juergen Gross 5ce9231c5b xen: update vscsiif.h
Update include/xen/interface/io/vscsiif.h to its newest version.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220428075323.12853-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:02:40 +02:00
Kees Cook aeb8441203 x86/boot: Wrap literal addresses in absolute_pointer()
GCC 11 (incorrectly[1]) assumes that literal values cast to (void *)
should be treated like a NULL pointer with an offset, and raises
diagnostics when doing bounds checking under -Warray-bounds. GCC 12
got "smarter" about finding these:

  In function 'rdfs8',
      inlined from 'vga_recalc_vertical' at /srv/code/arch/x86/boot/video-mode.c:124:29,
      inlined from 'set_mode' at /srv/code/arch/x86/boot/video-mode.c:163:3:
  /srv/code/arch/x86/boot/boot.h:114:9: warning: array subscript 0 is outside array bounds of 'u8[0]' {aka 'unsigned char[]'} [-Warray-bounds]
    114 |         asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr));
        |         ^~~

This has been solved in other places[2] already by using the recently
added absolute_pointer() macro. Do the same here.

  [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578
  [2] https://lore.kernel.org/all/20210912160149.2227137-1-linux@roeck-us.net/

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220227195918.705219-1-keescook@chromium.org
2022-05-19 12:47:30 +02:00
Boris Pismenny c1318b39c7 tls: Add opt-in zerocopy mode of sendfile()
TLS device offload copies sendfile data to a bounce buffer before
transmitting. It allows to maintain the valid MAC on TLS records when
the file contents change and a part of TLS record has to be
retransmitted on TCP level.

In many common use cases (like serving static files over HTTPS) the file
contents are not changed on the fly. In many use cases breaking the
connection is totally acceptable if the file is changed during
transmission, because it would be received corrupted in any case.

This commit allows to optimize performance for such use cases to
providing a new optional mode of TLS sendfile(), in which the extra copy
is skipped. Removing this copy improves performance significantly, as
TLS and TCP sendfile perform the same operations, and the only overhead
is TLS header/trailer insertion.

The new mode can only be enabled with the new socket option named
TLS_TX_ZEROCOPY_SENDFILE on per-socket basis. It preserves backwards
compatibility with existing applications that rely on the copying
behavior.

The new mode is safe, meaning that unsolicited modifications of the file
being sent can't break integrity of the kernel. The worst thing that can
happen is sending a corrupted TLS record, which is in any case not
forbidden when using regular TCP sockets.

Sockets other than TLS device offload are not affected by the new socket
option. The actual status of zerocopy sendfile can be queried with
sock_diag.

Performance numbers in a single-core test with 24 HTTPS streams on
nginx, under 100% CPU load:

* non-zerocopy: 33.6 Gbit/s
* zerocopy: 79.92 Gbit/s

CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz

Signed-off-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220518092731.1243494-1-maximmi@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-19 12:14:11 +02:00
Keerthy ffcb2fc86e thermal: k3_j72xx_bandgap: Add the bandgap driver support
Add VTM thermal support. In the Voltage Thermal Management
Module(VTM), K3 J72XX supplies a voltage reference and a temperature
sensor feature that are gathered in the band gap voltage and
temperature sensor (VBGAPTS) module. The band gap provides current and
voltage reference for its internal circuits and other analog IP
blocks. The analog-to-digital converter (ADC) produces an output value
that is proportional to the silicon temperature.

Currently reading temperatures only is supported.  There are no
active/passive cooling agent supported.

J721e SoCs have errata i2128: https://www.ti.com/lit/pdf/sprz455

The VTM Temperature Monitors (TEMPSENSORs) are trimmed during production,
with the resulting values stored in software-readable registers. Software
should use these  register values when translating the Temperature
Monitor output codes to temperature values.

It has an involved workaround. Software needs to read the error codes for
-40C, 30C, 125C from the efuse for each device & derive a new look up table
for adc to temperature conversion. Involved calculating slopes & constants
using 3 different straight line equations with adc refernce codes as the
y-axis & error codes in the x-axis.

-40C to 30C
30C to 125C
125C to 150C

With the above 2 line equations we derive the full look-up table to
workaround the errata i2128 for j721e SoC.

Tested temperature reading on J721e SoC & J7200 SoC.

[daniel.lezcano@linaro.org: Generate look-up tables run-time]

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20220517172920.10857-3-j-keerthy@ti.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:53 +02:00
Keerthy 031c2952d1 dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
Add VTM bindings documentation. In the Voltage Thermal
Management Module(VTM), K3 J72XX supplies a voltage
reference and a temperature sensor feature that are gathered in the band
gap voltage and temperature sensor (VBGAPTS) module. The band
gap provides current and voltage reference for its internal
circuits and other analog IP blocks. The analog-to-digital
converter (ADC) produces an output value that is proportional
to the silicon temperature.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220517172920.10857-2-j-keerthy@ti.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:53 +02:00
Miaoqian Lin 09700c504d thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
of_find_node_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.

Fixes: e20db70dba ("thermal: imx_sc: add i.MX system controller thermal support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220517055121.18092-1-linmq006@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:53 +02:00
Yang Yingliang 98a160e898 thermal/core: Fix memory leak in __thermal_cooling_device_register()
I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff888010080000 (size 264312):
  comm "182", pid 102533, jiffies 4296434960 (age 10.100s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 40 7f 1f b9 ff ff ff ff  ........@.......
  backtrace:
    [<0000000038b2f4fc>] kmalloc_order_trace+0x1d/0x110 mm/slab_common.c:969
    [<00000000ebcb8da5>] __kmalloc+0x373/0x420 include/linux/slab.h:510
    [<0000000084137f13>] thermal_cooling_device_setup_sysfs+0x15d/0x2d0 include/linux/slab.h:586
    [<00000000352b8755>] __thermal_cooling_device_register+0x332/0xa60 drivers/thermal/thermal_core.c:927
    [<00000000fb9f331b>] devm_thermal_of_cooling_device_register+0x6b/0xf0 drivers/thermal/thermal_core.c:1041
    [<000000009b8012d2>] max6650_probe.cold+0x557/0x6aa drivers/hwmon/max6650.c:211
    [<00000000da0b7e04>] i2c_device_probe+0x472/0xac0 drivers/i2c/i2c-core-base.c:561

If device_register() fails, thermal_cooling_device_destroy_sysfs() need be called
to free the memory allocated in thermal_cooling_device_setup_sysfs().

Fixes: 8ea229511e ("thermal: Add cooling device's statistics in sysfs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220511020605.3096734-1-yangyingliang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:53 +02:00
Bjorn Andersson 30988d3b31 dt-bindings: thermal: tsens: Add sc8280xp compatible
The Qualcomm SC8280XP platform has three instances of the tsens block,
add a compatible for these instances.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220503153436.960184-1-bjorn.andersson@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Bjorn Andersson b54d4dafc9 dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
Add compatible for the LMh blocks found in the Qualcomm sc8180x
platform.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220502164504.3972938-2-bjorn.andersson@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Bjorn Andersson ef6673e836 thermal/drivers/qcom/lmh: Add sc8180x compatible
The LMh instances in the Qualcomm SC8180X platform looks to behave
similar to those in SM8150, add additional compatibles to allow
platform specific behavior to be added if needed.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220502164504.3972938-1-bjorn.andersson@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Biju Das 2d37f5c90b thermal/drivers/rz2gl: Fix OTP Calibration Register values
As per the latest RZ/G2L Hardware User's Manual (Rev.1.10 Apr, 2022),
the bit 31 of TSU OTP Calibration Register(OTPTSUTRIM) indicates
whether bit [11:0] of OTPTSUTRIM is valid or invalid.

This patch updates the code to reflect this change.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220428093346.7552-1-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Biju Das e126ce0bcc dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
Document RZ/G2UL TSU bindings. The TSU block on RZ/G2UL is identical to one
found on RZ/G2L SoC. No driver changes are required as generic compatible
string "renesas,rzg2l-tsu" will be used as a fallback.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220501081930.23743-1-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Corentin Labbe 44b965d8c4 thermal: thermal_of: fix typo on __thermal_bind_params
Add a missing s to __thermal_bind_param kernel doc comment.
This fixes the following sparse warnings:
drivers/thermal/thermal_of.c:50: warning: expecting prototype for struct __thermal_bind_param. Prototype was for struct __thermal_bind_params instead

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Link: https://lore.kernel.org/r/20220426064113.3787826-1-clabbe@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Jiapeng Chong cb4487d2b4 tools/thermal: remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/thermal/thermometer/thermometer.c:147:3-4: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220427030619.81556-2-jiapeng.chong@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Jiapeng Chong f21b57eb12 tools/lib/thermal: remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/lib/thermal/commands.c:215:2-3: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220427030619.81556-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Zheng Yongjun e20d136ec7 thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
platform_get_resource() may return NULL, add proper check to
avoid potential NULL dereferencing.

Fixes: 250e211057 ("thermal: broadcom: Add Stingray thermal driver")
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20220425092929.90412-1-zhengyongjun3@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Daniel Lezcano 077df623c8 tools/thermal: Add thermal daemon skeleton
This change provides a simple daemon skeleton. It is an example of how
to use the thermal library which wraps all the complex code related to
the netlink and transforms it into a callback oriented code.

The goal of this skeleton is to give a base brick for anyone
interested in writing its own thermal engine or as an example to rely
on to write its own thermal monitoring implementation.

In the future, it will evolve with more features and hopefully more
logic.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220420160933.347088-5-daniel.lezcano@linaro.org
2022-05-19 12:11:52 +02:00
Daniel Lezcano 110acbc6a4 tools/thermal: Add a temperature capture tool
The 'thermometer' tool allows to capture the temperature of a set of
thermal zones defined in a configuration file at a specified rate.

It is designed to have the lowest possible overhead. It will write the
captured temperature per thermal zone per file so making easier to
write a gnuplot script.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220420160933.347088-4-daniel.lezcano@linaro.org
2022-05-19 12:11:52 +02:00