Commit Graph

948892 Commits (94dea151bf3651c01acb12a38ca75ba9d26ea4da)

Author SHA1 Message Date
Paul Cercueil 48f5dd56cf MIPS: ingenic: Hardcode mem size for qi,lb60 board
Old Device Tree for the qi,lb60 (aka. Ben Nanonote) did not have a
'memory' node. The kernel would then read the memory controller
registers to know how much RAM was available.

Since every other supported board has had a 'memory' node from the
beginning, we can just hardcode a RAM size of 32 MiB when running with
an old Device Tree without the 'memory' node.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:48:57 +02:00
Paul Cercueil 714b649dc7 MIPS: DTS: ingenic/qi,lb60: Add model and memory node
Add a memory node, which was missing until now, and use the retail name
"Ben Nanonote" as the model, as it is way more known under that name
than under the name "LB60".

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:48:36 +02:00
Paul Cercueil 199c5f080e MIPS: ingenic: Use fw_passed_dtb even if CONFIG_BUILTIN_DTB
The fw_passed_dtb is now properly initialized even when
CONFIG_BUILTIN_DTB is used, so there's no need to handle it in any
particular way here.

Note that the behaviour is slightly different, as the previous code used
the built-in Device Tree unconditionally, while now the built-in Device
Tree is only used when the bootloader did not provide one.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:48:19 +02:00
Paul Cercueil 37e5c69ffd MIPS: head.S: Init fw_passed_dtb to builtin DTB
Init the 'fw_passed_dtb' pointer to the buit-in Device Tree blob when it
has been compiled in with CONFIG_BUILTIN_DTB.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:48:02 +02:00
Nicolas Saenz Julienne c3028b951e of: address: Fix parser address/size cells initialization
bus->count_cells() parses cells starting from the node's parent. This is
not good enough for parser_init() which is generally parsing a bus node.

Revert to previous behavior using of_bus_n_*_cells().

Fixes: 2f96593ecc ("of_address: Add bus type match for pci ranges parser")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:43:07 +02:00
Jiaxun Yang 0fc0ead348 of_address: Guard of_bus_pci_get_flags with CONFIG_PCI
After 2f96593ecc ("of_address: Add bus type match for pci ranges parser"),
the last user of of_bus_pci_get_flags when CONFIG_PCI is disabled had gone.

This caused unused function warning when compiling without CONFIG_PCI.
Fix by guarding it with CONFIG_PCI.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 2f96593ecc ("of_address: Add bus type match for pci ranges parser")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-31 17:30:52 +02:00
Jerry Crunchtime 1acf8f90ea libbpf: Fix register in PT_REGS MIPS macros
The o32, n32 and n64 calling conventions require the return
value to be stored in $v0 which maps to $2 register, i.e.,
the register 2.

Fixes: c1932cd ("bpf: Add MIPS support to samples/bpf.")
Signed-off-by: Jerry Crunchtime <jerry.c.t@web.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/43707d31-0210-e8f0-9226-1af140907641@web.de
2020-07-31 17:20:49 +02:00
Jens Axboe d1719f70d0 io_uring: don't touch 'ctx' after installing file descriptor
As soon as we install the file descriptor, we have to assume that it
can get arbitrarily closed. We currently account memory (and note that
we did) after installing the ring fd, which means that it could be a
potential use-after-free condition if the fd is closed right after
being installed, but before we fiddle with the ctx.

In fact, syzbot reported this exact scenario:

BUG: KASAN: use-after-free in io_account_mem fs/io_uring.c:7397 [inline]
BUG: KASAN: use-after-free in io_uring_create fs/io_uring.c:8369 [inline]
BUG: KASAN: use-after-free in io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
Read of size 1 at addr ffff888087a41044 by task syz-executor.5/18145

CPU: 0 PID: 18145 Comm: syz-executor.5 Not tainted 5.8.0-rc7-next-20200729-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 io_account_mem fs/io_uring.c:7397 [inline]
 io_uring_create fs/io_uring.c:8369 [inline]
 io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c429
Code: 8d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8f121d0c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
RAX: ffffffffffffffda RBX: 0000000000008540 RCX: 000000000045c429
RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000196
RBP: 000000000078bf38 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
R13: 00007fff86698cff R14: 00007f8f121d19c0 R15: 000000000078bf0c

Move the accounting of the ring used locked memory before we get and
install the ring file descriptor.

Cc: stable@vger.kernel.org
Reported-by: syzbot+9d46305e76057f30c74e@syzkaller.appspotmail.com
Fixes: 309758254e ("io_uring: report pinned memory usage")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-31 08:25:06 -06:00
Wolfram Sang c7c9e914f9 i2c: rcar: avoid race when unregistering slave
Due to the lockless design of the driver, it is theoretically possible
to access a NULL pointer, if a slave interrupt was running while we were
unregistering the slave. To make this rock solid, disable the interrupt
for a short time while we are clearing the interrupt_enable register.
This patch is purely based on code inspection. The OOPS is super-hard to
trigger because clearing SAR (the address) makes interrupts even more
unlikely to happen as well. While here, reinit SCR to SDBS because this
bit should always be set according to documentation. There is no effect,
though, because the interface is disabled.

Fixes: 7b814d852a ("i2c: rcar: avoid race when unregistering slave client")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-31 15:56:32 +02:00
Wolfram Sang 073d398dc4 Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc7' into i2c/for-5.9
2020-07-31 15:54:27 +02:00
Bernard 1041dee217 drm/msm: use kthread_create_worker instead of kthread_run
Use kthread_create_worker to simplify the code and optimise
the manager struct: msm_drm_thread. With this change, we
could remove struct element (struct task_struct *thread &
struct kthread_worker worker), instead, use one point (struct
kthread_worker *worker).

Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Konrad Dybcio 974b7115a7 drm/msm/mdp5: Add MDP5 configuration for SDM636/660
This commit adds support for the MDP5 IP on Snapdragon
636/660.

Signed-off-by: Konrad Dybcio <konradybcio@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Konrad Dybcio 033f47f7f1 drm/msm/dsi: Add DSI configuration for SDM660
This also applies to sdm630/636 and their SDA
counterparts.

Signed-off-by: Konrad Dybcio <konradybcio@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Konrad Dybcio 75c1437ceb drm/msm/mdp5: Add MDP5 configuration for SDM630
This commit adds support for the MDP5 IP on Snapdragon
630. The configuration is different from SDM660's, as
the latter one has two DSI outputs.

Signed-off-by: Konrad Dybcio <konradybcio@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Konrad Dybcio 694dd304cc drm/msm/dsi: Add phy configuration for SDM630/636/660
These SoCs make use of the 14nm phy, but at different
addresses than other 14nm units.

Signed-off-by: Konrad Dybcio <konradybcio@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Jonathan Marek 66ffb9150b drm/msm/a6xx: add A640/A650 hwcg
Initialize hardware clock-gating registers on A640 and A650 GPUs.

At least for A650, this solves some performance issues.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Jonathan Marek b1c53a2a2d drm/msm/a6xx: hwcg tables in gpulist
This will allow supporting different hwcg tables for a6xx.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Jonathan Marek af776a3e1c drm/msm/dpu: add SM8250 to hw catalog
This brings up basic video mode functionality for SM8250 DPU. Command mode
and dual mixer/intf configurations are not working, future patches will
address this. Scaler functionality and multiple planes is also untested.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Jonathan Marek 386fced3f7 drm/msm/dpu: add SM8150 to hw catalog
This brings up basic video mode functionality for SM8150 DPU. Command mode
and dual mixer/intf configurations are not working, future patches will
address this. Scaler functionality and multiple planes is also untested.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
[fixup max_linewidth warning]
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:17 -07:00
Jonathan Marek fc3a69ec68 drm/msm/dpu: intf timing path for displayport
Calculate the correct timings for displayport, from downstream driver.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek 4376f2e508 drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3
This fixes flushing of INTF_2 and INTF_3 on SM8150 and SM8250 hardware.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek cace3ac4bc drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845
The INTF_INPUT_CTRL feature is not available on sdm845, so don't set it.

This also adds separate feature bits for INTF (based on downstream) instead
of using CTL feature bit for it, and removes the unnecessary NULL check in
the added bind_pingpong_blk function.

Fixes: 73bfb790ac ("msm:disp:dpu1: setup display datapath for SC7180 target")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek 7e9d4cdd65 drm/msm/dpu: move some sspp caps to dpu_caps
This isn't something that ever changes between planes, so move it to
dpu_caps struct. Making this change will allow more re-use in the
"SSPP sub blocks config" part of the catalog, in particular when adding
support for SM8150 and SM8250 which have different max_linewidth.

This also sets max_hdeci_exp/max_vdeci_exp to 0 for sc7180, as decimation
is not supported on the newest DPU versions. (note that decimation is not
implemented, so this changes nothing)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek 544d8b9615 drm/msm/dpu: update UBWC config for sm8150 and sm8250
Update the UBWC registers to the right values for sm8150 and sm8250.

This removes broken dpu_hw_reset_ubwc, which doesn't work because the
"force blk offset to zero to access beginning of register region" hack is
copied from downstream, where mapped region starts 0x1000 below what is
used in the upstream driver.

Also simplifies the overly complicated change that was introduced in
e4f9bbe9f8 to work around dpu_hw_reset_ubwc being broken.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek de321dcc23 drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250
All DPU versions starting from 4.0 use the sdm845 version, so check for
that instead of checking each version individually. This chooses the right
function for sm8150 and sm8250.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek d0bac4e9cd drm/msm/a6xx: set ubwc config for A640 and A650
This is required for A640 and A650 to be able to share UBWC-compressed
images with other HW such as display, which expect this configuration.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Rob Clark b5e02e117b drm/msm/adreno: un-open-code some packets
Small cleanup, lets not open-code bits/bitfields that are properly
defined in the rnndb xml (and therefore have builders in the generated
headers)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Rob Clark c28c82e9db drm/msm: sync generated headers
We haven't sync'd for a while.. pull in updates to get definitions for
some fields in pkt7 payloads.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek 51dd427192 drm/msm/a6xx: add build_bw_table for A640/A650
This sets up bw tables for A640/A650 similar to A618/A630, 0 DDR bandwidth
vote, and the CNOC vote. A640 has the same CNOC addresses as A630 and was
working, but this is required for A650 to work.

Eventually the bw table should be filled by querying the interconnect
driver for each BW in the dts, but use these dummy tables for now.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Jonathan Marek 142639a52a drm/msm/a6xx: fix crashstate capture for A650
A650 has a separate RSCC region, so dump RSCC registers separately, reading
them from the RSCC base. Without this change a GPU hang will cause a system
reset if CONFIG_DEV_COREDUMP is enabled.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:16 -07:00
Eric Anholt 62a35e81c2 drm/msm: Quiet error during failure in optional resource mappings.
We don't expect to find vbif_nrt or regdma on sdm845, but were clogging
up dmesg with errors about it.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Eric Anholt ecf9cd4899 drm/msm: Garbage collect unused resource _len fields.
Nothing was using the lengths of these ioremaps.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Rob Clark b8afe9f87c drm/msm/dpu: fix/enable 6bpc dither with split-lm
If split-lm is used (for ex, on sdm845), we can have multiple ping-
pongs, but only a single phys encoder.  We need to configure dithering
on each of them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Kalyan Thota <kalyan_t@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Akhil P Oommen 57c0bd517c drm: msm: a6xx: fix gpu failure after system resume
On targets where GMU is available, GMU takes over the ownership of GX GDSC
during its initialization. So, move the refcount-get on GX PD before we
initialize the GMU. This ensures that nobody can collapse the GX GDSC
once GMU owns the GX GDSC. This patch fixes some GMU OOB errors seen
during GPU wake up during a system resume.

Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Rajendra Nayak 32d3e0fecc drm/msm: dsi: Use OPP API to set clk/perf state
On SDM845 and SC7180 DSI needs to express a performance state
requirement on a power domain depending on the clock rates.
Use OPP table from DT to register with OPP framework and use
dev_pm_opp_set_rate() to set the clk/perf state.

dev_pm_opp_set_rate() is designed to be equivalent to clk_set_rate()
for devices without an OPP table, hence the change works fine
on devices/platforms which only need to set a clock rate.

Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Rajendra Nayak b0530eb119 drm/msm/dpu: Use OPP API to set clk/perf state
On some qualcomm platforms DPU needs to express a performance state
requirement on a power domain depending on the clock rates.
Use OPP table from DT to register with OPP framework and use
dev_pm_opp_set_rate() to set the clk/perf state.

Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Rob Clark 5e16372b59 drm/msm: ratelimit crtc event overflow error
This can happen a lot when things go pear shaped.  Lets not flood dmesg
when this happens.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Sharat Masetty 1f60d11423 drm: msm: a6xx: send opp instead of a frequency
This patch changes the plumbing to send the devfreq recommended opp rather
than the frequency. Also consolidate and rearrange the code in a6xx to set
the GPU frequency and the icc vote in preparation for the upcoming
changes for GPU->DDR scaling votes.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Sharat Masetty 369c4ef433 dt-bindings: drm/msm/gpu: Document gpu opp table
Update documentation to list the gpu opp table bindings including the
newly added "opp-peak-kBps" needed for GPU-DDR bandwidth scaling.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Akhil P Oommen 3cbdc8d8b7 drm/msm: Fix a null pointer access in msm_gem_shrinker_count()
Adding an msm_gem_object object to the inactive_list before completing
its initialization is a bad idea because shrinker may pick it up from the
inactive_list. Fix this by making sure that the initialization is complete
before moving the msm_obj object to the inactive list.

This patch fixes the below error:
[10027.553044] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000068
[10027.573305] Mem abort info:
[10027.590160]   ESR = 0x96000006
[10027.597905]   EC = 0x25: DABT (current EL), IL = 32 bits
[10027.614430]   SET = 0, FnV = 0
[10027.624427]   EA = 0, S1PTW = 0
[10027.632722] Data abort info:
[10027.638039]   ISV = 0, ISS = 0x00000006
[10027.647459]   CM = 0, WnR = 0
[10027.654345] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001e3a6a000
[10027.672681] [0000000000000068] pgd=0000000198c31003, pud=0000000198c31003, pmd=0000000000000000
[10027.693900] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[10027.738261] CPU: 3 PID: 214 Comm: kswapd0 Tainted: G S                5.4.40 #1
[10027.745766] Hardware name: Qualcomm Technologies, Inc. SC7180 IDP (DT)
[10027.752472] pstate: 80c00009 (Nzcv daif +PAN +UAO)
[10027.757409] pc : mutex_is_locked+0x14/0x2c
[10027.761626] lr : msm_gem_shrinker_count+0x70/0xec
[10027.766454] sp : ffffffc011323ad0
[10027.769867] x29: ffffffc011323ad0 x28: ffffffe677e4b878
[10027.775324] x27: 0000000000000cc0 x26: 0000000000000000
[10027.780783] x25: ffffff817114a708 x24: 0000000000000008
[10027.786242] x23: ffffff8023ab7170 x22: 0000000000000001
[10027.791701] x21: ffffff817114a080 x20: 0000000000000119
[10027.797160] x19: 0000000000000068 x18: 00000000000003bc
[10027.802621] x17: 0000000004a34210 x16: 00000000000000c0
[10027.808083] x15: 0000000000000000 x14: 0000000000000000
[10027.813542] x13: ffffffe677e0a3c0 x12: 0000000000000000
[10027.819000] x11: 0000000000000000 x10: ffffff8174b94340
[10027.824461] x9 : 0000000000000000 x8 : 0000000000000000
[10027.829919] x7 : 00000000000001fc x6 : ffffffc011323c88
[10027.835373] x5 : 0000000000000001 x4 : ffffffc011323d80
[10027.840832] x3 : ffffffff0477b348 x2 : 0000000000000000
[10027.846290] x1 : ffffffc011323b68 x0 : 0000000000000068
[10027.851748] Call trace:
[10027.854264]  mutex_is_locked+0x14/0x2c
[10027.858121]  msm_gem_shrinker_count+0x70/0xec
[10027.862603]  shrink_slab+0xc0/0x4b4
[10027.866187]  shrink_node+0x4a8/0x818
[10027.869860]  kswapd+0x624/0x890
[10027.873097]  kthread+0x11c/0x12c
[10027.876424]  ret_from_fork+0x10/0x18
[10027.880102] Code: f9000bf3 910003fd aa0003f3 d503201f (f9400268)
[10027.886362] ---[ end trace df5849a1a3543251 ]---
[10027.891518] Kernel panic - not syncing: Fatal exception

Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Kalyan Thota 3c128638a0 drm/msm/dpu: add support for dither block in display
This change enables dither block for primary interface
in display.

Enabled for 6bpc in the current version.

Changes in v1:
 - Remove redundant error checks (Rob).

Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:46:15 -07:00
Rob Clark 520c651f3b drm/msm/adreno: fix gpu probe if no interconnect-names
If there is no interconnect-names, but there is an interconnects
property, then of_icc_get(dev, "gfx-mem"); would return an error
rather than NULL.

Also, if there is no interconnect-names property, there will never
be a ocmem path.  But of_icc_get(dev, "ocmem") would return -EINVAL
instead of -ENODATA.  Just don't bother trying in this case.

v2: explicity check for interconnect-names property

Fixes: 08af4769c7 ("drm/msm: handle for EPROBE_DEFER for of_icc_get")
Fixes: 00bb9243d3 ("drm/msm/gpu: add support for ocmem interconnect path")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31 06:45:56 -07:00
Herbert Xu 075f77324f Bluetooth: Remove CRYPTO_ALG_INTERNAL flag
The flag CRYPTO_ALG_INTERNAL is not meant to be used outside of
the Crypto API.  It isn't needed here anyway.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2020-07-31 16:42:04 +03:00
Marcel Holtmann 79bf118957 Bluetooth: Increment management interface revision
Increment the mgmt revision due to the recently added new commands.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2020-07-31 16:41:09 +03:00
Vaibhav Jain af0870c4e7 powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
We add support for reporting 'fuel-gauge' NVDIMM metric via
PAPR_PDSM_HEALTH pdsm payload. 'fuel-gauge' metric indicates the usage
life remaining of a papr-scm compatible NVDIMM. PHYP exposes this
metric via the H_SCM_PERFORMANCE_STATS.

The metric value is returned from the pdsm by extending the return
payload 'struct nd_papr_pdsm_health' without breaking the ABI. A new
field 'dimm_fuel_gauge' to hold the metric value is introduced at the
end of the payload struct and its presence is indicated by by
extension flag PDSM_DIMM_HEALTH_RUN_GAUGE_VALID.

The patch introduces a new function papr_pdsm_fuel_gauge() that is
called from papr_pdsm_health(). If fetching NVDIMM performance stats
is supported then 'papr_pdsm_fuel_gauge()' allocated an output buffer
large enough to hold the performance stat and passes it to
drc_pmem_query_stats() that issues the HCALL to PHYP. The return value
of the stat is then populated in the 'struct
nd_papr_pdsm_health.dimm_fuel_gauge' field with extension flag
'PDSM_DIMM_HEALTH_RUN_GAUGE_VALID' set in 'struct
nd_papr_pdsm_health.extension_flags'

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200731064153.182203-3-vaibhav@linux.ibm.com
2020-07-31 22:55:28 +10:00
Vaibhav Jain 2d02bf835e powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
Update papr_scm.c to query dimm performance statistics from PHYP via
H_SCM_PERFORMANCE_STATS hcall and export them to user-space as PAPR
specific NVDIMM attribute 'perf_stats' in sysfs. The patch also
provide a sysfs ABI documentation for the stats being reported and
their meanings.

During NVDIMM probe time in papr_scm_nvdimm_init() a special variant
of H_SCM_PERFORMANCE_STATS hcall is issued to check if collection of
performance statistics is supported or not. If successful then a PHYP
returns a maximum possible buffer length needed to read all
performance stats. This returned value is stored in a per-nvdimm
attribute 'stat_buffer_len'.

The layout of request buffer for reading NVDIMM performance stats from
PHYP is defined in 'struct papr_scm_perf_stats' and 'struct
papr_scm_perf_stat'. These structs are used in newly introduced
drc_pmem_query_stats() that issues the H_SCM_PERFORMANCE_STATS hcall.

The sysfs access function perf_stats_show() uses value
'stat_buffer_len' to allocate a buffer large enough to hold all
possible NVDIMM performance stats and passes it to
drc_pmem_query_stats() to populate. Finally statistics reported in the
buffer are formatted into the sysfs access function output buffer.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200731064153.182203-2-vaibhav@linux.ibm.com
2020-07-31 22:55:27 +10:00
Suren Baghdasaryan 3e338d3c95 staging: android: ashmem: Fix lockdep warning for write operation
syzbot report [1] describes a deadlock when write operation against an
ashmem fd executed at the time when ashmem is shrinking its cache results
in the following lock sequence:

Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(fs_reclaim);
                                lock(&sb->s_type->i_mutex_key#13);
                                lock(fs_reclaim);
   lock(&sb->s_type->i_mutex_key#13);

kswapd takes fs_reclaim and then inode_lock while generic_perform_write
takes inode_lock and then fs_reclaim. However ashmem does not support
writing into backing shmem with a write syscall. The only way to change
its content is to mmap it and operate on mapped memory. Therefore the race
that lockdep is warning about is not valid. Resolve this by introducing a
separate lockdep class for the backing shmem inodes.

[1]: https://lkml.kernel.org/lkml/0000000000000b5f9d059aa2037f@google.com/

Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20200730192632.3088194-1-surenb@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31 14:41:43 +02:00
Christian Gromm 97a6f772f3 drivers: most: add USB adapter driver
This patch adds the USB driver source file most_usb.c and
modifies the Makefile and Kconfig accordingly.

Signed-off-by: Christian Gromm <christian.gromm@microchip.com>
Link: https://lore.kernel.org/r/1596198058-26541-1-git-send-email-christian.gromm@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31 14:38:12 +02:00
Crag Wang 46cbd0b057 power: supply: wilco_ec: Add long life charging mode
This is a long life mode set in the factory for extended warranty
battery, the power charging rate is customized so that battery at
work last longer.

Presently switching to a different battery charging mode is through
EC PID 0x0710 to configure the battery firmware, this operation will
be blocked by EC with failure code 0x01 when PLL mode is already
in use.

Signed-off-by: Crag Wang <crag.wang@dell.com>
Reviewed-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-07-31 14:33:56 +02:00
Ian Rogers 7c43b0c1d4 perf bench: Add benchmark of find_next_bit
for_each_set_bit, or similar functions like for_each_cpu, may be hot
within the kernel. If many bits were set then one could imagine on Intel
a "bt" instruction with every bit may be faster than the function call
and word length find_next_bit logic. Add a benchmark to measure this.

This benchmark on AMD rome and Intel skylakex shows "bt" is not a good
option except for very small bitmaps.

Committer testing:

  # perf bench
  Usage:
  	perf bench [<common options>] <collection> <benchmark> [<options>]

          # List of all available benchmark collections:

           sched: Scheduler and IPC benchmarks
         syscall: System call benchmarks
             mem: Memory access benchmarks
            numa: NUMA scheduling and MM benchmarks
           futex: Futex stressing benchmarks
           epoll: Epoll stressing benchmarks
       internals: Perf-internals benchmarks
             all: All benchmarks

  # perf bench mem

          # List of available benchmarks for collection 'mem':

          memcpy: Benchmark for memcpy() functions
          memset: Benchmark for memset() functions
        find_bit: Benchmark for find_bit() functions
             all: Run all memory access benchmarks

  # perf bench mem find_bit
  # Running 'mem/find_bit' benchmark:
  100000 operations 1 bits set of 1 bits
    Average for_each_set_bit took: 730.200 usec (+- 6.468 usec)
    Average test_bit loop took:    366.200 usec (+- 4.652 usec)
  100000 operations 1 bits set of 2 bits
    Average for_each_set_bit took: 781.000 usec (+- 24.247 usec)
    Average test_bit loop took:    550.200 usec (+- 4.152 usec)
  100000 operations 2 bits set of 2 bits
    Average for_each_set_bit took: 1113.400 usec (+- 112.340 usec)
    Average test_bit loop took:    1098.500 usec (+- 182.834 usec)
  100000 operations 1 bits set of 4 bits
    Average for_each_set_bit took: 843.800 usec (+- 8.772 usec)
    Average test_bit loop took:    948.800 usec (+- 10.278 usec)
  100000 operations 2 bits set of 4 bits
    Average for_each_set_bit took: 1185.800 usec (+- 114.345 usec)
    Average test_bit loop took:    1473.200 usec (+- 175.498 usec)
  100000 operations 4 bits set of 4 bits
    Average for_each_set_bit took: 1769.667 usec (+- 233.177 usec)
    Average test_bit loop took:    1864.933 usec (+- 187.470 usec)
  100000 operations 1 bits set of 8 bits
    Average for_each_set_bit took: 898.000 usec (+- 21.755 usec)
    Average test_bit loop took:    1768.400 usec (+- 23.672 usec)
  100000 operations 2 bits set of 8 bits
    Average for_each_set_bit took: 1244.900 usec (+- 116.396 usec)
    Average test_bit loop took:    2201.800 usec (+- 145.398 usec)
  100000 operations 4 bits set of 8 bits
    Average for_each_set_bit took: 1822.533 usec (+- 231.554 usec)
    Average test_bit loop took:    2569.467 usec (+- 168.453 usec)
  100000 operations 8 bits set of 8 bits
    Average for_each_set_bit took: 2845.100 usec (+- 441.365 usec)
    Average test_bit loop took:    3023.300 usec (+- 219.575 usec)
  100000 operations 1 bits set of 16 bits
    Average for_each_set_bit took: 923.400 usec (+- 17.560 usec)
    Average test_bit loop took:    3240.000 usec (+- 16.492 usec)
  100000 operations 2 bits set of 16 bits
    Average for_each_set_bit took: 1264.300 usec (+- 114.034 usec)
    Average test_bit loop took:    3714.400 usec (+- 158.898 usec)
  100000 operations 4 bits set of 16 bits
    Average for_each_set_bit took: 1817.867 usec (+- 222.199 usec)
    Average test_bit loop took:    4015.333 usec (+- 154.162 usec)
  100000 operations 8 bits set of 16 bits
    Average for_each_set_bit took: 2826.350 usec (+- 433.457 usec)
    Average test_bit loop took:    4460.350 usec (+- 210.762 usec)
  100000 operations 16 bits set of 16 bits
    Average for_each_set_bit took: 4615.600 usec (+- 809.350 usec)
    Average test_bit loop took:    5129.960 usec (+- 320.821 usec)
  100000 operations 1 bits set of 32 bits
    Average for_each_set_bit took: 904.400 usec (+- 14.250 usec)
    Average test_bit loop took:    6194.000 usec (+- 29.254 usec)
  100000 operations 2 bits set of 32 bits
    Average for_each_set_bit took: 1252.700 usec (+- 116.432 usec)
    Average test_bit loop took:    6652.400 usec (+- 154.352 usec)
  100000 operations 4 bits set of 32 bits
    Average for_each_set_bit took: 1824.200 usec (+- 229.133 usec)
    Average test_bit loop took:    6961.733 usec (+- 154.682 usec)
  100000 operations 8 bits set of 32 bits
    Average for_each_set_bit took: 2823.950 usec (+- 432.296 usec)
    Average test_bit loop took:    7351.900 usec (+- 193.626 usec)
  100000 operations 16 bits set of 32 bits
    Average for_each_set_bit took: 4552.560 usec (+- 785.141 usec)
    Average test_bit loop took:    7998.360 usec (+- 305.629 usec)
  100000 operations 32 bits set of 32 bits
    Average for_each_set_bit took: 7557.067 usec (+- 1407.702 usec)
    Average test_bit loop took:    9072.400 usec (+- 513.209 usec)
  100000 operations 1 bits set of 64 bits
    Average for_each_set_bit took: 896.800 usec (+- 14.389 usec)
    Average test_bit loop took:    11927.200 usec (+- 68.862 usec)
  100000 operations 2 bits set of 64 bits
    Average for_each_set_bit took: 1230.400 usec (+- 111.731 usec)
    Average test_bit loop took:    12478.600 usec (+- 189.382 usec)
  100000 operations 4 bits set of 64 bits
    Average for_each_set_bit took: 1844.733 usec (+- 244.826 usec)
    Average test_bit loop took:    12911.467 usec (+- 206.246 usec)
  100000 operations 8 bits set of 64 bits
    Average for_each_set_bit took: 2779.300 usec (+- 413.612 usec)
    Average test_bit loop took:    13372.650 usec (+- 239.623 usec)
  100000 operations 16 bits set of 64 bits
    Average for_each_set_bit took: 4423.920 usec (+- 748.240 usec)
    Average test_bit loop took:    13995.800 usec (+- 318.427 usec)
  100000 operations 32 bits set of 64 bits
    Average for_each_set_bit took: 7580.600 usec (+- 1462.407 usec)
    Average test_bit loop took:    15063.067 usec (+- 516.477 usec)
  100000 operations 64 bits set of 64 bits
    Average for_each_set_bit took: 13391.514 usec (+- 2765.371 usec)
    Average test_bit loop took:    16974.914 usec (+- 916.936 usec)
  100000 operations 1 bits set of 128 bits
    Average for_each_set_bit took: 1153.800 usec (+- 124.245 usec)
    Average test_bit loop took:    26959.000 usec (+- 714.047 usec)
  100000 operations 2 bits set of 128 bits
    Average for_each_set_bit took: 1445.200 usec (+- 113.587 usec)
    Average test_bit loop took:    25798.800 usec (+- 512.908 usec)
  100000 operations 4 bits set of 128 bits
    Average for_each_set_bit took: 1990.933 usec (+- 219.362 usec)
    Average test_bit loop took:    25589.400 usec (+- 348.288 usec)
  100000 operations 8 bits set of 128 bits
    Average for_each_set_bit took: 2963.000 usec (+- 419.487 usec)
    Average test_bit loop took:    25690.050 usec (+- 262.025 usec)
  100000 operations 16 bits set of 128 bits
    Average for_each_set_bit took: 4585.200 usec (+- 741.734 usec)
    Average test_bit loop took:    26125.040 usec (+- 274.127 usec)
  100000 operations 32 bits set of 128 bits
    Average for_each_set_bit took: 7626.200 usec (+- 1404.950 usec)
    Average test_bit loop took:    27038.867 usec (+- 442.554 usec)
  100000 operations 64 bits set of 128 bits
    Average for_each_set_bit took: 13343.371 usec (+- 2686.460 usec)
    Average test_bit loop took:    28936.543 usec (+- 883.257 usec)
  100000 operations 128 bits set of 128 bits
    Average for_each_set_bit took: 23442.950 usec (+- 4880.541 usec)
    Average test_bit loop took:    32484.125 usec (+- 1691.931 usec)
  100000 operations 1 bits set of 256 bits
    Average for_each_set_bit took: 1183.000 usec (+- 32.073 usec)
    Average test_bit loop took:    50114.600 usec (+- 198.880 usec)
  100000 operations 2 bits set of 256 bits
    Average for_each_set_bit took: 1550.000 usec (+- 124.550 usec)
    Average test_bit loop took:    50334.200 usec (+- 128.425 usec)
  100000 operations 4 bits set of 256 bits
    Average for_each_set_bit took: 2164.333 usec (+- 246.359 usec)
    Average test_bit loop took:    49959.867 usec (+- 188.035 usec)
  100000 operations 8 bits set of 256 bits
    Average for_each_set_bit took: 3211.200 usec (+- 454.829 usec)
    Average test_bit loop took:    50140.850 usec (+- 176.046 usec)
  100000 operations 16 bits set of 256 bits
    Average for_each_set_bit took: 5181.640 usec (+- 882.726 usec)
    Average test_bit loop took:    51003.160 usec (+- 419.601 usec)
  100000 operations 32 bits set of 256 bits
    Average for_each_set_bit took: 8369.333 usec (+- 1513.150 usec)
    Average test_bit loop took:    52096.700 usec (+- 573.022 usec)
  100000 operations 64 bits set of 256 bits
    Average for_each_set_bit took: 13866.857 usec (+- 2649.393 usec)
    Average test_bit loop took:    53989.600 usec (+- 938.808 usec)
  100000 operations 128 bits set of 256 bits
    Average for_each_set_bit took: 23588.350 usec (+- 4724.222 usec)
    Average test_bit loop took:    57300.625 usec (+- 1625.962 usec)
  100000 operations 256 bits set of 256 bits
    Average for_each_set_bit took: 42752.200 usec (+- 9202.084 usec)
    Average test_bit loop took:    64426.933 usec (+- 3402.326 usec)
  100000 operations 1 bits set of 512 bits
    Average for_each_set_bit took: 1632.000 usec (+- 229.954 usec)
    Average test_bit loop took:    98090.000 usec (+- 1120.435 usec)
  100000 operations 2 bits set of 512 bits
    Average for_each_set_bit took: 1937.700 usec (+- 148.902 usec)
    Average test_bit loop took:    100364.100 usec (+- 1433.219 usec)
  100000 operations 4 bits set of 512 bits
    Average for_each_set_bit took: 2528.000 usec (+- 243.654 usec)
    Average test_bit loop took:    99932.067 usec (+- 955.868 usec)
  100000 operations 8 bits set of 512 bits
    Average for_each_set_bit took: 3734.100 usec (+- 512.359 usec)
    Average test_bit loop took:    98944.750 usec (+- 812.070 usec)
  100000 operations 16 bits set of 512 bits
    Average for_each_set_bit took: 5551.400 usec (+- 846.605 usec)
    Average test_bit loop took:    98691.600 usec (+- 654.753 usec)
  100000 operations 32 bits set of 512 bits
    Average for_each_set_bit took: 8594.500 usec (+- 1446.072 usec)
    Average test_bit loop took:    99176.867 usec (+- 579.990 usec)
  100000 operations 64 bits set of 512 bits
    Average for_each_set_bit took: 13840.743 usec (+- 2527.055 usec)
    Average test_bit loop took:    100758.743 usec (+- 833.865 usec)
  100000 operations 128 bits set of 512 bits
    Average for_each_set_bit took: 23185.925 usec (+- 4532.910 usec)
    Average test_bit loop took:    103786.700 usec (+- 1475.276 usec)
  100000 operations 256 bits set of 512 bits
    Average for_each_set_bit took: 40322.400 usec (+- 8341.802 usec)
    Average test_bit loop took:    109433.378 usec (+- 2742.615 usec)
  100000 operations 512 bits set of 512 bits
    Average for_each_set_bit took: 71804.540 usec (+- 15436.546 usec)
    Average test_bit loop took:    120255.440 usec (+- 5252.777 usec)
  100000 operations 1 bits set of 1024 bits
    Average for_each_set_bit took: 1859.600 usec (+- 27.969 usec)
    Average test_bit loop took:    187676.000 usec (+- 1337.770 usec)
  100000 operations 2 bits set of 1024 bits
    Average for_each_set_bit took: 2273.600 usec (+- 139.420 usec)
    Average test_bit loop took:    188176.000 usec (+- 684.357 usec)
  100000 operations 4 bits set of 1024 bits
    Average for_each_set_bit took: 2940.400 usec (+- 268.213 usec)
    Average test_bit loop took:    189172.600 usec (+- 593.295 usec)
  100000 operations 8 bits set of 1024 bits
    Average for_each_set_bit took: 4224.200 usec (+- 547.933 usec)
    Average test_bit loop took:    190257.250 usec (+- 621.021 usec)
  100000 operations 16 bits set of 1024 bits
    Average for_each_set_bit took: 6090.560 usec (+- 877.975 usec)
    Average test_bit loop took:    190143.880 usec (+- 503.753 usec)
  100000 operations 32 bits set of 1024 bits
    Average for_each_set_bit took: 9178.800 usec (+- 1475.136 usec)
    Average test_bit loop took:    190757.100 usec (+- 494.757 usec)
  100000 operations 64 bits set of 1024 bits
    Average for_each_set_bit took: 14441.457 usec (+- 2545.497 usec)
    Average test_bit loop took:    192299.486 usec (+- 795.251 usec)
  100000 operations 128 bits set of 1024 bits
    Average for_each_set_bit took: 23623.825 usec (+- 4481.182 usec)
    Average test_bit loop took:    194885.550 usec (+- 1300.817 usec)
  100000 operations 256 bits set of 1024 bits
    Average for_each_set_bit took: 40194.956 usec (+- 8109.056 usec)
    Average test_bit loop took:    200259.311 usec (+- 2566.085 usec)
  100000 operations 512 bits set of 1024 bits
    Average for_each_set_bit took: 70983.560 usec (+- 15074.982 usec)
    Average test_bit loop took:    210527.460 usec (+- 4968.980 usec)
  100000 operations 1024 bits set of 1024 bits
    Average for_each_set_bit took: 136530.345 usec (+- 31584.400 usec)
    Average test_bit loop took:    233329.691 usec (+- 10814.036 usec)
  100000 operations 1 bits set of 2048 bits
    Average for_each_set_bit took: 3077.600 usec (+- 76.376 usec)
    Average test_bit loop took:    402154.400 usec (+- 518.571 usec)
  100000 operations 2 bits set of 2048 bits
    Average for_each_set_bit took: 3508.600 usec (+- 148.350 usec)
    Average test_bit loop took:    403814.500 usec (+- 1133.027 usec)
  100000 operations 4 bits set of 2048 bits
    Average for_each_set_bit took: 4219.333 usec (+- 285.844 usec)
    Average test_bit loop took:    404312.533 usec (+- 985.751 usec)
  100000 operations 8 bits set of 2048 bits
    Average for_each_set_bit took: 5670.550 usec (+- 615.238 usec)
    Average test_bit loop took:    405321.800 usec (+- 1038.487 usec)
  100000 operations 16 bits set of 2048 bits
    Average for_each_set_bit took: 7785.080 usec (+- 992.522 usec)
    Average test_bit loop took:    406746.160 usec (+- 1015.478 usec)
  100000 operations 32 bits set of 2048 bits
    Average for_each_set_bit took: 11163.800 usec (+- 1627.320 usec)
    Average test_bit loop took:    406124.267 usec (+- 898.785 usec)
  100000 operations 64 bits set of 2048 bits
    Average for_each_set_bit took: 16964.629 usec (+- 2806.130 usec)
    Average test_bit loop took:    406618.514 usec (+- 798.356 usec)
  100000 operations 128 bits set of 2048 bits
    Average for_each_set_bit took: 27219.625 usec (+- 4988.458 usec)
    Average test_bit loop took:    410149.325 usec (+- 1705.641 usec)
  100000 operations 256 bits set of 2048 bits
    Average for_each_set_bit took: 45138.578 usec (+- 8831.021 usec)
    Average test_bit loop took:    415462.467 usec (+- 2725.418 usec)
  100000 operations 512 bits set of 2048 bits
    Average for_each_set_bit took: 77450.540 usec (+- 15962.238 usec)
    Average test_bit loop took:    426089.180 usec (+- 5171.788 usec)
  100000 operations 1024 bits set of 2048 bits
    Average for_each_set_bit took: 138023.636 usec (+- 29826.959 usec)
    Average test_bit loop took:    446346.636 usec (+- 9904.417 usec)
  100000 operations 2048 bits set of 2048 bits
    Average for_each_set_bit took: 251072.600 usec (+- 55947.692 usec)
    Average test_bit loop took:    484855.983 usec (+- 18970.431 usec)
  #

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200729220034.1337168-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-31 09:32:11 -03:00