Linux kernel source tree
 
 
 
 
 
 
Go to file
David Laight d10bb374c4 lib: mul_u64_u64_div_u64(): optimise the divide code
Replace the bit by bit algorithm with one that generates 16 bits per
iteration on 32bit architectures and 32 bits on 64bit ones.

On my zen 5 this reduces the time for the tests (using the generic code)
from ~3350ns to ~1000ns.

Running the 32bit algorithm on 64bit x86 takes ~1500ns.  It'll be slightly
slower on a real 32bit system, mostly due to register pressure.

The savings for 32bit x86 are much higher (tested in userspace).  The
worst case (lots of bits in the quotient) drops from ~900 clocks to ~130
(pretty much independant of the arguments).  Other 32bit architectures may
see better savings.

It is possibly to optimise for divisors that span less than
__LONG_WIDTH__/2 bits.  However I suspect they don't happen that often and
it doesn't remove any slow cpu divide instructions which dominate the
result.

Typical improvements for 64bit random divides:
               old     new
sandy bridge:  470     150
haswell:       400     144
piledriver:    960     467   I think rdpmc is very slow.
zen5:          244      80
(Timing is 'rdpmc; mul_div(); rdpmc' with the multiply depending on the
first rdpmc and the second rdpmc depending on the quotient.)

Object code (64bit x86 test program): old 0x173 new 0x141.

Link: https://lkml.kernel.org/r/20251105201035.64043-9-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20 14:03:42 -08:00
Documentation dynamic_debug: add support for print stack 2025-11-12 10:00:16 -08:00
LICENSES LICENSES: Replace the obsolete address of the FSF in the GFDL-1.2 2025-07-24 11:15:39 +02:00
arch lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86 2025-11-20 14:03:42 -08:00
block block-6.18-20251031 2025-10-31 12:57:19 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto This push contains the following changes: 2025-10-10 08:56:16 -07:00
drivers i2c-for-6.18-rc5 2025-11-09 09:29:44 -08:00
fs ocfs2: validate cl_bpc in allocator inodes to prevent divide-by-zero 2025-11-20 14:03:41 -08:00
include lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86 2025-11-20 14:03:42 -08:00
init init/main.c: wrap long kernel cmdline when printing to logs 2025-11-12 10:00:16 -08:00
io_uring io_uring: fix regbuf vector size truncation 2025-11-07 17:17:13 -07:00
ipc ipc: create_ipc_ns: drop mqueue mount on sysctl setup failure 2025-11-12 10:00:15 -08:00
kernel kernel/hung_task: unexport sysctl_hung_task_timeout_secs 2025-11-20 14:03:41 -08:00
lib lib: mul_u64_u64_div_u64(): optimise the divide code 2025-11-20 14:03:42 -08:00
mm slab fix for 6.18-rc5 2025-11-07 08:01:58 -08:00
net Including fixes from bluetooth and wireless. 2025-11-06 08:52:30 -08:00
rust uaccess: decouple INLINE_COPY_FROM_USER and CONFIG_RUST 2025-11-12 10:00:16 -08:00
samples samples: fix coding style issues in Kconfig 2025-11-11 16:48:28 -08:00
scripts checkpatch: add IDR to the deprecated list 2025-11-20 14:03:41 -08:00
security integrity-v6.18 2025-10-05 10:48:33 -07:00
sound ASoC: Fixes for v6.18 2025-10-30 13:08:08 +01:00
tools hung_task: panic when there are more than N hung tasks at the same time 2025-11-12 10:00:14 -08:00
usr gen_init_cpio: Ignore fsync() returning EINVAL on pipes 2025-10-07 09:53:05 -07:00
virt KVM x86 fixes for 6.18: 2025-10-18 10:25:43 +02:00
.clang-format memblock: drop for_each_free_mem_pfn_range_in_zone_from() 2025-09-14 08:49:03 +03:00
.clippy.toml rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration 2025-05-07 00:11:47 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore MAINTAINERS: remove Alyssa Rosenzweig 2025-09-18 21:17:31 +02:00
.gitattributes
.gitignore .gitignore: ignore compile_commands.json globally 2025-08-12 15:53:55 -07:00
.mailmap mailmap: add entry for Hao Ge 2025-11-12 10:00:17 -08:00
.pylintrc tools: docs: parse-headers.py: move it from sphinx dir 2025-08-29 15:54:42 -06:00
.rustfmt.toml
COPYING
CREDITS CREDITS: update Martin's information 2025-11-12 10:00:14 -08:00
Kbuild sched: Make migrate_{en,dis}able() inline 2025-09-25 09:57:16 +02:00
Kconfig io_uring: Rename KConfig to Kconfig 2025-02-19 14:53:27 -07:00
MAINTAINERS MAINTAINERS: apply name and email address changes for Martin 2025-11-12 10:00:14 -08:00
Makefile Linux 6.18-rc5 2025-11-09 15:10:19 -08:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.