Commit Graph

1185157 Commits (c38b8400aef99d63be2b1ff131bb993465dcafe1)

Author SHA1 Message Date
Ye Bin 835659598c ext4: fix use-after-free read in ext4_find_extent for bigalloc + inline
Syzbot found the following issue:
loop0: detected capacity change from 0 to 2048
EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none.
==================================================================
BUG: KASAN: use-after-free in ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline]
BUG: KASAN: use-after-free in ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931
Read of size 4 at addr ffff888073644750 by task syz-executor420/5067

CPU: 0 PID: 5067 Comm: syz-executor420 Not tainted 6.2.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106
 print_address_description+0x74/0x340 mm/kasan/report.c:306
 print_report+0x107/0x1f0 mm/kasan/report.c:417
 kasan_report+0xcd/0x100 mm/kasan/report.c:517
 ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline]
 ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931
 ext4_clu_mapped+0x117/0x970 fs/ext4/extents.c:5809
 ext4_insert_delayed_block fs/ext4/inode.c:1696 [inline]
 ext4_da_map_blocks fs/ext4/inode.c:1806 [inline]
 ext4_da_get_block_prep+0x9e8/0x13c0 fs/ext4/inode.c:1870
 ext4_block_write_begin+0x6a8/0x2290 fs/ext4/inode.c:1098
 ext4_da_write_begin+0x539/0x760 fs/ext4/inode.c:3082
 generic_perform_write+0x2e4/0x5e0 mm/filemap.c:3772
 ext4_buffered_write_iter+0x122/0x3a0 fs/ext4/file.c:285
 ext4_file_write_iter+0x1d0/0x18f0
 call_write_iter include/linux/fs.h:2186 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x7dc/0xc50 fs/read_write.c:584
 ksys_write+0x177/0x2a0 fs/read_write.c:637
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f4b7a9737b9
RSP: 002b:00007ffc5cac3668 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4b7a9737b9
RDX: 00000000175d9003 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 00007f4b7a933050 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000079f R11: 0000000000000246 R12: 00007f4b7a9330e0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

Above issue is happens when enable bigalloc and inline data feature. As
commit 131294c35e fixed delayed allocation bug in ext4_clu_mapped for
bigalloc + inline. But it only resolved issue when has inline data, if
inline data has been converted to extent(ext4_da_convert_inline_data_to_extent)
before writepages, there is no EXT4_STATE_MAY_INLINE_DATA flag. However
i_data is still store inline data in this scene. Then will trigger UAF
when find extent.
To resolve above issue, there is need to add judge "ext4_has_inline_data(inode)"
in ext4_clu_mapped().

Fixes: 131294c35e ("ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline")
Reported-by: syzbot+bf4bb7731ef73b83a3b4@syzkaller.appspotmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20230406111627.1916759-1-tudor.ambarus@linaro.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 12:56:35 -04:00
Linus Torvalds 22b8cc3e78 Add support for new Linear Address Masking CPU feature. This is similar
to ARM's Top Byte Ignore and allows userspace to store metadata in some
 bits of pointers without masking it out before use.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRK/WIACgkQaDWVMHDJ
 krAL+RAAw33EhsWyYVkeAtYmYBKkGvlgeSDULtfJKe5bynJBTHkGKfM6RE9MSJIt
 5fHWaConGh8HNpy0Us1sDvd/aWcWRm5h7ZcCVD+R4qrgh/vc7ULzM+elXe5jzr4W
 cyuTckF2eW6SVrYg6fH5q+6Uy/moDtrdkLRvwRBf+AYeepB8gvSSH5XixKDNiVBE
 pjNy1xXVZQokqD4tjsFelmLttyacR5OabiE/aeVNoFYf9yTwfnN8N3T6kwuOoS4l
 Lp6NA+/0ux+oBlR+Is+JJG8Mxrjvz96yJGZYdR2YP5k3bMQtHAAjuq2w+GgqZm5i
 j3/E6KQepEGaCfC+bHl68xy/kKx8ik+jMCEcBalCC25J3uxbLz41g6K3aI890wJn
 +5ZtfcmoDUk9pnUyLxR8t+UjOSBFAcRSUE+FTjUH1qEGsMPK++9a4iLXz5vYVK1+
 +YCt1u5LNJbkDxE8xVX3F5jkXh0G01SJsuUVAOqHSNfqSNmohFK8/omqhVRrRqoK
 A7cYLtnOGiUXLnvjrwSxPNOzRrG+GAwqaw8gwOTaYogETWbTY8qsSCEVl204uYwd
 m8io9rk2ZXUdDuha56xpBbPE0JHL9hJ2eKCuPkfvRgJT9YFyTh+e0UdX20k+nDjc
 ang1S350o/Y0sus6rij1qS8AuxJIjHucG0GdgpZk3KUbcxoRLhI=
 =qitk
 -----END PGP SIGNATURE-----

Merge tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 LAM (Linear Address Masking) support from Dave Hansen:
 "Add support for the new Linear Address Masking CPU feature.

  This is similar to ARM's Top Byte Ignore and allows userspace to store
  metadata in some bits of pointers without masking it out before use"

* tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside
  x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA
  selftests/x86/lam: Add test cases for LAM vs thread creation
  selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVA test cases for linear-address masking
  selftests/x86/lam: Add inherit test cases for linear-address masking
  selftests/x86/lam: Add io_uring test cases for linear-address masking
  selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking
  selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking
  x86/mm/iommu/sva: Make LAM and SVA mutually exclusive
  iommu/sva: Replace pasid_valid() helper with mm_valid_pasid()
  mm: Expose untagging mask in /proc/$PID/status
  x86/mm: Provide arch_prctl() interface for LAM
  x86/mm: Reduce untagged_addr() overhead for systems without LAM
  x86/uaccess: Provide untagged_addr() and remove tags before address check
  mm: Introduce untagged_addr_remote()
  x86/mm: Handle LAM on context switch
  x86: CPUID and CR3/CR4 flags for Linear Address Masking
  x86: Allow atomic MM_CONTEXT flags setting
  x86/mm: Rework address range check in get_user() and put_user()
2023-04-28 09:43:49 -07:00
Maxim Korotkov 3e46c89c74 writeback: fix call of incorrect macro
the variable 'history' is of type u16, it may be an error
 that the hweight32 macro was used for it
 I guess macro hweight16 should be used

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2a81490811 ("writeback: implement foreign cgroup inode detection")
Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230119104443.3002-1-korotkov.maxim.s@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-28 10:41:32 -06:00
Jens Axboe f40c153afe Merge tag 'md-next-2023-04-28' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.4/block
Pull MD fixes from Song:

"1. Improve raid5 sequential IO performance on spinning disks, which fixes
    a regression since v6.0, by Jan Kara.
 2. Fix bitmap offset types, which fixes an issue introduced in this merge
    window, by Jonathan Derrick."

* tag 'md-next-2023-04-28' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: Fix bitmap offset type in sb writer
  md/raid5: Improve performance for sequential IO
2023-04-28 10:36:27 -06:00
Linus Torvalds 7b664cc38e * Do conditional __tdx_hypercall() 'output' processing via an
assembly macro argument rather than a runtime register.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRKvncACgkQaDWVMHDJ
 krDQsA/+OlDCexITqWgmd3rPN2ZUuyB+aV4MfoKppQuAuWD4I7vxD5qeqjRS2XTh
 E6SSzp43zEVhVo6Kv3UvPR/Tr9edUGn2KzIWmqd1bOwhgbEfd898gzbWuRmK6i8t
 qqweR1RMAL/COgPAlcrdpTLl2PCc9tLYpDnQ8WcAUqH4uoePpQyN3Za0J/dcKX7l
 8XexOAaco4Wz3ylD9npPcLo9ytvohg+exJtCNldN1l2j5xXdA2fTqEJYaUMp/+Nd
 Z1TTQ43QcT7dRknFojxdYfAkCqBfr8ccBAwV1mriahKWY/3xl35BqSeJVlma1tkm
 UzkTY1CFwKYRk24C/oQK7OQMYnyJ7Q1RhSrd91lQWVjaTcI/3DPUKiKKdwFXDv4C
 FUYvuJkanPVk3PyCZRvltdNvsXsifzx0RKZWLZ+3TQ2jtaMEDOzPgChq7a6WfpkQ
 HQPuVoENHvyHdUycQhtELUsaJ3AdnOM87XiQDcbNNiaPiOLB9C8dhSWMKoPsMehO
 oAiUQ7lW6po0lcELVSKib2ASVpXhOmlAxdRyZ50mhjrbpcxfBBGD3+KdFqZ4Gs1c
 8UyrQbjVq07Lx2fvdizvDpIcr4M7z0xBAhJeIegC6z86XpJq5uvin+vOLzFAfe16
 WGy6FiZtVpXp4fyqUY7GgQNqhk1b8h6EHKd9d/zCSPuH8/wT/6g=
 =hvDm
 -----END PGP SIGNATURE-----

Merge tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 tdx update from Dave Hansen:
 "The original tdx hypercall assembly code took two flags in %RSI to
  tweak its behavior at runtime. PeterZ recently axed one flag in commit
  e80a48bade ("x86/tdx: Remove TDX_HCALL_ISSUE_STI").

  Kill the other flag too and tweak the 'output' mode with an assembly
  macro instead. This results in elimination of one push/pop pair and
  overall easier to read assembly.

   - Do conditional __tdx_hypercall() 'output' processing via an
     assembly macro argument rather than a runtime register"

* tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tdx: Drop flags from __tdx_hypercall()
2023-04-28 09:36:09 -07:00
Linus Torvalds e54debe657 * Improve AMX documentation along with example code
* Explicitly make some hardware constants part of the uabi
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRKkrkACgkQaDWVMHDJ
 krDtDBAAhWbKRK1rJJsz2GuliF3/f/cZwcNxGG+QGrYBl2F2ilOrmVwNYME2TvHD
 qQJHm8pU7vnDpnkZspqE0OoB6fbSa5qH3RfFhBFRziJFgN9mY0F0IJZeuH/EvJ/0
 7gkRMA3Fs41EESbAWhUTakvC6u3L06SUpUH2W8ixAcawZu+g/FksDXxE+eVVPZaQ
 Ztw17j6/m8W9bZ17HtyWK2vAepPlJhuXFPSAk7ox09ACwkqWAHO0/3RPcbc8HUZV
 lDyYeDhRELG1pai14GhTixRcgkdn4nnnNDmn13xpuwkpOh7FeZL/SoDmXtJ71CrJ
 I1YM1t9aB4ze2WDOo3mSKzU4efspGzAgIH26u19NQTmEp/9ppS+RaifXpt0r1yir
 ygOXkgk8l2qZPxryyL9ROU6b9cnPzsP9k3mWTtNJiJrx0CL73lWkA5KORb/Ezdnj
 kXAjTd4nUeCQJz+7PsnuvGqsT8/Dk1ugnHTu6Bn66U0hV0MNcx5G5m5HehDQBUmb
 TllHGJSGt/1AXIfBZ1p7GSrgCaq3NTzWNmcFxHS3bpC/pyGwszmdDBIS/pODfBfp
 0nG9cG8mte1KkhqjkSYTLtgarQEijs1NWrVnTUogg1kqtlvqZr8Zxun51YAW9Jt5
 zCGoB6W7EWVfJZBMHmVX7a4g21650mgte3YoAAyAwMJFtZG14ng=
 =GlmS
 -----END PGP SIGNATURE-----

Merge tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Dave Hansen:
 "There's no _actual_ kernel functionality here.

  This expands the documentation around AMX support including some code
  examples. The example code also exposed the fact that hardware
  architecture constants as part of the ABI, but there's no easy place
  that they get defined for apps. Adding them to a uabi header will
  eventually make life easier for consumers of the ABI.

  Summary:

   - Improve AMX documentation along with example code

   - Explicitly make some hardware constants part of the uabi"

* tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Explain the state component permission for guests
  Documentation/x86: Add the AMX enabling example
  x86/arch_prctl: Add AMX feature numbers as ABI constants
  Documentation/x86: Explain the purpose for dynamic features
2023-04-28 09:32:34 -07:00
Linus Torvalds 4980c176a7 Reduce redundant counter reads with resctrl refactoring
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRKki4ACgkQaDWVMHDJ
 krDf7w//eQjKw0dny5YZGOaHUNKMzJWQtzAnDJS0HuedWIar5+iCRiLh3zbxyKU0
 8IG/xkDHQ5Dd1V7mOyl1g4WdZ/rFmmepl2VtvYnfMs5x+U2sf6LttzzkXetbP+oj
 x5uvfa9Vx8Ad8unhgYa9KIIkg/x02ImyupPLw32R2a/cMTRoi+LJEGiiUAWFTCx6
 4ZCtryAKHDTgrbuOWTz46cEgil3ZQLBI/uvF3IKd7BegfpbXQq/iyXJhhD/hWfVw
 lqswuGZN+yVLTkyJ4EHxUXAJI1AuH327KZI1SgSTe8AKFiygx0ZOrkmeI6cXbKJO
 os22OdT+cwAI8OkblH+9rMAd4dmAnLw9o/rGylC9rzwyXmmRII5FJ6LrbWFvsHmh
 QrUTcRzBtHmwLfqUf60b4bXDmI2MMrN5PAxmvRsHbzSfzMHVJDPXG0IoGBhUPrjS
 QmZuCNjsaVIOOxaSm+EtfFMeRxmfTEc6e3YxEeykfjGqPph9o0YK6H3o/4MgupJe
 uik0scEqBzq2MXkOYv5dysiTb57QAR/Y+CWvZHJ1YcwFvjAQahqFSwQ+gItoHTDL
 Rec9Tq9cm0AG1mZL1fVWFPK+ECiFti3YvZoIZEtAyg6hOMoZsZvu+VWcHFRdgIGk
 5riWJE8MiVyzQGcvWFBFbaeYLm1+obGhJHMpMW476K3jr75y+RU=
 =xO/z
 -----END PGP SIGNATURE-----

Merge tag 'x86_cache_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resctrl update from Dave Hansen:
 "Reduce redundant counter reads with resctrl refactoring"

* tag 'x86_cache_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Avoid redundant counter read in __mon_event_count()
2023-04-28 09:30:51 -07:00
Linus Torvalds 682f7bbad2 - Unify duplicated __pa() and __va() definitions
- Simplify sysctl tables registration
 
 - Remove unused symbols
 
 - Correct function name in comment
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmRKjI4ACgkQEsHwGGHe
 VUq85Q/9FbGNdHD2uX2KcpeWkEjYVlrhDudsG8e1JAMCjfSD0CCNhG7yU+6Jabfs
 LszYLwkNeHVpLUUOOtAnObqXpdcv2vML7/j6Cgg5aqdMDv3RwIgTti5tSkHr7s1A
 ejH0Qo/oYYt2OsJYkl+KuGhcaBmdpqEOIeOtV98vBtqgkRDCwdJhhMZeF0qgZ1kN
 r3bFdwy0KIiyI+EBYDXEsew/nI9oEuzoNgaOVIZCeOtHjtbgdl/kc7JgfDd0838D
 nsoNk1R8PVSl6RY30my7TKbFl7epWibinnD9M8NcyYpbLlfZKI7L60ZtQZ5Q49pz
 z+LtXTgeS/fjaFuM8LKkekGprpNiDClgygNini3QsmSb3kfb4ymxJLKbVuXziOLZ
 eYAE+xexCNUYXhmeamvPWjRP9cUgQc3TQD0IQFv/FO8M0gXBA4jTauyRrs+NNmVI
 G7W7T90x1XUu4fZDM/QZ2cn5qtdcRMZm4NcV0WY5OU/ZrrMmMNyGvDfrwLhFOSXi
 nOqzlJ9GNRVjhHsQhCG16B2y3guWmPGXyCvn6Ruuv7RQcm7oK4Rmq6bHuuqcAyaI
 R5z2pRib3AzPNgHUfMgDWuCa7D9jBimVJI/dG0bXG8DCnzaBXfYJn2ruvwvQlVLC
 4WqwdyUxR7k+vf1l0kQ5voGCLbXOcLFBfGP+7RRnEzlyCut2t74=
 =I3Mj
 -----END PGP SIGNATURE-----

Merge tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Borislav Petkov:

 - Unify duplicated __pa() and __va() definitions

 - Simplify sysctl tables registration

 - Remove unused symbols

 - Correct function name in comment

* tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Centralize __pa()/__va() definitions
  x86: Simplify one-level sysctl registration for itmt_kern_table
  x86: Simplify one-level sysctl registration for abi_table2
  x86/platform/intel-mid: Remove unused definitions from intel-mid.h
  x86/uaccess: Remove memcpy_page_flushcache()
  x86/entry: Change stale function name in comment to error_return()
2023-04-28 09:22:30 -07:00
Jonathan Derrick b1211978ec md: Fix bitmap offset type in sb writer
Bitmap offset is allowed to be negative, indicating that bitmap precedes
metadata. Change the type back from sector_t to loff_t to satisfy
conditionals and calculations.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/linux-raid/CAPhsuW6HuaUJ5WcyPajVgUfkQFYp2D_cy1g6qxN4CU_gP2=z7g@mail.gmail.com/
Fixes: 10172f200b ("md: Fix types in sb writer")
Signed-off-by: Jonathan Derrick <jonathan.derrick@linux.dev>
Suggested-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230425011438.71046-1-jonathan.derrick@linux.dev
2023-04-28 09:21:06 -07:00
Jan Kara fc05e06e60 md/raid5: Improve performance for sequential IO
Commit 7e55c60acf ("md/raid5: Pivot raid5_make_request()") changed the
order in which requests for underlying disks are created. Since for
large sequential IO adding of requests frequently races with md_raid5
thread submitting bios to underlying disks, this results in a change in
IO pattern because intermediate states of new order of request creation
result in more smaller discontiguous requests. For RAID5 on top of three
rotational disks our performance testing revealed this results in
regression in write throughput:

iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R

before 7e55c60acfbb:
              KB  reclen   write rewrite    read    reread
       131072000       4  493670  525964   524575   513384
       131072000       8  540467  532880   512028   513703

after 7e55c60acfbb:
              KB  reclen   write rewrite    read    reread
       131072000       4  421785  456184   531278   509248
       131072000       8  459283  456354   528449   543834

To reduce the amount of discontiguous requests we can start generating
requests with the stripe with the lowest chunk offset as that has the
best chance of being adjacent to IO queued previously. This improves the
performance to:
              KB  reclen   write rewrite    read    reread
       131072000       4  497682  506317   518043   514559
       131072000       8  514048  501886   506453   504319

restoring big part of the regression.

Fixes: 7e55c60acf ("md/raid5: Pivot raid5_make_request()")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230417171537.17899-1-jack@suse.cz
2023-04-28 09:19:02 -07:00
Randy Dunlap 6d2ed65318 lsm: move hook comments docs to security/security.c
Fix one kernel-doc warning, but invesigating that led to other
kernel-doc movement (lsm_hooks.h to security.c) that needs to be
fixed also.

include/linux/lsm_hooks.h:1: warning: no structured comments found

Fixes: e261301c85 ("lsm: move the remaining LSM hook comments to security/security.c")
Fixes: 1cd2aca64a ("lsm: move the io_uring hook comments to security/security.c")
Fixes: 452b670c72 ("lsm: move the perf hook comments to security/security.c")
Fixes: 55e853201a ("lsm: move the bpf hook comments to security/security.c")
Fixes: b14faf9c94 ("lsm: move the audit hook comments to security/security.c")
Fixes: 1427ddbe5c ("lsm: move the binder hook comments to security/security.c")
Fixes: 43fad28218 ("lsm: move the sysv hook comments to security/security.c")
Fixes: ecc419a445 ("lsm: move the key hook comments to security/security.c")
Fixes: 742b99456e ("lsm: move the xfrm hook comments to security/security.c")
Fixes: ac318aed54 ("lsm: move the Infiniband hook comments to security/security.c")
Fixes: 4a49f592e9 ("lsm: move the SCTP hook comments to security/security.c")
Fixes: 6b6bbe8c02 ("lsm: move the socket hook comments to security/security.c")
Fixes: 2c2442fd46 ("lsm: move the AF_UNIX hook comments to security/security.c")
Fixes: 2bcf51bf2f ("lsm: move the netlink hook comments to security/security.c")
Fixes: 130c53bfee ("lsm: move the task hook comments to security/security.c")
Fixes: a0fd6480de ("lsm: move the file hook comments to security/security.c")
Fixes: 9348944b77 ("lsm: move the kernfs hook comments to security/security.c")
Fixes: 916e32584d ("lsm: move the inode hook comments to security/security.c")
Fixes: 08526a902c ("lsm: move the filesystem hook comments to security/security.c")
Fixes: 36819f1855 ("lsm: move the fs_context hook comments to security/security.c")
Fixes: 1661372c91 ("lsm: move the program execution hook comments to security/security.c")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: KP Singh <kpsingh@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-04-28 11:58:34 -04:00
Zhihao Cheng 1dedde6903 ext4: fix i_disksize exceeding i_size problem in paritally written case
It is possible for i_disksize can exceed i_size, triggering a warning.

generic_perform_write
 copied = iov_iter_copy_from_user_atomic(len) // copied < len
 ext4_da_write_end
 | ext4_update_i_disksize
 |  new_i_size = pos + copied;
 |  WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize
 | generic_write_end
 |  copied = block_write_end(copied, len) // copied = 0
 |   if (unlikely(copied < len))
 |    if (!PageUptodate(page))
 |     copied = 0;
 |  if (pos + copied > inode->i_size) // return false
 if (unlikely(copied == 0))
  goto again;
 if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
  status = -EFAULT;
  break;
 }

We get i_disksize greater than i_size here, which could trigger WARNING
check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio:

ext4_dio_write_iter
 iomap_dio_rw
  __iomap_dio_rw // return err, length is not aligned to 512
 ext4_handle_inode_extension
  WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops

 WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319
 CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2
 RIP: 0010:ext4_file_write_iter+0xbc7
 Call Trace:
  vfs_write+0x3b1
  ksys_write+0x77
  do_syscall_64+0x39

Fix it by updating 'copied' value before updating i_disksize just like
ext4_write_inline_data_end() does.

A reproducer can be found in the buganizer link below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209
Fixes: 64769240bd ("ext4: Add delayed allocation support in data=writeback mode")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230321013721.89818-1-chengzhihao1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 11:07:19 -04:00
Jarkko Sakkinen 0c8862de05 tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM drivers
TPM chip bootstrapping was removed from tpm_chip_register(), and it
was relocated to tpm_tis_core. This breaks all drivers which are not
based on tpm_tis because the chip will not get properly initialized.

Take the corrective steps:
1. Rename tpm_chip_startup() as tpm_chip_bootstrap() and make it one-shot.
2. Call tpm_chip_bootstrap() in tpm_chip_register(), which reverts the
   things  as tehy used to be.

Cc: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Fixes: 548eb516ec ("tpm, tpm_tis: startup chip before testing for interrupts")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Link: https://lore.kernel.org/all/ZEjqhwHWBnxcaRV5@xpf.sh.intel.com/
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-04-28 13:06:36 +00:00
Olivier Bacon 4140aafcff crypto: engine - fix crypto_queue backlog handling
CRYPTO_TFM_REQ_MAY_BACKLOG tells the crypto driver that it should
internally backlog requests until the crypto hw's queue becomes
full. At that point, crypto_engine backlogs the request and returns
-EBUSY. Calling driver such as dm-crypt then waits until the
complete() function is called with a status of -EINPROGRESS before
sending a new request.

The problem lies in the call to complete() with a value of -EINPROGRESS
that is made when a backlog item is present on the queue. The call is
done before the successful execution of the crypto request. In the case
that do_one_request() returns < 0 and the retry support is available,
the request is put back in the queue. This leads upper drivers to send
a new request even if the queue is still full.

The problem can be reproduced by doing a large dd into a crypto
dm-crypt device. This is pretty easy to see when using
Freescale CAAM crypto driver and SWIOTLB dma. Since the actual amount
of requests that can be hold in the queue is unlimited we get IOs error
and dma allocation.

The fix is to call complete with a value of -EINPROGRESS only if
the request is not enqueued back in crypto_queue. This is done
by calling complete() later in the code. In order to delay the decision,
crypto_queue is modified to correctly set the backlog pointer
when a request is enqueued back.

Fixes: 6a89f492f8 ("crypto: engine - support for parallel requests based on retry mechanism")
Co-developed-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Olivier Bacon <obacon@genetec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-04-28 17:50:43 +08:00
Christophe JAILLET 8fd91151eb crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()
SS_ENCRYPTION is (0 << 7 = 0), so the test can never be true.
Use a direct comparison to SS_ENCRYPTION instead.

The same king of test is already done the same way in sun8i_ss_run_task().

Fixes: 359e893e8a ("crypto: sun8i-ss - rework handling of IV")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-04-28 17:50:43 +08:00
Oswald Buddenhagen 9d2f38638a ALSA: emu10k1: use more existing defines instead of open-coded numbers
Using the *_MASK defines for "maximal value" is debatable. I got the
idea from FreeBSD, and it sorta makes sense to me.

Some hunks look a bit incomplete, because code that is going to be
subsequently removed is not touched here.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Link: https://lore.kernel.org/r/20230428080732.1697695-1-oswald.buddenhagen@gmx.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-04-28 11:22:51 +02:00
Angelo Dureghello 6686317855 net: dsa: mv88e6xxx: add mv88e6321 rsvd2cpu
Add rsvd2cpu capability for mv88e6321 model, to allow proper bpdu
processing.

Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Fixes: 51c901a775 ("net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:57:32 +01:00
Antoine Tenart dc6456e938 net: ipv6: fix skb hash for some RST packets
The skb hash comes from sk->sk_txhash when using TCP, except for some
IPv6 RST packets. This is because in tcp_v6_send_reset when not in
TIME_WAIT the hash is taken from sk->sk_hash, while it should come from
sk->sk_txhash as those two hashes are not computed the same way.

Packetdrill script to test the above,

   0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
  +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > (flowlabel 0x1) S 0:0(0) <...>

  // Wrong ack seq, trigger a rst.
  +0 < S. 0:0(0) ack 0 win 4000

  // Check the flowlabel matches prior one from SYN.
  +0 > (flowlabel 0x1) R 0:0(0) <...>

Fixes: 9258b8b1be ("ipv6: tcp: send consistent autoflowlabel in RST packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:53:43 +01:00
Andrea Mayer 46ef24c60f selftests: srv6: make srv6_end_dt46_l3vpn_test more robust
On some distributions, the rp_filter is automatically set (=1) by
default on a netdev basis (also on VRFs).
In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using
the table associated with the VRF bound to that tunnel. During lookup
operations, the rp_filter can lead to packet loss when activated on the
VRF.
Therefore, we chose to make this selftest more robust by explicitly
disabling the rp_filter during tests (as it is automatically set by some
Linux distributions).

Fixes: 03a0b567a0 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:51:40 +01:00
wuych 042334a8d4 atlantic:hw_atl2:hw_atl2_utils_fw: Remove unnecessary (void*) conversions
Pointer variables of void * type do not require type cast.

Signed-off-by: wuych <yunchuan@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:49:21 +01:00
Cong Wang c88f8d5cd9 sit: update dev->needed_headroom in ipip6_tunnel_bind_dev()
When a tunnel device is bound with the underlying device, its
dev->needed_headroom needs to be updated properly. IPv4 tunnels
already do the same in ip_tunnel_bind_dev(). Otherwise we may
not have enough header room for skb, especially after commit
b17f709a24 ("gue: TX support for using remote checksum offload option").

Fixes: 32b8a8e59c ("sit: add IPv4 over IPv4 support")
Reported-by: Palash Oswal <oswalpalash@gmail.com>
Link: https://lore.kernel.org/netdev/CAGyP=7fDcSPKu6nttbGwt7RXzE3uyYxLjCSE97J64pRxJP8jPA@mail.gmail.com/
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:48:14 +01:00
Vlad Buslov da94a7781f net/sched: cls_api: remove block_cb from driver_list before freeing
Error handler of tcf_block_bind() frees the whole bo->cb_list on error.
However, by that time the flow_block_cb instances are already in the driver
list because driver ndo_setup_tc() callback is called before that up the
call chain in tcf_block_offload_cmd(). This leaves dangling pointers to
freed objects in the list and causes use-after-free[0]. Fix it by also
removing flow_block_cb instances from driver_list before deallocating them.

[0]:
[  279.868433] ==================================================================
[  279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0
[  279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963

[  279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ #4
[  279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  279.876295] Call Trace:
[  279.876882]  <TASK>
[  279.877413]  dump_stack_lvl+0x33/0x50
[  279.878198]  print_report+0xc2/0x610
[  279.878987]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.879994]  kasan_report+0xae/0xe0
[  279.880750]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.881744]  ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core]
[  279.883047]  flow_block_cb_setup_simple+0x631/0x7c0
[  279.884027]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.885037]  ? tcf_block_setup+0x6b0/0x6b0
[  279.885901]  ? mutex_lock+0x7d/0xd0
[  279.886669]  ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0
[  279.887844]  ? ingress_init+0x1c0/0x1c0 [sch_ingress]
[  279.888846]  tcf_block_get_ext+0x61c/0x1200
[  279.889711]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.890682]  ? clsact_init+0x2b0/0x2b0 [sch_ingress]
[  279.891701]  qdisc_create+0x401/0xea0
[  279.892485]  ? qdisc_tree_reduce_backlog+0x470/0x470
[  279.893473]  tc_modify_qdisc+0x6f7/0x16d0
[  279.894344]  ? tc_get_qdisc+0xac0/0xac0
[  279.895213]  ? mutex_lock+0x7d/0xd0
[  279.896005]  ? __mutex_lock_slowpath+0x10/0x10
[  279.896910]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.897770]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.898672]  ? __sys_sendmsg+0xb5/0x140
[  279.899494]  ? do_syscall_64+0x3d/0x90
[  279.900302]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.901337]  ? kasan_save_stack+0x2e/0x40
[  279.902177]  ? kasan_save_stack+0x1e/0x40
[  279.903058]  ? kasan_set_track+0x21/0x30
[  279.903913]  ? kasan_save_free_info+0x2a/0x40
[  279.904836]  ? ____kasan_slab_free+0x11a/0x1b0
[  279.905741]  ? kmem_cache_free+0x179/0x400
[  279.906599]  netlink_rcv_skb+0x12c/0x360
[  279.907450]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.908360]  ? netlink_ack+0x1550/0x1550
[  279.909192]  ? rhashtable_walk_peek+0x170/0x170
[  279.910135]  ? kmem_cache_alloc_node+0x1af/0x390
[  279.911086]  ? _copy_from_iter+0x3d6/0xc70
[  279.912031]  netlink_unicast+0x553/0x790
[  279.912864]  ? netlink_attachskb+0x6a0/0x6a0
[  279.913763]  ? netlink_recvmsg+0x416/0xb50
[  279.914627]  netlink_sendmsg+0x7a1/0xcb0
[  279.915473]  ? netlink_unicast+0x790/0x790
[  279.916334]  ? iovec_from_user.part.0+0x4d/0x220
[  279.917293]  ? netlink_unicast+0x790/0x790
[  279.918159]  sock_sendmsg+0xc5/0x190
[  279.918938]  ____sys_sendmsg+0x535/0x6b0
[  279.919813]  ? import_iovec+0x7/0x10
[  279.920601]  ? kernel_sendmsg+0x30/0x30
[  279.921423]  ? __copy_msghdr+0x3c0/0x3c0
[  279.922254]  ? import_iovec+0x7/0x10
[  279.923041]  ___sys_sendmsg+0xeb/0x170
[  279.923854]  ? copy_msghdr_from_user+0x110/0x110
[  279.924797]  ? ___sys_recvmsg+0xd9/0x130
[  279.925630]  ? __perf_event_task_sched_in+0x183/0x470
[  279.926656]  ? ___sys_sendmsg+0x170/0x170
[  279.927529]  ? ctx_sched_in+0x530/0x530
[  279.928369]  ? update_curr+0x283/0x4f0
[  279.929185]  ? perf_event_update_userpage+0x570/0x570
[  279.930201]  ? __fget_light+0x57/0x520
[  279.931023]  ? __switch_to+0x53d/0xe70
[  279.931846]  ? sockfd_lookup_light+0x1a/0x140
[  279.932761]  __sys_sendmsg+0xb5/0x140
[  279.933560]  ? __sys_sendmsg_sock+0x20/0x20
[  279.934436]  ? fpregs_assert_state_consistent+0x1d/0xa0
[  279.935490]  do_syscall_64+0x3d/0x90
[  279.936300]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.937311] RIP: 0033:0x7f21c814f887
[  279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[  279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887
[  279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003
[  279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001
[  279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400
[  279.949690]  </TASK>

[  279.950706] Allocated by task 2960:
[  279.951471]  kasan_save_stack+0x1e/0x40
[  279.952338]  kasan_set_track+0x21/0x30
[  279.953165]  __kasan_kmalloc+0x77/0x90
[  279.954006]  flow_block_cb_setup_simple+0x3dd/0x7c0
[  279.955001]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.956020]  tcf_block_get_ext+0x61c/0x1200
[  279.956881]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.957873]  qdisc_create+0x401/0xea0
[  279.958656]  tc_modify_qdisc+0x6f7/0x16d0
[  279.959506]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.960392]  netlink_rcv_skb+0x12c/0x360
[  279.961216]  netlink_unicast+0x553/0x790
[  279.962044]  netlink_sendmsg+0x7a1/0xcb0
[  279.962906]  sock_sendmsg+0xc5/0x190
[  279.963702]  ____sys_sendmsg+0x535/0x6b0
[  279.964534]  ___sys_sendmsg+0xeb/0x170
[  279.965343]  __sys_sendmsg+0xb5/0x140
[  279.966132]  do_syscall_64+0x3d/0x90
[  279.966908]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.968407] Freed by task 2960:
[  279.969114]  kasan_save_stack+0x1e/0x40
[  279.969929]  kasan_set_track+0x21/0x30
[  279.970729]  kasan_save_free_info+0x2a/0x40
[  279.971603]  ____kasan_slab_free+0x11a/0x1b0
[  279.972483]  __kmem_cache_free+0x14d/0x280
[  279.973337]  tcf_block_setup+0x29d/0x6b0
[  279.974173]  tcf_block_offload_cmd.isra.0+0x226/0x2d0
[  279.975186]  tcf_block_get_ext+0x61c/0x1200
[  279.976080]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.977065]  qdisc_create+0x401/0xea0
[  279.977857]  tc_modify_qdisc+0x6f7/0x16d0
[  279.978695]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.979562]  netlink_rcv_skb+0x12c/0x360
[  279.980388]  netlink_unicast+0x553/0x790
[  279.981214]  netlink_sendmsg+0x7a1/0xcb0
[  279.982043]  sock_sendmsg+0xc5/0x190
[  279.982827]  ____sys_sendmsg+0x535/0x6b0
[  279.983703]  ___sys_sendmsg+0xeb/0x170
[  279.984510]  __sys_sendmsg+0xb5/0x140
[  279.985298]  do_syscall_64+0x3d/0x90
[  279.986076]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.987532] The buggy address belongs to the object at ffff888147e2bf00
                which belongs to the cache kmalloc-192 of size 192
[  279.989747] The buggy address is located 32 bytes inside of
                freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0)

[  279.992367] The buggy address belongs to the physical page:
[  279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a
[  279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2)
[  279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001
[  279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[  280.000894] page dumped because: kasan: bad access detected

[  280.002386] Memory state around the buggy address:
[  280.003338]  ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.004781]  ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.007700]                                ^
[  280.008592]  ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.010035]  ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.011564] ==================================================================

Fixes: 59094b1e50 ("net: sched: use flow block API")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:44:19 +01:00
Christophe JAILLET e0807c4302 mISDN: Use list_count_nodes()
count_list_member() really looks the same as list_count_nodes(), so use the
latter instead of hand writing it.

The first one return an int and the other a size_t, but that should be
fine. It is really unlikely that we get so many parties in a conference.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:43:04 +01:00
Eric Dumazet 7e692df393 tcp: fix skb_copy_ubufs() vs BIG TCP
David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy
using hugepages, and skb length bigger than ~68 KB.

skb_copy_ubufs() assumed it could copy all payload using up to
MAX_SKB_FRAGS order-0 pages.

This assumption broke when BIG TCP was able to put up to 512 KB per skb.

We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45
and limit gso_max_size to 180000.

A solution is to use higher order pages if needed.

v2: add missing __GFP_COMP, or we leak memory.

Fixes: 7c4e983c4f ("net: allow gso_max_size to exceed 65536")
Reported-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/netdev/c70000f6-baa4-4a05-46d0-4b3e0dc1ccc8@gmail.com/T/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:40:38 +01:00
Cosmo Chou 6f75cd166a net/ncsi: clear Tx enable mode when handling a Config required AEN
ncsi_channel_is_tx() determines whether a given channel should be
used for Tx or not. However, when reconfiguring the channel by
handling a Configuration Required AEN, there is a misjudgment that
the channel Tx has already been enabled, which results in the Enable
Channel Network Tx command not being sent.

Clear the channel Tx enable flag before reconfiguring the channel to
avoid the misjudgment.

Fixes: 8d951a75d0 ("net/ncsi: Configure multi-package, multi-channel modes with failover")
Signed-off-by: Cosmo Chou <chou.cosmo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:35:33 +01:00
Jeremy Kerr 8c6c78ee3b i3c: ast2600: fix register setting for 545 ohm pullups
The 2k register setting is zero, OR-ing it in doesn't parallel the 2k
and 750 ohm pullups. We need a separate value for the 545 ohm setting.

Reported-by: Lukwinski Zbigniew <zbigniew.lukwinski@linux.intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230428001849.1775559-1-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:52:23 +02:00
Jeremy Kerr f2539c2079 i3c: ast2600: enable IBI support
The ast2600 i3c hardware is capable of IBIs, but we need a workaround
for a hardware issue with the I3C state machine handling IBI payloads
of specific lengths when PEC is not enabled. To avoid this, we need to
unconditionally enable PECs, at the consquence of losing a byte of data
when the device does not send a PEC.

Enable IBIs on the ast2600 platform, including an implementation of the
PEC workaround, which prints a warning when triggered.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/ba923b96d6d129024c975e8a0472c5b2fcb3af32.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:49:50 +02:00
Jeremy Kerr f3a3553a51 i3c: dw: Add a platform facility for IBI PEC workarounds
On the AST2600 i3c controller, we'll need to apply a workaround for a
hardware issue with IBI payloads.

Introduce a platform hook to allow dw i3c platform implementations to
modify the DAT entry in IBI enable/disable to allow this workaround in a
future change.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/d5d76a8d2336d2a71886537f42e71d51db184df6.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr e389b1d72a i3c: dw: Add support for in-band interrupts
This change adds support for receiving and dequeueing i3c IBIs.

By setting struct dw_i3c_master->ibi_capable before probe, a platform
implementation can select the IBI-enabled version of the i3c_master_ops,
enabling the global IBI infrastrcture for that controller.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/79daeefd7ccb7c935d0c159149df21a6c9a73ffa.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr e2d43101f6 i3c: dw: Turn DAT array entry into a struct
In an upcoming change, we will want to store additional data about the
devices we have in the data address table.

Change the type of the DAT entries into a struct, which currently just
has the address data.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/9dc0d9e2857e851a0cf04819df48e5d31921f83e.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr 79f42b31c2 i3c: dw: Create a generic fifo read function
In a future change we'll want to read from the IBI FIFO too, so turn
dw_i3c_read_rx_fifo() into a generic read with the FIFO register as a
parameter.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/827204789583dd86addffb47ecaeab9d67cf95d5.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr 7dc2e0a875 i3c: Allow OF-alias-based persistent bus numbering
Parse the /aliases node to assign any fixed bus numbers, as is done with
the i2c subsystem. Numbering for non-aliased busses will start after the
highest fixed bus number.

This allows an alias node such as:

    aliases {
        i3c0 = &bus_a,
	i3c4 = &bus_b,
    };

to set the numbering for a set of i3c controllers:

    /* fixed-numbered bus, assigned "i3c-0" */
    bus_a: i3c-master {
    };

    /* another fixed-numbered bus, assigned "i3c-4" */
    bus_b: i3c-master {
    };

    /* dynamic-numbered bus, likely assigned "i3c-5" */
    bus_c: i3c-master {
    };

If no i3c device aliases are present, the numbering will stay as-is,
starting from 0.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230405094149.1513209-1-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:19:01 +02:00
Jeremy Kerr 5844564143 i3c: ast2600: Add AST2600 platform-specific driver
Now that we have platform-specific infrastructure for the dw i3c driver,
add platform support for the ASPEED AST2600 SoC.

The AST2600 has a small set of "i3c global" registers, providing
platform-level i3c configuration outside of the i3c core.

For the ast2600, we need a couple of extra setup operations:

 - on probe: find the i3c global register set and parse the SDA pullup
   resistor values

 - on init: set the pullups accordingly, and set the i3c instance IDs

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230331091501.3800299-4-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:19:01 +02:00
Jeremy Kerr 21203e098c dt-bindings: i3c: Add AST2600 i3c controller
Add a devicetree binding for the ast2600 i3c controller hardware. This
is heavily based on the designware i3c core, plus a reset facility
and two platform-specific properties:

 - sda-pullup-ohms: to specify the value of the configurable pullup
   resistors on the SDA line

 - aspeed,global-regs: to reference the (ast2600-specific) i3c global
   register block, and the device index to use within it.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> (on v1)
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230331091501.3800299-3-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:19:01 +02:00
Jeremy Kerr d782188cbb i3c: dw: Add infrastructure for platform-specific implementations
The dw i3c core can be integrated into various SoC devices. Platforms
that use this core may need a little configuration that is specific to
that platform.

Add some infrastructure to allow platform-specific behaviour: common
probe/remove functions, a set of platform hook operations, and a pointer
for platform-specific data in struct dw_i3c_master. Move the common api
into a new (i3c local) header file.

Platforms will provide their own struct platform_driver, which allocates
struct dw_i3c_master, does any platform-specific probe behaviour, and
calls into the common probe.

A future change will add new platform support that uses this
infrastructure.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230331091501.3800299-2-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:18:53 +02:00
Ye Xingchen e99ab4abeb rtc: armada38x: use devm_platform_ioremap_resource_byname()
Convert platform_get_resource_byname(),devm_ioremap_resource() to a single
call to devm_platform_ioremap_resource_byname(), as this is exactly what
this function does.

Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/202303221130316049449@zte.com.cn
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:07:23 +02:00
Ye Xingchen 916890539b rtc: sunplus: use devm_platform_ioremap_resource_byname()
Convert platform_get_resource_byname(),devm_ioremap_resource() to a single
call to devm_platform_ioremap_resource_byname(), as this is exactly what
this function does.

Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/202303221131581039486@zte.com.cn
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:07:23 +02:00
Lars-Peter Clausen c7a639dac8 rtc: jz4740: Make sure clock provider gets removed
The jz4740 RTC driver registers a clock provider, but never removes it.
This leaves a stale clock provider behind that references freed clocks when
the device is unbound.

Use the managed `devm_of_clk_add_hw_provider()` instead of
`of_clk_add_hw_provider()` to make sure the provider gets automatically
removed on unbind.

Fixes: 5ddfa148de ("rtc: jz4740: Register clock provider for the CLK32K pin")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20230409162544.16155-1-lars@metafoo.de
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:07:23 +02:00
Linus Torvalds 33afd4b763 Mainly singleton patches all over the place. Series of note are:
- updates to scripts/gdb from Glenn Washburn
 
 - kexec cleanups from Bjorn Helgaas
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr+6wAKCRDdBJ7gKXxA
 jn4NAP4u/hj/kR2dxYehcVLuQqJspCRZZBZlAReFJyHNQO6voAEAk0NN9rtG2+/E
 r0G29CJhK+YL0W6mOs8O1yo9J1rZnAM=
 =2CUV
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Mainly singleton patches all over the place.

  Series of note are:

   - updates to scripts/gdb from Glenn Washburn

   - kexec cleanups from Bjorn Helgaas"

* tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (50 commits)
  mailmap: add entries for Paul Mackerras
  libgcc: add forward declarations for generic library routines
  mailmap: add entry for Oleksandr
  ocfs2: reduce ioctl stack usage
  fs/proc: add Kthread flag to /proc/$pid/status
  ia64: fix an addr to taddr in huge_pte_offset()
  checkpatch: introduce proper bindings license check
  epoll: rename global epmutex
  scripts/gdb: add GDB convenience functions $lx_dentry_name() and $lx_i_dentry()
  scripts/gdb: create linux/vfs.py for VFS related GDB helpers
  uapi/linux/const.h: prefer ISO-friendly __typeof__
  delayacct: track delays from IRQ/SOFTIRQ
  scripts/gdb: timerlist: convert int chunks to str
  scripts/gdb: print interrupts
  scripts/gdb: raise error with reduced debugging information
  scripts/gdb: add a Radix Tree Parser
  lib/rbtree: use '+' instead of '|' for setting color.
  proc/stat: remove arch_idle_time()
  checkpatch: check for misuse of the link tags
  checkpatch: allow Closes tags with links
  ...
2023-04-27 19:57:00 -07:00
Linus Torvalds 7fa8a8ee94 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
 
 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
 
 - zsmalloc performance improvements from Sergey Senozhatsky.
 
 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.
 
 - VFS rationalizations from Christoph Hellwig:
 
   - removal of most of the callers of write_one_page().
 
   - make __filemap_get_folio()'s return value more useful
 
 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing.  Use `mount -o noswap'.
 
 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.
 
 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).
 
 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.
 
 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
   than exclusive, and has fixed a bunch of errors which were caused by its
   unintuitive meaning.
 
 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.
 
 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.
 
 - Cleanups to do_fault_around() by Lorenzo Stoakes.
 
 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.
 
 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.
 
 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.
 
 - Lorenzo has also contributed further cleanups of vma_merge().
 
 - Chaitanya Prakash provides some fixes to the mmap selftesting code.
 
 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.
 
 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.
 
 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.
 
 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.
 
 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.
 
 - Yosry Ahmed has improved the performance of memcg statistics flushing.
 
 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.
 
 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.
 
 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.
 
 - Pankaj Raghav has made page_endio() unneeded and has removed it.
 
 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.
 
 - Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
 
 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.
 
 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.
 
 - Peter Xu fixes a few issues with uffd-wp at fork() time.
 
 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
 jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
 =J+Dj
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
2023-04-27 19:42:02 -07:00
Eric Blake 952aa344bf docs nbd: userspace NBD now favors github over sourceforge
While the sourceforge site for userspace NBD still exists, the code
repository moved to github several years ago.  Then with a recent
patch[1], the github landing page contains just as much information as
the sourceforge page, so we might as well point to a single location
that also provides the code.

[1] https://lists.debian.org/nbd/2023/03/msg00051.html

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230410180611.1051618-5-eblake@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-27 19:15:11 -06:00
Eric Blake bd9e9916c3 block nbd: use req.cookie instead of req.handle
The NBD spec was recently changed [1] to refer to the opaque client
identifier as a 'cookie' rather than a 'handle', but has for a much
longer time listed it as a 64-bit value, and declares that all values
in the NBD protocol are sent in network byte order (big-endian).

Because the value is opaque to the server, it doesn't usually matter
what endianness we send as the client - as long as we are consistent
that either we byte-swap on both write and read, or on neither, then
we can match server replies back to our requests.  That said, our
internal use of the cookie is as a 64-bit number (well, as two 32-bit
numbers concatenated together), rather than as 8 individual bytes; so
prior to this commit, we ARE leaking the native endianness of our
internals as a client out to the server.  We don't know of any server
that will actually inspect the opaque value and behave differently
depending on whether a little-endian or big-endian client is sending
requests, but since we DO log the cookie value, a wireshark capture of
the network traffic is easier to correlate back to the kernel traffic
of a big-endian host (where the u64 and char[8] representations are
the same) than of a little-endian host (where if wireshark honors the
NBD spec and displays a u64 in network byte order, it is byte-swapped
from what the kernel logged).

The fix in this patch is thus two-part: it now consistently uses
network byte order for the opaque value (no difference to a big-endian
machine, but an extra byteswap on a little-endian machine; probably in
the noise compared to the overhead of network traffic in general), and
now uses a 64-bit integer instead of char[8] as its preferred access
to the opaque value (direct assignment instead of memcpy()).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230410180611.1051618-4-eblake@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-27 19:15:11 -06:00
Eric Blake 2686eb845d uapi nbd: add cookie alias to handle
The uapi <linux/nbd.h> header declares a 'char handle[8]' per request;
which is overloaded in English (are you referring to "handle" the
verb, such as handling a signal or writing a callback handler, or
"handle" the noun, the value used in a lookup table to correlate a
response back to the request).  Many user-space NBD implementations
(both servers and clients) have instead used 'uint64_t cookie' or
similar, as it is easier to directly assign an integer than to futz
around with memcpy.  In fact, upstream documentation is now
encouraging this shift in terminology:
https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b

Accomplish this by use of an anonymous union to provide the alias for
anyone getting the definition from the uapi; this does not break
existing clients, while exposing the nicer name for those who prefer
it.  Note that block/nbd.c still uses the term handle (in fact, it
actually combines a 32-bit cookie and a 32-bit tag into the 64-bit
handle), but that internal usage is not changed by the public uapi,
since no compliant NBD server has any reason to inspect or alter the
64 bits sent over the socket.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230410180611.1051618-3-eblake@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-27 19:15:11 -06:00
Eric Blake daf376a366 uapi nbd: improve doc links to userspace spec
The uapi <linux/nbd.h> header intentionally documents only the NBD
server features that the kernel module will utilize as a client.  But
while it already had one mention of skipped bits due to userspace
extensions, it did not actually direct the reader to the canonical
source to learn about those extensions.

While touching comments, fix an outdated reference that listed only
READ and WRITE as commands.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230410180611.1051618-2-eblake@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-27 19:15:11 -06:00
Linus Torvalds 91ec4b0d11 - added support for Huawei B593u-12
- added support for virt board aligned to QEMU MIPS virt board
 - added support for doing DMA coherence on a per device base
 - reworked handling of RALINK SoCs
 - cleanup for Loongon64 barriers
 - removed deprecated support for MIPS_CMP SMP handling method
 - removed support Sibyte CARMEL and CHRINE boards
 - cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmRI78IaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHDG0xAAhXtNUKH6MNgPLm+iOeXu
 GIUax2ZdFKl/xbG9kLfSdpKLpdnvdZSQABRIzD0isw3F3ahwOzaql4feNUsdK9oU
 eLbzHu5isgtdX03ToOmn3yjgcWr1k/xNGjuW7uaj75CvUZHCqOwt+kDie+3rIMjE
 kYHIdszemFnj3VaG6omkVy/tv2pUHSJlVDePVNmmq7yWCXK+t/6CU8QoSlcQIxy9
 MAktt735wxJrFW6+ezm0T4lY64IqSpiXVcIOaOHXbJrIRJK4zyEiRleZ2+qIwCw0
 jpwc7qth6EeA/LJnJExfurDtH86oQvjpJmSw1QuDKE9h3RZHYE3amRFjGHEMvaZ7
 iSsCCKmTITcEWgAAq7GMot4qVSWOIhWpYZfNtpP8WfirZy8RlfyfXrzprcEg3SiO
 mBGqsK0s+Y8v/J3d9tDmNRSVOyMyeH3Qsc6feS6YvmWN48jauT+ze06pNFyDO3At
 bJWrzhI0UaLETo8hOa2mbnATThEuAUaFwOH1arikJwHkXjuvy1RvZerEtqGupI9y
 VubR3gEx2subruInZQU5O3R+ZhogoKnuADfeDtw8MUsUNC+ODAHX0mGCyQXqRf75
 ooepecwtZyHFjqh0sw7hz7184+VKeOHS8YRjW9njOXZVtEzM0LikXgECsQzEcVF7
 3Y72QFJFt61UtWSd3eJayEg=
 =eo9R
 -----END PGP SIGNATURE-----

Merge tag 'mips_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

 - added support for Huawei B593u-12

 - added support for virt board aligned to QEMU MIPS virt board

 - added support for doing DMA coherence on a per device base

 - reworked handling of RALINK SoCs

 - cleanup for Loongon64 barriers

 - removed deprecated support for MIPS_CMP SMP handling method

 - removed support Sibyte CARMEL and CHRINE boards

 - cleanups and fixes

* tag 'mips_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (59 commits)
  MIPS: uprobes: Restore thread.trap_nr
  MIPS: Don't clear _PAGE_SPECIAL in _PAGE_CHG_MASK
  MIPS: Sink body of check_bugs_early() into its only call site
  MIPS: Mark check_bugs() as __init
  Revert "MIPS: generic: Enable all CPUs supported by virt board in Kconfig"
  MIPS: octeon_switch: Remove duplicated labels
  MIPS: loongson2ef: Add missing break in cs5536_isa
  MIPS: Remove set_swbp() in uprobes.c
  MIPS: Use def_bool y for ARCH_SUPPORTS_UPROBES
  MIPS: fw: Allow firmware to pass a empty env
  MIPS: Remove deprecated CONFIG_MIPS_CMP
  MIPS: lantiq: remove unused function declaration
  MIPS: Drop unused positional parameter in local_irq_{dis,en}able
  MIPS: mm: Remove local_cache_flush_page
  MIPS: Remove no longer used ide.h
  MIPS: mm: Remove unused *cache_page_indexed flush functions
  MIPS: generic: Enable all CPUs supported by virt board in Kconfig
  MIPS: Add board config for virt board
  MIPS: Octeon: Disable CVMSEG by default on other platforms
  MIPS: Loongson: Don't select platform features with CPU
  ...
2023-04-27 17:46:52 -07:00
Linus Torvalds 513f17f8d6 sh updates for v6.4
- sh: Use generic GCC library routines
 - sh: sq: Use the bitmap API when applicable
 - sh: sq: Fix incorrect element size for allocating bitmap buffer
 - sh: pci: Remove unused variable in SH-7786 PCI Express code
 - sh: mcount.S: fix build error when PRINTK is not enabled
 - sh: remove sh5/sh64 last fragments
 - sh: math-emu: fix macro redefined warning
 - sh: init: use OF_EARLY_FLATTREE for early init
 - sh: nmi_debug: fix return value of __setup handler
 - sh: SH2007: drop the bad URL info
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEYv+KdYTgKVaVRgAGdCY7N/W1+RMFAmRIz0UACgkQdCY7N/W1
 +RMm1Q/9Hw5xMnxHbryDoBAqgwEOZRH+MUMBnAyMw3shqxO/Cp/nIAacvdNmF4Me
 iszDjATleshk8vbTwUE6cFPzKuLM8r4o1JfBvYSEBgkfs5YEEhoa+1TQZ6aYl3zD
 v6vcVQnobaV5dUc9yUA3FdG/vuXEj7wctZuqO0QYsC/bE5g/r1fFTEd37Jbo2qwg
 6sJ+xL8KEa29Abq9OP0QmeOWvHBuGcCLZNgagA4JxT7U4+jYhg0ddphw+c3yybnP
 FX1eFMulB98V/oDPCOlfrYsZAkQGoYPWwY0WI/nVg8ujA3lbRkSu6Fd9ic95/PGG
 KVjr6Mol6/+ESy4k/MB46bJzq0un2FPWhZzyfL0RoCbX2zQWBtC/1XbT0PmTsRud
 CzcPAMpNPDwUTcoSWdUpOfEAbxjIgGNhQBth9lRMNFhNkk8cwgk1UAN0LjBRm5nq
 MteTim3qCyiFkNlngpvSVbIokBKWllKAtPSL3wCi6OgQCNm7XWZxme2z8G5tVkit
 Q9bTVD5qMt24pRJsGsVho8wvRsqMmtl5hwMzFVP02WBNxb9csHpQHrhG7MRLN9kt
 0BPYU6erCcRl9DQ9HonUaKCmJDJEyxUcXan48TSyGzajFDnURS7AfkreO7NHQIbO
 YAaCvqCDwGVygBjUQtHLrBWlORjAD8IoMEJ1sivRzCeHXGlmI6s=
 =RGSv
 -----END PGP SIGNATURE-----

Merge tag 'sh-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
 "This is a bit larger than my previous one and mainly consists of
  clean-up work in the arch/sh directory by Geert Uytterhoeven and Randy
  Dunlap.

  Additionally, this fixes a bug in the Storage Queue code that was
  discovered while I was reviewing a patch to switch the code to the
  bitmap API by Christophe Jaillet.

  So this contains both a fix for the original bug in the Storage Queue
  code that can be backported later as well as the Christophe's patch to
  swich the code to the bitmap API.

  Summary:

   - Use generic GCC library routines

   - sq: Use the bitmap API when applicable

   - sq: Fix incorrect element size for allocating bitmap buffer

   - pci: Remove unused variable in SH-7786 PCI Express code

   - mcount.S: fix build error when PRINTK is not enabled

   - remove sh5/sh64 last fragments

   - math-emu: fix macro redefined warning

   - init: use OF_EARLY_FLATTREE for early init

   - nmi_debug: fix return value of __setup handler

   - SH2007: drop the bad URL info"

* tag 'sh-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: Replace <uapi/asm/types.h> by <asm-generic/int-ll64.h>
  sh: Use generic GCC library routines
  sh: sq: Use the bitmap API when applicable
  sh: sq: Fix incorrect element size for allocating bitmap buffer
  sh: pci: Remove unused variable in SH-7786 PCI Express code
  sh: mcount.S: fix build error when PRINTK is not enabled
  sh: remove sh5/sh64 last fragments
  sh: math-emu: fix macro redefined warning
  sh: init: use OF_EARLY_FLATTREE for early init
  sh: nmi_debug: fix return value of __setup handler
  sh: SH2007: drop the bad URL info
2023-04-27 17:41:23 -07:00
Linus Torvalds 35fab9271b xen: branch for v6.4-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZEolJQAKCRCAXGG7T9hj
 vuVMAP9B3WLzszen3/XCM2E6sZurtmD+YPkUrbES2AsEE1PH3gEA73ZxM1C+gvKS
 5be7Dksgeyqyqrwhb9/VOyHU3pmyrAw=
 =zirQ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - some cleanups in the Xen blkback driver

 - fix potential sleeps under lock in various Xen drivers

* tag 'for-linus-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/blkback: move blkif_get_x86_*_req() into blkback.c
  xen/blkback: simplify free_persistent_gnts() interface
  xen/blkback: remove stale prototype
  xen/blkback: fix white space code style issues
  xen/pvcalls: don't call bind_evtchn_to_irqhandler() under lock
  xen/scsiback: don't call scsiback_free_translation_entry() under lock
  xen/pciback: don't call pcistub_device_put() under lock
2023-04-27 17:27:06 -07:00
Linus Torvalds da46b58ff8 hyperv-next for v6.4
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmRHJSgTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXjSOCAClsmFmyP320yAB74vQer5cSzxbIpFW
 3qt/P3D8zABn0UxjjmD8+LTHuyB+72KANU6qQ9No6zdYs8yaA1vGX8j8UglWWHuj
 fmaAD4DuZl+V+fmqDgHukgaPlhakmW0m5tJkR+TW3kCgnyrtvSWpXPoxUAe6CLvj
 Kb/SPl6ylHRWlIAEZ51gy0Ipqxjvs5vR/h9CWpTmRMuZvxdWUro2Cm82wJgzXPqq
 3eLbAzB29kLFEIIUpba9a/rif1yrWgVFlfpuENFZ+HUYuR78wrPB9evhwuPvhXd2
 +f+Wk0IXORAJo8h7aaMMIr6bd4Lyn98GPgmS5YSe92HRIqjBvtYs3Dq8
 =F6+n
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - PCI passthrough for Hyper-V confidential VMs (Michael Kelley)

 - Hyper-V VTL mode support (Saurabh Sengar)

 - Move panic report initialization code earlier (Long Li)

 - Various improvements and bug fixes (Dexuan Cui and Michael Kelley)

* tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
  PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg
  Drivers: hv: move panic report code from vmbus to hv early init code
  x86/hyperv: VTL support for Hyper-V
  Drivers: hv: Kconfig: Add HYPERV_VTL_MODE
  x86/hyperv: Make hv_get_nmi_reason public
  x86/hyperv: Add VTL specific structs and hypercalls
  x86/init: Make get/set_rtc_noop() public
  x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
  x86/hyperv: Add callback filter to cpumask_to_vpset()
  Drivers: hv: vmbus: Remove the per-CPU post_msg_page
  clocksource: hyper-v: make sure Invariant-TSC is used if it is available
  PCI: hv: Enable PCI pass-thru devices in Confidential VMs
  Drivers: hv: Don't remap addresses that are above shared_gpa_boundary
  hv_netvsc: Remove second mapping of send and recv buffers
  Drivers: hv: vmbus: Remove second way of mapping ring buffers
  Drivers: hv: vmbus: Remove second mapping of VMBus monitor pages
  swiotlb: Remove bounce buffer remapping for Hyper-V
  Driver: VMBus: Add Devicetree support
  dt-bindings: bus: Add Hyper-V VMBus
  Drivers: hv: vmbus: Convert acpi_device to more generic platform_device
  ...
2023-04-27 17:17:12 -07:00
Linus Torvalds 8ccd54fe45 virtio,vhost,vdpa: features, fixes, cleanups
reduction in interrupt rate in virtio
 perf improvement for VDUSE
 scalability for vhost-scsi
 non power of 2 ring support for packed rings
 better management for mlx5 vdpa
 suspend for snet
 VIRTIO_F_NOTIFICATION_DATA
 shared backend with vdpa-sim-blk
 user VA support in vdpa-sim
 better struct packing for virtio
 
 fixes, cleanups all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRG+QcPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpMyAIALpq8Z9ljl7ADGLuvt/xeCnIdifo7NXam71s
 +algalRplF3QplnMxZ0vH19Z8Gvyl18fkk/l0tHoCrZZgyseYR6DbyZXPv8YIfFh
 NSBokhil+ZURH6eNJc2PLcBUF3QIL3rSv7tBq7/++PN3KIqdHIePbyUFLlwqb272
 NLkOkHT30QBtncRWJORj/GqDxi/4H1zHDmfMd6xD/1B6IrC3gin205RnLuCa2H65
 bP0IE025VrmrRqNGX7nhi7dIFo6SmMPwG5O0YWeEhFHaSOL9PJM/Z9EN4tLhC1v1
 Y34fryH9e+MMSgBnCK2ExxTq/pGWsbhPbvisDfDf3M1m1HHfhYI=
 =N1SV
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio,vhost,vdpa: features, fixes, and cleanups:

   - reduction in interrupt rate in virtio

   - perf improvement for VDUSE

   - scalability for vhost-scsi

   - non power of 2 ring support for packed rings

   - better management for mlx5 vdpa

   - suspend for snet

   - VIRTIO_F_NOTIFICATION_DATA

   - shared backend with vdpa-sim-blk

   - user VA support in vdpa-sim

   - better struct packing for virtio

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits)
  vhost_vdpa: fix unmap process in no-batch mode
  MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS
  tools/virtio: fix build caused by virtio_ring changes
  virtio_ring: add a struct device forward declaration
  vdpa_sim_blk: support shared backend
  vdpa_sim: move buffer allocation in the devices
  vdpa/snet: use likely/unlikely macros in hot functions
  vdpa/snet: implement kick_vq_with_data callback
  virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support
  virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
  vdpa/snet: support the suspend vDPA callback
  vdpa/snet: support getting and setting VQ state
  MAINTAINERS: add vringh.h to Virtio Core and Net Drivers
  vringh: address kdoc warnings
  vdpa: address kdoc warnings
  virtio_ring: don't update event idx on get_buf
  vdpa_sim: add support for user VA
  vdpa_sim: replace the spinlock with a mutex to protect the state
  vdpa_sim: use kthread worker
  vdpa_sim: make devices agnostic for work management
  ...
2023-04-27 17:05:34 -07:00
Linus Torvalds 0835b5ee87 pstore update for v6.4-rc1
- Revert pmsg_lock back to a normal mutex (John Stultz)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmRJaPkWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJtADD/42DSLBg9dAvFHUpXSuffbbhL/w
 HhbfPlsmSujWWWRE3xmEaWJrUQ+Ag6NHyHR7Euko6tBhtj1MhtPle4Di57H5sMid
 8R+7C+3XDmy5WeUF60dribiiKjtNiRIzWefsQyHn4fguaZ5SWHN+iwtvmBofWC44
 YQXaLR5lbxukZTKwiPjdJefS139/QMsKXx3mKu7IdtjjZ5yemH8iTvsQS/2nLkIS
 LWgBN2boopSVtJJslam/29JIhtT9UGoS/ooFJGkoFKXJrVY1+aiqxrYDihgH1K6b
 FoEb/+G/z9M9KxCNGOqv/h+Nl2Oa5L8hdvBy5UsUxhGUNG8/nqsjIwWjJmba9fJu
 3bJfMpsEja955Omq73UFVsgR8OTuy5z91XbR3jJk+4YQlXWgcqvoAYiM0SHX4z7W
 tB1OPCTGDaNLInYA6YHESlbiAmtk/Peizgs9n4PkOeCN26LWGV/FfjR+zorO+6xO
 NNbM1XN/Xdzp/oNwnU3TqRdI6F7v81uQfIiS0VDJoJ7jpHAVQA042l2zwihoopC2
 ErIBKUqpgfGUDxu29QEdfhdwkSfofyjfOzZ5iHYVsvxhn7oS7Xx+zxyp/mFReoIF
 bsqUsAZdCeMgye8wZZmNDlGaLsmLJB/bnt6XqNYMtSzp6ktpIkyBn/rRqhQYRrZK
 g//x5fMMz8fNZK1z0w==
 =5Jr7
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore update from Kees Cook:

 - Revert pmsg_lock back to a normal mutex (John Stultz)

* tag 'pstore-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore: Revert pmsg_lock back to a normal mutex
2023-04-27 17:03:40 -07:00