Commit Graph

1248722 Commits (d8764d347bd737efec00fae81133ffad0ae084bb)

Author SHA1 Message Date
Michal Simek d8764d347b dt-bindings: firmware: xilinx: Describe soc-nvmem subnode
Describe soc-nvmem subnode as the part of firmware node. The name can't be
pure nvmem because dt-schema already defines it as array property that's
why different name should be used.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/24fe6adbf2424360618e8f5ca541ebfd8bb0723e.1706692641.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-02-06 08:01:32 +01:00
Michal Simek dbcd27526e dt-bindings: soc: xilinx: Add support for KV260 CC
When DT overlay is applied at run time compatible string or model AFAIK is
not updated. But when fdtoverlay tool is used it actually creates full
description for used SOM and carrier card(CC). That's why there is no
reason to use generic SOM name and its compatible strings because they are
not properly reflected in newly created DT.
Composing dt overlays together was introduced by commit 7a4c31ee87
("arm64: zynqmp: Add support for Xilinx Kria SOM board") and later renamed
by commit 45fe0dc4ea ("arm64: xilinx: Use zynqmp prefix for SOM dt
overlays").
DTB selection is done prior booting OS that's why there is no need to do
run time composition for SOM and CC combination. And user space can use
compatible string and all listed revisions to figured it out which SOM and
CC combinations OS is running at.

Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/14c184225cc4f0a61da5f8c98bc0767f8deba0df.1706019781.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-31 10:09:00 +01:00
Michal Simek f935a52d03 dt-bindings: soc: xilinx: Add support for K26 rev2 SOMs
Revision 2 is SW compatible with revision 1 but it is necessary to reflect
it in model and compatible properties which are parsed by user space.
Rev 2 has improved a power on boot reset and MIO34 shutdown glich
improvement done via an additional filter in the GreenPak chip.

Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/90e1a393154c3d87e8ee7dc9eef07fc937c1eaf7.1706019397.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-31 10:08:55 +01:00
Michal Simek 237a1bbc32 arm64: zynqmp: Align usb clock nodes with binding
dwc3-xilinx.yaml defines 2 clocks which are not defined that's why define
them (bus_early clock is moved to bus_clk in glue logic).
With also describing kv260 assigned clock rates with assigned clocks.
Also add missing status property to standard dwc3 core.

Link: https://lore.kernel.org/r/aa4c65a8997c7a65f23da3a3088bb5eb64281307.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:13:01 +01:00
Michal Simek 672aa9abb6 arm64: zynqmp: Comment all smmu entries
SMMU is disabled by default and not all masters can be enabled at the same
time because of limited number of entries. That's why comment all iommu
properties but keep them for reference in DT. In XEN case they should be
added back and Xen should have SMMU enabled by default.
Also add IDs for DP and DPDMA.

Link: https://lore.kernel.org/r/bdb012b1c86abb0d9aa88954196d886d1283e9b1.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:13:00 +01:00
Michal Simek 8258cf0d4a arm64: zynqmp: Rename i2c?-gpio to i2c?-gpio-grp
Anything ending with gpio/gpios is taken as gpio phande/description which
is reported as the issue coming from gpio-consumer.yaml schema.
That's why rename the gpio suffix to gpio-grp to avoid name collision.

Link: https://lore.kernel.org/r/94f633e26b7b16cabddb8c7210c2e79208c364da.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Tejas Bhumkar ea470fe330 arm64: zynqmp: Disable Tri-state for MIO38 Pin
gpio38 is used in SOM's kv260 to reset the Ethernet PHY.
At present, HW reset is not working properly as Tri-state 
is enabled for MIO38, causing inappropriate PHY register reads.

Disabled Tri-state for MIO38 to make HW reset work.

Tri-state disable :
ZynqMP> md 0xFF180208 2
ff180208: 00bfe7a3 00000540

Tri-state enable :
ZynqMP> md 0xFF180208 2
ff180208: 00bfe7e3 00000540

Signed-off-by: Tejas Bhumkar <tejas.arvind.bhumkar@amd.com>
Link: https://lore.kernel.org/r/9f8a0687be407a8ffad610087074e94ebc4f5982.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Michal Simek 24e85ff034 arm64: zynqmp: Remove incorrect comment from kv260s
Remove incorrect comment about required nodes by spec. In past gem3 was the
part of SOM specification but it has been revisit by introducting KR260.

Link: https://lore.kernel.org/r/0bd166345ce78097a1ff8d4307545f5026c92267.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Michal Simek 2385a6d8ed arm64: zynqmp: Introduce u-boot options node with bootscr-address
Add u-boot options node with details about bootscr-address.
c&p description from dtschema/schemas/options/u-boot.yaml:

"Holds the full address of the boot script file. It helps in making
automated flow easier by fetching the 64bit address directly from DT.
Value should be automatically copied to the U-Boot 'scriptaddr' variable.
When it is defined, bootscr-ram-offset property should be ignored.
Actually only one of them should be present in the DT."

Address is generic for all zynqmp boards because all of them have DDR
starting from 0. Custom boards should revisit the location and aligned it
based on their needs.

Link: https://lore.kernel.org/r/4f5978d5a26fe0cd0cc6e54a97da1517bb925c01.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Michal Simek 97fed7ecbb arm64: zynqmp: Fix comment to be aligned with board name.
The board was renamed from zc1275 to zcu1275 but name in comment wasn't
updated.

Fixes: 370b0e900f ("arm64: zynqmp: Change zc1275 board name to zcu1275")
Link: https://lore.kernel.org/r/6e4215807f535d2e2b2b8055aa8574c748fe22e4.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Thippeswamy Havalige 3473622299 arm64: zynqmp: Update ECAM size to discover up to 256 buses
Update ECAM size to discover up to 256 buses.

Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
Link: https://lore.kernel.org/r/4f7621a790f4aa35b3e7f74683d3ae4ffe820667.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Michal Simek 46de36a489 arm64: zynqmp: Describe assigned-clocks for uarts
Describe assigned-clocks for both uarts. SOM is using this functionality.

Link: https://lore.kernel.org/r/21579f273554a19bc95a40f49956793b5261627f.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:10 +01:00
Michal Simek be5df5e0c1 arm64: zynqmp: Setup default si570 frequency to 156.25MHz
All si570 mgt chips have factory default 156.25MHz but DT changed it to
148.5MHz. After tracking it is pretty much c&p fault taken from Zynq
zc702/zc706 boards where 148.5MHz was setup as default because it was
requirement for AD7511 chip available on these boards.
ZynqMP board don't contain this chip that's why factory default frequency
can be used.

Link: https://lore.kernel.org/r/65a53776cbc5e4586f58da57a4b99e4d5c6c26a7.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:09 +01:00
Srinivas Neeli 1993f67646 arm64: zynqmp: Add resets property for CAN nodes
Added resets property for CAN nodes.

Signed-off-by: Srinivas Neeli <srinivas.neeli@amd.com>
Link: https://lore.kernel.org/r/7bf0cc230f3c25010f9545f3f92f6f15a95d21ec.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:09 +01:00
Ilias Apalodimas 06d22ed6b6 arm64: zynqmp: Add an OP-TEE node to the device tree
Since the zynqmp boards can run upstream OP-TEE, and having the DT node
present doesn't cause any side effects add it in case someone tries to
load OP-TEE.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/9ee7e8c263c453a8c9e6bc3b91fad78b0f54edc0.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:09 +01:00
Neal Frager 34e48901e7 arm64: zynqmp: Add output-enable pins to SOMs
Now that the zynqmp pinctrl driver supports the tri-state registers, make
sure that the pins requiring output-enable are configured appropriately for
SOMs.

Without it, all tristate setting for MIOs, which are not related to SOM
itself, are using default configuration which is not correct setting.
It means SDs, USBs, ethernet, etc. are not working properly.

In past it was fixed through calling tristate configuration via bootcmd:
usb_init=mw 0xFF180208 2020
kv260_gem3=mw 0xFF18020C 0xFC0 && gpio toggle gpio@ff0a000038 && \
  gpio toggle gpio@ff0a000038

Signed-off-by: Neal Frager <neal.frager@amd.com>
Link: https://lore.kernel.org/r/9270938b48c8939ac5dca4ac2c59f1c4a8c564d8.1704728353.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:10:09 +01:00
Michal Simek 5710ea6a90 arm64: zynqmp: Rename zynqmp-power node to power-management
Rename zynqmp-power node name to power-management which is more aligned
with generic node name recommendation.

Link: https://lore.kernel.org/r/bf24cde92c2b9e2824847687fab69fc25c533d53.1703161663.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:03:25 +01:00
Michal Simek e83e3c55e4 dt-bindings: firmware: xilinx: Sort node names (clock-controller)
Nodes should be sorted that's why move clock-controller to the top of list.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/ccb6bd5f4d1d28983c73497ada596e893fece499.1703161663.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:03:07 +01:00
Michal Simek 6f9c4e691f dt-bindings: firmware: xilinx: Describe missing child nodes
Firmware node has more than fpga, aes and clock child nodes but also power,
reset, gpio, pinctrl and pcap which are not described yet.
All of them have binding in separate files but there is missing connection
to firmware node that's why describe it.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/1d7988cfadf3554d11f0779f96a670b4fd86ce5a.1703161663.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:03:07 +01:00
Michal Simek 93b7a95f6d dt-bindings: firmware: xilinx: Fix versal-fpga node name
Based on commit 83a368a3fc ("docs: dt-bindings: add DTS Coding Style
document") using underscore ('_') in node name is not recommended that's
why switch to dash ('-').

Acked-by: Xu Yilun <yilun.xu@intel.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/6779af2f9cc21c912f10cf310388d99b980800b2.1702996281.git.michal.simek@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:01:38 +01:00
Jay Buddhabhatti 8e312baacc dt-bindings: firmware: versal: add versal-net compatible string
Add dt-binding documentation for Versal NET platforms.
Versal Net is a new AMD/Xilinx SoC.

The SoC and its architecture is based on the Versal ACAP device.
The Versal Net device includes more security features in the
platform management controller (PMC) and increases the number of
CPUs in the application processing unit (APU) and the real-time
processing unit (RPU).

Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/1705406326-2947516-1-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-01-22 14:00:32 +01:00
Linus Torvalds 6613476e22 Linux 6.8-rc1 2024-01-21 14:11:32 -08:00
Linus Torvalds 35a4474b5c More bcachefs updates for 6.7-rc1
- assorted prep work for disk space accounting rewrite
  - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
    makes our trigger context more explicit
  - A few fixes to avoid excessive transaction restarts on multithreaded
    workloads: fstests (in addition to ktest tests) are now checking
    slowpath counters, and that's shaking out a few bugs
  - Assorted tracepoint improvements
  - Starting to break up bcachefs_format.h and move on disk types so
    they're with the code they belong to; this will make room to start
    documenting the on disk format better.
  - A few minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWtjOsACgkQE6szbY3K
 bnbyXRAAsx+yM81TFqsLzRRqf8oocRwf2dj5XzExz9Ig/lYQS5LIVROS2OxwDsAc
 DeaYQSTcph9dkOswCrNR96bBnEgmmZ1ClfVI6WRXvm6vs4rjhSMNbNaVyySrMUVn
 5p/Lsn1/RKl0lWMYlHrdryo+106zRcr6z1Hiv9QCXkXhzdkV8wFYDkfbMveShUsu
 KobC29wvd2EfZr04nqsIXS/y/iRIXhtZqJmFCiAguN70UWrwUwArpELHI5Ve+WPZ
 9VjgFXW6Ka3QxJs/20tX+t24DrC+eDXR44DzQmxwG5mPBBpXkcSk5UgRw/EUag5U
 5+mDZQ5Ei3gvZvUwrilMosVy3pIw0IuvqeqwDGFoFXs1cce01QCMN+NG/dBTQw9i
 KGGxJw5sOrZ8fIiFnypk1M+r9NVtA8MjriLNR5bJjCWPSpWqzkT2HzxFXc6HmTZu
 vsE/AxwC1RLA6B2HZlDEqLOdHE3cofkDiIzWM5ABvb4p118iyk9hE6HhAufk5UdE
 HaG646kGB8pUY/sCxBIOD6K2pgthDFv+fftTM7X+uIazD3bovvPQCEInu48/KAHn
 /KmslSPO0txyjnRFMbXFJvd4Fgfo44GcBCeqGpy3B79aEJ3nroyRZ0qNnnsqj0Gl
 picUWjTn4W561Q1zBXuE/6cLWEp+sfaqYQcM8L3CCitRTVDPaCQ=
 =yd+F
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Some fixes, Some refactoring, some minor features:

   - Assorted prep work for disk space accounting rewrite

   - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
     makes our trigger context more explicit

   - A few fixes to avoid excessive transaction restarts on
     multithreaded workloads: fstests (in addition to ktest tests) are
     now checking slowpath counters, and that's shaking out a few bugs

   - Assorted tracepoint improvements

   - Starting to break up bcachefs_format.h and move on disk types so
     they're with the code they belong to; this will make room to start
     documenting the on disk format better.

   - A few minor fixes"

* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
  bcachefs: Improve inode_to_text()
  bcachefs: logged_ops_format.h
  bcachefs: reflink_format.h
  bcachefs; extents_format.h
  bcachefs: ec_format.h
  bcachefs: subvolume_format.h
  bcachefs: snapshot_format.h
  bcachefs: alloc_background_format.h
  bcachefs: xattr_format.h
  bcachefs: dirent_format.h
  bcachefs: inode_format.h
  bcachefs; quota_format.h
  bcachefs: sb-counters_format.h
  bcachefs: counters.c -> sb-counters.c
  bcachefs: comment bch_subvolume
  bcachefs: bch_snapshot::btime
  bcachefs: add missing __GFP_NOWARN
  bcachefs: opts->compression can now also be applied in the background
  bcachefs: Prep work for variable size btree node buffers
  bcachefs: grab s_umount only if snapshotting
  ...
2024-01-21 14:01:12 -08:00
Linus Torvalds 4fbbed7872 Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs. CPU hotplug.
     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.
 
  - Assorted fixes and improvements for clocksource/event drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmWtTLgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUXiD/4uN4Ntps8TwxSdg1X11M6++rizg9q9
 EmIfwWcfQQJDM5Ss5FE88ye55NxIOwJ1brYo08+yTAXjnnZ/yNP1BBegHbMNiGil
 NCHye7tYKZle25+hErdgfBB9n6brPz7dPOvV04/wRRWW+9p2ejt/5nEvojkyco9Y
 S9KgBCxkvUqScMbdKKFW1UsThWh2euxwQXRGiWhTPPkbKcVynPvQJjvVyRxn01NS
 eEhTn8YUNcAPT+1YApouGXrSCxo/IzBJ36CxOoCoUfaXcJ6FG1LLeAjNxKZ26Dfs
 Ah0e3Hhyv6KOsBvBNwwabXDwryd6L8rZd8yL2KakI1vIC51uS2wneFy8GCieDVGh
 xmy3U/tfkS0L7pmN+dQW2l4k9PHRNrwvbISKhs0UAHSOgGIMHZcjE6aFbYKru5i4
 1W+dEjiktlceZ94mrEHbLpKmxWH2z5P8m0BzUs4kt3nkaOf6CTUKqa/qdAiU5dv+
 lovKT26L8HBrMXf48I70UpgW/bYzOUGk55sR6hiLTXAelz1z02D1uYHFkshc0NCO
 /O4wvHcgvMM46CtWVbim42AlRcyyWCr+FrY+jvfiG2icOcHPLqc81iHL8EKj7pJl
 IxLgyPHVckgnE5gx+GQ8aDkg/qwCZnj4rFWgub8QMYtjI+pO+9T9kPAYPCxFhP7J
 gmcJxZAB2RnKXA==
 =RD6E
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for time and clocksources:

   - A fix for the idle and iowait time accounting vs CPU hotplug.

     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.

   - Assorted fixes and improvements for clocksource/event drivers"

* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
  clocksource/drivers/ep93xx: Fix error handling during probe
  clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
  clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
  clocksource/timer-riscv: Add riscv_clock_shutdown callback
  dt-bindings: timer: Add StarFive JH8100 clint
  dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
2024-01-21 11:14:40 -08:00
Linus Torvalds 7b297a5cc9 powerpc fixes for 6.8 #2
- 18f14afe28 powerpc/64s: Increase default stack size to 32KB BY: Michael Ellerman
 
 Thanks to:
 Michael Ellerman
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTYs9CDOrDQRwKRmtrJvCLnGrjHVgUCZayxkgAKCRDJvCLnGrjH
 Vv2hAQDwvyYydFw64D7bnaFJDLvOwi3SL02OBaFYV1JTr8rf/QEA8NcTuqXis5o5
 NedFYVE5PhYGWfyPD63aL+JpUKxsXwc=
 =Ud9v
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Aneesh Kumar:

 - Increase default stack size to 32KB for Book3S

Thanks to Michael Ellerman.

* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Increase default stack size to 32KB
2024-01-21 11:04:29 -08:00
Kent Overstreet 249f441f83 bcachefs: Improve inode_to_text()
Add line breaks - inode_to_text() is now much easier to read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet d826cc57c5 bcachefs: logged_ops_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet 8d52ba60c4 bcachefs: reflink_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet b2fa1b633b bcachefs; extents_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet 0560eb9abf bcachefs: ec_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet c6c4ff6507 bcachefs: subvolume_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet 8fed323b14 bcachefs: snapshot_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet d455179fce bcachefs: alloc_background_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 72e0801049 bcachefs: xattr_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 7ffc4daa5f bcachefs: dirent_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet b36425da71 bcachefs: inode_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 82de6207fb bcachefs; quota_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 43314801a4 bcachefs: sb-counters_format.h
bcachefs_format.h has gotten too big; let's do some organizing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 3a58dfbc46 bcachefs: counters.c -> sb-counters.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 12207f49ef bcachefs: comment bch_subvolume
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet d32088f2f2 bcachefs: bch_snapshot::btime
Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 7be0208fc9 bcachefs: add missing __GFP_NOWARN
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet d7e77f53e9 bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet ec4edd7b9d bcachefs: Prep work for variable size btree node buffers
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Su Yue 2acc59dd88 bcachefs: grab s_umount only if snapshotting
When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.

$ cat test.sh
prog=bcachefs

$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots

while true;do
    $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
    sleep 1s
done

$ cat /etc/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/mongod.log

storage:
  dbPath: /mnt/data/

lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
               but task is already holding lock:
[ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
               which lock already depends on the new lock.

[ 3437.456553]
               the existing dependency chain (in reverse order) is:
[ 3437.457054]
               -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507]        down_read+0x3e/0x170
[ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
[ 3437.458498]        do_syscall_64+0x42/0xf0
[ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
               -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615]        down_read+0x3e/0x170
[ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686]        notify_change+0x1f1/0x4a0
[ 3437.461283]        do_truncate+0x7f/0xd0
[ 3437.461555]        path_openat+0xa57/0xce0
[ 3437.461836]        do_filp_open+0xb4/0x160
[ 3437.462116]        do_sys_openat2+0x91/0xc0
[ 3437.462402]        __x64_sys_openat+0x53/0xa0
[ 3437.462701]        do_syscall_64+0x42/0xf0
[ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
               -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843]        down_write+0x3b/0xc0
[ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493]        vfs_write+0x21b/0x4c0
[ 3437.464653]        ksys_write+0x69/0xf0
[ 3437.464839]        do_syscall_64+0x42/0xf0
[ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
               -> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471]        __lock_acquire+0x1455/0x21b0
[ 3437.465656]        lock_acquire+0xc6/0x2b0
[ 3437.465822]        mnt_want_write+0x46/0x1a0
[ 3437.465996]        filename_create+0x62/0x190
[ 3437.466175]        user_path_create+0x2d/0x50
[ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
[ 3437.466791]        do_syscall_64+0x42/0xf0
[ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
               other info that might help us debug this:

[ 3437.469670] 2 locks held by bcachefs/35533:
               other info that might help us debug this:

[ 3437.467507] Chain exists of:
                 sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48

[ 3437.467979]  Possible unsafe locking scenario:

[ 3437.468223]        CPU0                    CPU1
[ 3437.468405]        ----                    ----
[ 3437.468585]   rlock(&type->s_umount_key#48);
[ 3437.468758]                                lock(&c->snapshot_create_lock);
[ 3437.469030]                                lock(&type->s_umount_key#48);
[ 3437.469291]   rlock(sb_writers#10);
[ 3437.469434]
                *** DEADLOCK ***

[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
               stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795]  <TASK>
[ 3437.471884]  dump_stack_lvl+0x57/0x90
[ 3437.472035]  check_noncircular+0x132/0x150
[ 3437.472202]  __lock_acquire+0x1455/0x21b0
[ 3437.472369]  lock_acquire+0xc6/0x2b0
[ 3437.472518]  ? filename_create+0x62/0x190
[ 3437.472683]  ? lock_is_held_type+0x97/0x110
[ 3437.472856]  mnt_want_write+0x46/0x1a0
[ 3437.473025]  ? filename_create+0x62/0x190
[ 3437.473204]  filename_create+0x62/0x190
[ 3437.473380]  user_path_create+0x2d/0x50
[ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819]  ? lock_acquire+0xc6/0x2b0
[ 3437.474002]  ? __fget_files+0x2a/0x190
[ 3437.474195]  ? __fget_files+0xbc/0x190
[ 3437.474380]  ? lock_release+0xc5/0x270
[ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
[ 3437.475277]  do_syscall_64+0x42/0xf0
[ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================

In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.

Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().

Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Su Yue 369acf97d6 bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:

[  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
[  406.830676 ] #PF: supervisor read access in kernel mode
[  406.831643 ] #PF: error_code(0x0000) - not-present page
[  406.832487 ] PGD 0 P4D 0
[  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
[  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  406.835796 ] RIP: 0010:kfree+0x62/0x140
[  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
[  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
[  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
[  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
[  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
[  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
[  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
[  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
[  406.841464 ] Call Trace:
[  406.841583 ]  <TASK>
[  406.841682 ]  ? __die+0x1f/0x70
[  406.841828 ]  ? page_fault_oops+0x159/0x470
[  406.842014 ]  ? fixup_exception+0x22/0x310
[  406.842198 ]  ? exc_page_fault+0x1ed/0x200
[  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
[  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
[  406.842842 ]  ? kfree+0x62/0x140
[  406.842988 ]  ? kfree+0x104/0x140
[  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
[  406.843390 ]  kobject_put+0xb7/0x170
[  406.843552 ]  deactivate_locked_super+0x2f/0xa0
[  406.843756 ]  cleanup_mnt+0xba/0x150
[  406.843917 ]  task_work_run+0x59/0xa0
[  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
[  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
[  406.844510 ]  do_syscall_64+0x4e/0xf0
[  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  406.844907 ] RIP: 0033:0x7f0a2664e4fb

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 00fff4dd58 bcachefs: bios must be 512 byte algined
Fixes: 023f9ac9f7 bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Colin Ian King aead3428e8 bcachefs: remove redundant variable tmp
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet b97de45365 bcachefs: Improve trace_trans_restart_relock
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet 46bf2e9cc7 bcachefs: Fix excess transaction restarts in __bchfs_fallocate()
drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00