Bcachefs, an Introduction/Exploration

156 pointsby marbu10 months ago

18 comments

amluto10 months ago

Here’s my pet peeve regarding RAID: no RAID system I’ve ever used gracefully handles disks that come and go. Concretely: start with two disks in RAID1. Remove one. Mount in degraded mode. Write to a file. Unmount. Reconnect the removed disk. Mount again with both disks.The results vary between annoying (need to restore / “resilver” and have no redundancy until it’s done; massively increased risk of data loss while doing so due to heavy IO load without redundancy and pointless loss of the redundancy that already exists) to catastrophic (outright corruption). The corollary is that RAID invariably works poorly with disks connected over using an interface that enumerates slowly or unreliably.Yet most competent active-active database systems have no problems with this scenario!I would love to see a RAID system that thinks of disks as nodes, properly elects leaders, and can efficiently fast-forward a disk that’s behind. A pile of USB-connected drives would work perfectly, would come up when a quorum was reached, and would behave correctly when only a varying subset of disks is available. Bonus points for also being able to run an array that spans multiple computers efficiently, but that would just be icing on the cake.

评论 #41077311 未加载

评论 #41077212 未加载

评论 #41077249 未加载

评论 #41077131 未加载

评论 #41080459 未加载

LeoPanthera10 months ago

The "why not btrfs" line boils down to "it took a long time to be stable".That's a weird argument. Even if it's true, it is now stable, and has been for a long time. btrfs has long been my default, and I'd be wary of switching to something newer just because someone was mad that development took a long time.

评论 #41076660 未加载

评论 #41077071 未加载

评论 #41076829 未加载

评论 #41076674 未加载

评论 #41076877 未加载

评论 #41078327 未加载

评论 #41080047 未加载

评论 #41076646 未加载

评论 #41077297 未加载

评论 #41077849 未加载

评论 #41076906 未加载

Tobu10 months ago

> Error handling on CRC read error > 2 or more copies of file, CRC on error, read other copy, data returned to userspace, does not correct bad copyThat's been implemented; in Linux 6.11 bcachefs will correct errors on read. See> - Self healing on read IO/checksum errorin <a href="https://lore.kernel.org/linux-bcachefs/73rweeabpoypzqwyxa7hld7tnkskkaotuo3jjfxnpgn6gg47ly@admkywnz4fsp/" rel="nofollow">https://lore.kernel.org/linux-bcachefs/73rweeabpoypzqwyxa7hl...</a>Making it possible to scrub from userspace by walking and reading everything (tar -c /mnt/bcachefs >/dev/null).

评论 #41079837 未加载

ajb10 months ago

Bcachefs was merged into the kernel only months ago, and had an immediate flurry of bug fixes due to the additional testing this brought. (It was in development for some years before that out of tree). That's the level of maturity that it is at. I think there's a hope that it will become more trustworthy than btrfs due to the developers success with bcache.

评论 #41080071 未加载

guenthert10 months ago

Hmmh, under "Why bcachefs?" we find- Stability but also- Constant refactoringsand later"Disclaimer, my personal data is stored on ZFS"A bit troubling, I find"RAID0 behavior is default when using multiple disks" never have I ever had the need for RAID0 or have I seen a customer using it. I think it was at one time popular with gamers before SSDs became popular and cheap."RAID 5/6 (experimental)<pre><code> This is referred to as erasure coding and is listed as “DO NOT USE YET”, "</code></pre> Well, you got to start somewhere, but a comparison with btrfs and ZFS seems premature.

评论 #41077154 未加载

评论 #41076893 未加载

ysleepy10 months ago

I wonder why ZFS is marked as not having de-dupe (deduplication).AFAIK ZFS has had deduplication support for a very long time (2009) and now even does opportunistic block cloning with much less overhead.

评论 #41077013 未加载

评论 #41076729 未加载

评论 #41076977 未加载

frankjr10 months ago

> btrfs Encryption Ybtrfs doesn't have a built-in encryption.> ZFS Encryption YI cannot find the discussion right now but I remember reading that they were considering a warning when enabling encryption because it was not really stable and people were running into crashes.<a href="https://github.com/openzfs/zfs/issues?q=is%3Aissue+label%3A%22Component%3A+Encryption%22+is%3Aopen+label%3A%22Type%3A+Defect%22">https://github.com/openzfs/zfs/issues?q=is%3Aissue+label%3A%...</a>

评论 #41077605 未加载

评论 #41077880 未加载

linsomniac10 months ago

I had really high hopes for HAMMER2, including that it would one day be ported to Linux, but it seems to have remained firmly planted in Dragonfly BSD and it's not really clear what the status is. <a href="https://en.wikipedia.org/wiki/HAMMER2" rel="nofollow">https://en.wikipedia.org/wiki/HAMMER2</a>

xxmarkuski10 months ago

I'm running bcache, with lvm/luks and xfs on top, since >5 years on my desktop and it has been stable and partition manipulations, like resizes, worked without problems, albeit the tooling is not so well supported.I bought new a new ssd and hdd for my desktop this year and looked into running bcachefs because it offers caching as well as native encryption and cow. I also determined that it is not production ready yet for my use case, my file system is the last thing I want to beta tester of. Investigated using bcache again, but opted to use lvm caching, as it offers better tooling and saves on one layer of block devices (with luks and btrfs on top). Performance is great and partition manipulations also worked flawless.Hopefully bcachefs gains more traction and will be ready for production use, as it combines several useful features. My current setup still feels like making compromises.

gigatexal10 months ago

> ZFS, pioneering COW filesystem, ... commendable, its block-based design diverges from modern extent-based systems due to complexities in implementing extents with snapshots.why is this a bad thing?

curt1510 months ago

How well does bcachefs handle databases and VMs? Those workloads are well-known to be btrfs' kryptonite whereas ZFS seems to tolerate them pretty well as long as one sets the correct recordsize (example: <a href="https://www.enterprisedb.com/blog/postgres-vs-file-systems-performance-comparison" rel="nofollow">https://www.enterprisedb.com/blog/postgres-vs-file-systems-p...</a>).

评论 #41077694 未加载

评论 #41077896 未加载

Liftyee10 months ago

An interesting analysis. I can't stop my brain from parsing the title as "B C A Chefs".

sevg10 months ago

I recently tried btrfs on a new USB thumb drive. I immediately got hard freezes of my main (Linux) OS while working with the USB stick.Never again.I eagerly await bcachefs reaching maturity!

评论 #41077122 未加载

评论 #41077128 未加载

commandersaki10 months ago

I’m pretty keen to try bcachefs. Has anyone successfully set it up as root filesystem on a raspberry pi?

whalesalad10 months ago

Had to do a double-take on the UI of this blog. It looks identical to my notetaking app, Trilium.

tripdout10 months ago

Does it allow both shrinking and growing the FS? Really wish ZFS allowed shrinking.

jcalvinowens10 months ago

The idea that a brand new filesystem might be more reliable than good 'ol BTRFS, which Facebook runs on basically their entire infrastructure, is downright laughable to me.Btrfs is also far more reliable than ZFS in my view, because it has far far more real world testing, and is also much more actively developed.Magical perfect elegant code isn't what makes a good filesystem: real world testing, iteration, and bugfixing is. BTRFS has more of that right now than anything else ever has.

评论 #41079774 未加载

评论 #41079151 未加载

NKosmatos10 months ago

Mandatory xkcd comic: <a href="https://xkcd.com/927" rel="nofollow">https://xkcd.com/927</a> (replace "standards" with "FS")

评论 #41077007 未加载

评论 #41076918 未加载

评论 #41076868 未加载