TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The State of ZFS on Linux

201 pointsby ferrantimover 10 years ago

19 comments

ownedthxover 10 years ago
At a previous job, we built a proof-of-concept Sinatra service (i.e., HTTP&#x2F;RESTful service) that would, on a certain API call, clone from a specified snapshot, and also create an iscsi target to that new clone. This was on OpenIndiana initially, then some other variant of that OS as a second attempt.<p>The client making the HTTP request was IPXE; so, every time the machine booted, you&#x27;d get yourself a flesh clone + iscsi target and we&#x27;d then mount that ISCSI target in IPXE, which would then hand off the ISCSI target to the OS and away you&#x27;d go.<p>The fundamental problem we hit was that there was a linear delay for every new clone; the delay seemed to be &#x27;number of clones * .05 second&#x27; or so. This was on extremely fast hardware. It was the ZFS command to clone that was going to slowly.<p>Around 500 clones, we&#x27;d notice these 10&#x2F;20 second delays. The reason that hurt so bad is that, to our understanding, it wasn&#x27;t safe to do ZFS commands or ISCSI commands in a parallel manner; the Sinatra service was responsible for serializing all ZFS&#x2F;ISCSI commands.<p>So my question to the author:<p>1) Does this &#x27;delay per clones&#x27; ring familiar to you? Does ZFS on Linux have the same issue? It was a killer for us, and I found a thread eventually that implied it would not ever get fixed in Solaris-land.<p>2) Can you execute concurrent ZFS CLI commands on the OS? Or is that dangerous like we found it to be on Solaris?
评论 #8302736 未加载
评论 #8302868 未加载
ryaoover 10 years ago
I am the author. Feel free to respond with questions. I will be watching for questions throughout the day.
评论 #8302614 未加载
评论 #8302729 未加载
评论 #8302477 未加载
评论 #8307324 未加载
评论 #8303347 未加载
评论 #8303604 未加载
astral303over 10 years ago
Tried using ZFS in the earnest and got spooked, felt it was not production ready. Wanted to use ZFS for MongoDB on Amazon Linux (primarily for compression, but also for snapshot functionality for backups). Tried 0.6.2.<p>Ended up running into a situation where a snapshot delete hung and none of my ZFS commands were returning. The snapshot delete was not killable with kill -9. <a href="https://github.com/zfsonlinux/zfs/issues/1283" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;zfsonlinux&#x2F;zfs&#x2F;issues&#x2F;1283</a><p>Also, under load encountered a kernel panic or a hang (I forget), turns out it&#x27;s because the Amazon Linux kernel comes compiled with no preemption. It seems that &quot;voluntary preemption&quot; is the only setting that&#x27;s reliable. <a href="https://github.com/zfsonlinux/zfs/issues/1620" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;zfsonlinux&#x2F;zfs&#x2F;issues&#x2F;1620</a><p>That left a bad taste in my mouth. Might be worth trying out 0.6.3 again.<p>I am still leafing through the issues closed in 0.6.3, but based on what I see, 0.6.2 did not seem production-ready-enough for me:<p><a href="https://github.com/zfsonlinux/zfs/issues?page=2&amp;q=is%3Aissue+milestone%3A0.6.3" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;zfsonlinux&#x2F;zfs&#x2F;issues?page=2&amp;q=is%3Aissue...</a>
评论 #8303976 未加载
WestCoastJustinover 10 years ago
For anyone interested, over the past couple weeks I have heavily researched ZFS, and created a couple screencasts about my findings [1, 2].<p>[1] <a href="https://sysadmincasts.com/episodes/35-zfs-on-linux-part-1-of-2" rel="nofollow">https:&#x2F;&#x2F;sysadmincasts.com&#x2F;episodes&#x2F;35-zfs-on-linux-part-1-of...</a><p>[2] <a href="https://sysadmincasts.com/episodes/37-zfs-on-linux-part-2-of-2" rel="nofollow">https:&#x2F;&#x2F;sysadmincasts.com&#x2F;episodes&#x2F;37-zfs-on-linux-part-2-of...</a>
agaponover 10 years ago
Great blog post! Something from personal experience. OpenZFS on FreeBSD feels mostly like a port of illumos ZFS where most of the non-FreeBSD-specific changes happen in illumos and then get ported downstream. On the other hand, OpenZFS on Linux feels like a fork. There is certainly a stream of changes from illumos, but there&#x27;s a rather non-trivial amount of changes to the core code that happen in ZoL.
评论 #8302672 未加载
bussiereover 10 years ago
I may have read the article too fast , but what about cryptography in zol ? is there a way to crypt data on zol ? regards and thks for the article
评论 #8302917 未加载
评论 #8304807 未加载
a2743906over 10 years ago
I&#x27;m using ZFS right now, because I need something that cares for data integrity, but the fact that it will never be included in Linux is a very big issue for me. Every time you upgrade your kernel, you have to upgrade the separate modules as well - this is the point where bad things can happen. I will definitely be looking into Btrfs once it is more reliable. For now I&#x27;m having a bit of a problem with SSD caching and performance, but don&#x27;t care about it enough for it to be relevant, I just use the filesystem to store data safely and ZFS does an OK job.
评论 #8302652 未加载
评论 #8302579 未加载
评论 #8302696 未加载
Andysover 10 years ago
I used ZFSonLinux on my laptop and workstation for a couple of years now, with Ubuntu, without any major problems. When I tried to use it in production, I didn&#x27;t get data loss but I hit problems:<p>* Upgrading is a crapshoot: Twice, it failed to remount the pool after rebooting, and needed manual intervention.<p>* Complete pool lockup: in an earlier version, the pool hung and I had to reboot to get access to it again. If you look through the issues on github, you&#x27;ll see weird lockups or kernel whoopsies are not uncommon.<p>* Performance problems with NFS: This is partially due to the linux NFS server sucking, but ZFS made it worse. Used alot of CPU compared to solaris or freebsd, and was slow. Its even slow looping back to localhost.<p>* Slower on SSDs: ZFS does more work than other filesystems, so I found that it used more CPU time and had more latency on pure SSD-backed pools.<p>* There are alternatives to L2ARC&#x2F;ZIL on linux, are built-in, and work with any filesystem, such as &quot;flashcache&quot; on ubuntu.<p>For these reasons, I think ZoL is good for &quot;near line&quot; and backups storage, where you have a large RAID of HDDs and need stable and checksummed data storage, but not mission critical stuff like fileservers or DBs.
评论 #8319060 未加载
ryaoover 10 years ago
I have been inundated with feedback from a wide number of channels. If I did not reply to a comment today, I will try to address it tomorrow.
ashayhover 10 years ago
ZFS, and most* other file systems are all about _one_ computer system.<p>While ZFS data integrity features may be useful, they don&#x27;t prevent the wide variety of things that can go wrong on a _single_ computer. You still need site redundancy, multiple physical copies, recovery from user errors etc.<p>Large, modern enterprises are better off keeping data on application layer &quot;filesystems&quot; or databases, since they can more easily aggregate the storage of hundreds or thousands of physical nodes. ZFS doesn&#x27;t help with anything special here.<p>For the average home user, ZoL modules are a hassle to maintain. You are better of setting up FeeNAS on a 2nd computer if you really want to use ZFS. Otherwise there is nothing much over what XFS, EXT4 or btrfs can offer.<p>The &#x27;ssm&#x27; set of tools to manage LVM, and other built in file systems, is more easier for home users with regular needs.<p>GlusterFS and others are distributed file systems, but suffers from additional complexity at the OS and management layer.
评论 #8304830 未加载
mbreeseover 10 years ago
I love ZFS, and I love working with Linux, but I can&#x27;t help but worry about using ZFS on Linux. Without the needed support from the kernel side, I don&#x27;t see how it can be useful for production. I can see using it on personal workstations, but for any situation where data loss is critical, you just won&#x27;t see any uptake. Because of the licensing, ZFS can never be anything more than a second-class citizen on Linux.<p>That said, I run a FreeBSD ZFS file server just to host NFS that is exported over to a Linux cluster. At least on FreeBSD, there is first-class integration of ZFS into the OS. (I used to also maintain a Sun cluster that had a Solaris ZFS storage server that exported NFS over to Linux nodes, which is where I first got a taste for ZFS).<p>So, I guess my main question is: In what use cases is ZFS on Linux so useful when native FreeBSD&#x2F;ZFS support exists?<p>I&#x27;m not saying it can&#x27;t be done - I just don&#x27;t understand <i>why</i>.
评论 #8303482 未加载
leonroyover 10 years ago
I&#x27;ve used ZFS (FreeNAS) for quite a few years and find it pretty flawless. Trust it&#x27;s not too dumb a question but what advantage is there to running ZFS on Linux when you can run it on variants of Solaris or BSD just fine?
DiabloD3over 10 years ago
I&#x27;ve used ZoL since it was created, and zfs-fuse before that. I ran it on my workstation for a few years (managing a 4x750gb RAID-Z (= ZFS&#x27;s RAID-5 impl), with ext3 on mdadm RAID 1 2x400gb root), and then swapped to BTRFS for 2x2TB BTRFS native RAID 1 (which was Oracle&#x27;s ZFS competitor that seems to be largely abandoned although I see commits in the kernel changelog periodically), and now back to ZFS on a dedicated file server using 2x128GB Crucial M550 SSD + 2x2TB, setup as mdadm RAID 1 + XFS for the first 16GB of the SSDs for root[2], 256MB on each for ZIL[1], and the rest as L2ARC[3], and the 2x2TB as ZFS mirror. I honestly see no reason to use any other FS for a storage pool, and if I could reliably use ZFS as root on Debian, I wouldn&#x27;t even need that XFS root in there.<p>All of this said, I get RAID 0&#x27;ed SSD-like performance with very high data reliability and without having to shell out the money for 2TB of SSD. And before someone says &quot;what about bcache&#x2F;flashcache&#x2F;etc&quot;, ZFS had SSD caching before those existed, and ZFS imo does it better due to all the strict data reliability features.<p>[1]: ZFS treats multiple ZIL devs as round robin (RAID 0 speed without increased device failure taking down all your RAID 0&#x27;ed devices). You need to write multiple files concurrently to get the full RAID 0-like performance out of that because it blocks on writing consecutive inodes, allowing no more than one in flight per file at a time. ZIL is only used for O_SYNC writes, and it is concurrently writing to both ZIL and the storage pool, ie, ZIL is not a write-through cache but a true journal.<p>The failure of a ZIL device is only &quot;fatal&quot; if the machine also dies before ZFS can write to the storage pool, and the mode of the failure cannot leave the filesystem in an inconsistent state. ZFS does not currently support RAID for ZIL devices internally, nor is it recommended to hijack this and use mdadm to force it. It only exists to make O_SYNC work at SSD speeds.<p>[2]: &#x2F;tank and &#x2F;home are on ZFS, the rest of the OS takes up about 2GB of that 16GB. I oversized it a tad, I think. If I ever rebuild the system, I&#x27;m going for 4GB.<p>[3]: L2ARC is a second level storage for ZFS&#x27;s in memory cache, called ARC. ARC is a highly advanced caching system that is designed to increase performance by caching often used data obsessively instead of being just a blind inode cache like the OS&#x27;s usual cache is, and is independent of the OS&#x27;s disk cache. L2ARC is sort of like a write through cache, but is more advanced by making a persistent version of ARC that survive reboots and is much larger than system memory. L2ARC is implicitly round robin (like how I described ZIL above), and survives the loss of any L2ARC dev with zero issues (it just disables the device, no unwritten data is stored here). L2ARC does not suffer from the non-concurrent writing issue that ZIL &quot;suffers&quot; (by design) from.
评论 #8304781 未加载
评论 #8303333 未加载
评论 #8303358 未加载
评论 #8303230 未加载
turriniover 10 years ago
I&#x27;ve created the script below a while (year) ago. It (deb)bootstrap a working Debian Wheezy with ZFS on root (rpool) using only 3 partitions: &#x2F;boot(128M) swap(calculated automatically) rpool(according to # of your disks, mirrored or raidz&#x27;ed).<p>All commentaries are in Brazilian Portuguese. I didn&#x27;t have time to translate it to English. Someone could do it and fill a push request.<p><a href="https://github.com/turrini/scripts/blob/master/debian-zol.sh" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;turrini&#x2F;scripts&#x2F;blob&#x2F;master&#x2F;debian-zol.sh</a><p>Hope you like it.
评论 #8308411 未加载
andikleenover 10 years ago
When swap doesn&#x27;t work, mmap is unlikely to work correctly either.<p>Figuring out why that is so is left as an exercise for the poster.
nailerover 10 years ago
Putting production data on a driver maintained outside the mainline Linux kernel is a bad idea.<p>That isn&#x27;t a licensing argument - I&#x27;m happy to use a proprietary nvidia.ko for gaming tasks, for example, because I won&#x27;t be screwing up anyone&#x27;s data if it breaks.
评论 #8321755 未加载
评论 #8306588 未加载
mrmondoover 10 years ago
While I like most parts of ZFS, these days BTRFS is both stable and performs well with a decent feature set. We moved from ZFS and EXT4 to BTRFS for a good portion of our production servers last year - and we haven&#x27;t looked back.
评论 #8305454 未加载
seoguruover 10 years ago
I have a laptop running ubuntu with a single SSD. Does it make sense to run it with ZFS to get compression and snapshots? If I add a hard drive, again does it make sense (perhaps using SSD as cache (arc?) )
评论 #8305499 未加载
评论 #8306402 未加载
评论 #8306597 未加载
awongaover 10 years ago
I&#x27;ve looked into ZFS before for distributions like freenas, is there any solution on the horizon for the massive memory requirements?<p>For example, needing 8-16gb ram for something like a xTB home nas is high.
评论 #8306591 未加载