TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Asynchronous filesystem replication with FUSE and rsync

76 pointsby Lethalmanover 10 years ago

13 comments

jewelover 10 years ago
FUSE is definitely the way to go with file synchronization, since the system will never miss a write, and can lazily load data for reads. It means that the entire filesystem doesn&#x27;t have to fit on every machine that&#x27;s synced. For a more sophisticated FUSE sync filesystem, be sure to check out orifs:<p><a href="http://ori.scs.stanford.edu/" rel="nofollow">http:&#x2F;&#x2F;ori.scs.stanford.edu&#x2F;</a><p>The best introduction to orifs is their paper, which is linked from the above site.
评论 #8735998 未加载
pmoriartyover 10 years ago
I wonder if this could be used as a replacement for Vagrant&#x27;s rsync feature.[1]<p>From Vagrant&#x27;s documentation:<p><pre><code> Vagrant can use rsync as a mechanism to sync a folder to the guest machine. This synced folder type is useful primarily in situations where other synced folder mechanisms are not available, such as when NFS or VirtualBox shared folders aren&#x27;t available in the guest machine. The rsync synced folder does a one-time one-way sync from the machine running to the machine being started by Vagrant. </code></pre> The disadvantage of the above is that it&#x27;s a one-time, one-way sync. SFS would overcome this limitation, if I&#x27;m not mistaken.<p>[1] <a href="http://docs.vagrantup.com/v2/synced-folders/rsync.html" rel="nofollow">http:&#x2F;&#x2F;docs.vagrantup.com&#x2F;v2&#x2F;synced-folders&#x2F;rsync.html</a>
评论 #8736096 未加载
评论 #8735531 未加载
0x0over 10 years ago
Wouldn&#x27;t the inotify API be a better way to detect file writes rather than writing a full FUSE wrapper?
评论 #8734885 未加载
评论 #8734795 未加载
评论 #8735060 未加载
kyledrakeover 10 years ago
This is neat. I like that it&#x27;s using stable off-the-shelf unix components.<p>I&#x27;m putting together the new <a href="https://neocities.org" rel="nofollow">https:&#x2F;&#x2F;neocities.org</a> fileserver stuff right now, so I&#x27;ll definitely be looking into this.<p>The current plan is to use hourly rsyncs, and then implement this (or some flavor of it): <a href="http://code.google.com/p/lsyncd/" rel="nofollow">http:&#x2F;&#x2F;code.google.com&#x2F;p&#x2F;lsyncd&#x2F;</a><p>RE inotify vs FUSE: The former is event-driven from an API, the latter I believe uses lower level blocks. Which one is better here is entirely debatable. Gluster uses a similar approach this does I believe. I&#x27;m not an expert on unix file APIs, so take all of this with a grain of salt.<p>The biggest reason we can&#x27;t use Gluster replication is that if you request a file when replicating, it goes to ask all the servers if the file is on them, instead of just failing because it&#x27;s not on the local system. That&#x27;s fine for many things, but it&#x27;s an instant DDoS if someone runs siege on the server and just blasts you with random bunk 404 requests. You can&#x27;t cache your way out of that one. Apparently the performance for instant request for lots of small files can be pretty slow too.<p>SSHFS (and rsync using SSH) blow S3 out of the water on performance for remote filesystem work. The difference is pretty insane.
评论 #8737686 未加载
评论 #8735469 未加载
andyidsingaover 10 years ago
for newly built applications, why not setup a ceph or swift cluster ..then use an s3 interface for access to files&#x2F;objects? total an minimum replicas can be configured... so you will get something like eventual consistency if you use small min than total.
评论 #8735350 未加载
Lethalmanover 10 years ago
I&#x27;m an employee at Immobiliare.it and yesterday we&#x27;ve released for the first time some internal software to the public on github. Just wanting to share our work :)
评论 #8735694 未加载
hrezover 10 years ago
Since it&#x27;s using FUSE, it would be nice if it had lazy initial syncing. I.e. on empty filesystem slave it would serve&#x2F;sync accessed files even if they are not synced yet. That&#x27;s at least until full sync is completed. Millions of files or GB&#x27;s of data can take long time for initial sync. And lazy sync would allow to start serving clients with no delay.
评论 #8739336 未加载
minaguibover 10 years ago
This is very interesting indeed for the described use case. AFAIK the other alternative is lsyncd
评论 #8734898 未加载
vtemianover 10 years ago
Did you check <a href="https://news.ycombinator.com/item?id=8735937" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8735937</a>?
评论 #8736518 未加载
patrickg_zillover 10 years ago
I have been looking into this, and my current idea is tending towards using VMs running DragonflyBSD.<p>VM-1 - local NFS server that runs DBSD and Hammer filesystem with many nice features (auto snapshots, etc.) Will be fast, especially if the VM-1 is on the same physical host-node as my worker VMs.<p>VM-2 is remote, and receives the DBSD filesystems I send it. All snapshots etc. from the original FS are retained. If the connection is interrupted, the Hammer sync code will figure it out and restart from the latest successful transaction.
评论 #8758461 未加载
illumenover 10 years ago
Would this work with sub folders being synced from different machines onto a single machine?<p>Even better, could it grab data as required from other machines?
评论 #8736499 未加载
fit2ruleover 10 years ago
Poor name for a filesystem - there are already a couple of filesystems named &quot;SFS&quot; ..
anon4over 10 years ago
If this could be ported to windows, mac, android and ios, I&#x27;d drop dropbox in a heartbeat.
评论 #8736874 未加载