FUSE is definitely the way to go with file synchronization, since the system will never miss a write, and can lazily load data for reads. It means that the entire filesystem doesn't have to fit on every machine that's synced. For a more sophisticated FUSE sync filesystem, be sure to check out orifs:<p><a href="http://ori.scs.stanford.edu/" rel="nofollow">http://ori.scs.stanford.edu/</a><p>The best introduction to orifs is their paper, which is linked from the above site.
I wonder if this could be used as a replacement for Vagrant's rsync feature.[1]<p>From Vagrant's documentation:<p><pre><code> Vagrant can use rsync as a mechanism to sync a folder to the
guest machine. This synced folder type is useful primarily in
situations where other synced folder mechanisms are not
available, such as when NFS or VirtualBox shared folders aren't
available in the guest machine.
The rsync synced folder does a one-time one-way sync from the
machine running to the machine being started by Vagrant.
</code></pre>
The disadvantage of the above is that it's a one-time, one-way sync. SFS would overcome this limitation, if I'm not mistaken.<p>[1] <a href="http://docs.vagrantup.com/v2/synced-folders/rsync.html" rel="nofollow">http://docs.vagrantup.com/v2/synced-folders/rsync.html</a>
This is neat. I like that it's using stable off-the-shelf unix components.<p>I'm putting together the new <a href="https://neocities.org" rel="nofollow">https://neocities.org</a> fileserver stuff right now, so I'll definitely be looking into this.<p>The current plan is to use hourly rsyncs, and then implement this (or some flavor of it): <a href="http://code.google.com/p/lsyncd/" rel="nofollow">http://code.google.com/p/lsyncd/</a><p>RE inotify vs FUSE: The former is event-driven from an API, the latter I believe uses lower level blocks. Which one is better here is entirely debatable. Gluster uses a similar approach this does I believe. I'm not an expert on unix file APIs, so take all of this with a grain of salt.<p>The biggest reason we can't use Gluster replication is that if you request a file when replicating, it goes to ask all the servers if the file is on them, instead of just failing because it's not on the local system. That's fine for many things, but it's an instant DDoS if someone runs siege on the server and just blasts you with random bunk 404 requests. You can't cache your way out of that one. Apparently the performance for instant request for lots of small files can be pretty slow too.<p>SSHFS (and rsync using SSH) blow S3 out of the water on performance for remote filesystem work. The difference is pretty insane.
for newly built applications, why not setup a ceph or swift cluster ..then use an s3 interface for access to files/objects?
total an minimum replicas can be configured... so you will get something like eventual consistency if you use small min than total.
I'm an employee at Immobiliare.it and yesterday we've released for the first time some internal software to the public on github. Just wanting to share our work :)
Since it's using FUSE, it would be nice if it had lazy initial syncing. I.e. on empty filesystem slave it would serve/sync accessed files even if they are not synced yet. That's at least until full sync is completed.
Millions of files or GB's of data can take long time for initial sync. And lazy sync would allow to start serving clients with no delay.
I have been looking into this, and my current idea is tending towards using VMs running DragonflyBSD.<p>VM-1 - local NFS server that runs DBSD and Hammer filesystem with many nice features (auto snapshots, etc.) Will be fast, especially if the VM-1 is on the same physical host-node as my worker VMs.<p>VM-2 is remote, and receives the DBSD filesystems I send it. All snapshots etc. from the original FS are retained. If the connection is interrupted, the Hammer sync code will figure it out and restart from the latest successful transaction.
Would this work with sub folders being synced from different machines onto a single machine?<p>Even better, could it grab data as required from other machines?