SeaweedFS – A simple and highly scalable distributed file system with S3 API

106 pointsby dragonshover 4 years ago

22 comments

fh973over 4 years ago

Their architecture descriptions starts with a strawman:> Usually distributed file systems split each file into chunks, a central master keeps a mapping of filenames, chunk indices to chunk handles, and also which chunks each chunk server has.> The main drawback is that the central master can't handle many small files efficiently, and since all read requests need to go through the chunk master, so it might not scale well for many concurrent users.The chunk server architecture has been first put into production with the Google File System AFAIK. And it has been designed specifically for large files (what search needed at that time). So no surprise.But that's only one architecture for a DFS. There are also block-based DFS (like GPFS), object-based DFS (Lustre.), cluster file systems (OCFS), and other architectures. They exhibit other characteristics.Telling from the architecture and Wiki, it does not seem to be a file system at its core, but an object store with a file translation layer. One of the core problems of this approach is that in-place updates usually mean read-modify-write (if the object store has immutable objects, like most have, with Ceph being a notable exception).From the replication page:> If one replica is missing, there are no automatic repair right away. This is to prevent over replication due to transient volume sever failures or disconnections. In stead, the volume will just become readonly. For any new writes, just assign a different file id to a different volume.This sounds like the architecture and implementation is still pretty basic. Distributed storage without redundancy (working redundancy!) is not that interesting.Sorry to be that critical (great that someone writes a distributed file system!), but I think it is important to add some context. And the seaweed auther seems to have problems with bold statements either...Disclaimer: I also work on a distributed file system (with unified access via S3 ;)

评论 #24730850 未加载

评论 #24730100 未加载

marcoherbstover 4 years ago

Evercam has used Seaweed for a few years. We've 1344TB of mostly jpegs and use the filer for folder structure. It's worked well for us, especially with low cost Hetzner SX boxes. I'd echo other people's positive comments about the maintainer's responsiveness & support. Happy to (try and) answer questions.

评论 #24731431 未加载

dcveloperover 4 years ago

If Oracle wins the Supreme Court case against Google, aren't all these "like S3" or S3 API compatible solutions (whether block storage competitors or file systems) at risk?

评论 #24731767 未加载

fastest963over 4 years ago

We've been running SeaweedFS in production serving images and other small files. We're not using Filer functionality just the underlying volume storage. We wrote our own asynchronous replication on top of the volume servers since we couldn't rely on synchronous replication across datacenters. The maintainer is super responsive and is quick to review our PRs. Happy to answer any questions.

0x008over 4 years ago

Whenever you introduce a new solution into a problem space that already has plenty of options, you are obligated to state why your (new) solution is needed in the first place, IMO.They did it well:> Most other distributed file systems seem more complicated than necessary.> SeaweedFS is meant to be fast and simple, in both setup and operation. If you do not understand how it works when you reach here, we've failed! Please raise an issue with any questions or update this file with clarifications.<a href="https://github.com/chrislusf/seaweedfs#compared-to-other-file-systems" rel="nofollow">https://github.com/chrislusf/seaweedfs#compared-to-other-fil...</a>However, since I never had to touch hdfs after installing it in the first place, I wonder what the difficulties in operation are, that they tried to overcome here?

评论 #24730004 未加载

评论 #24729530 未加载

heipeiover 4 years ago

This looks almost exactly like the kind of data store I need for an application. I have previously considered using minio (too inflexible wrt to adding more shards / replicas), a homebrew system based on something like ScyllaDB (needs code on top to act like a blob store) or S3/B2 (too slow and/or expensive wrt to transfer costs). Is anyone using this in production and can share a story of how stable and hard to run it is?

评论 #24729546 未加载

评论 #24729115 未加载

评论 #24729740 未加载

jasonjayrover 4 years ago

The architecture reminds me of 'mogilefs', which has a similar mechanism of filename to file storage.<a href="https://github.com/mogilefs/mogilefs-docs/blob/master/HighLevelOverview.md" rel="nofollow">https://github.com/mogilefs/mogilefs-docs/blob/master/HighLe...</a>It's an old system from the folks @ Danga, but the mailing list still sees random activity now and then...

anorwellover 4 years ago

I really wish this project, or other object storage systems modelled after haystack, would get more traction. I think it is reasonable to expect that your object storage system should support both small objects (< 10k) and large objects (> 1MB) transparently, but in my experience none of the heavily used open-source object stores (ceph, swift) can actually support small objects adequately.

评论 #24731040 未加载

评论 #24729524 未加载

arnooooooover 4 years ago

I wish all these file systems told me clearly if and how they guarantee file integrity over time.

3npover 4 years ago

Some differentiators that aren't immediately obvious in the comparison:> SeaweedFS Filer metadata store can be any well-known and proven data stores, e.g., Cassandra, Mongodb, Redis, Elastic Search, MySql, Postgres, MemSql, TiDB, CockroachDB, Etcd etc, and is easy to customized.I'm not very familiar with other DFS's but at the very least glusterfs stores metadata as xattrs on an underlying filesystem and so has no need of an external data store.Also, SeaweedFS has a "master" server (single centralized with failover to secondary) and "volume servers" (responsible for data).

评论 #24731114 未加载

pmiller2over 4 years ago

This is interesting. I've been looking for a file system for a non-RAID disk array I want to set up at home, and this seems to have some of the characteristics I'm looking for. The primary downfall for my particular use case seems to be that I want to use partiy-based error correction rather than (or, in addition to) replication, because I want the array to be able to survive a failure of any N disks in the array.Is there anything like that out there (other than Unraid, which I kinda don't like)?

评论 #24735385 未加载

thrusongover 4 years ago

I am not a large user whatsoever but I've been using SeaweedFS for a few years now.It is archiving and serving more than 40,000 images on a webapp I built for the small team I work with.I run SeaweedFS on two machines and it serves all images I host.I wanted to kick the tires because I was always fascinated by Facebook's Haystack.It has been simple, reliable, and robust. I really like it and hope if one of my side projects ever take off at some point, I get to test it with a much bigger load.

jonawesomegreenover 4 years ago

This is really cool. The killer feature I see is to be able to have a cloud storage tier for warm data that goes off to s3, while keeping the hot storage local. Does anyone know of another option that allows this kind of hybrid local / s3 storage that also has a filesystem interface?

评论 #24730136 未加载

评论 #24729544 未加载

bzikarskyover 4 years ago

We are running SeaweedFS successfully in production for a few years. We are serving and storing mostly user-uploaded images (around 100TB). It works surprisingly stable and the maintainer is usually responsive when we encounter issues.

评论 #24730045 未加载

1MachineElfover 4 years ago

If you want something similar that also supports NFS, then there's leofs: <a href="https://github.com/leo-project/leofs" rel="nofollow">https://github.com/leo-project/leofs</a>

didipover 4 years ago

I have been following seaweedFS since forever. Played with it on my own homelab.But I don't know if there's a major shop that uses it. Anyone knows?

评论 #24731670 未加载

评论 #24732914 未加载

teknopurgeover 4 years ago

I like it. What advantage does this have over something like Storj? (aside from the obvious operate differences)

评论 #24731940 未加载

ganafagolover 4 years ago

Geez, what's with those weird project names? For a second I expected/hoped this would be some cool hack storing data in actual seaweed. (You know like pingfs..) No it's not! It's some S3 k8s ... thing. Nothing wrong with that but come on, choose a better name!And no I'm not particularly fond of the namr CockroachDB either.

评论 #24729042 未加载

评论 #24729359 未加载

评论 #24730625 未加载

评论 #24731155 未加载

davidcollantesover 4 years ago

Is it possible to mount a SeaweedFS?

评论 #24731617 未加载

pinmingsanlangover 4 years ago

two years ago, In DiDi, it has been storing and serving 10 billions of files.

some_test_userover 4 years ago

So weird...I was just thinking about this yesterday.

talhof8over 4 years ago

Looking good!Take a look at Gasper (<a href="https://talhof8.github.com/gasper" rel="nofollow">https://talhof8.github.com/gasper</a>).

评论 #24730992 未加载