TechEcho

6 comments

sgarland12 months ago

> To make things harder, zfs send is an all or nothing operation: if interrupted for any reason, e.g. network errors, one would have to start over from scratch.ZFS absolutely handles resuming transfers [0].Honestly, articles like this make me doubt companies’ ability to handle what they’re doing. If you’re going to run a DB on ZFS, you’d damn well better know both inside and out. mbuffer is well-known to anyone who has used ZFS for a simple NAS. Also, you can’t use df to accurately measure a ZFS filesystem. df has no idea about child file systems, quotas, compression, file metadata…It’s also unclear to me why they didn’t just ship the filesystems through nc. Assuming they’re encrypted (which, I mean, I would hope so…) it wouldn’t be any more risky than unencrypted via SSH.[0]: <a href="https://openzfs.github.io/openzfs-docs/man/master/8/zfs-send.8.html" rel="nofollow">https://openzfs.github.io/openzfs-docs/man/master/8/zfs-send...</a>

shrubble12 months ago

I wonder what would have happened if they created a ZFS snapshot, transferred as "tar over ssh" to the remote host, then created hourly snapshots thereafter and synced those across? It seems they were not aware of this method.

评论 #40622337 未加载

kccqzy12 months ago

A really cool story. But I have to say, everyone will notice that rewriting a file transfer tool in Rust is a poor use of the engineering time without having first understood the cause of the slowness. It's almost like a blind cult-like trust in Rust.> So just like with everything else, we decided to write our own, in Rust. After days of digging through Tokio documentation and networking theory blog posts to understand how to move bytes as fast as possible between the filesystem and an HTTP endpoint, we had a pretty basic application that could chunk a byte stream, send it to an object storage service as separate files, download those files as they are being created in real time, re-assemble and pipe them into a ZFS snapshot.I mean this sounds like a fun engineering project and I suspect I would enjoy writing it very much. While this might be bring me joy personally, as an organization this is still a failure.

评论 #40610821 未加载

评论 #40611309 未加载

评论 #40610689 未加载

评论 #40612397 未加载

klabb312 months ago

Blog says 100Gbit NICs and 200MB/s achieved. Big gap!Since both endpoints are controlled by you, you should be able to tune the tcp buffers. In either case RTT and iperf3 with dozens concurrent tcp conns would be the first step to determine a baseline for what could be expected.Author if you are here that data would be very interesting to know.

评论 #40611159 未加载

评论 #40613832 未加载

Spooky2312 months ago

Awesome story! Made me miss my DBA days long ago.

评论 #40610583 未加载

threecheese12 months ago

I love watching a systems problem being solved, we’re forced to learn so much during outages and problems like this. When I read that you were starting to write a rust tool to integrate zfs with s3 because you thought aws was limiting throughput, I nearly yelled out loud! Guess you learned the same lesson I did, once :)

We migrated from AWS to GCP with minimal downtime

6 comments

We migrated from AWS to GCP with minimal downtime

6 comments