Forgive me if I'm missing something but this appears to just backup files so it would be fine for source code (should be in version control and safe already) and static assets (like user uploads) but doesn't appear to address things like DB backups which I feel like is the number 1 thing lost if you lose access to your host (followed by user uploads). The problem with DB backups is you can't just backup the data directory (like /var/lib/mysql) unless you've shutdown the DB or you can do a dump (mysqldump) but backing that up hourly is not a good solution IMHO. I guess you could have a replica that you shut down at the top of the hour, backup the data directory, then start back up but all if this is to say this post is not a silver bullet to "Automatically backup a Linux VPS".<p>This is NOT a knock against the author, I just wanted to point out that "backups" are much more complicated than "copy files elsewhere". For DB's I'd probably consider running a replica on 1 or more other clouds. IDK the logistics of replication over the internet but I know for work we do replication from our datacenter down to our local servers and that's over a relatively slow connection so I assume it's possible to do it from cloud-to-cloud.
Vultr AND Linode.<p>1) Upload a custom ISO with ZFS (<a href="https://github.com/beren12/zfs-iso/" rel="nofollow">https://github.com/beren12/zfs-iso/</a>)<p>2) Create a new VPS without OS and boot to your uploaded ISO.<p>3) Create a ZFS root pool and bootstrap your Debian or another distribution.<p>4) Enable all cool features: compression, encryption, etc.<p>5) rsync your zfs snapshots from Vultr to Linode and vice-versa.<p>This is how I do. You can even use them as templates for newly VPS.<p>And for backups, BackBlaze B2 and WASABI with a zfs-snapshot-upload script.
I've used rclone for a very similar purpose. Restic, which is used in this post looks very interesting as well.<p>It's not the topic of the post, but database backups deserve a special mention. You can't just naively copy the database folder this way in most cases, you have to make sure to backup a consistent snapshot of the database. This is still not hard to do at smaller scales, when you can just add an exported dump of the database to your regular backup. But it is a point that needs some attention if you host the database yourself.
Are there any reasons to prefer Restic over BorgBackup[1]?<p>A conclusion from one comparison (2017)[2]:<p>"Restic’s memory requirements makes it unsuitable for backing up a small VPS with limited RAM, and the slow backup verification process makes it impractical on larger servers. But if you are backing up desktop or laptop computers then this may not matter so much, and using Restic means that you don’t have to setup your own storage server."<p>Is this still true?<p>[1] <a href="https://www.borgbackup.org/" rel="nofollow">https://www.borgbackup.org/</a><p>[2] <a href="https://stickleback.dk/borg-or-restic/" rel="nofollow">https://stickleback.dk/borg-or-restic/</a>
I use Duplicity in a similar way to back my Linode stuff up to Backblaze. It does versioning really well and it's been very reliable.
I'd still have to configure up a new server somewhere etc but at least I have the data.
<a href="http://duplicity.nongnu.org/" rel="nofollow">http://duplicity.nongnu.org/</a>
Related - I've been thinking about how to best backup my S3 buckets (some with 50k+ files) off of Amazon. Sure I can setup another bucket with that cross region duplication feature, and I have versioning.. but would really prefer a backup off of Amazon (ie not sending manually created zips in a lightsail/ec2 or something to glacier) in case it ever gets hacked or I accidentally nuke the buckets or something like that.<p>Currently just doing a combination of s3cmd for a local archive (takes forever to download and then it doesnt seem like incremental syncs are any faster), as well as having Google Console clone my bucket there (but I'm not sure if it's versioned, or as easy as downloading the whole archive).<p>Never used duplicity -- would it be fast for something like this? Guessing I should just cron it on a remote server instead of running off a local machine frequently.
Don't forget about practicing restoration (catastrophe scenarios). So that you will know how long time it will take to restore, and if something is missing. Last time I did it I did not remember the password for the encryption key. Sure I had it written down on a piece of paper, but the scenario was that the building had burnt down.
In a dockerized single-VPS environment, where should cronjobs live? Should they be part of the main Docker container that had the app code, or a separate container that only has all cronjobs, or simply on the host?
Meh, just another backup solution that requires AWS keys, ssh keys, etc. to be kept on the same server where your data is. What if that server is compromised? The attacker now has all the keys he needs to delete or modify your backups, too.<p>For maximum peace of mind, always <i>pull</i> backups from a separate server that is not exposed to the world. Don't let your primary server <i>push</i> arbitrary data to the backup store.<p>This rule is trickier to follow when your backup store can't run scripts, which is why so many tools designed to work with S3 tell you to keep the keys exposed. But if you really want to, you can use an intermediate host to pull backups before pushing them again to S3.
My own personal preference is to simply make VM's on each VPS that has some storage space, then enable chroot sftp and rsnapshot. Then on the client side, I used LFTP (sftp mirror sub-system) which is compatible with chroot sftp and behaves like rsync.<p>Each VPS backs up to the other. RSnapshot makes daily diffs that use hardlinks to avoid taking up space. This also mitigates tampering, as only root have access to the snapshots.<p>Demo site using anon login for testing: [1]<p>[1] - <a href="https://tinyvpn.org/sftp/#lftp" rel="nofollow">https://tinyvpn.org/sftp/#lftp</a>
Does anyone here have experience with backing up ZFS pools in cloud storage like S3, B2, ...?<p>I have a bunch of snapshots (<a href="https://github.com/jakelee8/zfs-auto-snapshot" rel="nofollow">https://github.com/jakelee8/zfs-auto-snapshot</a>) that I want to backup along with the active tree. But don't want to keep extra copies of the data.<p>- Do these services offer snapshotting? ...that can be automated?<p>- Is there zfs integration, e.g `zpool send | b2 receive`?
One idea I had was to create a service with preconfigured images setup for personal use with VPN, email server and file sync/backup. It could be sold to privacy conscious individuals and could compete with ProtonMail.<p>The technical side could be hidden from less technical users and it sold as isolated servers so the data would be protected.<p>I don’t have the skills or the time to work on this so happy for others to use the idea
Does anyone have a recommendation for a backup client that handles millions of tiny files? I'm using rsnapshot right now, which works but backing up to an NFS share is incredibly slow (most of the time is spent in iterating over the filesystem to get a list of changed files, then running the hardlink process from the previous snapshot).
I actually like using CPanel's built-in backup settings on servers where I have CPanel installed. Amazingly simple to set up, really intuitive, and supports a variety of services. I have used Amazon and SFTP backups so far and they both work really well.
Or use a paid service that handles files and databases for ~$30/year like <a href="https://www.dropmysite.com/" rel="nofollow">https://www.dropmysite.com/</a><p>(I'm just a customer that has been generally pleased over the past many years)
I'm surprised that no one has mentioned Duplicacy yet. It's another very solid, reliable and fast alternative. At the moment I use Restic on servers but use Duplicacy on the desktop. It can also be used on servers of course.
Restic looks neat. I've been looking at using duplicity [0] for similar purposes recently, which does a similar job.<p>Just found a good comparison/benchmark of the two at [1] - tl;dr seems to be that restic is fast, and duplicity is small.<p>[0] - <a href="http://duplicity.nongnu.org" rel="nofollow">http://duplicity.nongnu.org</a><p>[1] - <a href="https://github.com/gilbertchen/benchmarking" rel="nofollow">https://github.com/gilbertchen/benchmarking</a>