TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Low-cost backup to S3 Glacier Deep Archive

120 点作者 mrich超过 2 年前
Hi,<p>most people (hopefully) have local backups. However, when that backup fails, it is good to have a backup stored somewhere off-site. In the old days you would ship physical drives&#x2F;tapes, which is cumbersome, costly, and slow. With fast upload speeds, it is now possible to upload your data to the cloud. I have found S3 Glacier Deep Archive to be a great solution for this:<p>- It is very cheap ($1&#x2F;TB&#x2F;month for US region) - Very reliable (99.999999999% data durability, data spread over 3 Availability Zones)<p>However, usability out of the box is not that great, I&#x27;m not aware of any automated backup solution for Deep Archive. This free project provides that.<p>Currently, ZFS is required, but that might change. Please try it out and provide feedback!

38 条评论

NBJack超过 2 年前
Note that, in the event of retrieval, it&#x27;s not just per GB, it&#x27;s also per request.<p>Synology has offered Amazon Glacier and S3 as a destination option with Hyper Backup for years as part of their NAS offerings. Given the available automatic archive feature to move an existing store to Glacier Deep Archive, budget permitting I&#x27;d recommend a NAS over this for three reasons:<p>- Initial setup costs aside, the power draw of a two bay unit like the DS218 (15W at load) would be ~$16&#x2F;year at peak usage assuming a cost of $0.12&#x2F;W<p>- Uploading&#x2F;syncing your local files to your NAS should be considerably faster, technically &#x27;free&#x27;, and can be done more frequently as you desire; should you need them, it would also be &#x27;free&#x27; to retrieve them locally barring a catastrophic event<p>- The remote push of the NAS contents to S3&#x2F;Glacier storage can be done asynchronously of your PC&#x27;s state (and, to save money, less frequent if you wish), which as you point out could take days; additionally, you can save money given you can reduce the number of requests via automatic archiving&#x2F;compression<p>Given how unlikely it is for you to retrieve data from Glacier Deep Archive with such a setup, I highly recommend it. You can still rest knowing your data is offsite.
评论 #32864940 未加载
评论 #32865191 未加载
评论 #32865619 未加载
contravariant超过 2 年前
What does 99.999999999% durability mean exactly? Does it mean a probability of 0.000000001% (1 in 100_000_000_000) that your bits will randomly disappear? Is that yearly?<p>One interpretation is that about 1 bit per 100 GB will randomly flip each year. That or S3 Glacier is expecting to hit a catastrophic event every 100 billion years (which doesn&#x27;t seem nearly frequent enough).
评论 #32864792 未加载
评论 #32864573 未加载
评论 #32872756 未加载
lathiat超过 2 年前
I use rclone and Backblaze B2 for this. B2 used to be cheaper though not sure if that’s true with the new deep archive but it is much less fiddly and no crazy fees at restore time.<p>Rclone is also multi threaded so goes much faster compared to rsync
评论 #32865095 未加载
评论 #32865448 未加载
评论 #32869136 未加载
评论 #32865152 未加载
Etheryte超过 2 年前
Since I&#x27;m not that familiar with the pricing model behind this, did I understand correctly that it costs roughly $1&#x2F;TB&#x2F;month to store and roughly $95&#x2F;TB to restore? The price seems steep at first, but comparing it to regular backup services where the cost usually adds up to roughly $100 per year it starts to make sense. I have backups but I don&#x27;t think I would ever need them more than once a year, even that would be alarmingly often.
评论 #32864550 未加载
评论 #32866215 未加载
thrtythreeforty超过 2 年前
With a little setup (no more than described here), rclone just... does this out of the box.<p>Specifically I have an S3 remote configured to use the Deep Archive tier. On top of that I have an encryption remote (pointing to the S3 remote). Then I just rclone my pool to this remote, and all my crap is shipped off to Ireland.<p>Like in the link, I expect never to need it; restore is so expensive that it&#x27;s a &quot;house burns down&quot; insurance only.
评论 #32864832 未加载
评论 #32865621 未加载
评论 #32864685 未加载
knorker超过 2 年前
I&#x27;ve moved all my stuff off of glacier because of their ridiculous pricing model, and user-hostile metadata handling.<p>I.e. you have to maintain your own index of files, where they could have just done this for you.<p>The pricing model for downloads is too easy to shoot yourself in the foot. I&#x27;d rather pay a tiny bit more to not have bankruptsy traps built into the product. So that&#x27;s what I do now.
评论 #32864535 未加载
评论 #32865000 未加载
评论 #32864527 未加载
评论 #32864518 未加载
londons_explore超过 2 年前
&gt; 3 or more AWS Availability Zones, providing 99.999999999% data durability<p>I think you need to multiply that by the durability of amazon as a company...<p>I suspect that in any given year there is perhaps a 1% chance that they shut down AWS with no chance to retrieve data. (either due to world war, civil war in the usa, change in laws, banning you as a customer, change in company policy, bankrupcy, etc.)
评论 #32864643 未加载
评论 #32864570 未加载
评论 #32866153 未加载
dewey超过 2 年前
This service already exists for a very long time. I have been using it for many years with <a href="https:&#x2F;&#x2F;www.arqbackup.com" rel="nofollow">https:&#x2F;&#x2F;www.arqbackup.com</a> as a fallback for my TimeMachine and Backblaze backup. Google also offers a very similar service as Glacier called Coldline: <a href="https:&#x2F;&#x2F;cloud.google.com&#x2F;storage&#x2F;docs&#x2F;storage-classes#coldline" rel="nofollow">https:&#x2F;&#x2F;cloud.google.com&#x2F;storage&#x2F;docs&#x2F;storage-classes#coldli...</a>
评论 #32864648 未加载
hampereddustbin超过 2 年前
This often disregards the cost of retrieving said data, which is at $90&#x2F;TB for outboud network traffic on top of the costs of making the backups available.
评论 #32864632 未加载
loloquwowndueo超过 2 年前
I use tarsnap for off-site backups :<p><a href="https:&#x2F;&#x2F;www.tarsnap.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.tarsnap.com&#x2F;</a><p>It’s probably not as cheap as glacier but it’s cheap enough for my needs, secure and encrypted and was very easy to set up.
评论 #32864667 未加载
评论 #32864512 未加载
ogig超过 2 年前
This recalled a horror story about huge charges when retrieving the data. I searched the link and seems like pricing was changed since the blog entry [1].<p>Still good idea to check the extra charges when reading.<p>[1] <a href="https:&#x2F;&#x2F;medium.com&#x2F;@karppinen&#x2F;how-i-ended-up-paying-150-for-a-single-60gb-download-from-amazon-glacier-6cb77b288c3e" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;@karppinen&#x2F;how-i-ended-up-paying-150-for-...</a>
评论 #32864438 未加载
评论 #32864464 未加载
capableweb超过 2 年前
&gt; 99.999999999% data durability, data spread over 3 Availability Zones<p>I&#x27;d love to see what source(s) this claim has in practice. How did you arrive at 9-9s? Shouldn&#x27;t it be spread over 3 Regions rather than Availability Zones, as otherwise it could just end up in the same geographical region while giving the impression it&#x27;s spread across many.
评论 #32864738 未加载
评论 #32864478 未加载
kidme5超过 2 年前
Isn&#x27;t the chance of nuclear armageddon something like 30% this century?, that should affect the probabilities
评论 #32864504 未加载
评论 #32864529 未加载
resoluteteeth超过 2 年前
Can&#x27;t you also just use existing tools to backup to s3 and then move it to deep archive?
评论 #32865277 未加载
agurk超过 2 年前
I&#x27;ve created similar functionality just using a simple bash script that will send the latest version of ZFS datasets to S3&#x2F;Glacier, including dealing with incremental changes. I have mentioned this previously on HN and got a few useful changes submitted for it, especially making it more platform agnostic.<p>I have some open tickets asking about (script based) restoring. I haven&#x27;t tried this yet as this has been a backup of last resort for me, but hopefully posting this again will nudge me into looking at that.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;agurk&#x2F;zfs-to-aws&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;agurk&#x2F;zfs-to-aws&#x2F;</a>
bzxcvbn超过 2 年前
&gt; I&#x27;m not aware of any automated backup solution for Deep Archive.<p>What about the well-known rclone? <a href="https:&#x2F;&#x2F;rclone.org&#x2F;s3&#x2F;" rel="nofollow">https:&#x2F;&#x2F;rclone.org&#x2F;s3&#x2F;</a>
评论 #32864904 未加载
uptown超过 2 年前
I use Arq with AWS Deep Archive as my off-site. Seems to work well, though admittedly I haven&#x27;t tested the recovery&#x2F;retrieval yet.
评论 #32865661 未加载
blahgeek超过 2 年前
I’m curious, has anyone really experienced any data loss on public storage service like s3? I’m not sure if the count of 9s actually matter …
评论 #32865691 未加载
andag超过 2 年前
<a href="https:&#x2F;&#x2F;github.com&#x2F;andaag&#x2F;zfs-to-glacier" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;andaag&#x2F;zfs-to-glacier</a><p>I built something similar a while back that I&#x27;ve been using for years now.<p>Something worth noting. There is a minimum cost to files. If you have tons of tiny kb sized files (incremental snapshots..) it&#x27;s drastically cheaper to fallback to s3 for them.
aborsy超过 2 年前
You still need to check your backups every once in a while. Glacier is priced such that most people don’t check their backups. This could be worse than no backup.<p>Also, one may frequently add and prune snapshots. Costs of this should be considered too. You may use hot storage, but pruning usually removes old data which is in cold storage.<p>Does anyone here check glacier backups?
clarkdale超过 2 年前
Have you considered using plain S3 buckets with Intelligent Tiering AND the two opt-in archive access tiers? You can use the normal S3 apis to upload, then after 180 days or so, your objects transition to Glacier Deep Archive. You do pay a penny per 1000 objects, but the benefit here is using S3 like normal. You still have to wait hours for restore.
评论 #32865964 未加载
password4321超过 2 年前
For onsite backups, is it a valid option to buy a spindle of blu-ray discs and swap away until you have another copy of everything with enough par files included to account for a few years of bit rot?<p>Or is copying everything to a new, larger hard drive every year and keeping a few years of drives still the best choice?<p>Edit: for personal use, for sure! I think BD-R goes up to 100GB.
评论 #32865465 未加载
评论 #32865454 未加载
Havoc超过 2 年前
Hetzner storage boxes are also worth a look due to their colourful range of connection options.<p>Borg restic rclone sftp etc
GRBurst超过 2 年前
Some questions that weren&#x27;t answered in the readme: - What type of encryption is used exactly? Is it a simple encfs which leaks some meta information or is a container created or something else? - Is it a full snapshop backup or does it work incrementally?
评论 #32867489 未加载
评论 #32866626 未加载
traceroute66超过 2 年前
Aside from the general concept issues that others are addressing (e.g. &quot;what does 99.999999999% durability mean exactly&quot;).<p>There&#x27;s also code smell.<p>I had a quick glance through and couldn&#x27;t help noticing the stench of assumptions and poor (or non-existent) exception handling.
评论 #32864654 未加载
xani_超过 2 年前
Is there any good OSS solution that supports multiple servers and modern storage targets ?<p>There is Bareos&#x2F;Bacula but that just pretends everything is a tape and generally works badly&#x2F;quirky because of that.
评论 #32865692 未加载
19h超过 2 年前
Try Wasabi! It&#x27;s amazingly affordable $5.99 per TB&#x2F;month WITHOUT fees for egress or API requests. Storing all my backups there and they are not single-DC anymore.
kosolam超过 2 年前
Cheap archive. But very expensive data out from aws. omg
评论 #32865032 未加载
dolmen超过 2 年前
How would you preserve an encryption key for such a backup outside of the digital world, for personal purpose, and also unlock it after my death?
评论 #32865429 未加载
mattbillenstein超过 2 年前
Hmm, I just ship the files directly to Glacier using the aws cli -- aws s3 sync &#x2F;foo&#x2F;bar s3:&#x2F;&#x2F;&lt;bucket&gt;&#x2F;bar&#x2F;
评论 #32865893 未加载
kbumsik超过 2 年前
What is the maximum size of each .tar.zstd.ssl ?<p>We have a storage with 40TB with 100 million files, then what is the expected number of archive files?
评论 #32865401 未加载
vlovich123超过 2 年前
Uploads are not free as described in the project I think. Unless I’m misreading, AWS’s cost page showed about 5c &#x2F; 1000 files.
评论 #32865822 未加载
Markoff超过 2 年前
Why not mention this straight in your post:<p>&quot;Restore and download is quite costly:<p>Restore from S3 tape to S3 blob: $0.0025&#x2F;GiB ($2.56&#x2F;TiB) for Bulk within 48 hours $0.02&#x2F;GiB ($20.48&#x2F;TiB) for Standard within 12 hours Download: The first 100 GiB&#x2F;month are free, then 10 TiB&#x2F;Month for $0.09 per GiB ($92.16&#x2F;TiB) and discounts for more.&quot;<p>TLDR if I read it correctly 2.56+92.16 USD to get your 1TB back home<p>Not that bad, but I feel like buying every half year 1TB drive for ~50USD and just storing it wherever outside your home would be cheaper option. But it depends how often you need to perform backup.
评论 #32864622 未加载
评论 #32864583 未加载
Crontab超过 2 年前
Glacier is nice, but due to the cost of data retrival, people should use it as the restore of last resort.
synergy20超过 2 年前
for deep archiving the major question for me is for the tool to do client side encryption easily, I never understand the server-side encryption, if you put your keys there, the server now have both key and content and of course they can in theory decrypt things at will.
croes超过 2 年前
The question is what are the costs when you need to retrieve the data.<p>They prices might be higher next year.
FounderBurr超过 2 年前
Put all the data you want in, pay ungodly amounts to get it out tho.
评论 #32865016 未加载
crest超过 2 年前
Low cost until you need a (partial) restore...