TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

NASA to launch 247 petabytes of data into AWS, but forgot about egress costs

282 pointsby nobitaabout 5 years ago

37 comments

slowhand09about 5 years ago
Wow! I worked on EODSIS in 93-96. We estimated 16 petabytes, at the time it would be one of the worlds largest databases. We changed horses midstream moving our user interfaces from X-windows Motif to WWW. And built a very early Oracle DB accessible via WWW. There was no cloud then except missions studying atmospheric water vapor. When this was originally designed there were to be several (6-7) DAACs - Distributed Active Archive Centers (<a href="https:&#x2F;&#x2F;earthdata.nasa.gov&#x2F;eosdis&#x2F;daacs" rel="nofollow">https:&#x2F;&#x2F;earthdata.nasa.gov&#x2F;eosdis&#x2F;daacs</a>) to store data near where it was needed or captured. Now they have 12 and are storing on AWS. Amazon didn&#x27;t exist when this was originally built.
评论 #22629946 未加载
评论 #22633811 未加载
anthonylukachabout 5 years ago
This article seems short sighted.<p>1. Using the AWS cost calculator is pointless, naturally an entity the size of NASA would get heavily discounted rates. 2. As data volume grows, the complexities of working with that data expands. NASA appears to be embracing cloud computing by embracing a paradigm where scientists push computation to where the data rests rather than downloading data [1], [2], [3], thereby paying egress on only the higher order data products. 3. The report notes that NASA has tooling to rate limit and throttle access to data. This, in itself, proves that NASA didn&#x27;t &quot;[forget] about eye-watering cloudy egress costs before lift-off&quot;.<p>People may scream about vendor lock in, which is a fair complaint; but acting like NASA just didn&#x27;t think about egress is misleading.<p>NASA is ultimately a science institution, I think diverting effort away from infrastructure management and towards studying data is likely a wise decision.<p>[1: <a href="https:&#x2F;&#x2F;www.hec.nasa.gov&#x2F;news&#x2F;features&#x2F;2018&#x2F;cloud_computing_services.html" rel="nofollow">https:&#x2F;&#x2F;www.hec.nasa.gov&#x2F;news&#x2F;features&#x2F;2018&#x2F;cloud_computing_...</a>] [2: <a href="https:&#x2F;&#x2F;link.springer.com&#x2F;article&#x2F;10.1007&#x2F;s10712-019-09541-z" rel="nofollow">https:&#x2F;&#x2F;link.springer.com&#x2F;article&#x2F;10.1007&#x2F;s10712-019-09541-z</a>] [3: <a href="https:&#x2F;&#x2F;ui.adsabs.harvard.edu&#x2F;abs&#x2F;2017AGUFMIN21F..02P&#x2F;abstract" rel="nofollow">https:&#x2F;&#x2F;ui.adsabs.harvard.edu&#x2F;abs&#x2F;2017AGUFMIN21F..02P&#x2F;abstra...</a>]
评论 #22631514 未加载
评论 #22632783 未加载
评论 #22630741 未加载
评论 #22632046 未加载
评论 #22630583 未加载
评论 #22631499 未加载
评论 #22630383 未加载
评论 #22632911 未加载
Dunedanabout 5 years ago
&gt; “However, when end users download data from Earthdata Cloud, the agency, not the user, will be charged every time data is egressed.<p>Not necessarily, depending on how the users access the data. If users access the data through their own AWS accounts, NASA could leverage S3&#x27;s &quot;Requester Pays&quot; feature [1], to let the user pay for downloading the data.<p>1: <a href="https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonS3&#x2F;latest&#x2F;dev&#x2F;RequesterPaysBuckets.html" rel="nofollow">https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonS3&#x2F;latest&#x2F;dev&#x2F;RequesterPay...</a>
评论 #22627202 未加载
评论 #22629303 未加载
评论 #22631241 未加载
评论 #22628514 未加载
评论 #22628847 未加载
djrogersabout 5 years ago
I&#x27;m not saying this won&#x27;t be a financial cluster - it likely will cost many times more than planned - but the headline here is just a flat-out lie.<p>TFA says:<p>&quot;a March audit report [PDF] from NASA&#x27;s Inspector General noticed EOSDIS hadn’t properly modeled what data egress charges would do to its cloudy plan.&quot;<p>&#x27;Hadn&#x27;t properly modeled&#x27; is very different from &#x27;forgot about&#x27;. And if you actually read the linked report, it says things like:<p>&quot;ESDIS officials said they plan to educate end users on accessing data stored in the cloud, including providing tools to enable them to process the data in the cloud to avoid egress charges.&quot; and &quot;To mitigate the challenges associated with potential high egress costs when end-users access data, ESDIS plans to monitor such access and “throttle” back access to the data&quot;<p>Neither of those statements would be <i>in the audit</i> if the entire topic had been a surprise.
评论 #22627326 未加载
unhammerabout 5 years ago
<p><pre><code> YOU ARE NOT AFRAID? &#x27;Not yet. But, er...which way to the egress, please?&#x27; There was a pause. Then Death said, in a puzzled voice: ISN&#x27;T THAT A FEMALE EAGLE? </code></pre> I&#x27;ve been reading A Hat Full of Sky to my daughter these days, and there&#x27;s a running joke that &quot;supposedly intelligent people&quot; don&#x27;t know the meaning of the word &quot;egress&quot;, mixing it up with things like egret, ogress or eagles.<p>(See also the inspiration for the joke: <a href="https:&#x2F;&#x2F;unrealfacts.com&#x2F;pt-barnum-would-trick-people-with-a-this-way-to-egress-sign&#x2F;" rel="nofollow">https:&#x2F;&#x2F;unrealfacts.com&#x2F;pt-barnum-would-trick-people-with-a-...</a> )
ghostpepperabout 5 years ago
There&#x27;s a joke around here somewhere about AWS pricing being too difficult even for rocket scientists.
评论 #22630355 未加载
movedxabout 5 years ago
It&#x27;s The Register, people. Don&#x27;t take it seriously. It&#x27;s practically The Onion of the IT industry, especially the comments sections.<p>I&#x27;ve written two articles for them and the comments are a joke. They&#x27;re all anti-Cloud, anti-progressive. Try selling them Kubernetes has a solution to their problems: they&#x27;ll think you&#x27;ve come to steal their children. I know, I&#x27;ve tried.<p>In short: this never happened. NASA didn&#x27;t forget anything. It does, however, make for a great eye catching headline!<p>Sorry to be bitter about this, but publications like The Register serve little purpose these days. It caters to a specific kind of IT personality that can&#x27;t let go of their physical tin and they think public Cloud has no place or use at all. Again I know, I&#x27;ve tried convincing these people of such things.
评论 #22633175 未加载
pixelbathabout 5 years ago
Unless my numbers are <i>way</i> off, I got around $15.5 million per year using Backblaze&#x27;s calculator: <a href="https:&#x2F;&#x2F;www.backblaze.com&#x2F;b2&#x2F;cloud-storage-pricing.html" rel="nofollow">https:&#x2F;&#x2F;www.backblaze.com&#x2F;b2&#x2F;cloud-storage-pricing.html</a><p>Numbers used:<p><pre><code> Initial upload: 258998272 GB (1024*1024*247) Monthly upload: 100 GB (default) Monthly delete: 5 GB (default) Monthly download: 1048576 GB (1 PB) Period of Time: 12 months (default)</code></pre>
评论 #22630893 未加载
ackbar03about 5 years ago
Oh but aws didn&#x27;t forget. Aws never forgets
评论 #22627892 未加载
评论 #22636636 未加载
评论 #22627330 未加载
评论 #22632683 未加载
NikolaeVariusabout 5 years ago
Senator Shelby should get AWS to launch a new region in Alabama for NASA at this rate.
OzzyBabout 5 years ago
Looks like even the big boys get bitten by the Cloud Meme when forgetting about bandwidth costs; glad I&#x27;m not the only one.
7777fpsabout 5 years ago
I assume the data accessed is a heavily skewed pareto distribution.<p>Given that, it&#x27;s maybe still cheaper to build their own serving &#x2F; caching layer in front to save egress costs than to have constructed the whole storage solution themselves.
评论 #22627839 未加载
knorkerabout 5 years ago
This surely was entirely known to AWS, where they were rubbing their hands at the fact that every user of this data has to process it using EC2 on site.<p>This is Cloud lock-in using data location.
tehalexabout 5 years ago
I wonder if this includes or if they can use Direct Connect? [1]<p>Cloud data transfers are too expensive, personally I assume that it costs more to measure and bill for bandwidth than the usage itself...<p>1: <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;directconnect&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;directconnect&#x2F;</a>
评论 #22630205 未加载
toomuchtodoabout 5 years ago
Cue the cloud apologists that “it’s better to use the cloud than to build and manage your own infra”.<p>This is why you build and run your own storage, similar to Backblaze (who is almost entirely bootstrapped except for one reasonable round of investment).
评论 #22628728 未加载
评论 #22627049 未加载
评论 #22627024 未加载
yositoabout 5 years ago
&gt; You don&#x27;t need to be a rocket scientist to learn about and understand data egress costs. Which left The Register wondering how an agency capable of sending stuff into orbit or making marvelously long-lived Mars rovers could also make such a dumb mistake.<p>I used to work very closely with this department at NASA. Without saying too much, the short answer is &quot;tenured government employees more concerned about job security than the success of the project&quot; is how an agency could make such dumb mistakes.
jkaabout 5 years ago
What&#x27;s the opposite of AWS Snowmobile[0]?<p>[0] - <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;snowmobile&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;snowmobile&#x2F;</a>
评论 #22631431 未加载
Spooky23about 5 years ago
Using AWS for this type of use case is dumb for an org as large as NASA, if cost savings is a goal. It&#x27;s cheaper to just land capacity at a datacenter.
评论 #22629668 未加载
julienchastangabout 5 years ago
This article is misleading. The entire point is to not move data out of the cloud. Instead bring your computing (analysis, visualization) to the data and pay for compute cycles on AWS. If your workflows are short&#x2F;bursty, you will come out ahead. Moreover, you will be able to do big data-style computations that you cannot do in a local computing environment. This is bad journalism, IMO.
chxabout 5 years ago
If you are facing similar problems you should know traffic via Cloudflare from B2 is free. I am not 100% CF would be happy if NASA picked the CF free tier but probably their quote would be magnitudes lower than Amazon&#x27;s.
X6S1x6Okd1stabout 5 years ago
&gt; NASA also knows that a torrent of petabytes is on the way.<p>Oh that sounds like a potential solution.<p>&#x2F;s
gigatexalabout 5 years ago
might be cheaper to spin up virtual workstations on AWS and use the data there
评论 #22630515 未加载
Havocabout 5 years ago
Can&#x27;t they just use the current DAACs as a caching layer? Seems like the least ugly way out of this mess.<p>Also - can&#x27;t they use torrent tech? I wouldn&#x27;t mind helping out a bit on space &amp; data
CKN23-ARINabout 5 years ago
Putting a dataset into AWS is a lot like putting a satellite into orbit. You still need to pay later to get it down, or to safely destroy it.
Wheaties466about 5 years ago
at that point why not just use a P2P based system.
szczepanoabout 5 years ago
To sum up no matter how big the hard drives or data center we produce we will always have problem with storage capacity.
pontifierabout 5 years ago
Cloud egress costs killed the business I&#x27;m now trying to save. I won&#x27;t fall into that trap.
ralusekabout 5 years ago
I wonder why they wouldn&#x27;t use Wasabi:<p><a href="https:&#x2F;&#x2F;wasabi.com&#x2F;cloud-storage-pricing&#x2F;" rel="nofollow">https:&#x2F;&#x2F;wasabi.com&#x2F;cloud-storage-pricing&#x2F;</a><p>Looks like egress is free.<p>Maybe because it&#x27;s comparably untested? Does anyone here have any experience with it?
评论 #22627478 未加载
评论 #22630286 未加载
apiabout 5 years ago
This is exactly why the costs are set up that way. The first time I saw AWS pricing I chuckled and thought &quot;roach motel.&quot; Data goes in but it doesn&#x27;t come out. Its one of many soft lock in mechanisms cloud hosts use.
tzmabout 5 years ago
$5,439,526.92 per month
turdnagelabout 5 years ago
Requester pays!
Mave83about 5 years ago
just build your own storage and save an incredible amount.<p>It&#x27;s hard you might think, but it&#x27;s not. croit.io provides all you need to deploy a scalable cluster even on multiple geographic regions.<p>Price for 1 PB sized cluster including everything from rack to hardware to license to labor for below 3€&#x2F;TB&#x2F;Month or at the Amazon Glacier price tag but with the S3-IA access.
评论 #22630186 未加载
评论 #22630092 未加载
评论 #22631921 未加载
评论 #22630098 未加载
oh_helloabout 5 years ago
&quot;The audit, meanwhile, suggests an increased cloud spend of around $30m a year by 2025&quot;<p>Isn&#x27;t this a rounding error for NASA?
mensetmanusmanabout 5 years ago
This seems like a good use of torrenting?
评论 #22630697 未加载
评论 #22631355 未加载
beastman82about 5 years ago
Torrent FTW
vnchrabout 5 years ago
Cloud VERSUS Space. Who will come out on top?
ph2082about 5 years ago
1 Terabyte of hard disk cost ~50USD.<p>247 Petabyte ~ 247000 Terabyte &gt; 50000 USD.<p>Network cards, bandwidth, electricity cost &gt; I can&#x27;t guess.<p>Couple of good engineers (hardware and software ones), which they definitely have.<p>May be they could have built their own cloud in &lt; ~10-15 million USD. And that won&#x27;t be recurring cost.<p>May be they missed article about Bank of America saving ~2 Billion USD, by building their own cloud.
评论 #22627825 未加载
评论 #22627226 未加载
评论 #22627756 未加载