TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

1.7 petabytes and 850M files lost, and how we survived it

92 点作者 beck5大约 9 年前

7 条评论

zimpenfish大约 9 年前
&quot;The directory is intended for temporary storage of results before staging them into a more permanent location [...] During the three years that the filesystem has been in operation, it has accumulated 1.7 Petabytes of data in 850 million objects.&quot;<p>There needs to be some law about how temporary directories always end up containing vitally important data.
评论 #11389209 未加载
评论 #11389141 未加载
评论 #11389386 未加载
hga大约 9 年前
Lots of fun; while backing up the filesystem prior to wiping and rebuilding it, they ran out of IOPS to do it in a reasonable time frame, so after considering other options:<p><i>One obvious solution would be to use a ramdisk, a virtual disk that actually resides in the memory of a node. The problem was that even our biggest system had 1.5TB of memory while we needed at least 3TB.<p>As a workaround we created ramdisks on a number of Taito cluster compute nodes, mounted them via iSCSI over the high-speed InfiniBand network to a server and pooled them together to make a sufficiently large filesystem for our needs.</i><p>A hack they weren&#x27;t at all sure would work, but it did nicely.
评论 #11389385 未加载
ghubbard大约 9 年前
Current HN Title: 1.7 petabytes and 850M files lost, and how we survived it.<p>Article title: The largest unplanned outage in years and how we survived it<p>Article overview: A month ago CSC&#x27;s high-performance computing services suffered the largest unplanned outage in years. In total approximately 1.7 petabytes and 850 million files were recovered.<p>Although technically correct, the HN title is misleading.
评论 #11389480 未加载
评论 #11391096 未加载
评论 #11389232 未加载
pinewurst大约 9 年前
It should be noted that this is about a Lustre filesystem hosted on DDN hardware. It&#x27;s unclear whether the failed controller contributed to the file system corruption, but Lustre is quite capable of accelerating local entropy all by itself. It was designed&#x2F;spec-ed at LLNL as huge file, high performance, short term scratch&#x2F;swap and even after 15 years isn&#x27;t especially reliable or fit for use outside that domain.
gnufx大约 9 年前
I&#x27;m surprised that the copying bottleneck seems to have been entirely at the target rather than the source. Is that because there were multiple copies of the source?<p>I&#x27;ve had to employ the horrible hack of iscsi from compute nodes, raided and re-exported, but it&#x27;s not what I&#x27;d have tried to use first. The article doesn&#x27;t mention the possibility of just spinning up a parallel filesystem on compute node local disks (assuming they have disks); I wonder if that was ruled out. I don&#x27;t have a good feeling for the numbers, but I&#x27;d have tried OrangeFS on a good number of nodes initially.<p>By the way, it&#x27;s been pointed out that RAM disk is relatively slow, if in the context of data rates rather than metadata &lt;<a href="http:&#x2F;&#x2F;mvapich.cse.ohio-state.edu&#x2F;static&#x2F;media&#x2F;publications&#x2F;slide&#x2F;rajachan-hpdc13.pdf&gt;" rel="nofollow">http:&#x2F;&#x2F;mvapich.cse.ohio-state.edu&#x2F;static&#x2F;media&#x2F;publications&#x2F;...</a>.
评论 #11395640 未加载
ajford大约 9 年前
Out of curiosity, why weren&#x27;t they running the metadata drive in a mirroring raid? If you have PB of data, wouldn&#x27;t it make sense to spend the ~$100 for a second 3TB drive to mirror your metadata?<p>Or was the inode problem not a local disk problem but a problem in the Luster fs? I couldn&#x27;t quite tell from the article.
评论 #11392056 未加载
评论 #11392207 未加载
beezle大约 9 年前
I bookmarked this for whenever I think I&#x27;m having a really bad day...
评论 #11395646 未加载