TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Cutting cost and power consumption for big data

8 点作者 tpatke将近 10 年前

1 comment

jcr将近 10 年前
The (<i>unmentioned</i>) title of the paper is, &quot;BlueDBM: an appliance for big data analytics&quot;<p>Abstract:<p>&gt;<i>&quot;Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data and daily twitter feeds where the datasets of interest are 5TB to 20 TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to 256GBs of DRAM, to accommodate all the data in DRAM. On the other hand, such datasets could be stored easily in the flash memory of a rack-sized cluster. Flash storage has much better random access performance than hard disks, which makes it desirable for analytics workloads. In this paper we present BlueDBM, a new system architecture which has flash- based storage with in-store processing capability and a low- latency high-throughput inter-controller network. We show that BlueDBM outperforms a flash-based system without these features by a factor of 10 for some important applications. While the performance of a ram-cloud system falls sharply even if only 5%~10% of the references are to the secondary storage, this sharp performance degradation is not an issue in BlueDBM. BlueDBM presents an attractive point in the cost-performance trade-off for Big Data analytics.&quot;</i><p><a href="http:&#x2F;&#x2F;people.csail.mit.edu&#x2F;wjun&#x2F;papers&#x2F;ISCA15_Sang-Woo_Jun.pdf" rel="nofollow">http:&#x2F;&#x2F;people.csail.mit.edu&#x2F;wjun&#x2F;papers&#x2F;ISCA15_Sang-Woo_Jun....</a>