TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How To Get Experience Working With Large Datasets

36 点作者 m3mb3r超过 14 年前

6 条评论

drblast超过 14 年前
US Census data is multiple gigabytes, and well documented.<p>If you want to run a database through its paces beyond the point where all the data fits in memory, that's a good place to start.
评论 #1990082 未加载
wrath超过 14 年前
One nice and free dataset which you can play with is the BestBuy open data. You can download the full catalog of products from BestBuy in JSON and XML format. <a href="http://developer.bestbuy.com" rel="nofollow">http://developer.bestbuy.com</a> Simply register for a key and you'll have access to the data.
andrewjshults超过 14 年前
Along the same lines, NYC's Big Apps 2.0 competition is going on right now (<a href="http://nycbigapps.com/" rel="nofollow">http://nycbigapps.com/</a>). Not affiliated, but I went to NYTM last year where they demoed the winners and there are some interesting (and impressively large) datasets to play with. One of my favorites was the mobile app, CabSense, that crunched the TLC data to determine the best corners to catch a cab on depending on the time of day
fmw超过 14 年前
They might be relatively small, but <a href="http://www.grouplens.org/node/12" rel="nofollow">http://www.grouplens.org/node/12</a> has some interesting datasets that can be used to experiment with recommendation systems, e.g. book and movie reviews.
ashtophoenix超过 14 年前
What a silly article - When it said how to get experience working with large datasets I was expecting it would explain more about storage/scalability/design/caching issues etc. There are myriad ways to get (or generate) data to play with...
earl超过 14 年前
What's with the recent fetishization of Big Data? I'm moving to Dziuba's camp -- its a developer dick size contest.
评论 #1989865 未加载
评论 #1989951 未加载