TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Foursquare dataset free to download and analyze

113 点作者 rsobers超过 11 年前

12 条评论

sneak超过 11 年前
Obligatory:<p><a href="https://en.wikipedia.org/wiki/AOL_search_data_leak" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;AOL_search_data_leak</a><p><a href="http://techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data/" rel="nofollow">http:&#x2F;&#x2F;techcrunch.com&#x2F;2006&#x2F;08&#x2F;06&#x2F;aol-proudly-releases-massiv...</a><p><a href="http://www.nytimes.com/2006/08/09/technology/09aol.html?pagewanted=all" rel="nofollow">http:&#x2F;&#x2F;www.nytimes.com&#x2F;2006&#x2F;08&#x2F;09&#x2F;technology&#x2F;09aol.html?page...</a><p>TL;DR: It&#x27;s fairly easy to deanonymize datasets like this, provided they are somewhat complete.
评论 #6513686 未加载
评论 #6514626 未加载
danso超过 11 年前
Jesus Christ. The bulk scraping in violation of the TOS is egregious enough, but redistributing it with a mandate that the researchers get credit? For what, scraping a generous public API?
评论 #6514398 未加载
nicholassmith超过 11 年前
That doesn&#x27;t look like Foursquare has handed that over. What&#x27;s the legality of scraping a service for their data in this way?
评论 #6513593 未加载
评论 #6513448 未加载
galapago超过 11 年前
<a href="http://webcache.googleusercontent.com/search?q=cache:hLI5FqDixY8J:www-users.cs.umn.edu/~sarwat/foursquaredata/+&amp;cd=1&amp;hl=en&amp;ct=clnk" rel="nofollow">http:&#x2F;&#x2F;webcache.googleusercontent.com&#x2F;search?q=cache:hLI5FqD...</a><p>(the direct link is not working, but this confirmed that was freely available)
boothead超过 11 年前
No mention of the data format. Is it json, csv what? I know you can always head -n the file but a little hint would be helpful!
评论 #6513537 未加载
interskh超过 11 年前
&gt; This data set contains 2153471 users, 1143092 venues, 1021970 check-ins, 27098490 social connections, and 2809581 ratings that users assigned to venues<p>The number of check-ins seems to be low compared to other numbers.
davidmat超过 11 年前
Could anyone recommend some solid introductory material on data analysis&#x2F;data visualisation?<p>I&#x27;m thinking this data set seems like a fun way to fill a rainy weekend, going for a dive into these worlds :)
评论 #6514416 未加载
m4tthumphrey超过 11 年前
Look&#x27;s like it&#x27;s been removed. Damn.<p>Edit: Not removed, just unaccessible. 403.
评论 #6514609 未加载
renownedmedia超过 11 年前
Looks like the data is only up-to-date as of July 2012 (judging from the zip compression times).
xntrk超过 11 年前
sounded too good to be true. I guess we&#x27;ll have to find it on bittorrent.
评论 #6515711 未加载
rajbala超过 11 年前
The data set has been removed?
评论 #6519395 未加载
waynesonfire超过 11 年前
why was this not posted as a torrent?