TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Where to find open data sets to play around with to learn ML?

4 点作者 imperio59大约 3 年前
I&#x27;m learning ML and looking to find more open datasets that I can use, especially in the area of recommender&#x2F;ranker systems.<p>I&#x27;m already familiar with Kaggle, but wondering what else there is out there?

2 条评论

JoeyBananas大约 3 年前
In a few applications of ML that I&#x27;ve worked with, there is no need for an outside dataset because the program generates it&#x27;s own data. For example, the data could come from a simulation of some process.
mindcrime大约 3 年前
<a href="https:&#x2F;&#x2F;datasets.reddit.com" rel="nofollow">https:&#x2F;&#x2F;datasets.reddit.com</a><p><a href="https:&#x2F;&#x2F;opendata.reddit.com" rel="nofollow">https:&#x2F;&#x2F;opendata.reddit.com</a><p><a href="https:&#x2F;&#x2F;archive.ics.uci.edu&#x2F;ml&#x2F;datasets.php" rel="nofollow">https:&#x2F;&#x2F;archive.ics.uci.edu&#x2F;ml&#x2F;datasets.php</a><p><a href="https:&#x2F;&#x2F;lod-cloud.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lod-cloud.net&#x2F;</a><p><a href="https:&#x2F;&#x2F;www.data.gov" rel="nofollow">https:&#x2F;&#x2F;www.data.gov</a><p><a href="https:&#x2F;&#x2F;data.un.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;data.un.org&#x2F;</a><p><a href="https:&#x2F;&#x2F;data.worldbank.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;data.worldbank.org&#x2F;</a><p><a href="https:&#x2F;&#x2F;fred.stlouisfed.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fred.stlouisfed.org&#x2F;</a><p><a href="https:&#x2F;&#x2F;data.oecd.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;data.oecd.org&#x2F;</a><p><a href="https:&#x2F;&#x2F;www.nber.org&#x2F;research&#x2F;data?page=1&amp;perPage=50" rel="nofollow">https:&#x2F;&#x2F;www.nber.org&#x2F;research&#x2F;data?page=1&amp;perPage=50</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;awesomedata&#x2F;awesome-public-datasets" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;awesomedata&#x2F;awesome-public-datasets</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;datasets" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;datasets</a><p><a href="https:&#x2F;&#x2F;opendata.cern.ch&#x2F;" rel="nofollow">https:&#x2F;&#x2F;opendata.cern.ch&#x2F;</a><p><a href="https:&#x2F;&#x2F;data.nasa.gov&#x2F;" rel="nofollow">https:&#x2F;&#x2F;data.nasa.gov&#x2F;</a><p><a href="https:&#x2F;&#x2F;data.world&#x2F;datasets&#x2F;machine-learning" rel="nofollow">https:&#x2F;&#x2F;data.world&#x2F;datasets&#x2F;machine-learning</a><p><a href="https:&#x2F;&#x2F;data.noaa.gov&#x2F;datasetsearch&#x2F;" rel="nofollow">https:&#x2F;&#x2F;data.noaa.gov&#x2F;datasetsearch&#x2F;</a><p><a href="https:&#x2F;&#x2F;www.usgs.gov&#x2F;products&#x2F;data" rel="nofollow">https:&#x2F;&#x2F;www.usgs.gov&#x2F;products&#x2F;data</a><p><a href="https:&#x2F;&#x2F;www.fema.gov&#x2F;about&#x2F;openfema&#x2F;data-sets" rel="nofollow">https:&#x2F;&#x2F;www.fema.gov&#x2F;about&#x2F;openfema&#x2F;data-sets</a><p>etc...<p>And of course don&#x27;t ignore the data you can collect yourself one way or another. A few cheap Arduino Nano or Rpi Pico boards, some sensors, and you can build quite a variety of distributed data collection systems. Use solar panels for power in remote areas, and 4G &#x2F; cellular data networks and you can get data from all over the place. You can also use a cheap SDR &quot;dongle&quot; to pull down data from various weather satellites and other sources. And don&#x27;t forget about the API&#x27;s &#x2F; data export mechanisms for apps you might use like Fitbit, Strava, MapMyRun, etc.
评论 #31163633 未加载