TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What dataset do you think is useful to you but not readily available

3 pointsby abhikandoi2000almost 5 years ago
We are a bootstrapped team trying to build tools for data extraction. We are currently focusing on tools for data that is semi-structured and thus can be extracted using non deep learning based software. So if you think there is some data that you need (and you are willing to pay for) but it is not readily available, we might be able to help you. We are looking for different types of datasets that are actually useful to people, so that we can work towards a tool that can be generally used for some sort of data extraction. If you think you have such a dataset in mind, do let us know. Also, if you could share a website where we could find the semi-structured version of this dataset that you need, it'd be really helpful.

2 comments

ggmalmost 5 years ago
Customer counts for ISPs worldwide by ASN. There are approximations for some economies, filing for federal regulations and stock exchange notices. There are yearly numbers for broadband which are pretty hazy. I went to a meeting where china declared 160m extra online users had been found that year.<p>A huge amount of internet modelling and sampling would improve at scale if we knew this. I&#x27;ve discussed this with researchers in the field. Akamai and Google and Facebook have private information which is their secret sauce.
评论 #23506998 未加载
steerpikealmost 5 years ago
A clean dataset YouTube music videos associated to musicbrainz tags.
评论 #23506957 未加载