We are a bootstrapped team trying to build tools for data extraction. We are currently focusing on tools for data that is semi-structured and thus can be extracted using non deep learning based software. So if you think there is some data that you need (and you are willing to pay for) but it is not readily available, we might be able to help you. We are looking for different types of datasets that are actually useful to people, so that we can work towards a tool that can be generally used for some sort of data extraction. If you think you have such a dataset in mind, do let us know. Also, if you could share a website where we could find the semi-structured version of this dataset that you need, it'd be really helpful.
Customer counts for ISPs worldwide by ASN. There are approximations for some economies, filing for federal regulations and stock exchange notices. There are yearly numbers for broadband which are pretty hazy. I went to a meeting where china declared 160m extra online users had been found that year.<p>A huge amount of internet modelling and sampling would improve at scale if we knew this. I've discussed this with researchers in the field. Akamai and Google and Facebook have private information which is their secret sauce.