Wow we were just talking about selling shovels in a (ML) gold rush.<p>Incidentally, what is the open source alternative of this? Data is so cheap that it should be actually free, unlike counterfeit nike shoes.<p>(Does a bittorrent tracker specifically for research data exist? Edit: there's <a href="http://academictorrents.com/" rel="nofollow">http://academictorrents.com/</a>)
This is nice, but I don't see the pricing after the free trials. The Pitney Bowes data [1] they used as an example in linked article only shows $0 for the free trial, not what it's going to cost you afterwards. It'd be nice to know the long term cost before tying this data into your business.<p>[1] <a href="https://aws.amazon.com/marketplace/pp/prodview-bwf7mapyyjzom?qid=1573682854054&sr=0-1&ref_=srh_res_product_title" rel="nofollow">https://aws.amazon.com/marketplace/pp/prodview-bwf7mapyyjzom...</a>
It looks like this is targeted at ML/AI but I have a tangentially related question: does anyone know of open source or other publicly available lists of US businesses? Just business name and address?<p>I’m building out an app and we receive documents from all kinds of vendors from all over the country. The app is for our business to manage our client data. I was hoping to find a list of business I could throw in the db rather then piecemeal add the addresses in one by one as the documents come in.<p>I looked at some of the data service providers (infoUsa I think was one and d&b being another), but one dataset for just business names and addresses they were asking $50,000 for. I think my use-case is unique in that these companies typically sell this data as sales lead data which it definitely is not in my case (we don’t even sell b2b).<p>Anyone know of anything like this? I suppose I could just scrape phone books but I think if I can’t find the data we will just resort to one by one entry.
There are many free public datasets available on the web.<p>I have an open source project on crawling public datasets and make them searchable in one place: <a href="https://github.com/findopendata/findopendata" rel="nofollow">https://github.com/findopendata/findopendata</a>.
They're crowd-sourcing valuable information services from third parties to become a market data provider.<p>What could go wrong for information providers where Amazon controls their market and infrastructure? They become commoditized "data providers". They are coerced into profit sharing with Amazon. They are eventually replaced by Amazon-provided data.<p>I won't buy from this market because I see where this is heading. I use the same reason that I apply for not buying many other services and products from Amazon. It offers no additional value other than minor convenience to a customer at a much greater cost to the economy and providers.<p>Buying local isn't just for produce.
How does a data provider prevent someone from copying the data from their S3 bucket into a new one, then cancelling the subscription and owning the data forever?
Did anyone else find Jeff’s first sentence terribly unoriginal and somewhat wimpy?<p>“We live in a data-intensive, data-driven world!”<p>I know these blog posts are turned out fast, but especially for such a sensitive issue as a world awash in data that no one understands and no one - yet - controls...it seemed like it was “whistling past the graveyard”.