Hi HN, asking here because the SEC is not getting back to me.<p>I recently discovered that the SEC has a RESTful endpoint for company filing data. You can, for free, query extremely fine grained financial data about any company that files with the SEC[1]. For instance, this link gives you the Berkshire Hathaway Basic Shares Outstanding:<p>https://data.sec.gov/api/xbrl/companyconcept/CIK0001067983/us-gaap/WeightedAverageNumberOfSharesOutstandingBasic.json<p>Unfortunately, as far as I can tell, the data is incomplete. The link above delivered json that only covers up to end 2015, nothing more recent. If you go to the SEC website you can access more recent Berkshire Hathaway data (including basic shares outstanding) - their latest filing was last week.<p>So, what gives? Is the API faulty? Does this explain why all the python and R libs still use html traversal for scraping SEC data?<p>[1] https://www.sec.gov/edgar/sec-api-documentation
I am by no means an expert, but this reflects my dabbling with XBRL. It seems like a giant time suck that would require too much curating to make useful.