There is a surprising amount of open data. I think this would be a great hobby. Though maybe use a good VPN and fake creds. People have been murdered for this sort of thing.<p>Source: <a href="https://forbiddenstories.org/case/daphne-project-three-years-after/" rel="nofollow">https://forbiddenstories.org/case/daphne-project-three-years...</a><p>I can't find a link now but some phd students in South America (I think?) fairly accurately found fraud based on the content in business and government contracts simply on certain clauses.<p>Would also be interesting to take stats of higher incidents of certain diseases and track them against factory production/creation in the area. Could also cross-examine against EPA fines.<p>You could then send the analysis to forbidden stories. Though remember they are journalists and not necessarily ML/GPT-3 experts.<p>Internet sleuths unite!
I had a similar idea to try and track corruption in local government by simply bringing visibility to the things different council members vote for, and when. All this stuff is freely available on the internet, on official sources, it's just hard to find and navigate. Scraping it into a nice UI that lets you cross-reference stuff seems worthwhile.
Here's a more recent Vice article a couple of links in:<p><a href="https://www.vice.com/en/article/5dpxvq/this-transparency-project-is-creating-a-massive-collection-of-police-data" rel="nofollow">https://www.vice.com/en/article/5dpxvq/this-transparency-pro...</a>