Hi, Rafael here from https://curiosity.ai.<p>I was just talking to someone today that works in a restrictive IT environment (no internet), and the person mentioned how annoying it is to not have access to Stack Overflow. There is an open source project on GitHub (https://github.com/tools4j/stacked-off) that seems to provide an offline server using the public dumps, but this person found it a bit cumbersome to work with.<p>As we build a desktop search app, we could add an offline Stack Overflow integration. I was curious (!) if anyone would be interested before taking the effort to build it.<p>WDYT?
You may be interested in Dash, which provides offline dockets of StackOverflow across a variety of different language/technology categories:<p><a href="https://kapeli.com/dash" rel="nofollow">https://kapeli.com/dash</a>
Another option to consider that I only discovered yesterday is Kiwix: <a href="https://www.kiwix.org/en/" rel="nofollow">https://www.kiwix.org/en/</a><p>You can browse the different packages that the community has preprared, including Wikipedia and a ton of different stack exchange sites: <a href="https://docs.google.com/spreadsheets/d/1lWXdwy3OIfZ1Ob2cQR707OMHSva3khTcAXZE9MK9ad8/edit#gid=598202886" rel="nofollow">https://docs.google.com/spreadsheets/d/1lWXdwy3OIfZ1Ob2cQR70...</a>
I've mentioned before that this already exists. The NSA maintains mirrors of Stack Overflow and Wikipedia that get synchronized from the Internet to classified airgapped networks once every 24 hours. I have no idea what the software is they've used to do it, but you could arguably file a FOIA request or something since the code they're using has no reason to be classified. They've open-sourced simple tools and libraries like this in the past.
I'm thinking we could use Postgresql as backend, import all xml files as tables and then build a simple frontend with whatever language people are comfortable about.<p>Then you only need to do a monthly or quarterly update.
One solution is to have an offline virtual machine that is configured not to speak to the public Internet, and do your coding in that. If you need to lookup something online, the host machine can do that and it can be copy and pasted from the host to the offline VM (Virtualbox allows you to do this).