It’s a cool idea. I’ve wasted a lot of time over the past few months futzing around with beautifulsoup, Playwright and others I forget, or cloning entire repos and trying to figure out exactly which incantations for which build tools are going to get me the built docs I need, all in service of setting them up for retrieval and use by LLMs. Some projects (e.g. Godot, Blender, Django) make it very easy. Others do not (Dagster is giving me headaches at the moment).<p>I would probably prefer to receive unmodified, plain text/md versions (with the heavy lifting done by, e.g., docling, unstructured) than LLM summaries though, since I’d rather produce my own distillations.<p>I would pay for that kind of thing. I think the intersection between ethical scraping and making things machine-readable is fertile ground. For a lot of companies it’s something that can be of great value, but is also non-trivial to do well and unlikely to be a core competency in-house.