TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Could predicting software output be used for synthetic data?

1 pointsby maxutilityabout 1 year ago
I’ve been reading a lot about the cliff that AI frontier models face as training data sources dry up. I’ve seen synthetic data mentioned as an option but haven’t seen a lot of details (maybe I haven’t looked hard enough).<p>I’m curious whether you could create an unlimited resource of synthetic data and improve coding&#x2F;logic performance by having an LLM generate code and then train on predicting (1) whether it compiles and (2) what outputs it would generate for an unlimited series of generated inputs.

1 comment

maxutilityabout 1 year ago
You could call it “bootstrapping is all you need” :)