TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Could predicting software output be used for synthetic data?

1 点作者 maxutility大约 1 年前
I’ve been reading a lot about the cliff that AI frontier models face as training data sources dry up. I’ve seen synthetic data mentioned as an option but haven’t seen a lot of details (maybe I haven’t looked hard enough).<p>I’m curious whether you could create an unlimited resource of synthetic data and improve coding&#x2F;logic performance by having an LLM generate code and then train on predicting (1) whether it compiles and (2) what outputs it would generate for an unlimited series of generated inputs.

1 comment

maxutility大约 1 年前
You could call it “bootstrapping is all you need” :)