TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Have you used a good general purpose LLM-powered scraper?

2 点作者 FrenchDevRemote超过 1 年前
I&#x27;m looking to either build or use a model or SaaS that could be used with a prompt like:<p>Find the 10 biggest realtors in [CITY], then extract all their listings within [BUDGET] in JSON with all the data you can find.<p>I&#x27;m looking for a tool that can use search inputs on any website, click next page buttons or handle infinite scroll.<p>Is there a tool like that on the market that actually delivers?What I&#x27;ve seen for now doesn&#x27;t seem good enough.

2 条评论

coderatlarge超过 1 年前
One precursor question would be whether an LLM can extract the data you want from raw html even when copy-pasted manually. In my limited experience we’re not quite there yet, but I’d be curious to hear of others have different experience - or better yet, actual measurements against a baseline scraper.
评论 #38150624 未加载
mdev23超过 1 年前
I am working on an LLM agents tool that could probably help with this: <a href="https:&#x2F;&#x2F;recurai.com" rel="nofollow noreferrer">https:&#x2F;&#x2F;recurai.com</a><p>Reach out and i&#x27;ll try to get you setup: matt@recurai.com