TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Test thoses that say LLM are just for easy things

1 点作者 twometwo5 个月前
Motivation: In the post [1], GPT-5 is behind schedule (wsj.com), user jacobolus posted a comment [2] about LLM are not useful for tricky or obscure questions.<p>I am thinking about creating a page for testing those claims. Perhaps something similar already exists. Anyway, I think that page could be useful to determine whether, in general, LLM are useful for deep questions.<p>Obviously, that page should use an ensemble of the best models and there should be limits to the number of models, time and budget for computation. That costs real money.<p>I think the battle between editor and contributors to wikipedia and LLMs is going to be fierce once the LLMs get to the level to question the basic assumptions of editors in their respective fields.<p>Edited: Edited a lot.<p>[1] GPT-5 is behind schedule (wsj.com) https:&#x2F;&#x2F;www.wsj.com&#x2F;tech&#x2F;ai&#x2F;openai-gpt5-orion-delays-639e7693<p>[2] Excerpt: I&#x27;ve never gotten an answer from an LLM to a tricky or obscure question about a subject I already know anything about that seemed remotely competent.

3 条评论

not_your_vase5 个月前
Sure, we can create a website that says &quot;AI is useful for complex things&quot;, but will it actually make it true? People say that it is only usable for trivial stuff due to their experiences - all AI tools fumble at most marginally complex tasks and questions. Change the experience of the people, and their opinion will change too.
DemocracyFTW25 个月前
Sometimes I wish if only people would take the time to structure their thoughts before they post, that would be great
RadiozRadioz5 个月前
What?