TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Hype aside, what practical issues come up when building LLM-based apps?

3 点作者 arkmm大约 2 年前
Prototyping an application with an LLM is pretty straightforward task these days: try out a few prompts and paste in some context and see if it seems to work.<p>After trying to build something more substantial (a script that takes a text description and attempts to scrape that info out of a collection of PDFs&#x2F;websites), I realized there are a number of annoyances with getting this out of the prototype stage:<p>* Parsing prompt responses to ensure they match my expected schema (e.g. I want XPATH selectors, sometimes the model hallucinates a DOM id)<p>* Hacks to avoid long context windows (especially if the context isn&#x27;t easily vector-searchable, e.g. a DOM tree)<p>* Retry logic<p>* Measuring how well the system is doing over multiple examples<p>AI Twitter is full of examples of how LLMs, AutoGPT, etc. are cure-alls, but what are some of the practical issues that actually come up when you try to build on top of these yourselves?

1 comment

tornato7大约 2 年前
One practical problem with the OpenAI API is you&#x27;ll get &#x27;server busy&#x27; responses pretty regularly, making your app less reliable.<p>Another problem is it will typically not ask clarifying questions, so it can often make a wrong assumption about ambiguous wording or missing information in your prompt, but not tell you.<p>It seems to have an inexplicable penchant for certain outputs, for example I asked GPT-4 to rate a myriad of things 1-10 for relevance, and a third of its ratings were exactly 7.8.<p>If you&#x27;re dealing with vector-based context, there are even more issues: for example it fails on negations and doesn&#x27;t know about newer words - i.e. &quot;find me all the competitors to Pinecone&quot; would not give you good results, because the embeddings model doesn&#x27;t know what Pinecone is and the embeddings aren&#x27;t similar to an actual competitor like Milvus.