TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Compromising LLM-integrated applications with indirect prompt injection

43 pointsby greshakealmost 2 years ago

8 comments

haolezalmost 2 years ago
Well, maybe we could limit this by having a list of preset actions that the LLM can take and those actions can contain canned responses based on templates. This way we can make a chat bot with a LLM model that never sends its output to the user. For some applications, this might be enough, since you still get the amazing interpretation abilities of a LLM.
评论 #36038293 未加载
jasonwcfanalmost 2 years ago
If I’m understanding correctly, the technique basically injects malicious instructions in the content that is stored and retrieved?<p>Sounds like an easy fix, if it’s possible to detect direct prompt injection attacks then the same techniques can be applied to the data staged for retrieval.
评论 #36038309 未加载
评论 #36038152 未加载
评论 #36038168 未加载
SkyPuncheralmost 2 years ago
The headline got me, but the paper lost me.<p>Isn&#x27;t this saying what most people already knew - user content should never be trusted?<p>These attacks are no different than old school SQL injection attacks when people didn&#x27;t understand the importance of escaping. Even if a user can&#x27;t do SQL injection directly, they can get data stored that&#x27;s injects into some other system. Much harder to pull off, but the exact same concept.
评论 #36038793 未加载
genewitchalmost 2 years ago
I&#x27;ve managed a few &quot;prompt injections&quot;, nearly all benign. It is funny to me that SEO garbage works on resume&#x2F;CV AI.<p>I wonder how linked &quot;organic search engine results polluted with SEO nonsense&quot; and prompt injection are, as problems.<p>Google can hire me and i&#x27;ll figure it out.
greshakealmost 2 years ago
TLDR: With these vulnerabilities, we show the following is possible:<p>- Remote control of chat LLMs<p>- Persistent compromise across sessions<p>- Spread injections to other LLMs<p>- Compromising LLMs with tiny multi-stage payloads<p>- Leaking&#x2F;exfiltrating user data<p>- Automated Social Engineering<p>- Targeting code completion engines<p>There is also a repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;greshake&#x2F;llm-security">https:&#x2F;&#x2F;github.com&#x2F;greshake&#x2F;llm-security</a> and another site demonstrating the vulnerability against Bing as a real-world example: <a href="https:&#x2F;&#x2F;greshake.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;greshake.github.io&#x2F;</a><p>These issues are not fixed or patched, and apply to most apps or integrations using LLMs. And there is currently no good way to protect against it.
评论 #36038291 未加载
RcouF1uZ4gsCalmost 2 years ago
We keep on having to relearn this principle over and over again: mixing instructions and data on the same channel leads to disaster. For example, phone phreaking were people were able to whistle into the phone and place long distance calls. SQL injection attacks. Buffer overflow code injections. And now LLM prompt injections.<p>We will probably end up with the equivalent of prepared LLM statements like we have for SQL that will separate out the instruction and data channels.
bagelsalmost 2 years ago
Didn&#x27;t read through the whole thing yet, but this seems to be the key idea:<p>&quot;With LLM-integrated applications, adversaries could control the LLM, without direct access, by indirectly injecting it with prompts placed within sources retrieved at inference time.&quot;
cubefoxalmost 2 years ago
My proposal for fixing indirect prompt injection:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35929145" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35929145</a>