TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Can we solve AI prompt injection attacks with an indented data format?

1 点作者 alexrustic大约 1 年前
Hi HN ! I&#x27;m Alex, a tech enthusiast. I have an idea that I can&#x27;t test and that concerns an area in which I am not an expert. I am making this post to find out to what extent this idea is relevant to the state of the art.<p>From what little I know, raw user inputs are not directly submitted to LLMs. Typically, user input is carefully wrapped in a special format before being sent to the LLM. The format usually has tags, including special tags to tell the AI, for example, which topic is prohibited.<p>As with SQL injection, an attacker can craft malicious user input by introducing special tags. Input sanitization can be seen as a solution, but it seems that it isn&#x27;t enough. Anyway, it doesn&#x27;t seem very intuitive, I think a document intended to be read by an LLM should also be very human-readable. I also wonder what happens when an attacker uses obscure Unicode characters to forge a string that looks like a special tag.<p>Instead of using an XML-like language, my idea is to use a format that seamlessly interweave human-readable structured data with prose within a single document. Also, the format must natively support indentation to remove the need for input sanitization, thereby eliminating an entire class of injection attacks.<p>I am the author of Braq, a data format that seems to be a good candidate.<p>The idea to better structure a prompt is described in this Markdown section: https:&#x2F;&#x2F;github.com&#x2F;pyrustic&#x2F;braq?tab=readme-ov-file#ai-prompts<p>And here, ChatML from OpenAI: https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34988748<p>As mentioned above, I can&#x27;t test this idea. Therefore, I&#x27;m asking to you: Can we solve AI prompt injection attacks with an indented data format ?

2 条评论

alexrustic大约 1 年前
The backspace escape character (<a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;6792812&#x2F;the-backspace-escape-character-b-unexpected-behavior" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;6792812&#x2F;the-backspace-es...</a>) might be a good candidate for successfully creating a valid section in a document.<p>In a ChatML document, this character can also help destroy the closing tag of an instruction node.<p>But this can only work if the escape character is actually &#x27;executed&#x27;.
wmf大约 1 年前
I don&#x27;t understand how indentation can remove the need for input sanitization since the input can definitely include brackets, spaces, tabs, and newline characters.<p>You might be able to test this by fine-tuning a local LLM to understand your format then breaking it.
评论 #39721345 未加载