IMO the critical concept to explain LLM prompt injection and manipulation like this is that almost all these "assistants" are <i>fictional characters</i> in a document that looks like a theater-play script, along with some tricks that "speaks out" the character so that we humans believe it's a real entity. (Meanwhile, our own inputs invisibly become words "spoken" by another "The User" character.)<p>So the true LLM is a nameless lump tasked with Make Any Document Longer. If for any reason the prior state is "Copilot Says: Sure, " then the LLM is <i>probably</i> going to try to make something that "fits" with that kind of intro.<p>This becomes extra-dangerous when when the generated play-script has stuff like "Copilot opens a terminal and runs the command X", and some human programmers decided to put in special code to recognize and "act out" that stage-direction.<p>> AI assistants like Copilot need strong context-awareness<p>That'll be hard. The LLM is just Making Document Longer, and the document is one undifferentiated string with no ownership. Without core algorithm changes, you're stuck trying to put in flimsy literary guardrails.<p>Really hardening it means getting closer to the "real" AI of sci-fi stories, where the machine (not just an assembled character named The Machine) recognizes multiple entities as existing, recognizes logical propositions, trakcs which entities are <i>asserting</i> those proposition (and not just referencing them), and assigning different trust-levels or authority.