We were developing something using LLMs for a narrow set of problems in a specific domain, and so we wanted to gatekeep the usage and refuse any prompts that strayed too far off target.<p>In the end our solution was trivial (?): We'd pass the final assembled prompt (there was some templating) as a payload to a wrapper-prompt, basically asking the LLM to summarize and evaluate the "user prompt" on how well it fit our criteria.<p>If it didn't match the criteria, it was rejected. Since it was a piece of text embedded in a larger text, it seemed secure against injection. In any case, we haven't found a way to break it yet.<p>I strongly believe the LLMs should be all-featured, and agnostic of opinions / beliefs / value systems. This way we get capable "low level" tools which we can then tune for specific purpose downstream.