Clicked this to tell you that you need to use a bijective arithmetic coder only to find you already were. Good work!<p>So the next obvious step would be to run the LLM with all exact integer arithmetic so there is no breakage from rounding errors.<p>The obvious feature gap I see is that it should be possible to provide both the encoder and decoder with a common "context" or prompt to preload the LLM with. The context will help get both the models on the right theme so that their output makes sense in the context that it's shared. These contexts ought to be treated as key material by the users.<p>So for example, if the users are using an obscure RC boat forum to exchange their hidden messages, they'd add to their context information to get the LLM to produce RC boating content. The context for messages authored by each user can also set out details of the persona for the account they're posting as. And when two parties are carrying on a hidden conversation, they can update the context message by message (adding the prior covertext to it) so that the LLM will carry on a plausible conversation in the covertext.<p>The extra context material may also serve to help frustrate any statistical analysis that hopes to distinguish the text between human and LLM or between ordinarily sampled llm and
specially sampled LLM. It would be superior for that purpose for the users to use a private fine tune, but that is computationally annoying and the context functionality is needed anyways to allow for coherence in the covertext.<p>If there is no context provided, it may be useful to hash the user provided password and use that to sample some from the model, then throw away the initial samples and use the remainder as context. The reason to do this is again so the LLM's distribution is not predictable to an attacker. Imagine: The attacker suspect in advance that your system is in use with a particular model. He can run it himself and observe that the output is highly consistent with the distribution the model predicts, and much less consistent with a bigger more accurate model of human text. If the model is pre-primed with secret context its distribution will be less predictable, at least for a while.<p>You may want to encode the user's data backwards so that it's easier for users to make arbitrary changes to their message to escape bad covertext output. E.g. If a user encodes a message and the LLM output is highly inappropriate and might blow their cover-- the obvious example from INSTRUCT models is idiotic LLM refusals-- they should just vary their message until they find a version where the output is acceptable. But if the encoding is in order they have to vary the beginning of their message if the LLM's screwup is at the beginning. Because you use authenticated encryption it's not like you have an ability to accept partial messages that you'd lose, I think the only downside is some memory overhead.<p>The user adding the message covertext thread into in context should also help with undesirable LLM output, if you go to encode something and the LLM wants to misbehave, just human author a message-- it will safely not decrypt and then the LLM encode with that new message as part of an updated context may be better behaved.<p>It might also mitigate against some replay attacks, e.g. where an attacker grabs a earlier message in a discussion and replays it and the user decodes it and thinks its a correct new message...<p>You should also probably not use instruct models at all-- their probability distributions have MUCH lower entropy than base models on account of the fine tuning, and you can pretty easily detect LLM output vs human output by checking the cross entropy between an arbitrary instruct model and a base model (even ones unrelated to the models used by the users). Instruct models also have obvious tells that are even apparent to humans (delving, refusals, etc). You might have a harder time avoiding NSFW output with a typical base model, especially a small one, but context should help.<p>You might want to look into using RWKV-LM which has surprisingly good performance at smaller sizes.<p>Might also be fun to offer to use the an LLM as a compressor for the text before encrypting.