I thought I would start a thread on here to discuss some of the challenges in Generative AI particularly with enforcing on-topic conversation and tone. It seems this poses substantial cost restrictions on deployment, as using on topic checks and classifiers for inference prompts does not scale well. It seems an O(N) problem that has high cost and low performance with inference. In addition, though embedding methods could be used to ensure on topic conversations, I want to specifically discuss the problem of combined context, where a predefined cluster is isolated, but combined alternative context in the prompt 'dilutes' the vector such that it no longer points to a defined cluster, or has a cosine similarity such that it does no trigger a match or non match decisively. As an example, consider the sentence. "I would like to discuss politics, and I also enjoy the stock market." Though sentence splitters and or inference topic modeling could be used, that seems it could get expensive. Consider also the problem of impersonation, where the model is instructed to act like a volatile character. Finally, I throw out there a stranger version of prompts meant to throw model results off, where perhaps the letters are switched in such as a way that when the model computes the answer by decoding the answer, it does not consider the final context (rotate the following letters by 1 forward: "GH", which would produce "HI"). I am least concerned about the third but I am simply pointing it out. I know there are a lot of bright minds on here and thought it might warrant a conversation of how people have overcome these challenges and what ideas people might have.