>Getting that user feedback cycle and being able to improve your models, build a trust and safety layer turns out to be an important thing to do.<p>I have had a theory that Google has started with a less-capable but easier to host LLM in order to obtain RLHF data. That is where I think they are actually behind. They have access to huge amounts of training data, but without the reinforcement feedback, it isn't going to scale in the ways they need it to right now.