> The flipside of this approach, however, is that concentrates more responsibility for aligning the language language model into the hands of OpenAI, instead of democratizing it. That poses a problem for red-teamers, or programmers that try to hack AI models to make them safer.<p>More cynically, could it be that the model is not doing anything remotely close to what we consider "reasoning" and that inquiries into how it's doing whatever it's doing will expose this fact?