tl;dr:
> We are able to reproduce the model benchmark scores initially claimed and are sharing the eval code.<p>That is a big deal considering all the accusations flying around at the time so I hope this forensic update checks out and everyone who jumped to conclusions a couple of weeks ago takes a step back to reflect.