3 pointsby kylerush8 months ago

1 comment

tl;dr: > We are able to reproduce the model benchmark scores initially claimed and are sharing the eval code.<p>That is a big deal considering all the accusations flying around at the time so I hope this forensic update checks out and everyone who jumped to conclusions a couple of weeks ago takes a step back to reflect.

Update on Reflection-70B

1 comment

Update on Reflection-70B

1 comment