TE
TechEcho
Home
24h Top
Newest
Best
Ask
Show
Jobs
English
GitHub
Twitter
Home
Early Evals for OpenAI O3
30 points
by
maurycy
5 months ago
2 comments
macawfish
5 months ago
Wow, the demo where the user asks for untraceable payments shows some pretty sophisticated reasoning. The word "crafty" comes to mind.
og_kalu
5 months ago
New SOTA's on:<p>SWE-Bench - 71.7<p>Competition Code - 2727<p>ARC (Semi Private Eval) - 75.7 on low, 87.5% on high compute<p>Frontier Math (previous SOTA was 2%) - 25% on high compute