TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

O3 mini vs. Gemini flash 2.0 in chess

2 pointsby dinp3 months ago

1 comment

dinp3 months ago
Source code: <a href="https:&#x2F;&#x2F;github.com&#x2F;don-dp&#x2F;simulateagents&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;don-dp&#x2F;simulateagents&#x2F;</a><p>Click on &#x27;Play moves&#x27; to watch a replay.<p>I initially planned to run a chess tournament for LLMs but they are not good: besides obvious mistakes, they output incorrect moves, get stuck in loops by repeating the same moves and the smaller models fail to output valid json frequently. I thought the reasoning models like o3 mini might be good, but they are an incremental improvement in chess.<p>Feedback and suggestions for other games to explore welcome.