TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LLM Hallucination Benchmark: R1, o1, o3-mini, Gemini 2.0 Flash Think Exp 01-21

17 pointsby zone4113 months ago

1 comment

jszymborski3 months ago
Some very odd choices in that first plot. Lower is better, but also the x-axis is inverted such that higher scores go towards the left.
评论 #43004660 未加载