TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

LLM price vs. performance (Google sheet)

9 点作者 harlanlewis大约 1 年前

2 条评论

harlanlewis大约 1 年前
I created this dense visual comparison to better understand and contextualize the precise relationships between capability, cost, and speed for text LLMs widely available via cloud providers today.<p>All values are sourced externally from publicly available data.<p>This sheet is only as good as the data I&#x27;ve found for it. Some values change over time (eg 0-100 normalized index), while others have contradictory sources. For example, OpenAI&#x27;s self-reported metrics for GPT-4-turbo are quite close but not identical between their simple-evals repo[1] and the charts in the GPT-4o announcement[2]. For others, strong benchmark scores are prominent on marketing pages while weaker scores require some digging.<p>As a general rule of thumb, I&#x27;ve tried to: a) Include every metric I can find to help mitigate cherry-pick bias. b) Resolve conflicts by selecting what I consider to be either the more current or more trustworthy source. For what it&#x27;s worth, I haven&#x27;t come across any evaluation discrepancies with a meaningful margin of difference.<p>The folks I&#x27;ve shared this with so far have found it useful - I hope you do as well!<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;openai&#x2F;simple-evals">https:&#x2F;&#x2F;github.com&#x2F;openai&#x2F;simple-evals</a> [2] <a href="https:&#x2F;&#x2F;openai.com&#x2F;index&#x2F;hello-gpt-4o&#x2F;" rel="nofollow">https:&#x2F;&#x2F;openai.com&#x2F;index&#x2F;hello-gpt-4o&#x2F;</a>
Sebmono大约 1 年前
Love this!