TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Text-to-Video Arena

4 点作者 lambda-research5 个月前

1 comment

lambda-research5 个月前
Unlike text generation using LLMs, text-to-video generation brings unique challenges — balancing realism, prompt alignment, and artistic vision is something much more nuanced and intuitive than generated code.<p>But how do we measure the quality of the outputs? Is choice of color more important than the realistic aspect or is it the composition of the scene?<p>We’ve launched a Text-to-Video Model Leaderboard to explore these questions, inspired by the LLM Leaderboard (lmarena.ai). Our idea: many models exist, but only an unbiased comparison can help evaluating what users of text-to-video models actually find most important.<p>Right now, the leaderboard includes five open-source models: * HunyuanVideo * Mochi1 * CogVideoX-5b * Open-Sora 1.2 * PyramidFlow<p>We plan to expand it to include proprietary models from Kling AI, LumaLabs.ai, Pika.art. You can check out the current leaderboard here: <a href="https:&#x2F;&#x2F;t2vleaderboard.lambdalabs.com&#x2F;leaderboard&#x2F;" rel="nofollow">https:&#x2F;&#x2F;t2vleaderboard.lambdalabs.com&#x2F;leaderboard&#x2F;</a><p>We’re looking for feedback from the HN community: * How should text-to-video models be evaluated? * What criteria or benchmarks would you find meaningful? * Are there other models we should include?<p>We’d love to hear your thoughts and suggestions!