TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels

7 点作者 silasalberti7 天前
Hey – we just published a blog post about Kevin-32B = K(ernel D)evin.<p>It&#x27;s to our knowledge the first open-source model that&#x27;s RL-trained on CUDA kernels. Our goal was to demonstrate multi-turn RL using GRPO. We used 180 Python-&gt;CUDA conversion tasks from the KernelBench dataset.<p>The results were surprisingly strong! We were able to outperform top reasoning model like o3 &amp; o4-mini.<p>We&#x27;re sharing our training setup and learnings in the blogpost. Also the model is on HuggingFace: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;cognition-ai&#x2F;Kevin-32B" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;cognition-ai&#x2F;Kevin-32B</a>

暂无评论

暂无评论