TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: An interactive transformer “debugger” been working on in my free time

8 点作者 robkop将近 2 年前
My focus has been shifting towards the ML alignment space recently, and in particular the ability to translate large transformer models into human understandable circuits and algorithms. This problem potentially isn&#x27;t solvable, but it is one that some groups have had success with after large amounts of effort.<p>In attempting to address this issue, I&#x27;ve been developing Transpector. A tool scaling up and reducing the barrier to entry of techniques that these teams have been showing success with. Techniques aiming to understand the internal mechanics of the model. Currently this tool is focused on model activations but with more free time willing I&#x27;m planning to expend it into the gradient and weight spaces as well.<p>If you have some free time of your own, I encourage you to give it a try, I&#x27;ve found it&#x27;s not only a bit of fun but its been a good way to help others build intuition of these models.

2 条评论

quickthrower2将近 2 年前
hey, this looks pretty cool. I was about to start research into the tools you use to do stuff like find hyper parameters, debug the network and so on. Karpathy’s YT series aludes to the need to do such things but he hasn’t yet dug into that rabbit hole. I hope I get some time to try this out. But the visuals look great and make me think this would be worth trying out as a learning (as in me learning!) tool.
评论 #36176466 未加载
segmondy将近 2 年前
Looks very neat. How do you use it? Looks like it&#x27;s just to inspect a personal model and can&#x27;t be applied to external models, is that right?
评论 #36176368 未加载