TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Pushing the Limits of LLM Quantization via the Linearity Theorem

95 点作者 felineflock24 天前

2 条评论

cs70223 天前
The OP looks like good work, but it&#x27;s definitely <i>not</i> a quick read. The authors claim theoretical breakthroughs that enable:<p>* a data-free LLM quantization method which they claim outperforms all prior data-free approaches, including NF4; and<p>* a method which they claim is optimal for finding non-uniform per-layer quantization levels which match a given compression constraint in the &quot;medium bitwidth&quot; regime.<p>They demonstrate improved accuracy-compression trade-offs on popular LLMs.<p>Thank you for sharing this on HN.
Scene_Cast223 天前
Given our modern understanding of how LLMs work (like the recent Anthropic work), I wonder if that insight can be used to quantize better. For example, we know that LLMs encode concepts through rotations (but not magnitude) of several neurons.<p>Bringing this up because the abstract (and the mention of rotations) reminded me of recent LLM interpretability posts.