TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Bolt: Faster matrix and vector operations that run on compressed data

183 点作者 febin将近 3 年前

9 条评论

ffast-math将近 3 年前
Author here. Ask me anything--happy to answer questions.<p>Also, if you like this kind of work, you might like what I&#x27;ve been building for the past year: Composer [1]. It speeds up neural net training by a lot (e.g., 7x faster for ResNet-50) [2] and, in contrast to Bolt&#x2F;MADDNESS, is polished, documented code you can get working in &lt;5min.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;mosaicml&#x2F;composer" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;mosaicml&#x2F;composer</a><p>[2] <a href="https:&#x2F;&#x2F;www.mosaicml.com&#x2F;blog&#x2F;mosaic-resnet" rel="nofollow">https:&#x2F;&#x2F;www.mosaicml.com&#x2F;blog&#x2F;mosaic-resnet</a>
评论 #31796155 未加载
评论 #31796520 未加载
评论 #31796102 未加载
评论 #31796625 未加载
cgreerrun将近 3 年前
Maddness is their more recent work and yields 100x speedups: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2106.10860.pdf" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2106.10860.pdf</a><p>The code for Maddness is in the same github repo if you search for &quot;Mithral&quot;.<p>SIMD instructions can work wonders in the right context.
评论 #31793113 未加载
Iv将近 3 年前
&gt; If you have a large collection of mostly-dense vectors and can tolerate lossy compression, Bolt can probably save you 10-200x space and compute time.<p>Space. It can save space.<p>The main limitation of fast ML models nowadays is how much parameters you can load in your GPU memory, and these are usually matrices.<p>200x would allow me to run GPT-3 on my old GTX 1050.<p>Frameworks, please implement this NOW!
Iv将近 3 年前
This is actually from a paper published last year:<p><a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;pffoo8&#x2F;r_multiplying_matrices_without_multiplying&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;pffoo8&#x2F;r_m...</a><p>A few questions:<p>- Do some ML frameworks implement it already? - It promises up to 200x compression, is it reasonable to expect it to allow us to run GPT-3 on smaller mainstream GPUs?
评论 #31796358 未加载
jansan将近 3 年前
THis sounds and looks impressive, but this part struck me:<p>&quot;If you ... and can tolerate lossy compression&quot;<p>What does this mean? I wouldn&#x27;t have thought that matrix operations can be lossy. Does anybody know to what extend they are lossy and where this would be acceptable?
评论 #31794233 未加载
评论 #31793247 未加载
评论 #31796173 未加载
raxxorraxor将近 3 年前
This looks good. Why do the vectors have to be dense? Just because of overhead&#x2F;speed gain being the lowest? Just asking if you could use it universally for all operations if I don&#x27;t know the density.
评论 #31793174 未加载
bee_rider将近 3 年前
I guess the naive approach, if we wanted to do a quick lossy matrix multipy, would be to take the truncated SVD and use that. How does this library compare to the boring strategy, I wonder?
评论 #31796116 未加载
评论 #31795220 未加载
nynx将近 3 年前
Wow, this is fascinating. I wonder if hardware could be designed to do this really efficiently.
评论 #31796076 未加载
评论 #31793123 未加载
a-dub将近 3 年前
any thoughts on trying to build a sort of vq-blas?
评论 #31796340 未加载