TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Multiplying Matrices Without Multiplying

235 pointsby moinnadeemover 3 years ago

17 comments

cs702over 3 years ago
Every element of a matrix multiplication is a dot-product of two vectors. The dot-product of two vectors quantifies their similarity -- in fact, we call it &quot;dot-product similarity&quot; in a nearest-neighbors context: If the dot product &gt; 0, the vectors point in similar directions; if &lt; 0, the two vectors point in dissimilar directions; if = 0, the two vectors are orthogonal.<p>It&#x27;s not too hard to imagine that it might be possible to learn representative K-means clusters of training vectors and then, at run-time, find similar vectors using an efficient hashing scheme and do the equivalent of a table lookup to get the approximate dot-product similarity scores -- without having to perform any multiplications.<p>This isn&#x27;t quite correct (I&#x27;m oversimplifying and ignoring many important details), but I think it&#x27;s a helpful mental image.<p>Very clever stuff.
评论 #28376183 未加载
评论 #28377538 未加载
评论 #28383976 未加载
评论 #28381442 未加载
评论 #28376953 未加载
评论 #28377674 未加载
ffast-mathover 3 years ago
Primary author here. Happy to answer questions!<p>Also, feel free to email me at the address in the paper if you&#x27;re interested in talking about it in more detail. E.g., I&#x27;ve already heard from some hardware folks looking at expanding on this work.
评论 #28380033 未加载
评论 #28492349 未加载
评论 #28383166 未加载
评论 #28380629 未加载
评论 #28386535 未加载
yangjunproover 3 years ago
An interesting work, with some to-be-addressed questions: 1.The paper only covers the GEMM part with small-scale experiments(CIFAR-10&#x2F;100), not covering convolution, not covering GEMM part in more popular network such as Transformer&#x2F;BERT, etc. 2. It is still an approximating method, meaning potential accuracy loss. So I think this method is less attractive to training acceleration scenario, maybe potentially as a complementing methods for inference acceleration. 3. No results evaluated in GPU with TensorCore equipment. I am a little bit curious, since modern AI accelerator(including NV GPU) all incorporate TensorCore which by-design supports GEMM acceleration, what is the add-on value brought by the approximating method mentioned in this paper.
评论 #28376480 未加载
评论 #28375935 未加载
评论 #28376495 未加载
esjeonover 3 years ago
&gt; MADDNESS<p>I think this name really fits well to the concept. Replacing matrix-multiplication with some hash-lookups! (warning: an overly simplified statement)<p>This is a really interesting application of PQ(product quantization), which itself also requires learning (usually K-means). Paper: <a href="https:&#x2F;&#x2F;lear.inrialpes.fr&#x2F;pubs&#x2F;2011&#x2F;JDS11&#x2F;jegou_searching_with_quantization.pdf" rel="nofollow">https:&#x2F;&#x2F;lear.inrialpes.fr&#x2F;pubs&#x2F;2011&#x2F;JDS11&#x2F;jegou_searching_wi...</a><p>Considering that ANN has survived through many approximations (e.g. lower precision, pruning), and that many ANN applications are anyway subjective (e.g. image processing, recommendation), I think this can be highly useful in many cases.
评论 #28379964 未加载
TaupeRangerover 3 years ago
Method 1: computer, use MUL to multiply X and Y.<p>Method 2: computer, remember this number. When I give you X and Y, tell me what that number was. This requires no multiplication! Let us tell SCIENCE about our discovery!
评论 #28383810 未加载
评论 #28380372 未加载
评论 #28380314 未加载
infogulchover 3 years ago
I wonder how much alternative number representations have been studied for use in matrices. Like, storing the log of values instead so that multiplication can be done by just adding. Or something like Zech&#x27;s logarithm. Or even, take the log of whole matrices (which is a thing apparently [0]), then adding them to compute their multiplication. I wonder if the logarithm of a matrix can be computed using ML.<p>[0]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Logarithm_of_a_matrix" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Logarithm_of_a_matrix</a>
评论 #28376541 未加载
评论 #28379345 未加载
评论 #28377951 未加载
评论 #28376467 未加载
bawanaover 3 years ago
Lots of discussion here about higher dimensional matrices. The argument is that the dot product is an operation on two arguments, therefore any multidimensional matrix multiplication can be broken down into multiple two dimensional operations. The image offerred is that two vectors define a plane so any two dimensional operation is valid. But how about curved space? Suppose I have two vectorsin curved space-they look like two droopy arrows. How does one calculate this dot product? I guess the the local curvature has to be taken into account. In spherically shaped space, this is a constant, but what about irregularly bent space-like that near an amorphous blob of dark matter? Suppose you are trying to calculate electrostatic potentials or do anything with Maxwell&#x27;s equations? Currently, we define the curvature of space with gravity or the deflection of light past massive objects. Is there a way to measure the curvature of space locally? Can the curvature of space alter the piezoelectric potential generated by a crystal for example and allow its measurement- I am thinking that a miniscule deformation on a large macroscopic object is multiplied many fold on a property that is distributed over the atomic level and happening in parallel on all the atoms of a crystal.
评论 #28383067 未加载
评论 #28383738 未加载
alforover 3 years ago
I wonder how our brain can train billion of neuron without matrix multiplication. What is the biological process that get a similar result?
评论 #28375669 未加载
评论 #28377605 未加载
评论 #28376047 未加载
trilinearnzover 3 years ago
Clever and logical. It reminds me of when John Carmack used a precomputed table of Sine values for fast lookup in Quake, rather than running the actual function on the CPU.
评论 #28378573 未加载
评论 #28377524 未加载
joe_the_userover 3 years ago
Shouldn&#x27;t the solution to fast matrix multiplication be logarithms, similarly to fast scalar multiplication?<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Logarithm_of_a_matrix" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Logarithm_of_a_matrix</a>
评论 #28376114 未加载
nynxover 3 years ago
Huh, that’s kinda fascinating. Maybe it’d be worth running ml on matrix multiplication that’s approximated by ml?
评论 #28375212 未加载
ausbahover 3 years ago
as a total outside to this sort of stuff, doesn&#x27;t this have the big issue of error propagation?
评论 #28375356 未加载
sriram_malharover 3 years ago
Can someone help me understand how approximate it is? what are te error bounds?
评论 #28400949 未加载
antonzabirkoover 3 years ago
Had this same exact thought as an undergrad like 3 years ago! I kinda gave up due to the massive barrier and difficult financial burdens faced by phd students. This feels nice to know i wasn&#x27;t crazy.
评论 #28375737 未加载
评论 #28375664 未加载
评论 #28375506 未加载
评论 #28375641 未加载
random314over 3 years ago
Ha ha! What the hell! This is revolutionary if true!
kazinatorover 3 years ago
If machine learning depends on matrix multiplications, but you fib them using machine learning ... do you see the problem?
评论 #28377192 未加载
leecarraherover 3 years ago
For some reason they run all their tests single threaded. Seems like parallelism is where all computing hardware is inevitably going. I also wish they had run time comparisons to more recent matrix sketching and multiplication method such as frequent directions, newer FJLT implementations, and RIP matrices based on hashing.
评论 #28376606 未加载