In a few fields of rust we are starting to see a convergence of lower level libraries that then can be shared amongst the higher level crates. For example, wgpu is seeing broad use across a bunch of libraries from game engines to UI libraries. This then allows the shared libraries to be made more robust, with more shared resources going into them.<p>Does anyone know how much of this is happening in the matrix/array space in rust? There are several libraries that have overlapping goals: ndarray, nalgebra, etc. How much do they share in terms of underlying code? Do they share data structures, or is anything like that on the horizon?
This does not seem to depend on BLAS/LAPACK.<p>Good to see LU decomposition with full pivoting being implemented here (which is missing from BLAS/LAPACK). This gives a fast, numerically stable way to compute the rank of a matrix (with a basis of the kernel and image spaces). Details: <a href="https://www.heinrichhartmann.com/posts/2021-03-08-rank-decomposition/" rel="nofollow">https://www.heinrichhartmann.com/posts/2021-03-08-rank-decom...</a>.
Why Eigen is not run in parallel mode w/ Open-MP?<p>Eigen handle most (if not all, I just skimmed the tables) tasks in parallel [0]. Plus, it has hand-tuned SIMD code inside, so it needs "-march=native -mtune=native -O3" to make it "full send".<p>Some solvers' speed change more than 3x with "-O3", to begin with.<p>This is the Eigen benchmark file [1].<p>[0]: <a href="https://eigen.tuxfamily.org/dox/TopicMultiThreading.html" rel="nofollow">https://eigen.tuxfamily.org/dox/TopicMultiThreading.html</a><p>[1]: <a href="https://github.com/sarah-ek/faer-rs/blob/main/faer-bench/eigen.cpp">https://github.com/sarah-ek/faer-rs/blob/main/faer-bench/eig...</a>
Something looks dubious with the benchmarking here to me.<p>Top-tier numerical linear algebra libraries hold all hit the same number (give or take a few percent) for matrix multiply, because they're all achieving the same hardware peak performance.
Looking at thin matrix SVD, it appears <i>much</i> faster than everyone else. I’m curious what it’s doing differently at a high level and if there’s any tradeoff in accuracy. I also wonder how it compares to MKL, which is typically the winner in all these benchmarks on Intel.
A bit of a tangent, but the same author has something like a libdivide (C++) for Rust: <a href="https://github.com/sarah-ek/fastdiv">https://github.com/sarah-ek/fastdiv</a> . Cool.
How exactly does this dovetail with <a href="https://github.com/rust-or/good_lp">https://github.com/rust-or/good_lp</a> ? Will it be a replacement, an enhancement, or something else?
Nixpkgs has a some pluggable BLAS/Lapack implementation infra. If this does offer a shim layer doing exactly that interace, it would be nice to see this packaged as a new alternative!