The success we're seeing with neural networks is tightly coupled with the ability to scale - the algorithm itself works at scale (more layers), but it also scales well with hardware, (neural nets mostly consist of matrix multiplications, and GPUs have specialised matrix multiplication acceleration) - one of the most impactful neural network papers, AlexNet, was impactful because it showed that NNs could be put on the GPU, scaled and accelerated, to great effect.<p>It's not clear from the paper how well this algorithm will scale, both in terms of the algorithm itself (does it still train well with more layers?), and ability to make use of hardware acceleration, (e.g. it's not clear to me that the structure, with its per-weight activation functions, can make use of fast matmul acceleration).<p>It's an interesting idea, that seems to work well and have nice properties on a smaller scale; but whether it's a good architecture for imagenet, LLMs, etc. is not clear at this stage.