I remember learning as part of my CS degree that knowing the distribution of your data allows you to sort it in N log(log(N)) instead of N log(N) (which assumes a comparison based sort), although back then it wasn't given such a fashionable name as "learned sort", and determining the distribution wasn't called "training"
>The results show that our approach yields an average 3.38x performance improvement over C++ STL sort, which is an optimized Quick- sort hybrid, 1.49x improvement over sequential Radix Sort, and 5.54x improvement over a C++ implementation of Tim- sort, which is the default sorting function for Java and Python.<p>This is a lot
Can this approach be made more robust so it could be used in critical programs? Is there a way to formally verify performance of an approach like this?
Is the paper "NN-sort: Neural Network based Data Distribution-aware Sorting" [1] not a year prior to the cited paper? It seems to discuss the same approach.<p>[1] NN-sort: Neural Network based Data Distribution-aware Sorting: <a href="https://arxiv.org/pdf/1907.08817.pdf" rel="nofollow">https://arxiv.org/pdf/1907.08817.pdf</a>