Hello everyone. I'm Giorgio, the co-author of the PGM-index paper together with Paolo Ferragina.<p>First of all, I'd like to thank @hbrundage for sharing our work here and also all those interested in it. I'll do my best to answer any doubt in this thread.<p>Also, I'd like to mention two other related papers:<p>- "Why are learned indexes so effective?" presented at ICML 20, and co-authored with Paolo Ferragina and Fabrizio Lillo.<p>PDF, slides and video: <a href="http://pages.di.unipi.it/vinciguerra/publication/learned-indexes-effectiveness/" rel="nofollow">http://pages.di.unipi.it/vinciguerra/publication/learned-ind...</a><p>TL;DR: In the VLDB 20 paper, we proved a (rather pessimistic) statement that "the PGM-index has the same worst-case query and space bounds of B-trees". Here, we show that actually, under some general assumptions on the input data, the PGM-index improves the space bounds of B-trees from O(n/B) to O(n/B^2) with high probability, where B is the disk page size.<p>- "A 'learned' approach to quicken and compress rank/select dictionaries" presented at ALENEX 21, and co-authored with Antonio Boffa and Paolo Ferragina.<p>PDF and code: <a href="http://pages.di.unipi.it/vinciguerra/publication/learned-rank-select/" rel="nofollow">http://pages.di.unipi.it/vinciguerra/publication/learned-ran...</a><p>TL;DR: You can use piecewise linear approximations to compress not only the index but the data too! We present a compressed bitvector/container supporting efficient rank and select queries, which is competitive with several well-established implementations of succinct data structures.