An interesting variant of space-filling curves + dimensionality reduction is Geohash (<a href="https://en.wikipedia.org/wiki/Geohash" rel="nofollow">https://en.wikipedia.org/wiki/Geohash</a>, <a href="http://geohash.org/" rel="nofollow">http://geohash.org/</a>) which takes a lon/lat and uses a Z-curve approach to produce a hash such as `u4pruydqqvj` representing the location. This hash value is basically "how far along the space-filling curve is the lon/lat located". You're reducing two dimensions (lon/lat) into one dimension (how far along the space-filling curve).<p>There's a unique side-effect to Geohashes in that the value (`u4pruydqqvj`) can have it's end "lopped off" (i.e. cut down to `u4pru`) and it still represents a less precise, but generally accurate representation of the original lon/lat in most cases (when the curve isn't near the edge of the 2d map!). This allows you to index locations (lat/lon) using a string ('u4pru') which opens you up to doing b-tree, range queries, etc. in traditional database, with one field.<p>Just a rad math quirk! I'm not an expert, and it's a very dense book, but if someone really wants to get into this kind of stuff the "Bible" is "Foundations of Multidimensional and Metric Data Structures" by Hanan Samet.
Space filling curves like Hilbert sound esoteric but they’re actually useful in for high performance database indexing.<p>DuckDB has a Hilbert index.<p><a href="https://github.com/rustyconover/duckdb-lindel-extension">https://github.com/rustyconover/duckdb-lindel-extension</a><p>It’s not just DuckDB. SQL server and Postgres support Hilbert too.<p>The big idea is that you can project tuples e.g (lat, long) or just about any other tuple structure (string or floats) in a single integer, ordered in a way that is locality preserving.<p>Because most queries tend to lookup data close to each other, this can lead to performance gains. Parquet statistics can be built to be locality sensitive (locality defined broadly, not just numbers).<p>If you index on Hilbert, you can specific compute a max min of the Hilbert range of your predicate and eliminate looking outside that range, speeding up query performance.
I read an entire book on applications of space filling curves, and none of the applications seemed to actually work as advertised (though perhaps they did at the time the book was written, I am skeptical).<p>All the stuff about database indexing and cache locality seems to have become obsolete with improvements in compilers; the overhead of computing the Hilbert curve indices is just too high. The indexing method that people seem to actually use is the Z-order/Morton order, which just interleaves bits and is mathematically less interesting [1].<p>[1]: <a href="https://en.wikipedia.org/wiki/Z-order_curve" rel="nofollow">https://en.wikipedia.org/wiki/Z-order_curve</a>
I wrote an article about it and S2 some time ago as well for those interested: <a href="https://blog.christianperone.com/2015/08/googles-s2-geometry-on-the-sphere-cells-and-hilbert-curve/" rel="nofollow">https://blog.christianperone.com/2015/08/googles-s2-geometry...</a>
One problem I’ve encountered is the inverse mapping is not necessarily locality preserving.<p>Hilbert maps (x,y) -> h such that nearby h correspond to nearby x,y.<p>But the inverse isn’t true: similar x,y don’t always have similar h.<p>This is a problem for some applications where we are trying to find dense hot spots cheaply by only looking at h’s, but turns out this doesn’t work out so well.
ahhhh...space-filling curves.<p>With some of the recent work done with ClickHouse this has been an area I've had to learn about and, tbh, wish I had found this earlier.<p>Our CTO started using Morton curves for interesting purposes several months ago with initially piqued my interest and started me down the path.<p><a href="https://reversedns.space/" rel="nofollow">https://reversedns.space/</a> (a map of the Internet) - Morton curve is used as a visualization tool;<p><a href="https://adsb.exposed/" rel="nofollow">https://adsb.exposed/</a> (a visualizer of air traffic) - Morton curve is used as a database index;<p>Our most recent release post dove into the usage of Hilbert curves in context of maintaining spatial locality for weather measurements<p><a href="https://clickhouse.com/blog/clickhouse-release-24-06#hilbert-curves" rel="nofollow">https://clickhouse.com/blog/clickhouse-release-24-06#hilbert...</a><p>Quite an interesting area of study and optimisation!
This reminds of <a href="https://binvis.io" rel="nofollow">https://binvis.io</a> a binary file visualisation tool that uses space filling curves for visualisation. Discussed here previously <a href="https://news.ycombinator.com/item?id=39210436">https://news.ycombinator.com/item?id=39210436</a>