TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Hexagons and Hilbert curves – The headaches of distributed spatial indices

89 点作者 max_sendfeld大约 1 年前

12 条评论

spenczar5大约 1 年前
Oh boy, this gives me a chance to talk about one of the gems of astronomy software which deserves to be better known: HEALPixel tesselation!<p>HEALPixels stand for &#x27;Hierarchical Equal-Area Iso-latitudinal Pixels&#x27;. It is a scheme that was developed to analyze signals that cover the entire sky, but with variable density.<p>Like HTM or Hilbert curves, this can be used to organize spatial data.<p>The tesselation looks kind of funny but has many good features - it doesn&#x27;t have discontinuities at poles, and is always equal area. And with the &quot;nested&quot; healpixel formulation, pixels are identified by integers. Pixel IDs are hierarchical based on leading bits - so, for example, pixel 106 (=0110 1010) contains pixel 1709 (=0110 1010 1101). This lets you do some marvelous optimizations in queries if you structure your data appropriately. Nearest neighbor searches can be extremely quick if things are HEALPix-indexed - and so can radius searches, and arbitrary polygon searches.<p>HEALPixels are used today for more than just their original intent. LSST will use them for storing all-sky data and point source catalogs, for example.<p>More here:<p>- Original NASA&#x2F;JPL site: <a href="https:&#x2F;&#x2F;healpix.jpl.nasa.gov&#x2F;" rel="nofollow">https:&#x2F;&#x2F;healpix.jpl.nasa.gov&#x2F;</a><p>- Popular Python implementation: <a href="https:&#x2F;&#x2F;healpy.readthedocs.io&#x2F;en&#x2F;latest&#x2F;" rel="nofollow">https:&#x2F;&#x2F;healpy.readthedocs.io&#x2F;en&#x2F;latest&#x2F;</a><p>- Good PDF primer: <a href="https:&#x2F;&#x2F;healpix.jpl.nasa.gov&#x2F;pdf&#x2F;intro.pdf" rel="nofollow">https:&#x2F;&#x2F;healpix.jpl.nasa.gov&#x2F;pdf&#x2F;intro.pdf</a><p>And an experimental database being built on healpix for extremely large data volumes (certainly many TB, maybe single-digit PB): <a href="https:&#x2F;&#x2F;github.com&#x2F;astronomy-commons&#x2F;hipscat">https:&#x2F;&#x2F;github.com&#x2F;astronomy-commons&#x2F;hipscat</a>
评论 #39793334 未加载
评论 #39795248 未加载
michelpp大约 1 年前
I experimented with geospatial Hilbert Curves as a Postgres extension [0] for PostGIS using the S2 [1] spherical geometry library. S2 uses a scale free cell coverage pattern that is numbered using six interlocking space filling Hilbert Curves [2]. This approach is similar to what is used in this article. S2 doesn&#x27;t use hexagons but a different cell structure but the idea is generally the same.<p>By having both high level (cell) and low level (cell id) geometries it was a very powerful library which allowed projection from the hilbert space into a Postgres spatial index (spgist) including various trees, like noted in this article. It appears to be still quite active in development.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;michelp&#x2F;pgs2">https:&#x2F;&#x2F;github.com&#x2F;michelp&#x2F;pgs2</a><p>[1] <a href="https:&#x2F;&#x2F;s2geometry.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;s2geometry.io&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;s2geometry.io&#x2F;devguide&#x2F;s2cell_hierarchy" rel="nofollow">https:&#x2F;&#x2F;s2geometry.io&#x2F;devguide&#x2F;s2cell_hierarchy</a>
评论 #39792331 未加载
评论 #39793795 未加载
tromp大约 1 年前
&gt; a vehicle with a Hilbert Curve position of 0.34 is really close to one with 0.35 and really far from one with 0.89<p>But points with a large difference in their single curve coordinate can be either far apart or close together. E.g. on this 16 point Hilbert curve<p><pre><code> __. .__ __| |__ | __ | |__| |__| </code></pre> the &#x27;.&#x27; marked points at 1&#x2F;16th and 15&#x2F;16th along the curve are adjacent.
评论 #39792342 未加载
joe_the_user大约 1 年前
The problem of finding the shortest path on a map (a la google maps etc) is often solved with approaches like &quot;hub networks&quot; and &quot;contraction hierarchies&quot; rather than simple spatial networks. Things like dijkstra&#x27;s algorithm are only efficient if you&#x27;re input is the raw graph and point and you only want one path once.<p>See just for a start: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1504.05140" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1504.05140</a>
Lichtso大约 1 年前
Here is one of the recent advancements in performing similarity (including nearest neighbor) search on sparse high dimensional data using learned indices: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2204.10028" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2204.10028</a><p>The paper basically deals with the most adverse case but the approach should work in lower dimensions and in data with a non-uniform density distribution as well.
zX41ZdbW大约 1 年前
I&#x27;ve recently implemented a map of the Internet using a similar technique: <a href="https:&#x2F;&#x2F;reversedns.space&#x2F;" rel="nofollow">https:&#x2F;&#x2F;reversedns.space&#x2F;</a><p>In ClickHouse, we have almost every mentioned technique: H3 and S2, geohashes, and indexing by space-filling curves.
评论 #39792886 未加载
feverzsj大约 1 年前
The purpose of using space-filling curves is to convert coordinates into 1 dimensional data, thus they can be indexed by ordinary database index, no matter distrusted or not. The problem is it&#x27;s much slower than a &quot;real&quot; spatial index like R-tree, and the supported queries are rather limited.<p>On the other hand, spatial data can be easily partitioned, rendering the aforementioned method less useful. Also, I&#x27;ve never seen anyone needed more than a single instance of postgres to index all the spatial data, unless you are dealing with spatial temporal data, which can grow infinitely large. Then again, spatial temporal data can be easily partitioned.
评论 #39794058 未加载
patches11大约 1 年前
It would be interesting to know what other solutions exist in this space and why they weren’t chosen.<p>I know of at least one, GeoMesa, which seems like it could at least provide the building blocks to achieve what they are trying to do.
评论 #39792896 未加载
scentoni大约 1 年前
Also see &quot;Evenly distributing points on a sphere&quot; <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17842057">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17842057</a>
klysm大约 1 年前
I got to use space filling curves in my first ever professional programming experience to build image pyramids in a cache efficient way
xrd大约 1 年前
Makes me want to play more hex.<p><a href="https:&#x2F;&#x2F;www.drking.org.uk&#x2F;hexagons&#x2F;hex&#x2F;templates.html" rel="nofollow">https:&#x2F;&#x2F;www.drking.org.uk&#x2F;hexagons&#x2F;hex&#x2F;templates.html</a>
otoburb大约 1 年前
I knew about hexagons but not about R-trees and Hilbert curves, and then combining the two. Even though that was an inbound content article, it <i>was</i> at least informative.