TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Compressing graphs and indexes with recursive graph bisection (2016)

16 pointsby luu11 months ago

3 comments

ot11 months ago
I&#x27;m one of the authors of the paper, nice to see it on HN.<p>I remember when we first experimented with this, the compression improvement compared to our previous heuristic was massive, but the algorithm took a day to run on a double-digit-node Giraph cluster for a single index shard. I was very skeptical we&#x27;d ever be able to use it in production, given that we had to run it on thousands of index shards every few days.<p>Eventually we reimplemented it in C++, optimized all the data structures to make them fit in memory, and we were able to run it in a couple of hours on a single (beefy) machine. Over the years it has been optimized further.<p>The algorithm has been reproduced externally with an open-source implementation [1], which AFAIR was pretty good when I looked at it.<p>[1] <a href="https:&#x2F;&#x2F;culpepper.io&#x2F;publications&#x2F;mm+19-ecir.pdf" rel="nofollow">https:&#x2F;&#x2F;culpepper.io&#x2F;publications&#x2F;mm+19-ecir.pdf</a>
评论 #40806745 未加载
CrimsonCape11 months ago
I wonder if there&#x27;s any development of this, since this paper is from 2016.<p>This is an example of an NP-hard problem that surprisingly broadly applies to everyday situations, just like the bin-packing problem that has been on the front page recently. Do you know of other NP hard problems with surprising applicability to everyday situations?
评论 #40805907 未加载
aurumque11 months ago
This is such a nice and straightforward result. Encode the vertex ID&#x27;s using locality-preserving integers, and then store the deltas between the ID&#x27;s instead of the ID&#x27;s themselves.