TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Burrows–Wheeler Transform

162 pointsby bshanksover 2 years ago

16 comments

dkbrkover 2 years ago
Honestly, the inverse Burrows-Wheeler transform seems like some sort of Voodoo black magic to me.<p>It reminds be of the 100 prisoners problem [0]. Yes, I understand why it works. I can see how it works mathematically. But it still feels like it shouldn&#x27;t work, that we&#x27;re somehow getting something for free.<p>[0]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;100_prisoners_problem" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;100_prisoners_problem</a>
评论 #32963676 未加载
评论 #32965199 未加载
评论 #32961743 未加载
评论 #32961165 未加载
评论 #32969371 未加载
评论 #32966079 未加载
评论 #32969322 未加载
tdidoover 2 years ago
This is used in both bwa [1] and bowtie [2], two of the most popular DNA sequence aligners.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;lh3&#x2F;bwa" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lh3&#x2F;bwa</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;BenLangmead&#x2F;bowtie" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;BenLangmead&#x2F;bowtie</a>
评论 #32962498 未加载
评论 #32961381 未加载
lancefisherover 2 years ago
This is a fun video from a series on compression that explains it well, and features Mike Burrows:<p><a href="https:&#x2F;&#x2F;youtu.be&#x2F;4WRANhDiSHM" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;4WRANhDiSHM</a><p>He shares the origin of the algorithm as well as a story about how it was first published.<p>The Compressor Head video series is the best introduction to compression that I’ve found.
评论 #32961707 未加载
ur-whaleover 2 years ago
I still remember when I first read the first C implementation of this, which was IIRC by Mike Burrows.<p>Apparently, back in the days, Mike had some sort of philosophical opposition to indenting his C code.<p>I was impressed he managed to write such a complex piece of code without ever indenting anything.
评论 #32962222 未加载
评论 #32980126 未加载
评论 #32965759 未加载
dangover 2 years ago
Related:<p><i>Burrows-Wheeler Transform [video]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10721401" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10721401</a> - Dec 2015 (5 comments)<p><i>Compression with the Burrows-Wheeler Transform</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1112845" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1112845</a> - Feb 2010 (6 comments)<p><a href="https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;query=%22burrows-wheeler%22&amp;sort=byDate&amp;type=comment" rel="nofollow">https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;que...</a>
skrebbelover 2 years ago
Can anyone explain to me why this is said to be O(n) when it heavily relies on sorting?
评论 #32961205 未加载
lofatdairyover 2 years ago
For biologists reading this, I recommend the following series from a mathematical biologist in Spain (maybe Toronto by now): <a href="http:&#x2F;&#x2F;blog.thegrandlocus.com&#x2F;tag&#x2F;burrows-wheeler-transform" rel="nofollow">http:&#x2F;&#x2F;blog.thegrandlocus.com&#x2F;tag&#x2F;burrows-wheeler-transform</a>
mcintover 2 years ago
Someone&#x27;s been reading about string matching algorithms, or compression. For biology?
评论 #32960901 未加载
djha-skinover 2 years ago
It&#x27;s fascinating to me that xz&#x27;s algorithm[1] beats bzip2 (Burrows-Wheeler) in both time and space, but it&#x27;s a much simpler algorithm.<p>1: <a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Lempel%E2%80%93Ziv%E2%80%93M...</a>
评论 #32963472 未加载
评论 #32964452 未加载
评论 #32965542 未加载
ducaaleover 2 years ago
Fun fact, burrows-wheeler is one of the assignments in coursera&#x27;s algorthims course.
personjerryover 2 years ago
How is it that this doesn&#x27;t violate the pidgeonhole principle?
tehjokerover 2 years ago
If you were applying a first pass of compression to integer data and then applying gzip, would adding BWT in the middle still be beneficial?
beagle3over 2 years ago
It is an incredibly elegant scheme for bringing Markov context outputs together without actually trying to figure out what those contexts are.
billfruitover 2 years ago
Is there any analogous method for images?
评论 #32965101 未加载
frozencellover 2 years ago
So even the inventors quite misunderstand how they came up with this marvel. (we are lost ^^)
GordonSover 2 years ago
Has anyone here found uses for it, perhaps for domain-specific string compression?
评论 #32962239 未加载