TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Burrows–Wheeler Transform

162 点作者 bshanks超过 2 年前

16 条评论

dkbrk超过 2 年前
Honestly, the inverse Burrows-Wheeler transform seems like some sort of Voodoo black magic to me.<p>It reminds be of the 100 prisoners problem [0]. Yes, I understand why it works. I can see how it works mathematically. But it still feels like it shouldn&#x27;t work, that we&#x27;re somehow getting something for free.<p>[0]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;100_prisoners_problem" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;100_prisoners_problem</a>
评论 #32963676 未加载
评论 #32965199 未加载
评论 #32961743 未加载
评论 #32961165 未加载
评论 #32969371 未加载
评论 #32966079 未加载
评论 #32969322 未加载
tdido超过 2 年前
This is used in both bwa [1] and bowtie [2], two of the most popular DNA sequence aligners.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;lh3&#x2F;bwa" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lh3&#x2F;bwa</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;BenLangmead&#x2F;bowtie" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;BenLangmead&#x2F;bowtie</a>
评论 #32962498 未加载
评论 #32961381 未加载
lancefisher超过 2 年前
This is a fun video from a series on compression that explains it well, and features Mike Burrows:<p><a href="https:&#x2F;&#x2F;youtu.be&#x2F;4WRANhDiSHM" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;4WRANhDiSHM</a><p>He shares the origin of the algorithm as well as a story about how it was first published.<p>The Compressor Head video series is the best introduction to compression that I’ve found.
评论 #32961707 未加载
ur-whale超过 2 年前
I still remember when I first read the first C implementation of this, which was IIRC by Mike Burrows.<p>Apparently, back in the days, Mike had some sort of philosophical opposition to indenting his C code.<p>I was impressed he managed to write such a complex piece of code without ever indenting anything.
评论 #32962222 未加载
评论 #32980126 未加载
评论 #32965759 未加载
dang超过 2 年前
Related:<p><i>Burrows-Wheeler Transform [video]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10721401" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10721401</a> - Dec 2015 (5 comments)<p><i>Compression with the Burrows-Wheeler Transform</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1112845" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1112845</a> - Feb 2010 (6 comments)<p><a href="https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;query=%22burrows-wheeler%22&amp;sort=byDate&amp;type=comment" rel="nofollow">https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;que...</a>
skrebbel超过 2 年前
Can anyone explain to me why this is said to be O(n) when it heavily relies on sorting?
评论 #32961205 未加载
lofatdairy超过 2 年前
For biologists reading this, I recommend the following series from a mathematical biologist in Spain (maybe Toronto by now): <a href="http:&#x2F;&#x2F;blog.thegrandlocus.com&#x2F;tag&#x2F;burrows-wheeler-transform" rel="nofollow">http:&#x2F;&#x2F;blog.thegrandlocus.com&#x2F;tag&#x2F;burrows-wheeler-transform</a>
mcint超过 2 年前
Someone&#x27;s been reading about string matching algorithms, or compression. For biology?
评论 #32960901 未加载
djha-skin超过 2 年前
It&#x27;s fascinating to me that xz&#x27;s algorithm[1] beats bzip2 (Burrows-Wheeler) in both time and space, but it&#x27;s a much simpler algorithm.<p>1: <a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Lempel%E2%80%93Ziv%E2%80%93M...</a>
评论 #32963472 未加载
评论 #32964452 未加载
评论 #32965542 未加载
ducaale超过 2 年前
Fun fact, burrows-wheeler is one of the assignments in coursera&#x27;s algorthims course.
personjerry超过 2 年前
How is it that this doesn&#x27;t violate the pidgeonhole principle?
tehjoker超过 2 年前
If you were applying a first pass of compression to integer data and then applying gzip, would adding BWT in the middle still be beneficial?
beagle3超过 2 年前
It is an incredibly elegant scheme for bringing Markov context outputs together without actually trying to figure out what those contexts are.
billfruit超过 2 年前
Is there any analogous method for images?
评论 #32965101 未加载
frozencell超过 2 年前
So even the inventors quite misunderstand how they came up with this marvel. (we are lost ^^)
GordonS超过 2 年前
Has anyone here found uses for it, perhaps for domain-specific string compression?
评论 #32962239 未加载