36 pointsby silentbicycleover 11 years ago

2 comments

jibsenover 11 years ago

One trick you could try is: in find_longest_match, if you already have a match, check if the byte at match_maxlen matches before doing the linear compare off all bytes up to it.<p>If that one byte does not match, the entire match has no chance of being longer than the current best (in this simple case).

评论 #7056646 未加载

评论 #7052314 未加载

ccleveover 11 years ago

A nice trick. It could be used for generalized string search as well as compression. And if you indexed bigrams instead of single characters, it could be even faster.<p>I especially like the clear, easy-to-understand, well-written presentation along with links to prior art. Wouldn't it be nice if most academic papers were written like this?

评论 #7051597 未加载

Lightweight Indexing for Small Strings

2 comments

Lightweight Indexing for Small Strings

2 comments