TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Lightweight Indexing for Small Strings

36 pointsby silentbicycleover 11 years ago

2 comments

jibsenover 11 years ago
One trick you could try is: in find_longest_match, if you already have a match, check if the byte at match_maxlen matches before doing the linear compare off all bytes up to it.<p>If that one byte does not match, the entire match has no chance of being longer than the current best (in this simple case).
评论 #7056646 未加载
评论 #7052314 未加载
ccleveover 11 years ago
A nice trick. It could be used for generalized string search as well as compression. And if you indexed bigrams instead of single characters, it could be even faster.<p>I especially like the clear, easy-to-understand, well-written presentation along with links to prior art. Wouldn&#x27;t it be nice if most academic papers were written like this?
评论 #7051597 未加载