TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

High speed Unicode routines using SIMD

126 pointsby Peter5over 2 years ago

5 comments

jansanover 2 years ago
I do not know about the real world implications of this, but just reading of a 20x performance increase for standard cases makes me excited.
评论 #32731107 未加载
vanderZwanover 2 years ago
The readme has a link to a technical paper on arxiv that was uploaded last year[0], has that perhaps been discussed before?<p>(not meant as a complaint that this might have been submitted before already, I&#x27;m just curious about what might have already been said about it)<p>[0] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2109.10433" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2109.10433</a>
评论 #32725572 未加载
评论 #32723199 未加载
评论 #32723386 未加载
评论 #32723176 未加载
erk__over 2 years ago
IBM&#x27;s Z&#x2F;Architecture mainframes have native support for some of these functions, it could be interesting to see how the speed compares
评论 #32727028 未加载
dragontamerover 2 years ago
I dunno much about Unicode, but I imagine it is a regular language? (Aka: acception &#x2F; rejection can be determined by regular expressions &#x2F; finite state machines)<p>If so, regular languages are one of those &#x27;surprising things that can be parallelized&#x27;. It doesn&#x27;t seem possible at first thought though.
评论 #32723947 未加载
评论 #32724575 未加载
Genboxover 2 years ago
I&#x27;m at the point where I can no longer see any reason to use UTF-16. UTF-8 is used everywhere today and constantly converting between the two is not only inefficient, but also introduce a risk of bugs&#x2F;corruption.
评论 #32729876 未加载