TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Rot8000 – Rot13 for the Unicode generation

47 pointsby rottytoothover 11 years ago

7 comments

lelfover 11 years ago
It&#x27;s broken.<p>Λ̊1 → ⊻∪ά → Λ̊⋌<p>𝄞 → 뤔뷾 → 駴點<p>Edit: anyway, even with correct (a+b)%n it&#x27;s plain bad idea.<p>Unicode is not English alphabet. Everything not in basic multilingual plane is broken automatically. And even in BMP there&#x27;s going to be bag of glitches starting from hanging combining characters and ending to ‘oops someone normalised our string and it&#x27;s now different’ (for site, not for user &#x2F; Unicode).
评论 #6660062 未加载
评论 #6687474 未加载
mischanixover 11 years ago
Not reciprocal for CJK input, e.g. &quot;한글&quot; takes 5 iterations to reach stability. I believe this has to do with the utf-16 encoding of codepoints &gt; 0x10000
评论 #6659931 未加载
aculverover 11 years ago
Inputting &quot;こんにちは。元気ですか?&quot; caused an application error:<p><pre><code> [ArgumentException: Error serializing value &#x27;ᄳᅳᅋᅁᅏტ㈣䳷ᅇᄹᄫ�&#x27; of type &#x27;System.String.&#x27;] </code></pre> After realizing it was &quot;?&quot; that was breaking everything, I ended up with this round trip:<p>&quot;こんにちは。元気ですか。&quot; → &quot;ᄳᅳᅋᅁᅏტ㈣䳷ᅇᄹᄫტ&quot; → &quot;こんにちは。ጃ⷗ですか。&quot;<p>It&#x27;s broken. I suspect Unicode requires more careful manipulation than OP anticipated. :-)
peterwallerover 11 years ago
Copy-pasting the contents of rot8000.com&#x2F;info in and hitting cypher twice ends up scrambling the contents quite a bit..<p><pre><code> It also bypasses 32 control characters, technically making it rot7968, sometimes with an additional offset. </code></pre> -&gt;<p><pre><code> It also bypasses ⋍2 control characters, technically making it rot⋏⋬68, sometimes with an additional offset.</code></pre>
评论 #6660243 未加载
rottytoothover 11 years ago
I put in a fix for CJK and the result is: nearly everything that&#x27;s not CJK now rotates into it and back out; CJK is an <i>huge</i> section of the Basic Multilingual Plane. The fix invalidates rotations done with rot8000 before the fix, unfortunately.
njharmanover 11 years ago
I just realized that 13 was probably chosen for rot13 cause that&#x27;s half the number of letters in English alphabet.<p>I miss &quot;obvious&quot; stuff like that all the time.
jloughryover 11 years ago
Why not call it Rot8192 or Rot0x7777 ?
评论 #6659994 未加载