TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Filenames with Accents (2011)

21 pointsby FrankSansCabout 3 years ago

3 comments

Anthony-Gabout 3 years ago
In 2009, David A. Wheeler wrote a comprehensive article covering problems with Unix&#x2F;Linux&#x2F;POSIX filenames¹. Given that the OS naïvely treats filenames as a simple stream of bytes, he advocated that developers use UTF-8 for encoding filenames. He mentioned the issue of multiple normalisation systems being used to encode characters that have more than one Unicode representation but glossed over it because such problems are “overshadowed by the terrible awful even worse problems caused by filenames all being in random unguessable charsets”.<p>I’m guessing that, by now, most developers on Unix-like systems would be using UTF-8 for filenames – though a decade after these articles were published, there still doesn’t seem to be any good&#x2F;universal solution to the problem of characters with multiple Unicode representations.<p>¹ <a href="https:&#x2F;&#x2F;dwheeler.com&#x2F;essays&#x2F;fixing-unix-linux-filenames.html" rel="nofollow">https:&#x2F;&#x2F;dwheeler.com&#x2F;essays&#x2F;fixing-unix-linux-filenames.html</a>
juancnabout 3 years ago
You should normalize names on write, on read is very hard to fix. You can have a perfectly valid, denormalized strings representing codepoints with different normalizations.<p>So if you have four possible normalizations: NFD, NFC, NFKD, NFKC and your string has N ambiguous codepoints, the number of possible strings you need to try is N^4.
评论 #30959265 未加载
评论 #30949496 未加载
baal80spamabout 3 years ago
Side note - after all these years I still don&#x27;t feel comfortable with using special characters (like ą, ż, ź) and spaces in filenames in Windows. DOS times sit deeply in my soul and It just doesn&#x27;t feel right.
评论 #30949571 未加载
评论 #30949987 未加载
评论 #30963917 未加载
评论 #30949756 未加载