TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A rant about cross-platform programming with wchar_t

6 点作者 BudVVeezer大约 15 年前

2 条评论

prodigal_erik大约 15 年前
ANSI C specified wchar_t in 1989, two years before Unicode 1.0. They couldn't even be sure Unicode was going to win.<p>Besides, the <i>whole point</i> of wchar_t is to not be variable width. UTF-16 in wchar_t is an abomination that dates back to the industry building APIs that take UCS-2 (which the author really ought to cover) before they realized UCS-2 was too narrow to do its job. So now we have a lot of code that appears to support Unicode but may not handle it correctly, depending on whether QA knew they should try surrogate pairs. Almost nobody realizes UTF-16 needs to be searched and spliced as carefully as UTF-8. Each is just a compression scheme for the million or so actual codepoints, and there aren't many reasons to favor one over the other (in memory, at least).<p>What's the actual problem here, the team made assumptions that ANSI warned against making? Apple failed to accept UCS-4 for their API?
评论 #1239192 未加载
jheriko大约 15 年前
This is silly... not only is the article inaccurate but the problem described is trivial to solve. As long as you know what encoding wchar_t uses and what encoding your data is stored in this is not a big problem, use one format internally and convert your data on the way in as appropriate. Trust me, I solved it with no prior knowledge, no formal education in less than a day, as a distraction during my day job... I did not have to re-write the entire library from scratch.
评论 #1239046 未加载