TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

What can’t the internet handle in 2022? Apostrophes

85 点作者 lxm超过 2 年前

18 条评论

puyoxyz超过 2 年前
<a href="https:&#x2F;&#x2F;archive.ph&#x2F;K4gV2" rel="nofollow">https:&#x2F;&#x2F;archive.ph&#x2F;K4gV2</a>
colejohnson66超过 2 年前
Once again, programmers forget alphabets other than the Latin one exist. And by Latin, I mean the one containing only A through Z, with no “fancies”.<p>Then there’s the issue of character encodings, which the article does a decent job explaining! Nitpick though: they claims it’s a UTF-8&#x2F;ASCII thing, but it’s actually a UTF-8&#x2F;Windows-1252 issue.<p>In related news, even GNU coreutils fails to support UTF-8 properly despite a claim of support for multibyte character sets: <a href="https:&#x2F;&#x2F;catgirl.ai&#x2F;log&#x2F;cut-c-harmful&#x2F;" rel="nofollow">https:&#x2F;&#x2F;catgirl.ai&#x2F;log&#x2F;cut-c-harmful&#x2F;</a>
评论 #33077263 未加载
评论 #33080938 未加载
评论 #33076137 未加载
Pinus超过 2 年前
It only takes a quick glance at sites like Stack Overflow to see that things like character encodings, the difference between the external representation of a string and its actual content, different quoting rules in different contexts (CSV versus XML versus JSON versus ...) is a complete mystery to many people trying to work as developers.<p>In print, code is often represented in a monospace font, materials intended for beginners sometimes indicate stuff that you are actually supposed to type with little keycap symbols, etc. Have there been any programming languages (with associated IDE:s) where strings were represented by typography (or colour, or whatever) rather than by surrounding them with some sort of quotation marks?
评论 #33080790 未加载
评论 #33077550 未加载
tapoxi超过 2 年前
I have an Irish O&#x27; surname, and I&#x27;ve been to too many conferences where it&#x27;s been changed to O%apos; on my nametag, and in once case a registration failed because there is a second unicode apostrophe that some text editors use which means my name won&#x27;t match what&#x27;s in their database. Not to mention many sites consider it an illegal character and drop the apostrophe all together.<p>It&#x27;s infuriating, I&#x27;ve seriously considered changing my name because of it.
评论 #33077061 未加载
评论 #33080563 未加载
评论 #33077558 未加载
评论 #33082075 未加载
评论 #33079320 未加载
评论 #33078709 未加载
评论 #33104753 未加载
评论 #33077040 未加载
lloydatkinson超过 2 年前
The title doesn’t do this justice. As it goes on to say it’s not just apostrophes it’s practically everything outside the incredibly limited and non-inclusive range of 26 “ascii” letters.<p>It’s bad enough that apparently no one can spell my name correctly but when my partner and family with letters such as ö or ł can almost never type their correct name it irritates me greatly.<p>A few months ago at work I had to actually kick off at product managers and QA that these letters are not “special characters” in the same way ? or &amp; are. In the end I had to refuse to implement the regex in the form validation and just said that it was done and that was the end of it. So incredibly dumb that peoples names are classified as “special characters” by some arbitrary set of rules people subscribe to.
leovander超过 2 年前
Copy pasting code shared on slack, why do the apostrophe characters get converted into a different character that looks like an apostrophe?
评论 #33079098 未加载
评论 #33076459 未加载
dublin超过 2 年前
OK, I&#x27;ll whack the hornets&#x27; nest here: Base 10 Arithmetic should be the standard, with others being available only when wise programmers explicitly choose it because they have a <i>good</i> reason.<p>Every tangle I&#x27;ve ever had with IEEE 754 would have been unnecessary if our computers and languages used the same numbers that humans do! (And we have way more than enough compute power now to make Base10 arithmetic the default - I&#x27;ll cut the folks in the &#x27;60s some slack due to resource limits, but why are we <i>still</i> dealing with this brokenness in the 21st century?)
评论 #33082603 未加载
评论 #33081546 未加载
评论 #33085178 未加载
fanf2超过 2 年前
One of the nice little details of email address syntax is that &#x27; is not a special character, so niall.o&#x27;reilly@example.ie is just fine.<p>One of the vexing things about MS Outlook and Exchange is that they insist on using a non-standard syntax in which &#x27; is used as a quote, so they make an appalling mess of mail headers.
bloak超过 2 年前
It annoys me that apostrophe and English closing single quotation mark are the same character in Unicode. I&#x27;m tempted to use &#x27; for apostophe so that the quotation marks (‘...’) match up properly.
评论 #33082031 未加载
评论 #33081806 未加载
nicbou超过 2 年前
Is there a good dataset that can be used to test such things? I&#x27;m thinking of a database of all kinds of names, addresses and emails that could be found in the wild.
评论 #33076035 未加载
gdprrrr超过 2 年前
One Person once sued their Bank to correct their Name, under the GDPR right oft correction. The Bank initially refused, because they considered their legacy EDCDIC system unreasonable. The Bank lost the case.<p><a href="https:&#x2F;&#x2F;gdprhub.eu&#x2F;index.php?title=Court_of_Appeal_of_Brussels_-_2019&#x2F;AR&#x2F;1006" rel="nofollow">https:&#x2F;&#x2F;gdprhub.eu&#x2F;index.php?title=Court_of_Appeal_of_Brusse...</a>
drooopy超过 2 年前
* Cries in Portuguese *
nayuki超过 2 年前
If you don&#x27;t trust developers to use prepared statements, banning input strings with apostrophes can be a blunt way to avoid SQL injection attacks.<p>Robert&#x27;); DROP TABLE Students;-- <a href="https:&#x2F;&#x2F;xkcd.com&#x2F;327&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;327&#x2F;</a>
评论 #33076690 未加载
评论 #33078658 未加载
评论 #33076753 未加载
NautilusWave超过 2 年前
The US Federal government (and I imagine most&#x2F;all states) only being able to handle names comprised of unaccented ASCII characters certainly doesn&#x27;t help anything.
NoGravitas超过 2 年前
Further evidence that O&#x27;Brien must suffer.
Foobar8568超过 2 年前
Gartner top lead quadrant solution cannot handle dot., minus - and the likes, fun stuff too
vivegi超过 2 年前
Tyranny of the ASCII and EBCDIC.
dublin超过 2 年前
With very few exceptions (Gab, et al), the Internet cannot handle Free Speech in 2022. Especially speech that contravenes the official government narratives - that gets brutally shut down. This will not end well.