This article should be retired because it's harmful.<p>This and other "absolute minimums" like it seem to stop before teaching absolute minimum probably because the authors don't know the absolute minimum. They just teach the encodings and stop there. That's harmful.<p>Consider the two incorrect sentences Joel makes:<p>> In Unicode, a letter maps to something called a code point<p>and<p>>Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium<p>These are incorrect statements and Joel does not (or did not) know enough about Unicode to know that he is wrong.<p>Above the code points are "grapheme clusters", "extended grapheme clusters" or “user-perceived character” (“a basic unit of a writing system for a language”) that match the "platonic letter" Joel talks about.
wchar_t can't represent<p>Extended grapheme clusters can have arbitary number of code points in them . You need to use unicode-segmentation to cut unicode string into smaller strings that represent "platonic characters" if you want to do it right.