This is a really good post that shines some light on how the insanity of encodings still isn't fixed today, since so many operating systems still don't completely use Unicode everywhere.<p>Some of the reasonings behind why the characters are displayed like that are slightly incorrect, though, so here are some corrections:<p>I'm going to supply each example here with some python3 code to reproduce with, with the following definition:<p>`data = b"a\xcc\xb6\xcc\x81\xcc\x93\xcc\xbf\xcc\x88\xcc\x9b\xcc\x9b\xcd\x90\xcd\x98\xcd\x86\xcc\x90\xcd\x9d\xcc\x87\xcc\x92\xcc\x91\xcd"`<p>First, let's start at the beginning:<p>> My router just cut the name down to 32 octets though to stay complient
> This was what was being sent according to iw
> `a\xcc\xb6\xcc\x81\xcc\x93\xcc\xbf\xcc\x88\xcc\x9b\xcc\x9b\xcd\x90\xcd\x98\xcd\x86\xcc\x90\xcd\x9d\xcc\x87\xcc\x92\xcc\x91\xcd`<p>If you look at this closely, the last byte in this sequence is `\xcd`, which is an incomplete UTF-8 character. It's missing the final `\x84` that the router cut off (along with the three additional `a` characters).<p>> with the raw hex being
> `97ccb6cc81cc93ccbfcc88cc9bcc9bcd90cd98cd86cc90cd9dcc87cc92cc91cd`<p>small mistake: the hex of `a` is `61`, not `97` (that's decimal), but otherwise correct.<p>> Galaxy S8 running Android 9 with Kernel 4.4.153
> Amazon Firestick<p>Everything correct, except for a small detail:<p>These two devices render the result of UTF-8 decoding while ignoring bytes that are invalid unicode (in python3: `data.decode('utf-8', 'ignore')`)<p>> iPhone 6 running iOS 13.5.1
> Apple TV Second Generation<p>Completely correct. This is definitely Mac OS Roman (in python3: `data.decode('mac_roman')`)<p>> Windows 10 Pro 10.0.19041<p>This one is a incorrect again:<p>Windows is interpreting the characters in the "Windows Codepage 1252" (also known as "Western") encoding and ignoring invalid characters (in python3: `data.decode('cp1252', 'ignore')`)<p>Decoding every character separately as UTF-8 would fail (since every byte that can be a continuation of a UTF-8 character is not a valid start byte).<p>Interpreting every character as a Unicode code-point number would give something very similar, but not exactly the same: What Windows decodes as quote, caret-y thing, angle bracket-y thing, tilde, dagger, double dagger, and single quote fall into a control character block at the start of the Unicode "Latin-1 Supplement" block (`\x80` to `\x9f`).<p>> Chromebook running ChromeOS 83.0.4103.97<p>Correct.<p>The Chromebook seems to have rendered the ASCII a, but replaced all other 31 characters with question marks.<p>> Kindle Paperwhite running Firmware 5.10.2
> Vizio M55-C2 TV<p>Also correct.<p>Those two devices seem to opt to display hex instead of falling back to question marks as the Chromebook does.<p>I hope this comment gave some useful insight into why these devices decoded it this way :)