There is <i>fundamental</i> a difference between Unicode and UTF-8/16/32. Unicode says nothing at all about how strings should be stored in memory. It is just a set of symbols and code-points. On the other hand, UTF-8/16/32 are implementations of Unicode. See <a href="http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings" rel="nofollow">http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings</a> for others.<p>Statements like "first 256 code points in Unicode map to Latin-1" make little sense. They are true if you s/Unicode/UTF-8/g. However, they are not true for other encodings such as ones that use 2 or 4 bytes per character. There 'abcd' is not 4 bytes, but 8 or 16.