I agree. Validating UTF-8 will waste processing time, as well as not work well with non-Unicode text; and (like it says in the article) often you should not actually care what character encoding (if any) it uses anyways. Furthermore, it is often useful to measure the length or split by bytes rather than Unicode code points anyways.<p>Unicode string types are just a bad idea, I think. Byte strings are better; you can still add functions to deal with Unicode or other character codes if necessary (and/or add explicit tagging for character encoding, if that is helpful).<p>Many programming languages though make it difficult to work with byte strings, non-Unicode strings, etc. This often causes problems, in my experience, unless you are careful.<p>Unicode string types are a problem especially when used incorrectly, since if used in a library they can even be exposed to applications that call it even if they do not want it and even if the library doesn't or shouldn't really care. GOTO is not a problem; it is good, because it does not affect library APIs; even if a library uses it, your program does not have to use it, and vice-versa. Unicode string types do not have that kind of benefit, so they are a much more significant problem, and should be avoided when designing a programming language.<p>(None of the above means that there is never any reason to deal with UTF-8, although usually there isn't a good one. For example, if a file in ASCII format can contain commands which are used to produce some output in a UTF-16 format, then it makes sense to treat the arguments to those commands as WTF-8 so that they can be converted to UTF-16, since WTF-8 is the "corresponding ASCII-compatible character encoding" than UTF-16. Similarly, if the output file is JIS, then using EUC-JP would be sensible.)