Java Unicode escapes are kind of a mess, and the raw string literals proposal only makes it worse.<p>Unicode escapes are processed in Java not just inside string literals, but <i>everywhere</i> in the source code. So, for example, the following program prints "Hello, world!" even though that line of code seems to be commented (\u000a is new line, so it ends the comment):<p><pre><code> public class Test {
public static void main(String[] args) {
// \u000a System.out.println("Hello, world!");
}
}
</code></pre>
Moreover, a \u000a inside a string literal is the same as an actual newline, so the compiler doesn't accept it:<p><pre><code> Test2.java:3: error: unclosed string literal
String s = "\u000a";
^
</code></pre>
But now with the raw string literal proposal, JEP 326[1], Unicode escape processing is disabled inside raw string literals, and \u0060 escapes (backticks) aren't considered backticks for the purposes of starting raw string literals.<p>So, with this proposal, Unicode escapes are in a worst-of-two-worlds middle way:<p>1) They can't be handled uniformly at a low level anymore, so a Java parser can't naively convert escapes while reading the source file, but<p>2) They must <i>still</i> be naively interpreted in unexpected places like comments and normal string literals, as shown in my examples.<p>What a mess.<p>[1] <a href="https://openjdk.java.net/jeps/326" rel="nofollow">https://openjdk.java.net/jeps/326</a>