"Now, in Python 3, we finally have Unicode (utf-8) strings, and 2 byte classes: byte and bytearrays."<p>No, they are Unicode strings. utf-8 is an encoding, and only comes into play when you want to encode strings into bytes for sending them somewhere, or decode them from bytes when receiving them. The interpreter's internal representation of the string is either UCS-4 or an automatically selected encoding (<a href="http://legacy.python.org/dev/peps/pep-0393/" rel="nofollow">http://legacy.python.org/dev/peps/pep-0393/</a>), but that is an irrelevant implementation detail. Conceptually, the strings are sequences of Unicode characters, and it helps to think of them that way.<p>Here are the really important facts about Unicode handling differences between Python 2 and 3 (aside from the obvious str/unicode -> bytes/str move):<p>- There is no silent Unicode coercion in Python 3. Unlike Python 2, your bytes objects won't be decoded for you to str just because you happened to concatenate them with a Unicode string. Your Unicode strings won't be encoded silently if you write them to a byte stream (instead, Python 3 will fail with the cryptic error "TypeError: 'str' does not support the buffer interface").<p>- The default encoding in Python 3 is utf-8, instead of the insane ascii default in Python 2.<p>- All text I/O methods by default return decoded strings, except if you open a stream in binary mode (open(filename, "b")), which now actually means what you'd expect. See the documentation for the io module (<a href="https://docs.python.org/3.4/library/io.html" rel="nofollow">https://docs.python.org/3.4/library/io.html</a>) for more information. (You can use the io module in Python 2.7 to write code that is more forward-compatible with Python 3.)<p>- The above I/O semantics includes sys.argv, os.environ, and the standard streams (sys.stdin/stdout/stderr). The fact that all of these behave differently between Python 2 and 3 with respect to text encoding makes for a lot of fun hair pulling when trying to write code compatible with both.<p>I have built a small library of helper functions to help deal with this stuff in a sane way: <a href="https://github.com/kislyuk/eight" rel="nofollow">https://github.com/kislyuk/eight</a>. Another project that I can recommend that tries to lessen the pain of writing code that's compatible with both 2 and 3 is python-future: <a href="https://github.com/PythonCharmers/python-future" rel="nofollow">https://github.com/PythonCharmers/python-future</a>.
There are other, more subtle differences. In Python 3 this doesn't work.<p><pre><code> >>> filter(lambda (x, y): x > y, ((1, 2), (4, 3)))
</code></pre>
And 2.7 returns the filter result in the input type.<p><pre><code> >>> filter(lambda x: x in 'ABC', 'ABCDEFA')
'ABCA'
</code></pre>
In 3 it's an extra step.<p><pre><code> >>> ''.join(filter(lambda x: x in 'ABC', 'ABCDEFA'))
'ABCA'</code></pre>
The `xrange` example suggests that in Python 3.x, `range `(which is equivalent to xrange in python 2.x is slower than range in python 2.x.<p>This is a double patently IMHO.<p><a href="http://sebastianraschka.com/Articles/2014_python_2_3_key_diff.html#xrange" rel="nofollow">http://sebastianraschka.com/Articles/2014_python_2_3_key_dif...</a>
Vis-a-vis the unicode issue this is another post (by Armin Ronacher/mitsuhiko, creator of Flask web framework)<p><a href="http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/" rel="nofollow">http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/</a>
> However, it is also possible - in contrast to generators - to iterate over those multiple times if needed, it is aonly not so efficient.<p>You can reuse a generator multiple times via itertools.tee():<p><a href="https://docs.python.org/2/library/itertools.html#itertools.tee" rel="nofollow">https://docs.python.org/2/library/itertools.html#itertools.t...</a>
Typo in this sentence: However, it is also possible - in contrast to generators - to iterate over those multiple times if needed, it is aonly not so efficient.
thought many improvement happened, developer still prefer to use 2.x and the eco system take long time to adopt 3.x. "who tie the bell to the cat"