A tip on the diffing:<p>We have done something similar when diffing HTML, e.g. replacing the HTML with single unicodes. And then we run the diff and get several diff-segments (EQUALS, INS, DEL). What we have done, is then to scan those for tags, and split them into a new type.<p>So an insert like INS(something \xE000 else) would become three <i>changes</i>. E.g. INS(something ) INS_TAG(\xE000) INS( else). So the INS_TAG shouldn't be wrapped in <ins> when converting this back to HTML.