This is pretty damn cool, but I can think of a few things that I'm curious how you're going to address:<p>1) Your C++ example, we have "namespace 'v8'" -- which has been removed and reinserted. That made me scratch my head a bit. Has it been refactored enough that this is kind of a rewrite? If so, there might be a third or fourth color here for "hey, this part didn't /change/ per se, but everything under it did"<p>2) Your Python example: right at the top we have a new class, PairIterator, with "class" rightly green. However I'm fighting against years of reading textual diffs that suggest the whole block is really the new thing. Could the block highlight the class as-is (where it changed) and subtly color the rest of the tree (what else changed with it)?<p>3) I get your whole in-the-future-we-store-ASTs argument, but that's certainly not the case today. Today we store text, and I don't see that changing. Could there also be a diff, perhaps even another mode that, counter to just dealing with structure, deals with formatting? ie, find the ranges that contribute nothing to the AST and then diff those textually.<p>I like the ideas and the paradigm shifting -- and the real answer is likely somewhere in between, because, at the end of the day, programmers are still editing things in text, even if they are manipulating greater structures.<p>As a diff tool, I'd introduce this to my workflow <i>in a heartbeat</i> if I wasn't working so hard to interpret what the diff means.
Neat, love new approaches to old problems. At first sight I thought it would be an awesome implementation of an idea I had (and probably tons of other people before and after :) a few years ago to find cheating among C programming assignments: convert programs to PostScript "drawings": ifs, fors and certain other functions gave rise to different "movements" of the cursor, eventually drawing paths for different expressions. Since it didn't take into account variable names, malloc-calloc-free order or any other extraneous thing, it could pinpoint cheating with a relatively good accuracy (and indeed, we caught 3 cheaters among ~90 submissions, and one of them would have been pretty hard to spot without the tool.) If anyone's interested, it was created with lex and basically parsed the expressions I was interested. I think I wrote about it once in my blog, but I'm not sure if I ever published the code (it was hacky as hell!)
I was working on a very similar project myself recently [1], heavily inspired by ydiff. Unfortunately, I didn't implement the tree diff algorithm entirely correctly, and it's been on hiatus for over a year :(.<p>The one interesting addition my project had was merging. The neat trick was that we reused the same tree diff algorithm to find conflicts :P. With a bit of work, we would have some very neat features, including the ability to resolve certain conflicts which physically overlap.<p>[1]: <a href="http://jelv.is/cow" rel="nofollow">http://jelv.is/cow</a>
Lovely to see Racket being used. Given the pure meta-ness of Racket hopefully ydiff will not only evolve into a full-fledged version control system but a completely configurable one at that, akin to the power of Emacs and it's Lisp. Having something like a .yinrc where I could quickly throw around and test out custom functionality would be awesome. I haven't gotten much into git's plumbing yet, but I don't see a culture of hacking it like Emacs and Vim, and I'm not sure why that couldn't be. There's definitely a need, since you have things like git-flow. With structural versioning I think the possibilities of customization and automation would be greatly expanded, and I'd love to be able to use an .*rc file like I use my .vimrc daily, as a scratchpad for small scripts or for fleshing out rough ideas into later externalized plugins.
We use a similar thing internally at Wolfram to diff Wolfram Language expressions, though I believe it doesn't have the subexpression matcher. But the diffs are much clearer when you have full m-expressions instead of s-expressions.<p>The coder who wrote it (@wmacura) now works at Tumblr.
Really neat. Shall have to play with it.<p>Related to this, I wish there was a way to navigate a programme based on a tree, not just line and character navigation.
This is great. I've wanted to build a similar tool for a couple of years, the distinction being that I want a structural grep utility. I'd pass in a pattern in the form of a code snippet (with pattern operators), and have the thing search for matching code structures. I suspect that this tool does 95% of what would be needed for that, very nice.
Cool, but not perfect:<p>at the end of void ArmDebugger::Stop(Instruction* instr)
(rhs of <a href="http://www.cs.indiana.edu/~yw21/demos/simulator-mips-simulator-arm.html" rel="nofollow">http://www.cs.indiana.edu/~yw21/demos/simulator-mips-simulat...</a> )<p>we have an alignment of ' 2 * Instruction::kInstrSize' to some random code at the end
Semantic Merge [1] is a product based around semantic diffs for C#/VB.NET. Their demo is quite impressive, and they're working on Java and C++ next.<p>[1]: <a href="http://www.semanticmerge.com/" rel="nofollow">http://www.semanticmerge.com/</a>
Worth noting this was written up a year ago and there hasn't been a whole lot of progress since. Would love to see this idea progressed further, this seems really neat.
interesting thing in similar topic
<a href="http://programmers.stackexchange.com/questions/119095/why-dont-we-store-the-syntax-tree-instead-of-the-source-code" rel="nofollow">http://programmers.stackexchange.com/questions/119095/why-do...</a>