Ydiff: Structural Comparison of Programs

176 pointsby roquinalmost 12 years ago

15 comments

barakmalmost 12 years ago

This is pretty damn cool, but I can think of a few things that I'm curious how you're going to address:1) Your C++ example, we have "namespace 'v8'" -- which has been removed and reinserted. That made me scratch my head a bit. Has it been refactored enough that this is kind of a rewrite? If so, there might be a third or fourth color here for "hey, this part didn't /change/ per se, but everything under it did"2) Your Python example: right at the top we have a new class, PairIterator, with "class" rightly green. However I'm fighting against years of reading textual diffs that suggest the whole block is really the new thing. Could the block highlight the class as-is (where it changed) and subtly color the rest of the tree (what else changed with it)?3) I get your whole in-the-future-we-store-ASTs argument, but that's certainly not the case today. Today we store text, and I don't see that changing. Could there also be a diff, perhaps even another mode that, counter to just dealing with structure, deals with formatting? ie, find the ranges that contribute nothing to the AST and then diff those textually.I like the ideas and the paradigm shifting -- and the real answer is likely somewhere in between, because, at the end of the day, programmers are still editing things in text, even if they are manipulating greater structures.As a diff tool, I'd introduce this to my workflow in a heartbeat if I wasn't working so hard to interpret what the diff means.

评论 #5768243 未加载

RBerenguelalmost 12 years ago

Neat, love new approaches to old problems. At first sight I thought it would be an awesome implementation of an idea I had (and probably tons of other people before and after :) a few years ago to find cheating among C programming assignments: convert programs to PostScript "drawings": ifs, fors and certain other functions gave rise to different "movements" of the cursor, eventually drawing paths for different expressions. Since it didn't take into account variable names, malloc-calloc-free order or any other extraneous thing, it could pinpoint cheating with a relatively good accuracy (and indeed, we caught 3 cheaters among ~90 submissions, and one of them would have been pretty hard to spot without the tool.) If anyone's interested, it was created with lex and basically parsed the expressions I was interested. I think I wrote about it once in my blog, but I'm not sure if I ever published the code (it was hacky as hell!)

评论 #5769702 未加载

评论 #5768035 未加载

tikhonjalmost 12 years ago

I was working on a very similar project myself recently [1], heavily inspired by ydiff. Unfortunately, I didn't implement the tree diff algorithm entirely correctly, and it's been on hiatus for over a year :(.The one interesting addition my project had was merging. The neat trick was that we reused the same tree diff algorithm to find conflicts :P. With a bit of work, we would have some very neat features, including the ability to resolve certain conflicts which physically overlap.[1]: <a href="http://jelv.is/cow" rel="nofollow">http://jelv.is/cow</a>

评论 #5768920 未加载

goldfeldalmost 12 years ago

Lovely to see Racket being used. Given the pure meta-ness of Racket hopefully ydiff will not only evolve into a full-fledged version control system but a completely configurable one at that, akin to the power of Emacs and it's Lisp. Having something like a .yinrc where I could quickly throw around and test out custom functionality would be awesome. I haven't gotten much into git's plumbing yet, but I don't see a culture of hacking it like Emacs and Vim, and I'm not sure why that couldn't be. There's definitely a need, since you have things like git-flow. With structural versioning I think the possibilities of customization and automation would be greatly expanded, and I'd love to be able to use an .*rc file like I use my .vimrc daily, as a scratchpad for small scripts or for fleshing out rough ideas into later externalized plugins.

评论 #5768947 未加载

评论 #5772580 未加载

taliesinbalmost 12 years ago

We use a similar thing internally at Wolfram to diff Wolfram Language expressions, though I believe it doesn't have the subexpression matcher. But the diffs are much clearer when you have full m-expressions instead of s-expressions.The coder who wrote it (@wmacura) now works at Tumblr.

n72almost 12 years ago

Relatedly, see Rich Hickey's <a href="http://blog.datomic.com/2012/10/codeq.html" rel="nofollow">http://blog.datomic.com/2012/10/codeq.html</a>

frozenportalmost 12 years ago

Wonderful, although I would rather have the AST integrated with GIT and not as some other kind of version control.

评论 #5769951 未加载

rmcalmost 12 years ago

Really neat. Shall have to play with it.Related to this, I wish there was a way to navigate a programme based on a tree, not just line and character navigation.

评论 #5768145 未加载

aethertapalmost 12 years ago

This is great. I've wanted to build a similar tool for a couple of years, the distinction being that I want a structural grep utility. I'd pass in a pattern in the form of a code snippet (with pattern operators), and have the thing search for matching code structures. I suspect that this tool does 95% of what would be needed for that, very nice.

评论 #5772176 未加载

jongraehlalmost 12 years ago

Cool, but not perfect:at the end of void ArmDebugger::Stop(Instruction* instr) (rhs of <a href="http://www.cs.indiana.edu/~yw21/demos/simulator-mips-simulator-arm.html" rel="nofollow">http://www.cs.indiana.edu/~yw21/demos/simulator-mips-simulat...</a> )we have an alignment of ' 2 * Instruction::kInstrSize' to some random code at the end

Scaevolusalmost 12 years ago

Semantic Merge [1] is a product based around semantic diffs for C#/VB.NET. Their demo is quite impressive, and they're working on Java and C++ next.[1]: <a href="http://www.semanticmerge.com/" rel="nofollow">http://www.semanticmerge.com/</a>

mcmirealmost 12 years ago

Worth noting this was written up a year ago and there hasn't been a whole lot of progress since. Would love to see this idea progressed further, this seems really neat.

thesebasalmost 12 years ago

interesting thing in similar topic <a href="http://programmers.stackexchange.com/questions/119095/why-dont-we-store-the-syntax-tree-instead-of-the-source-code" rel="nofollow">http://programmers.stackexchange.com/questions/119095/why-do...</a>

pagekickeralmost 12 years ago

can this be pointed at bash scripts?

blacksqralmost 12 years ago

Name too close to "yiff" to take seriously.