Very interesting! I have been thinking about something similar, but rather than learning from examples, it would have a generic "misaligned" cost function that would penalise lines which have similar content but in different columns, and minimise this by hillclimbing or similar.<p>The difficulty is tying it to a particular language's parser and whitespace rules.