This approach is strongly reminiscent of abstract algebra; for example, the Map-Reduce Transformation example seems to be(?) justified exactly when f is an isomorphism over r.<p>I've been thinking along somewhat similar lines but I've been looking at it from the perspective of introduction and elimination rules, as found in type theory. If your compiler can find a transformation that brings an intro form and an elim form together, then you can apply the computation rule to remove both from the program. My motivation is to remove allocations that aren't really essential. (I do have a functional language compiler that I work on but it doesn't implement any optimizations yet).<p>Another interesting idea that seems related is from a Philip Wadler talk[1] I saw which discusses how normalization can be used to achieve something like your "hypothetical inverses", which should always be eliminated from the final result.<p>> These transformations might only be useful in obvious optimization scenarios (cases that no programmer would intentionally create).<p>I would be surprised if that turned out to be the case.<p>[1] "Everything Old is New Again: Quoted Domain Specific Languages" <a href="https://www.youtube.com/watch?v=DlBwJ4rvz5c" rel="nofollow">https://www.youtube.com/watch?v=DlBwJ4rvz5c</a>
This is excellent, I was thinking about similar optimization using inverses:<p>If one calculates some reduction over a data structure, and this reduction is group operator (so it has inverse, like addition), and this data structure is then updated, it may be easier to recalculate the reduction by updating the result through inverse than to recalculate the whole reduction again.<p>For example, we are calculating sum over a list, we take element out, then it's easier to just subtract the element from the sum rather than calculating the sum again.<p>Anyway, all in all, I think knowledge of function inverses (and perhaps also isomorphisms between types) by compiler will become very handy in the future, and will enable huge number of optimizations.
For the Map-Reduce transformation, you require f(r(a,b)) = r(f(a),f(b)). This is equivalent to requiring r to be a homomorphism, I think.<p>You could probably generalize that condition to situations where f can be lifted over the output of r so you don't have to require that r's output type is the same as its input.
How does this compare to stream fusion? <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7401" rel="nofollow">http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7...</a>
Referenced in the paper panic suggested below ( <a href="https://news.ycombinator.com/item?id=11595273" rel="nofollow">https://news.ycombinator.com/item?id=11595273</a> ), and suggested to me directly by @mattcmd on twitter ( <a href="https://twitter.com/mattmcd/status/726342652845809664" rel="nofollow">https://twitter.com/mattmcd/status/726342652845809664</a> ), there is something called the "fusion property of fold," which would explain transformations I'm discussing here applied to folds (sided reduce). From the paper ( <a href="http://www.cs.nott.ac.uk/~pszgmh/fold.pdf" rel="nofollow">http://www.cs.nott.ac.uk/~pszgmh/fold.pdf</a> ), (and some minimal editorial substitution from me) we see that:<p>h(fold(g, w, l)) = fold(f, h(w), l)<p>if<p>h(g(x, y)) = f(x, h(y))<p>This is exciting because (1) it suggests the equivalent transformation for sided reduces (folds), and (2) fold is a more general operator than the others. My somewhat silly use of "hypothetical inverses" effectively just creates a goalpost to (1) constructively find the the function f, given g (or vice-versa) and (2) prove the necessary condition.<p>Very exciting.
How do we make compilers smart enough to know that “sum(reverse(x))” can be rewritten to “sum(x)”.<p>That examle is pretty easy; you apply a pattern match to the tree which looks for the pattern (sum (reverse ?var)) and rewrites it to (sum ?var).<p>In Common Lisp, we would use define-compiler-macro on the sum function which looks for the argument form being (reverse whatever), in which case the macro returns the form (sum whatever).<p>Not so fast, though; order of summing matters in floating-point! And over integers too, in the presence of overflows (in junk languages without bignums). That is to say, if we add together three or more integers, the arithmetic result can be in range of the type, but some orders might overflow, whereas others do not.
The premise of the optimizations in this article don't always hold, unfortunately. If x is floating point, sum(reverse(x)) is unlikely to be equal to sum(x) for long enough x, and sorting x on beforehand will make this effect even more apparent.
Interesting article. Not sure if it meets the of the "Show HN" guidelines -- there is nothing for users to try out or play with. If it is still withing the window, maybe change the title. If not, maybe email the moderators.