This sounds more similar to what people have done with Racket and hint generation for MOOCs. Not sure which university it is again, but I saw a presentation about how they generate hints for students by mutating the syntax tree and analyzing how they had to modify it, to get to a target solution. It was presented at some RacketCon, maybe a decade ago already. Perhaps that knowledge how to do it can be combined with newer machine learning approaches?<p>EDIT: I found the talk: <a href="https://invidious.baczek.me/watch?v=ijyFC36kVis" rel="nofollow">https://invidious.baczek.me/watch?v=ijyFC36kVis</a>
It's funny, this kind of subtree mutation was looked at pretty deeply by Koza and Adamı in the 90s under the rubric of Genetic Algorithms, but with a slightly different optimization function<p>One ref in the paper to 2000 for GAs for fast generation of program trees, but that's missing the main show<p>Hope they're reading this and dig into those guys work
Markov Chain Monte Carlo for program synthesis isn't exactly novel. The most immediate reference I thought of is Josh Tenenbaum's [1].<p>There's also a lot of demos in WebPPL (web probabilistic programming language)[2] like [3] for the synthesis of 3D space-ships. I highly recommend their associated books on The Design and Implementation of Probabilistic Programming Languages [4] and Probabilistic Models of Cognition [5].<p>I also highly recommend taking a look at the publications of the MIT Probabilistic Computing Project [6].<p>[1] Human-level concept learning through probabilistic program induction. <a href="https://www.cs.cmu.edu/~rsalakhu/papers/LakeEtAl2015Science.pdf" rel="nofollow">https://www.cs.cmu.edu/~rsalakhu/papers/LakeEtAl2015Science....</a><p>[2] <a href="http://webppl.org/" rel="nofollow">http://webppl.org/</a><p>[3] <a href="https://dritchie.github.io/web-procmod/" rel="nofollow">https://dritchie.github.io/web-procmod/</a><p>[4] <a href="https://dippl.org/" rel="nofollow">https://dippl.org/</a><p>[5] <a href="http://probmods.org/" rel="nofollow">http://probmods.org/</a><p>[6] <a href="http://probcomp.csail.mit.edu/" rel="nofollow">http://probcomp.csail.mit.edu/</a>
I don't understand the "magic" here.<p>In a traditional approach, you would generate random images, calculate some distance metric, then use some optimization method like simulated annealing to minimize the distance.<p>I get that the difference between the image representations is being optimzied here, but how is it possible that changing tokens in a program is differentiable?
I wonder how this would apply to compiler/interpreter optimizations.<p>Is it possible that it can "disect" some parts of the execution, perhaps at assembly level, and come up with optimizations specific to the compiled code without changing the output (I mean expected program output, not emitted binary), that modern compilers have not deterministically come up with?
There's been talk in the past about github adding integrations with common build tools (automatically?). What if you could compile every llvm-compiled project on github and run a diffusion model over the intermediate representations?
could diffusion work at binary level? I mean, could we train a diffusion model to generate a final binary of a program given a prompt? probably AST may be better but the binary I feel is extremely easy to at least test fast if it works or not. Though there may be a lot of drawbacks, if this is possible I can't wait until we ask "give me an app that does that" and the diffusion model starts generating all te bytes the app to do the job. Just wondering
The PDF is super slow to render, I guess because it contains commands from programmatically generated figures. It gives it a kind of an academic-paper-feel I miss these days.<p><a href="https://arxiv.org/pdf/2405.20519" rel="nofollow">https://arxiv.org/pdf/2405.20519</a>
The application to inverse graphics tasks reminds me of this paper which was released one week earlier: <a href="https://arxiv.org/abs/2405.15306" rel="nofollow">https://arxiv.org/abs/2405.15306</a>
This is very cool! My first thought is: can this be applied to converting raster graphics to vector graphics (eg PNG to SVG)? Seems like a very similar problem, though probably much more computationally expensive.
This is fascinating. I've been trying to envisage how the new language models will have deeper or lower-level role in software production than simple code generation.
The application to graphics is interesting. It seems to me that current image generation models struggle with stylized pictures ("ligne claire" in comics, geometric shapes and so on). After all this kinds of pictures should be easy to encode in vectoriel formats (like SVG), which are basically programming languages.
I would love to see them try this with the Processing/P5Js libraries. Thats what the ASTs reminded me of. It could potentially be used to help students trying to figure out how to fix their programs. I used AST-based hints for my ProcessingJS courses on Khan Academy, but I handwrote the AST patterns for those hints.
I had idea about doing something similar based on DifussER paper. One would need to model code edits as algebra similar to add char, replace char or delete char but something like define func, remove func, define var etc. I am undereducated to do it myself but have a feeling it could work.
The beam search idea is interesting. Curious to know if beam search for reverse diffusion has been done before.<p>Could anyone clarify how they integrate beam search with the reverse diffusion- do they sample m > k nodes from a reverse diffusion step and expand only the top k nodes?
I'm failing to see how this is novel. It looks like they're doing diffusion on a representation system for 2D graphics, which is very different than an actual program (they do address this limitation to be fair)
Surprised to see Stuart Russells name on this as I thought he was fully consumed by the doomsday cult. Although he's last author so he's probably only on it because he's the head of the lab
Here's a GPTo summary of the paper<p><a href="https://www.emergentmind.com/papers/2405.20519" rel="nofollow">https://www.emergentmind.com/papers/2405.20519</a>