Hi HN,<p>One of the things that frustrates me about Copilot is that all tasks posed to it must be in the form of a completion. By writing clever comments you can get it to generate a few lines of code or a short function body, but you never get coherent long-form generations just from mashing the tab key.<p>I’m working on a different approach. Instead of requiring you specify your code generation task through stilted comments, you can use GPT-3 to fill in what I call “instructional templates”. They’re like f-strings, except the English goes on the inside and the Python goes on the outside. Additionally, each instruction’s location and surrounding context can aid in interpreting it, allowing instructions to be impressively terse.<p>I’ve collected 10 examples of the method on a Twitter thread here. Most code examples are in Python, but I also demonstrate generating CSV, NDJSON, R, Markdown, and HTML: <a href="https://twitter.com/goodside/status/1559801520773898240?s=21&t=-r-dR8pkhZ3lfCpeLOWqvw" rel="nofollow">https://twitter.com/goodside/status/1559801520773898240?s=21...</a><p>I also have a few examples of more creative, non-program output in HTML and Markdown in this thread: <a href="https://twitter.com/goodside/status/1560953991722418177?s=21&t=-r-dR8pkhZ3lfCpeLOWqvw" rel="nofollow">https://twitter.com/goodside/status/1560953991722418177?s=21...</a><p>Interested in any feedback, especially from anyone who’s tried to apply my method to their own problems.
Seems like this also works with C++. This was my first time using GPT3, and first attempt at this, no fine-tuning, no cherry picking at all:<p><pre><code> Replace the $$dollar-quoted expressions$$ with the content described therein, or follow the directions they contain.
main.cpp
```
$$c++ includes for output$$
$$Vec3 struct with three float members x, y, z. constructor. dot product implementation. $$
$$main method which allocates two vectors, then prints the dot product of them to standard output$$
```
Answer:
main.cpp
```
#include <iostream>
struct Vec3 {
float x, y, z;
Vec3(float x, float y, float z) : x(x), y(y), z(z) {}
float dot(const Vec3& v) const {
return x * v.x + y * v.y + z * v.z;
}
};
int main() {
Vec3 a(1, 2, 3);
Vec3 b(4, 5, 6);
std::cout << a.dot(b) << std::endl;
}
```</code></pre>
Wow, I had no idea this could be done!<p>Since we are on the topic of code generation, I had a question. I built this joke script called Rockstar [0] which generates fake git commits resulting in fully green GitHub commit graph. In each commit it adds gibberish, in the last commit adds a valid code. I wanted to know if there’s an easy way to generate realistic looking code which I can use in each commit? I can’t expect users of the script to use OpenAI or any such API service. Something which can be used to generate code locally would be sweet!<p>[0] - <a href="https://github.com/avinassh/rockstar" rel="nofollow">https://github.com/avinassh/rockstar</a>
I find it fascinating that it seems like there's this emerging field of expertise around how to best interact with these gigantic LMs. I have no idea if this is like a passing fad during a gawky adolescent phase of the models, or if this is just a new thing that some people will always be at the cutting edge of.<p>I gather that there's loose some precedent for the latter with the chess computer stuff: I've read that serious chess players heavily incorporate computers into their training and even that hybrid human/computer teams often outperform either/or teams. I'd love if someone who actually knows chess commented.<p>Generating code via model sampling seems to have different, or at least exaggerated imperatives around "few-shot" tuning. One might wish to generate natural language for any number of purposes, but there is a probably a stronger "better"/"worse" gradient for code, and much like human language, excellent code is rare as a fraction of all code not only overall, but even by the same company or even by the same author. So you probably want the overall contours of like "this code compiles" from a big corpus, but to tune up the "this vectorizes well" eigen-tensor from Lemire's repo.<p>Crazy times.
I've been following Riley on Twitter and he's a constant source of fantastic GPT-3 tips, recommended: <a href="https://twitter.com/goodside" rel="nofollow">https://twitter.com/goodside</a>
Just as cybersecurity analyst jobs are getting reduced to comparing risk score numbers, maybe programming jobs will be 'reviewing' machine-generated code in the future.
I just copy and pasted 3 random Leetcode problem prompts to GPT-3. It successfully generated Python code that passed all test cases for 2 out of the 3.<p>Problems passed:
- Two sum
- Text Justification
edit: Newlines
Thanks for sharing this. I've been playing around with GPT-3 for a bit. Have you tried comparing this method to using the `insert` mode in the Playground?<p><a href="https://beta.openai.com/playground?mode=insert" rel="nofollow">https://beta.openai.com/playground?mode=insert</a><p>On a side note, I learned that the limitation on the number of tokens was often too restrictive to do anything fun with code generation. Have you run into this issue too?
Is co-pilot now mandatory to be paid? I used a trial for a long while, at least enough to miss it in my workflows once it went paid. Is this really the case? Last I checked it was like 12 dollars a month? Is it possible for it still to be free or are there free plug-ins for intellij that leverage it?<p>I loved it for some scaffolding, quick drafts or explore some language features, but, not enough to pay monthly for it (yet!)
if you take a coding interview and answer the question by feeding it into GPT-3, does that mean you pass the interview? it must do really well since all it takes is memorizing the solutions to a large body of meaningless challenges.<p>the implication here is that if GPT-3 can solve your coding question, you are hiring people good at memorizing solutions and not skillfull engineers.
GPT-3 also understand code to some degree so it can run the code as instructed.<p><a href="https://mayt.substack.com/p/gpt-3-can-run-code" rel="nofollow">https://mayt.substack.com/p/gpt-3-can-run-code</a>
This is using standard prompting I think?
It'd be neat to try with the full 'fill in the blank' (where the blank is in the middle of the input) generation technique LLMs can support, might work even better!
Clickable version of links:<p>Python, CSV, NDJSON, R, Markdown, and HTML examples: <a href="https://twitter.com/goodside/status/1559801520773898240?s=21&t=-r-dR8pkhZ3lfCpeLOWqvw" rel="nofollow">https://twitter.com/goodside/status/1559801520773898240?s=21...</a><p>More creative, non-program output in HTML and Markdown: <a href="https://twitter.com/goodside/status/1560953991722418177?s=21&t=-r-dR8pkhZ3lfCpeLOWqvw" rel="nofollow">https://twitter.com/goodside/status/1560953991722418177?s=21...</a>