TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Demo of an OpenAI language model applied to code generation [video]

283 点作者 cjlovett将近 5 年前

48 条评论

neil_s将近 5 年前
I had trouble accessing the relevant video snippet even after going through the conference registration, so here&#x27;s a summary.<p>You can view the demo at <a href="https:&#x2F;&#x2F;twitter.com&#x2F;i&#x2F;broadcasts&#x2F;1OyKAYWPRrWKb" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;i&#x2F;broadcasts&#x2F;1OyKAYWPRrWKb</a> starting around 29:00.<p>It&#x27;s Sam Altman demoing a massive Open AI model that was trained on GitHub OSS repos using a Microsoft supercomputer. It&#x27;s not Intellicode, but the host says that they&#x27;re working on compressing the models to a size that could be feasible in Intellicode. The code model uses English-language comments, or simply function signatures, to generate entire functions. Pretty cool.
评论 #23252119 未加载
评论 #23252222 未加载
评论 #23251738 未加载
评论 #23252422 未加载
YeGoblynQueenne将近 5 年前
So that&#x27;s basically program synthesis from natural language (ish) specifications (i.e. the comments).<p>I can see this being a useful tool [1]. However, I don&#x27;t expect any ability for innovation. At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).<p>That&#x27;s not to say that it can&#x27;t write <i>new</i> code, that nobody has quite written before in the same way. But in order for a tool like this to be useful it must stick as close as possible to what is expected- or it will slow development down rather than helping it. Which means it can only do what has already been done before.<p>For instance- don&#x27;t expect this to come up with a new sorting algorithm, out of the blue, or to be able to write good code to solve a certain problem when the majority of code solving that problem on github happens to be pretty bad.<p>In other words: everyone can relax. This will not take your job. Or mine.<p>____________<p>[1] I apologise to the people who know me and who will now be falling off their chairs. OK down there?
评论 #23253635 未加载
评论 #23256333 未加载
评论 #23254710 未加载
评论 #23253493 未加载
评论 #23261135 未加载
评论 #23257334 未加载
评论 #23255345 未加载
评论 #23255742 未加载
评论 #23253519 未加载
评论 #23254422 未加载
tanilama将近 5 年前
I mean it is cool.<p>But there is the thing, the natural description of a function is not always this unambiguous.<p>When you are telling a function to &#x27;compute XYZ&#x27;, what you are actually doing is &#x27;check whether X.a exists, if so execute branch 1), else branch 2)&#x27;.<p>If the logic gets really complicated, then describing it accurately in human language isn&#x27;t necessarily faster than doing it in code directly. Otherwise, we don&#x27;t need invent programming languages like at all, we can just write compilers to interpret and execute human languages.<p>And I am interested, as whether the model itself is conditioned on the type constraint of class. It is neat that they pick Python in this case. But if it is Java or other static typed language, would this system condition its generation not only the natural text, but also the resulted type system? My bet, per my understanding of the language modeling approach they use is, they are not doing this, due to very high complexity and cost of the training, and domain adaptation.<p>Overall, this again is an interesting demo. But I think for code generation based on human language to be useful, we are really in a scenario, that you need to go 99% accurate for it to be remotely practical.
评论 #23252485 未加载
评论 #23252396 未加载
IdiocyInAction将近 5 年前
How does this do compared to other models? Is this a totally cutting edge result? On the surface, it seems quite impressive, but sans an environment to try it out with, I cannot be entirely sure. Still, this does make me question whether I chose a safe career, haha.<p>The thing is, I&#x27;d really need to see a live demo to see how good this is. Making mistakes is actually kind of a big issue; as most people know, debugging code is harder than writing it. And a lot of the language models which can write impressive-seeming text also generate masses of garbage. There&#x27;s no way to know whether this was cherrypicked or not.<p>The mere fact that it can extract meaning from text like this is already really impressive though.
评论 #23253200 未加载
评论 #23254533 未加载
parksy将近 5 年前
I have thought about this before but I can see that logical errors are introduced which must be manually tested and reviewed anyway, so what if a more reliable approach could be achieved by training these data sets on test cases alongside passing code?<p>This way developers just write unit tests or functional tests, and the AI generates code and retrains itself until the code passes for all tests. This could happen silently in the background as the developer defines the tests.<p>A number of natural language test frameworks exist, Behat for example lets you define tests such as:<p>Feature: Multiple site support<p><pre><code> Background: Given a global administrator named &quot;Greg&quot; And a blog named &quot;Greg&#x27;s anti-tax rants&quot; And a customer named &quot;Wilson&quot; And a blog named &quot;Expensive Therapy&quot; owned by &quot;Wilson&quot; Scenario: Wilson posts to his own blog Given I am logged in as Wilson When I try to post to &quot;Expensive Therapy&quot; Then I should see &quot;Your article was published.&quot; Scenario: Greg posts to a client&#x27;s blog Given I am logged in as Greg When I try to post to &quot;Expensive Therapy&quot; Then I should see &quot;Your article was published.&quot; </code></pre> It could still fit the dream of describing to a computer what kind of program you want and having it figure out the plumbing.<p>Anyway interesting work. Very interesting. I remember a few colleagues laughed at me no more than 5 years ago when I suggested that AI would eventually write code. And here it is, in an early version, flawed surely but only set to improve.<p>Edit to add: This subject while insanely interesting to me is well out of my wheelhouse. I&#x27;m guessing there&#x27;s possibly semantic structure to the above that the type of model being used in the demo can&#x27;t deal with? Like this one use-case has to co-exist in an entire ecosystem of dependencies and related entities... Could the model cope with that or is it just calculating the likelihood of the next character like other models I&#x27;ve seen, but with insane accuracy when it comes to code?
评论 #23256947 未加载
Voloskaya将近 5 年前
I&#x27;am a bit confused, is this built by OpenAI or Microsoft? Microsoft released the paper IntelliCode Compose: Code Generation Using Transformer [1] 4 days ago and there is no attribution to anyone from OpenAI in it.<p>Are those two entirely separate and yet exactly similar initiatives?<p>[1]: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2005.08025v1" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2005.08025v1</a>
评论 #23255579 未加载
grensley将近 5 年前
Wow, this has the ability to be a total gamechanger. You have to be really observant about the bugs though, I would have totally missed the one with the price discount without executing it.
评论 #23251341 未加载
swalsh将近 5 年前
These are just baby steps, but holy shit is that impressive. It kind of feels like working with offshore devs, but it&#x27;s in real time.
评论 #23252316 未加载
评论 #23255846 未加载
corbins将近 5 年前
Mirror: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;i&#x2F;broadcasts&#x2F;1OyKAYWPRrWKb" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;i&#x2F;broadcasts&#x2F;1OyKAYWPRrWKb</a>
评论 #23251576 未加载
评论 #23250945 未加载
评论 #23250984 未加载
gradys将近 5 年前
I worked on project very much like this last summer, a transformer language model applied to code completion.<p>You&#x27;d be surprised how easy it is to get a model that performs as well as what you see in the video. And it&#x27;s even easier now that people have built great libraries for fine-tuning generative language models.<p>I encourage you to try it yourself! There are many interesting extensions for people to explore:<p>- Use bi-directional context (vanilla GPT-2 only sees backward context)<p>- Integrate with semantic analysis tools.<p>- Experiment with different context representations. You condition the model on an arbitrary sequence of N tokens. It&#x27;s not necessarily the case that you should spend that whole budget on the N tokens that came immediately before. What about including the imports at the top of the file? What about the docstrings for functions that were just used? What about the filepath of the current file?<p>Don&#x27;t look at something like this as though watching your job be automated away. Look at it as a tool that you can master and use to move up the stack.
评论 #23272814 未加载
mring33621将近 5 年前
Amazing!<p>So the developer&#x27;s role will shift to:<p>1) writing good enough descriptions of the code to be generated by the AI model<p>2) fixing any little issues in the generated code
评论 #23251356 未加载
评论 #23251843 未加载
评论 #23252899 未加载
评论 #23251569 未加载
simonhughes22将近 5 年前
This is really cool. However, I doubt it can write more than very simple functions. That may be enough to be useful however. It would be nice if they created a demo page where we could try this out. This use case is a little different than the auto-complete one.
jfoster将近 5 年前
I wonder if this could be trained on just bug fix commits from GitHub in order to produce a model that could suggest bug fixes for an existing code base.
symplee将近 5 年前
Can this freaky A.I. also generate the corresponding unit tests?<p>Or, for TDD, generate the unit tests <i>first</i> based on the function name and description. Then, if the dev updates any of those tests, or adds more tests, use that information in auto generating the appropriate code.
评论 #23251813 未加载
评论 #23251642 未加载
评论 #23251786 未加载
Jach将近 5 年前
I don&#x27;t see it replacing (or even much augmenting) professional programming any time soon... My predicted use case for this is mostly with non-programmers. They&#x27;ll be instructed to write in English what they want to be done, and behind the scenes this will attempt to generate code, execute it, and give the results. A fun demo would be writing &quot;Download the recipe on this webpage (paste link) and order the ingredients from Safeway&quot;. If it could generate its own billing and shipping storage to remember indefinitely after getting it from the user, then generate the relevant web scraping &#x2F; web driving or API code for various websites, that&#x27;d be pretty sweet.
cjlovett将近 5 年前
Hey, now we have a reason to write proper unambiguous code comments :-)
评论 #23252063 未加载
f47il将近 5 年前
Relevant section <a href="https:&#x2F;&#x2F;youtu.be&#x2F;fZSFNUT6iY8" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;fZSFNUT6iY8</a>
rpiguy将近 5 年前
Donald Knuth would be proud! (it appears proper commenting is very important to the AI&#x27;s ability to generate code)
chrisco255将近 5 年前
Is this a demo of their AI &#x27;autocomplete&#x27; tech that they&#x27;ve built into Visual Studio and VS Code?
评论 #23251030 未加载
评论 #23251002 未加载
imranq将近 5 年前
Where this would be most useful is automated testing suites just by specifying what you are testing for. A product manager looking to test portions of a system that absolutely need to work can specify code comments and generate 1000s of tests this way.<p>This is a gamechanger for ensuring the reliability of software. Many more people can be involved in the software development process, and inject their domain knowledge into it.<p>Are there any plans to open source the model? I would love to play around with it.
Debonnys将近 5 年前
Glad to see it learned to use spaces instead of tabs.<p>In all seriousness, the demo really looks amazing. I&#x27;m curious to see more elaborate, real world examples though.
raghavgoyal14将近 5 年前
Imagine all the Stackoverflow accepted answers funneled into your code just because the answers were repeatedly used multiple times in the training data.
AJRF将近 5 年前
Very cool work.<p>However; I fear this moves software engineering closer to the role of something like plumbing.<p>I&#x27;ve despaired at the state of most software I&#x27;ve used since as far back as I can remember, except when it comes to tools that have the maturity of something like linux, git, emacs, vim and the unix tools.<p>For software to get good - it needs to be deeply understood by at least one person working on it. If you train an army of warrior drones who get full line autocompletion first they&#x27;ll start forgetting what types this method takes as its parameters, they&#x27;ll be less likely to explore codebases instead plugging in the first autocompletion that comes to their editor.<p>There bosses will of course want this in the name of &quot;Getting Shit Done&quot;. We already have this sort of divide between developers, those who heavily lean on their tools and those who use minimal editor help. Once you are forced to learn a tool because your tool isn&#x27;t spoon feeding you, you have a chance to better reason from first principles using the code you have available. I don&#x27;t think it&#x27;s a shock that a very high percentage of the very best developers use emacs or vim with minimal tooling.<p>I am aware that this whole comment has subtle tones of superiority and elitism and I am genuinely sorry for that but in my experience it&#x27;s just true that people who lean really hard on their IDEs to do everything for them are less able to develop creative solutions and you can tell from having conversations with them that they don&#x27;t really understand what they are doing.
random32840将近 5 年前
Is there an example of something like this, but trained on the actual abstract syntax tree manipulations that are going on behind the scenes?<p>That seems like it would be considerably more effective, because you&#x27;re removing the noise&#x2F;overhead of parsing the text and giving a much clearer model of what&#x27;s being manipulated to the AI.
yeldarb将近 5 年前
I was very surprised how well it did mimicking the StackOverflow archives when I trained GPT-2 on them last year: <a href="https:&#x2F;&#x2F;stackroboflow.com" rel="nofollow">https:&#x2F;&#x2F;stackroboflow.com</a> (Only the 345M weights were released back then; now I&#x27;m curious how much better 1.5B would do.)
Avi-D-coder将近 5 年前
GPT2 is known to be unable to track and bind variables, scaling purely associative models beyond the trivial examples is going to be difficult or more likely impossible.<p>This will end up being a better tabnine. Models like GPT2 are still just approximating intelligence, they are not rationally cognizing.
评论 #23261060 未加载
unixhero将近 5 年前
Uhm. What if you could use this to produce code to improve ML libraries. Quite recursive or what.
评论 #23256694 未加载
brenden2将近 5 年前
I can&#x27;t even imagine what it&#x27;s like to have so much money that you can spend time working on things like this which are so incredibly unlikely to ever become useful. Congrats and I hope you guys discover a great product some day.
评论 #23253793 未加载
neatze将近 5 年前
Not going to build my hopes up, but looking forward for automated tests generation.
评论 #23253658 未加载
Bjorkbat将近 5 年前
Looks cool. If you want to temper your expectations though, play some AI Dungeon.
woile将近 5 年前
Can someone explain me how these kind of softwares are shared? Would I need to train it again? Or usually the trained models are provided?<p>Is this one in particular open source?
monkeydust将近 5 年前
As a product person wondering how much more productive this will make my engineers? In the surface looks impressive.
评论 #23253247 未加载
评论 #23254681 未加载
sabujp将近 5 年前
I think have something autogenerate tests would be a good first start
boolcow将近 5 年前
When is OpenAI planning to actually solve a hard problem? They have spent a huge amount of money and time creating useless demos so far.<p>Creating flashy AI demos relatively easy. Creating important AI products that actually operate in the real world is the difficulty.
评论 #23253910 未加载
debbiedowner将近 5 年前
How can I try it? And what is the compute cost?
mirekrusin将近 5 年前
Comment driven development, nice.
master_yoda_1将近 5 年前
the title is misleading
pdeligia将近 5 年前
This is super cool!
bobly_today将近 5 年前
So are we all going to be out of a job?
评论 #23252711 未加载
评论 #23252016 未加载
评论 #23255529 未加载
评论 #23251664 未加载
评论 #23251404 未加载
评论 #23251552 未加载
darepublic将近 5 年前
I don&#x27;t want to believe
testeur将近 5 年前
def is_even(x):
rauf11将近 5 年前
find odd numbers from list
alpb将近 5 年前
I tried signing in with my Microsoft account as well, nope, they want you to definitely go ahead and fill out a registration form for Build conference <a href="https:&#x2F;&#x2F;register.build.microsoft.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;register.build.microsoft.com&#x2F;</a>, not gonna happen. Hope they learn not to paywall conferences of this kind, their competition just puts it out on YouTube live.
评论 #23251604 未加载
评论 #23251045 未加载
consultutah将近 5 年前
Is there anyway to get to the video? I&#x27;m registered for build, but the page is all but empty...
评论 #23251628 未加载
cjlovett将近 5 年前
Kevin Scott demos a new AI that is writing code in collaboration with a developer... very cool!
评论 #23251156 未加载
datlife将近 5 年前
I can&#x27;t see anything
评论 #23251625 未加载
评论 #23250792 未加载
ipsum2将近 5 年前
Is there a way to watch this without an account?
评论 #23251620 未加载
评论 #23251056 未加载
Vysero将近 5 年前
I would much rather have an AI that is capable of interpreting what I say as code. So if I say:<p>Build me a class which computes the larger of two integers.<p>The AI is smart enough to write it.
评论 #23251000 未加载
评论 #23251086 未加载
评论 #23251732 未加载