TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Competitive Programming with AlphaCode

678 pointsby yigitdemiragover 3 years ago

51 comments

FiberBundleover 3 years ago
It never ceases to amaze me what you can do with these transformer models. They created millions of potential solutions for each problem, used the provided examples for the problems to filter out 99% of incorrect solutions and then applied some more heuristics and the 10 available submissions to try to find a solution.<p>All these approaches just seem like brute-force approaches: Let&#x27;s just throw our transformer on this problem and see if we can get anything useful out of this.<p>Whatever it is, you can&#x27;t deny that these unsupervised models learn some semantic representations, but we have no clue at all what that actually is and how these model learn that. But I&#x27;m also very sceptical that you can actually get anywhere close to human (expert) capability in any sufficiently complex domain by using this approach.
评论 #30189515 未加载
评论 #30185172 未加载
评论 #30184059 未加载
评论 #30186644 未加载
评论 #30185418 未加载
doctor_evalover 3 years ago
I sometimes read these and wonder if I need to retrain. At my age, I’ll struggle to get a job at a similar level in a new industry.<p>And then I remember that the thing I bring to the table is the ability to turn domain knowledge into code.<p>Being able to do competitive coding challenges is impressive, but a very large segment of software engineering is about eliciting what the squishy humans in management actually want, putting it into code, and discovering as quickly as possible that it’s not what they really wanted after all.<p>It’s going to take a sufficiently long time for AI to take over management that I don’t think oldies like me need to worry too much.
评论 #30185522 未加载
评论 #30187316 未加载
评论 #30186937 未加载
评论 #30189895 未加载
评论 #30187159 未加载
评论 #30187494 未加载
FemmeAndroidover 3 years ago
This is extremely impressive, but I do think it’s worth noting that these two things were provided:<p>- a very well defined problem. (One of the things I like about competitive programming and the like is just getting to implement a clearly articulated problem, not something I experience on most days.) - existing test data.<p>This is definitely a great accomplishment, but I think those two features of competitive programming are notably different than my experience of daily programming. I don’t mean to suggest these will always be limitations of this kind of technology, though.
评论 #30182339 未加载
评论 #30181962 未加载
评论 #30182173 未加载
评论 #30185482 未加载
评论 #30187844 未加载
msoadover 3 years ago
This seems to have a narrower scope than GitHub Copilot. It generates more lines of code to a more holistic problem vs. GitHub Copilot that works as a &quot;more advanced autocomplete&quot; in code editors. Sure Copilot can synthesize full functions and classes but for me, it&#x27;s the most useful when it suggests another test case&#x27;s title or writes repetitive code like this.foo = foo; this.bar = bar etc...<p>Having used Copilot I can assure you that this technology won&#x27;t replace you as a programmer but it will make your job easier by doing things that programmers don&#x27;t like to do as much like writing tests and comments.
评论 #30181510 未加载
评论 #30181074 未加载
评论 #30180598 未加载
评论 #30180554 未加载
评论 #30180439 未加载
gfdover 3 years ago
Relevant blogpost on codeforces.com (the competitive programming site used): <a href="https:&#x2F;&#x2F;codeforces.com&#x2F;blog&#x2F;entry&#x2F;99566" rel="nofollow">https:&#x2F;&#x2F;codeforces.com&#x2F;blog&#x2F;entry&#x2F;99566</a><p>Apparently the bot would have a rating of 1300. Although the elo rating between sites is not comparable, for some perspective, mark zuckerberg had a rating of ~1k when he was in college on topcoder: <a href="https:&#x2F;&#x2F;www.topcoder.com&#x2F;members&#x2F;mzuckerberg" rel="nofollow">https:&#x2F;&#x2F;www.topcoder.com&#x2F;members&#x2F;mzuckerberg</a>
评论 #30181404 未加载
评论 #30181922 未加载
ahgamutover 3 years ago
I find almost every new advance in deep learning is accompanied by contrasting comments: it&#x27;s either &quot;AI will soon automate programming&#x2F;&lt;insert task here&gt;&quot;, or &quot;let me know when AI can actually do &lt;some-difficult-task&gt;&quot;. There are many views on this spectrum, but these two are sure to be present in every comment section.<p>IIUC, AlphaCode was trained on Github code to solve competitive programming challenges on Codeforces, some of which are &quot;difficult for a human to do&quot;. Suppose AlphaCode was trained on Github code that contains the entire set of solutions on Codeforces, is it actually doing anything &quot;difficult&quot;? I don&#x27;t believe it would be difficult for a human to solve problems on Codeforces when given access to the entirety of Github (indexed and efficiently searchable).<p>The general question I have been trying to understand is this: is the ML model doing something that we can <i>quantify</i> as &quot;difficult to do (given this particular training set)&quot;? I would like to compute a number that measures how difficult it is for a model to do task X given a large training set Y. If the X is part of the training set, the difficulty should be <i>zero</i>. If X is obtained only by combining elements in the training, maybe it is harder to do. My efforts to answer this question: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2109.12075" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2109.12075</a><p>In recent literature, the RETRO Transformer (<a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2112.04426.pdf" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2112.04426.pdf</a>) talks about &quot;quantifying dataset leakage&quot;, which is related to what I mentioned in the above paragraph. If many training samples are also in the test set, what is the model actually learning?<p>Until deep learning methods provide a measurement of &quot;difficulty&quot;, it will be difficult to gauge the prowess of any new model that appears on the scene.
评论 #30183873 未加载
37ef_ced3over 3 years ago
The example problem (essentially, is T a subsequence of S with deletions of size N) is a classic problem with no doubt dozens of implementations in AlphaCode&#x27;s training set.<p>And yet, what a garbage solution it produces.<p>To illustrate the difference between intelligence and regurgitation, someone tell me what CoPilot generates for this:<p><pre><code> &#x2F;&#x2F; A Go function to swap the sixth bit and seventeenth bit of a 32-bit signed integer. </code></pre> Here is a human solution:<p><pre><code> func swap(x int32) int32 { const mask = 1 &lt;&lt; 5 var ( xor1 = (x&gt;&gt;11 ^ x) &amp; mask xor2 = xor1 &lt;&lt; 11 ) return x ^ xor1 ^ xor2 } </code></pre> CoPilot cannot reason numerically like this (understand &quot;seventeenth bit&quot; and &quot;sixth bit&quot; and generate the right code for that combination). It needs to understand the size of the gap between the bits, i.e., 11, and that&#x27;s too hard.
评论 #30180067 未加载
评论 #30180075 未加载
评论 #30181556 未加载
评论 #30183749 未加载
评论 #30180240 未加载
评论 #30180493 未加载
评论 #30180127 未加载
jakey_bakeyover 3 years ago
At the risk of sounding relentlessly skeptical - surely by training the code on GitHub data you&#x27;re not actually creating an AI to solve problems, but creating an extremely obfuscated database of coding puzzle solutions?
评论 #30182753 未加载
评论 #30183828 未加载
评论 #30183664 未加载
hmate9over 3 years ago
Between this and OpenAI&#x27;s Github Copilot &quot;programming&quot; will slowly start dying probably. What I mean by that is that sure, you have to learn how to program, but our time will be spent much more on just the design part and writing detailed documentation&#x2F;specs and then we just have one of these AIs generate the code.<p>It&#x27;s the next step. Binary code &lt; assembly &lt; C &lt; Python &lt; AlphaCode<p>Historically its always been about abstracting and writing less code to do more.
评论 #30180413 未加载
评论 #30180556 未加载
评论 #30179987 未加载
评论 #30180098 未加载
评论 #30180669 未加载
评论 #30181552 未加载
评论 #30180244 未加载
评论 #30183320 未加载
评论 #30190094 未加载
评论 #30183572 未加载
mirrorlakeover 3 years ago
I&#x27;ve been wondering this for a while:<p>In the future, code-writing AI could be tasked with generating the most reliable and&#x2F;or optimized code to pass your unit tests. Human programmers will decide what we want the software to do, make sure that we find all the edge cases and define as many unit tests as possible, and let the AI write significant portions of the product. Not only that, but you could include benchmarks that pit AI against itself to improve runtime or memory performance. Programmers can spend more time thinking about what they want the final product to do, rather than getting mired in mundane details, and be guaranteed that portions of software will perform extremely well.<p>Is this a naive fantasy on my part, or actually possible?
评论 #30191291 未加载
评论 #30183672 未加载
评论 #30182597 未加载
评论 #30181943 未加载
algon33over 3 years ago
How suprising did you guys find this? I&#x27;d have said there was a 20% chance of this performing at the median+level if I was asked to predict things beforehand.
评论 #30182780 未加载
评论 #30180195 未加载
评论 #30180334 未加载
评论 #30181472 未加载
评论 #30185272 未加载
agentultraover 3 years ago
This is kind of neat. I wonder if it will one day be possible for it to find programs that maintain invariant properties we state in proofs. This would allow us to feel confident that even though it&#x27;s generating huge programs that do weird things a human might not think of... well that it&#x27;s still <i>correct</i> for the stated properties we care about, ie: that it&#x27;s not doing anything underhanded.
qualudeheartover 3 years ago
Calling it now: If current language models can solve competitive programming at an average human level, we’re only a decade or less off from competitive programming being as solved as Go or Chess.<p>Deepmind or openAI will do it. If not them, it will be a Chinese research group on par with them.<p>I’ll be considering a new career. It will still be in computer science but it won’t be writing a lot of code. There’ll be several new career paths made possible by this technology as greater worker productivity makes possible greater specialization.
评论 #30182095 未加载
评论 #30182506 未加载
评论 #30181842 未加载
评论 #30182047 未加载
评论 #30183944 未加载
评论 #30181815 未加载
评论 #30182289 未加载
评论 #30182285 未加载
评论 #30181972 未加载
评论 #30182167 未加载
评论 #30184245 未加载
评论 #30183076 未加载
评论 #30185165 未加载
评论 #30186750 未加载
评论 #30183591 未加载
评论 #30182282 未加载
评论 #30183450 未加载
评论 #30181860 未加载
评论 #30183025 未加载
d0mineover 3 years ago
It reminds me that median reputation on StackOverflow is 1. All AlphaSO would have to do is to register to receive median reputation on SO ;) (kidding aside AlphaCode sounds like magic)<p>Inventing relational DBs hasn&#x27;t replaced programmers, we just write custom DB engines less often. Inventing electronic spreadsheets hasn&#x27;t deprecated programmers, it just means that we don&#x27;t need programmers for corresponding tasks (where spreadsheets work well).<p>AI won&#x27;t replace programmers until it grows to replace the humanity as a whole.
评论 #30183425 未加载
评论 #30183716 未加载
londons_exploreover 3 years ago
&gt; AlphaCode placed at about the level of the median competitor,<p>In many programming contests, a large number of people can&#x27;t solve the problem at all, and drop out without submitting anything. Frequently that means the median scoring solution is a blank file.<p>Therefore, without further information, this statement shouldn&#x27;t be taken to be as impressive as it sounds.
aidenn0over 3 years ago
&gt; Creating solutions to unforeseen problems is second nature in human intelligence<p>If this is true then a lot of the people I know lack human intelligence...
bltover 3 years ago
I am always surprised by the amount of skepticism towards deep learning on HN. When I joined the field around 10 years ago, image classification was considered a grand challenge problem (e.g. <a href="https:&#x2F;&#x2F;xkcd.com&#x2F;1425&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;1425&#x2F;</a>). 5 years ago, only singularity enthusiast types were envisioning things like GPT-3 and Copilot in the short term.<p>I think many people are uncomfortable with the idea that their own &quot;intelligent&quot; behavior is not that different from pattern recognition.<p>I do not enjoy running deep learning experiments. Doing resource-hungry empirical work is not why I got into CS. But I still believe it is very powerful.
评论 #30191614 未加载
mwattsunover 3 years ago
Seems to me that this accelerates the trend towards a more declarative style of programming where you tell the computer what you want to do, not how to do it
BoardsOfCanadaover 3 years ago
Do I understand it correctly that it generated (in the end) ten solutions that then were examined by humans and one picked? Still absolutely amazing though.
评论 #30180129 未加载
评论 #30180291 未加载
erwincoumansover 3 years ago
It would be interesting if a future &#x27;AlphaZeroCode&#x27; with access to a compiler and debugger can learn to code, generating data using self-play. Haven&#x27;t read the paper yet, seems some impressive milestone.
mrsuprawsmover 3 years ago
Does this mean that we can all stop grinding leetcode now?
rabbits77over 3 years ago
What I always find missing from these Deep Learning showcase examples are an honest comparison to existing work. It isn’t like computers haven’t been able to generate code before.<p>Maybe the novelty here is working from the English language specification, but I am dubious just how useful that really is. Specifications are themselves hard to write well too.<p>And what if the “specification” was some Lisp code testing a certain goal, is this any better then existing Genetic Programming?<p>Maybe it is better but in my mind it is kind of suspicious that no comparison is made.<p>I love Deep Learning but nobody does the field any favors by over promising and exaggerating results.
评论 #30187393 未加载
评论 #30187359 未加载
EGregover 3 years ago
To me, coding in imperative languages are one of the hardest things to produce an AI for with current approaches (CNN’s, MCTS and various backpropagation). Something like Cyc would seem to be a lot more promising…<p>And yet, I am starting to see (with GitHub’s Copilot, and now this) a sort of “GPT-4 for code”. I do see many problems with this, including:<p>1. It doesn’t actually “invent” solutions on its own like AlphaZero, it just uses and remixes from a huge body of work that humans put together,<p>2. It isn’t really ever sure if it solved the problem, unless it can run against a well-defined test suite, because it could have subtle problems in both the test suite and the solution if it generated both<p>This is a bit like readyplayer.me trying to find the closest combination of noses and lips to match a photo (do you know any open source alternatives to that site btw?)<p>But this isn’t really “solving” anything in an imperative language.<p>Then again, perhaps human logic is just an approaching with operations using low-dimensional vectors, able to capture simple “explainable” models while the AI classifiers and adversarial training produces far bigger vectors that help model the “messiness” of the real world and also find simpler patterns as a side effect.<p>In this case, maybe our goal shouldn’t be to get solutions in the form of imperative language or logic, but rather unleash the computer on “fuzzy” inputs and outputs where things are “mostly correct 99.999% of the time”. The only areas where this could fail is when some intelligent adversarial network exploits weaknesses in that 0.001% and makes it more common. But for natural phenomena it should be good enough !
评论 #30183775 未加载
timetoteaover 3 years ago
If you want some video explanation <a href="https:&#x2F;&#x2F;youtu.be&#x2F;Qr_PCqxznB0" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;Qr_PCqxznB0</a>
errcorrectcodeover 3 years ago
And this is how we reach the technological singularity and how programmers become as equivalently out-of-demand as piano tuners: self-programming systems.<p>AI will eat any and all knowledge work because there&#x27;s very little special a human can do that a machine won&#x27;t be able to do eventually, and much faster and better. It won&#x27;t be tomorrow, but the sands are inevitably shifting this way.
prideoutover 3 years ago
It is obvious to me that computer programming is an interesting AI goal, but at the same time I wonder if I&#x27;m biased, because I&#x27;m a programmer. The authors of AlphaCode might be biased in this same way.<p>I guess this makes sense though, from a practical point of view. Verifying correctness would be difficult in other intellectual disciplines like physics and higher mathematics.
评论 #30180969 未加载
udevover 3 years ago
I am thinking whether this result can create a type of loop that can self-optimize.<p>We have AI to generate reasonable code from text problem description.<p>Now what if the problem description text is to generate such a system in the first place?<p>Would it be possible to close the loop, so to speak, so that over many iterations:<p>- text description is improved<p>- output code is improved<p>Would it be possible to create something that converges to something better?
评论 #30182819 未加载
knowmadover 3 years ago
I agree with most of the comments I&#x27;ve read in this thread. Writing code to solve a well defined narrowly scoped problem isn&#x27;t that hard or valuable. It&#x27;s determining what the problem actually is and how software could be used to solve it that is challenging and valuable.<p>I would really like to see more effort in the AI&#x2F;ML code generation space being put into things like code review, and system observation. It seems significantly more useful to use these tools to augment human software engineers rather than trying to tackle the daunting and improbable task of completely replacing them.<p>*Note: as a human software engineer I am biased
thomasahleover 3 years ago
Next they can train it on kaggle, and we&#x27;ll start getting closer to the singularity
tasubotadasover 3 years ago
I just hope that this shows how useless competitive programming is that it can be replace by the Transformer-model.<p>Additionally, people should REALLY rething their coding interviews if they can be solved by a program.
评论 #30186404 未加载
derelictoover 3 years ago
Hey, honest question: how does one get into competitive programming? I imagine it goes far beyond just leetcoding but honestly i don&#x27;t even know where to start.
throwaway5752over 3 years ago
Most people here are programmers (or otherwise involved in the production of software). We shouldn&#x27;t look at RPA and other job automation trends dispassionately. SaaS valuations aren&#x27;t were they are (and accounting doesn&#x27;t treat engineering salary as cost of goods sold) because investors believe that they will require armies of very well paid developers in perpetuity.
评论 #30183568 未加载
a-dubover 3 years ago
&gt; In our preprint, we detail AlphaCode, which uses transformer-based language models to generate code at an unprecedented scale, and then smartly filters to a small set of promising programs<p>if you&#x27;re using a large corpus of code chunks from working programs as symbols in your alphabet, i wonder how much entropy there actually is in the space of syntactically correct solution candidates.
deepbreamover 3 years ago
This result is well worth a meme.<p><a href="https:&#x2F;&#x2F;opensea.io&#x2F;assets&#x2F;0x495f947276749ce646f68ac8c248420045cb7b5e&#x2F;38800416672363094847602926489336820944788560867702800329357993734324516552705&#x2F;" rel="nofollow">https:&#x2F;&#x2F;opensea.io&#x2F;assets&#x2F;0x495f947276749ce646f68ac8c2484200...</a>
nsikorrover 3 years ago
I suspect these code generating AIs will bring the singularity at some point in the future. Even if we don’t manage to create an artificial general intelligence, they will. I imagine they will learn to code on super human levels through self play just like AlphaGo and AlphaZero did. This will be awesome.
xibalbaover 3 years ago
Between developments like this (and Copilot [Is there a general accepted word for this class of things e.g. &quot;AI Coders&quot;?) and the move toward fully remote, I predict the mean software engineering salary in the United States will be lower in 10 years (in real dollars) than it is today.
评论 #30185654 未加载
dantodorover 3 years ago
Great. Now the only thing remaining is POs being able to come with a clear spec and I&#x27;m out of job
thorwwaskeasover 3 years ago
Since they used the tests this is not something you can do if you don&#x27;t have a rich battery of tests.<p>Perhaps many problems are something like finite automata and the program discover the structure of the finite automata and also an algorithm for better performance.
YeGoblynQueenneover 3 years ago
&gt;&gt; AlphaCode ranked within the top 54% in real-world programming competitions, an advancement that demonstrates the potential of deep learning models for tasks that require critical thinking.<p>Critical thinking? Oh, wow. That sounds amazing!<p>Let&#x27;s read further on...<p>&gt;&gt; At evaluation time, we create a massive amount of C++ and Python programs for each problem, orders of magnitude larger than previous work. Then we filter, cluster, and rerank those solutions to a small set of 10 candidate programs that we submit for external assessment.<p>Ah. That doesn&#x27;t sound like &quot;critical thinking&quot;, or any thinking. It sounds like massive brute-force guessing.<p>A quick look at the arxiv preprint linked from the article reveals that the &quot;massive&quot; amount of prorgams generated is in the millions (see Section 4.4). These are &quot;filtered&quot; by testing them against program input-output (I&#x2F;O) examples given in the problem descriptions. This &quot;filtering&quot; still leaves a few thousands of candidate programs that are further reduced by clustering to &quot;only&quot; 10 (which are finally submitted).<p>So it&#x27;s a generate-and-test approach rather than anything to do with reasoning (as claimed elsewhere in the article) let alone &quot;thinking&quot;. But why do such massive numbers of programs need to be generated? And why are there still thousands of candidate programs left after &quot;filtering&quot; on I&#x2F;O examples?<p>The reason is that the generation step is constrained by the natural-language problem descriptions, but those are not enough to generate appropriate solutions because the generating language model doesn&#x27;t understand what the problem descriptions mean; so the system must generate millions of solutions hoping to &quot;get lucky&quot;. Most of those don&#x27;t pass the I&#x2F;O tests so they must be discarded. But there are only very few I&#x2F;O tests for each problem so there are many programs that can pass them, and still not satisfy the problem spec. In the end, clustering is needed to reduce the overwhelming number of pretty much randomly generated programs to a small number. This is a method of generating programs that&#x27;s not much more precise than drawing numbers at random from a hat.<p>Inevitably, the results don&#x27;t seem to be particularly accurate, hence the evaluation against programs written by participants in coding competitions, which is not any objective measure of program correctness. Table 10 on the arxiv preprint lists results on a more formal benchmar, the APPS dataset, where it&#x27;s clear that the results are extremely poor (the best performing AlphaCode variant solves 20% of the &quot;introductory&quot; level problems, though outperforming earlier approaches).<p>Overall, pretty underwhelming and a bit surpirsing to see such lackluster results from DeepMind.
mcastover 3 years ago
The year is 2025, Google et al. are now conducting technical on-site interviews purely with AI tools and no human bias behind the camera (aside from GPT-3&#x27;s quirky emotions). The interview starts with a LC hard, you&#x27;re given 20 minutes -- good luck!
评论 #30182761 未加载
softwaredougover 3 years ago
I think CoPilot, etc will be revolutionary tools AND I think human coders are needed. Specifically I love CoPilot for the task of &quot;well specified algorithm to solve problem with well-defined inputs and outputs&quot;. The kind of problem you could describe as a coding challenge.<p>BUT, our jobs have a lot more complexity<p>- Local constraints - We almost always work in a large, complex existing code base with specific constraints<p>- Correctness is hard - writing lots of code is usually not the hard part, it&#x27;s proving it correct against amorphous requirements, communicated in a variety of human social contexts, and bookmarked.<p>- Precision is extremely important - Even if 99% of the time, CoPilot can spit out a correct solution, the 1% of the time it doesn&#x27;t creates a bevy of problems<p>Are those insurmountable problems? We&#x27;ll see I suppose, but we begin to verge on general AI if we can gather and understand half a dozen modalities of social context to build a correct solution.<p>Not to mention much of the skill needed in our jobs has much more to do with soft skills, and the bridge between the technical and the non technical, and less to do with hardcore heads-down coding.<p>Exciting times!
jdrcover 3 years ago
I think it would be interesting the train a system end-to-end with assembly code instead of various programming languages. This would make it a much more generic compiler
wildeover 3 years ago
Oh sweet! When can skip the bullshit puzzle phone screens?
评论 #30187000 未加载
alasdair_over 3 years ago
The interesting stuff happens once AlphaCode gets used to improve the code of AlphaCode.
jdrcover 3 years ago
&quot;And so in 2022 the species programmus programmicus went extinct&quot;
NicoJuicyover 3 years ago
I would stop programming if all we needed to write was unit tests :p
评论 #30180695 未加载
pedrobtzover 3 years ago
What about finding bugs, zero-day exploits?
zmmmmmover 3 years ago
Has nobody yet asked it to write itself?
pretendscholarover 3 years ago
I am a little bitter that it is trained on stuff that I gave away for free and will be used by a billion dollar company to make more money. I contributed the majority of that code before it was even owned by Microsoft.
评论 #30181361 未加载
评论 #30180625 未加载
ensanover 3 years ago
Wake me up when an AI creates an operating system on the same level of functionality as early-years Linux.
评论 #30186993 未加载
jonas_kgomoover 3 years ago
Genuine question, what are the reasons to be a software engineer without much ML knowledge in 2022. Seems like a wake up call for developers
评论 #30183105 未加载
评论 #30182706 未加载
评论 #30182864 未加载
评论 #30183814 未加载
评论 #30181867 未加载