What amazes me is how predictable(?) all of the recent issues were.<p>Don't get me wrong, the folks behind Copilot are clearly, without any doubt smart, creative, and capable. But then... None of these issues (reproducing licensed code ad verbatim, non-compiling code, getting semantics wrong, and now this) are 0.01% edge cases that take specialized knowledge to see or trigger. I remember some of them being called days ago in the initial HN thread by people who haven't had beta access.<p>I really wonder how this announcement/rollout looked like on the management side of things. Because a) these shortcomings must have been known beforehand and b) backlash from people who feel threatened for their jobs/"stolen" of their open source work was (I guess) foreseeable? I've already read calls to abandon GitHub for competitors; this can hardly have been an acceptable outcome here.<p>Nevertheless, Copilot is still one of the most innovative and interesting products I've seen in a while.
Unintentional copyright violations and “leaking” of secrets people accidentally committed to public repos aside, my main issue with Copilot is that I don’t think it actually makes coding easier.<p>Everyone knows it’s usually far easier to write code than to read code. Writing code is a nonlinear process: you don’t start from the first character and write everything out in one single pass. Instead, the logic of the code evolves nonlinearly—add a bit here, remove a bit there, restructure a bit over there. Good code is written such that it can be mostly understood in a single pass, but this is not always possible. For example, understanding function calls requires jumping around the code to where the function is defined (and often deeper down the stack). Understanding a conditional with multiple branches requires first reading all the conditional predicates before reading the code blocks they lead to.<p>Reading, on the other hand, is naturally a linear process. Understanding code requires reconstructing the nonlinear flow though it, and the nonlinear thought process used to write it in the first place. This is why constant communication between partners during pair programming is essential—if too much unexplained code gets dumped on a partner, figuring out how it works takes longer than just writing it themself.<p>Copilot is like pair programming with a completely incommunicative partner who can’t walk you through the code they just wrote. You therefore still have to review most of it manually, which takes much longer than writing it yourself in the first place.
Can we please stop (mis)using the term "AI"? It just does not live up to most people's expectations.<p>Copilot is a glorified Markov chain autocomplete sitting on a huge dump pile of data. It is not aware of constructs such as "licenses" or "secrets" most people would have expected from AI. To prevent it from spilling secrets everywhere, a developer <i>~~should teach the AI a concept of secrets and the meaning of licenses.~~</i> has to implement a filter. A regexp-based one will do, I guess.
> SendGrid engineer reports API keys generated by the AI are not only valid but still functional.<p>> GitHub CEO acknowledges the issue... still waiting for them to pull the plug<p>I agree this is an issue for co-pilot <i>as well</i> - but it's really on send grid to invalidate keys that are known to be leaked?<p>Yes, that's inconvenient for the affected customers - otoh they won't get billed for other people's usage - or dinged for someone spamming using their keys...
It does not <i>generate</i> secrets. The Twitter conversation does not mention that word.
Most certainly, it regurgitates secrets it has seen on crawled repos. Can the title be adjusted, please?
It's really kind of comical at this point. The more this copilot bs continues to be a thing, the more it's making Github seem irresponsible/careless at best.
I'm kind of astonished that this project got greenlit, given Microsoft's previous experiences with embarrassing AI projects (thinking particularly of Tay and Zo).
It is one thing to put by accident your API key on your public Github repository.<p>And it's another (bigger) issue for Copilot to pick up that API key and put it in someone else's project.
I see this as a problem with the developers who are committing code, and not a problem with Copilot. if you make your secrets accessible then they might be accessed. Also if you are rotating your keys regularly that would also mitigate these issues. This is a problem with humans failing to execute known security best practices, not malicious AI doing something insidious.
If Copilot was trained only on public repos like they claim, then shouldn't those API keys already be disabled due to existing secret scanning tools?<p>For example <a href="https://docs.github.com/en/code-security/secret-security/about-secret-scanning" rel="nofollow">https://docs.github.com/en/code-security/secret-security/abo...</a><p>The fact that Copilot recreates API keys that still work makes me wonder if they come from a semi-public place, because SendGrid is usually quite fast at blocking API keys that were accidentally made public.
People put valid secrets in their public repository all the time.<p>Just a quick search:<p><a href="https://grep.app/search?q=%28secret%7Capi%29_%3Fkey%5Cs%3A%3F%3D%20%5B%22%27%5D%5Ba-zA-Z0-9%5D%7B8%2C%7D%22&regexp=true" rel="nofollow">https://grep.app/search?q=%28secret%7Capi%29_%3Fkey%5Cs%3A%3...</a>
I wish he would have tried to track down if the keys were in a public repo before asking Sendgrid about them. If they turned out to be only on Github private repos, that would be new and interesting info.<p>Not saying putting keys in a private, but 3rd party hosted repo, is a terrific idea.
<a href="https://web.archive.org/web/20210705123028/https://twitter.com/alexjc/status/1411966249437995010" rel="nofollow">https://web.archive.org/web/20210705123028/https://twitter.c...</a><p>> COPILOT SECURITY BREACH<p>> SendGrid engineer reports API keys generated by the AI are not only valid but still functional.<p>> GitHub CEO acknowledges the issue... still waiting for them to pull the plug or make a comment. :popcorn:<p>Quoting <a href="https://twitter.com/pkell7/status/1411058236321681414" rel="nofollow">https://twitter.com/pkell7/status/1411058236321681414</a>
I don't consider this a problem. Copilot was trained on public repos, so these secrets had to be checked into public repos. They were already totally public, and should have been invalidated/replaced and redacted. Copilot might result in previously undiscovered published secrets being found, but that's not much worse than anyone finding one under normal circumstances.
Grand Source Code theft. A permanent stain on Github?<p>They should scrap it and Microsoft should be ordered to sell Github because they have a conflict of interest.<p>For example Microsoft has access to your private repos and can do things like co pilot with your data.
Who knows maybe your code powers Windows 11 now.
The only time I would think this is a valid security issue if those were tokens that were previously not public. But that should not be the case right?
There truly is an XKCD for everything: <a href="https://xkcd.com/2169/" rel="nofollow">https://xkcd.com/2169/</a>
I do feel for the people behind Copilot, even though they'll have known it was coming. They produce something <i>absolutely frggin' amazing</i> that can change the world and for the next few days all everyone does is pile on and pull it to pieces... yes of course these are valid issues but can we please look at the big picture and appreciate what an achievement this is?
So GitHub Copilot has inherited all the bad practices of many StackOverFlow and GitHub side projects and generates them in front of you as 'assistance'.<p>All the API keys are still working and who knows, someone might complain about a huge fee right in here because they forgot to revoke it. Only time will tell.<p>I am certainly going to avoid this contraption. No thanks and most certainly no deal.<p>Downvoters: So are you saying GitHub Copilot DOES NOT do the following:<p><pre><code> Leak working API keys in the editor.
Generate broken code AND give you the wrong implementation if you add a single typo?
Copy and regurgitates copyrighted code verbatim.
Guesses 1 out of 10 tries.
Send parts of your code when you type in the editor.
</code></pre>
Are you VERY sure?