Seems this article completely misses the benefits of copilot. Its a massive step forward in productivity. For me, its about suggesting proper syntax across the various libraries we use. It really does cut time by 10s of percent.<p>I don't buy the argument that the risk of a yet-to-be-litigated case against a different company, who will certainly fight this hard; is greater than the productivity gain of using copilot.<p>Additionally, the security argument feels ridiculous to me. We lift code examples from gists and stackoverflow ALL THE TIME! But any good dev doesn't just paste it in and go, instead we review the code snippet to ensure its secure. Same thing with copilot, of course its going to write buggy/insecure code, but instead of going to stackoverflow for a snippet its suggested in my IDE and with my current context.
Context: Kolide just launched a "GitHub Copilot Check" which you can get (along with other features) for $7/device/month. The article is marketing -- an attempt to induce demand among CTOs for an already developed product.<p>That said: I generally agree with the assessment. Github should at the very least be telling users when it is generating code that they trained on. Until it does that, it's kind of dangerous to use. The security stuff is imo more of a red herring.<p>But the more important point is that you can just wait a year and hire a consultant to build a better product (for you) at pretty low cost. Within a year, any organization with a non-trivial number of developers will have the option of hosting their own model trained on The Stack (all permissively licensed) and fine-tuning it on their internal code or their chosen stack. That's probably the best path forward for most organizations. If you can afford $7/dev/month for Slack-integrated nannybots you can definitely afford to pay a consultant/contractor to setup a custom model and get the best of both worlds -- not giving MSFT your company's IP while also improving your dev's productivity and happiness beyond what a generic product could deliver.
Like many folks here, I can write and read a variety of different programming languages. Some I've been using for a long time and know very well and some I seldom use but retain the basics.<p>I don't use Copilot when writing languages I am very comfortable with because I'd rather write code that I completely understand. Or at least, understand to the best of my ability. I find it easier to consider edge cases and side effects when writing original code. Or at least, compared to reading someone else's that was ripped from a project you don't even know the goals of. I don't buy that Copilot improves productivity for this reason as well.<p>I also avoid using Copilot when writing in languages I am unfamiliar with because I feel like it's robbing me of a learning experience. Or robbing me of repetition that improves my memory of how to do various things in the language.<p>I don't know. Copilot is certainly impressive but there are too many questions - what I've mentioned and the legal ones in the OP. But perhaps that is a good thing? It is a new angle on copyright that we're going to have to answer one way or another. In programming and other fields.
People are way to attached to single function examples. I'm struggling to find any example that actually rise up to the requested "originality, creativity, and fixation" for copyright to apply.<p>Just because something looks similar or is even identical doesn't mean copyright applies.
Please don't use copilot, decide it's not worth the risk for your company. In the great competition that is the labor market, copilot is giving me a leg up on everyone who isn't using it. It's the biggest single tool based improvement to my productivity since JetBrains.
1) Starting off, I support AI/ML-based code generation/completion. I would be very happy for the day when I can figuratively wave my hand and get 80-90% of what I need.<p>2) It might be fair to allow authors to submit repos, along with some sort of 'proof of ownership' to Copilot, in order to exclude them from the training set. There might have to be an documented (agreed-upon?) schedule for 'retraining', in order for the exclusion list to take effect in a timely manner.<p>3) Or just allow authors to add a robots.txt to their repos, which specifies rules for training.<p>Just a few thoughts...
There is a risk, but the legal risk to individual users is yet to be decided.<p>What I think is more concerning is that copilot is an extension of effectively automatic copying stuff from stack overflow with even lesser understanding of what the code does by the prompt writer.<p>Do not get me wrong. I absolutely see the benefits, but the risk listed in the article seems less material than a further general decline in code quality. "Built by a human" may need to end up being a thing same way "organic" became a part of daily vocabulary.
The structural completions are way more useful than the entire function completions, even in IntelliJ, where autocomplete is already extremely high quality.<p>The part that I find unsettling when using Copilot is the risk that credentials or secrets embedded in the code, or being edited in (.gitignore'd) config files, are being sent off to Microsoft for AI-munging and possible human review for improvements to the model.
It's interesting to consider how you might prevent training using a license without being too restrictive.<p>Here is an example of a license that attempts to directly prohibit training. The problem is that you can imagine such software can't be used in any part of a system that might be used for training or inference (in the OS, for example). Somehow you need to additionally specify that the software is used directly... But how, what does that mean? This is left as an exercise for the reader and I hope someone can write something better:<p><pre><code> The No-AI 3-Clause License
</code></pre>
<i>This is the BSD 2-Clause License, unmodified except for the addition of a third clause. The intention of the third clause is to prohibit, e.g., use in the training of language models. The intention of the third clause is also to prohibit, e.g., use during language model inference. Such language models are used commercially to aggregate and interpolate intellectual property. This is performed with no acknowledgement of authorship or lineage, no attribution or citation. In effect, the intellectual property used to train such models becomes anonymous common property. The social rewards (e.g., credit, respect) that often motivate open source work are undermined.</i><p><pre><code> License Text:
</code></pre>
<a href="https://bugfix-66.com/7a82559a13b39c7fa404320c14f47ce0c304facc51cdacbba3f99654652bf428" rel="nofollow">https://bugfix-66.com/7a82559a13b39c7fa404320c14f47ce0c304fa...</a>
I'm really getting tired of lawyers, and collectively our "inner-lawyer", poo-pooing this merely for licensing and GPL issues, neither of which have any practical implication on anything a software engineer does.<p>All this "controversy" around Copilot just reeks of a kind of technological "social justice" that most people didn't sign up for but seem happy to sit, watch, and commiserate on.
Reading about FOSS copyright is so exhausting. I find no meaningful distinction between reading code and learning from it, vs feeding it into a model. I’ve heard the “it spits out foss code verbatim” argument, and I really don’t buy that. I’ve never seen it. AI assisted software tooling is so powerful we really should consider the social benefits ahead of what is part of our existing legal framework.
There is some legal risk, but what percent of code you write is potentialy affected by audits before you sell it?
So you're trading as a single developer real productivity gain and as a company lower costs for a potential "liability" when you're selling your company. Looks like a good bet.
A lot of code will be thrown out or never be sold to anyone.
I have been writing my PhD thesis in VSCode with copilot enabled, and it it absurdly good at suggestions in Latex, from generating tables to writing whole paragraphs of text in the discussion.
I seem to recall a recent copyright office decision [0] where it was decided that an AI does not own copyright of its own work, because the output is not the effort of an entity which used intellectual effort to create the output. only a human can own copyright, according to that (US-only) copyright office decision.<p>this means the output of an AI isn't even considered "work" in the eyes of copyright law, if I am understanding. if the output of copilot is not a "work" then the output of copilot cannot be a "derivative work," and cannot violate copyright.<p>Courts have repeatedly found that only stuff humans create can be copyrighted.<p>If GitHub Copilot can't produce a work, it can't violate copyright; only human operators can do that.<p>This makes the recent GitHub announcement about coming Copilot features make much more sense legally: features which show the origin of the suggestion and which allows a user to select which code licenses to use suggestions from. previously these seemed like things to appease critics, but they're tools to help paying copilot users know what code they're actually using. Nice.<p>IANAL.<p>anyway, this lawsuit is gonna fail so effin' hard. lol<p>[0]: <a href="https://www.theverge.com/2022/2/21/22944335/us-copyright-office-reject-ai-generated-art-recent-entrance-to-paradise" rel="nofollow">https://www.theverge.com/2022/2/21/22944335/us-copyright-off...</a>
Programmer: Uploads code to Github for the public to see / use
Github: Uses code uploaded by programmers to learn and make other code better
Programmer: NO FAIR! My code can only be used the way I want it to be and my code is absolutely unique and no one else has coded something like it
How to make it to the front page in any tech forum:<p>Step 1: "GitHub Copilot Bad.... amirite!>"<p>Snark aside, most of these articles miss the mark to the point where they seem like the author is tech illiterate and is just parroting soundbites from others' opinions.
This article makes a big mistake. It assumes copyright infringement is extremely bad and would never be worth doing. In practice when have people been sued over misusing open source software? You most likely won't be caught. And even if you are you can rewrite the code / give attribution then. Even if you do end up having to pay damages, the productivity increase for your company using copilot may be worth the damages.
I'm going to keep using it. You won't stop me. You won't catch me. And I just need to read the next five tokens to know whether it's right.
I would say the risk is minimal. You need to bait Copilot really hard for it to produce anything coherent from existing code. That's simply not how you use it.<p>Regardless, the risk need to be really big for me to stop using it. It's such an essential tool for me now that I get shocked how crippled I feel when internet stops working and I realize how much I depend on it.
"You might get sued if you use this software you paid for" is already covered via an indemnification clause in any reasonable enterprise software license agreement. I'm sure Microsoft/GitHub will be no different in indemnifying their customers who purchase Copilot.
I saw there was some (unofficial) package for Emacs, reusing some vim Copilot integration. Anyone here tried Emacs+Copilot yet? Is it working fine? Out of curiosity I'd like to try it and, who knows...<p>Also: does Copilot work for Clojure and is it any useful for Clojure?
There is a major difference between the help you can get from an IDE or editor with a language server running in the background, and then GitHub Copilot stealing away other peoples code.<p>I sincerely hope Microsoft looses this law suit.
you still on about copyright? what about the fact that it will just add vulns and bugs to your code? or is the industry so bad at this point that a gimmicky AI tool can do better
I haven't used it yet. I believe when people say that it's the future of development and that every dev will have to use it or be left behind, but I can't fathom how people are comfortable sending every iteration of their code to a big tech corporation. I can't wait to see the day where we can run such solutions in our personal computers (or personal cloud servers), but I feel that, in 2022, this type of tool is not yet worth the risk. I hope this is just a temporary obstacle in our way to our future AI-assisted programming.
I have not used it, but I don't understand how copilot could be useful. As a game programmer I don't spend much time actually writing final code. Most of my time is spent working stuff out on paper or writing little tests which I will discard.<p>In general I want to write as little code as possible as more code = more problems. The code I <i>do</i> write I want to put great care and craft into in order to keep it maintainable. Giving up any of my agency in this critical area seems like a terrible idea to me.<p>Something that will help me write more code, or write code faster is of no benefit to me.