Measuring GitHub Copilot's impact on productivity

89 点作者 explosion-s大约 1 年前

19 条评论

nomilk大约 1 年前

I had an interesting experience with CoPilot yesterday. I asked it to add a 'deactivate' button to each of a collection of items. It did that, but unexpectedly, it conditionally displayed 'reactivate' if the `item.deactivated_at.present?` with appropriate bootstrap icons (a cross for deactivate and a circular arrow for reactivate).What surprised me was it knew what I wanted better than I did (for this MVP, I hadn't even considered 'reactivate' functionality - I'd have been happy just with deactivate).So it didn't just write code I asked for, but suggested (and was right about) something it thought I might like beyond what was actually asked for.It was my first time using CoPilot chat so perhaps it does this a lot, but it was my first time experiencing it. It challenged my thinking and improved it.

评论 #39567768 未加载

评论 #39567727 未加载

评论 #39567614 未加载

评论 #39567639 未加载

评论 #39567774 未加载

mpweiher大约 1 年前

The outcome they are measuring is "perceived productivity", which seems pretty weak sauce to me.Here, we investigate whether usage measurements of developer interactions with GitHub Copilot can predict perceived productivity as reported by developers.And they use the "actual activity" to predict this perceived productivity.

评论 #39567893 未加载

cdme大约 1 年前

It's better than traditional autocomplete, but not transformative. ChatGPT is particularly bad with JavaScript — you're better off writing that yourself.

评论 #39567796 未加载

评论 #39567749 未加载

评论 #39567767 未加载

munk-a大约 1 年前

I think it's too early to tell but my main concern about copilot is code maintainability and security. Copilot is able to barf out helpful expressions that will reduce the amount of code we need to write by hand - I think it's excellent when it comes to reducing boilerplate... but I think a large amount of boilerplate existing belies a bigger issue with the project. The majority of software engineering isn't writing code - copilot may be beneficial as an accessibility aide for developers that have typing impairments but most developers can type faster than they can think - if the level of boilerplate in your project is reasonable then this should mean you're never prevented from thinking because your fingers are still working on recording your previous thought. However, at the end of the day, if you can help reduce carpal tunnel that's still a win.The problem I can foresee with copilot is that the scenario change you're agreeing to is that you'll type less but need to read over the code produced more - this is an effort that isn't normally necessary (typos happen but those should take a trivial time to correct) but when copilot is involved you need to proof all the code that is being generated. There is a motivation to skip this step and just accept the code was written correctly and that will inevitably lead to security problems - and there is a motivation to not correct or alter auto-filled command. If there's a multi-dimensional array and you think it semantically makes sense to iterate it over dimension a then dimension b and copilot instead goes with b as the major index then it's more likely to remain in a b major iteration - that may make code less readable or it may cause major issues down the line.Copilot, IMO, is optimizing the least important part of development right now and it costs us more to correct it then it would to just splat out the correct code _but_ this is a similar argument to longbows vs. crossbows - hand a peasant a crossbow and they can fire a crossbow - train a peasant for 30 years and they can fire a longbow - the longbow is more powerful, but the crossbow is a clear choice in terms of RoI. It may be that today's developers will only benefit from copilot minimally since we've invested the training time in standard development practices but tomorrow's developers will eschew a lot of the algorithmic learning and still be able to deliver the majority of the value.

评论 #39567799 未加载

评论 #39568127 未加载

评论 #39569161 未加载

评论 #39571720 未加载

bottlepalm大约 1 年前

I'm already an experienced developer, but writing code in new domains is sooo much nicer now. I can learn and get things done way faster than ever before. Copilot has pretty much replaced Google and Stack Overflow for me. I use it all day everyday. The chat feature is great at well to discuss code, questions, ideas, etc.. I still use ChatGPT 4 for bigger questions, more complex things, writing entire files, etc..

评论 #39568357 未加载

yodon大约 1 年前

For me, Figure 6 is the most interesting observation: programmers are interacting with Copilot in a statistically different way nights and weekends vs working hours.I'm guessing (without hard evidence) that this implies day job code-reviewed commits and weekend hobby project/side hustle/startup coding are held to different standards by the developers involved.

评论 #39567752 未加载

评论 #39568366 未加载

dotnet00大约 1 年前

Copilot has been more impactful for 'unproductive' stuff for me. Not as good at handling the code related to my job, as it's a math heavy beast in an extremely niche field. But great for my hobby, writing little bots to toy around with various things. Lets me skip remembering or looking up a lot of basic implementation details of talking to the service APIs, drastically speeding up the time to get a functional prototype.As others have mentioned, it's great for speeding up all the little boilerplate and other things too simple or otherwise too unrelated to the main goal.Sometimes it almost works like rubberduck-development, by seeing what Copilot spits out, I have sometimes realized earlier than otherwise, that I missed certain components/checks that I should probably plan ahead for (eg remembering to add a convenient way to handle user profiles for a bot).

t_believ-er873大约 1 年前

Yeap, it definately makes work more efficent and less time consuming. However, I think on how safe is it to use it.Here is the art I found about GitHub Copilot: <a href="https://gitprotect.io/blog/github-copilot-introduction-an-ai-assisted-coding/" rel="nofollow">https://gitprotect.io/blog/github-copilot-introduction-an-ai...</a> .

hnthrowaway0328大约 1 年前

ChatGPT is instrumental for me to maintain pyspark code efficiently. I don't want to learn it as I'm not particularly interested in the projects that use it. It's a lot easier to just learn on the fly and double check.

评论 #39567618 未加载

richardw大约 1 年前

I’m a generalist who has been good to excellent at various technologies and languages over a long time but honestly am not currently sharp at many. I’ve written Python, TS, SQL and Java, with many API’s and domains in the last few weeks. I don’t remember everything so Copilot and ChatGPT are an excellent way to get me back in the game, although I often have to still identify and fix where it’s going wrong. And some fairly rare (little code online) areas it produces garbage so it’ll take you 10% of the way there and then it’s back to you.Not perfect but invaluable.

skatanski大约 1 年前

I'd be curious to see improvements in "work being done" metrics. Something like DORA. Where a company has a timeline of metrics before and at some point they introduce Copilot or other types of "AI" assistants. I suppose we will start seeing these, since its something companies would like to share with their shareholders.

robinsonrc大约 1 年前

Where it really shone while I was trialling it was its ability to guess exactly what throwaway code I wanted while I was prototyping this and that and wrapping it all up in a println! call.It was extremely handy from an autocomplete perspective - EXCEPT it insisted on inserting triple backticks into my Rust code four out of five times

dmix大约 1 年前

I love Copilot, I find it essential these days, but if it's going to significantly impact my productivity it's got to be 2x faster. The autocompete is scary good at predicting what was in my head, sometimes before I figure it out, but often it trails behind my own speed.But anyway, the future is exciting.

评论 #39568203 未加载

评论 #39567738 未加载

评论 #39567721 未加载

sagman大约 1 年前

I use it for doing mundane tasks like creating queries following a pattern I defined or creating docs for code. My experience has been great so far. Not sure if it is worth the price for my company, but it encourages me being a little lazy and saves some time.

hatthew大约 1 年前

For me copilot is mildly helpful 10% of the time when coding, but writing code take up such a small amount of my time that it doesn't make a measurable difference overall.

joshstrange大约 1 年前

N=1 but I find Copilot to be incredibly valuable to me.So much so that when there was an outage a few weeks ago (or maybe I had network issues on my side) the loss of it was palpable. I found myself pausing, waiting for Copilot to spit out code only to realize it wasn't going to do it. Once you've used Copilot for a while you get a good sense of what it can and can't do. When to pause and when to just keep typing. I was so used to knowing "this is a thing Copilot will do well" and waiting for it that I kept forgetting it wasn't working right now.Even before this experience I was convinced of the usefulness of it. I've been writing code for close to 20 years and I think I'm pretty decent at it, I never take Copilot's suggestion without first understanding what it's doing but more often than not the suggestion is almost identical to what I would have written myself. Sometimes it wants to do a `.forEach` and I would prefer a `for()` loop but that's easy to fix and often writing `for(` is enough for it to re-write that part of code in the way I prefer. Those changes are often only stylistic.In addition, it's great for code I don't write often but need something quick and dirty to test out a POC. It along with ChatGPT feel like cheating. Just yesterday we were looking into an issue where I work. We had some timing data in the logs but nothing was consuming/displaying that data. Yes, we could grep for the lines of data but we didn't have this feeding into prometheus and the effort to do that was not going to be minor.Instead I had ChatGPT parse the log lines I had already filtered with grep and spit out CSV data ("Datetime, how many seconds something took") then I had it write an extremely basic HTML/CSS/JS file to graph the data. After checking that it was all working I hooked up the command directly in the php file that held the graph so we had "live" graphing (after a reload) of a problem we were investigating. This whole thing took well under 5 minutes.Now I'm perfectly capable of doing everything I just outlined above but it would have taken me longer than 5 minutes just to look up and use the ChartJS syntax/api. Instead I had a tool displaying near-live data in almost no time at all."AI" feels like a superpower. I already know what I want to do and often I even know how to write the code to do it but LLMs let me skip the repetitive boring parts and focus on the things LLMs are not good at, my specific problem space, the specifics of my stack, etc. Only I can do that (for now at least), let the LLM spit out graphs, loops, awk commands, etc, I'll glue it all together and make it useful.

gv83大约 1 年前

let's also measure the productivity of reviewers and people in general that, at a later point, have to wade through piles of ai generated crap.last friday i had to review 2 trash PRs that were blatantly made with ai coding assistance. hundreds of code lines for something that, by reading the doc of the library, could have been made in 5 lines. and the fantastic comments like "returns the body" over a body() function.

WirelessGigabit大约 1 年前

What I am most looking forward to is future changes in development speed. I wish the report tracked the changes over time. Were they committed as-is (which is a metric to the local quality of the suggestion) and how long does it survive (which is a metric to the global quality of the code).

hn_throwaway_99大约 1 年前

> While suggestion correctness is important, the driving factor for these improvements appears to be not correctness as such, but whether the suggestions are useful as a starting point for further development.I admit I didn't read the whole article, but that bullet I thought was key. I totally agree. When folks yell "BuT HalliciNations!!!", I get it, but that doesn't mean that LLMs can still be a huge boon if you know how to use them and don't just trust their output blindly and yolo it into production.

评论 #39567654 未加载