Watching AI drive Microsoft employees insane

1055 балловавтор: laiysb4 дня назад

70 comments

diggan4 дня назад

Interesting that every comment has "Help improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative.> This seems like it's fixing the symptom rather than the underlying issue?This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?The same PR as the quote above continues with 3 more messages before the human seemingly gives up:> please take a look> Your new tests aren't being run because the new file wasn't added to the csproj> Your added tests are failing.I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.Another PR: <a href="https://github.com/dotnet/runtime/pull/115732/files">https://github.com/dotnet/runtime/pull/115732/files</a>How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.

评论 #44050654 未加载

评论 #44050746 未加载

评论 #44050788 未加载

评论 #44052119 未加载

评论 #44054197 未加载

评论 #44050434 未加载

评论 #44051706 未加载

评论 #44050326 未加载

评论 #44051109 未加载

评论 #44050378 未加载

评论 #44050318 未加载

评论 #44058817 未加载

评论 #44050938 未加载

评论 #44050741 未加载

评论 #44051967 未加载

评论 #44067882 未加载

评论 #44051424 未加载

bramhaag4 дня назад

Seeing Microsoft employees argue with an LLM for hours instead of actually just fixing the problem must be a very encouraging sight for businesses that have built their products on top of .NET.

评论 #44051059 未加载

评论 #44050510 未加载

评论 #44051447 未加载

评论 #44050865 未加载

评论 #44054493 未加载

评论 #44051963 未加载

评论 #44056433 未加载

评论 #44055849 未加载

评论 #44052801 未加载

评论 #44051824 未加载

评论 #44050846 未加载

nirui4 дня назад

I recently, meaning hours ago, had this delightful experience watching the Eric of Google, which everybody love, including he's extra curricular girl friend and wife, talking about AI. He seemed to believe AI is under-hyped after tried it out himself: <a href="https://www.youtube.com/watch?v=id4YRO7G0wE" rel="nofollow">https://www.youtube.com/watch?v=id4YRO7G0wE</a>He also said in the video:> I brought a rocket company because it was like interesting. And it's an area that I'm not an expert in and I wanted to be a expert. So I'm using Deep Research (TM). And these systems are spending 10 minutes writing Deep Papers (TM) that's true for most of them. (Them he starts to talk about computation and "it typically speaks English language", very cohesively, then stopped the thread abruptly) (Timestamp 02:09)Let me quote out the important in what he said: "it's an area that I'm not an expert in".During my use of AI (yeah, I don't hate AI), I found that the current generative (I call them pattern reconstruction) systems has this great ability to Impress An Idiot. If you have no knowledge in the field, you maybe thinking the generated content is smart, until you've gained some depth enough to make you realize the slops hidden in it.If you work at the front line, like those guys from Microsoft, of course you know exactly what should be done, but, the company leadership maybe consists of idiots like Eric who got impressed by AI's ability to choose smart sounding words without actually knowing if the words are correct.I guess maybe one day the generative tech could actually write some code that is correct and optimal, but right now it seems that day is far from now.

评论 #44057178 未加载

评论 #44060463 未加载

评论 #44057307 未加载

评论 #44058943 未加载

评论 #44059559 未加载

kruuuder4 дня назад

A comment on the first pull request provides some context:> The stream of PRs is coming from requests from the maintainers of the repo. We're experimenting to understand the limits of what the tools can do today and preparing for what they'll be able to do tomorrow. Anything that gets merged is the responsibility of the maintainers, as is the case for any PR submitted by anyone to this open source and welcoming repo. Nothing gets merged without it meeting all the same quality bars and with us signing up for all the same maintenance requirements.

评论 #44050964 未加载

评论 #44050977 未加载

rsynnott4 дня назад

Beyond every other absurdity here, well, maybe Microsoft is different, but I would never assign a PR that was _failing CI_ to somebody. That that's happening feels like an admission that the thing doesn't _really_ work at all; if it worked even slightly, it would at least only assign passing PRs, but presumably it's bad enough that if they put in that requirement there would be no PRs.

评论 #44050819 未加载

评论 #44052170 未加载

robotcapital4 дня назад

Replace the AI agent with any other new technology and this is an example of a company:1. Working out in the open2. Dogfooding their own product3. Pushing the state of the artGiven that the negative impact here falls mostly (completely?) on the Microsoft team which opted into this, is there any reason why we shouldn't be supporting progress here?

评论 #44051799 未加载

评论 #44051946 未加载

评论 #44051912 未加载

评论 #44053698 未加载

评论 #44053038 未加载

globalise834 дня назад

Malicious compliance should be the order of the day. Just approve the requests without reviewing them and wait until management blinks when Microsoft's entire tech stack is on fire. Then quit your job and become a troubleshooter on x3 the pay.

评论 #44050748 未加载

评论 #44050365 未加载

评论 #44050587 未加载

评论 #44050723 未加载

评论 #44050983 未加载

balazstorok4 дня назад

At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.

评论 #44050762 未加载

评论 #44050775 未加载

评论 #44050731 未加载

评论 #44051280 未加载

评论 #44050801 未加载

petetnt4 дня назад

GitHub has spent billions of dollars building an AI that struggles with things like whitespace related linting errors on one of the most mature repositories available. This would be probably okay for a hobbyist experiment, but they are selling this as a groundbreaking product that costs real money.

评论 #44052498 未加载

评论 #44050774 未加载

Quarrelsome4 дня назад

rah, we might be in trouble here. The primary issue at play is that we don't have a reliable means of measuring developer performance, outside of subjective judgement like end of year reviews.This means its probably quite hard to measure the gain or the drag of using these agents. On one side, its a lot cheaper than a junior, but on the other side it pulls time from seniors and doesn't necessarily follow instruction well (i.e. "errr your new tests are failing").This combined with the "cult of the CEO" sets the stage for organisational dissonance where developer complaints can be dismissed as "not wanting to be replaced" and the benefits can be overstated. There will be ways of measuring this, to project it as huge net benefit (which the cult of the CEO will leap upon) and there will be ways of measuring this to project it as a net loss (rabble rousing developers). All because there is no industry standard measure accepted by both parts of the org that can be pointed at which yields the actual truth (whatever that may be).If I might add absurd conjecture: We might see interesting knock-on effects like orgs demanding a lowering of review standards in order to get more AI PRs into the source.

评论 #44058062 未加载

评论 #44051612 未加载

Crosseye_Jack4 дня назад

I do love one bot asking another bot to sign a CLA! - <a href="https://github.com/dotnet/runtime/pull/115732#issuecomment-2891990223">https://github.com/dotnet/runtime/pull/115732#issuecomment-2...</a>

评论 #44050670 未加载

评论 #44050407 未加载

评论 #44050886 未加载

评论 #44051196 未加载

Philpax4 дня назад

Stephen Toub, a Partner Software Engineer at MS, explaining that the maintainers are intentionally requesting these PRs to test Copilot: <a href="https://github.com/dotnet/runtime/pull/115762#issuecomment-2897683991">https://github.com/dotnet/runtime/pull/115762#issuecomment-2...</a>

margorczynski4 дня назад

With how stochastic the process is it makes it basically unusable for any large scale task. What's the plan? To roll the dice until the answer pops up? That would be maybe viable if there was a way to automatically evaluate it 100% but with a human in the loop required it becomes untenable.

评论 #44050291 未加载

评论 #44050696 未加载

评论 #44051948 未加载

评论 #44050321 未加载

le-mark4 дня назад

The real tragedy is the management mandating this have their eyes clearly set on replacing the very same software engineers with this technology. I don’t know what’s more Kafka than Kafka but this situation certainly is!

评论 #44051225 未加载

评论 #44052597 未加载

rchaud4 дня назад

It's remarkable how similar this feels to the offshoring craze of 20 years ago, where the complaints were that experienced developers were essentially having to train "low-skilled, cheap foreign labour" that were replacing them, eating up time and productivity.Considering the ire that H1B related topics attract on HN, I wonder if the same outrage will apply to these multi-billion dollar boondoggles.

automatic61314 дня назад

Satya said "nearly 30% of code written at microsoft is now written by AI" in an interview with Zuckerberg, so underlings had to hurry to make it true. This is the result. Sad!

评论 #44051881 未加载

einrealist4 дня назад

This is one good example of the Sunk Cost Fallacy: generative AI has cost so much money, acknowledging its shortcomings is now becoming more and more impossible.This AI bubble is far worse than the Blockchain hype.Its not yet clear whether productivity gains are real and whether the gains are eaten by a decline in overall quality.

评论 #44054381 未加载

cebert4 дня назад

Do we know for a fact there are Microsoft employees who were told they have to use CoPilot and review its change suggestions on projects?We have the option to use GitHub CoPilot on code reviews and it’s comically bad and unhelpful. There isn’t a single member of my team who find it useful for anything other than identifying typos.

评论 #44050270 未加载

评论 #44050192 未加载

评论 #44050807 未加载

is_true4 дня назад

Today I received the 2nd email about an endpoint in an API we run that doesn't exist but some AI tool told the client it does.

评论 #44051138 未加载

bossyTeacher4 дня назад

Every week, one of Google/OpenAI/Anthropic releases a new model, feature or product and it gets posted here with 3 figure comments mostly praising LLMs as the next best thing since the internet. I see a lot of hype on HN about LLMs for software development and how it is going to revolutionize everything. And then, reality looks like this.I can't help but think that this LLM bubble can't keep growing much longer. The investment to results ratio doesn't look great so far and there is only so many dreams you can sell before institutional investors pull the plug.

vachina4 дня назад

> This seems like it's fixing the symptom rather than the underlying issue?Exactly. LLM does not know how to use a debugger. LLM does not have runtime contexts.For all we know, the LLM could’ve fixed the issue simply by commenting out the assertions or sanity checks and everything seemed fine and dandy until every client’s device catches on fire.

评论 #44050533 未加载

评论 #44050460 未加载

aiono4 дня назад

While I am AI skeptic especially for use cases like "writing fixes" I am happy to see this because it will be a great evidence whether it's really providing increase in productivity. And it's all out in the open.

rvz4 дня назад

After all of that, every PR that Copilot opened still has failing tests and it failed to fix the issue (because it fundamentally cannot reason).No surprises here.It always struggles on non-web projects or on software where it really matters that correctness is first and foremost above everything, such as the dotnet runtime.Either way, a complete disastrous start and what a mess that Copilot has caused.

评论 #44050303 未加载

softwaredoug4 дня назад

I’m all for AI “writing” large swaths of code, vibe coding, etc.But I think it’s better for everyone if human ownership is central to the process. Like I vibe coded it. I will fix it if it breaks. I am on call for it at 3AM.And don’t even get started on the safety issues if you don’t have clear human responsibility. The history of engineering disasters is riddled with unclear lines of responsibility.

评论 #44051216 未加载

评论 #44083240 未加载

skywhopper4 дня назад

Oof. A real nightmare for the folks tasked with shepherding this inattentive failure of a robot colleague. But to see it unleashed on the dotnet runtime? One more reason to avoid dotnet in the future, if this is the quality of current contributions.

Havoc4 дня назад

At least it's clearly labelled as copilot.Much more worried about what this is going to do to the FOSS ecosystem. We've already seen a couple maintainers complain and this trend is definitely just going to increase dramatically.I can see the vision but this is clearly not ready for prime time yet. Especially if done by anonymous drive-by strangers that think they're "helping"

评论 #44050927 未加载

评论 #44058116 未加载

pera4 дня назад

This is all fun and games until it's your CEO who decides to go "AI first" and starts enforcing "vibe coding" by monitoring LLM API usage...

lossolo4 дня назад

This is hilarious. And reading the description on the Copilot account is even more hilarious now: "Delegate issues to Copilot, so you can focus on the creative, complex, and high-impact work that matters most."

ankitml4 дня назад

GitHub is not the place to write code. IDE is the place. Along with pre CI checks, some tests, coverage etc. they should get some PM before making decisions..

评论 #44050314 未加载

评论 #44050392 未加载

baalimago4 дня назад

Well, the coding agent is pretty much a junior dev at the moment. The seniors are teaching it. Give it a 100k PRs with senior developer feedback and it'll improve just like you'd anticipate a junior would. There is no way that FANG aren't using the comments by the seniors as training data for their next version.It's a long-term play to have pricey senior developers argue with an llm

评论 #44050575 未加载

评论 #44050629 未加载

评论 #44050692 未加载

评论 #44050610 未加载

评论 #44050733 未加载

评论 #44051636 未加载

评论 #44057727 未加载

TimPC4 дня назад

I still believe in having humans do PRs. It's far cheaper to have the judgement loop on the AI come before and during coding than after. My general process with AI is to explicitly instruct it not to write code, agree on a correct approach to a problem and if the project has any architectural components a correct architecture then once we've negotiated the correct way of doing things ask it to write code. Usually each step of this process takes multiple iterations of providing additional information or challenging incorrect assumptions of the AI. I can get it much faster than human coding with a similar quality bar assuming I iterate until a high quality solution is presented. In some cases the AI is not good enough and I fall back to human coding but for the most part I think it makes me a faster coder.

GiorgioG4 дня назад

Step 1. Build “AI” (LLM models) that can’t be trusted, doesn’t learn, forgets instructions, and frustrates software engineersStep 2. Automate the use of these LLMs into “agents”Step 3. ???Step 4. Profit

rubyfan4 дня назад

FTPR> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.This is gross, keep your fomo to yourself.

rkagerer4 дня назад

This comment from lloydjatkinson resonated:As an outside observer but developer using .NET, how concerned should I be about AI slop agents being let lose on codebases like this? How much code are we going to be unknowingly running in future .NET versions that was written by AI rather than real people?What are the implications of this around security, licensing, code quality, overall cohesiveness, public APIs, performance? How much of the AI was trained on 15+ year old Stack Overflow answers that no longer represent current patterns or recommended approaches?Will the constant stream of broken PR's wear down the patience of the .NET maintainers?Did anyone actually want this, or was it a corporate mandate to appease shareholders riding the AI hype cycle?Furthermore, two weeks ago someone arbitrarily added a section to the .NET docs to promote using AI simply to rename properties in JSON. That new section of the docs serves no purpose.How much engineering time and mental energy is being allocated to clean up after AI?

评论 #44051806 未加载

ethanol-brain4 дня назад

Are people really doing coding with agents through PRs? This has to be a huge waste of resources.It is normal to preempt things like this when working with agents. That is easy to do in real time, but it must be difficult to see what the agent is attempting when they publish made up bullshit in a PR.It seems very common for an agent to cheat and brute force solutions to get around a non-trivial issue. In my experience, its also common for agents to get stuck in loops of reasoning in these scenarios. I imagine it would be incredibly annoying to try to interpret a PR after an agent went down a rabbit hole.

评论 #44050502 未加载

评论 #44058126 未加载

Traubenfuchs4 дня назад

> These defines do not appear to be defined anywhere in the build system.> @copilot fix the build error on apple platforms> @copilot there is still build error on Apple platformsAre those PRs some kind of software engineer focused comedy project?

carefulfungi4 дня назад

It's mind blowing that a computer program can accomplish this much and yet absurd that it accomplishes so little.

actionfromafar4 дня назад

The funniest is the dotnet-policy-service asking copilot to read and agree to the Contributor License Agreement. :-D

kookamamie4 дня назад

Many here don't seem to get it.The AI agent/programmer corpo push is not about the capabilities and whether they match human or not. It's about being able to externalize a majority of one's workforce without having a lot of people on permanent payroll.Think in terms of an infinitely scalable bunch of consultants you can hire and dismiss at your will - they never argue against your "vision", either.

评论 #44050621 未加载

smartmic4 дня назад

reddit may not have the best reputation, but the comments there are on point! So far much better than what has been posted here by HN users on this topic/thread. Anyway, I hope this is good fodder to show the limits (and they are much narrower than hype-driven AI enthusiasts like to pretend) of AI coding and to be more honest with yourself and others about it.

评论 #44050594 未加载

gizzlon4 дня назад

> @copilot please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.haha

RobKohr4 дня назад

With layoffs driven by a push for more LLM use, this feels like malicious compliance.

octocop4 дня назад

"fix failing tests" does never yield any good results for me either

ncr1004 дня назад

Q: Does Microsoft report its findings or learnings BACK to the open source community?The @stephentoub MS user suggests this is an experiment (<a href="https://github.com/dotnet/runtime/pull/115762#issuecomment-2897683991">https://github.com/dotnet/runtime/pull/115762#issuecomment-2...</a>).If this is using open source developers to learn how to build a better AI coding agent, will MS share their conclusions ASAP?EDIT: And not just MS "marketing" how useful AI tools can be.

esafak4 дня назад

I speculate what is going on is that the agent's context retrieval algorithm is bad, so it does not give the LLM the right context, because today's models should suffice to get the job done.Does anyone know which model in particular was used in these PRs? They support a variety of models: <a href="https://github.blog/ai-and-ml/github-copilot/which-ai-model-should-i-use-with-github-copilot/" rel="nofollow">https://github.blog/ai-and-ml/github-copilot/which-ai-model-...</a>

评论 #44053507 未加载

sensanaty4 дня назад

Related: GitHub Developer Advocate Demo 2025 - <a href="https://www.youtube.com/watch?v=KqWUsKp5tmo&t=403s" rel="nofollow">https://www.youtube.com/watch?v=KqWUsKp5tmo&t=403s</a>The timestamp is the moment where one of these coding agents fails live on stage with what is one of the simplest tasks you could possibly do in React, importing a Modal component and having it get triggered on a button click. Followed by blatant gaslighting and lying by the host - "It stuck to the style and coding standards I wanted it to", when the import doesn't even match the other imports which are path aliases rather than relative imports. Then, the greatest statement ever, "I don't have time to debug, but I am pretty sure it is implemented."Mind you, it's writing React - a framework that is most definitely over-represented in its training data and from which it has a trillion examples to stea- I mean, "borrow inspiration" from.

xyst4 дня назад

llms are already very expensive to run on a per query basis. Now it’s being asked to run on massive codebases and attempt to fix issues.Spending massive amounts of:- energy to process these queries- wasting time of mid-level and senior engineers to vibe code with copilot to ensure train and get it rightWe are facing a climate change crisis and we continue to burn energy at useless initiatives so executives at big corporation can announce in quarterly shareholder meetings: "wE uSe Ai, wE aRe tHe FuTuRe, lAbOr fOrCe rEdUceD"

mark-r3 дня назад

My favorite comment:> But on the other hand I think it won't create terminators. Just some silly roombas.I watched a roomba try to find its way back to base the other day. The base was against a wall. The roomba kept running into the wall about a foot away from the base, because it kept insisting on approaching from a specific angle. Finally gave up after about 3 tries.

zb34 дня назад

I tried to search all PRs submitted by copilot and I came up with this indirect way: <a href="https://github.com/search?q=%22You+can+make+Copilot+smarter+by+setting+up+custom+instructions%22&type=pullrequests">https://github.com/search?q=%22You+can+make+Copilot+smarter+...</a>Is there a more direct way? Filtering PRs in the repo by copilot as the author seems currently broken..

bwfan1234 дня назад

What do you call a code change created by co-pilot ?A Bull Request

insin4 дня назад

Look at this poor dev, an entire workday's worth of hours into babysitting this PR, still having to say "fix whitespace":<a href="https://github.com/dotnet/runtime/pull/115826">https://github.com/dotnet/runtime/pull/115826</a>

评论 #44060426 未加载

nottorp4 дня назад

So, to achieve parity, they should allow humans to also commit code without checking that it at least compiles, right?Or MS already does that?

评论 #44050663 未加载

vbezhenar4 дня назад

Why bot left work when tests are failing? Looks like incomplete implementation. It should work until all tests are green.

caleblloyd3 дня назад

Maybe funny now but once (if?) it can eventually contribute meaningfully to dotnet/runtime, AI will probably be laughing at us because that is the pinnacle of a massive enterprise project.

teleforce4 дня назад

>I can't help enjoying some good schadenfreudeFun facts schadenfreude: the emotional experience of pleasure in response to another’s misfortune, according to Encyclopedia Britannica.Word that's so nasty in meaning that it apparently does not exist except in German language.

评论 #44050690 未加载

评论 #44050602 未加载

评论 #44060145 未加载

评论 #44056246 未加载

snickerbockers4 дня назад

It's pretty cringe and highlights how inept LLMs being shoehorned into positions where they don't belong wastes more company time than it saves, but aren't all the people interjecting themselves into somebody else's github conversations the ones truly being driven insane here? The devs in the issue aren't blinking torture like everybody thinks they are. It's one thing to link to the issue so we can all point and laugh but when you add yourself to a conversation on somebody else's project and derail a bug report it with your own personal belief systems you're doing the same thing the LLM is supposedly doing.Anyways I'm disappointed the LLM has yet to discover the optimal strategy, which is to only ever send in PRs that fix minor mis-spellings and improper or "passive" semantics in the README file so you can pad out your resume with all the "experience" you have "working" as a "developer" pm Linux, Mozilla, LLVM, DOOM (bonus points if you can successfully become a "developer" on a project that has not had any official updates since before you born!), Dolphin, MAME, Apache, MySQL, GNOME, KDE, emacs, OpenSSH, random stranger's implementation of conway's game of life he hasn't updated or thought about since he made it over the course of a single afternoon back during the obama administration, etc.

评论 #44058142 未加载

shultays4 дня назад

<a href="https://github.com/dotnet/runtime/pull/115733">https://github.com/dotnet/runtime/pull/115733</a><pre><code> @copilot please remove all tests and start again writing fresh tests.</code></pre>

-__---____-ZXyw3 дня назад

Have people seen this?<a href="https://noazureforapartheid.com/" rel="nofollow">https://noazureforapartheid.com/</a>

ainiriand4 дня назад

So this is our profession now?

amai3 дня назад

Microsoft is just really following the "fail fast, fail often " paradigm here. Whether they are learning from their mistakes is another story.

OzzyB4 дня назад

_this_ is the Judgement Day we were warned about--not in the nuclear annihilation sense--but the "AI was then let loose on all our codez and the systems went down" sensecrazy times...

rmnclmnt4 дня назад

Again, very « Silicon Valley »-esque, loving it. Thanks Gilfoyle

ramesh314 дня назад

The Github based solutions are missing the mark because we still need a human in the loop no matter what. Things are nowhere near the point of being able to just let something push to production. And if you still need a human in the loop, it is far more efficient to have them giving feedback in realtime, i.e. in an IDE with CLI access and the ability to run tests, where the dev is still ultimately responsible for making the PR. Management class is salivating at the thought of getting rid of engineers, hence all of this nonsense, but it seems they're still stuck with us for now.

whimsicalism4 дня назад

kinda sad to see y'all brigading an OSS project, regardless of what you think of AI

评论 #44051980 未加载

jeswin4 дня назад

I find it amusing that people (even here on HN) are expecting a brand new tool (among the most complex ever) to perform adequetely right off the bat. It will require a period of refinement, just as any other tool or process.

评论 #44050710 未加载

评论 #44050608 未加载

评论 #44050656 未加载

评论 #44051176 未加载

评论 #44050795 未加载

评论 #44054147 未加载

评论 #44050681 未加载

aiinnyc4 дня назад

it feels like the classic solution to this is to have another LLM review the PR and loop until the PR meets a minimum acceptance bar.

blitzar4 дня назад

Needs more bots.

markus_zhang4 дня назад

Clumsy but this might be the future -- humans adjusting to AI workflow, not the other way. Much easier (for AI developers).

wyett4 дня назад

We wanted a future where AIs read boring text and we wrote interesting stuff. Instead, we got…

bonoboTP4 дня назад

Fixing existing bugs left in the codebase by humans will necessarily be harder than writing new code for new features. A bug can be really hairy to untangle, given that even the human engineer got it wrong. So it's not surprising that this proves to be tough for AI.For refactoring and extending good, working code, AI is much more useful.We are at a stage where AI should only be used for giving suggestions to a human in the driver's seat with a UI/UX that allows ergonomically guiding the AI, picking from offered alternatives, giving directions on a fairly micro level that is still above editing the code character by character.They are indeed overpromising and pushing AI beyond its current limits for hype reasons, but this doesn't mean this won't be possible in the future. The progress is real, and I wouldn't bet on it taking a sharp turn and flattening.