Devin is now generally available

155 pointsby neural_thing5 months ago

21 comments

winkle5 months ago

First place I usually go is the terms of service and what they are granting themselves rights to. Not excited about how broad this is "3.2 License: By using the Services, you hereby grant to Cognition, its affiliates, successors, and assigns a non-exclusive, worldwide, royalty-free, fully paid, sublicensable, transferable license to reproduce, distribute, modify, and otherwise use, display, and perform all acts with respect to the Customer Data as may be necessary for Cognition to provide the Services to you."

评论 #42381456 未加载

评论 #42380905 未加载

Topfi5 months ago

No public testing, no benchmarks, no clear information on context window size or restrictions for extensive use, no comparison with the newest Claude Sonnet 3.5 or O1, nothing.What we do get is a price of $ 500,- per month from a company that has been caught lying about this very product [0] and has never allowed independent testing.Cognition, I am sorry to tell you, but there is no reason to trust you. In fact, there are multiple good reasons no to, even if you offered Devin at a fraction.If this were e.g. Anthropic launching a new beyond Opus size model that was still performant and came with "chain-of-thought" capabilities, a far more extensive context window that still fully passes needle in haystack and is absolutely solid in sourcing from provided files, keeps on track even when provided with large documents, has few or no restrictions on usage and comes with extensive, verifiable benchmarks that showcase this offering being a significant upgrade over other models, maybe such a price could be justified.You know why Cognition? Because they haven’t actively lied. What they did instead was let people use their models and actually test the advantages. Even Claude Instant way back when had certain use cases that made them have their own niche and showed they could execute before expanding with 2 and the larger context, then 3 with more applications. You never did any of that, you never gave anyone reason to believe what you claim, you didn’t even release benchmarks. See the difference?Seems more like a simple cash grab, attempting to ride the O1 wave. OpenAI has a hard time justifying their Pro pricing, you doubling that makes this an out of season April fools joke. Waiting for the inevitable reporting that this is just another API wrapper for Claude or ChatGPT with our old faithful RAG.[0] <a href="https://www.youtube.com/watch?v=tNmgmwEtoWE&pp=ygUJZGV2aW4gYWkg" rel="nofollow">https://www.youtube.com/watch?v=tNmgmwEtoWE&pp=ygUJZGV2aW4gY...</a>

preommr5 months ago

From the second video: "We can focus on the things that excite us rather than just the maintenancing [maintenance] work".But these are the kinds of problems that help shape the product. The software archictecture should be a compression of a deep and intuitive understanding of the problem space. How can you develop that knowledge if you're just delegating it to a black box that can't operate at a near-human level?I've used ai based tools to great success, but on an ad-hoc basis, for specific and small functions or modules. To do the integration part requires an understanding of what abstraction is appropriate where. I don't think these tools are good that.

评论 #42383034 未加载

a-arbabian5 months ago

Mike from Vesta (first demo video) claims Devin saved "at least a hundred hours" debugging API integrations. That seems crazy to me - API integrations rarely take that long, and any engineer would spot issues like wrong API keys almost immediately. The tool might be more valuable for non-engineers creating initial drafts, but by the time you've written all the detailed specs for Devin, a mid-level engineer could have made significant progress on the task.

评论 #42381741 未加载

评论 #42380357 未加载

评论 #42383926 未加载

paradite5 months ago

The trend of AI tools to make a bold claim at launch, just have lots of caveats caveats caveats caveats when actually releasing to public.

Yusefmosiah5 months ago

Looking for comprehensive benchmarks with Devin vs Cursor + Claude 3.6 vs ChatGPT o1 Pro.In my own experience using Cursor with Claude 3.5 Sonnet (new) and o1-preview, Claude is sufficient for most things, but there are times when Claude gets stumped. Invariably that means I asked it to do too much. But sometimes, maybe 10-20% of the time, o1-preview is able to do what Claude couldn’t.I haven’t signed up for o1 Pro because going from Cursor to copy/pasting from ChatGPT is a big DevX downgrade. But from what I’ve heard o1 Pro can solve harder coding problems that would stump Claude or o1-preview.My solution is just to split the problem into smaller chunks that make it tractable for Claude. I assume this is what Devin’s doing. Or is Devin using custom models or an early version of the o1 (full or pro) API?

评论 #42379943 未加载

gexla5 months ago

Should have come with a prominent warning at the app site that you're heading towards a $500 sub. I'm sure it's mentioned in places I didn't see it. Ideally, you would agree to the sub before you even create an account. This could save LOADS of signups from people who aren't your intended users.

评论 #42411507 未加载

mfdupuis5 months ago

I'm curious to see how this plays out when it comes to deploying and maintaining production-grade apps. I know relatively little about infrastructure and DevOps, but that's the stuff that actually always seems complicated when it goes from going to MVP to production. This question feels particularly important if we're expecting PMs and designers to be primary users.That said, I'm super excited about this space and love seeing smart folks putting energy into this. Even if it's still a bit aspirational, I think the idea of cutting down time spent debugging and refactoring and putting more power in the hands of less technical folks is awesome.

waldenyan205 months ago

hey guys - Walden here, one of the founders. Excited to have you try out Devin. Reach out here if you have any questions!

评论 #42383845 未加载

评论 #42379728 未加载

评论 #42379738 未加载

评论 #42379486 未加载

评论 #42379941 未加载

评论 #42379437 未加载

评论 #42381877 未加载

评论 #42379334 未加载

评论 #42381123 未加载

评论 #42379306 未加载

评论 #42382610 未加载

评论 #42386008 未加载

评论 #42379370 未加载

评论 #42381785 未加载

评论 #42390367 未加载

评论 #42383698 未加载

评论 #42379568 未加载

adamgordonbell5 months ago

It seems like a lot of the magic is providing LLMs with tools that let it work like a human would. This approach makes more sense to me then the model of expecting an LLM to just emit a giant block of code for a change, given a pile of RAG context.( removed pricing q, as I missed it is $500 / month for whole teams. I get why that is the pricing, but doesn't work for me to try it in side projects sadly )

评论 #42379318 未加载

binarynate5 months ago

Am I the only one who laments this trend of using a common first name as a product name? When I see this, my first reaction is that the company lacks any empathy for people who have the name they're co-opting.<a href="https://www.washingtonpost.com/technology/interactive/2021/people-named-alexa-name-change-amazon/" rel="nofollow">https://www.washingtonpost.com/technology/interactive/2021/p...</a><a href="https://archive.is/w8r58" rel="nofollow">https://archive.is/w8r58</a>

评论 #42381363 未加载

评论 #42379703 未加载

评论 #42380107 未加载

评论 #42381846 未加载

评论 #42380362 未加载

评论 #42379645 未加载

评论 #42380265 未加载

评论 #42379937 未加载

评论 #42379861 未加载

debacle5 months ago

I couldn't find anywhere a list of languages that this tool supports. What makes this tool better than e.g. cursor?

didip5 months ago

Aren't you guys afraid that Copilot will simply crushed you? They have all the training data afterall.

评论 #42379614 未加载

评论 #42381466 未加载

anticensor5 months ago

Can you also add Discord, Telegram, Gitlab, Forgejo integrations for those whose use them for their software development discussions?

Oras5 months ago

> Small frontend bugs and edge cases - tag Devin in Slack threadsAnd other points where it should shine. How does it compare to using Cursor? Is it the slack integration?

评论 #42379957 未加载

allusernamesare5 months ago

How does Devin compare to lovable.dev ? I've been thoroughly impressed by their ability to build and host functioning apps from very basic prompts.

daft_pink5 months ago

Is there any evidence this works better than Claude 3.5?

评论 #42379596 未加载

评论 #42379307 未加载

评论 #42380339 未加载

WesleyJohnson5 months ago

Any plans or capabilities for something local? Not a locally hosted Devin, mind you, but a way to interact with on-prem source control repos?

nextworddev5 months ago

Devin really wasted a lot of time going GA because they lost a lot of their initial buzz

DidYaWipe5 months ago

Might be an interesting headline if it said what "Devin" is.

adastra225 months ago

You never say what Devin is.