> People of every background will soon be able to create code to solve their everyday problems and improve their lives using AI, and we’d like to help make this happen<p>Yeah, this is not going to happen. Anyone who has ever tried to gather requirements for software knows that users don't know what they want (clients especially lmao.) The language they use won't be detailed enough to create anything meaningful. Do you know what language would be? Code... Unironically, the best language for software isn't English. It's code. Should you specify what you want in enough detail for it to be meaningful suddenly you're doing something quite peculiar. Where have I heard it before? Oh yeah, you're writing code.<p>These tools are all annoying AF. Developers don't need half-baked hints to write basic statements and regular people don't have the skills to hobble together whatever permutations these things spit out. Which rather begs the question: who the hell is the audience for this?
Its metrics on HumanEval seem not particularly good (26.89 Pass@1 for it vs. 61.64 for PanGu-Coder2 15b). Is it targeting a very specific latency for responses? I'd think a 15b quantization should run fast enough for most use cases? Even phi-1 1.3B has better performance at 50.6.
> People of every background will soon be able to create code to solve their everyday problems and improve their lives using AI, and we’d like to help make this happen<p>Just like everytime people hyping a technology have said this with something else where “AI” is but otherwise an identical claim, no, it didn’t happen last time, its not happening this time, and there’s a pretty good chance its not happening next time, either.
Is this a "product" that one could install and use or a model that one should expect an OEM to integrate into a product before programmers can use it? I'm asking because I don't see any links that would help me figure out how to try it out.
Is it good at algos?<p>From interviews:<p>Implement queue that supports three methods:<p>* push<p>* pop<p>* peek(i)<p>peek returns element by its index. All three methods should have O(1) complexity [write code in Ruby].<p>ChatGPT wasn't able to solve that last time I tried <a href="https://twitter.com/romanpushkin/status/1617037136364199938" rel="nofollow noreferrer">https://twitter.com/romanpushkin/status/1617037136364199938</a>
I have thought about how these tools can be useful quite a lot. I have a prompt I can feed chat gpt and it will create whole feature "skeletons" with my naming rules and architecture quirks. Taking a lot of time from getting started when building something new. But with chat it is still too inconvenient, having something like this integrated in the ide via a script would he more convenient but still a very specific use case.<p>I think what I want is this idea of "code completion" but not for writing the methods, which is the easy part. Instead the tool should structure classes and packages and modules and naming and suggest better ways to write certain things.
If I’m reading this correctly this could be an open source model that may compete with the likes of copilot?<p>That is something I’d be very interested in if they can get the compute requirements down to those of say a standard 13B model. Then I could fine tune (correct term?) it on my offline data and hook it into something like fauxpilot and my IDE.<p>I had a look at some of the recent code models (wizardcoder,strider etc) but it seemed that you need a really large model to be any good and quite a few of them were trying specifically for python.
AI Cannot magically read minds. Having said that It would be nicer to have complete solutions rather than code hints. Imagine having to write a detailed prompt rather than choosing a prediction. Something like : "Write a React/Node JS app that has authentication and a home page" and the AI model give you a complete project as the output. It would be great if it generates deterministic output for the Prompt. AI can really help increase the productivity of Programmers.
> ~120,000 code instruction/response pairs in Alpaca format were trained on the base model to achieve this result.<p>Very curious where they are getting this data from. In other open source papers, usually this comes from a GPT-4 output, but presumably Stability would not do that?
Either way, the race to zero has been further accelerated.<p>Stability AI, Apple, Meta, etc are clearly at the finish line putting pressure on cloud only AI models and cannot raise prices or compete with free.
As a user who cares more about the product, how does it compare to the gpt-4 code capability? gpt-4 is good enough for me, if it works better than gpt-4 I would love to try it!