TechEcho

8 comments

Hello, I would like to take this opportunity and ask for help here, about using A.I. with my own codebase.Context: I missed [almost] the entire A.I. wave, but I knew that one day I would have to learn something about and/or use it. That day has come. I'm allocated in one team, that is migrating to another engine, let's say "engine A → engine B". We are looking from the perspective of A, to map the entries for B (inbound), and after the request to B is returned, we map back to A's model (outbound). This is a chore, and much of the work is repetitive, but it comes with its edge cases that we need to look out for and unfortunately there isn't a solid foundation of patterns apart from the Domain-driven design (DDD) thing. It seemed like a good use case for an A.I.Attempts: I began by asking to ChatGPT and Bard, with questions similar to: "how to train LLM on own codebase" and "how to get started with prompt engineering using own codebase".I concluded that, fine-tuning is expensive, for large models, unrealistic for my RTX 3060 with 6Gb VRAM, no surprise there; so, I searched here, in Hacker News, for keywords like "llama", "fine-tuning", "local machine", etc, and I found out about ollama and DeepSeek.I tried both ollama and DeepSeek, the former was slow but not as slow as the latter, which was dead slow, using a 13B model. I tried the 6/7B model (I think it was codellama) and I got reasonable results and speed. After feeding it some data, I was on my way to try and train on the codebase when a friend of mine came and suggested that I use Retrieval-Augmented Generation (RAG), I have yet to try it, with a setup Langchain + Ollama.Any thoughts, suggestions or experiences to share?I'd appreciate it.

评论 #39212923 未加载

评论 #39212298 未加载

评论 #39212219 未加载

评论 #39212341 未加载

评论 #39212847 未加载

评论 #39213702 未加载

评论 #39212078 未加载

评论 #39212190 未加载

评论 #39212484 未加载

_boffin_over 1 year ago

Been using DeepSeek Coder 33B Q8 on my work laptop for a bit now. I like it, but am still finding myself going to GPT-4's API for the more nuanced things.They just released a v1.5 (<a href="https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5" rel="nofollow">https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruc...</a>), but for some reason, they reduced the context length from ~16k to ~4k.

评论 #39214752 未加载

sestinjover 1 year ago

We've been playing with the 1.3b model for continue.dev's autocomplete and it's quite impressive. One unclear part is whether the license really permits commercial usage, but regardless it's exciting to see the construction of more complex datasets. They mention that training on multiple tasks (FIM + normal completion) improves performance...wonder whether training to output diffs would be equally helpful (this is the holy grail needed to generate changes in O(diff length) time)

评论 #39212716 未加载

评论 #39212361 未加载

elwebmasterover 1 year ago

Mixtral > Codellama > DeepSeek Coder. Very weird model, writes super long comments on one line, definitely not at the level of Codellama, benchmarks be damned.

评论 #39212838 未加载

评论 #39211794 未加载

评论 #39213480 未加载

评论 #39212138 未加载

评论 #39212435 未加载

评论 #39211393 未加载

评论 #39212565 未加载

评论 #39211457 未加载

Havocover 1 year ago

I’ve been using their 7B with tabbyML.Works well but closer to a very smart code complete rather than generating much novel blocks of code

评论 #39211874 未加载

chiiover 1 year ago

Just tried it by asking how to create a game that is turn based, using an ECS system, and how to add a decision tree, and a save/load system, in the language Haxe.It outputs relatively correct haxe code, but it did halucinate that there is a library called 'haxe-tiled' to read tmx map files...

hackerlightover 1 year ago

In the benchmarks, are they using the base GPT-4, or are they using a GPT like Grimoire which will be better at coding? If they aren't using Grimoire, isn't it unfair to compare their fine tuned model to base GPT-4?

评论 #39210223 未加载

byyoung3over 1 year ago

looks like code llama 70B outperforms on humaneval I believe

评论 #39211432 未加载

8 comments

rickstanleyover 1 year ago

评论 #39212923 未加载

评论 #39212298 未加载

评论 #39212219 未加载

评论 #39212341 未加载

评论 #39212847 未加载

评论 #39213702 未加载

评论 #39212078 未加载

评论 #39212190 未加载

评论 #39212484 未加载

_boffin_over 1 year ago

评论 #39214752 未加载

sestinjover 1 year ago

评论 #39212716 未加载

评论 #39212361 未加载

elwebmasterover 1 year ago

Mixtral > Codellama > DeepSeek Coder. Very weird model, writes super long comments on one line, definitely not at the level of Codellama, benchmarks be damned.

评论 #39212838 未加载

评论 #39211794 未加载

评论 #39213480 未加载

评论 #39212138 未加载

评论 #39212435 未加载

评论 #39211393 未加载

评论 #39212565 未加载

评论 #39211457 未加载

Havocover 1 year ago

I’ve been using their 7B with tabbyML.Works well but closer to a very smart code complete rather than generating much novel blocks of code

DeepSeek Coder: Let the Code Write Itself

8 comments

DeepSeek Coder: Let the Code Write Itself

8 comments