TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

DeepSeek Coder: Let the Code Write Itself

208 pointsby fintechieover 1 year ago

8 comments

rickstanleyover 1 year ago
Hello, I would like to take this opportunity and ask for help here, about using A.I. with my own codebase.<p>Context: I missed [almost] the entire A.I. wave, but I knew that one day I would have to learn something about and&#x2F;or use it. That day has come. I&#x27;m allocated in one team, that is migrating to another engine, let&#x27;s say &quot;engine A → engine B&quot;. We are looking from the perspective of A, to map the entries for B (inbound), and after the request to B is returned, we map back to A&#x27;s model (outbound). This is a chore, and much of the work is repetitive, but it comes with its edge cases that we need to look out for and unfortunately there isn&#x27;t a solid foundation of patterns apart from the Domain-driven design (DDD) thing. It seemed like a good use case for an A.I.<p>Attempts: I began by asking to ChatGPT and Bard, with questions similar to: &quot;how to train LLM on own codebase&quot; and &quot;how to get started with prompt engineering using own codebase&quot;.<p>I concluded that, fine-tuning is expensive, for large models, unrealistic for my RTX 3060 with 6Gb VRAM, no surprise there; so, I searched here, in Hacker News, for keywords like &quot;llama&quot;, &quot;fine-tuning&quot;, &quot;local machine&quot;, etc, and I found out about ollama and DeepSeek.<p>I tried both ollama and DeepSeek, the former was slow but not as slow as the latter, which was <i>dead slow</i>, using a 13B model. I tried the 6&#x2F;7B model (I think it was codellama) and I got reasonable results and speed. After feeding it some data, I was on my way to try and train on the codebase when a friend of mine came and suggested that I use Retrieval-Augmented Generation (RAG), I have yet to try it, with a setup Langchain + Ollama.<p>Any thoughts, suggestions or experiences to share?<p>I&#x27;d appreciate it.
评论 #39212923 未加载
评论 #39212298 未加载
评论 #39212219 未加载
评论 #39212341 未加载
评论 #39212847 未加载
评论 #39213702 未加载
评论 #39212078 未加载
评论 #39212190 未加载
评论 #39212484 未加载
_boffin_over 1 year ago
Been using DeepSeek Coder 33B Q8 on my work laptop for a bit now. I like it, but am still finding myself going to GPT-4&#x27;s API for the more nuanced things.<p>They just released a v1.5 (<a href="https:&#x2F;&#x2F;huggingface.co&#x2F;deepseek-ai&#x2F;deepseek-coder-7b-instruct-v1.5" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;deepseek-ai&#x2F;deepseek-coder-7b-instruc...</a>), but for some reason, they reduced the context length from ~16k to ~4k.
评论 #39214752 未加载
sestinjover 1 year ago
We&#x27;ve been playing with the 1.3b model for continue.dev&#x27;s autocomplete and it&#x27;s quite impressive. One unclear part is whether the license really permits commercial usage, but regardless it&#x27;s exciting to see the construction of more complex datasets. They mention that training on multiple tasks (FIM + normal completion) improves performance...wonder whether training to output diffs would be equally helpful (this is the holy grail needed to generate changes in O(diff length) time)
评论 #39212716 未加载
评论 #39212361 未加载
elwebmasterover 1 year ago
Mixtral &gt; Codellama &gt; DeepSeek Coder. Very weird model, writes super long comments on one line, definitely not at the level of Codellama, benchmarks be damned.
评论 #39212838 未加载
评论 #39211794 未加载
评论 #39213480 未加载
评论 #39212138 未加载
评论 #39212435 未加载
评论 #39211393 未加载
评论 #39212565 未加载
评论 #39211457 未加载
Havocover 1 year ago
I’ve been using their 7B with tabbyML.<p>Works well but closer to a very smart code complete rather than generating much novel blocks of code
评论 #39211874 未加载
chiiover 1 year ago
Just tried it by asking how to create a game that is turn based, using an ECS system, and how to add a decision tree, and a save&#x2F;load system, in the language Haxe.<p>It outputs relatively correct haxe code, but it did halucinate that there is a library called &#x27;haxe-tiled&#x27; to read tmx map files...
hackerlightover 1 year ago
In the benchmarks, are they using the base GPT-4, or are they using a GPT like Grimoire which will be better at coding? If they aren&#x27;t using Grimoire, isn&#x27;t it unfair to compare their fine tuned model to base GPT-4?
评论 #39210223 未加载
byyoung3over 1 year ago
looks like code llama 70B outperforms on humaneval I believe
评论 #39211432 未加载