TechEcho

9 comments

ghthor10 months ago

I’m self hosting using TabbyML and running StarCoder 3B on my nvidia rtx2080 super and I can’t imagine coding without it anymore. It consistently, across all languages I work in , gives me great completions.People not having good success in this thread, I would suggest trying again

tcdent10 months ago

Went through this same exercise this week and came to the same conclusion.After trying multiple open models, reconfiguring GPT-4o and seeing the speed and quality of the output was illuminating.

cmpit10 months ago

I also wanted to try some local LLMs, but gave up and came to the same conclusion:"While the idea of having a personal and private instance of a code assistant is interesting (and can also be the only available option in certain environments), the reality is that achieving the same level of performance as GitHub Copilot is quite challenging.".But considering the pace at which AI and the ecosystem advances, things might change soon.

zamalek10 months ago

I believe we'll need a purpose-built ASIC with access to 100GB of good old [G]DDR5 before this becomes viable. Something like what Hailo offers, but without the "product inquiry" barrier.I say that because we don't need datacenter speeds for a single user, but there is no avoiding memory requirements.I don't think it will happen. The market is too niche. People are happy to fork over $5/mo.

rcarmo10 months ago

It really depends on the use case, and right now using Ollama for coding just isn’t that useful. I can use gemma2 and phi3 just fine for general summarization and keyword extraction (including most of the stuff I need to do home automation with a “better Siri”—low bar, I know), but generating or autocompleting code is just another level entirely.

NomDePlum10 months ago

I don't use Copilot so not able to compare but ollama + llama3:instruct + open-webui on a Mac Pro M2 is helpful when coding.

sebazzz10 months ago

I wonder if running these models on a recent RTX graphics card would speed them up - instead of using an M2 mac.

评论 #41016810 未加载

transformi10 months ago

It looks like you used some older models.. what about deepseekcoder/Qwen?

fortyseven10 months ago

This is from early March. An eternity in this space. What's the REAL current status?

9 comments

ghthor10 months ago

tcdent10 months ago

Went through this same exercise this week and came to the same conclusion.After trying multiple open models, reconfiguring GPT-4o and seeing the speed and quality of the output was illuminating.

cmpit10 months ago

zamalek10 months ago

rcarmo10 months ago

NomDePlum10 months ago

I don't use Copilot so not able to compare but ollama + llama3:instruct + open-webui on a Mac Pro M2 is helpful when coding.

sebazzz10 months ago

I wonder if running these models on a recent RTX graphics card would speed them up - instead of using an M2 mac.

评论 #41016810 未加载

transformi10 months ago

It looks like you used some older models.. what about deepseekcoder/Qwen?

fortyseven10 months ago

This is from early March. An eternity in this space. What's the REAL current status?

Self hosting a Copilot replacement: my personal experience

9 comments

Self hosting a Copilot replacement: my personal experience

9 comments