Show HN: Tabby – A self-hosted GitHub Copilot

627 点作者 wsxiaoys大约 2 年前

I would like to introduce Tabby, which is a self-hosted alternative to GitHub Copilot that you can integrate into your hardware. While GitHub Copilot has made coding more efficient and less time-consuming by assisting developers with suggestions and completing code, it raises concerns around privacy and security.Tabby is in its early stages, and we are excited to receive feedback from the community.Its Github repository is located here: <a href="https://github.com/TabbyML/tabby">https://github.com/TabbyML/tabby</a>.We have also deployed the latest docker image to Huggingface for a live demo: <a href="https://huggingface.co/spaces/TabbyML/tabby" rel="nofollow">https://huggingface.co/spaces/TabbyML/tabby</a>.Tabby is built on top of the popular Hugging Face Transformers / Triton FasterTransformer backend and is designed to be self-hosted, providing you with complete control over your data and privacy. In Tabby's next feature iteration, you can fine-tune the model to meet your project requirements.

29 条评论

DanHulton大约 2 年前

Copilot has so far been pretty useful to me as a "sometimes smarter intellisense". It'll frequently correctly guess the arguments I want for a function (and their types), and every once in a while I'll type `a.map(` and it'll auto-fill the transformation code I was planning on writing.The simpler the task I'm trying to do, the better chance it has of being correct, but that's also the part where I feel I get the most benefit from it, because I already thoroughly understand exactly what I'm writing, why I'm writing it, and what it needs to look like, and Copilot sometimes saves me the 5-30s it takes to write it. Over a day, that adds up and I can move marginally faster.It's definitely not a 100x improvement (or even a 10x improvement), but I'm glad to have it.If this works as well, locally, to escape the privacy issue, I'll be thrilled. Checking it out.

评论 #35472352 未加载

评论 #35474786 未加载

评论 #35478991 未加载

评论 #35479143 未加载

评论 #35473866 未加载

评论 #35477444 未加载

评论 #35482748 未加载

评论 #35473735 未加载

评论 #35478190 未加载

评论 #35472766 未加载

评论 #35475594 未加载

评论 #35473351 未加载

the42thdoctor大约 2 年前

If not for the wonderful product itself, take my upvote for the "hire" page. It's the first time I see a company presenting a formula where you can input your data and figure out how much you would get paid. Nice!> <a href="https://tabbyml.notion.site/Compensation-Sheet-ad61218889ab47249fe35d85fc3c0586" rel="nofollow">https://tabbyml.notion.site/Compensation-Sheet-ad61218889ab4...</a>

评论 #35477985 未加载

评论 #35479642 未加载

vikp大约 2 年前

How does this compare to fauxpilot - <a href="https://github.com/fauxpilot/fauxpilot">https://github.com/fauxpilot/fauxpilot</a>? Fauxpilot also uses Triton with fastertransformers and GPT-J style models (codegen).

评论 #35472035 未加载

uglycoyote大约 2 年前

I'm confused about the premise here. The power of self-hosting such a thing is presumably that you would be able to train it on your own company's codebase as a corpus of examples to help other people in the company know how to navigate the specifics of your codebase.But there's nothing in the introductory materials about how to train this thing.

评论 #35474444 未加载

评论 #35475735 未加载

评论 #35476082 未加载

评论 #35475115 未加载

评论 #35475509 未加载

funnyfoobar大约 2 年前

I think that this is a really cool project.Personally, I am working at a financial institution which is regulated, and the legal team + cyber are still evaluating if there would be any problems that could arise with CoPilot.The way the Europe is heading towards AI regulation, Italy has already banned chatGPT. It seems likely that there would be a threat to Copilot as well in the Europe.So these kind of solutions would make a ton of sense wrt organisations adopting this.However as a developer who has used copilot before, here are my 2 cents:The copilot makes a lot of sense, wrt auto completion and code generation, because it understands the current context of code.It would be a friction for the developers if we are expecting them to use the user interface this project comes with.It would be great for this project to go in the direction of the following:- Ability for the developers to train the model with our custom projects, so it can give suggestions wrt to our style of coding- Extensions for popular Editors like VIM, VS Code, IntelliJ etcHappy to share more further info if needed :)

mska大约 2 年前

Out of curiosity, how do companies offering self-hosted/on-prem solutions monetize their offerings?Do they rely on legal contracts to prevent customers from using the software for free or modifying it for their own purposes?

评论 #35473645 未加载

评论 #35472190 未加载

评论 #35478254 未加载

评论 #35472061 未加载

nmstoker大约 2 年前

Is this serious? There's very little to go on in the repo to establish how well thought through this is.I don't want to mark them down for poor language skills but the style of the comments on the TabbyML GitHub profile suggests a rather casual approach, and when combined with a lack of any serious documentation or even basic details beyond a sketched architecture diagram, I kind of wonder... Is there any particular context others can point to that I may be overlooking?

评论 #35476278 未加载

评论 #35472831 未加载

Nic0大约 2 年前

Just in case you didn't know that a project called Tabby exists (it was Terminus). It's a terminal (another one you could say). It's not my project, I'm just a user.<a href="https://tabby.sh/" rel="nofollow">https://tabby.sh/</a>

nathancahill大约 2 年前

This looks exactly like what I need for a project I've been working on. How do you get your own code in to the model? Or is that the future fine-tuning step you're talking about?

评论 #35471390 未加载

simonw大约 2 年前

Anyone know of a quick workaround for this?<pre><code> % docker run \ -it --rm \ -v ./data:/data \ -v ./data/hf_cache:/home/app/.cache/huggingface \ -p 5000:5000 \ -e MODEL_NAME=TabbyML/J-350M \ tabbyml/tabby Unable to find image 'tabbyml/tabby:latest' locally latest: Pulling from tabbyml/tabby docker: no matching manifest for linux/arm64/v8 in the manifest list entries. See 'docker run --help'. </code></pre> I have an M2 Mac. I believe Docker is capable of running images compiled for different architectures using QEMU style workarounds, but is that something I can do with a one-liner or would I need to build a new image from scratch?Previous experiments with Docker and QEMU: <a href="https://til.simonwillison.net/docker/emulate-s390x-with-qemu" rel="nofollow">https://til.simonwillison.net/docker/emulate-s390x-with-qemu</a>

评论 #35472793 未加载

shaunxcode大约 2 年前

Do any of the authors of these models disclose a clear list of codebases/discussion sites these models are trained on? Are they at the least avoiding stuff clearly GNU (or such) licensed? Is the hope that there is just no way to figure it out?

wongarsu大约 2 年前

Trying the demo I got "value is not a valid enumeration member; permitted: 'unknown', 'python', 'javascript'". Trying some clearly identifiable Rust code with language set to 'unknown' I got a completion in Java back. The completion made sense, was properly indented and syntactically correct, it just was the wrong programming language.Is this a limitation of the hosted demo or the chosen model, or do I simply have to wait a bit until my favorite niche language is supported?

syntaxing大约 2 年前

What sort of resources do we need to run this, particularly VRAM? Also, how does this compare to Fauxpilot?

评论 #35471882 未加载

myin大约 2 年前

Great to have a self-hosted solution, both for data privacy and quality improvement potentials.

akrymski大约 2 年前

What LLM is this using? Or did you train your own?

评论 #35471598 未加载

GartzenDeHaes大约 2 年前

After using Github Copilot for a couple of weeks, it doesn't seem to do much other than Stackoverflow/blog copy paste. Does anyone get much else from it when writing non-boilerplate, non-tutorial, and non-stackoverflow types of code?

评论 #35472388 未加载

评论 #35471648 未加载

评论 #35472528 未加载

评论 #35471516 未加载

评论 #35471529 未加载

评论 #35471956 未加载

评论 #35471748 未加载

评论 #35475349 未加载

评论 #35474775 未加载

jslakro大约 2 年前

Considering it's an alpha version I think a VSCode extension is the missing part

评论 #35473019 未加载

boringuser2大约 2 年前

It's kinda funny that the same people knocking GPT-3 copilot are going to be losing their absolute minds when copilot X drops.GPT-4 can autogenerate most code a business needs, just need a lone engineer to keep it in check.

评论 #35477506 未加载

ROFISH大约 2 年前

Does this come pre-trained, or does it require training? What sources of data are you training on if so?I restrict all usage of AI tools trained from publicly-sourced data because of an unknown copyright restriction, general unease, and lawsuits; however if this can be trained solely on my own codebases that are of clean providence, I can be 100% guaranteed against potential lawsuits.Copilot is a cool tool, but super scary from a legal perspective. And even more heavily regulated industries (that I'm not in) would absolutely need their own firewalled version.

hddqsb大约 2 年前

I'm glad to see projects like this.I had a look at the demo (<a href="https://huggingface.co/spaces/TabbyML/tabby" rel="nofollow">https://huggingface.co/spaces/TabbyML/tabby</a>) and wasn't too impressed with the generated code for the default sample prompt (binary search) -- it recurses infinitely if the item is missing. It would be interesting to compare with Copilot's output. No idea how one would go about fixing this (other than manually add a correct binary search implementation to the training data, which feels like cheating).Request to <a href="https://tabbyml-tabby.hf.space/v1/completions" rel="nofollow">https://tabbyml-tabby.hf.space/v1/completions</a>:<pre><code> { "language": "python", "prompt": "def binarySearch(arr, left, right, x):\n mid = (left +" } </code></pre> Response:<pre><code> { "id": "cmpl-...", "created": 1680867355, "choices": [ { "index": 0, "text": " right) >> 1\n if x < arr[mid]:\n return binarySearch(arr, left, mid - 1, x)\n elif x > arr[mid]:\n return binarySearch(arr, mid + 1, right, x)\n else:\n return mid" } ] } </code></pre> Formatted code:<pre><code> def binarySearch(arr, left, right, x): mid = (left + right) >> 1 if x < arr[mid]: return binarySearch(arr, left, mid - 1, x) elif x > arr[mid]: return binarySearch(arr, mid + 1, right, x) else: return mid </code></pre> Manually written test cases:<pre><code> arr = [1, 3, 5, 7] print(binarySearch(arr, 0, len(arr), 5)) # 2 (correct) print(binarySearch(arr, 0, len(arr), 4)) # RecursionError </code></pre> Runnable demo:<a href="https://tio.run/##lY/BCoJAEIbvPsUPXVw0yCyCKG@9QB3Fg@maC7bKuII@vc1qdJGg5rIL83/fzDSDKWsdjmMuC9yVTmm4yZSy0k2JfFSyMD5IPUp@enF0wPVUOc5wbQ/e3BSIIgRTVxXocQLjMQeTGbFF0nSkvw6x1jUCO2ZCZDWZon9M1uFZx2fjt6qVC5qzjsMUnxIzEfrY@zgkTkNKG3fp3thFtf0LjgqBFbZws5pIZkb8hu0m7CqzjlpV6wtRTeP4Ag" rel="nofollow">https://tio.run/##lY/BCoJAEIbvPsUPXVw0yCyCKG@9QB3Fg@maC7bKuI...</a>

covi大约 2 年前

This is awesome, and glad to see that SkyPilot is useful in distributing Tabby on any cloud! <a href="https://github.com/TabbyML/tabby/blob/main/deployment/skypilot/README.md">https://github.com/TabbyML/tabby/blob/main/deployment/skypil...</a>

unosama大约 2 年前

I like to use my brain as my copilot

moonchrome大约 2 年前

Anyone dare to guess how fast would GH copilot be if it ran locally ?My main problem with copilot is latency/speed - I would shell out for 4090 if it meant I could use local copilot model that's super fast/low latency/explores deep suggestions.

评论 #35474997 未加载

arco1991大约 2 年前

Very interesting to have a self-hosted version of Copilot. Will definitely try this out.

xvilka大约 2 年前

I hope someone comes with the AI-powered refactoring. I think it has more practical applications than Copilot, at least from my experience in C and C++ test runs of it.

grudg3大约 2 年前

Hi, will this work on AMD Gpu? I have plenty of VRAM available

accelbred大约 2 年前

Is AMD gpu support on the roadmap?I'm interested in this stuff but I won't be buying Nvidia cards.

zoba大约 2 年前

My assumption is that this would not be fast enough for practical use on M1/M2 Macbooks. Is that correct?

lfkdev大约 2 年前

This exakt name is already used by a big open-source projekt, <a href="https://github.com/eugeny/tabby">https://github.com/eugeny/tabby</a> (50k stars) maybe consider changing the name for better SEO

评论 #35472085 未加载