TechEcho

5 comments

UtahDaveabout 2 years ago

Simon, I opened an issue on your TIL repo with the pip incantation that I think will get the GPU working.<a href="https://github.com/simonw/til/issues/69">https://github.com/simonw/til/issues/69</a>I ran into that previously

spepsabout 2 years ago

I read "Paperspace" as "paper space" so it reminded of this great article: <a href="http://www.righto.com/2014/09/mining-bitcoin-with-pencil-and-paper.html" rel="nofollow">http://www.righto.com/2014/09/mining-bitcoin-with-pencil-and...</a>Could someone do the same with some LLM to demonstrate a very simple example?

ankitmathurabout 2 years ago

We'd love to help you all deploy this!1. We just released a couple models that are much smaller (<a href="https://huggingface.co/databricks/dolly-v2-6-9b" rel="nofollow">https://huggingface.co/databricks/dolly-v2-6-9b</a>), and these should be much easier to run on commodity hardware in a reasonable amount of time.2. Regarding this particular issue, I suspect something is wrong with the setup. The example is generating a little over 100 words, which probably is something like 250 tokens. 12 minutes makes no sense for that if you're running on a modern GPU. I'd love to see details on which GPU was selected - I'm unfamiliar with which modern GPU has 30GB of memory (A10 is 24GB, T4 is 16GB, and A100 is 40/80GB). Are you sure you're using a version of PyTorch that installs CUDA correctly?3. We have seen single GPU inference work in 8-bit on the A10, so I'd suggest that as a followup

评论 #35561732 未加载

freeqazabout 2 years ago

I wrote a small POC of getting this model working on my box (I felt inspired after reading this). If anybody else is wanting to try this out, give it a shot here!<a href="https://github.com/lunabrain-ai/dolly-v2-12b-8bit-example">https://github.com/lunabrain-ai/dolly-v2-12b-8bit-example</a>(It's garbage code and this should really just be used as a starting POC. I hope it helps!)

yawnxyzabout 2 years ago

Could someone give a breakdown on why Dolly 2 is so much more difficult to run than llama.cpp?

评论 #35551854 未加载

评论 #35559301 未加载

评论 #35554888 未加载

5 comments

UtahDaveabout 2 years ago

spepsabout 2 years ago

ankitmathurabout 2 years ago

评论 #35561732 未加载

freeqazabout 2 years ago

yawnxyzabout 2 years ago

Could someone give a breakdown on why Dolly 2 is so much more difficult to run than llama.cpp?

评论 #35551854 未加载

评论 #35559301 未加载

评论 #35554888 未加载

Running Dolly 2.0 on Paperspace

5 comments

Running Dolly 2.0 on Paperspace

5 comments