TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Running Dolly 2.0 on Paperspace

73 pointsby l2dyabout 2 years ago

5 comments

UtahDaveabout 2 years ago
Simon, I opened an issue on your TIL repo with the pip incantation that I think will get the GPU working.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;til&#x2F;issues&#x2F;69">https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;til&#x2F;issues&#x2F;69</a><p>I ran into that previously
spepsabout 2 years ago
I read &quot;Paperspace&quot; as &quot;paper space&quot; so it reminded of this great article: <a href="http:&#x2F;&#x2F;www.righto.com&#x2F;2014&#x2F;09&#x2F;mining-bitcoin-with-pencil-and-paper.html" rel="nofollow">http:&#x2F;&#x2F;www.righto.com&#x2F;2014&#x2F;09&#x2F;mining-bitcoin-with-pencil-and...</a><p>Could someone do the same with some LLM to demonstrate a very simple example?
ankitmathurabout 2 years ago
We&#x27;d love to help you all deploy this!<p>1. We just released a couple models that are much smaller (<a href="https:&#x2F;&#x2F;huggingface.co&#x2F;databricks&#x2F;dolly-v2-6-9b" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;databricks&#x2F;dolly-v2-6-9b</a>), and these should be much easier to run on commodity hardware in a reasonable amount of time.<p>2. Regarding this particular issue, I suspect something is wrong with the setup. The example is generating a little over 100 words, which probably is something like 250 tokens. 12 minutes makes no sense for that if you&#x27;re running on a modern GPU. I&#x27;d love to see details on which GPU was selected - I&#x27;m unfamiliar with which modern GPU has 30GB of memory (A10 is 24GB, T4 is 16GB, and A100 is 40&#x2F;80GB). Are you sure you&#x27;re using a version of PyTorch that installs CUDA correctly?<p>3. We have seen single GPU inference work in 8-bit on the A10, so I&#x27;d suggest that as a followup
评论 #35561732 未加载
freeqazabout 2 years ago
I wrote a small POC of getting this model working on my box (I felt inspired after reading this). If anybody else is wanting to try this out, give it a shot here!<p><a href="https:&#x2F;&#x2F;github.com&#x2F;lunabrain-ai&#x2F;dolly-v2-12b-8bit-example">https:&#x2F;&#x2F;github.com&#x2F;lunabrain-ai&#x2F;dolly-v2-12b-8bit-example</a><p>(It&#x27;s garbage code and this should really just be used as a starting POC. I hope it helps!)
yawnxyzabout 2 years ago
Could someone give a breakdown on why Dolly 2 is so much more difficult to run than llama.cpp?
评论 #35551854 未加载
评论 #35559301 未加载
评论 #35554888 未加载