TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GPT-J 6B on GPUs (through HuggingFace PR)

4 pointsby paul-naialmost 4 years ago
Hi all,<p>I&#x27;ve created a Jupyter notebook with everything you need to convert+run GPT J from Jax over to work with the new HuggingFace PR for GPT J. I&#x27;ve also got the model working on our production environment that you can play around with&#x2F;use in production here:<p>https:&#x2F;&#x2F;hub.getneuro.ai&#x2F;model&#x2F;nlp&#x2F;gpt-j-6B-text-generation<p>Average inference speed is surprisingly fast running on our T4s, around 5s for 50 tokens. Will be trying with a V100, and Quadro 8000 (full precision model) tomorrow. To fit the model on GPUs that are sub ~24GB the model in the demo and notebook are half precision in torch. This was kinda painful to get working, so hopefully you find it useful.<p>https:&#x2F;&#x2F;github.com&#x2F;paulcjh&#x2F;gpt-j-6b&#x2F;blob&#x2F;main&#x2F;gpt-j-t4.ipynb<p>Cheers

no comments

no comments