TechEcho

Hi all,I've created a Jupyter notebook with everything you need to convert+run GPT J from Jax over to work with the new HuggingFace PR for GPT J. I've also got the model working on our production environment that you can play around with/use in production here:https://hub.getneuro.ai/model/nlp/gpt-j-6B-text-generationAverage inference speed is surprisingly fast running on our T4s, around 5s for 50 tokens. Will be trying with a V100, and Quadro 8000 (full precision model) tomorrow. To fit the model on GPUs that are sub ~24GB the model in the demo and notebook are half precision in torch. This was kinda painful to get working, so hopefully you find it useful.https://github.com/paulcjh/gpt-j-6b/blob/main/gpt-j-t4.ipynbCheers

GPT-J 6B on GPUs (through HuggingFace PR)

no comments

GPT-J 6B on GPUs (through HuggingFace PR)

no comments