5 pointsby osansevieroabout 2 years ago

2 comments

jerpintabout 2 years ago

This is a big deal, up until now the only reasonable way of using BLOOM was with PETALS or lots of GPUs… great work!

osansevieroabout 2 years ago

bloomz.cpp allows running inference of BLOOM-like models in pure C/C++ (inspired by llama.cpp). It supports all models that can be loaded in transformers for BLOOM.<p>As an example, you can run GPT-4 on your Mac or Pixel! On M1 Pro, you can achieve 16 tokens/sec.

Bloomz.cpp: Run multilingual BLOOM model with C++

2 comments

Bloomz.cpp: Run multilingual BLOOM model with C++

2 comments