I have a stockpile of 22 8 GPUs servers with AMD Mi50s(see notes about Mi50s below). I've been able to get PyTorch working on these GPUs and have been able to do inference for different large language models. I originally wanted to use these GPUs to serve up LLMs, but VLLM cuda kernels don't work out of the box with the Mi50s, and Llama CPP has a bug where it only supports up to 4 AMD GPUs at once.<p>So TLDR, I don't want these servers sitting around and if anybody has any creative useful ideas for the servers, I'm happy to grant them SSH access to piddle around.<p>Mi50 Specs:
- 16GB VRAM
- 1TB/s VRAM BW
- 25 TFLOPs