How do run finetuned models in a multi-tenant/shared GPU setup?

1 pointsby iamzycon8 months ago

I'm considering setting up a fine-tuning and inference platform for Llama that would allow customers to host their fine-tuned models. Would it be necessary to allocate a dedicated infrastructure for each fine-tuned model, or could a shared infrastructure work? Are there any existing solutions for this?

How do run finetuned models in a multi-tenant/shared GPU setup?

no comments

How do run finetuned models in a multi-tenant/shared GPU setup?

no comments