I would like to really just run OpenwebUI and a few models for local chat use. I'm not into training (yet) and am patient- what is a good cost effective way to get started?
VRAM is king if you want to run larger (and therefore more accurate) models. 12 GB VRAM will let you run 13B models, which are great for local chat, but you could get away with 8 GB VRAM to run an 8B model as well; I'd recommend Llama 3 8B for that.