I used to build PC 15 years ago for game. After having kids, I switched to a console to save time fiddling with endless Windows updates.<p>I wanted to get back to that hobby again. This time, it's less about gaming but more on running AI tasks such as LLM and diffusion.<p>Do you have any recommendations for me to get started?
Start with r/LocalLLama and r/StableDiffusion. Look for benchmarks for various GPUs.<p>I have an RTX 3060(12GB) and 32GB RAM. Just ran Qwen2.5-14B-Instruct-Q4_K_M.gguf in llama.cpp with flash attention enabled and 8K context. I get get 845t/s for prompt processing and 25t/s for generation.<p>For a while I even ran llama.cpp without a GPU (don't recommend it for diffusion) and with the same model (Qwen2.5 14B) I would get 11t/s for processing and 4t/s for generation. Acceptable for chats with short questions/instructions and answers.