This is beyond exciting. Welcome to the new reality!<p>On one hand, the resources required to run these models continues falling dramatically, thanks to the techniques discovered by researchers: GPTQ quantizing down to 4, 3, 2, even 1 bits! model pruning! hybrid vram offloading! better, more efficient architectures! 1-click finetuning on consumer hardware! Of course, the free lunches won't last forever, and this will level off, but it's still incredible.<p>And on the other side of the coin, the power of <i>all</i> computing devices continues its ever-upward exponential growth.<p>So you have a continuous <i>lowering</i> of requirements, combined with a continuous <i>increase</i> in available power... surely these two trends will collide, and I can only imagine what this stuff will be like at that intersection.