... or fat client era?<p>At the moment Chat-GPT and other instruct-based models show what's possible with modern LLM models. Although most SOTA models need cloud-compute to <i>predict</i>, it's feasible to assume that they might work on beefy standard desktops in the next 5 to 10 years (IMHO).<p>Now, historically we had:<p><pre><code> - thin-client main frame architecture (1970s - 1980s)
- fat-client "home computers" (1980s - 2010s)
- thin-client SaaS software platforms (2010s - 2025s)
- fat-client LLM inference engines (2025s - ?)
</code></pre>
In particular I think there will be a lot of ethical questions and legal work for companies to sell LLMs as SaaS, and because of fear to "recommend stuff against the status-quo", they might be inferior to "open" (unconstrained) models and that might be just possible for private persons (at first).<p>Just my 2 cents, what do you think?
It may already be happening, by sheer coincidence. A leak suggests Intel is building remarkably wide, M1-like IGP for laptops: <a href="https://videocardz.com/newz/intel-arrow-lake-p-with-320eu-gpu-confirmed-by-a-leaked-roadmap-targeting-to-compete-with-apple-14-premium-laptops" rel="nofollow">https://videocardz.com/newz/intel-arrow-lake-p-with-320eu-gp...</a><p>Which was in the pipe <i>years</i> before the LLM craze. Its reasonable to assume AMD is on a similar trajectory, making their future CPUs GPU/NPU heavy. Pair that with lots of RAM (and hopefully wider buses), and you have a respectable LLM client.<p>This might be a reasonable frontier for smartphone performance too, depending on how much DRAM continues to shrink. But maybe not, since mobile apps <i>love</i> their subscriptions and MTX, which rely on doing stuff in the cloud... otherwise why would you subscribe?
StableDiffusion, which isn't an LLM but is related, says that we could. Just like crypto mining drove GPU sales (for those interested in cryptocurrencies), those interested in private output of such systems are going to look to fat clients.<p>What am referring to with "private output"? I'm referring to what we know is coming - easy at-home production of porn deep fakes. Some aren't going to mind using AWS to produce their weird fantasies (not kink shaming, we're all into some weird stuff, not everyone is into their cloud provider potentially having access to it). Others are going to want that produced privately at home, on their own hardware.
I hope so. But I'm not sure it would necessarily be driven by LLM's or AI explicitly so much as part of what may be natural cycling between opposing views on how to best mix society and technology. Thin = centralised power & thick = decentralised.
No. The large compute power these systems require and the desire to have them accessible everywhere makes them naturally live in the cloud. Folks used to run their own servers, email, etc and that's already gone.