This is f* cool! Privacy is super important and writing+running code this way would help to mitigate some of the concerns about data access control; browsers can access whatever user's identity does, which is a great improvement over, ie, an approach where a centralized LLM (trained with unspecific/too specific data) is run for a whole company.
I have been recently exploring and testing this approach of offloading the AI inference to the user instead of using a cloud API.
There are many advantages and I can see how this could be the norm in the future.<p>Also, I was surprised by the amounts of people that have GPUs and how well SLMs perform in many cases, even those with just 1B parameters.