I think that the example in the article is not a good usecase for this technology. It would be better, cheaper and less error prone to have prebuilt forms that LLM can call like tools, at least for things like changing shipping address<p>Shipping forms usually need verification of addresses, sometimes they even include a map<p>Especially if on the other end data that would be inputted in this form, would be stored in the traditional DB<p>Much better usecase would be use it in something, that is dynamic by nature. For example, advanced prompt generator for image generation models (sliders for size of objects in a scene; dropdown menus with variants of backgrounds or style, instead of usual lists)
I was working on exactly this in gpt 3 days and still believe ad hoc generation of super specifc and contextual relevant UIs will solve a lot of problems and friction that purely textual or speech based conversational interfaces pose especially if the UI elements like sliders provide some form of live feedback of their effect and are possible to scroll back to or pin and make changes anytime.
I really believe this is the future.<p>Conversations are error prone and noisy.<p>UI distills down the mode of interaction into something defined and well understood by both parties.<p>Humans have been able to speak to each other for a long time, but we fill out forms for anything formal.
Related to this: Here is some recently published research we did at Microsoft Research on generating UX for prompt refinements based on the user prompt and other context (case study: <a href="https://www.iandrosos.me/promptly.html" rel="nofollow">https://www.iandrosos.me/promptly.html</a>, paper link also in intro).<p>We found it lowered barriers to providing context to AI, improved user perception of control over AI, and provided users guidance for steering AI interactions.
Related, it’s crazy to me that OpenAI hasn’t already done something like this for Deep Research.<p>After your initial question, it always follows up asking some clarifying questions, but it’s completely up to the user to format their responses and I always wonder if people are sloppy if the LLM gets confused. It would make much more sense for OpenAI to break out each question and have a dedicated answer box. That way the user’s response can be consistent and there’s less of a chance they make a mistake or forget to answer a question.
This seems much worse than the typical pre-AI mechanism of navigating to and clicking on a "Change Delivery Address" button.<p>I don't know why you wouldn't develop whatever forms you wanted to support upfront and make them available to the agent (and hopefully provide old-fashioned search). You can still use AI to develop and maintain the forms. Since the output can be used as many times as you want, you can probably use more expensive/capable models to develop the forms rather than cheaper/faster but less capable models that you're probably limited to for customer service.
I was hoping to do this over IRC but never got around to implementing it. I hate the idea of implementing a whole website/chat system, when they already exist. I'd like to use it for my (currently in-existent) home automation communication.
that's not a very innovative idea or even better UX.
I think that the future wil have to do with voice commands and mcps will be the backend, exposing capabilities.