The challenge with this kind of system is always the bit that figures out the most relevant text from the corpus to bake together into the prompt.<p>It's interesting to see that this example takes the simplest approach possible, and it seems to provide pretty decent results:<p>> Third, each word of the list cleaned up above is searched inside the information paragraph. When a word is found, the whole sentence that includes it is extracted. All the sentences found for each and all of the relevant words are put together into a paragraph that is then fed to GPT-3 for few-shot learning.<p>This is stripping punctuation and stopwords and then doing a straight string match to find the relevant sentences to include in the prompt!<p>A lot of people - myself included - have been trying semantic search using embeddings to solve this. I wrote about my approach here: <a href="https://simonwillison.net/2023/Jan/13/semantic-search-answers/" rel="nofollow">https://simonwillison.net/2023/Jan/13/semantic-search-answer...</a><p>My version dumps in entire truncated blog entries, but from this piece I'm thinking that breaking down to much smaller snippets (maybe even at the sentence level) is worth investigating further.
Maybe naively, I was hoping this would involve a way of hosting a specially trained model oneself. If this pre-feeding of a corpus needs to be done every time the bot is "launched", that seems like a lot of extra tokens to pay for. I question the practicality of getting people to input their own API keys, which seems to be the only purpose of the PHP wrapper. On the other hand, passing the costs on to people (the "intermediate solution"[1]) would only make sense if the value added by the several-shot training was really significant, e.g. a very large body of domain-specific knowledge. Which again becomes impractical to feed in at the start of every session.<p>[1] <a href="https://towardsdatascience.com/custom-informed-gpt-3-models-for-your-website-with-very-simple-code-47134b25620b" rel="nofollow">https://towardsdatascience.com/custom-informed-gpt-3-models-...</a>
An alternative to using old school NLP, is to use GPT itself for the first pipeline as well, with a prompt like: I've the following resources with data. power_troubleshooting.txt contains information for customers that have issues powering on the device, (and so forth in the next lines... with other resources). This is the user question: ..., please reply with what is the resource I should access.<p>Then you get the file and create a second prompt. Based on the following information: ..., answer this question: ... question ...<p>A slower and more powerful way involves showing GPT different parts of potentially relevant text (for instance 3 each time) and ask it to score from 0 to 10 the level of usefulness of the resource in order to reply to the question. And make it select what resource to use. But this requires a lot of back and forth.
(Probably) related discussion.<p>Natural language is the lazy user interface (2 days ago)
<a href="https://news.ycombinator.com/item?id=34549378" rel="nofollow">https://news.ycombinator.com/item?id=34549378</a><p>Chatbot is often a useless, unintuitive UX for solving problems. We already know how most websites work, so it's easier to navigate to the intended resources with a few clicks rather than typing uncertain questions.
This seems terrifying for Google Search and similar products. If I can cram the majority of my static, rarely changing information and proverbial (not literally, of course) consciousness into a better search format (a model) than Google et al can, why should I bother building out the rest of my website or sharing that model with Google et al? This is especially true if it's conversational enough for most people to chat with it casually.<p>It seems the obvious answer is that people still need to be able to find me and you can't easily backlink the contents of a model. Google can create an interface or standard for this à la bots talking to bots, but the compute cost is just fundamentally higher for everyone involved. Maybe it's worth it for the end-user's sake? Anyway, a search query can be shorter than the question(s) it's going to take to get that information out of a model too. And as for Google, OpenAI or similar scraping the entire internet and creating a model like ChatGPT, sure, that works now, but how are people going to feel about that now that the cat's out of the bag? It seems the knee-jerk reaction to this is to more highly scrutinize what you publicly make available for scraping, especially since I have no idea what level of accuracy a model like this is going to possess in terms of representing my information.<p>As a closing example, I have a friend who runs one of the most popular NPM packages available. He doesn't billboard his name all over the project, but it's public information that can be discovered trivially by a human with a search engine for various reasons (on govt. websites no less). Essentially, he's a de facto, albeit shy, public figure. I asked ChatGPT various questions about the library and it nailed the answers. Next I asked ChatGPT various formulations of who wrote or maintains the project. It gave us a random, wildly incorrect first name and said no other public information is available about him. To be honest, I'm really ambivalent about this because of all sorts of different reasons centered around the above topics.<p>It seems there's some tension here. For those of us willing to embrace this, we may want to maintain technical stewardship. However, those changes may fundamentally change the fabric of discoverability on the web. Please let me know if I'm misunderstanding the technology or you believe I'm jumping to any conclusions here. Thanks!
We need a transferable GPT, like how CNN models have been trained on basic shapes and patterns, and can then be fine-tuned to an application. A transferable GPT wouldn't know the entire internet's worth of knowledge, but it would know to predict generalized structures. Maybe those structures could have placeholders that could be filled with specific knowledge.
How to approach building ChatGPT like chatbot for business application, to help users who do not like reading documentation? How much more precise the documentation should be, so chatbot would be actually useful? Is it possible to teach chatbot by having sessions with users who are experts of the application, so bot could gather required information from them?
If you want a simple command line chat bot, I made this simple example: <a href="https://github.com/atomic14/command_line_chatbot">https://github.com/atomic14/command_line_chatbot</a>
I'm doing a similar thing but with a web-based platform that lets you build chatbots (AI Agents) in your browser: <a href="https://agent-hq.io" rel="nofollow">https://agent-hq.io</a>