[Co-founder of Shortwave here] I know a lot of folks are launching “AI Assistants” right now – but ours isn’t just a “chat with your PDF” thin shim on GPT4. We’ve got some serious infrastructure behind this.<p>Here are some notes on our architecture:<p>- We use LLMs at multiple places to choose what data to pull at each step. We use an additive approach rather than a chaining approach to avoid error propagation. We use GPT3.5-turbo with a bunch of hand-rolled prompts for most of this.<p>- We’re using InstructorXL + Pinecone running on GCP for vector-based search. We combine this with more traditional search methods backed by Postgres & Elasticsearch, to give the assistant the ability to fast searches of multiple types.
We use a x-encoding model trained on open source Q&A data from Bing for scoring & reranking to allow us to combine multiple data sources and determine what makes the most sense to feed into the final prompt.<p>- We hand-rolled a bunch of rule-based algorithms and heuristics on top of the LLMs to deal with email-specific corner cases and other issues we couldn’t resolve reliably in prompts<p>- Our user-facing output is generated with GPT4.<p>This enables a bunch of capabilities that other AI assistants can’t match:<p>- Way better search — Ask a question and get a succinct direct answer, including finding emails that would be tough for you to find through traditional search (ie. you can’t remember a keyword to use).<p>- Scheduling – Since we can dynamically pull in multiple types of data, we can access calendar data at the right time to help you schedule meetings.<p>- Analyze across multiple emails & types of data - The assistant can synthesize answers across multiple emails, your calendar, setting, etc to give you an answer (eg. “What are the top 5 issues that customers emails support about last week”, “what are some meeting times that work for me and the other people on this thread”),<p>- Write in your voice — the assistant can automatically learn your style and tone based on your sent emails. This means it actually sounds like you and, while it still requires some tweaking occasionally, it’ll save you a lot of time.<p>- Summarize & translate – it can dynamically access the data you have <i>on your screen right now</i> if you reference it, so it can help you with whatever you’re reading.<p>A note on privacy: We take privacy very seriously. We’re running everything above on our own GPUs + using OpenAI for final outputs. We aren’t training any models on user data.<p>We’ve put a lot of thought and effort into this one – I hope you like it – either way, let me know what you think in the comments below!<p>-Andrew