TechEcho

We're Tom and Adrian, the cofounders of Canonical AI. We were building a conversational AI product and wanted to use semantic caching. We tried out a few different projects, but none of them were accurate enough. The problem with the semantic caches we tried was that they didn't have a sense of the context of the user query. That is, the same user query could mean two different things, depending on what the query is referencing.<p>So we changed course and started working on a semantic cache that understands the context of the user query. We've developed a number of different methods to make the cache more aware of the context. These methods include multi-tenancy (i.e., user-defined cache scopes), multi-turn cache keys, metadata tagging, etc.<p>We'd love to hear your thoughts on it!

Show HN: A context-aware semantic cache for reducing LLM app latency and cost

no comments

Show HN: A context-aware semantic cache for reducing LLM app latency and cost

no comments