It occurred to me that a personal diary/journal is one of the most interesting data sets for a vector embedding / chat context product. People express their hopes, dreams, fears and much more in their journal. A psychiatrist/psychologist/therapist would have many insights from reading a journal.<p>So this is what I'm trying to build at Jumble Journal. We enable people to chat with their past journals as a first feature.<p>Technology<p>For vector embeddings and similarity search, we use ChromaDB. It's open source and has great performance. For something small scale, I didn't want to get locked into one of the Vector DB services like Pinecone.<p>The DB is hosted on an EC2 instance. Backend API is serverless with AWS Lambda and API Gateway. We back up all embeddings in S3 in case of failure.<p>I am really happy to discuss the technology stack and the feature itself.<p>Links
https://jumblejournal.org
https://www.trychroma.com/
The formatting is a bit off.<p>The web app is here: <a href="https://jumblejournal.org" rel="nofollow noreferrer">https://jumblejournal.org</a><p>The DB used is here: <a href="https://www.trychroma.com/" rel="nofollow noreferrer">https://www.trychroma.com/</a>