With just two main components you can create a fully working semantic search app. The client performs the heavy work of embedding calculation and simply sends the vector to Qdrant (vector database with built-in API) performing the actual search. Note that the model I'm using in the frontend is quantized (weighs only 30mb) and hence outputs slightly different results in comparison to the original model.