You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the most popular way to implement it is using Approximate K Nearest Neighbor. However, I should note that I'm not knowledgeable in that area.
Are there possible drawbacks, for example where recall range is exchanged for speed ?
Yes, the algorithm makes such a tradeoff — a little less accurate for a massive speed bump when the dataset is large. This is what you can expect from commercial vector databases (example: https://supabase.com/vector).
I see the implementation uses cosine similarity. Performance gains come from normalizing the embeddings and caching them.
Have you considered ANN? I guess something like https://github.com/DanielKRing1/Annoy.js?
The text was updated successfully, but these errors were encountered: