ANN support? #7

astoilkov · 2023-06-17T14:57:31Z

I see the implementation uses cosine similarity. Performance gains come from normalizing the embeddings and caching them.

Have you considered ANN? I guess something like https://github.com/DanielKRing1/Annoy.js?

MentalGear · 2023-06-22T09:20:49Z

Interesting, how would you implement this?
Are there possible drawbacks, for example where recall range is exchanged for speed ?

astoilkov · 2023-06-26T04:35:55Z

Interesting, how would you implement this?

I think the most popular way to implement it is using Approximate K Nearest Neighbor. However, I should note that I'm not knowledgeable in that area.

Are there possible drawbacks, for example where recall range is exchanged for speed ?

Yes, the algorithm makes such a tradeoff — a little less accurate for a massive speed bump when the dataset is large. This is what you can expect from commercial vector databases (example: https://supabase.com/vector).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ANN support? #7

ANN support? #7

astoilkov commented Jun 17, 2023

MentalGear commented Jun 22, 2023

astoilkov commented Jun 26, 2023

ANN support? #7

ANN support? #7

Comments

astoilkov commented Jun 17, 2023

MentalGear commented Jun 22, 2023

astoilkov commented Jun 26, 2023