Replies: 1 comment
-
I am also trying to learn but I think the results make sense. IVF65536_HNSW32,PQ32 is not better than HNSW32_PQ32. the accuracy of HSNW32_PQ32 is higher (0.9943 vs 0.8069). where it suffers is memory consumption. the higher memory consumption is also expected. HNSW32_PQ32 is storing compressed vectors. Each vector is compressed to 32 bytes. And you have 64 links in the base layer. Each link needs 4 bytes (unsigned integer). 64 links = 256 bytes. per vector we have 288 bytes. for 4M vectors this translates to 1,152,000,000 which is close to your observation of 1.2G (note I did not factor in storage by higher layers of HNSW). IVF65536_HNSW32,PQ32 is going to store 4M vectors in 65536 clusters. the storage of the vectors will take up 32 bytes per vector * 4M vectors = 128,000,000 bytes. Now we need to account for storage of the inverted list. The list has 65536 clusters. Each cluster contains the IDs of the vectors in that cluster. To store 4M Ids you will need another 16,000,000 bytes. We are also creating a HNSW graph of 65536 data points. this will cost 65536 vectors * 4 bytes per vector (each vector is the centroid of the cluster) + 64 links per vector * 4 bytes per link * 65536 vectors = 17,039,360 bytes. lets add up to get
compare to 375M which is about 2x. |
Beta Was this translation helpful? Give feedback.
-
I did the following experience.
I have 4M normalized embeddings with dimension 768 and tested the creation of two indexes. The first one with the code:
and the second one with:
where
embs
is a numpy array with the 4M vectors.I then did a small test to check the
1-recall@10
when searching of a random sample of the same vectors:With the first index I got:
for the second one:
The first index is only 375M and the second one, 1.2G. These results are quite surprising and I was actually expecting the opposite.
What am I doing wrong?
Beta Was this translation helpful? Give feedback.
All reactions