[] Fudist: An efficient distance approximation tool to accelerate the search of approximate nearest neighbors
We benchmark the performance of the following algorithms w.r.t. the distance approximation part:
- ADSampling
- LSH-APG (LSH-pruning part)
- PQ
- OPQ
- PCA
- DWT
And we benchmark and combine many heuristic idea and engineering optimizations from SOTA papers. After that, we propose Fudist, the best tool for distance approximation and pruning in ANN search.
- Eigen == 3.4.0
- Download the Eigen library from https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz.
- Unzip it and move the
Eigen
folder to./src/
.
The tested datasets are available at https://www.cse.cuhk.edu.hk/systems/hash/gqr/datasets.html.
-
Download and preprocess the datasets. Detailed instructions can be found in
./data/README.md
. -
Index the datasets. It could take several hours.
# Index HNSW/HNSW+/HNSW++ ./script/index_hnsw.sh
-
Test the queries of the datasets. The results are generated in
./results/
. Detailed configurations can be found in./script/README.md
.# Index HNSW/HNSW+/HNSW++ ./script/search_hnsw.sh