Example High Performance Approximate Nearest Neighbors Algorithm Extension #671
garrettwrong
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
A core algorithmic component of our RIR class averaging is the Nearest Neighbors computation. At the time of writing, there exists a
legacy
(direct) computation, ported from the classic MATLAB, and asklearn
implementation in ASPIRE.The
sklearn
implementation is actually adaptive and selects from several algorithms based on the problem size. Specifics can be found in their documentation.For extremal scale networks there is a lot of active research in both industry and academia yielding many high performance options. In fact, there are several projects benchmarking these algorithms, such as https://github.com/erikbern/ann-benchmarks.
Unfortunately, with each additional external dependency comes increased potential for software engineering burdens, so we don't ship ASPIRE with additional NN codes. On the other hand, I will demonstrate that for testing and experimental purposes it is relatively straightforward to install and use alternative nearest neighbors.
The attached notebook extends ASPIRE code to test an alternative network. In this case we will choose the highly regarded
annoy
package for approximate nearest neighbors. While this experiment is not representative of the code we would merge in, it is an example of a good starting point for beginning to work with the ASPIRE developers on a feature or extension. Generally speaking, similar approaches could be taken to experiment with other components of various algorithms in ASPIRE.NearestNeighborExtension.pdf
(Github does not allow me to post the
.ipynb
at this time, but feel free to contact me for it.)Beta Was this translation helpful? Give feedback.
All reactions