-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for AbstractVector #29
Conversation
CC @andyferris |
@@ -1,11 +1,12 @@ | |||
__precompile__() | |||
# __precompile__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StaticArrays should now be precompiled, if that was the problem here (use 0.0.4 in the REQUIRE file above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is just commented out because it is a WIP so I dont have to wait for recompilation :)
Looks like a nice generalization, @KristofferC. So I've managed to get it working with To make a large improvement to the timings I think we will want more parts of the code, like |
ndim_tree = size(tree.data, 1) | ||
if ndim_points != ndim_tree | ||
function check_input{V1, V2}(tree::NNTree{V1}, points::Vector{V2}) | ||
if length(V1) != length(V2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere you are allowing V
to be V <: AbstractVector
, and then calling length()
on the type. Do you think V <: StaticVector
is more appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really since someone might want to use a FixedSizeArray
for example. I don't want to bind V
too much, I rather specify an interface that V
has to follow (like definining length
and eltype
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that's true! (Except FixedSizeArray
isn't a subtype of AbstractArray
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah hehe. But still :P
I am not surprised that there is not much speed improvement (for now I'm glad if there aren't any regression). The majority of the time is spent in fetching data from memory and I have already been careful optimizing that as much as I can. What this PR brings is more of a generalization to have the possibility to use a vector of points instead of a matrix. The only computations that are done on the hyper rectangle is in the creation of the tree. Otherwise, only computations in the dimension that the points got split are performed NearestNeighbors.jl/src/kd_tree.jl Line 175 in 6517578
|
Right, makes sense! Thanks |
do_return_inrange(idxs, ::AbstractVector) = idxs[1] | ||
do_return_inrange(idxs, ::AbstractMatrix) = idxs | ||
function inrange{V, T <: Number}(tree::NNTree{V}, point::AbstractVector{T}, radius::Number, sortres=false) | ||
idxs = inrange(tree, Vector{T}[point], radius, sortres) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be Vector{typeof(point)}[point]
.
Also, this one seems a little wasteful. We could copy the kernel from the above function (lines 21-29) - measured about a 10% speedup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there are some optimizations to be made with regards to only querying for 1 point at a time. Could also be good to provide an API where everything is preallocated, like the vector for storing indices and distances.
The easiest thing is probably to split up the knn
function into an outer part and an inner part where the inner only takes a single point.
I updated this. It should deal with single point query better now. Feel free to try it out. :) |
53ea5da
to
e3a061e
Compare
Some updates: Benchmarking is looking great: https://gist.github.com/KristofferC/d3666e6428a473ea93ccfea396391ab3 |
I think changing the HyperSpheres from each having their own |
Current coverage is 93.61% (diff: 94.41%)@@ master #29 diff @@
==========================================
Files 14 14
Lines 523 517 -6
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 440 484 +44
+ Misses 83 33 -50
Partials 0 0
|
OK, I had a look at those benchmark comparisons,. and it really seems to have helped I've been using How efficient is |
BallTree is not as heavily optimized as KDTree but should in theory be faster for higher dimensions. And it also works with more general metrics than Minkowski. If you run the benchmarks and run the markdown generator on the resulting file you can compare absolute values between performance of the trees. If your data is biased some way ypu should probably benchmark with that type of data. |
Allow for entering a vector of points where a point is an instance of an
AbstractVector
. Tests should pass except theDataFreeTree
stuff.Need to benchmark things to see that things doesnt slow down.
Help in testing / benchmarking appreciated.
Should fix: #24, #17, #13