Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison of calculation speed #130

Open
pucicu opened this issue Feb 17, 2022 · 18 comments
Open

Comparison of calculation speed #130

pucicu opened this issue Feb 17, 2022 · 18 comments

Comments

@pucicu
Copy link
Contributor

pucicu commented Feb 17, 2022

I did some comparison of the calculation speed for RPs and simple RQA. More details at https://github.com/pucicu/RP_Speed_Test. The Julia implementation (RecurrenceAnalysis.jl) is (surprising to me) not the fastest one.

Computation speed for recurrence plots and recurrence quantification measures for the Rössler system.

Have fun! 😁

Norbert

@Datseris
Copy link
Member

Datseris commented Feb 17, 2022

Hi Norbert!

Thanks a lot for your post. Generally speaking I would be happy to see that there can be improvements to the library. However, are we sure that these benchmarks are accurate? I had a look at the repo you cite.

  • For Pyunicorn it seems to me that the function being benchmarked is: rqa_summary. The docstring states that it computes "the recurrence rate RR, the determinism DET, the average diagonal line length L and the laminarity LAM".
  • For RecurrenceAnalysis.jl, the function being benchmarked is rqa which computes:
    image
    So, it computes at least triple the amount of quantities as rqa_summary, and the quantities computed are also much more complex than average rates.
  • (I have no comments on the GPU version, as a direct performance comparison between GPU and CPU code can only yield the obvious result "GPU is faster", and cannot really be of any help to the CPU code. Of course, it is great that a GPU version exists for RQA!)

It comes to no surprise for me that our version is slower.

Furthermore, the benchmark also mixes the time it takes to compute the recurrence matrix with the time it takes to compute RQA measures. It would be more useful to separate those two, and would also help each library see where they can improve.

@heliosdrm
Copy link
Collaborator

For Pyunicorn it seems to me that the function being benchmarked is: rqa_summary. The docstring states that it computes "the recurrence rate RR, the determinism DET, the average diagonal line length L and the laminarity LAM".

Maybe it would be closer for Julia if rqa is called with the keyword argument onlydiagonal=true. As noted in the documentation:

The keyword argument `onlydiagonal` (`false` by default) can be set to `true`
in order to restrict the analysis to the recurrence rate and the parameters related
to diagonal structures (`:RR`, `:DET`, `:L`, `:Lmax`, `:DIV` and `:ENTR`), which makes
this function slightly faster.

Also, I have not checked if Pyunicorn or other alternatives run in parallel threads by default. This package does not, unless parallel=true is passed to the calculation of recurrence matrices.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 17, 2022

I know that some of the tools I compared are not computing all measures. But pyunicorn computes measures quantifying the diagonal and the vertical line structures, thus, requires finding the histograms of them. The calculation of the histograms is the most time consuming part, the different quantifiers, e.g., LAM, TT, Vmax, Ventr etc. are easy to calculate and do not need so much time as the histogram itself. You can only complain about the missing "white vertical" lines.

Surely, for a deeper test we should separate the calculations of RP and RQA. For the purpose, for which I implemented this test, such a summary was completely fine. And also in your comparison I found such a combination of calculation times: https://github.com/JuliaDynamics/RecurrenceAnalysis.jl/wiki/Comparison-of-software-packages-for-RQA.

@Datseris
Copy link
Member

Datseris commented Feb 17, 2022

I was not aware of the cited page https://github.com/JuliaDynamics/RecurrenceAnalysis.jl/wiki/Comparison-of-software-packages-for-RQA before. I only learned of it now. I have not even read it yet (haven't had the time yet!!!)

I've already opened a PR to add it to the official docs: JuliaDynamics/DynamicalSystems.jl#172 . In my opinion, documentation should be in one, and only one place. Mixing github wikis and readmes and offiicial docs makes information harder to locate, and much harder to maintain as well. So, we will update the page accordingly and put it in the official docs so that it is also more discoverable!

In the same PR we will also add a new comparison of performances!

The calculation of the histograms is the most time consuming part, the different quantifiers, e.g., LAM, TT, Vmax, Ventr etc. are easy to calculate and do not need so much time as the histogram itself. You can only complain about the missing "white vertical" lines.

Okay, yes, that is a fair point.

@Datseris
Copy link
Member

@pucicu does rqa_summary run on parallel threads by default?

@pucicu
Copy link
Contributor Author

pucicu commented Feb 17, 2022

@pucicu does rqa_summary run on parallel threads by default?

I'm not sure, but I don't think so. At least the CPU load is not showing such. MATLAB is running parallel.

Maybe it would be closer for Julia if rqa is called with the keyword argument onlydiagonal=true. … Also, I have not checked if Pyunicorn or other alternatives run in parallel threads by default. This package does not, unless parallel=true is passed to the calculation of recurrence matrices.

I have just tested both but could not find a big improvement, only a small one.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 18, 2022

I have tested the calculation time for RP and RQA separately. The result is really interesting.

Computation speed for recurrence plots and recurrence quantification measures for the Rössler system.

I could not include the R package, because we cannot separate the calculations there. In the MATLAB version, I have commented out the network measures. But I have also included a version using loops. Now MATLAB (vector/ pdist) has the same performance as pyunicorn for RP, and same as Julia for RQA.

But I'm quite sure that I did some stupid mistake with the Julia implementation of the loop and it might be possible that it could be much faster. And I could not find a difference in Julia using parallel=true or false and onlydiagonal=true or false.

@Datseris
Copy link
Member

Thanks a lot Norbert. This is very helpful.

. And I could not find a difference in Julia using parallel=true or false and onlydiagonal=true or false.

Okay, that is clearly a problem on our end, as these settings should be making a difference. For the parallel setting to work you would need to start Julia with multiple threads, which doesn't happen by default. I think either me or @heliosdrm should take a look at your repo when we have time, and maybe open Pull Requests for the Julia code snippets, if we think this is necessary.

Given that the pyunicorn version is clearly faster, we should also look at its source code and see what techniques we can copy from there ;)

@pucicu
Copy link
Contributor Author

pucicu commented Feb 18, 2022

Hehe, I have now checked the results and found them a bit strange. I had used the z-component of the Rössler as Horatio did in his comparison. But this causes quite large number of recurrences (RR=0.56). Now I switched to the x-component which gives better RPs and much lower recurrence density (RR=0.03). Now the speed is much better for Julia. Nevertheless, it seems that pyunicorn is more efficient for higher RR than Julia.

And it means that all the results above hold only for larger RR.

@Datseris
Copy link
Member

Didn't we had a discussion somewhere, with a PR that @hkraemer started, and we made a choice regarding this? We had two versions to compute something and one was better for one recurrence and the other...? I don't remember very well, please correct me if I am wrong.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 18, 2022

I also remember only vague. Hauke should know.

@heliosdrm
Copy link
Collaborator

Yes, it was about the skeletonization algorithm (#128 ). It resulted that the algorithm that is used to make the histograms works better the more sparse are the matrices. This is consistent with what Norbert says.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 18, 2022

but does it be included per default?

@Datseris
Copy link
Member

Oh yeah... There we tested RR=0.08 and RR=0.01. In both of these cases the "current version" was faster. We decided to drop the "alternative" version completely then.

Well, we can see now that this was a mistake. I suggest that we have both versions, and we have a keyword like highrr = false. If the user expects the given data would have a really high recurrence rate, they should use highrr = true, which would use the algorithm we decide to drop in #128.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 19, 2022

Seems to make sense, indeed.

@hkraemer
Copy link
Contributor

Hm, I don't know if that is really worth the effort, since RR>.20 or RR>.30 is not so common. On the other hand it is just one additional keyword and it had been implemented already. So, your call.

If we do it, we can make it switch methods automatically, whenever RR> some high value we need to figure out. - At least when fixedrate=true. Otherwise, in case of a fixed threshold, the user has to decide a priori.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 19, 2022

A good approach for an automatic selection could be to compare the threshold and the standard deviation. If the ratio between both exceeds a certain threshold, the other implementation is used.

@pucicu
Copy link
Contributor Author

pucicu commented Feb 19, 2022

Here the updated results using the Rössler x-component. I have used julia --threads 4 software_speed_parallel.jl and I can see with top that mulitple threads are running, but the speed is not better than in the single version.

Computation speed for recurrence plots and recurrence quantification measures for the Rössler system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants