-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SIMD where applicable #49
Comments
I would suggest looking/asking around about the best ways to support SIMD in Rust, too, because I think there are crates that allow for determining whether or not to use SIMD at runtime based on CPU features and such. |
Yes, there's a stable std::is_x86_feature_detected macro as well as the Another crate to look at is |
Any idea what is optimal here? We'd first want some fixed size bigint crate that avoids the |
This is what we wrote Should have another release soon, although it doesn't yet have any SIMD features. One thing I can try to do prior to the next release is add support for loading certain sizes of big ints (it's using const generics internally, but still allows specialization around size) into SIMD registers, which will at least leave the door open for SIMD optimizations. |
What got me looking into this was the somewhat slow key generation and the related issue #29. A solution or rather, improvement would be to implement SIMD, Single Instruction Multiple Data. I found a paper by freescale semiconductor on this topic here: http://application-notes.digchip.com/314/314-66328.pdf.
Since all processors do not support the AVX/SSE/SIMD family of instructions, this would have to be implemented under a feature flag or as in the case of aes_gcm, the feature is enabled when the compiler is passed these flags:
Update: What takes time during key generation is finding big primes, and that is done here: https://github.com/dignifiedquire/num-bigint/blob/master/src/bigrand.rs#L324-L371
I might take a deeper look at this when my exams are over.
The text was updated successfully, but these errors were encountered: