-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel reduction #36
Comments
Laser is still in research mode so plenty of things are implemented but not properly exposed in a high-level API. To do a reduction you can do it as it's done for the sum reduction:
I will create min and max tomorrow, so that they are ready to use. Alternatively, if you use a Tensor there are 4 ways to do parallel reduction in this example: https://github.com/numforge/laser/blob/af191c086b4a98c49049ecf18f5519dc6856cc77/examples/ex05_tensor_parallel_reduction.nim#L9-L95 Note that the underlying |
I've added They only works for float32 at the moment but if needed it's easy to extend to other types. |
thanks very much for your links and the new reduce_min stuff. I can get this to work from the laser src directory but if I move it elsewhere I get a long traceback ending with:
I can move the same file containing: import
random, sequtils,
laser/primitives/reductions
proc main() =
let interval = -1f .. 1f
let size = 10_000_000
let buf = newSeqWith(size, rand(interval))
echo reduce_min(buf[0].unsafeAddr, buf.len)
main() in and out of ~/src/laser and it works in the directory and does not without. |
btw, this gives a nearly 5X speed improvement on my laptop on my example use-case so this will be a nice improvement! |
That's unfortunately one of Nim limitations. If you look into reductions_sse3 file it calls On x86_64 the compiler can only assume SSE2 support and more advanced SIMD instructions require an explicit compiler flag. As I want the library to have a fallback when no SSE3 is available I can't just So the SSE3 flag is passed per-file (instead of globally) via an undocumented feature of nim.cfg: https://github.com/numforge/laser/blob/2f619fdbb2496aa7a5e5538035a8d42d88db8c10/nim.cfg#L32. So you need to add yourfilename.always = "-msse3" if you use the primitive outside of laser. Ultimately, @Araq said that he wants to provide a way to in .nim file to have per-file compilation flags which would be very helpful. |
got it. thanks for the explanation. |
hi, I wanted to try out laser. I have this code working:
do I need an omp_critical section for the final result? and/or any other problems?
And here is my calling code from your examples/
The text was updated successfully, but these errors were encountered: