Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't exploit sparsity in Transform3D, SpatialInertia? #212

Open
tkoolen opened this issue Apr 24, 2017 · 1 comment
Open

Don't exploit sparsity in Transform3D, SpatialInertia? #212

tkoolen opened this issue Apr 24, 2017 · 1 comment

Comments

@tkoolen
Copy link
Collaborator

tkoolen commented Apr 24, 2017

Especially on newer CPU architectures, it may be favorable not to exploit sparsity in e.g. multiplication of homogeneous transforms.

AVX2-capable machine:

Julia Version 0.6.0-pre.beta.295
Commit dc907c7 (2017-04-24 04:37 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6950X CPU @ 3.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

Older, non-AVX2-capable machine:

Julia Version 0.6.0-pre.beta.295
Commit dc907c760f (2017-04-24 04:37 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)

In each case, I rebuilt the system image for the native architecture.

Exploiting sparsity

@benchmark (arot * brot, atrans + arot * btrans) setup = begin
    arot = rand(SMatrix{3, 3})
    brot = rand(SMatrix{3, 3})
    atrans = rand(SVector{3})
    btrans = rand(SVector{3})
end

AVX2:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     12.423 ns (0.00% GC)
  median time:      12.496 ns (0.00% GC)
  mean time:        12.906 ns (0.00% GC)
  maximum time:     32.182 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

Non-AVX2:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.598 ns (0.00% GC)
  median time:      11.208 ns (0.00% GC)
  mean time:        11.527 ns (0.00% GC)
  maximum time:     91.898 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999
  time tolerance:   5.00%
  memory tolerance: 1.00%

Not exploiting sparsity

@benchmark a * b setup = (a = rand(SMatrix{4, 4}); b = rand(SMatrix{4, 4}))

AVX2:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     5.331 ns (0.00% GC)
  median time:      5.344 ns (0.00% GC)
  mean time:        5.565 ns (0.00% GC)
  maximum time:     26.861 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

Non-AVX2:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     7.679 ns (0.00% GC)
  median time:      8.138 ns (0.00% GC)
  mean time:        8.520 ns (0.00% GC)
  maximum time:     54.870 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999
  time tolerance:   5.00%
  memory tolerance: 1.00%
@tkoolen
Copy link
Collaborator Author

tkoolen commented Apr 24, 2017

Doing this before #207 would simplify #207.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant