Major change: use clang compiler instead of msvc. clang can optimize Eigen library's template code a bit better than msvc in some cases. Also by default we set -mavx2 -mfma
to use AVX2 instead of AVX to squeeze a bit more performance (obviously if the CPU is old and does not support AVX2 then` the program will crash), and also set -O3
for some extra performance.