-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Memory Data Volume formula wrong for AMD Zen2 #510
Comments
Today I reinstalled likwid 5.2.2 with spack to see what happens with an out of the bix installation.
It seems there are no values for Interestingly it seems to work with some applications, but not the one I want to benchmark. |
According to the commit history, the |
Even so, sometimes those values are initialized (I suspect it has something to do with how I start my application, something with the environment or so), and when they are, the values are wrong. Our system has 8 memory NUMA domains (each with 16 cores and 4 L3 caches across 2 sockets (each with 64 cores). This would lead to |
The factor you see depends on the number of used NUMA domains of a CPU socket. With a single NUMA domain, the numbers are valid [1] . When using 2 NUMA domains [2], you get a factor of two. With 3 NUMA domains [3], you need a factor of 3 and so on and so on [4]. But this is nothing LIKWID can determine for various applications. AMD mentions in the docs that memory measurements are only supported for NPS1 mode. The "correction factor" was added by me based on observations and is not confirmed by AMD.
With Zen1, the memory controllers when per NUMA domain, so it could determine the traffic better. With Zen2, AMD switched to socket-local units and this seems to be some problem now. Also the more recent generations provide memory measurements only in NP2 mode. |
For our AMD Zen2 processor (EPYC 7702) the formula to calculate the memory data volume seems to be wrong by a factor of 2.
We have a 2 socket system, meaning 8 NUMA domains.
The formula reads:
likwid-bench
withstream_mem_avx
(using 1 workgroup per L3 NUMA domain, but big enough to not fit in L3) gives:While
likwid-perfctr -m -g MEM
on the benchmark gives:For my own benchmarks I just hardcoded that
4.0/(num_numadomains/num_sockets)
factor to be 2, which gives coinciding results withlikwid-bench
. But I still would like it to work out of the box for my colleagues if they install likwid themselves. They might be not that familiar with likwid to notice.Where do these
num_numadomains
andnum_sockets
variables come from? Is this maybe dependent on the actual processor from the Zen2 architecture?The text was updated successfully, but these errors were encountered: