forked from RRZE-HPC/likwid
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' of github.com:RRZE-HPC/likwid
- Loading branch information
Showing
1 changed file
with
37 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
SHORT L3 cache miss rate/ratio | ||
|
||
EVENTSET | ||
FIXC0 INSTR_RETIRED_ANY | ||
FIXC1 CPU_CLK_UNHALTED_CORE | ||
FIXC2 CPU_CLK_UNHALTED_REF | ||
FIXC3 TOPDOWN_SLOTS | ||
PMC0 MEM_LOAD_RETIRED_L3_HIT | ||
PMC1 MEM_LOAD_RETIRED_L3_MISS | ||
PMC2 UOPS_RETIRED_SLOTS | ||
|
||
METRICS | ||
Runtime (RDTSC) [s] time | ||
Runtime unhalted [s] FIXC1*inverseClock | ||
Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock | ||
CPI FIXC1/FIXC0 | ||
L3 request rate (PMC0+PMC1)/PMC2 | ||
L3 miss rate PMC1/PMC2 | ||
L3 miss ratio PMC1/(PMC0+PMC1) | ||
|
||
LONG | ||
Formulas: | ||
L3 request rate = (MEM_LOAD_RETIRED_L3_HIT+MEM_LOAD_RETIRED_L3_MISS)/UOPS_RETIRED_SLOTS | ||
L3 miss rate = MEM_LOAD_RETIRED_L3_MISS/UOPS_RETIRED_SLOTS | ||
L3 miss ratio = MEM_LOAD_RETIRED_L3_MISS/(MEM_LOAD_RETIRED_L3_HIT+MEM_LOAD_RETIRED_L3_MISS) | ||
- | ||
This group measures the locality of your data accesses with regard to the | ||
L3 cache. L3 request rate tells you how data intensive your code is | ||
or how many data accesses you have on average per instruction. | ||
The L3 miss rate gives a measure how often it was necessary to get | ||
cache lines from memory. And finally L3 miss ratio tells you how many of your | ||
memory references required a cache line to be loaded from a higher level. | ||
While the data cache miss rate might be given by your algorithm you should | ||
try to get data cache miss ratio as low as possible by increasing your cache reuse. | ||
With Intel Icelake, the retired UOPs cannot be measured anymore but instead, it uses the number | ||
of used retirement slots as basis for the rates. | ||
|