Skip to content

Commit

Permalink
Use same performance group L3 for SkylakeX and CascadelakeX
Browse files Browse the repository at this point in the history
  • Loading branch information
TomTheBear committed Apr 15, 2021
1 parent 87166b1 commit d548407
Showing 1 changed file with 23 additions and 11 deletions.
34 changes: 23 additions & 11 deletions groups/CLX/L3.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ FIXC1 CPU_CLK_UNHALTED_CORE
FIXC2 CPU_CLK_UNHALTED_REF
PMC0 L2_LINES_IN_ALL
PMC1 L2_TRANS_L2_WB
PMC2 IDI_MISC_WB_DOWNGRADE
PMC3 IDI_MISC_WB_UPGRADE


METRICS
Runtime (RDTSC) [s] time
Expand All @@ -14,23 +17,32 @@ Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock
CPI FIXC1/FIXC0
L3 load bandwidth [MBytes/s] 1.0E-06*PMC0*64.0/time
L3 load data volume [GBytes] 1.0E-09*PMC0*64.0
L3 evict bandwidth [MBytes/s] 1.0E-06*PMC1*64.0/time
L3 evict data volume [GBytes] 1.0E-09*PMC1*64.0
L3 evict bandwidth [MBytes/s] 1.0E-06*PMC3*64.0/time
L3 evict data volume [GBytes] 1.0E-09*PMC3*64.0
L3|MEM evict bandwidth [MBytes/s] 1.0E-06*PMC1*64.0/time
L3|MEM evict data volume [GBytes] 1.0E-09*PMC1*64.0
Dropped CLs bandwidth [MBytes/s] 1.0E-6*PMC2*64.0/time
Dropped CLs data volume [GBytes] 1.0E-9*PMC2*64.0
L3 bandwidth [MBytes/s] 1.0E-06*(PMC0+PMC1)*64.0/time
L3 data volume [GBytes] 1.0E-09*(PMC0+PMC1)*64.0

LONG
Formulas:
L3 load bandwidth [MBytes/s] = 1.0E-06*L2_LINES_IN_ALL*64.0/time
L3 load data volume [GBytes] = 1.0E-09*L2_LINES_IN_ALL*64.0
L3 evict bandwidth [MBytes/s] = 1.0E-06*L2_TRANS_L2_WB*64.0/time
L3 evict data volume [GBytes] = 1.0E-09*L2_TRANS_L2_WB*64.0
L3 evict bandwidth [MBytes/s] = 1.0E-06*IDI_MISC_WB_UPGRADE*64.0/time
L3 evict data volume [GBytes] = 1.0E-09*IDI_MISC_WB_UPGRADE*64.0
Dropped CLs bandwidth [MBytes/s] = 1.0E-6*IDI_MISC_WB_DOWNGRADE*64.0/time
Dropped CLs data volume [GBytes] = 1.0E-9*IDI_MISC_WB_DOWNGRADE*64.0
L3|MEM evict bandwidth [MBytes/s] = 1.0E-06*L2_TRANS_L2_WB*64.0/time
L3|MEM evict data volume [GBytes] = 1.0E-09*L2_TRANS_L2_WB*64.0
L3 bandwidth [MBytes/s] = 1.0E-06*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64/time
L3 data volume [GBytes] = 1.0E-09*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64
-
Profiling group to measure L3 cache bandwidth. The bandwidth is computed by the
number of cache line allocated in the L2 and the number of modified cache lines
evicted from the L2. This group also output data volume transferred between the
L3 and measured cores L2 caches. Note that this bandwidth also includes data
transfers due to a write allocate load on a store miss in L2.

--
Profiling group to measure L3 cache bandwidth and data volume. For Intel Skylake
or Cascadelake, the L3 is a victim cache. This means that all data is loaded
from memory directly into the L2 cache (if L3 prefetcher is inactive). Modified
data in L2 is evicted to L3 (additional data transfer due to non-inclusivenss of
L3 can be measured). Clean cache lines (only loaded data) might get dropped in
L2 to reduce traffic. If amount of clean cache lines is smaller than L3, it
might be evicted to L3 due to some heuristic.

0 comments on commit d548407

Please sign in to comment.