-
Notifications
You must be signed in to change notification settings - Fork 230
likwid perfscope
likwid-perfscope is a command line application written in Lua that uses the timeline mode of likwid-perfctr to create on-the-fly pictures with the current measurements. It uses the feedGnuplot Perl script to send the current data to gnuplot. In order to make it more convenient for users, preconfigured plots of interesting metrics are embedded into likwid-perfscope. Since the plot windows are normally closed directly after the execution of the monitored applications, likwid-perfscope waits until Ctrl+c is pressed.
-h, --help Help message
-v, --version Version information
-V, --verbose <level> Verbose output, 0 (only errors), 1 (info), 2 (details), 3 (developer)
-a Print all preconfigured plot configurations for the current system.
-c <list> Processor ids to measure, e.g. 1,2-4,8
-C <list> Processor ids to pin threads and measure, e.g. 1,2-4,8
-g, --group <string> Preconfigured plot group or custom event set string with plot config.
-t, --time <time> Frequency in s, ms or us, e.g. 300ms, for the timeline mode of likwid-perfctr
-d, --dump Print output as it is send to feedGnuplot.
-p, --plotdump Use dump functionality of feedGnuplot. Plots out plot configurations plus data to directly submit to gnuplot
--host <host> Run likwid-perfctr on the selected host using SSH. Evaluation and plotting is done locally.
This can be used for machines that have no gnuplot installed. All paths must be similar to the local machine.
The basic usage of likwid-perfscope is to use one of the predefined plot configurations that are embedded into the Lua script. All of them are time resolved, e.g. Mbyte/s or FLOP/s. A list of all plot available for the current architecture can be retrieved with
$ likwid-perfscope -a
which prints on an Intel IvyBridge EP system:
Group NUMA
Perfctr group: NUMA
Match for metric: Local DRAM bandwidth [MByte/s]
Title of plot: NUMA separated memory bandwidth
Title of x-axis: Time
Title of y-axis: Bandwidth [MBytes/s]
Match for second metric: Remote DRAM bandwidth [MByte/s]
Title of y2-axis: Bandwidth [MBytes/s]
Group MEM_BAND
Perfctr group: MEM
Match for metric: Memory bandwidth [MBytes/s]
Title of plot: Memory bandwidth
Title of x-axis: Time
Title of y-axis: Bandwidth [MBytes/s]
Group FLOPS_DP
Perfctr group: FLOPS_DP
Match for metric: MFlops/s
Title of plot: Double Precision Flop Rate
Title of x-axis: Time
Title of y-axis: MFlops/s
Group L2_BAND
Perfctr group: L2
Match for metric: L2 bandwidth [MBytes/s]
Title of plot: L2 cache bandwidth
Title of x-axis: Time
Title of y-axis: Bandwidth [MBytes/s]
Group L3_BAND
Perfctr group: L3
Match for metric: L3 bandwidth [MBytes/s]
Title of plot: L3 cache bandwidth
Title of x-axis: Time
Title of y-axis: Bandwidth [MBytes/s]
Group FLOPS_SP
Perfctr group: FLOPS_SP
Match for metric: MFlops/s
Title of plot: Single Precision Flop Rate
Title of x-axis: Time
Title of y-axis: MFlops/s
Group TEMP
Perfctr group: ENERGY
Match for metric: Temperature [C]
Title of plot: Temperature
Title of x-axis: Time
Title of y-axis: Temperature [C]
Group POWER
Perfctr group: ENERGY
Match for metric: Power [W]
Title of plot: Consumed power
Title of x-axis: Time
Title of y-axis: Power [W]
Match for second metric: Power DRAM [W]
Title of y2-axis: Power DRAM [W]
Group QPI_BAND
Perfctr group: QPI
Match for metric: QPI data bandwidth [MByte/s]
Title of plot: QPI bandwidth
Title of x-axis: Time
Title of y-axis: Bandwidth [MBytes/s]
Match for second metric: QPI link bandwidth [MByte/s]
Title of y2-axis: Bandwidth [MBytes/s]
You can run these groups in a similar manner as with likwid-perfctr like:
$ likwid-perfscope -C S0:0 -g L3_BAND ./a.out
which measures the memory bandwidth on the first CPU of socket 0 and plots it using the title "L3 cache bandwidth", the x-axis has the label "Time" and the y-axis the label "Bandwidth [MBytes/s]". If you execute on multiple CPUs, each CPU gets its own line in the plot.
There are plot configurations, like POWER
that plots two lines per CPU, one for the CPU package power consumption and one for the DRAM power consumption. The DRAM power consumption uses the right y-axis with an own axis label "Power DRAM [W]".
You can increase the number of samples by setting -t <time>
on the command line. The default value is one sample per second.
$ likwid-perfscope -C S0:0 -g L3_BAND -t 500ms ./a.out
Moreover, you can use the group switching functionality of the timeline mode to measure multiple metrics at once:
$ likwid-perfscope -C S0:0 -g L3_BAND -g L2_BAND -g MEM_BAND -t 500ms ./a.out
Each group opens its own plotting window and is updated in a round-robin fashion. Each group is measured 500ms
.
If you want to record the measurements, you can use either -d
or -p
. The difference is, that -d
outputs the strings that are send to feedGnuplot. The plot environment (title, labels) is not included. With -p
the dump is made by feedGnuplot which prints the plot environment first and then for each update step the whole data that has been collected.
Output format of -d
:
<groupID> <runtime> <value_1_CPU1> (<value_2_CPU1>) (<value_1_CPU2>) (<value_2_CPU2>) ...
Example output of -p
:
set grid
set xlabel "Time"
set ylabel "Bandwidth [MBytes/s]"
set title "L3 cache bandwidth"
set boxwidth 1
histbin(x) = 1 * floor(0.5 + x/1)
set xtics
set xrange ["0":]
plot '-' title "L3 bandwidth [MBytes/s]" with linespoints
0 0
1.000161322585 48.433210629261
2.000241249986 21.798359943835
3.0003206090227 21.337482595053
4.0004001520114 14.873424079086
5.0004813269837 7.8612681493985
e
You can also perform the measurements on another host using the --host
option:
$ likwid-perfscope -C S0:0@S1:0 -g POWER --host host1 ./a.out
but all paths need to be similar to the local system, the group must be available on the host and the CPU list valid. This feature is currently experimental.
-
Applications
-
Config files
-
Daemons
-
Architectures
- Available counter options
- AMD
- Intel
- Intel Atom
- Intel Pentium M
- Intel Core2
- Intel Nehalem
- Intel NehalemEX
- Intel Westmere
- Intel WestmereEX
- Intel Xeon Phi (KNC)
- Intel Silvermont & Airmont
- Intel Goldmont
- Intel SandyBridge
- Intel SandyBridge EP/EN
- Intel IvyBridge
- Intel IvyBridge EP/EN/EX
- Intel Haswell
- Intel Haswell EP/EN/EX
- Intel Broadwell
- Intel Broadwell D
- Intel Broadwell EP
- Intel Skylake
- Intel Coffeelake
- Intel Kabylake
- Intel Xeon Phi (KNL)
- Intel Skylake X
- Intel Cascadelake SP/AP
- Intel Tigerlake
- Intel Icelake
- Intel Icelake X
- Intel SappireRapids
- Intel GraniteRapids
- Intel SierraForrest
- ARM
- POWER
-
Tutorials
-
Miscellaneous
-
Contributing