GPUscout

A tool for discovering data movement-related bottlenecks on NVidia GPUs.

!!! GPUscout is in active development, and is not yet in a production-ready stability !!!

GPUscout is a tool for systematical detection of the root cause of frequent memory performance bottlenecks on NVIDIA GPUs. It connects three approaches to analysing performance -- static CUDA SASS code analysis, sampling warp stalls, and kernel performance metrics. Connecting these approaches, GPUscout can identify the problem, locate the code segment where it originates, and assess its importance.

Requirements

cmake (3.27+)
Nvidia CUDA Toolkit (11.8+)
- Nvidia Nsight Compute should also be automatically installed as a part of it.

Installation

GPUscout can be installed with cmake; spack package is coming in the near future.

#mkdir executable
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=../inst-dir ..
make all install

Note that this tool has been tested with 11.8 on NVIDIA Volta and Turing architectures.

Running an analysis

Generate the executable and cubin file

Generate two executables using the nvcc compiler:

Executable generated by normal compilation, say <executable_name>. (nvcc pr.cu -o pr)
Executable generated by using the -cubin flag with the nvcc compiler. If the executable name is prefixed with cubin-, i.e. cubin-<executable_name> (nvcc pr.cu -cubin -o cubin-pr), and is in the same folder as the regular executable, the cubin file path does not need to be entered later on.

Run the GPUscout script

Run the GPUscout.sh script, which was installed to the defined install directory. Specify the executable to analyze (-e executable) and potentially other parameters:

./GPUscout -e ../executable/gaussian -a '-q -s 2000'

The following input arguments and syntax are supported:

Usage: GPUscout [-h] [--dry-run] [--verbose] -e executable [-c directory] [--args]"
    -h | --help : Display this help.
    --dry_run : performs only dry_run. A --dry_run will only analyse the SASS instructions. --dry_run will neither read warp stalls nor Nsight metrics
    -v | --verbose : print more verbose output.
    -e | --executable : Path to the executable (compiled with nvcc).
    -c | --cubin : Path to the cubin file (compiled with nvcc, with -cubin). If left empty, the same path as executable and the name cubin-<executable> will be assumed.
    -a | --args : Arguments for running the binary. e.g. --args=\"64 2 2 temp_64 power_64 output_64.txt\"
    --sm_count : Can be used to specify the number of streaming multiprocessors of the current GPU, as this will be used in calculations (default: 16)
    -j | --json : Save a JSON-formatted version of the output (Needed for the use of GPUscout-GUI)

This should automatically start analysing the code and printing recommendations on the terminal screen.

For older NVIDIA architectures (like Pascal), a dry run option has been provided that reports based on SASS instructions only. This can be run as GPUscout --dry_run ..... .

About

GPUscout has been initially developed by Soumya Sen, and is further maintained by Stepan Vanecek ([email protected]) and the CAPS TUM. Please contact us in case of questions, bug reporting etc.

GPUscout is available under the Apache-2.0 license. (see License)

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
GPUscout.sh		GPUscout.sh
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPUscout

Requirements

Installation

Running an analysis

Generate the executable and cubin file

Run the GPUscout script

About

About

Releases 5

Packages

Contributors 2

Languages

License

caps-tum/GPUscout

Folders and files

Latest commit

History

Repository files navigation

GPUscout

Requirements

Installation

Running an analysis

Generate the executable and cubin file

Run the GPUscout script

About

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Languages

Packages