A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
-
Updated
Dec 31, 2023 - C++
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
A GPU performance prediction toolkit for CUDA programs
A Python script for plotting roofline analyses. Intel Advisor style.
Fork of the CS Roofline Toolkit from Berkeley Lab
Code Comprehension Assistance for Evidence-Based performance Tuning
Add a description, image, and links to the roofline-model topic page so that developers can more easily learn about it.
To associate your repository with the roofline-model topic, visit your repo's landing page and select "manage topics."