Intel® Advisor Help

Analyze GPU Roofline

Measure and visualize the actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor, by running the GPU Roofline Insights perspective.

Use the Roofline chart to answer the following questions:

Run the GPU Roofline Insights to measure performance of Data Parallel C++ (DPC++), C++/Fortran with OpenMP* pragmas, Intel® oneAPI Level Zero (Level Zero), or OpenCL™ applications enabled to run on a GPU.

How It Works

The GPU Roofline Insights perspective includes the following steps:

  1. Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
  2. Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.

    Intel® Advisor calculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH.

    Intel Advisor automatically determines data type in the collected operations using the dst register.

GPU Roofline Summary

GPU Roofline Insights perspective measures performance of kernels executed on GPU and loops/functions executed on CPU and shows what you should optimize your application for. Examine the following performance data:

Summary report for the GPU Roofline Insights

See the Summary section to examine the performance summary of your application, and continue to GPU Roofline Insights Regions tab to examine the performance in more detail.

See Also