Intel® Advisor Help

Analyze CPU Roofline

Visualize actual performance against hardware-imposed performance ceilings by running the CPU / Memory Roofline Insights perspective. It helps you determine the main limiting factor (memory bandwidth or compute capacity) and provides an ideal roadmap of potential optimization steps.

Use the Roofline chart to answer the following questions:

How It Works

The CPU / Memory Roofline Insights perspective includes the following steps:

  1. Collect loop/function timings using the Survey analysis.
  2. Collect floating-point and/or integer operations data, memory traffic data, and measure the hardware limitations of your hardware using the FLOP analysis in the Characterization step.

    At this step, Intel® Advisor collects:

    • Compute operations (floating-point operations (FLOP) and integer operations (INTOP)):
      • FLOP is calculated as a sum of the following classes of instructions multiplied by their iteration count: FMA, ADD, SUB, DIV, DP, MUL, ATAN, FPREM, TAN, SIN, COS, SQRT, SUB, RCP, RSQRT, EXP, VSCALE, MAX, MIN, ABS, IMUL, IDIV, FIDIVR, CMP, VREDUCE, VRND
      • INTOP is calculated by default as a sum of the following classes of instructions multiplied by their iteration count:ADD, ADC, SUB, MUL, IMUL, DIV, IDIV, INC/DEC, shifts, rotates.
    • Memory traffic data that is calculated as a product of memory operations and the amount of bytes in the register accessed by the function/loop. For memory traffic calculation, Intel Advisor counts the following classes of memory instructions:
      • scalar and vector MOV instructions
      • GATHER/SCATTER instructions
      • VBMI2 compress/expand instructions

    Note

    This collection can take three to four times longer than the Survey analysis.

CPU Roofline Report

The Roofline chart plots an application's achieved performance and arithmetic intensity against the hardware maximum achievable performance:

Example of a CPU Roofline report

See Also