Intel® Advisor Help
Low
Survey + FLOP (Characterization)
The farther a dot is from the topmost roofs, the more room for improvement there is. In accordance with Amdahl's Law, optimizing the loops that take the largest portion of the program's total run time will lead to greater speedups than optimizing the loops that take a smaller portion of the run time.
The roofs above a dot represent the restrictions preventing it from achieving a higher performance, although the roofs below can contribute somewhat. Each roof represents the maximum performance achievable without taking advantage of a particular optimization, which is associated with the next roof up. Depending on a dot position, you can try the following optimizations.
Dot Position |
Reason |
To Optimize |
---|---|---|
Below a memory roof (DRAM Bandwidth, L1 Bandwidth, so on) |
The loop/function uses memory inefficiently. |
Run a Memory Access Patterns analysis for this loop.
|
Below Vector Add Peak |
The loop/function under-utilizes available instruction sets. |
Check Traits column in the Survey report to see if FMAs are used.
|
Just above Scalar Add Peak |
The loop/function is undervectorized. |
Check vectorization efficiency and performance issues in the Survey. Follow the recommendations to improve it if it's low. |
Below Scalar Add Peak |
The loop/function is scalar. |
Check the Survey report to see if the loop vectorized. If not, try to get it to vectorize if possible. This may involve running Dependencies to see if it's safe to force it. |
In the following
Roofline chart representation, loops A and G (large red dots), and to a lesser extent B (yellow dot far below the roofs), are the best candidates for optimization. Loops C, D, and E (small green dots) and H (yellow dot) are poor candidates because they do not have much room to improve or are too small to have significant impact on performance.
Some algorithms are incapable of breaking certain roofs. For instance, if Loop A in the example above cannot be vectorized due to dependencies, it cannot break the Scalar Add Peak.
Select a dot on the chart, open the Code Analytics tab to view detailed information about the selected loop:
Intel Advisor automatically determines the data type used in operations. View the classes of instructions grouped by categories in instruction mix:
Category |
Instruction Types |
---|---|
Compute (FLOP and INTOP) | ADD, MUL, SUB, DIV, SAD, MIN, AVG, MAX, ABS, SIN, SQRT, FMA, RCCP, SCALE, FCOM, V4FMA, V4VNNI |
Memory |
|
Mixed | Compute instructions with memory operands |
Other | MOVE, CONTROL FLOW, SYNC, OTHER |
Intel Advisor calculates floating-point operations (FLOP) as a sum of the following classes of instructions multiplied by their iteration count: FMA, ADD, SUB, DIV, DP, MUL, ATAN, FPREM, TAN, SIN, COS, SQRT, SUB, RCP, RSQRT, EXP, VSCALE, MAX, MIN, ABS, IMUL, IDIV, FIDIVR, CMP, VREDUCE, VRND
Integer operations (INTOP) are calculated in two modes: