Intel® Advisor Help

CPU Metrics

This reference section describes the contents of data columns in Survey and Refinement Reports of the Vectorization and Code Insights, CPU / Memory Roofline Insights, and Threading perspectives.

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | XYZ

A

Access Pattern

Description: Summary of access types.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

Access Type

Description: Memory access type: Read, Write, Read/Write

Collected during Memory Access Patterns Analysis and found in Memory Access Patterns Report.

Address Range

Description: Instruction address range in memory.

Interpretation: A wide range indicates one or more of the following:

Average

Description: Loop trip count average.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

B

C

Cache Line Utilization

Description: Simulated cache line utilization for data transfer operations.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

Cache Misses

Description: Number of memory load operations served by memory subsystem higher than cache. Calculated for the first instance of the loop (assuming cold CPU cache). Value is a result of virtual cache modeling, which might not match exact counter reported by hardware for this analysis run.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

Call Count

Description: Number of times loop/function was invoked.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Interpretation: A high number means there is an outer loop in the selected loop call chain with high trip count values. If the loop has a low trip count value, the outer loop could be a better candidate for parallelization (threading/vectorization).

Compiler Estimated Gain

Description: Theoretical compiler estimate of relative loop performance speedup achieved or achievable due to vectorization.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: Gain Estimate is Intel Advisor-calculated estimate of relative loop performance speedup achieved due to vectorization.

D

Data Types

Description: Data types provided by binary static analysis.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Interpretation: Bold indicates primary data type used for vectorization.

Description

Description: Code location classification.

Collected during Dependencies Analysis and found in Dependencies Report.

Dirty Evictions

Description: Number of evicted cache lines with a modified state introducing upstream memory traffic to a higher memory subsystem.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

E

Efficiency

Description: Intel Advisor-calculated performance estimated gain compared to maximum achievable gain from vectorization.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report).

Interpretation: Normally means how effectively vectorization was applied, compared to maximum possible gain (higher is better).

Calculation/Aggregation: (Estimated gain/Vector length) * 100%

Interpretation: Hover mouse over data cell for more information.

Elapsed Time

Description: Elapsed (wall-clock) application time.

Collected during Survey Analysis,  and found in Filters banner.

F

First Instance Site Footprint

Description: For each memory access instruction for the first instance of a loop, the Intel Advisor:

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

Comparison with similar metrics: This metric is more reliable than the Maximum Per-Instruction Address Range metric.

Max. Per-Instruction Addr. Range

First Instance Site Footprint

Simulated Memory Footprint

Number of threads analyzed for loop/site

1

1

1

Number of loop instances analyzed

All instances, but with some memory access instruction filtering

1

Depends on loop call count limit:

  • GUI: Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced > Loop call count limit

  • CLI action option: -loop-call-count limit

Awareness of overlap between address ranges accessed in loop

No

Yes

Yes

Suitability for code with random memory access

No

No

Yes

Function

Description: Function name.

Collected during Dependencies Analysis and found in Dependencies Report.

Function Call Sites and Loops

Description: Information about parent function, source file, and line where site/loop begins in Loop Information Pane (Survey Report), and top-down call tree of target functions and loops in Loop Information Pane (Survey Report)

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Interpretation:

G

Gain Estimate

Description: Intel Advisor-calculated estimate of relative loop performance speedup achieved due to vectorization.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report).

Comparison with similar metrics: Compiler Estimated Gain is the theoretical compiler estimate of relative loop performance speedup achieved or achievable due to vectorization.

H

I

Instruction Address

Description: Instruction address in memory.

Collected during Dependencies Analysis and found in Dependencies Report.

Instruction Sets

Description: Instruction Set Architecture (ISA) usage for individual instructions.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Iteration Duration

Description: Average loop iteration time.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

J

K

L

Loop Instance Total Time

Description: Average loop instance total time.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Loop-Carried Dependencies

Description: Dependencies summary across iterations

Collected during Dependencies Analysis and found in Loop Information Pane (Refinement Reports).

Possible values:

M

Max

Description: Loop trip count maximum.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Max Site Footprint

Description: Maximum distance (among all instances of the loop) between the minimum and maximum memory address values.

Maximum Per-Instruction Address Range

Description: For most memory access instructions for all instances of a loop, the Intel Advisor:

The value may be imprecise because the Intel Advisor filters some memory access instructions while analyzing all instances of a loop. Unreliable values are displayed in gray.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports) and Memory Access Patterns Report.

Comparison with similar metrics: This metric is less reliable than the First Instance Site Footprint metric.

Max. Per-Instruction Addr. Range

First Instance Site Footprint

Simulated Memory Footprint

Number of threads analyzed for loop/site

1

1

1

Number of loop instances analyzed

All instances, but with some memory access instruction filtering

1

Depends on loop call count limit:

  • GUI: Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced > Loop call count limit

  • CLI action option: -loop-call-count limit

Awareness of overlap between address ranges accessed in loop

No

Yes

Yes

Suitability for code with random memory access

No

No

Yes

Memory Access Footprint

Description: Maximum distance (among all instances of the loop) between minimum and maximum memory address values, accessed by the instructions, generated from the current source line.

Memory Loads

Description: Number of memory load operations in first instance of the loop.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports)|.

Memory Stores

Description: Number of memory store operations in first instance of the loop.

Collected during Memory Access Patterns Analysis] and found in Loop Information Pane (Refinement Reports).

Memory, GB

Description: Number of data transfers, in GB, between the CPU and memory subsystem.

Important

This is a core metric that is the basis of the arithmetic intensity (AI) calculation.

Min

Description: Loop trip count minimum.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect Trip Counts option of the Characterization step on Analysis Workflow tab or enabled Collect information about Loop Trip Counts on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Module/Modules

Description: Executable or library name.

Collected during Survey Analysis, Dependencies Analysis, and Memory Access Patterns Analysis; and found in Loop Information Pane (Survey Report), Advanced View Pane (Survey Report), Dependencies Report, and Memory Access Patterns Report.

Multi-Pumping Factor

Description: The number of times the compiler applied a pumping optimization to extend vector length.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

N

Nested Function

Description: Name of the function (invoked from the site) where the stride diagnostic was detected.

Collected during Memory Access Patterns Analysis and found in Memory Access Patterns Report.

O

Optimization Details

Description: Compiler optimization details.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

P

Performance Issues

Description: Performance issues found.

Collected during Survey Analysis, and Memory Access Patterns Analysis, and found in Loop Information Pane (Survey Report) and Memory Access Patterns Analysis.

Interpretation: Click to display confidence level about issue root cause and recommended fixes.

Problem Severity

Description: Seriousness of a detected problem.

Collected for during Dependencies Analysis and found in Loop Information Pane (Refinement Reports).

Possible values:

Q

R

RFO Cache Misses

Description: Number of cache lines loaded to cache due to a modification request (Request for Ownership).

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

S

Self AI

Description: Ratio of Self GFLOPS to self L1 transferred bytes.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Self Elapsed Time

Description: Self Time-based wall time from beginning to end of loop/function execution, excluding time for callees.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: Total Elapsed Time is Total Time-based wall time from beginning to end of loop/function execution, including time for callees.

Interpretation: Same as Self Time for single-threaded applications

Self GFLOP

Description: Giga floating-point operations, excluding GFLOP for callees.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Self GFLOPS

Description: Ratio of Self GLOP to Self Elapsed Time.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Self Giga OP

Description: Giga floating-point operations plus giga integer operations, excluding giga floating-point and integer operations for callees.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Instruction types counted for INTOP calculation (default):

Self Giga OPS

Description: Ratio of Self GFLOP plus Self GINTOP to Self Elapsed Time.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Instruction types counted for INTOP calculation (default):

Self GINTOP

Description: Giga integer operations, excluding giga integer operations for callees.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for INTOP calculation (default):

Self GINTOPS

Description: Ratio of Self GINTOP to Self Elapsed Time.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for INTOP calculation (default):

Self INT AI

Description: Ratio of Self GINTOPS to self L1 transferred bytes.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for INTOP calculation (default):

Self Memory (GB)

Description: Data transfers between CPU and memory subsystem (total traffic, including caches and DRAM) in gigabytes, excluding transfers for callees.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect FLOP option of the Characterization step on Analysis Workflow tab or enabled Collect information about FLOP, L1 memory traffic, and AVX-512 mask usage on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Self Memory (GB/s)

Description: Data transfers between CPU and memory subsystem (total traffic, including caches and DRAM) in gigabytes per second, excluding transfers for callees.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display: Enabled Collect FLOP option of the Characterization step on Analysis Workflow tab or enabled Collect information about FLOP, L1 memory traffic, and AVX-512 mask usage on Trip Counts and FLOP Analysis tab of Project Properties Dialog Box.

Calculation/Aggregation: Self GBs / Self Elapsed Time

Self Overall AI

Description: Ratio of Self GFLOPS plus Self GINTOPS to self L1 transferred bytes.

Collected during Trip Counts Analysis (Characterization), and found in Loop Information Pane (Survey Report).

Prerequisites for collection/display:

Instruction types counted for FLOP calculation:

Instruction types counted for INTOP calculation (default):

Self Time

Description: Time actively executing a function/loop, excluding time for callees.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: Total Time is time actively executing a function/loop, including time for callees.

Simulated Memory Footprint

Description: The summarized and overlap-aware memory footprint across all instances of a loop.

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

Prerequisites for collection/display:

In the GUI Project Properties Dialog Box:

CLI example:

advisor -collect map 
-mark-up-list=1,2,7,17,26 
-enable-cache-simulation 
-cachesim-mode=footprint 
-project-dir C:\my_advisor_project 
-- my_application.exe

Comparison with similar metrics:

Max. Per-Instruction Addr. Range

First Instance Site Footprint

Simulated Memory Footprint

Number of threads analyzed for loop/site

1

1

1

Number of loop instances analyzed

All instances, but with some memory access instruction filtering

1

Depends on loop call count limit:

  • GUI: Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced > Loop call count limit

  • CLI action option: -loop-call-count limit

Awareness of overlap between address ranges accessed in loop

No

Yes

Yes

Suitability for code with random memory access

No

No

Yes

Calculation/Aggregation: Number of unique cache lines accessed during cache simulation * Cache line size.

For performance reasons, not all accesses and cache lines are simulated. Instead the Intel Advisor tracks a subset and then scales up to the whole cache size to determine the final footprint value.

Site Location

Description: Information about parent function, source file, and line where site/loop begins.

Collected during Dependencies Analysis and Memory Access Patterns Analysis, and found in Loop Information Pane (Refinement Reports).

Site Name

Description: Site name if using source annotations; sequence ID if marking loops for deeper analysis in Survey Report.

Collected during Dependencies Analysis and Memory Access Patterns Analysis, and found in Loop Information Pane (Refinement Reports), Dependencies Report, and Memory Access Patterns Report.

Source/Source Location/Sources

Description: Source file name(s) and line number(s).

Collected duringSurvey Analysis, Dependencies Analysis and Memory Access Patterns Analysis; and found in Loop Information Pane (Survey Report), Advanced View Pane (Survey Report), Dependencies Report, and Memory Access Report.

State

Description: State of most severe problem in problem set.

Collected during Dependencies Analysis and found in Dependencies Report.

Possible values:

Stride

Description: Distance, in elements, between memory accesses in two consequent iterations.

Collected during Memory Access Patterns Analysis and found in Memory Access Patterns Report.

Strides Distribution

Description: Stride ratio in following format: Unit%/Constant%/Variable%

Collected during Memory Access Patterns Analysis and found in Loop Information Pane (Refinement Reports).

T

Total Elapsed Time

Description: Total Time-based wall time from beginning to end of loop/function execution, including time for callees

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: Self Elapsed Time is Self Time-based wall time from beginning to end of loop/function execution, excluding time for callees.

Interpretation: Same as Total Time for single-threaded applications.

Total Time

Description: Time actively executing a function/loop, including time for callees.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: Self Time is time actiely executing a function/loop, not including time for callees.

Traits

Description: Scalar and vectorization characteristics that may impact performance.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Possible values:

Trait

Detected ASM Instructions

Divisions

*DIV*

Square Roots

*SQRT*

Type Conversions

*CVT*

NT-stores

*MOVNT*

Gathers

*GATHER*

Scatters

*SCATTER*

Shuffles

*SHUF*

Permutes

*PERM*

Blends

*BLEND*

Packs

*PACK*

Unpacks

*UNPCK*

Inserts

*INSERT*

Extracts

*EXTRACT*

Masked Stores

*MASKMOV*

Shifts

*PROR*, *PROL*, *PSLL*, *PSRA*, *PSRL*

FMA

*FMADD*, *FMSUB*, *FNMADD*, *FNMSUB*

Mask Manipulations

*KADD*, *KTEST*, *KAND*, *KOR*, *KXOR*, *KXNOR*, *KNOT*, *KUNPCK*, *KMOV*, *KSHIFT*

Conflict Detections

*VPCONFLICT*

Exponent extractions

*VGETEXP*

Mantissa extractions

*VGETMANT*

Expands

*EXPAND*

Compresses

*COMPRESS*

VNNI

*VNNI*

Transformations

Description: Loop transformations applied by compiler.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Type

Collected during Survey Analysis, Dependencies Analysis, and Memory Access Patterns Analysis; and found in Loop Information Pane (Survey Report), Advanced View Pane (Survey Report), Dependencies Report, and Memory Access Patterns Report.

Possible Survey Report values:

Possible Memory Access Patterns Report values:

Possible Dependencies Report values - See Problem and Message Types.

U

Unroll Factor

Description: Loop unroll factor applied by the compiler.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

V

Variable References

Description: Name of the variable for which the dependency or memory access stride is detected.

Collected during Dependencies Analysis and Memory Access Patterns Analysis, and found in Dependencies Report and Memory Access Patterns Report.

Vector ISA

Description: The highest vector Instruction Set Architecture used for individual instructions.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Comparison with similar metrics: An ISA higher than the ISA of your current hardware appears when you add corresponding codepaths with x, Qx / ax, Qax compiler options. To see the ISA of non-executed codepaths, enable the Analyze non-executed codepaths option in Project Properties.

Vector Widths

Description: Vector register width in bits.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Possible values: Combination of values, including 32, 64, 128, 256, 512, delimited by a slash or semi-colon (/ or ;).

Vectorization Details

Description: Compiler notes on vectorization.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

VL (Vector Length)

Description: The number of elements processed in a single iteration of vector loops, or the number of elements processed in individual vector instructions.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Calculation/Aggregation: Estimated by binary static analysis or the Intel compiler.

W

Why No Vectorization?

Description: The reason the compiler did not vectorize the loop.

Collected during Survey Analysis and found in Loop Information Pane (Survey Report) and Advanced View Pane (Survey Report).

Interpretation: Click to display the issue root cause and recommended fixes.

X, Y, Z