Intel® Advisor Help
Medium
Survey + Characterization (Trip Counts and FLOP, Call Stacks, Memory-Level) + Memory Access Patterns
In the Medium accuracy preset, the Intel® Advisor extends the basic Roofline capability and collects metrics for all memory levels and the callstack data, which allows you to analyze your application in more detail. Roofline chart uses the results of Memory Access Patterns analysis to understand what bounds the loop and build recommendations in Roofline Guidance.
For information about Memory Access Patterns data interpretation, refer to Investigate Memory Usage and Traffic.
The Memory-Level Roofline allows you to examine each loop at different cache levels and arithmetic intensities and provides precise insights into which cache level causes the performance bottlenecks.
The Memory-Level Roofline can help you to:
To configure the Memory-Level Roofline chart:
Memory-Level Roofline Data
Intel® Advisor collects integrated traffic data for all traffic types between a CPU and different memory subsystem using cache simulation. With this data, Intel® Advisor counts the number of data transfers for a given cache level and computes AI for each loop and each memory level.
Review the changes in the traffic from one memory level to another and compare it to respective to identify the memory hierarchy bottleneck for the kernel and determine optimization steps based on this information.
Arithmetic intensity determines the order in which dots are plotted, which can provide some insight into your code's performance. For example, the L1 dot should be the largest and first plotted dot on the chart from left to right. However, memory access type, latency, or technical issues can change the order of the dots. Continue to run the Memory Access Pattern analysis to investigate this issue.
To examine a specific loop in more details, select a dot on the chart and open the Code Analytics tab below the chart:
Intel® Advisor basic Roofline model, the Cache-Aware Roofline Model (CARM), offers self data capability. Intel® Advisor Roofline with Callstacks feature extends the basic model with total data capability:
Self data = Memory access, FLOPs, and duration related only to the loop/function itself and excludes data originating in other loops/functions called by it
Total data = Data from the loop/function itself and its inner loops/functions
The total-data capability in the Roofline with Callstacks feature can help you:
Investigate the source of loops/functions instead of just the loops/functions themselves.
Get a more accurate view of loops/functions that behave differently when called under different circumstances.
Uncover design inefficiencies higher up the call chain that could be the root cause of poor performance by smaller loops/functions.
To view the callstacks, enable the With Callstacks checkbox in the Roofline chart.
To show/hide dot descendants:
Click a loop/function dot
control to collapse descendant dots into the parent dot.
Click a loop/function dot
control to show descendant dots and their relationship with visual indicators to the parent dot.
Roofline with Callstacks Chart Data
The following Roofline chart representation shows some of the added benefits of the Roofline with Callstacks feature, including:
A navigable, color-coded Callstack pane that shows the entire call chain for the selected loop/function, but excludes its callees
Visual indicators (caller and callee arrows) that show the relationship among loops and functions
The ability to simplify dot-heavy charts by collapsing several small loops into one overall representation
Loops/functions with no self data are grayed out when expanded and in color when collapsed. Loops/functions with self data display at the coordinates, size, and color appropriate to the data when expanded, but have a gray halo of the size associated with their total time. When such loops/functions are collapsed, they change to the size and color appropriate to their total time and, if applicable, move to reflect the total performance and total arithmetic intensity.