Intel® Advisor Help

advisor Command Option Reference

The advisor command currently supports the options shown below.

Option

Description

accuracy

Set an accuracy level for the Offload Modeling collection preset.

append

Add loops (by file and line number) to the loops selected for deeper analysis.

app-working-dir

Specify the directory where the target application runs during analysis, if it is different from the current working directory.

assume-dependencies

Assume that a loop has dependencies if the loop dependency type is unknown.

assume-hide-taxes

Estimate invocation taxes assuming the invocation tax is paid only for the first kernel launch.

assume-ndim-dependency

When searching for an optimal N-dimensional offload, assume there are dependencies between inner and outer loops.

assume-single-data-transfer

Assume data is only transferred once for each offload, and all instances share that data.

auto-finalize

Finalize Survey and Trip Counts & FLOP analysis data after collection is complete.

batching

Emulate the execution of more than one instance simultaneously for a top-level offload.

benchmarks-sync

Run benchmarks on only one concurrently executing Intel Advisor instance to avoid concurrency issues with regard to platform limits.

bottom-up

Generate a Survey report in bottom-up view.

cache-binaries

Enable binary visibility in a read-only snapshot you can view any time.

cache-binaries-mode

Select what binary files will be added to a read-only snapshot.

cache-config

Set the cache hierarchy to collect modeling data for CPU cache behavior during Trip Counts & FLOP analysis.

cache-simulation

Simulate device cache behavior for your application.

cache-sources

Enable source code visibility in a read-only snapshot you can view any time (with the --snapshot action). Enable keeping source code cache within a project (with the --collect action).

cachesim

Enable cache simulation for Performance Modeling.

cachesim-associativity

Set the cache associativity for modeling CPU cache behavior during the Memory Access Patterns analysis.

cachesim-cacheline-size

Set the cache line size (in bytes) for modeling CPU cache behavior during Memory Access Patterns analysis.

cachesim-mode

Set the focus for modeling CPU cache behavior during Memory Access Patterns analysis.

cachesim-sampling-factor

Specify what percentage of total memory accesses should be processed during cache simulation.

cachesim-sets

Set the cache set size (in bytes) for modeling CPU cache behavior during Memory Access Patterns analysis.

check-profitability

Check the profitability of offload regions and add only profitable regions to a report.

clear

Clear all loops previously selected for deeper analysis.

config

Specify a device configuration to model your application performance for.

count-logical-instructions

Use the projection of x86 logical instructions to GPU logical instructions.

count-memory-instructions

Project x86 memory instructions to GPU SEND/SENDS instructions.

count-memory-objects-accesses

Count the number of accesses to memory objects created by code regions.

count-mov-instructions

Project x86 MOV instructions to GPU MOV instructions.

count-send-latency

Select how to model SEND instruction latency.

cpu-scale-factor

Specify a scale factor to approximate a host CPU that is faster than the baseline CPU by this factor.

csv-delimiter

Set the delimiter for a report in CSV format.

custom-config

Specify the ablosute path or name for a custom TOML configuration file with additional modeling parameters.

data-limit

Limit the maximum amount (in MB) of raw data collected during Survey analysis.

data-reuse-analysis

Analyze potential data reuse between code regions.

data-transfer

Set the level of details for modeling data transfers during Characterization.

data-transfer-histogram

Estimate data transfers in details and latencies for each transferred object.

data-transfer-page-size

Specify memory page size to set the traffic measurement granularity for the data transfer simulator.

data-type

Show only floating-point data, only integer data, or data for the sum of both data types in a Roofline interactive HTML report.

delete-tripcounts

Remove previously collected trip counts data when re-running a Survey analysis with changed binaries.

disable-fp64-math-optimization

Do not account for optimized traffic for transcendentals on a GPU.

display-callstack

Show a callstack for each loop/function call in a report.

dry-run

List all steps included in Offload Modeling batch collection at a specified accuracy level without running them.

duration

Specify the maximum amount of time (in seconds) an analysis runs.

dynamic

Show (in a Survey report) how many instructions of a given type actually executed during Trip Counts & FLOP analysis.

enable-batching

Deprecated.

enable-cache-simulation

Model CPU cache behavior on your target application.

enable-data-transfer-analysis

Model data transfer between host memory and device memory.

enable-grf-simulation

Enable a simulator to model GRF.

enable-slm

Deprecated. SLM is modeled by default if available.

enable-task-chunking

Examine specified annotated sites for opportunities to perform task-chunking modeling in a Suitability report.

enforce-baseline-decomposition

Use the same local size and SIMD width as measured on a baseline device.

enforce-fallback

Emulate data distribution over stacks if stacks collection is disabled.

enforce-offloads

Offload all selected code regions even if offloading their child loops/functions is more profitable.

estimate-max-speedup

Estimate region speedup with relaxed constraints.

evaluate-min-speedup

Consider loops recommended for offloading only if they reach the minimum estimated speedup specified in a configuration file.

exclude-files

Exclude the specified files or directories from annotation scanning during analysis.

executable-of-interest

Specify an application for analysis that is not the starting application.

exp-dir

Specify a path to an unpacked result snapshot or an MPI rank result to generate a report or model performance.

filter

Filter data by the specified column name and value in a Survey and Trips Counts & FLOP report.

filter-by-scope

Enable filtering detected stack variables by scope (warning vs. error) in a Dependencies analysis.

filter-reductions

Mark all potential reductions by specific diagnostic during Dependencies analysis.

flex-cachesim

Enable flexible cache simulation to change cache configuration without re-running collection.

flop

Collect data about floating-point and integer operations, memory traffic, and mask utilization metrics for AVX-512 platforms during Trip Counts & FLOP analysis.

force-32bit-arithmetics

Consider all arithmetic operations as single-precision floating-point or int32 operations.

force-64bit-arithmetics

Consider all arithmetic operations as double-precision floating-point or int64 operations.

format

Set a report output format.

gpu

With Offload Modeling perspective, analyze OpenCL™ and oneAPI Level Zero programs running on Intel® Graphics. With GPU Roofline Insights perspective. create a Roofline interactive HTML report for data collected on GPUs.

gpu-carm

Collect memory traffic generated by OpenCL™ and Intel® Media SDK programs executed on Intel® Processor Graphics.

gpu-kernels

Deprecated. Use --profile-gpu or --gpu instead.

gpu-sampling-intervals

Specify time interval, in milliseconds, between GPU samples during Survey analysis.

hide-data-transfer-tax

Disable data transfer tax estimation.

ignore

Specify runtimes or libraries to ignore time spent in these regions when calculating per-program speedup.

ignore-app-mismatch

Ignore mismatched target or application parameter errors before starting analysis.

ignore-checksums

Ignore mismatched module checksums before starting analysis.

instance-of-interest

Analyze the Nth child process during Memory Access Patterns and Dependencies analysis.

integrated

Model traffic on all levels of the memory hierarchy for a Roofline report.

interval

Set the length of time (in milliseconds) to wait before collecting each sample during Survey analysis.

limit

Set the maximum number of top items to show in a report.

loop-call-count-limit

Set the maximum number of instances to analyze for all marked loops.

loop-filter-threshold

Specify total time, in milliseconds, to filter out loops that fall below this value.

loops

Select loops (by criteria instead of human input) for deeper analysis.

mark-up

Enable/disable user selection as a way to control loops/functions identified for deeper analysis.

mark-up-list

After running a Survey analysis and identifying loops of interest, select loops (by file and line number or ID) for deeper analysis.

memory-level

Model specific memory level(s) in a Roofline interactive HTML report, including L1, L2, L3, and DRAM.

memory-operation-type

Model only load memory operations, store memory operations, or both, in a Roofline interactive HTML report.

mix

Show dynamic or static instruction mix data in a Survey report.

mkl-user-mode

Collect Intel® oneAPI Math Kernel Library (oneMKL) loops and functions data during the Survey analysis.

model-baseline-gpu

Use the baseline GPU configuration as a target device for modeling.

model-children

Analyze child loops of the region head to find if some of the child loops provide more profitable offload.

model-extended-math

Model calls to math functions such as EXP, LOG, SIN, and COS as extended math instructions, if possible.

model-system-calls

Analyze code regions with system calls considering they are separated from offload code and executed on a host device.

module-filter

Specify application (or child application) module(s) to include in or exclude from analysis.

module-filter-mode

Limit, by inclusion or exclusion, application (or child application) module(s) for analysis.

mpi-rank

Specify MPI process data to import.

mrte-mode

Set the Microsoft* runtime environment mode for analysis.

ndim-depth-limit

When searching for an optimal N-dimensional offload, limit the maximum loop depth that can be converted to one offload.

option-file

Specify a text file containing command line arguments.

overlap-taxes

Enable asynchronous execution to overlap offload overhead with execution time.

pack

Pack a snapshot into an archive.

profile-gpu

Analyze OpenCL™ and oneAPI Level Zero programs running on Intel® Processor Graphics.

profile-intel-perf-libs

Show Intel® performance libraries loops and functions in Intel® Advisor reports.

profile-jit

Collect metrics about Just-In-Time (JIT) generated code regions during the Trip Counts and FLOP analysis.

profile-python

Collect Python* loop and function data during Survey analysis.

profile-stripped-binaries

Collect metrics for stripped binaries.

project-dir

Specify the top-level directory where a result is saved if you want to save the collection somewhere other than the current working directory.

quiet

Minimize status messages during command execution.

recalculate-time

Recalculate total time after filtering a report.

record-mem-allocations

Enable heap allocation tracking to identify heap-allocated variables for which access strides are detected during Memory Access Patterns analysis.

record-stack-frame

Capture stack frame pointers to identify stack variables for which access strides are detected during Memory Access Patterns analysis.

reduce-lock-contention

Examine specified annotated sites for opportunities to reduce lock contention or find deadlocks in a Suitability report.

reduce-lock-overhead

Examine specified annotated sites for opportunities to reduce lock overhead in a Suitability report.

reduce-site-overhead

Examine specified annotated sites for opportunities to reduce site overhead in a Suitability report.

reduce-task-overhead

Examine specified annotated sites for opportunities to reduce task overhead in a Suitability report.

refinalize-survey

Refinalize a survey result collected with a previous Intel® Advisor version or if you need to correct or update source and binary search paths.

remove

Remove loops (by file and line number) from the loops selected for deeper analysis.

report-output

Redirect report output from stdout to another location.

report-template

Specify the PATH/name of a custom report template file.

result-dir

Specify a directory to identify the running analysis.

resume-after

Resume collection after the specified number of milliseconds.

return-app-exitcode

Return the target exit code instead of the command line interface exit code.

search-dir

Specify the location(s) for finding target support files.

search-n-dim

Enable searching for an optimal N-dimensional offload.

select

Select loops (by file and line number, ID, or criteria) for deeper analysis.

set-dependency

Assume loops with specified IDs or source locations have a dependency.

set-parallel

Assume loops with specified IDs or source locations are parallel.

set-parameter

Specify a single-line parameter to modify in a target device configuration.

show-all-columns

Show data for all available columns in a Survey report.

show-all-rows

Show data for all available rows, including data for child loops, in a Survey report.

show-functions

Show only functions in a report.

show-loops

Show only loops in a report.

show-not-executed

Show not-executed child loops in a Survey report.

show-report

Generate a Survey report for data collected for GPU kernels.

small-node-filter

Specify the total time threshold, in milliseconds, to filter out nodes that fall below this value from PDF and DOT Offload Modeling reports.

sort-asc

Sort data in ascending order (by specified column name) in a report.

sort-desc

Sort data in descending order (by specified column name) in a report.

spill-analysis

Register flow analysis to calculate the number of consecutive load/store operations in registers and related memory traffic in bytes during Survey analysis.

stack-access-granularity

Specify stack access size to set stack memory access measurement granularity for the data transfer simulation.

stack-stitching

Restructure the call flow during Survey analysis to attach stacks to a point introducing a parallel workload.

stack-unwind-limit

Set stack size limit for analyzing stacks after collection.

stacks

Perform advanced collection of callstack data during Roofline and Trip Counts & FLOP analysis.

stackwalk-mode

Choose between online and offline modes to analyze stacks during Survey analysis.

start-paused

Start executing the target application for analysis purposes, but delay data collection.

static-instruction-mix

Statically calculate the number of specific instructions present in the binary during Survey analysis.

strategy

Specify processes and/or children for instrumentation during Survey analysis.

support-multi-isa-binaries

Collect a variety of data during Survey analysis for loops that reside in non-executed code paths.

target-device

Specify a device configuration to model cache for during Trip Counts collection.

target-gpu

Specify a target GPU to collect data for if you have multiple GPUs connected to your system.

target-pid

Attach Survey or Trip Counts & FLOP collection to a running process specified by the process ID.

target-process

Attach Survey or Trip Counts & FLOP collection to a running process specified by the process name.

target-system

Specify the hardware configuration to use for modeling purposes in a Suitability report.

threading-model

Specify the threading model to use for modeling purposes in a Suitability report.

threads

Specify the number of parallel threads to use for offload heads.

top-down

Generate a Survey report in top-down view.

trace-mode

Set how to trace loop iterations during Memory Access Patterns analysis.

trace-mpi

Configure collectors to trace MPI code and determine MPI rank IDs for non-Intel® MPI library implementations.

track-memory-objects

Attribute memory objects to the analyzed loops that accessed the objects.

track-stack-accesses

Track accesses to stack memory.

track-stack-variables

Enable parallel data sharing analysis for stack variables during Dependencies analysis.

trip-counts

Collect loop trip counts data during Trip Counts & FLOP analysis.

use-collect-configs

Deprecated.

user-data-dir

Deprecated.

verbose

Maximize status messages during command execution.

with-stack

Show call stack data in a Roofline interactive HTML report (if call stack data is collected).