advisor Command Option Reference

The advisor command currently supports the options shown below.

Option	Description
accuracy	Set an accuracy level for the Offload Modeling collection preset.
append	Add loops (by file and line number) to the loops selected for deeper analysis.
app-working-dir	Specify the directory where the target application runs during analysis, if it is different from the current working directory.
assume-dependencies	Assume that a loop has dependencies if the loop dependency type is unknown.
assume-hide-taxes	Estimate invocation taxes assuming the invocation tax is paid only for the first kernel launch.
assume-ndim-dependency	When searching for an optimal N-dimensional offload, assume there are dependencies between inner and outer loops.
assume-single-data-transfer	Assume data is only transferred once for each offload, and all instances share that data.
auto-finalize	Finalize Survey and Trip Counts & FLOP analysis data after collection is complete.
batching	Emulate the execution of more than one instance simultaneously for a top-level offload.
benchmarks-sync	Run benchmarks on only one concurrently executing Intel Advisor instance to avoid concurrency issues with regard to platform limits.
bottom-up	Generate a Survey report in bottom-up view.
cache-binaries	Enable binary visibility in a read-only snapshot you can view any time.
cache-binaries-mode	Select what binary files will be added to a read-only snapshot.
cache-config	Set the cache hierarchy to collect modeling data for CPU cache behavior during Trip Counts & FLOP analysis.
cache-simulation	Simulate device cache behavior for your application.
cache-sources	Enable source code visibility in a read-only snapshot you can view any time (with the --snapshot action). Enable keeping source code cache within a project (with the --collect action).
cachesim	Enable cache simulation for Performance Modeling.
cachesim-associativity	Set the cache associativity for modeling CPU cache behavior during the Memory Access Patterns analysis.
cachesim-cacheline-size	Set the cache line size (in bytes) for modeling CPU cache behavior during Memory Access Patterns analysis.
cachesim-mode	Set the focus for modeling CPU cache behavior during Memory Access Patterns analysis.
cachesim-sampling-factor	Specify what percentage of total memory accesses should be processed during cache simulation.
cachesim-sets	Set the cache set size (in bytes) for modeling CPU cache behavior during Memory Access Patterns analysis.
check-profitability	Check the profitability of offload regions and add only profitable regions to a report.
clear	Clear all loops previously selected for deeper analysis.
config	Specify a device configuration to model your application performance for.
count-logical-instructions	Use the projection of x86 logical instructions to GPU logical instructions.
count-memory-instructions	Project x86 memory instructions to GPU SEND/SENDS instructions.
count-memory-objects-accesses	Count the number of accesses to memory objects created by code regions.
count-mov-instructions	Project x86 MOV instructions to GPU MOV instructions.
count-send-latency	Select how to model SEND instruction latency.
cpu-scale-factor	Specify a scale factor to approximate a host CPU that is faster than the baseline CPU by this factor.
csv-delimiter	Set the delimiter for a report in CSV format.
custom-config	Specify the ablosute path or name for a custom TOML configuration file with additional modeling parameters.
data-limit	Limit the maximum amount (in MB) of raw data collected during Survey analysis.
data-reuse-analysis	Analyze potential data reuse between code regions.
data-transfer	Set the level of details for modeling data transfers during Characterization.
data-transfer-histogram	Estimate data transfers in details and latencies for each transferred object.
data-transfer-page-size	Specify memory page size to set the traffic measurement granularity for the data transfer simulator.
data-type	Show only floating-point data, only integer data, or data for the sum of both data types in a Roofline interactive HTML report.
delete-tripcounts	Remove previously collected trip counts data when re-running a Survey analysis with changed binaries.
disable-fp64-math-optimization	Do not account for optimized traffic for transcendentals on a GPU.
display-callstack	Show a callstack for each loop/function call in a report.
dry-run	List all steps included in Offload Modeling batch collection at a specified accuracy level without running them.
duration	Specify the maximum amount of time (in seconds) an analysis runs.
dynamic	Show (in a Survey report) how many instructions of a given type actually executed during Trip Counts & FLOP analysis.
enable-batching	Deprecated.
enable-cache-simulation	Model CPU cache behavior on your target application.
enable-data-transfer-analysis	Model data transfer between host memory and device memory.
enable-grf-simulation	Enable a simulator to model GRF.
enable-slm	Deprecated. SLM is modeled by default if available.
enable-task-chunking	Examine specified annotated sites for opportunities to perform task-chunking modeling in a Suitability report.
enforce-baseline-decomposition	Use the same local size and SIMD width as measured on a baseline device.
enforce-fallback	Emulate data distribution over stacks if stacks collection is disabled.
enforce-offloads	Offload all selected code regions even if offloading their child loops/functions is more profitable.
estimate-max-speedup	Estimate region speedup with relaxed constraints.
evaluate-min-speedup	Consider loops recommended for offloading only if they reach the minimum estimated speedup specified in a configuration file.
exclude-files	Exclude the specified files or directories from annotation scanning during analysis.
executable-of-interest	Specify an application for analysis that is not the starting application.
exp-dir	Specify a path to an unpacked result snapshot or an MPI rank result to generate a report or model performance.
filter	Filter data by the specified column name and value in a Survey and Trips Counts & FLOP report.
filter-by-scope	Enable filtering detected stack variables by scope (warning vs. error) in a Dependencies analysis.
filter-reductions	Mark all potential reductions by specific diagnostic during Dependencies analysis.
flex-cachesim	Enable flexible cache simulation to change cache configuration without re-running collection.
flop	Collect data about floating-point and integer operations, memory traffic, and mask utilization metrics for AVX-512 platforms during Trip Counts & FLOP analysis.
force-32bit-arithmetics	Consider all arithmetic operations as single-precision floating-point or int32 operations.
force-64bit-arithmetics	Consider all arithmetic operations as double-precision floating-point or int64 operations.
format	Set a report output format.
gpu	With Offload Modeling perspective, analyze OpenCL™ and oneAPI Level Zero programs running on Intel® Graphics. With GPU Roofline Insights perspective. create a Roofline interactive HTML report for data collected on GPUs.
gpu-carm	Collect memory traffic generated by OpenCL™ and Intel® Media SDK programs executed on Intel® Processor Graphics.
gpu-kernels	Deprecated. Use --profile-gpu or --gpu instead.
gpu-sampling-intervals	Specify time interval, in milliseconds, between GPU samples during Survey analysis.
hide-data-transfer-tax	Disable data transfer tax estimation.
ignore	Specify runtimes or libraries to ignore time spent in these regions when calculating per-program speedup.
ignore-app-mismatch	Ignore mismatched target or application parameter errors before starting analysis.
ignore-checksums	Ignore mismatched module checksums before starting analysis.
instance-of-interest	Analyze the Nth child process during Memory Access Patterns and Dependencies analysis.
integrated	Model traffic on all levels of the memory hierarchy for a Roofline report.
interval	Set the length of time (in milliseconds) to wait before collecting each sample during Survey analysis.
limit	Set the maximum number of top items to show in a report.
loop-call-count-limit	Set the maximum number of instances to analyze for all marked loops.
loop-filter-threshold	Specify total time, in milliseconds, to filter out loops that fall below this value.
loops	Select loops (by criteria instead of human input) for deeper analysis.
mark-up	Enable/disable user selection as a way to control loops/functions identified for deeper analysis.
mark-up-list	After running a Survey analysis and identifying loops of interest, select loops (by file and line number or ID) for deeper analysis.
memory-level	Model specific memory level(s) in a Roofline interactive HTML report, including L1, L2, L3, and DRAM.
memory-operation-type	Model only load memory operations, store memory operations, or both, in a Roofline interactive HTML report.
mix	Show dynamic or static instruction mix data in a Survey report.
mkl-user-mode	Collect Intel® oneAPI Math Kernel Library (oneMKL) loops and functions data during the Survey analysis.
model-baseline-gpu	Use the baseline GPU configuration as a target device for modeling.
model-children	Analyze child loops of the region head to find if some of the child loops provide more profitable offload.
model-extended-math	Model calls to math functions such as EXP, LOG, SIN, and COS as extended math instructions, if possible.
model-system-calls	Analyze code regions with system calls considering they are separated from offload code and executed on a host device.
module-filter	Specify application (or child application) module(s) to include in or exclude from analysis.
module-filter-mode	Limit, by inclusion or exclusion, application (or child application) module(s) for analysis.
mpi-rank	Specify MPI process data to import.
mrte-mode	Set the Microsoft* runtime environment mode for analysis.
ndim-depth-limit	When searching for an optimal N-dimensional offload, limit the maximum loop depth that can be converted to one offload.
option-file	Specify a text file containing command line arguments.
overlap-taxes	Enable asynchronous execution to overlap offload overhead with execution time.
pack	Pack a snapshot into an archive.
profile-gpu	Analyze OpenCL™ and oneAPI Level Zero programs running on Intel® Processor Graphics.
profile-intel-perf-libs	Show Intel® performance libraries loops and functions in Intel® Advisor reports.
profile-jit	Collect metrics about Just-In-Time (JIT) generated code regions during the Trip Counts and FLOP analysis.
profile-python	Collect Python* loop and function data during Survey analysis.
profile-stripped-binaries	Collect metrics for stripped binaries.
project-dir	Specify the top-level directory where a result is saved if you want to save the collection somewhere other than the current working directory.
quiet	Minimize status messages during command execution.
recalculate-time	Recalculate total time after filtering a report.
record-mem-allocations	Enable heap allocation tracking to identify heap-allocated variables for which access strides are detected during Memory Access Patterns analysis.
record-stack-frame	Capture stack frame pointers to identify stack variables for which access strides are detected during Memory Access Patterns analysis.
reduce-lock-contention	Examine specified annotated sites for opportunities to reduce lock contention or find deadlocks in a Suitability report.
reduce-lock-overhead	Examine specified annotated sites for opportunities to reduce lock overhead in a Suitability report.
reduce-site-overhead	Examine specified annotated sites for opportunities to reduce site overhead in a Suitability report.
reduce-task-overhead	Examine specified annotated sites for opportunities to reduce task overhead in a Suitability report.
refinalize-survey	Refinalize a survey result collected with a previous Intel® Advisor version or if you need to correct or update source and binary search paths.
remove	Remove loops (by file and line number) from the loops selected for deeper analysis.
report-output	Redirect report output from stdout to another location.
report-template	Specify the PATH/name of a custom report template file.
result-dir	Specify a directory to identify the running analysis.
resume-after	Resume collection after the specified number of milliseconds.
return-app-exitcode	Return the target exit code instead of the command line interface exit code.
search-dir	Specify the location(s) for finding target support files.
search-n-dim	Enable searching for an optimal N-dimensional offload.
select	Select loops (by file and line number, ID, or criteria) for deeper analysis.
set-dependency	Assume loops with specified IDs or source locations have a dependency.
set-parallel	Assume loops with specified IDs or source locations are parallel.
set-parameter	Specify a single-line parameter to modify in a target device configuration.
show-all-columns	Show data for all available columns in a Survey report.
show-all-rows	Show data for all available rows, including data for child loops, in a Survey report.
show-functions	Show only functions in a report.
show-loops	Show only loops in a report.
show-not-executed	Show not-executed child loops in a Survey report.
show-report	Generate a Survey report for data collected for GPU kernels.
small-node-filter	Specify the total time threshold, in milliseconds, to filter out nodes that fall below this value from PDF and DOT Offload Modeling reports.
sort-asc	Sort data in ascending order (by specified column name) in a report.
sort-desc	Sort data in descending order (by specified column name) in a report.
spill-analysis	Register flow analysis to calculate the number of consecutive load/store operations in registers and related memory traffic in bytes during Survey analysis.
stack-access-granularity	Specify stack access size to set stack memory access measurement granularity for the data transfer simulation.
stack-stitching	Restructure the call flow during Survey analysis to attach stacks to a point introducing a parallel workload.
stack-unwind-limit	Set stack size limit for analyzing stacks after collection.
stacks	Perform advanced collection of callstack data during Roofline and Trip Counts & FLOP analysis.
stackwalk-mode	Choose between online and offline modes to analyze stacks during Survey analysis.
start-paused	Start executing the target application for analysis purposes, but delay data collection.
static-instruction-mix	Statically calculate the number of specific instructions present in the binary during Survey analysis.
strategy	Specify processes and/or children for instrumentation during Survey analysis.
support-multi-isa-binaries	Collect a variety of data during Survey analysis for loops that reside in non-executed code paths.
target-device	Specify a device configuration to model cache for during Trip Counts collection.
target-gpu	Specify a target GPU to collect data for if you have multiple GPUs connected to your system.
target-pid	Attach Survey or Trip Counts & FLOP collection to a running process specified by the process ID.
target-process	Attach Survey or Trip Counts & FLOP collection to a running process specified by the process name.
target-system	Specify the hardware configuration to use for modeling purposes in a Suitability report.
threading-model	Specify the threading model to use for modeling purposes in a Suitability report.
threads	Specify the number of parallel threads to use for offload heads.
top-down	Generate a Survey report in top-down view.
trace-mode	Set how to trace loop iterations during Memory Access Patterns analysis.
trace-mpi	Configure collectors to trace MPI code and determine MPI rank IDs for non-Intel® MPI library implementations.
track-memory-objects	Attribute memory objects to the analyzed loops that accessed the objects.
track-stack-accesses	Track accesses to stack memory.
track-stack-variables	Enable parallel data sharing analysis for stack variables during Dependencies analysis.
trip-counts	Collect loop trip counts data during Trip Counts & FLOP analysis.
use-collect-configs	Deprecated.
user-data-dir	Deprecated.
verbose	Maximize status messages during command execution.
with-stack	Show call stack data in a Roofline interactive HTML report (if call stack data is collected).