Targeting IA-32 Architecture Processors for Run-time Performance Checking

The -ax (Linux* and Mac OS* X) or /Qax (Windows*) option instructs the compiler to determine if opportunities exist to generate multiple, specialized code paths to take advantage of performance gains and features available on newer Intel® processors based on IA-32 and Intel® 64 architectures. The option also instructs the compiler to generate a generic code path that should allow the same application to run on a larger number of processors; however, the generic code is usually slower than the specialized code.

The compiler inserts run-time checking code to help determine which version of the code to execute. The size of the compiled binary increases because it contains both a processor-specific version of some of the code and a generic version of all code. Application performance is affected slightly due to the run-time checks needed to determine which code to use. The code path executed depends strictly on the processor detected at run time.

Processor support for the generic code path is determined by the architecture specified in the -x (Linux and Mac OS X) or /Qx (Windows) option, which has default architecture values. You can easily increase minimum architecture for the generic code path above the default level, but you cannot specify a value less than the default. The generic code will not operate correctly on processors that are not compatible with the minimum architecture specified in the x option.

Optimizations in the specialized code paths can include generating and using Intel® Streaming SIMD Extensions 4 (SSE4), Supplemental Streaming SIMD Extensions 3 (SSSE3), Streaming SIMD Extensions 3 (SSE3), Streaming SIMD Extensions 2 (SSE2), or Streaming SIMD Extensions (SSE) instructions for supported processors; however, the instructions are executed on the processor only after run-time checking verifies the instruction sets are supported.

If not indicated otherwise, the following processor values are valid for IA-32 and Intel® 64 architectures.

Linux and Mac OS X

Windows

Description

-axS

/QaxS

Can generate specialized code path using SSE4 Vectorizing Compiler and Media Accelerators instructions for future Intel processors that support the instruction set and can optimize for the architecture.

Mac OS X: IA-32 architectures only.

-axT

/QaxT

Can generate specialized code path for SSSE3, SSE3, SSE2, and SSE instructions for Intel processors and optimize for the Intel® Core™2 Duo processor family.

Mac OS X: IA-32 and Intel® 64 architectures.

-axP

/QaxP

Can generate specialized code path for SSE3, SSE2, and SSE instructions for Intel processors and optimize for Intel processors based on Intel® Core™ microarchitecture and Intel Netburst® microarchitecture.

Mac OS X: IA-32 and Intel® 64 architectures.

-axB

/QaxB

Deprecated. Can generate specialized code path for SSE2 and SSE instructions for Intel processors and optimize for Intel® Pentium® M processors.

If this is the first time using this value consider using the N or W values instead.

Linux and Windows: IA-32 architectures only.

-axN

/QaxN

Can generate specialized code path for SSE2 and SSE instructions for Intel processors and optimize for Pentium® 4 processors and Intel® Xeon® processors with SSE2.

Linux and Windows: IA-32 architectures only.

-axW

/QaxW

Can generate specialized code path for SSE2 and SSE instructions for Intel processors and optimize for Intel Pentium® 4 processors and Intel® Xeon® processors with SSE2.

Minimum value for Intel® 64 architectures.

-axK

/QaxK

Can generate specialized code path for SSE instructions for Intel processors and optimize for Intel® Pentium® III and Intel Pentium® III Xeon® processors.

Linux and Windows: IA-32 architectures only.

Note

You can specify -diag-disable cpu-dispatch (Linux and Mac OS X) or /Qdiag-disable:cpu-dispatch (Windows) to disable CPU dispatch remarks.

If your application does not need to run on multiple processors based on IA-32 or Intel® 64 architectures, consider using the -x (Linux and Mac OS X) or /Qx (Windows) option instead or combine this option with the x option. Combining the options allows the compiler to generate optimized code to run only on a specific processor and generate multiple code paths for compatibility. If you combine the options, the -x (Linux and Mac OS X) or /Qx (Windows) option takes precedence and forces the generic code to execute only on processors compatible with the processor value specified as the minimum processor value.

The following compilation examples demonstrate how to generate an executable that includes an optimized version for Intel® Core™2 Duo processors, as long as there is a performance gain, an optimized version for Intel® Core™ Duo processors, as long as there is a performance gain, and a generic version that runs on any IA-32 architecture that supports the minimum, required instruction sets on that operating system and architecture.

Platform

Example

Linux

ifort -axPT sample.f90

Windows

ifort /QaxPT sample.f90

See also:

Other Options for Generating Processor-Specific Optimized Applications

The -mtune (Linux and Mac OS X) or /G{n} (Windows) option generates code that should be compatible with earlier Intel® processors in the same processor family. The compatibility behavior means the code generated with -mtune=pentium4 (Linux and Mac OS X) or /G7 (Windows) should run on earlier processors based on IA-32 architecture; however, the code might be slower than if the code had been compiled with -mtune=pentium (Linux and Mac OS X) or /G5 (Windows).

The following options can optimize application performance for specific processors based on IA-32 or Intel® 64 architectures.

Linux and Mac OS X

Windows

Optimizes applications for...

-mtune=pentium4

/G7

Default. Intel® Pentium® 4 processors, Intel® Core™ Duo processors, Intel® Core™ Solo processors, Intel® Xeon® processors, Intel® Pentium® M processors, and Intel® Pentium® 4 processors with Streaming SIMD Extensions 3 (SSE3) instruction support

-mtune=pentiumpro

/G6

Intel® Pentium® Pro, Pentium® II and Pentium® III processors

-mtune=pentium

-mtune=pentium-mmx

/G5

Intel® Pentium® and Pentium® with MMX™ technology processor

Note

Windows: For this release, the /G5, /G6, and /G7 options have been deprecated but not removed.

See also:

The example commands shown below each result in a compiled binary of the source program sample.f90 optimized for Pentium 4 and Intel® Xeon® processors by default. The same binary will also run on Pentium, Pentium III, and more advanced processors. The following examples demonstrate using the default options:

Platform

Example

Linux and Mac OS X

ifort -mtune=pentium4 sample.f90

Windows

ifort /G7 sample.f90