FFTW to Intel^® Math Kernel Library Wrappers Technical User Notes
for FFTW 2.x

Introduction
Wrappers Reference
     Complex FFTs
     Real FFTs
     Wisdom Wrappers
     Memory Allocation
Parallel Mode
     Multi-threaded FFTW
Calling Wrappers from Fortran
Installation
     Creating a Wrapper Library
     Application Assembling
     Running Examples
MPI FFTW
     MPI FFTW Wrappers Reference
     Creating MPI FFTW Wrapper Library
     Application Assembling with MPI FFTW Wrapper Library
     Running Examples of MPI FFTW Wrappers
Technical Support
Disclaimer and Legal Information

Introduction

This document describes a collection of wrappers that is the FFTW interfaces superstructure to be used for calling functions of the Intel® Math Kernel Library (Intel® MKL) Fourier transform interface (DFTI). These wrappers correspond to the FFTW version 2.x and the Intel MKL versions 7.0 and later.

The purpose of this set of wrappers is to enable developers whose programs currently use FFTW to gain performance with the Intel MKL Fourier transforms without changing the program source code. Because of differences between FFTW and Intel MKL DFTI functionalities, there are restrictions on using wrappers instead of the FTTW functions. Some FFTW functions have empty wrappers. However, many typical DFTs can be computed using these wrappers.

Please refer to Intel MKL Reference Manual, Chapter 11 "Fourier Transform Functions", for better understanding the effects from the use of the wrappers.

Additional wrappers may be added in the future to extend FFTW functionality available with Intel MKL.

Wrappers Reference

The section provides a reference for FFTW C interface.

Each FFTW function has its own wrapper. Some of them, which are not expressly listed below, are empty and do nothing, but they are still needed to avoid link errors and satisfy the function calls.
Intel MKL DFTI operates on both float and double-precision data types.

Complex FFTs

One-dimensional FFTs

fftw_plan fftw_create_plan(int n, fftw_direction dir, int flags);
fftw_plan fftw_create_plan_specific(int n, fftw_direction dir, int flags, fftw_complex *in, int istride, fftw_complex *out, int ostride);

void fftw(fftw_plan plan, int howmany, fftw_complex *in, int istride, int idist, fftw_complex *out, int ostride, int odist);
void fftw_one(fftw_plan plan, fftw_complex *in, fftw_complex *out);

void fftw_destroy_plan(fftw_plan plan);

Argument restrictions. The same algorithm corresponds to all values of the flags parameter.

Multi-dimensional FFTs

fftwnd_plan fftwnd_create_plan(int rank, const int *n, fftw_direction dir, int flags);
fftwnd_plan fftw2d_create_plan(int nx, int ny, fftw_direction dir, int flags);
fftwnd_plan fftw3d_create_plan(int nx, int ny, int nz, fftw_direction dir, int flags);
fftwnd_plan fftwnd_create_plan_specific(int rank, const int *n, fftw_direction dir, int flags, fftw_complex *in, int istride, fftw_complex *out, int ostride);
fftwnd_plan fftw2d_create_plan_specific(int nx, int ny, fftw_direction dir, int flags, fftw_complex *in, int istride, fftw_complex *out, int ostride);
fftwnd_plan fftw3d_create_plan_specific(int nx, int ny, int nz, fftw_direction dir, int flags, fftw_complex *in, int istride, fftw_complex *out, int ostride);

void fftwnd(fftwnd_plan plan, int howmany, fftw_complex *in, int istride, int idist, fftw_complex *out, int ostride, int odist);
void fftwnd_one(fftwnd_plan plan, fftw_complex *in, fftw_complex *out);

void fftwnd_destroy_plan(fftwnd_plan plan);

Argument restrictions. The same algorithm corresponds to all values of the flags parameter.

Real FFTs

One-dimensional FFTs

All wrappers are empty and do nothing, as the Intel MKL DFTI does not currently support this functionality (halfcomplex array).

Multi-dimensional FFTs

rfftwnd_plan rfftwnd_create_plan(int rank, const int *n, fftw_direction dir, int flags);
rfftwnd_plan rfftw2d_create_plan(int nx, int ny, fftw_direction dir, int flags);
rfftwnd_plan rfftw3d_create_plan(int nx, int ny, int nz, fftw_direction dir, int flags);

void rfftwnd_real_to_complex(rfftwnd_plan plan, int howmany, fftw_real *in, int istride, int idist, fftw_complex *out, int ostride, int odist);
void rfftwnd_complex_to_real(rfftwnd_plan plan, int howmany, fftw_complex *in, int istride, int idist, fftw_real *out, int ostride, int odist);
void rfftwnd_one_real_to_complex(rfftwnd_plan plan, fftw_real *in, fftw_complex *out);
void rfftwnd_one_complex_to_real(rfftwnd_plan plan, fftw_complex *in, fftw_real *out);

void rfftwnd_destroy_plan(rfftwnd_plan plan);

Argument restrictions. The same algorithm corresponds to all values of the


flags

parameter.

Wisdom Wrappers

All wrappers are empty and do nothing, as Intel MKL DFTI currently does not support these functionalities.

Memory Allocation

void* fftw_malloc(size_t n);

void fftw_free(void* x);

Unlike the fftw_malloc function, the fftw_malloc wrapper does not align the allocatable array. To do that, it is necessary to allocate extra memory and shift the array address for the DFT data. See also the Managing Performance and Memory chapter in the Intel MKL User's Guide (file userguide.pdf).

Parallel Mode

This section touches upon multi-threaded FFTW wrappers only. MPI FFTW wrappers, available with Intel MKL for Linux* and Windows*, are described in a separate section.

Multi-threaded FFTW

FFTW multi-threaded functions use the number of threads parameter, which the fftw_threads_init function defines. However, the int fftw_threads_init(void) wrapper is empty and does nothing, as the Intel MKL DFTI implements a different mechanism of parallelization. If you want to use Intel MKL DFTI routines in parallel mode or call wrappers from a multi-threaded application, please refer to the Intel MKL documentation to learn how to manage the number of threads.

Each of other wrappers in this section is the same as the respective wrapper in section 1 or 2 (whose name differs from the one of the given wrapper in cutting out "threads_").

For example,
void fftw_threads_one(int threads, fftw_plan plan, fftw_complex *in, fftw_complex *out);
is the same as
void fftw_one(fftw_plan plan, fftw_complex *in, fftw_complex *out);

Argument restrictions. Thread parameter is inessential. Both functions may be single-threaded or parallel depending on MKL variables.

Calling Wrappers from Fortran

Wrappers are available for all Fortran FFTW functions.
For example, instead of calling the C wrapper fftw_one, in Fortran, you should call the fftw_f77_one wrapper.

FFTW Fortran functions are actually the wrappers to FFTW C functions. Fortran wrappers are actually the wrappers to C wrappers. So their functionality and argument restrictions are the same as of the corresponding C wrappers.

Installation

Wrappers are delivered as the source code, which must be compiled by a user to build the wrapper library. Then the FFTW library can be substituted by the wrapper and Intel MKL libraries. The source code for the wrappers and makefiles with the wrapper list files are located in the \interfaces\fftw2xc and \interfaces\fftw2xf subdirectories in the Intel MKL directory for C and Fortran wrappers, respectively.

Creating a Wrapper Library

Two header files are used to compile the C wrapper library: fftw2_mkl.h and fftw.h.
The fftw2_mkl.h file is located in the \interfaces\fftw2xc\wrappers subdirectory in the Intel MKL directory.

Three header files are used to compile the Fortran wrapper library: fftw2_mkl.h, fftw2_f77_mkl.h, and fftw.h.
The fftw2_mkl.h and fftw2_f77_mkl.h files are located in the \interfaces\fftw2xf\wrappers subdirectory in the Intel MKL directory.

The file fftw.h, used to compile libraries for both interfaces and located in the \include\fftw subdirectory in the Intel MKL directory, slightly differs from the original FFTW (www.fftw.org) header file fftw.h.

A wrapper library contains C and Fortran wrappers for complex and real transforms in serial and multi-threaded mode for one of the two data types (double or float). The data type is managed by a makefile parameter.

Makefiles contain the following parameters: platform, compiler, precision and function. Description of these parameters is contained in the makefile comment heading.

platform is an obligatory parameter. Possible values:

lib32 – 32-bit applications
libem64t – Intel® 64 architecture based applications
lib64 – IA-64 architecture based applications.

The rest of parameters have default values and are optional.

The compiler parameter may have values:

intel – Intel® compilers version 8.0 or higher, default
gnu – GNU compiler on Linux* or Mac OS* X
mc – Microsoft* C++ Compiler on Windows*

The precision parameter may have values:

MKL_DOUBLE – double-precision data, default
MKL_SINGLE – float, single-precision data.

The function parameter is not used for building a wrapper library.

As the Fortran wrapper library is built by a C compiler, function names in the wrapper library and Fortran object module may be different. The file fftw2_f77_mkl.h in the \interfaces\fftw2xf\source subdirectory in the Intel MKL directory defines function names according to names in the Fortran module. If a required name is missing in the file, you can change the latter to add the name.

Examples

The command
make lib64
builds a double-precision wrapper library for IA-64 architecture based applications using the Intel® C++ Compiler and Intel® Fortran Compiler version 8.0 or higher (Compilers and PRECISION=MKL_DOUBLE are chosen by default.).
The command
make lib64 PRECISION=MKL_SINGLE
builds a single-precision wrapper library for IA-64 architecture based applications using the Intel C++ Compiler and Intel Fortran Compiler version 8.0 or higher (Compilers are chosen by default.).

As a result, the wrapper library will be created in the directory with Intel MKL libraries corresponding to the used platform. For example, \lib\64 or \ia32\lib.

In the wrapper library names, the suffix corresponds to the used compiler and the underscore is preceded with letter "f" for Fortran and "c" for C.
For example,
     fftw2xf_intel.lib (Windows*)             libfftw2xf_intel.a (Linux* and Mac OS* X)
     fftw2xc_intel.lib (Windows)               libfftw2xc_intel.a (Linux and Mac OS X)
     fftw2xc_ms.lib (Windows)                    libfftw2xc_gnu.a (Linux and Mac OS X).

Application Assembling

The necessary original FFTW (www.fftw.org) header files are used without any modifications. The created wrapper library and the Intel MKL library are used instead of the FFTW library.

Running Examples

There are some examples that demonstrate how to use the wrapper library. The source code for the examples, makefiles used to run them, and the example list files are located in the \examples\fftw2xc and \examples\fftw2xf subdirectories in the Intel MKL directory for C and Fortran, respectively. To build examples, several additional files are needed: fftw.h, fftw_threads.h, rfftw.h, rfftw_threads.h, and fftw_f77.i. These files are distributed with permission from FFTW and are available in \include\fftw. The original files can also be found in FFTW 2.1.5 at http://www.fftw.org/download.html.

Parameters for the example makefiles are described in the makefiles comment heading and are the same as the wrapper library makefile parameters (see Creating a Wrapper Library). Example makefiles normally invoke examples. However, if the appropriate wrapper library is not yet created, the makefile will first build it in the same way as the wrapper library makefile does and then proceed to examples.

If the parameter function=<example_name> is defined, then only the specified example will run. Otherwise, all examples from the appropriate subdirectory will run. The subdirectory \_results will be created, and the results will be stored there in the example_name.res files.

MPI FFTW

MPI FFTW wrappers are available with Intel® MKL for Linux* and Windows*.

MPI FFTW Wrappers Reference

The section provides a reference for MPI FFTW C interface.

Complex MPI FFTW

Complex One-dimensional MPI FFTW Transforms

fftw_mpi_plan fftw_mpi_create_plan(MPI_Comm comm, int n, fftw_direction dir, int flags);

void fftw_mpi(fftw_mpi_plan p, int n_fields, fftw_complex *local_data, fftw_complex *work);

void fftw_mpi_local_sizes(fftw_mpi_plan p, int *local_n, int *local_start, int *local_n_after_transform, int *local_start_after_transform, int *total_local_size);

void fftw_mpi_destroy_plan(fftw_mpi_plan plan);

Argument restrictions:

Supported values of flags are FFTW_ESTIMATE, FFTW_MEASURE, FFTW_SCRAMBLED_INPUT and FFTW_SCRAMBLED_OUTPUT. The same algorithm corresponds to all these values of the flags parameter. If any other flags value is supplied, the wrapper library reports an error "CDFT error in wrapper: unknown flags".
The only supported value of n_fields is 1.

Complex Multi-dimensional MPI FFTW Transforms


fftwnd_mpi_plan fftw2d_mpi_create_plan(MPI_Comm comm, int nx, int ny, fftw_direction dir, int flags);


fftwnd_mpi_plan fftw3d_mpi_create_plan(MPI_Comm comm,
int nx, int ny, int nz, fftw_direction dir, int flags);


fftwnd_mpi_plan fftwnd_mpi_create_plan(MPI_Comm comm, int dim, int *n, fftw_direction dir, int flags);

void fftwnd_mpi(fftwnd_mpi_plan p, int n_fields, fftw_complex *local_data, fftw_complex *work, fftwnd_mpi_output_order output_order);

void fftwnd_mpi_local_sizes(fftwnd_mpi_plan p, int *local_nx, int *local_x_start, int *local_ny_after_transpose, int *local_y_start_after_transpose, int *total_local_size);

void fftwnd_mpi_destroy_plan(fftwnd_mpi_plan plan);

Argument restrictions:

Supported values of flags are FFTW_ESTIMATE and FFTW_MEASURE. If any other value of flags is supplied, the wrapper library reports an error "CDFT error in wrapper: unknown flags"
The only supported value of output_order is FFTW_NORMAL_ORDER. If any other value of output_order is supplied, the wrapper library reports an error "CDFT error in wrapper: unknown output_order".
The only supported value of n_fields is 1.

Real MPI FFTW

Real Multi-dimensional MPI FFTW Transforms

The wrappers are empty and do nothing. If the wrappers are used, the wrappers library reports an error "CDFT error in wrapper".

Creating MPI FFTW Wrapper Library

The source code for the wrappers and makefiles with the wrapper list files are located in the \interfaces\fftw2x_cdft subdirectory in the Intel MKL directory.

A wrapper library contains C wrappers for Complex One-dimensional MPI FFTW Transforms and Complex Multi-dimensional MPI FFTW Transforms. The library also contains empty C wrappers for Real Multi-dimensional MPI FFTW Transforms. For details, see MPI FFTW Wrappers Reference.

The precision is managed by a makefile parameter.

Makefiles contain the following parameters: platform, compiler, precision, mpi, and mpidir. Description of these parameters is contained in the makefile comment heading.

platform is an obligatory parameter. Possible values:

lib32 – 32-bit applications
libem64t – Intel® 64 architecture based applications
lib64 – IA-64 architecture based applications.

The mpidir parameter value is a path to the MPI installation directory. If this directory is specified in the PATH system variable, then you can omit the parameter.

The rest of parameters are optional as well and have default values.

The compiler parameter may have values:

intel – Intel® compilers version 8.x or higher on Linux*, default
gnu – GNU compiler on Linux.

On Windows, this parameter is not used. Default Intel® compiler will be used to build the library.

The precision parameter may have values:

MKL_DOUBLE – double-precision data, default
MKL_SINGLE – float, single-precision data.

The mpi parameter specifies the MPI library to be used. The parameter may have values:

intel1 – Intel® MPI 1.0.x on Linux.
intel2 – Intel® MPI 2.0.x on Linux, default for Linux.
intel3 – Intel® MPI 3.0.x on Linux.
mpich – MPICH 1.2.x on Linux.
mpich2 – MPICH2 1.0.x on Linux or Windows*. For Windows, it's a default value.
msmpi – Microsoft* MPI library (for Intel® 64 architecture only) on Windows.

Examples

The command
make lib64
builds a double-precision wrapper library for IA-64 architecture based applications using Intel MPI 2.0 and Intel® C++ Compiler version 8.x or higher on Linux (Compilers and PRECISION=MKL_DOUBLE are chosen by default.).
The command
make lib32 mpi=mpich PRECISION=MKL_SINGLE
builds a single-precision wrapper library for the 32-bit applications using MPICH 1.2.x and Intel C++ Compiler version 8.x or higher on Linux (Compilers are chosen by default.).

As a result, the wrapper library will be created in the directory with Intel MKL libraries corresponding to the used platform. For example, \lib\64 or /ia32/lib.

In the wrapper library names, the suffix corresponds to the used data precision.
For example,
fftw2x_cdft_SINGLE.lib for Windows
libfftw2x_cdft_DOUBLE.a for Linux.

Application Assembling with MPI FFTW Wrapper Library

The necessary original FFTW (www.fftw.org) header files are used without any modifications. The created MPI FFTW wrapper library and the Intel MKL library are used instead of the FFTW library.

Running Examples of MPI FFTW Wrappers

There are some examples that demonstrate how to use the MPI FFTW wrapper library. The source C code for the examples, makefiles used to run them, and the example list files are located in the \examples\fftw2x_cdft subdirectory in the Intel MKL directory. To build examples, one additional file fftw_mpi.h is needed. This file is distributed with permission from FFTW and is available in \include\fftw. The original file can also be found in FFTW 2.1.5 at http://www.fftw.org/download.html.

Parameters for the example makefiles are described in the makefiles comment heading and are the same as the wrapper library makefile parameters (see Creating MPI FFTW Wrapper Library) except for precision, which takes different values:

FFTW_ENABLE_DOUBLE – double-precision data, default
FFTW_ENABLE_FLOAT – float, single-precision data.

The table below lists examples available in the \examples\fftw2x_cdft\source subdirectory.

Examples source file	Description
wrappers_c1d.c	One-dimensional Complex MPI FFTW transform, using plan = fftw_mpi_create_plan(...)
wrappers_c2d.c	Two-dimensional Complex MPI FFTW transform, using plan = fftw2d_mpi_create_plan(...)
wrappers_c3d.c	Three-dimensional Complex MPI FFTW transform, using plan = fftw3d_mpi_create_plan(...)
wrappers_c4d.c	Four-dimensional Complex MPI FFTW transform, using plan = fftwnd_mpi_create_plan(...)

Technical Support

Please see the Intel MKL support website at http://www.intel.com/support/performancetools/libraries/mkl/.

Disclaimer and Legal Information

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details.

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.

FFTW to Intel® Math Kernel Library Wrappers Technical User Notes for FFTW 2.x

Contents

Introduction

Wrappers Reference

One-dimensional FFTs

Multi-dimensional FFTs

One-dimensional FFTs

Multi-dimensional FFTs

Parallel Mode

Calling Wrappers from Fortran

Installation

MPI FFTW

MPI FFTW Wrappers Reference

Complex MPI FFTW

Real MPI FFTW

Running Examples of MPI FFTW Wrappers

Examples source file

Description

Technical Support

FFTW to Intel^® Math Kernel Library Wrappers Technical User Notes
for FFTW 2.x