Introduction
Wrappers Reference
1. Wrappers for Using Plans
2. Basic Interface Wrappers
3. Advanced Interface Wrappers
4. Guru Interface Wrappers
5. Wisdom Wrappers
6. Memory Allocation
Parallel Mode
Calling Wrappers from Fortran
Installation
Creating a Wrapper Library
Application Assembling
Running Examples
Technical Support
Disclaimer and Legal Information
This document describes a collection of wrappers that is the FFTW interfaces superstructure to be used for calling functions of the Intel® Math Kernel Library (Intel® MKL) Fourier transform (DFTI) or Trigonometric Transform (TT) interface. These wrappers correspond to the FFTW version 3.x and the Intel MKL versions 7.0 and later.
The purpose of this set of wrappers is to enable developers whose programs
currently use FFTW to gain performance with the Intel MKL Fourier
transforms without changing the program source code. The only change that is
required is to modify the header file fftw3.h
(see
Creating a Wrapper Library).
Because of differences between FFTW and Intel MKL DFTI/TT functionalities, there
are a lot of restrictions on using wrappers instead of FTTW functions. Some FFTW functions have empty wrappers. However,
many typical DFTs can be computed using these wrappers.
Please refer to the Intel MKL DFTI/TT documentation for better understanding the effects from the use of the wrappers.
Additional wrappers may be added in the future to extend FFTW functionality available with Intel MKL.
The section provides a reference for FFTW C interface.
Each FFTW function has its own wrapper. Some of them, which are not expressly listed below, are empty and do nothing, but they are still needed to avoid link errors and satisfy the function calls.
Note that Intel MKL DFTI operates on float and double-precision data types and does not support the long-double data type used by the FFTW functions.
void fftw_execute(const fftw_plan plan);
void fftw_destroy_plan(const fftw_plan plan);
void fftwf_execute(const fftw_plan plan);
void fftwf_destroy_plan(const fftw_plan plan);
Wrappers for execution and plan destruction functions are listed in Wrappers for Using Plans.
2.1 Complex DFTs
fftw_plan fftw_plan_dft_1d(int n, fftw_complex
*in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft_2d(int nx, int ny,
fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft_3d(int nx, int ny, int
nz, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft(int rank, const int
*n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_1d(int n,
fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_2d(int nx, int ny,
fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_3d(int nx, int ny,
int nz, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft(int rank, const int
*n, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
Argument restrictions. The
same algorithm corresponds to all values of the flags
parameter.
2.2 Real-Data DFTs
fftw_plan fftw_plan_dft_r2c(int rank, const int *n, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_1d(int n, double
*in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_2d(int nx, int ny,
double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_3d(int nx, int ny,
int nz, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r(int rank, const
int *n, fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_1d(int n,
fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_2d(int nx, int ny,
fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_3d(int nx, int ny,
int nz, fftw_complex *in, double *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c(int rank, const
int *n, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_1d(int n, float
*in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_2d(int nx, int
ny, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_3d(int nx, int
ny, int nz, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r(int rank, const
int *n, fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_1d(int n,
fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_2d(int nx, int
ny, fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_3d(int nx, int
ny, int nz, fftwf_complex *in, float *out, unsigned flags);
Argument restrictions. The
same algorithm corresponds to all values of the flags
parameter.
2.3 Real-to-Real Transforms
Currently, only real 1D even/odd DFTs (cosine/sine transforms) are supported.
fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out, fftw_r2r_kind kind, unsigned flags);
fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out, const fftw_r2r_kind *kind, unsigned flags);
Argument restrictions and extension:
flags
parameter.
A new value MKL_RODFT00
of the
kind
parameter was introduced in the wrappers. For better
performance, you are strongly encouraged to use this value rather than FFTW_RODFT00
and provide input/output vectors that have an extra first element equal to
0.0.
For example, let the input vector in1
for the function call
plan1=fftw_plan_r2r_1d(n, in1, out1, FFTW_RODFT00, FFTW_ESTIMATE);
be (u,v,w) of length 3, then to accomplish the same transform with
kind =
MKL_RODFT00
, that is, with the
function call
plan1=fftw_plan_r2r_1d(n, in2, out2, MKL_RODFT00, FFTW_ESTIMATE);
the input vector in2
should be (0.0, u, v, w) of length 4.
Similarly, whereas the result out1
is (x,y,z), the result out2
is
(0.0, x, y, z).
Wrappers for execution and plan destruction functions are listed in Wrappers for Using Plans.
3.1 Advanced Complex DFTs
fftw_plan fftw_plan_many_dft(int rank, const
int *n, int howmany, fftw_complex *in, const int *inembed, int istride, int
idist, fftw_complex *out, const int *onembed, int ostride, int odist, int sign,
unsigned flags);
fftwf_plan fftwf_plan_many_dft(int rank, const
int *n, int howmany, fftwf_complex *in, const int *inembed, int istride, int
idist, fftwf_complex *out, const int *onembed, int ostride, int odist, int sign,
unsigned flags);
Argument restrictions. The
same algorithm corresponds to all values of the flags
parameter.
3.2 Advanced Real-Data DFTs
fftw_plan fftw_plan_many_dft_r2c(int rank,
const int *n, int howmany, double* in, const int *inembed, int istride, int
idist, fftw_complex *out, const int *onembed, int ostride, int odist, unsigned
flags);
fftwf_plan fftwf_plan_many_dft_r2c(int rank,
const int *n, int howmany, float* in, const int *inembed, int istride, int
idist, fftwf_complex *out, const int *onembed, int ostride, int odist, unsigned
flags);
fftw_plan fftw_plan_many_dft_c2r(int rank,
const int *n, int howmany, fftw_complex * in, const int *inembed, int istride,
int idist, double *out, const int *onembed, int ostride, int odist, unsigned
flags);
fftwf_plan fftwf_plan_many_dft_c2r(int rank,
const int *n, int howmany, fftwf_complex* in, const int *inembed, int istride,
int idist, float *out, const int *onembed, int ostride, int odist, unsigned
flags);
3.3 Advanced Real-to-Real Transforms
All wrappers are empty and do nothing. The wrappers may be added in later versions of Intel MKL.
4.1 Guru Complex DFTs
fftw_plan fftw_plan_guru_dft(int rank, const
fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, fftw_complex
*in, fftw_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_guru_dft(int rank, const
fftwf_iodim *dims, int howmany_rank, const fftwf_iodim *howmany_dims,
fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
flags
parameter. The only supported value of howmany_rank
is 1.
The rest of the wrappers are empty and do nothing, as the Intel MKL DFTI currently does not support split arrays.
4.2 Guru Real-Data DFTs
All wrappers are empty and do nothing.
Real-data wrappers (without support of split arrays) may be added in later versions of Intel MKL.
4.3 Guru Real-to-Real Transforms
All wrappers are empty and do nothing. The wrappers may be added in later versions of Intel MKL.
4.4 Guru Execution of Plans
void fftw_execute_dft(const fftw_plan p,
fftw_complex *in, fftw_complex *out);
void fftw_execute_dft_r2c(const fftw_plan p,
double *in, fftw_complex *out);
void fftw_execute_dft_c2r(const fftw_plan p,
fftw_complex *in, double *out);
void fftwf_execute_dft(const fftwf_plan p,
fftwf_complex *in, fftwf_complex *out);
void fftwf_execute_dft_r2c(const fftwf_plan p,
float *in, fftwf_complex *out);
void fftwf_execute_dft_c2r(const fftwf_plan p,
fftwf_complex *in, float *out);
The rest of the wrappers are empty and do nothing.
Real-data wrappers (without support of split arrays) will be added in later versions of Intel MKL.
Wrappers for more execution and plan destruction functions are listed in Wrappers for Using Plans.
All wrappers are empty and do nothing, as the Intel MKL DFTI currently does not support these functionalities.
void* fftw_malloc(size_t n);
void fftw_free(void* x);
void* fftwf_malloc(size_t n);
void fftwf_free(void* x);
Unlike the fftw_malloc
and fftwf_malloc
functions, the fftw_malloc
and fftwf_malloc
wrappers
do not align the allocatable array. To do that, it is necessary to allocate extra memory and
shift the array address for the DFT data. See also the
Managing Performance and Memory chapter
in the Intel MKL User's Guide (file userguide.pdf).
FFTW multi-threaded functions use the number of threads parameter, which
the function fftw_threads_init
defines and the function fftw_plan_with_nthreads
sets.
However, the wrappers to these functions and the fftw_cleanup_threads
wrapper are empty and do nothing, as the Intel MKL DFTI implements a different
mechanism of parallelization. If you want to use Intel
MKL DFTI routines in parallel mode or call wrappers from a multi-threaded application, please
refer to the Intel MKL documentation to learn how to manage the number of
threads.
Wrappers are available for all the Fortran FFTW functions. FFTW Fortran functions are actually the wrappers to FFTW C functions. Fortran wrappers are actually the wrappers to C wrappers. So their functionality and argument restrictions are the same as of the corresponding C wrappers.
DFFTW_EXECUTE(PLAN)
DFFTW_DESTROY_PLAN(PLAN)
SFFTW_EXECUTE(PLAN)
SFFTW_DESTROY_PLAN(PLAN)
DFFTW_PLAN_DFT_1D(PLAN, N, IN, OUT, SIGN, FLAGS)
DFFTW_PLAN_DFT_2D(PLAN, NX, NY,
IN, OUT, SIGN, FLAGS)
DFFTW_PLAN_DFT_3D(PLAN, NX, NY, NZ, IN, OUT, SIGN, FLAGS)
DFFTW_PLAN_DFT(PLAN, RANK, N, IN, OUT, SIGN, FLAGS)
SFFTW_PLAN_DFT_1D(PLAN, N, IN, OUT, SIGN, FLAGS)
SFFTW_PLAN_DFT_2D(PLAN, NX, NY,
IN, OUT, SIGN, FLAGS)
SFFTW_PLAN_DFT_3D(PLAN, NX, NY, NZ, IN, OUT, SIGN, FLAGS)
SFFTW_PLAN_DFT(PLAN, RANK, N, IN, OUT, SIGN, FLAGS)
DFFTW_PLAN_DFT_R2C(PLAN, RANK, N, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_R2C_1D(PLAN, N, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_R2C_2D(PLAN, NX, NY, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_R2C_3D(PLAN, NX, NY, NZ, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_C2R(PLAN, RANK, N, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_C2R_1D(PLAN, N, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_C2R_2D(PLAN, NX, NY, IN, OUT, FLAGS)
DFFTW_PLAN_DFT_C2R_3D(PLAN, NX, NY, NZ, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_R2C(PLAN, RANK, N, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_R2C_1D(PLAN, N, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_R2C_2D(PLAN, NX, NY, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_R2C_3D(PLAN, NX, NY, NZ, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_C2R(PLAN, RANK, N, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_C2R_1D(PLAN, N, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_C2R_2D(PLAN, NX, NY, IN, OUT, FLAGS)
SFFTW_PLAN_DFT_C2R_3D(PLAN, NX, NY, NZ, IN, OUT, FLAGS)
DFFTW_PLAN_MANY_DFT(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE, IDIST,
OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
SFFTW_PLAN_MANY_DFT(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE, IDIST,
OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
DFFTW_PLAN_MANY_DFT_R2C(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE,
IDIST, OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
SFFTW_PLAN_MANY_DFT_R2C(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE,
IDIST, OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
DFFTW_PLAN_MANY_DFT_C2R(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE,
IDIST, OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
SFFTW_PLAN_MANY_DFT_C2R(PLAN, RANK, N, HOWMANY, IN, INEMBED, ISTRIDE,
IDIST, OUT, ONEMBED, OSTRIDE, ODIST, SIGN,
FLAGS)
DFFTW_PLAN_GURU_DFT(PLAN, RANK, DIMS_N, DIMS_IS, DIMS_OS, HOWMANY_RANK, HOWMANY_DIMS_N, HOWMANY_DIMS_IS, HOWMANY_DIMS_OS, IN, OUT, SIGN, FLAGS)
SFFTW_PLAN_GURU_DFT(PLAN, RANK, DIMS_N, DIMS_IS, DIMS_OS, HOWMANY_RANK, HOWMANY_DIMS_N, HOWMANY_DIMS_IS, HOWMANY_DIMS_OS, IN, OUT, SIGN, FLAGS)
DFFTW_EXECUTE_DFT(PLAN, P, IN, OUT)
DFFTW_EXECUTE_DFT_R2C(PLAN, P, IN, OUT)
DFFTW_EXECUTE_DFT_C2R(PLAN, P, IN, OUT)
SFFTW_EXECUTE_DFT(PLAN, P, IN, OUT)
SFFTW_EXECUTE_DFT_R2C(PLAN, P, IN, OUT)
SFFTW_EXECUTE_DFT_C2R(PLAN, P, IN, OUT)
Wrappers are delivered as the source code, which must be compiled by a user to build the wrapper library. Then the FFTW library can be substituted by the wrapper and Intel MKL libraries. The source code for the wrappers and makefiles with the wrapper list files are located in the \interfaces\fftw3xc and \interfaces\fftw3xf sub-directories in the Intel MKL directory for C and Fortran wrappers, respectively.
Two header files are used
to compile the C wrapper library:
fftw3_mkl.h
and
fftw3.h
.
The
fftw3_mkl.h
file is located in the \interfaces\fftw3xc\wrappers subdirectory in the Intel
MKL directory.
Three header files are used to compile the Fortran wrapper library: fftw3_mkl.h
, fftw3_f77_mkl.h
, and fftw3.h
.
The fftw3_mkl.h
and fftw3_f77_mkl.h
files are located in the \interfaces\fftw3xf\wrappers
subdirectory in the Intel MKL directory.
The file fftw3.h
, used to compile libraries for both interfaces and located in the \include\fftw
subdirectory in the Intel MKL directory, slightly differs from the original FFTW (www.fftw.org) header file fftw3.h
in that all rows containing calls to the fftw3.lib
are commented.
As the Fortran wrapper library is built by a C compiler, function names in the wrapper library and Fortran object module may be different. The file fftw3_f77_mkl.h
in the \interfaces\fftw3xf\wrappers
subdirectory
in the Intel MKL directory defines function names according to names in the Fortran module. If a required name is missing in the file, you can change the latter to add the name.
Makefiles contain the following parameters: platform (required), compiler, and function. Description of these parameters can be found in the makefile comment heading.
Examples
The commandmake lib64
make lib64 compiler=gnu
In the wrapper library names, the suffix corresponds to the used compiler and the underscore is preceded with letter
"f" for Fortran and "c" for C.
For example,
fftw3xf_intel.lib
(Windows*) libfftw3xf_intel.a
(Linux* and Mac OS* X)
fftw3xc_intel.lib
(Windows) libfftw3xc_intel.a
(Linux and Mac OS X)
fftw3xc_ms.lib
(Windows) libfftw3xc_gnu.a
(Linux and Mac OS X).
The adapted
fftw3.h
header file (see above)
should be used when you build C applications.
The native
fftw3.h
header file should be used when you build Fortran applications.
There are some examples that demonstrate how to use the wrapper library. The source code for the examples,
makefiles used to run them, and the example list files are located in the \examples\fftw3xc and
\examples\fftw3xf subdirectories in the Intel MKL directory. To build
Fortran examples, one additional file fftw3.f
is needed.
This file is distributed with permission from FFTW and is available in the \include\fftw subdirectory of the Intel
MKL directory. The original file can also be found in FFTW 3.1 at http://www.fftw.org/download.html.
Example makefile parameters are the same as wrapper library makefile parameters. Example makefiles normally invoke examples. However, if the appropriate wrapper library is not yet created, the makefile will first build it in the same way as the wrapper library makefile does and then proceed to examples.
If the parameter
function=<example_name>
is defined, then only the specified example will run.
Otherwise, all examples from the appropriate subdirectory will run. The
subdirectory \_results will be created, and the results will be stored there in
the example_name.res
files.
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO
LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY
RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS
OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS
ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING
LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY,
OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED
NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD
CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without
notice. Designers must not rely on the absence or characteristics of any features
or instructions marked "reserved" or "undefined." Intel reserves these for future
definition and shall have no responsibility whatsoever for conflicts or incompatibilities
arising from future changes to them. The information here is subject to change without
notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known
as errata which may cause the product to deviate from published specifications.
Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications
and before placing your product order.
Copies of documents which have an order number and are referenced in this document,
or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting
Intel's Web Site.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details.
BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.
*
Other names and brands may be claimed as the property of others.
Copyright
(C) 2006-2007, Intel Corporation.
All rights reserved.