Using FFTW

- Q3.1. Why not support the FFTW 2 interface in FFTW 3?
- Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just the algorithm?
- Q3.3. FFTW seems really slow.
- Q3.4. FFTW slows down after repeated calls.
- Q3.5. An FFTW routine is crashing when I call it.
- Q3.6. My Fortran program crashes when calling FFTW.
- Q3.7. FFTW gives results different from my old FFT.
- Q3.8. FFTW gives different results between runs
- Q3.9. Can I save FFTW's plans?
- Q3.10. Why does your inverse transform return a scaled result?
- Q3.11. How can I make FFTW put the origin (zero frequency) at the center of its output?
- Q3.12. How do I FFT an image/audio file in
*foobar*format? - Q3.13. My program does not link (on Unix).
- Q3.14. I included your header, but linking still fails.
- Q3.15. My program crashes, complaining about stack space.
- Q3.16. FFTW seems to have a memory leak.
- Q3.17. The output of FFTW's transform is all zeros.
- Q3.18. How do I call FFTW from the Microsoft language du jour?
- Q3.19. Can I compute only a subset of the DFT outputs?
- Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix transposition?

- It was important for performance reasons that the plan be specific to array characteristics like the stride (and alignment, for SIMD), and requiring that the user maintain these invariants is error prone.
- In most high-performance applications, as far as we can tell, you are usually transforming the same array over and over, so FFTW's semantics should not be a burden.
- If you need to transform another array of the same size, creating a new plan once the first exists is a cheap operation.
- If you need to transform many arrays of the same size at once, you
should really use the
`plan_many`

routines in FFTW's "advanced" interface. - If the abovementioned array characteristics are the same, you are willing to pay close attention to the documentation, and you really need to, we provide a "new-array execution" interface to apply a plan to a new array.

- First, you create a plan. This will take several seconds.
- Then, you reuse the plan many times to perform FFTs. These are fast.

`FFTW_ESTIMATE`

option in the planner, which uses heuristics
instead of runtime measurements and produces a good plan in a short
time. Second, you can use the wisdom feature to precompute the plan;
see Q3.9 `Can I save FFTW's plans?'
`make check`

, or `cd tests; make bigcheck`

if you want to be paranoid)? If so, you almost
certainly have a bug in your own code. For example, you could be
passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
you could simply have memory corruption elsewhere in your program that
causes random crashes later on. Please don't complain to us unless
you can come up with a minimal self-contained program (preferably
under 30 lines) that illustrates the problem.
`integer*8`

. We recommend using `integer*8`

on 32-bit machines as well, to simplify porting.
`FFTW_FORWARD`

/`FFTW_BACKWARD`

directions correspond to signs of -1/+1 in the exponent of the DFT definition.
(You should also know that we compute an unnormalized transform. In contrast, Matlab is an example of program that computes a normalized transform. See Q3.10 `Why does your inverse transform return a scaled result?'.

Finally, note that floating-point arithmetic is not exact, so different FFT algorithms will give slightly different results (on the order of the numerical accuracy; typically a fractional difference of 1e-15 or so in double precision).

`FFTW_MEASURE`

or `FFTW_PATIENT`

mode, then the algorithm FFTW employs is not deterministic: it depends on
runtime performance measurements. This will cause the results to vary
slightly from run to run. However, the differences should be slight,
on the order of the floating-point precision, and therefore should
have no practical impact on most applications.
If you use saved plans (wisdom) or `FFTW_ESTIMATE`

mode, however, then the algorithm is deterministic and the results should be
identical between runs.

`wisdom`

mechanism for saving plans; see the FFTW manual.
`-lfftw3 -lm`

for FFTW 3.x) and `#include <fftw3.h>`

. (Yes, this is really a FAQ.)
`fftw_complex array[N]`

); you should use `fftw_malloc`

(or equivalent) to allocate the arrays you want
to transform if they are larger than a few hundred elements.
`fftw_cleanup()`

, as documented in the manual.
`FFTW_ESTIMATE`

: planning with `FFTW_MEASURE`

or `FFTW_PATIENT`

overwrites the input/output arrays, as described in the manual.
There are some specific cases in which you can get the O(N log K) performance benefits easily, however, by combining a few ordinary FFTs. In particular, the case where you want the first K outputs, where K divides N, can be handled by performing N/K transforms of size K and then summing the outputs multiplied by appropriate phase factors. For more details, see pruned FFTs with FFTW.

There are also some algorithms that compute pruned transforms
*approximately*, but they are beyond the scope of this FAQ.

For double-valued data stored in row-major format, plan creation looks like this:

fftw_plan plan_transpose(int rows, int cols, double *in, double *out) { const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */ fftw_iodim howmany_dims[2]; howmany_dims[0].n = rows; howmany_dims[0].is = cols; howmany_dims[0].os = 1; howmany_dims[1].n = cols; howmany_dims[1].is = 1; howmany_dims[1].os = rows; return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL, /*howmany_rank=*/ 2, howmany_dims, in, out, /*kind=*/ NULL, flags); }(This entry was written by Rhys Ulerich.)

Next: Internals of FFTW.

Back: Installing FFTW.

Return to contents.

Matteo Frigo and Steven G. Johnson / fftw@fftw.org - 04 March 2014

Extracted from FFTW Frequently Asked Questions with Answers, Copyright © 2014 Matteo Frigo and Massachusetts Institute of Technology.