Real even/odd DFTs (cosine/sine transforms) (FFTW 3.3.11)

Next: The Discrete Hartley Transform, Previous: The Halfcomplex-format DFT, Up: More DFTs of Real Data [Contents][Index]

2.5.2 Real even/odd DFTs (cosine/sine transforms) ¶

The Fourier transform of a real-even function f(-x) = f(x) is real-even, and i times the Fourier transform of a real-odd function f(-x) = -f(x) is real-odd. Similar results hold for a discrete Fourier transform, and thus for these symmetries the need for complex inputs/outputs is entirely eliminated. Moreover, one gains a factor of two in speed/space from the fact that the data are real, and an additional factor of two from the even/odd symmetry: only the non-redundant (first) half of the array need be stored. The result is the real-even DFT (REDFT) and the real-odd DFT (RODFT), also known as the discrete cosine and sine transforms (DCT and DST), respectively.

(In this section, we describe the 1d transforms; multi-dimensional transforms are just a separable product of these transforms operating along each dimension.)

Because of the discrete sampling, one has an additional choice: is the data even/odd around a sampling point, or around the point halfway between two samples? The latter corresponds to shifting the samples by half an interval, and gives rise to several transform variants denoted by REDFTab and RODFTab: a and b are 0 or 1, and indicate whether the input (a) and/or output (b) are shifted by half a sample (1 means it is shifted). These are also known as types I-IV of the DCT and DST, and all four types are supported by FFTW’s r2r interface.³

The r2r kinds for the various REDFT and RODFT types supported by FFTW, along with the boundary conditions at both ends of the input array (n real numbers in[j=0..n-1]), are:

FFTW_REDFT00 (DCT-I): even around j=0 and even around j=n-1.
FFTW_REDFT10 (DCT-II, “the” DCT): even around j=-0.5 and even around j=n-0.5.
FFTW_REDFT01 (DCT-III, “the” IDCT): even around j=0 and odd around j=n.
FFTW_REDFT11 (DCT-IV): even around j=-0.5 and odd around j=n-0.5.
FFTW_RODFT00 (DST-I): odd around j=-1 and odd around j=n.
FFTW_RODFT10 (DST-II): odd around j=-0.5 and odd around j=n-0.5.
FFTW_RODFT01 (DST-III): odd around j=-1 and even around j=n-1.
FFTW_RODFT11 (DST-IV): odd around j=-0.5 and even around j=n-0.5.

Note that these symmetries apply to the “logical” array being transformed; there are no constraints on your physical input data. So, for example, if you specify a size-5 REDFT00 (DCT-I) of the data abcde, it corresponds to the DFT of the logical even array abcdedcb of size 8. A size-4 REDFT10 (DCT-II) of the data abcd corresponds to the size-8 logical DFT of the even array abcddcba, shifted by half a sample.

All of these transforms are invertible. The inverse of R*DFT00 is R*DFT00; of R*DFT10 is R*DFT01 and vice versa (these are often called simply “the” DCT and IDCT, respectively); and of R*DFT11 is R*DFT11. However, the transforms computed by FFTW are unnormalized, exactly like the corresponding real and complex DFTs, so computing a transform followed by its inverse yields the original array scaled by N, where N is the logical DFT size. For REDFT00, N=2(n-1); for RODFT00, N=2(n+1); otherwise, N=2n.

Note that the boundary conditions of the transform output array are given by the input boundary conditions of the inverse transform. Thus, the above transforms are all inequivalent in terms of input/output boundary conditions, even neglecting the 0.5 shift difference.

FFTW is most efficient when N is a product of small factors; note that this differs from the factorization of the physical size n for REDFT00 and RODFT00! There is another oddity: n=1 REDFT00 transforms correspond to N=0, and so are not defined (the planner will return NULL). Otherwise, any positive n is supported.

For the precise mathematical definitions of these transforms as used by FFTW, see What FFTW Really Computes. (For people accustomed to the DCT/DST, FFTW’s definitions have a coefficient of 2 in front of the cos/sin functions so that they correspond precisely to an even/odd DFT of size N. Some authors also include additional multiplicative factors of √2 for selected inputs and outputs; this makes the transform orthogonal, but sacrifices the direct equivalence to a symmetric DFT.)

Which type do you need? ¶

Since the required flavor of even/odd DFT depends upon your problem, you are the best judge of this choice, but we can make a few comments on relative efficiency to help you in your selection. In particular, R*DFT01 and R*DFT10 tend to be slightly faster than R*DFT11 (especially for odd sizes), while the R*DFT00 transforms are sometimes significantly slower (especially for even sizes).⁴

Thus, if only the boundary conditions on the transform inputs are specified, we generally recommend R*DFT10 over R*DFT00 and R*DFT01 over R*DFT11 (unless the half-sample shift or the self-inverse property is significant for your problem).

If performance is important to you and you are using only small sizes (say n<200), e.g. for multi-dimensional transforms, then you might consider generating hard-coded transforms of those sizes and types that you are interested in (see Generating your own code).

We are interested in hearing what types of symmetric transforms you find most useful.

Footnotes

(3)

There are also type V-VIII transforms, which correspond to a logical DFT of odd size N, independent of whether the physical size n is odd, but we do not support these variants.

(4)

R*DFT00 is sometimes slower in FFTW because we discovered that the standard algorithm for computing this by a pre/post-processed real DFT—the algorithm used in FFTPACK, Numerical Recipes, and other sources for decades now—has serious numerical problems: it already loses several decimal places of accuracy for 16k sizes. There seem to be only two alternatives in the literature that do not suffer similarly: a recursive decomposition into smaller DCTs, which would require a large set of codelets for efficiency and generality, or sacrificing a factor of 2 in speed to use a real DFT of twice the size. We currently employ the latter technique for general n, as well as a limited form of the former method: a split-radix decomposition when n is odd (N a multiple of 4). For N containing many factors of 2, the split-radix method seems to recover most of the speed of the standard algorithm without the accuracy tradeoff.