Installation Of FFTW Version 3.0.1 on Microsoft Windows ======================================================= Last updated: 15/MAR/04 -- updated acknowledgements. Authors: Mark G. Beckett (g.beckett@epcc.ed.ac.uk). EPCC, The University of Edinburgh, EH9 3JZ, Scotland. Acknowledgements ================ Thanks to Gavin J. Pringle and Ratnadeep Abrol (both of The University of Edinburgh, Scotland) for their comments on this document and the accompanying document "Cross-calling FFTW from Borland Delphi". The work was carried out under EC funding (Contract No. IST-2000-29598/3D-PATHOLOGY). Introduction ============ These instructions explain how to build the FFTW libraries on Microsoft Windows. They are tailored towards Microsoft Windows 2000, and use the Borland C++ command-line compiler suite Version 5.5. This suite may be freely downloaded from Borland's website http://www.borland.com and includes comprehensive installation instructions. Based on the instructions below, we will build the single-precision version of the FFTW library. Both a static version (FFTW3.LIB) and a dynamic version (FFTW3.DLL) of the library are created. Note that, currently, only the static FFTW3.LIB version of the library is validated using the check code in the tests/ directory. This should not represent a significant risk, since both the LIB and DLL versions are assembled from the same object files. Note: We have been unable to build the SSE instructions extension to the library, using this version of the Borland compiler. The SSE instruction set will facilitate better, single-precision performance on Pentium III, Pentium IV, etc. (1) Download the FFTW source archive from http://www.fftw.org/ and unpack into a suitable location. At the time of writing, the current stable version number is 3.0.1. (2) Edit "config.h" for hardware/software configuration. A sample "config.h" is included in the distribution. Note that on MS Windows-based systems, most compilers do not allow the user to force integers to be aligned on 16-byte boundaries (provides better performance of double precision arithmetic with SIMD extensions). FFTW provides a workaround using its own version of malloc to allocate correctly aligned space on the stack. To enable this, you should set the following config.h variables: /* Define to enable alignment debugging hacks. */ #define FFTW_DEBUG_ALIGNMENT 1 /* Define to 1 if you have the declaration of `memalign', and to 0 if you don't. */ #define HAVE_DECL_MEMALIGN 0 /* Define to 1 if you have the declaration of `posix_memalign', and to 0 if you don't. */ #define HAVE_DECL_POSIX_MEMALIGN 0 There seems to be some inconsistency in the way in which FFTW_DEBUG_ALIGNMENT is utilised. We found that we needed to change the #ifdef at line 730 of "ifftw.h" to an #ifndef, to stop FFTW checking for compiler correct alignment. During a first attempt, we recommend that you do not attempt to compile the hardware optimisations for SSE, SSE2, or 3DNOW, etc. We have been unable to build the SSE extended instructions using the Borland Compiler Version 5.5. (2a) Selecting single or double precision (or long double, if your compiler supports it). FFTW can be compiled to use single-precision floating point numbers, should you prefer. This can be done by setting the following defines in "config.h": #define BENCHFFT_SINGLE 1 #define FFTW_SINGLE 1 The default settings (to implement double-precision f.p.) is: #undef BENCHFFT_SINGLE #undef FFTW_SINGLE Similar variables are used to control long double precision. For consistency with documentation, you should consider renaming the library as fftw3f.LIB or fftw3l.LIB for single-precision and long double-precision, respectively. (3) Once you have done this, you need to make a small change to a function prototype for X(solvtab_exec), as follows: cd into directory kernel/, and remove const restriction from "solvtab.c" in the function definition: void X(solvtab_exec)(const solvtab tbl, planner *p){ ^^^^^ Remove the corresponding entry from the function declaration in "ifftw.c". N.B. This seems to be a genuine problem, as tbl (a pointer to an array) is modified within this function? (4) In order to link into a single FFTW3 library, we need to make a few changes to the source code filenames. This is because the Borland Linker TLIB (Version 4.5) complains if two object files have the same name. It only includes symbols from the first object file, assuming that the other instances are duplicates! We have to change the following source filenames: dft\buffered.c dft\ct.c dft\plan.c dft\problem.c dft\rader.c ...\inplace\codlist.c rdft\buffered.c rdft\conf.c rdft\direct.c rdft\generic.c rdft\indirect.c rdft\nop.c rdft\plan.c rdft\problem.c rdft\rank-geq2.c rdft\rank0.c rdft\solve.c rdft\vrank-geq1.c ...\r2hc\codlist.c ...\hc2r\codlist.c ...\r2r\codlist.c reodft\conf.c The new names were made by prefixing the source filename by the immediate parent directory and an underscore. For example, reodft\conf.c becomes reodft\reodft_conf.c (4) Makefiles and response files (for use with TLIB) are provided. For the Borland C++ compiler, we have enabled the following optimisations: -O2 general optimisation for speed of execution; -Ov enable loop induction variable and strength reduction. -a16 align data on 16 byte boundaries ??? -Oc eliminate duplicate expressions within basic blocks and functions (5) To build the library, type "make" from the root directory. At the end of the build, you should find "fftw3.lib" and "fftw3.dll" in the top-level directory. If you also wish to build "fftw3_threads.lib", cd into the "threads\" directory and type "make". (6) Having built the library, you should test it. A test script called "check.pl" is provided in the "\Tests" directory. To use the "check.pl" script, you will need an installation of Perl, e.g. from Cygwin. Then run: perl check.pl --random from the "Tests\" directory. Add --verbose if you want verbosity. Without verbosity turned on, a successful test is indicated by no output. The above check takes about 5 minutes on a AMD Athlon 1600 XP with 512Mb RAM. You can also test installation using the bench command-line with a range of problem size and types. E.g.: bench -y irf256x256 to test the correctness of an inplace, forward transform of a real, two-dimensional array of dimensions 256x256. See the README file in the "Tests\" directory for more details. If you wish to build the Wisdom command-line tool, change to the "Tools\" directory and type "make". (7) ... and you are done! Unless ... (8) If you are planning to link FFTW with the Borland Delphi compiler, then you need to create a dynamic library (commonly called a DLL in MS Windows speak) from your object files. The above make process will, in addition to creating a static .LIB library file, also create a DLL for you, called "fftw3.dll". However, the DLL will be of limited use, unless you specify the functions that you wish to be exported to the DLL symbols table. One way to do this is to make a change to the function declaration in the original C code. For example, to export the fftw_malloc() function, change the function definition in line 70 of alloc.c from: void *X(malloc)(size_t n) to: __declspec(dllexport) void *X(malloc)(size_t n) i.e., prefix the definition of the function with "__declspec(dllexport)". You will note that the fftw_malloc is declared as "X(alloc)" in the source code. X(...) is a macro that replaces its argument with fftw_... or fftwf_... or fftwl_..., for double precision, single precision, and long double precision, respectively. Within the symbols table, function are likely to be represented in C declaration format, i.e. for single precision, the malloc routine has the symbol "_fftwf_malloc". To check the symbol name for functions that you have exported, examine the DLL using the TDUMP utility provided as part of Borland C command-line utilities. (9) To clean the source distribution, either execute "make clean" or "make distclean". These will both remove object files and target executables. You may also run "make distclean" to recursively delete all compiler-generated files in and below the top-level directory. Useful references ================= [ ] FFTW User Manual (section 8 covers installation on Unix/non-Unix systems): http://www.fftw.org/fftw3_doc/. [ ] Windows Installation Notes: http://www.fftw.org/install/install-Windows.html [ ] Tuning FFTW For Win32 Compilers (description of Windows alignment problems): http://www.research.microsoft.com/users/jch/fftw-performance.html