FFTW is a free collection of fast C routines for computing the Discrete Fourier Transform in one or more dimensions. It includes complex, real, symmetric, and parallel transforms, and can handle arbitrary array sizes efficiently. FFTW is typically faster than other publically-available FFT implementations, and is even competitive with vendor-tuned libraries. There are several machine-dependent optimizations that can be enabled in order to generate SIMD code; take a look in fftw.SlackBuild to enable them.