[转] Fast Fourier transform — FFT (一篇不错的关于一维FFT原理及CPU实现的文章)
[转] Fast Fourier transform — FFT (一篇不错的关于一维FFT原理及CPU实现的文章)<div id="cnblogs_post_body">原文地址:http://www.librow.com/articles/article-10
Fast Fourier transform &mdash; FFT
Category. Digital signal processing (DSP) software development.
Abstract. The article is a practical tutorial for fast Fourier transform &mdash; FFT &mdash; understanding and implementation. Article contains theory, C++ source code and programming instructions. Popular Cooley-Tukey technique is considered.
References. Case study: ECG processing &mdash; R-peaks detection and 4D ultrasound volume image processing.
Download fast Fourier transform &mdash; FFT &mdash; C++ source code (zip, 4 Kb)
http://www.librow.com/content/common/images/articles/common/fft_thumb.gif
1. Introduction to fast Fourier transform
Fast Fourier transform &mdash; FFT &mdash; is speed-up technique for calculating discrete Fourier transform &mdash; DFT, which in turn is discrete version of continuous Fourier transform, which indeed is origin for all its versions. So, historically continuous form of the transform was discovered, then discrete form was created for sampled signals and then algorithm for fast calculation of discrete version was invented.
2. Understanding FFT
First of all let us have a look at what Fourier transform is. Fourier transform is an integral of the form:
http://www.librow.com/content/common/images/articles/article-10/1.gif(1)The transform operates in complex domain. Recall, that imaginary exponent could be written as:
http://www.librow.com/content/common/images/articles/article-10/2.gif(2)For sampled function continuous transform (1) turns into discrete one:
http://www.librow.com/content/common/images/articles/article-10/3.gif(3)Expression (3) is discrete Fourier transform &mdash; DFT. Here {f0, f1, ... , fN-1} is input discrete function and {F0, F1, ... ,FN-1} is result of Fourier transform.
It is easily could be seen that to program DFT it is enough to write double loop and just calculate sums of products of input samples and imaginary exponents. The number of operations required is obviously of O(N2) order. But due to transform properties it is possible to reduce the number of operations to the order of O(Nlog2N). Now, let us explore tricks we can use to speed-up calculations.
Let us put N=8 and write down our DFT:
http://www.librow.com/content/common/images/articles/article-10/4.gif(4)Easily could be seen we can split the sum into two similar sums separating odd and even terms and factoring out the latter sum:
http://www.librow.com/content/common/images/articles/article-10/5.gif(5)Now we can split the sums in brackets again:
http://www.librow.com/content/common/images/articles/article-10/6.gif(6)Thus, we have 3 &mdash; log28 &mdash; levels of summation. The deepest one in parenthesis, the middle one in brackets and the outer one. For every level exponent multiplier for the second term is the same.
http://www.librow.com/content/common/images/articles/article-10/7.gif(7)And now the most important observation one can make to get speed-up: periodicity of the exponent multipliers.
http://www.librow.com/content/common/images/articles/article-10/8.gif(8)For the exponent multiplier in parenthesis period is n=2, which means sums in parenthesis are exactly the same forn=0, 2, 4, 6 and for n=1, 3, 5, 7. It means on deepest level in parenthesis we need 4&times;2=8 &mdash; number of sums times period &mdash; sums in total. And note, since n=1, 3, 5, 7 corresponds to the half of the period &pi;, exponent multiplier is the same as for n=0, 2, 4, 6 but with the opposite sign.
http://www.librow.com/content/common/images/articles/article-10/9.gif(9)Indeed they are 1 for n=0, 2, 4, 6 and -1 for n=1, 3, 5, 7.
For the exponent multiplier in brackets period is n=4, which means we have the same sums for pairs n=0, 4; n=1, 5;n=2, 6 and n=3, 7. It means on the middle level in brackets we have 2&times;4=8 sums and the second half of them could be received again by changing sign in first half of them &mdash; due to the fact distance between n and n+2 is &pi;. Thus, forn=0, 4 the factor is 1 and for n=2, 6 it is -1; for n=1, 5 it equals -i and for n=3, 7 it is i.
On the outer level we have just one sum for every transform component, and the period of the exponent multiplier is 8. Which gives us 1&times;8=8 sums and second half of them could be received by changing sign in the first half.
So, on every calculation level we have 8 sums. In terms of N it means we have log2N levels and N sums on every level, which gives us O(Nlog2N) order of number of operations. On the other hand having the constant number of sums on every level means we can process data in-place.
In summary, we have got fast Fourier transform &mdash; FFT. Now it is time to develop step-by-step instruction list to be carved in code.
3. FFT algorithm
Having understanding of what features of DFT we are going to exploit to speed-up calculation we can write down the following algorithm:
[*]Prepare input data for summation &mdash; put them into convenient order;
[*]For every summation level:
[*]For every exponent factor of the half-period:
[*]Calculate factor;
[*]For every sum of this factor:
[*]Calculate product of the factor and the second term of the sum;
[*]Calculate sum;
[*]Calculate difference.
First step of reordering is putting input data into their natural order in (7): {f0, f1, f2, f3, f4, f5, f6, f7}&rarr;{f0, f4, f2,f6, f1, f5, f3, f7}. Since this new order was received as result of successive splitting terms into even and odd ones, in binary domain it looks like ordering based on bit greatness starting from the least significant bit &mdash; see table 1.
0000000000010010101004201010001023011110110641000010011510101110156110101011371111111117Table. 1. Reordering in binary domain.
This leads to &ldquo;mirrored arithmetics&rdquo; &mdash; result binary column in the mirror looks like {0, 1, 2, 3, 4, 5, 6, 7} &mdash; our initial order indeed. Thus, to reorder input elements it is enough just to count in mirrored manner. Since double mirroring gives again the same number, reordering reduces to swaps.
Summation levels include our parenthesis, brackets and outer level. In general case this leads to iterations on pairs, quadruples, octads and so on.
Further, we iterate on components of half-period, second half-period we are getting as result of taking differences instead of sums for the first half. Thus, for the deepest level of parenthesis period is 2 and half-period is 1, which means this cycle will be executed only once. For the second level period is 4 and half-period is 2 and cycle will be executed 2 times. In general case we have succession 1, 2, 4, 8, ... for this cycle.
Factor calculation is calculation of our imaginary exponent. To restrict the number of trigonometric function calls (2) we use the recurrence:
http://www.librow.com/content/common/images/articles/article-10/10.gif(10)Which is indeed expression
http://www.librow.com/content/common/images/articles/article-10/11.gif(11)written in a tricky way not to lose significance for small &delta; &mdash; for this case cos&delta;&asymp;1 and sin&delta;&asymp;&delta;, which tells us that for cos&delta; memory will be just packed with .999999(9) but for sin&delta; there will be much more useful information. Thus, (10) is just the way to eliminate cos&delta; from calculations. If you look back at (7) you will find, that for N=8 &delta;=&pi;/2,&pi;/4 &mdash; not a small numbers. But for transforms of much bigger N &delta;=&pi;/2, &pi;/4, &pi;/8, ... up to 2&pi;/N, for sure, could be very small.
The innermost loop looks for sums, where calculated imaginary exponent present, calculates product and takes sum and difference, which is the sum for the second half-period at &pi; distance, where our exponent changes its sign but not the absolute value according to (9). To perform in-place processing we utilize the following scheme:
http://www.librow.com/content/common/images/articles/article-10/butterfly.gifFig. 1. Butterfly.
This operation is elementary brick of in-place FFT calculation and usually is called butterfly. The bottom term is multiplied by imaginary exponent and then sum of the terms is stored in place of the upper term and difference is stored in place of the bottom term. General butterfly picture is depicted below &mdash; fig. 2.
http://www.librow.com/content/common/images/articles/article-10/butterfly_all.gifFig. 2. All butterflies for N=8.
Elegant scheme. It is time to engrave this beauty in code. But before we delve into programming let us make a small digression: it is known thing that Fourier transform is a transfer to frequency domain, so, let us see how to be back from the realm of frequencies.
4. Inverse Fourier transform
Expression for inverse Fourier transform is
http://www.librow.com/content/common/images/articles/article-10/12.gif(12)and its discrete counterpart is
http://www.librow.com/content/common/images/articles/article-10/13.gif(13)Thus, the difference between forward (3) and inverse (13) transforms is just a sign and not necessary scale factor &mdash; one does not need it if interested in ratio between components but not in absolute values. It means that the routine for forward transform with slight modification can perform inverse one as well.
5. FFT programming
Here is our FFT class declaration.
<div class="SourceCode">class CFFT{public: // FORWARD FOURIER TRANSFORM // Input- input data // Output - transform result // N - length of both input data and result static bool Forward(const complex *const Input, complex *const Output, const unsigned int N); // FORWARD FOURIER TRANSFORM, INPLACE VERSION // Data - both input data and output // N - length of both input data and result static bool Forward(complex *const Data, const unsigned int N); // INVERSE FOURIER TRANSFORM // Input- input data // Output - transform result // N - length of both input data and result // Scale- if to scale result static bool Inverse(const complex *const Input, complex *const Output, const unsigned int N, const bool Scale = true); // INVERSE FOURIER TRANSFORM, INPLACE VERSION // Data- both input data and output // N - length of both input data and result // Scale - if to scale result static bool Inverse(complex *const Data, const unsigned int N, const bool Scale = true);protected: // Rearrange function and its inplace version static void Rearrange(const complex *const Input, complex *const Output, const unsigned int N); static void Rearrange(complex *const Data, const unsigned int N); // FFT implementation static void Perform(complex *const Data, const unsigned int N, const bool Inverse = false); // Scaling of inverse FFT result static void Scale(complex *const Data, const unsigned int N);};
页:
[1]