The present invention is related to Discrete Fourier Transform (DFT) Twiddle Factors (TFs) generation algorithms.
Preferred embodiments of the present invention deal with DFT TFs generation algorithms, more particularly with DFT TFs New Projection Algorithms (NPAs) and methods for generating any Twiddle Factor (TF) of the DFT matrix.
In conventional systems, all TFs of the DFT matrix should be pre-calculated and stored in a Memory Storage (MS) as lookup tables.
Storing all TFs requires large hardware MS space requirements that consumes extra power to maintain their stored values.
Another approach to generate the DFT matrix is by calculating the needed TFs using complicated algorithms like COordinate Rotation DIgital Computer (CORDIC). These algorithms consume time and require heavy calculations.
Using the proposed DFT TFs NPAs will eliminate the need to store the complete lookup tables for all TFs, hence; increasing the efficiency from the MS requirements perspective. Also, they eliminate the need for using slow complicated algorithms like CORDIC.
The DFT TFs NPAs of the present invention will generate any required TF of the DFT matrix.
The need to retrieve the required TF from pre-saved lookup tables of the DFT matrix is avoided; hence, avoiding the need to calculate it with slow complicated algorithms like CORDIC as implemented by prior arts.
In order to get the maximum performance of the DFT TFs NPAs of the present invention, it is preferred to be implemented using Field Programmable Gate Arrays (FPGA), parallel architecture Application Specific Integrated Circuit (ASIC) and/or other platforms having the capability of actual parallel processing for real time applications.
The DFT TFs NPAs of the present invention are very efficient due to their low hardware requirements, high speed and low power consumption.
Such efficiency is vital for real time computation of the DFT with massive number of samples to be transformed from time domain to frequency domain.
The present invention addresses these needs.
The object of the present invention is to overcome the above disadvantages and to provide DFT TFs NPAs.
The NPAs are very efficient due to their low MS requirements, high speed and low power consumption.
Such efficiency is vital for real time DFT applications for massive number of samples transformation from time domain to frequency domain.
The thorough exploitation of the TFs symmetries and similarities is the backbone concept of the operation of the present invention.
The present invention presents how to exploit these symmetries and similarities to construct a methodology of operation to be able to formulate the NPAs. The NPAs will generate any required TF of the DFT without the need to retrieve that TF from pre-saved lookup tables of the DFT matrix. Also, will avoid the need to calculate it with slow complicated algorithms like CORDIC as used by prior arts.
The NPAs of the present invention will generate the required TF of the DFT matrix by providing only a small stored portion of the TFs that is stored in an MS.
In the present invention, N denotes the number of samples in the time domain to be transformed to frequency domain using DFT.
In Embodiment-1 (EMB-1) of
It requires only N stored TFs in an MS to generate any required TF of the DFT matrix. The NPA-1 works for any odd or even value of N. The NPA-1 will reduce the MS requirements by a factor of
is the positive natural numbers {1, 2, 3, . . . }.
In Embodiment-2 (EMB-2) of
The NPA-2 requires
stored TFs in an MS to generate any required TF of the DFT matrix. The NPA-2 works for any odd or even value of N. The NPA-2 will reduce the MS requirements by a factor of
In Embodiment-3 (EMB-3) of
The NPA-3 requires only
stored TFs in an MS to generate any required TF of the DFT matrix. The NPA-3 works only for even values of N. The NPA-3 will reduce the MS requirements by a factor of
In Embodiment-4 (EMB-4) of
The NPA-4 requires only
stored TFs in an MS to generate any required TF of the DFT matrix. The NPA-4 works only for even values of N. The NPA-4 will reduce the MS requirements by a factor of
The present invention can be implemented using the powerful capabilities of the actual parallel processing platforms like FPGA, ASIC and/or other platforms. Using such platforms employ the strength of actual parallel processing computation for real time applications.
The N complex valued numbers, generated by the DFT for the time-sampled signal xn, are shown in Eq.(1).
The N complex valued numbers generated by the Inverse DFT (IDFT) for the frequency-sampled Xk are shown in Eq.(2).
where:
Xk are the frequency harmonics.
xn are the signal time samples.
k is the index in the frequency domain.
n is the index in the time domain.
is a complex number.
j=√{square root over (−1)}.
Equation (3) presents WN, which is the principal Nth root of unity of Eq.(1) and Eq.(2).
The matrix form of Eq.(1) is shown in Eq.(4). Equation (4) shows the DFT matrix.
The compact DFT matrix form of Eq.(4) is shown in Eq.(5).
When an integer number A is divided by another integer number B as shown in Eq.(6), the modulo's operation result is the remainder R. The modulo operator is denoted by the symbol as presented in Eq.(7).
Where:
A is the dividend, A ∈ ={ . . . , −2, −1, 0, 1, 2, . . . }.
B is the divisor, B ∈ \{0}.
Q is the quotient, Q ∈ .
R is the remainder, R ∈ φ={0, 1, 2, . . . , (B−1)}.
In EMB-1106 of the present invention as shown in
Instead of storing the N2 TFs that are needed to calculate the DFT which require large MS requirements, only N TFs that are members of the set ζ1{WN(0), WN(1), . . . , WN(N−1)} ∀N∈ρ1{circumflex over ( )}∀WN∈ are needed to be stored in an MS 103.
The NPA-1101 will be used to project a specific value to the required TF WN(n×k) 102 from one of the N TFs of the ζ1 104 vector that is stored in an MS 103.
The outcome 105 of the NPA-1101 will be the projected required TF WN(n×k) 102.
The NPA-1 PE formulation is presented below.
The property shown in Eq.(8) is the PE of EMB-1 106 and it will be used to design the NPA-1.
WN(n×k)= (8)
∀n,k,N(n×k)∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
By using Eq.(8), we can project a specific value to the required TF WN(n×k) 102 from one of the N TFs of the ζ1 104 vector.
The MS arrangement of the NPA-1 is presented below.
The N elements of the ζ1 104 TFs vector are constructed according to Eq.(9).
The memory location index of the ζ1 104 TFs vector is G1.
The NPA-1101 is connected to the ζ1 104 TFs vector that is stored in an MS 103.
From Eq.(8) and Eq.(9), the NPA-1101 of EMB-1 106 of the present invention is realized as shown in the flowchart of
The flowchart of
Step 503 will calculate the value of N(n×k) using Eq.(7).
Step 504 will read the value of the TF stored in the memory location of index N(n×k) of the ζ1 104 TFs vector.
According to Eq.(10), Step 505 will project and assign the value of ζ1[N(n×k)] to the required TF WN(n×k) 102.
W
N
(n×k)=ζ1[N(n×k)] (10)
∀n,k,N(n×k)∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
The NPA-1101 is ended at step 506.
In EMB-2 206 of the present invention as shown in
Instead of storing the N2 TFs that are needed to calculate the DFT which require large MS requirements, only
TFs that are members of the set
∀N∈ρ1{circumflex over ( )}∀WN∈ are needed to be stored in an MS 203.
The NPA-2201 will be used to project a specific value to the required TF WN(n×k) 202 from one of the
TFs of the ζ2 204 vector that is stored in an MS 203.
The outcome 205 of the NPA-2201 will be the projected required TF WN(n×k) 202.
The NPA-2201 PEs formulation is presented below.
We have the property of Eq.(11).
W
N
(−(n×k))=(WN(n×k)*= (11)
∀n,k,N(n×k)∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
From Eq.(8) and Eq.(11), we can get Eq.(12).
W
N
(n×k)==(
)* (12)
∀n,k,N(n×k)∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
Let
be defined as shown in Eq.(13).
Where └ ┘ is the floor symbol. └┘ is the floor of
that is the nearest integer ≤
.
Using Eq.(12) and Eq.(13), we can get Eq.(14).
From examining Table 1, we can clearly notice that by only storing
TFs of the ζ2 204 vector in an MS 203, we can generate any required TF WN(n×k) 202 of the DFT matrix.
From Eq.(14) and Table 1, we can get Eq.(15) that shows the two PEs of EMB-2 206 that will be used to design the NPA-2201. Equation (15) can be used to generate any required TF WN(n×k) 202.
)*
By using Eq.(15), we can project a specific value to the required TF WN(n×k) 202 from one of the
TFs of the ζ2 204 vector.
The MS arrangement of the NPA-2201 is presented below.
The elements of the ζ2 204 TFs vector are constructed according to Eq.(16).
The memory location index of the ζ2 204 TFs vector is G2.
The NPA-2201 is connected to the ζ2 204 TFs vector that is stored in an MS 203.
From the above equations, the NPA-2201 of EMB-2 206 of the present invention is realized as shown in the flowchart of
The flowchart of
Step 603 will calculate the value of N(n×k) using Eq.(7).
Step 604 will calculate the value of
using Eq.(13).
Step 605 will check if
is equal to 1. If it is equal to 1, then step 606 will be executed, if it is not equal to 1, then step 608 will be executed.
Step 606 will read the value of the TF stored in the memory location of index N(n×k) of the ζ2 204 TFs vector.
According to Eq.(17), step 607 will project and assign the value of ζ2[] to the required TF WN(n×k) 202.
W
N
(n×k)=ζ2[] (17)
∀n,k∈ψ{circumflex over ( )}∀N(n×k)∈β1{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
The NPA-2201 is ended at step 610.
Step 608 will read the value of the TF stored in the memory location of index (N−N(n×k)) of the ζ2 204 TFs vector.
According to Eq.(18), step 609 will project and assign the value of (ζ2[N−N(n×k)])* to the required TF WN(n×k) 202.
W
N
(n×k)=(ζ2[N−N(n×k)])* (18)
∀n,k∈ψ{circumflex over ( )}∀N(n×k)∈β2{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
The NPA-2201 is ended at step 610.
In EMB-3 306 of the present invention as shown in
Instead of storing the N2 TFs that are needed to calculate the DFT which require large MS requirements, only
TFs that are members of the set
∀N∈ρ2{circumflex over ( )}∀WN∈ are needed to be stored in an MS 303.
The NPA-3301 will be used to project a specific value to the required TF WN(n×k) 302 from one of the
TFs of the ζ3 304 vector that is stored in an MS 303.
The outcome 305 of the NPA-3301 will be the projected required TF WN(n×k) 302.
The NPA-3301 PEs formulation is presented below.
We have the properties shown in Eq.(19), Eq.(20) and Eq.(21).
From Eq.(8), Eq.(19), Eq.(20), and Eq.(21), we get Eq.(22).
From Eq.(12) and Eq.(22), we get Eq.(23).
From Eq.(13) and Eq.(23), we get Eq.(24).
From examining Table 2, we can clearly notice that by only storing
TFs of the ζ3 304 vector in an MS 303, we can generate any required TF WN(n×k) 302.
From Eq.(24) and Table 2, we can get Eq.(25) that shows the five PEs of EMB-3 306 that will be used to design the NPA-3301. Equation (25) can be used to find any required TF WN(n×k) 302.
By using Eq.(25), we can project a specific value to the required TF WN(n×k) 302 from one of the
TFs of the ζ3 304 vector.
The MS arrangement of the NPA-3301 is presented below.
The
elements of the ζ3 304 TFs vector are constructed according to Eq.(26).
)
))*
The memory location index of the ζ3 304 TFs vector is G3.
The NPA-3301 is connected to the ζ3 304 TFs vector that is stored in an MS 303.
From the above equations, the NPA-3301 of EMB-3 306 of the present invention is realized as shown in the flowchart of
The flowchart of
Step 703 will calculate the value of N(n×k) using Eq.(7).
Step 704 will calculate the value of
using Eq.(13).
Step 705 will check if
is equal to 1. If it is equal to 1, then step 706 will be executed, if it is not equal to 1, then step 708 will be executed.
Step 706 will read the value of the TF stored in the memory location of index N(n×k) of the ζ3 304 TFs vector.
According to Eq.(27), step 707 will project and assign the value of ζ3[N(n×k)] to the required TF WN(n×k) 302.
W
N
(n×k)=ζ3[N(n×k)] (27)
∀n,k,∈ψ{circumflex over ( )}∀N(n×k)∈β3{circumflex over ( )}∀N∈ρ2{circumflex over ( )}∀WN∈
The NPA-3301 is ended at step 710.
Step 708 will read the value of the TF stored in the memory location of index (N−N(n×k)) or
of the ζ3 304 TFs vector.
According to Eq.(28), step 709 will project and assign the value of (ζ3[N−N(n×k)])* or
to the required WN(n×k) 302.
The NPA-3301 is ended at step 710.
In EMB-4 406 of the present invention as shown in
Instead of storing the N2 TFs that are needed to calculate the DFT which require large MS requirements, only
TFs that are members of the set
are needed to be stored in an MS 403.
The NPA-4401 will be used to project a specific value to the required TF WN(n×k) 402 from one of the
TFs of the ζ4 404 vector that is stored in an MS 403.
The outcome 405 of the NPA-4401 will be the projected required TF WN(n×k) 402.
The NPA-4401 PEs formulation is presented below.
Let be
defined as in Eq.(29).
From Eq.(29), the computation of
will result one of the values shown in Eq.(30).
From Eq.(24), we can get Eq.(31).
From examining Table 3, we can clearly notice that by only storing
TFs of the ζ4 404 vector in an MS 403, we can generate any required TF WN(n×k) 402.
)*
From Eq.(31) and Table 3, we can get Eq.(32) that shows the six PEs of EMB-4 406 which will be used to design the NPA-4401. Equation (32) can be used to find any required TF WN(n×k) 402.
By using Eq.(32), we can project a specific value to the required TF WN(n×k) 402 from one of the
TFs of the ζ4 404 vector.
The MS arrangement of the NPA-4401 is presented below.
The
elements of the ζ4 404 TFs vector are constructed according to Eq.(33).
The memory location index of the ζ4 404 TFs vector is G4.
The NPA-4401 is connected to the ζ4 404 TFs vector that is stored in an MS 403.
From the above equations, the NPA-4401 of EMB-4 406 of the present invention is realized as shown in the flowchart of
The flowchart of
Step 803 will calculate the value of N(n×k) using Eq.(7).
Step 804 will calculate the value of
using Eq.(29).
Step 805 will check if
is equal to 0. If it is equal to 0, then the path of connector CC1806 will be followed and step 812 will be executed. If it is not equal to 0, then step 807 will be executed.
Connector CC1806 is connecting step 805 and step 812.
Step 807 will check if
is equal to 1. If it is equal to 1, then the path of connector CC2808 will be followed and step 814 will be executed. If it is not equal to 1, then step 809 will be executed.
Connector CC2808 is connecting step 807 and step 814.
Step 809 will check if
is equal to 2. If it is equal to 2, then the path of connector CC3810 will be followed and step 817 will be executed. If it is not equal to 2, meaning that
is equal to 3, then the path of connector CC4811 will be followed and step 819 will be executed.
Connector CC3810 is connecting step 809 and step 817.
Connector CC4811 is connecting step 809 and step 819.
Step 812 will read the value of the TF stored in the memory location of index N(n×k) of the ζ4 404 TFs vector.
According to Eq.(34), step 813 will project and assign the value of ζ4[N(n×k)] to the required TF WN(n×k) 402.
W
N
(n×k)=ζ4[N(n×k)] (34)
∀n,k,∈ψ{circumflex over ( )}∀N(n×k)∈λ1{circumflex over ( )}∀N∈ρ2{circumflex over ( )}∀WN∈
The NPA-4401 is ended at step 816.
Step 814 will read the value of the TF stored in the memory location of index
of the ζ4 404 TFs vector.
According to Eq.(35), step 815 will project and assign the value of
to the required TF WN(n×k) 402.
The NPA-4401 is ended at step 816.
Step 817 will read the value of the TF stored in the memory location of index
of the ζ4 404 TFs vector.
According to Eq.(36), step 818 will project and assign the value of
to the required TF WN(n×k) 402.
The NPA-4401 is ended at step 816.
Step 819 will read the value of the TF stored in the memory location of index (N−N(n×k)) of the ζ4 404 TFs vector.
According to Eq.(37), step 820 will project and assign the value of (ζ4[N−N(n×k)])* to the required TF WN(n×k) 402.
W
N
(n×k)=(ζ4[N−N(n×k)])* (37)
∀n,k∈ψ{circumflex over ( )}∀N(n×k)∈λ4{circumflex over ( )}∀N∈ρ2{circumflex over ( )}∀WN∈
The NPA-4401 is ended at step 816.
Performance Evaluation
The TPR is the percentage ratio of the MNRTF that is needed to be stored in an MS 103 or MS 203 or MS 303 or MS 403 to the N2 total TFs needed to calculate the DFT as shown in Eq.(38).
From Eq.(38), the PMRR for the memory requirements for storing the TFs that are needed to calculate the DFT is shown in Eq.(39).
Table 4 presents a comparison between the NPA-1101 of EMB-1 106, the NPA-2201 of EMB-2 206, the NPA-3301 of EMB-3 306, the NPA-4401 of EMB-4 406 of the present invention and the DFT regarding their MNRTF, TPR, PMRR, the number of samples in the time domain N to be transformed to frequency domain using DFT and the number of their PEs.
From Table 4, it is clear to notice that the NPA-4401 of EMB-4 406 that works for ∀N∈ρ2 has the lowest MNRTF, lowest TPR and the highest PMRR. So, it has a better performance when compared with the performance of the NPA-1101 of EMB-1 106, the NPA-2201 of EMB-2 206, the NPA-3301 of EMB-3 306 of the present invention and the DFT.
From Table 4, it is clear to notice that the NPA-2201 of EMB-2 206 that works ∀N∈ρ1 and the NPA-3301 of EMB-3 306 that works ∀N∈ρ2 have the same MNRTF, the same TPR and the same PMRR when compared ∀N∈ρ2.
From Table 4, it is clear to notice that the DFT has the highest MNRTF, highest TPR and the lowest PMRR. So it has the lowest performance when compared with the performance of the NPA-1101 of EMB-1 106 that works ∀N∈ρ1, the NPA-2201 of EMB-2 206 that works ∀N∈ρ1, the performance of the NPA-3301 of EMB-3 306 that works ∀N∈ρ2, and the NPA-4401 of EMB-4 406 that works ∀N∈ρ2.
From Table 4, it is clear to notice that the NPA-1101 of EMB-1 106 that works ∀N∈ρ1 has the highest MNRTF, the highest TPR and the lowest PMRR. So, it has the lowest performance when compared with the performance of the NPA-2201 of EMB-2 206 that works ∀N∈ρ1, the performance of the NPA-3301 of EMB-3 306 that works ∀N∈ρ2, and the performance of the NPA-4401 of EMB-4 406 that works ∀N∈ρ2.
Having a low MNRTF is highly recommended since it will result in low MS requirements for the TFs that will have to be stored in an MS 103 or MS 203 or MS 303 or MS 403 and to be used by the NPA-1101 of EMB-1 106, the NPA-2201 of EMB-2 206, the NPA-3301 of EMB-3 306 and the NPA-4401 of EMB-4 406 respectively to generate any required TF of the DFT matrix.
A low MNRTF results in a low power consumption. It is desired to have a low MNRTF that will result in a low TPR. Also, it will result in a high PMRR that is also desired. A high MNRTF is not desired since it will result in a high TPR and a low PMRR.
For the IDFT, the IDFT matrix form of Eq. (2) is shown in Eq.(40).
From Eq.(40), we can get the compact IDFT matrix form of the IDFT as shown in Eq.(41).
From Eq.(1), Eq.(2), Eq.(5) and Eq.(41), we can easily conclude that the NPA-1101 of EMB-1 106, the NPA-2201 of EMB-2 206, the NPA-3301 of EMB-3 306 and the NPA-4401 of EMB-4 406 of the present invention can also be easily adapted and used to calculate the TFs needed to calculate the IDFT.
The relation between the TF WN(−(n×k))|IDFT ∀n,k∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈ and the TF WN(n×k))|DFT ∀n,k∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈
is shown in Eq.(42).
W
N
(−(n×k))|IDFT=(WN(n×k))*|DFT (42)
∀n,k∈ψ{circumflex over ( )}∀N∈ρ1{circumflex over ( )}∀WN∈