The invention relates to a method for processing a data signal, a data processing unit for processing a data signal and a computer program product.
Space time block coding (STBC) is a technique for transmitting data used in wireless communication systems where multiple copies of the same data stream are sent using a plurality of antennas. In a group space time block coding (GSTBC) system, there are groups of antennas wherein each group of antennas transmits one data stream and the antennas of one group transmit the same data stream. At the expense of using more transmit antennas a performance improvement in terms of packet error rate (PER) may be achieved compared to multiple input multiple output (MIMO) systems where also a plurality of data streams are transmitted in parallel but one data stream is only transmitted using a single antenna.
For example, a GSTBC system with a two dimensional (2D) interleaver outperforms the OFDM system according to IEEE 802.11a when the SNR (signal to noise ratio) is greater than 26 dB even with a data rate that is 3 times higher.
However, the complexity of a receiver for a GSTBC system is significantly larger as compared to a conventional single antenna system even when linear detectors are employed.
An object of the invention is to provide an improved method for detecting data signals received via a communication channel.
The object is achieved by a method for processing a data signal, a data processing unit for processing a data signal and a computer program product with the features according to the independent claims.
A method for processing a data signal received via a communication channel is provided, comprising
Further, a data processing unit for processing a data signal and a computer program product according to the method for processing a data signal described above are provided.
Illustratively, for generating a matrix for processing the received data signal, for example for generating a filter matrix, the first matrix characterizing the communication channel, for example a channel matrix comprising components characterizing the transmission characteristics of the communication channel, is inverted using a matrix inversion approach that allows to reduce the number of multiplications necessary for matrix inversion and signal processing compared with conventional methods.
The data processing unit is for example used in a receiver, for example the receiver of a 6×3 (6 transmit antennas, 3 receiver antennas) MIMO (multiple input multiple output) WLAN (wireless local area network) system. The data signal may for example be transmitted and received via a communication system according to WLAN 11a, WLAN 11g, WLAN 11n, Super 3G, HIPERLAN 2, WIMAX (Worldwide Interoperability for Microwave Access) and B3G (beyond 3G) or other communication systems.
Embodiments of the invention emerge from the dependent claims. The embodiments which are described in the context of the method for processing a data signal are analogously valid for the data processing unit for processing a data signal and the computer program product.
In one embodiment of the invention, the first matrix is inverted according to Strassen's method for matrix inversion.
The communication channel is for example a radio communication channel. The received data signal is for example processed for estimating a sent data signal that was received as the received data signal. In one embodiment, the sent data signal is transmitted using a plurality of transmit antennas. For example, the sent data signal is transmitted according to Groupwise Space Time Block Code. The sent data signal may be transmitted using OFDM modulation.
In one embodiment, at least one of the four sub matrices has Alamouti structure. For example, the first matrix consists of sub matrices having Alamouti structure. A sub matrix having Alamouti structure can be easily inverted and allows a further decrease of complexity compared to a direct approach for matrix inversion for example by solving a linear equation system.
In one embodiment, the inverted matrix is used for performing a filter operation on the received data signal. The received data signal is for example linearly filtered using the inverted matrix. The received data signal may be filtered according to a zero forcing interference suppression method, a linear minimum mean squared error detection method or an interference cancellation method.
The first matrix is for example generated from a channel matrix modelling the transmission characteristics of the communication channel. The first matrix may for example be sub matrix of the channel matrix or the first matrix may be the channel matrix itself. The first matrix may also be a product of matrices with one factor being the channel matrix or the hermitian of the channel matrix.
Illustrative embodiments of the invention are explained below with reference to the drawings.
The transmitter 101 is a GSTBC (Groupwise space time block coding) transmitter comprising a plurality of STBC transmitter groups 103, 104, 105, in this example a first STBC transmitter group 103, a second STBC transmitter group 104 and a third STBC transmitter group 105. The number of STBC transmitter groups 103, 104, 105 is denoted by Lg. In this example Lg=3.
The transmitter 101 comprises a plurality of transmit antennas 106. The number of transmit antennas 106 is denoted by Lt. In this embodiment, each STBC transmitter group uses Lt/Lg transmit antennas 106. It is not necessary that each STBC transmitter groups uses the same number of transmit antennas. In other embodiments, the STBC transmitter groups use different numbers of transmit antennas.
The first STBC transmitter group 103 receives as input a first data stream 107, generates STBC codewords from the first data stream 107 and sends the STBC codewords using OFDM (orthogonal frequency division multiplexing) and using Lt/Lg transmit antennas 106. Each antenna used by the first STBC transmitter group 103 sends the same signal.
The second STBC transmitter group 104 receives as input a second data stream 108, generates STBC codewords from the second data stream 108 and sends the STBC codewords using OFDM (orthogonal frequency division multiplexing) and using Lt/Lg transmit antennas 106. Each antenna used by the second STBC transmitter group 104 sends the same signal.
The third STBC transmitter group 105 receives as input a third data stream 109, generates STBC codewords from the third data stream 109 and sends the STBC codewords using OFDM (orthogonal frequency division multiplexing) and using Lt/Lg transmit antennas 106. Each antenna used by the third STBC transmitter group 105 sends the same signal.
In this way, the first data stream 107, the second data stream 108 and the third data stream 109 are sent in parallel. Altogether, the STBC transmitter groups 103, 104, 105 send data 110 being separated into the data streams 107, 108, 109. For example, each data stream 107, 108, 109 corresponds to a different user that wants to transmit the respective data stream 107, 108, 109.
In one embodiment, the transmitter 101 comprises one or more coded modulation units for encoding the data 110 such that the data streams 107, 108, 109 are, for example convolutionally encoded, bit-interleaved, and modulated to MQAM (M-order quadrature amplitude modulation) signals. The data streams 107, 108, 109 may be bit streams, e.g. to be transmitted using BPSK (binary phase-shift keying) or streams of symbols, e.g. to be transmitted using QPSK (quadrature phase-shift keying) or QAM (quadrature amplitude modulation).
The receiver 102 comprises receive antennas 111. The number of receive antennas 111 is denoted by Lr. In one embodiment Lr≧Lg. In the following, it is assumed that Lr=Lg such that the minimum number of receive antennas 111, in this example 3, is used. This is for example of advantage if the receiver 102 should be kept small, for example when the receiver 102 is integrated in a mobile device, such as a cell phone.
The signals received by the receive antennas 111 are fed into a filter unit 112 which generates filtered bit streams that are respectively fed into a first OFDM decoder 113, a second OFDM decoder 114 and a third OFDM decoder 115. The first OFDM decoder 113 generates a first received bit stream 116, The second OFDM decoder 114 generates a second received bit stream 117 and the third OFDM decoder 115 generates a third received bit stream 118.
In the embodiment described in the following, an Alamouti's code is used. Therefore, Lt/Lg=2 and full rate transmission is achieved. More antennas can be employed to allow further diversity and higher system performance. The embodiment described in the following can be extended to such cases.
A STBC transmitter group 103, 104, 105 forms data vectors form the respective data stream 107, 108, 109. A data vector formed by one STBC transmitter group 103, 104, 105 from the respective bit streams 107, 108, 109 is denoted by
where g=1, . . . , Lg is the number of the respective STBC transmitter group 103, 104, 105. When xg has been transmitted (in one transmission step), as will be explained in the following, the STBC transmitter group 103, 104, 105 forms the next data vector.
The data vectors data vectors xg can be written together in the form of the vector
which corresponds to the data transmitted in one transmission step by the STBC transmitter groups 103, 104, 105.
The data received by the receiver 102 in one transmission step, i.e. when the vector x is transmitted, is written as
The relation between the sent data vector x and the received data vector r can be modelled as
r=Hx+n (1)
where n models independent average white Gaussian noise (AWGN) with zero mean and with variance σn2. The matrix H is called channel matrix and is defined by
where Hgrt (t=0, 1 for the two antennas of each STBC transmitter group 103, 104, 105) models the transmission characteristics of the (frequency domain) channel between the tth antenna 106 of the gth STBC transmitter group 103, 104, 105 and the rth receive antenna 111.
The sub matrices Hgr of matrix H are called channel sub matrices. They have a special structure given by
where h0, h1 are complex numbers. A matrix having such a structure is called Alamouti matrix in the following. For an Alamouti matrix H, a vector h can be defined according to the following correspondence:
The vector h holds the same information as the matrix H and is introduced for notational convenience. It allows the orthogonality of the matrix H to be written as
HHH=HHH=∥h∥2I (5)
where ∥h∥ denotes the norm of the vector h and I is the identity matrix of the appropriate dimension (2×2 in this case).
The result of adding and multiplying Alamouti matrices as well as of taking the inverse of an Alamouti matrix is still an Alamouti matrix. This means that the set of all Alamouti matrices is closed under addition, multiplication and inversion. This can be seen when these operations are written down explicitly. If A and B are Alamouti matrices,
This allows to simplify the detection process.
In one embodiment, the filter unit 112 carries out a general zero forcing interference suppression (ZFIS) algorithm. This is a linear filter algorithm applied to the received data vector
such that for a STBC transmitter group 103, 104, 105, the interference from the other STBC transmitter groups 103, 104, 105 in the received signal is suppressed.
The linear filter used is given by the matrix
where each sub matrix Gij has to be determined based on the transmission characteristics. Filtering the received data vector r using the matrix G gives the filtered received data vector
according to
{tilde over (r)}=Gr=GHx+Gn={tilde over (H)}x+ñ (8)
where ñ=Gn. The matrix G is chosen such that
such that in the filtered received data vector {tilde over (r)}, there is no interference between the data sent by different STBC transmitter groups 103, 104, 105. It can be seen from (9) that after filtering according to (8) the data transmitted by a STBC transmitter group 103, 104, 105 can be processed independently of the data transmitted by the other STBC transmitter groups 103, 104, 105 since all interference from the other STBC transmitter groups 103, 104, 105 has been removed. The channel model for the gth STBC transmitter group 103, 104, 105 can then be written as
{tilde over (r)}g={tilde over (H)}gxg+{tilde over (n)}g (10)
where ΣkGgkHkg={tilde over (H)}g and the noise component {tilde over (n)}g is bi-variate complex Gaussian distributed with zero mean and correlation matrix
The generality of this formulation results in some flexibility in implementation. The matrix G has Lg×2×2 degrees of freedom which are not all used for fulfilling (9). Therefore, in one embodiment a first constraint (denoted by constraint A) is used and in another embodiment, a second constraint (denoted by constraint B) is used such that the matrix G is uniquely defined by equation (9) and the respective constraint. These two embodiments are equivalent in the sense that the same results are given for coded and uncoded systems. The different choice of the constraint for matrix G allows different hardware architectures to be used which may be more appropriate for different scenarios.
Constraint A is given by Gii=I where i=1, . . . , Lg. When Gii=I, the sub matrices Gij with i≠j (i=1, . . . , Lg) can be uniquely determined using (9).
Constraint B is given by {tilde over (H)}i=I where i=1, . . . , Lg. When {tilde over (H)}i=I, the sub matrices Gij with i≠j (i=1, . . . , Lg) can be uniquely determined using (9). In this case, G=H−1.
In an uncoded system, based on (10), for unbiased estimation, the data transmitted by the gth STBC transmitter group 103, 104, 105 is estimated by
xg,ZF={tilde over (J)}g−1{tilde over (r)}=xg+{tilde over (H)}gH{tilde over (n)}g/∥{tilde over (h)}g∥2. (11)
A slicer is used to determine the corresponding transmitted symbols based on this estimation. As will be explained below, the estimation xg,ZF is also used for computing the metric for soft decoding a forward error control (FEC) code as well.
Instead of using two filter steps in the linear filter 112, namely the filter steps according to (8) and (11), the linear filter 112 may also generate xg,ZF in one filter step which is a combination of these two filter steps according to
where
In a FEC-coded system, for example when the bit streams 107, 108, 109 are convolutionally encoded, the data received from each STBC transmitter groups 103, 104, 105 can be processed independently based on (10). Since the noise variance of {tilde over (n)}g is
the optimal metric used for soft decoding is given as
The second form of metric calculation, i.e. the right hand side of (14), is more desirable since ∥xg,ZF−xg∥2 can be computed easily element by element for a fixed constellation of a transmitted symbol. Therefore, in one embodiment, this form is used for metric calculation.
In this case xg,ZF and the channel normalization factor defined as
is explicitly calculated.
The remaining required operations according to (14) can be carried out by standard computation and the results can be passed as soft input to a Viterbi decoder.
The detection according to (11) (denoted by ZFIS Option 1) and (12) (denoted by ZFIS Option 2) are based on constraint A. When the detection is based on constraint B, and G is calculated as the inverse of H (denoted by ZFIS Option 3), a matrix inversion of H is required. In this case, the complexity can be reduced if the matrix H is inverted efficiently.
In the specific case Lr=Lg=3, the channel matrix H is a 6×6 matrix which consists of nine Alamouti matrices. Additions can be carried out much faster than multiplications and, in view of a hardware implementation, less silicon area or chip space is required to implement additions than multiplications. Therefore, when considering complexity only the number of multiplications and divisions is taken into account. It is differentiated between the number of real operations and complex operations. In terms of complexity and necessary hardware, each complex multiplication can be expressed as a multiple of a real multiplication, usually a multiple to 3 or 4.
The ZFIS algorithms introduced above require matrix additions, matrix multiplications and matrix inversions. Since Alamouti matrices are closed under addition and multiplication, when these operations are carried out for two Alamouti matrices, only the first row of the resulting matrix needs to be calculated. The second row can be obtained by sign change and/or conjugate operations. Therefore, for a multiplication of two 2×2 Alamouti matrices, only 4 complex multiplications are required.
From (6), it can be seen that a matrix inversion of a 2×2 Alamouti matrix can be done by scaling and reordering of matrix elements. If
the norm of a is given by
∥a∥2=aR2+aI2+bR2+bI2 (16)
where the subscript R denotes the real part of the respective complex number and I denotes the imaginary part of the respective complex number. ∥a∥2 can therefore be computed using 4 real multiplications. Taking the reciprocal of ∥a∥2 requires 1 real division. For multiplying
with the real parts and imaginary parts of the first row of the matrix
to get the first row of the inverse matrix (the second row can be, as mentioned above, be calculated without further multiplications) 4 real multiplications are additionally needed. For a multiplication of a matrix A with the identity matrix, no operations are required since AI=IA=A.
The channel matrix H is estimated in a channel training phase. Subsequently, the receiver 101 receives the data vector r which is filtered to reconstruct the sent data vector x.
In one embodiment, the detection method is divided into two stages, a pre-computation stage (before the r has been received or is processed) and a filtering stage (when the vector r is processed).
In the pre-computation stage, ρg is calculated according to (15) and all computations are carried out that allow the calculation of xg,ZF according to (11) or (12) respectively as soon as the vector r is available. Specifically, in the pre-computation stage, (after the channel has been estimated and the matrix H has been generated), the following steps are carried out:
(a) The matrix G and
are calculated
(b) The matrices {tilde over (H)}g and {tilde over (H)}g−1 and ∥{tilde over (h)}g∥2 are calculated
(c) The matrix G′ is calculated (this requires G and {tilde over (H)}g−1 and is only done when ZFIS Option 2 is used)
(d) ρg is calculated according to (15)
ZFIS Option 1 requires less pre-computation than ZFIS Option 1 but uses a two-step filter instead of a one-step filter.
In the filtering stage in case of ZFIS Option 1, the following step is carried out
(a1) {tilde over (r)}=Gr and xg,ZF={tilde over (H)}g−1{tilde over (r)}g are calculated.
In the filtering stage in case of ZFIS Option 2, the following step is carried out
(a2) xZF=G′r is calculated.
In the example with Lr=Lg=3, the ZFIS filter matrix G has the form
where Ggg=I (g=1, 2, 3) and the other 2×2 square sub matrices are calculated according to (9) by
The terms for Gij (i≠j according to (18)) consist of two factors (enclosed in square factors). They are denoted by left side factor and right side factor according to the order in which they are written in (18). Since Ggg=I (g=1, 2, 3) the matrices Ggg do not need to be calculated.
The right side factors of all Gij can be calculated first using the matrices given in the second column of table 1. The left side factors can be calculated subsequently using the matrices in the second column and the third column.
H
23
H
33
H, H32H22H
H
13
H
23
H
H
13
H
33
H, H31H11H
H
21
H
31
H
H
12
H
22
H, H21H11H
H
32
H
12
H
The required complexity (for 9 HiHjH terms) is:
9×4=36 complex multiplications for HiHjH
9×4=36 real multiplications for ∥hi∥2
For example, the sub matrix G12 is calculated according to
where
C=∥h33∥2H22H32H−∥h32∥2H23H33 (20)
The required (for 6 Gij terms where i≠j) complexity is:
6×4×4=96 real multiplications for the real scalar-multiplications written in square brackets in (19)
6×4=24 real multiplications for ∥c∥2
6×1 real divisions for reciprocal of ∥c∥2
6×4=24 complex multiplications for the matrix multiplications
6×4=24 real multiplications for real scalar-matrix normalization
For the calculation of
in step (a) in the case Lr=Lg=3, i.e. for the calculation of 6 ∥ggk∥2 terms with g≠k (since ∥ggg∥2=1) 6×4=24 real multiplications are required.
These terms are combined according to
In stage (b),
is calculated. For example, when Lr=Lg=3
{tilde over (H)}1=H11+G12H21+G13H31
{tilde over (H)}2=G21H12+H22+G23H32
{tilde over (H)}3=G31H13+G32H23+H33 (23)
This requires 3×2×4=24 complex multiplications (for the three {tilde over (H)}g terms.
Further, in step (b), the matrix {tilde over (H)}g−1 and the value ∥{tilde over (h)}g∥2 need to be calculated. For Lr=Lg=3, this requires
3×4=12 real multiplications for 3 ∥{tilde over (h)}g∥2 terms
3×1=3 real divisions for the reciprocals of ∥{tilde over (h)}g∥2
3×4=12 real multiplications for 3 scalar-matrix multiplications.
In step (c), G′ is calculated, in case Lr=Lg=3 this is
This requires 6×4=24 complex multiplications.
As mentioned above, this is not done in ZFIS Option 1.
In step (d) (15) is calculated. This requires 3 real multiplications if Lr=Lg=3.
In the filtering step according to ZFIS Option 1, 3×8+18=42 complex multiplications are required for {tilde over (r)}=Gr and xg,ZF={tilde over (H)}g−1{tilde over (r)}g if Lr=Lg=3. Since Ggg=I, {tilde over (r)}=Gr may be simplified.
In the filtering step according to ZFIS Option 2, 6×6=36 complex multiplications are required for xZF=G′r if Lr=Lg=3.
The complexity of ZFIS Options 1 and 2 are summarized in tables 2 and 3 for Lr=Lg=3.
G
ij (i ≠ j)
{tilde over (H)}
g
−1, ∥{tilde over (h)}g∥2
G′
x
ZF = G′r
G
ij (i ≠ j)
{tilde over (H)}
g
−1, ∥{tilde over (h)}g∥2
{tilde over (r)} = Gr
x
ZF,g = {tilde over (H)}g−1{tilde over (r)}
In one embodiment where ZFIS Option 3 is used, the inversion of the matrix H is carried out according to a fast matrix multiplication approach discovered by Strassen in 1969 (see [1]) which uses a divide and conquer approach.
For inverting a matrix H having the structure
where aij and cij can be scalars or sub matrices of appropriate dimensions, the following computations are carried out:
R1=(a11)−1
R2=a21×R1
R3=R1×a12
R4=a21×R3
R5=R4−a22
R6=(R5)−1
c12=R3×R6
c21=R6×R2
R7=R3×c21
c11=R1−R7
c22=−R6 (26)
This can be done for a matrix of any dimension having the structure according to (25). However, the above operations have to be carried out serially.
For example, in the case Lr=Lg=3, the channel matrix H is a 6×6 matrix consisting of 9 sub matrices Hij (i, j=1, 2, 3) which may be divided such that into the structure (25) according to
For higher numbers of Lr and Lg, H can be partitioned in such a way that the lower-right submatrix of H, a22, in this example
has the largest possible dimension (being a power of 2) such that the STBC matrix H11 is not partitioned.
If needed for higher numbers of Lr and Lg, the partitioning can be done recursively such that there is one sub-matrix (e.g. a22) with the largest possible dimension (being a power of 2) making sure that the STBC matrix H11 is not partitioned. This sub-matrix can then be partitioned itself and inverted according to the inverting scheme above.
In the example Lr=Lg=3, another partitioning is possible, for example by forming a 2×2 sub matrix in the upper left of H. It can be seen that when equations (26) are applied to this partitioning of H, the matrices generated by the equations and multiplied according to the equations (26) have the correct dimensions such that no impossible multiplication has to be carried out in terms of matrix dimensions (a 4×4 matrix with a 2×2 matrix for example.) The inversion of the lower right 4×4 matrix
may also be done according to the equations (26). Since the Hij (i, j=1, 2, 3) have Alamouti structure (and these are closed under multiplication, addition and inversion), only the first rows of the matrices need to be computed. Further, simplifications for calculation of inverse matrices of 2×2 matrices and calculation of determinants of 2×2 matrices can be used.
With G=H−1, the filtered channel matrix {tilde over (H)} is an identity matrix. In this case, ∥{tilde over (h)}g∥2 needs not to be computed.
The complexity of ZFIS Option 3 is summarized in table 4 for Lr=Lg=3. It involves all the necessary computations to generating the estimate xZF,g (g=1, 2, 3) and the channel normalization factor according to (14) for the decoder (for example Viterbi decoder).
H
−1
x
ZF,g = {tilde over (H)}g−1{tilde over (r)}g
The ZFIS algorithm is a generalization of the zero-forcing (ZF) detector. A minimum mean squared error (MMSE) detector can also be used by the filter unit 112. In this case, the filter matrix is given by
As in the case G=H−1, the inverse of a matrix needs to be calculated. Since
also consists of Alamouti sub matrices, the method explained above for low-complexity matrix inversion can be used as well.
In the following, an example is given for the inversion of a 6×6 matrix comprising 2×2 matrices with Alamouti structure. The matrix to be inverted is given by
The objective is to compute
Given two matrices
represented as S1={a1, b1, k1} and S2={a2, b2, k2}, the following rules are used:
S1+S2≡{k2a1+k1a2,k2b1+k1b2,k1k2}
S1−S2≡{k2a1−k1a2,k2b1−k1b2,k1k2}
S1S2≡{a1a2−b1b2*,a1b2+b1a2*,k1k2}
S1−1≡{k1a1*,−k1b1,|a1|2+|b1|2}
−S1≡{−a1,−b1,k1} (31)
The solution to the 6×6 matrix inversion is obtained by first performing matrix inversion on the 4×4 matrix
corresponding to (26) according to the equations:
R=m11−1
R2=m21R1
R3=R1m12
R4=m21R3
R5=R4−m22
R6=R5−1
c12=R3R6
c21=R6R2
R7=R3c21
c11=R1−R7
c22=−R6 (33)
Equations (26) are then iteratively used a second time such that the final solution is calculated according to
Q2,11a=m31c11
Q2,11b=m32c21
Q2,11=Q2,11a+Q2,11b
Q2,12a=m31c12
Q2,12b=m32c22
Q2,12=Q2,12a+Q2,12b
Q3,11a=c11m13
Q3,11b=c12m23
Q3,11=Q3,11a+Q3,11b
Q3,21a=c21m13
Q3,21b=c22m23
Q3,21=Q3,21a+Q3,21b
Q4a=m31Q3,11
Q4b=m32Q3,21
Q4=Q4a+Q4b
Q5=Q4−m33
Q6=Q5−1
g13=Q3,11Q6
g23=Q3,21Q6
g31=Q6Q2,11
g32=Q6Q2,12
Q7,11=Q3,11g31
Q7,12=Q3,11g32
Q7,21=Q3,21g31
Q7,22=Q3,21g32
g11=c11−Q7,11
g12=c12−Q7,12
g21=c21−Q7,21
g22=c22−Q7,22
g33=−Q6 (34)
Divisions are only required when the matrices are recovered to their final form from the notation {a1, b1, k1} to
All operations above are operations on 2×2 matrices.
The reciprocal computation can be done using the Newton Raphson method. This is based on the method to obtain a zero of a function f (the value of x for which f(x)=0). A guess for a zero x[j], i.e. an approximation of a zero of the function, is iteratively enhanced by
x[j+1]=x[j]−f(x[j])/f′(x[j]) (35)
where f′(x[j]) is evaluated at x[j].
This can be applied to the function f(R)=1/R−k which has a zero at 1/k for computing the inverse of the value k. The iterative formula is in this case given by
R[j+1]=R[j](2−R[j]k) (36)
where R[0] is an initial guess, i.e. the initial approximation. The iterative formula can be re-written as
R[j+1]=R[j]+R[j]−(R[j]k)R[j] (37)
for computation using a RISC-like architecture. Quadratic convergence can be achieved, the better the initial guess, the faster the convergence. Multiple initial guesses can be used for different ranges of the value of k.
Since only one division at the end of the procedure is necessary, hardware implementation is facilitated. This also enlarges the range of the values that appear in course of the procedure. Numeric simulations show that bit widths of some intermediate result may exceed more than 20 bits. Therefore, in one embodiment, intermediate data is dynamically scaled.
In the following, a possible hardware architecture for carrying out the inversion process described above is described with reference to
The processor element 200 comprises a dual-port memory 201 for storing the input and output, two hardware multipliers 202 for multiplying two words, two adding/subtracting units 203 for adding/subtracting, a dynamic scale unit comprising a first scale unit 204, a second scale unit 205 and a scale function unit 206, pipeline registers to store intermediate results, a plurality of multiplexers 208 to direct the data flow and initial guess selection units 209.
Since the memory 201 is a dual-port memory, data stored at the same address can be simultaneously accessed at two output ports. Therefore, operations on the same two complex numbers (multiplication, addition, subtraction) are possible as well as operations on one complex number (squaring, determination of absolute value). The memory is for example operated at 80 MHz, wherein the read cycles and write cycles are alternated (and thus are each clocked at 40 MHz).
The multiplexers denoted by RMUX0, IMUX0, RMUX1, IMUX1 in
The inversion method based on the Newton Raphson method as explained above is used for reciprocal calculation. When the first iteration is to be initiated, the signal denoted by INIT in
Since there are two independent multipliers 201, the reciprocal of two numbers can simultaneously calculated using the iterative Newton Raphson method. This allows an efficient use of hardware resources.
The processing element 200 is for example used for each sub carrier of an OFDM system, for example, in case of 48 sub carriers as it is commonly used, 48 processing elements 200 are used in parallel. The processing element has a RISC-like architecture and is optimized for the 5 basic matrix operations and reciprocal calculation. It can be implemented with high speed (e.g. in ASIC) for a low latency or for allowing processing of more than one sub carrier by the same processing element 200. The scaling decreases the risk of numerical underflow or overflow during the processing.
In the following, a processing flow for detection according to one embodiment of the invention is described.
In step 301, a first matrix comprising components describing characteristics of the communication channel is determined. This may be the channel matrix H itself or another matrix, for example the matrix
for use in an MMSE filter.
The first matrix is inverted by the following steps. In step 302, the first matrix is sub-divided into at least four sub matrices. For example, the matrix is written according to equation (25).
In step 303, a first sub matrix of the four sub matrices is inverted. In the example using the formulas (26), the first sub matrix would be a11 which is inverted to get R1.
In step 304, a second matrix (R4 in the example using (26)) is generated by multiplying a second sub matrix (a12 in the example using of (26)) of the four sub matrices with the inverted first matrix and a third sub matrix of the four sub matrices (a12 in the example using (26)).
In step 305, the difference matrix (R5 in the example using (26)) between the second matrix and a fourth sub matrix (a22 in the example using (26)) of the four sub matrices is determined.
In step 306, the difference matrix is inverted (the result would be R6 in the example using (26)).
In step 307, the inverted matrix is calculated based on the inverted difference matrix and the data signal is processed using the inverted first matrix. The inverted first matrix is for example used as a part of a filter matrix or a filter matrix is generated from the inverted first matrix for filtering the received data signal.
In this document, the following publication is cited:
The present application claims the benefit of U.S. provisional application 60/774,251, filed 16 Feb. 2006, the entire contents of which is incorporated herein by reference for all purposes.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SG2007/000048 | 2/14/2007 | WO | 00 | 5/17/2010 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2007/094745 | 8/23/2007 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20080279091 | Zhang et al. | Nov 2008 | A1 |
20110150145 | Tsai | Jun 2011 | A1 |
Number | Date | Country | |
---|---|---|---|
20100220815 A1 | Sep 2010 | US |
Number | Date | Country | |
---|---|---|---|
60774251 | Feb 2006 | US |