1. Field of the Invention
The present invention relates to wireless communication systems, such as but not limited to wireless local area networks (WLANs), and in particular, to an 802.11b receiver that contains a packet-based multiplication-free CCK demodulator with a Fast Multipath Interference Cipher (FMIC).
2. Description of the Prior Art
U.S. patent application Publication No. US 2001/0036223 to Webster et al. (“Webster”) discloses a RAKE receiver that is used for indoor multipath WLAN applications on direct spread spectrum signals having relatively short codeword lengths. The RAKE receiver has an embedded decision feedback equalizer structure in the signal processing path through the receiver's channel matched filter and codeword correlator. The decision feedback equalizer serves to cancel inter-codeword interference (also known as Inter-Symbol Interference, ISI) (i.e., bleed-over between CCK codewords).
In
In another embodiment illustrated in
In yet another embodiment in
In summary, Webster's architecture requires significant hardware complexity and the execution of large numbers of complex operations (complex multiplications and additions that are required for complex convolution and complex correlation). As a result, large power consumption, complex hardware, and long processing times will be required to implement the embodiments described in
It is an object of the present invention to provide a CCK receiver which utilizes a reduced number of complex operations.
It is another object of the present invention to simplify the hardware for a CCK receiver.
In order to accomplish the objects of the present invention, the present invention provides an optimal algorithmic structure for use in a RAKE receiver to compute multipath interferences (MPIs) required for canceling intra-codeword chip interference (ICI). The algorithmic structure jointly computes the ICI of a plurality of possible codewords. This algorithmic structure is similar to the optimal architecture required for CCK correlation computations. Therefore, the present invention takes advantage of this similarity between these two structures and uses the same hardware at different times to compute MPIs and to compute the CCK correlation of a plurality of possible codewords.
Further scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The present invention will become more fully understood from the detailed description given hereinbelow and the accompanying drawings which are given by way of illustration only, and thus are not limitative of the present invention, and wherein:
The following detailed description is of the best presently contemplated modes of carrying out the invention. This description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating general principles of embodiments of the invention. The scope of the invention is best defined by the appended claims. In certain instances, detailed descriptions of well-known devices, components, mechanisms and methods are omitted so as to not obscure the description of the present invention with unnecessary detail.
A simplified 802.11b Packet format is shown in
A receiver 20 is illustrated in
The receiver 20 has a CIR (channel impulse response) estimation block 24 having inputs coupled to the first selector 22, first set of outputs coupled to a Channel Matched Filter (CMF) block 28 (which has another set of inputs coupled to the first selector 22, and output coupled to a multiplexer block 34), and a second set of outputs coupled to the second selector block 26. The multiplexer 34 (MUX) has inputs coupled to the CMF block 28 and outputs coupled to the second selector 26. A Shared Hardware (SH) block 36 has inputs coupled to the second selector 26 and outputs coupled to a CCK Decoder 38. As described below, the SH block 36 can be either a Fast-Multipath-Interference-Cipher (FMIC) block 36a or a CCK Correlator 36b, depending on the Mode of operation. The output of the CCK Decoder 36 is provided to an input of a CCK remodulation (Remod) block 32, which has outputs coupled to the DFE block 30. The Mode 1 and Mode 2 signals are provided to the selectors 22, 26, the SH block 36, and the CCK Decoder 38. During Mode 1 operation, the second selector 26 routes the 8 inputs from the CIR Estimation block 24 to the SH block 36. During Mode 2 operation, the second selector 26 routes the 8 inputs from the MUX 34 to the SH block 36. The details of the operations will be described below. The output of the CCK Decoder 38 is an 8-bit index of the decoded CCK codeword. This index also represents the 8-bit decoded data.
Although it is not shown in
In
The CIR Estimation block 24 works in Mode 1 during preamble processing for each packet. During the Mode 1 operation, the CIR estimation block 24 (1) uses Barker code correlation to determine an estimated channel impulse response (CIR), and (2) uses the estimated CIR to generate the corresponding CMF taps as outputs for the CMF block 28 to use in Mode 2 and (3) synthesizes the feedback (FB) and feed-forward (FF) taps as outputs for the Selector block 26. Let hCIR(n), where (n=j, −j+1, . . . , 0, . . . ,m), denote the estimated CIR, where n=0 being the tap with the maximum power and hCMF(n) denote the complex-valued CMF taps, then
hCMF(n)=hCIR*(−n),
where ( )* denotes complex conjugate. One can then derive the multipath profile, hMP(n);
hMP(n)=hCIR(n)*hCMF(n),
wherein ( )*( ) denotes convolution. The FB taps, Bk=hMP(k), where k>0. The FF taps, Fk's, are the taps which precede the main tap hMP(0), i.e., Fk=hMP(−k), where k>0. For ICI cancellations, these FB and FF taps represent an estimate of the joint effect of the multipath channel (via an estimated CIR) and the CMF. For convenience, we will, hereafter, refer to the FF and FB taps as multipath profile.
The MPIs provided by the FMIC block 36a is a post-correlation representation of the Intra-codeword Chip Interference (ICI) due to the currently received CCK symbol (8 chips long). The multipath profile is assumed to be unchanged during a packet period. Therefore, the MPIs are calculated only once for each packet in Mode 1. These MPIs will be used later in Mode 2 when the CCK Decoder 38 cancels out the effect of the ICI by subtracting the corresponding MPI values from its correlator outputs. The same hardware that is used to implement the FMIC operation in Mode 1 is used to implement the CCK Correlator block 36b in Mode 2. In other words, the same hardware is “shared”, or used to perform two different functions at two different times.
During Mode 2 operation, the Channel Estimation block 24 provides the CMF tap weights for the CMF block 28 to coherently combine the energy of the received signal through a multipath channel. The DFE block 30 uses the FB tap weights from CIR Estimation block 24 to cancel the ICI from the previous CCK symbol. The previous symbol was decoded in the CCK decoder block 38 and was re-modulated by the CCK Remod block 32.
The MUX 34 groups every eight complex chips (one CCK symbol) after the CMF block 28. The second selector 26 then routes the outputs from MUX 34 to the SH block 36. The SH block 36 operates as a CCK Correlator 36b during Mode 2. Finally, the CCK Decoder 38 is utilized to find an index of a CCK codeword having the maximum correlation between the received signal (represented by 8 received chips) and all possible CCK codewords. This index is the decoded 8-bit data.
The operations of the Shared Hardware 36 are shown in
The CCK Decoder 38 is illustrated in
A CCK modulated symbol (8 chips long) designed for an 802.11b system is described in IEEE Std. 802.11b, D8.0, IEEE Standards Department, Piscataway, N.J., September 2001, whose disclosure is incorporated by this reference as though set forth fully herein. Each codeword described by a vector is given by
C≡[ck]=ej(φ
Each of four encoded phases, φ1, φ2, φ3, and φ4 has one of the following four values: 0, π2, π, or 3π/2. Therefore, there are 44=256 codewords. An optimal demodulator is to find one of the 256 codewords which has the maximum correlation with the received CCK symbol. In internal testing done by the inventors, the 256 CCK correlation outputs generated from a well-known Fast-Walsh-Transform (FWT) is given by:
where ri (i=1, 2, 3, . . . , 8) is the received CCK symbol (8 complex chips), and the notations, star ‘*’ and minus ‘−’, denote the complex conjugate and the complex negative (or subtraction) of a complex number, respectively. The Fast Walsh Transform (FWT) is a well-known algorithm that calculates all 64 or 256 correlations jointly with a minimal number of mathematical operations (multiplications and/or additions) and hardware processing time. Features of a FWT include:
R=h0C+B1[C(+1)]+F1C(−1)
The first term in the above equation represents the desired signal vector. The post- and pre-cursor ICIs are described as the right- and left-shifts of the transmitted codeword vector, C, respectively. Then, the CCK Decoder 38 is applied to find one codeword, out of 256 correlation results, having the maximal correlation, RHC, with the received signal vector, i.e.:
RHC=h*0CHC+B*1[C(+1)]HC+F*1[C(−1)]HC
where the notation, h′, denotes a complex conjugate and transpose operation of the received signal vector. The last two terms in the above equation are the post-correlation ICIs from the pre-cursor and the post-cursor that may cause an incorrect decoding decision and degrade the system performance.
In general, the post-correlation ICIs of the received signal vector with a MP profile shown in
which has 7 pre-cursors, 7 post-cursors, and one desired signal path. Therefore, the output of the CCK Correlator 36b is given by
In the present invention, the post-correlation ICIs from both pre-cursors and post-cursors will be calculated as follows using the FMIC block 36a and then cancelled in the CCK Decoder block 38.
It is observed that the ICI of the received signal vector is a summation of shifted versions of the desired codeword. For examples, the pre-correlation ICIs from the pre-cusor F1 and the post-cursor B1 are given by
F1C(−1)≡F1[ej(φ
B1C(+1)≡B1[0ej(φ
For multipaths with a relative delay equal to an even number of chips from the desired signal path, it can be shown that the post-correlation ICIs are all equal to zero at the CCK Correlator 36b output when applying the following auto-correlation property of the CCK codewords:
[C(−k)]HC=[C(+k)]HC=0, if j=2,4, or 6. Eq.(1a)
Then, the overall ICI computation can be simplified to:
where
ƒF(φ2,φ3,φ4)≡ejφ
ƒB(φ2,φ3,φ4)≡e−jφ
The functions, fF(φ2, φ3, φ4) and fB(φ2, φ3, φ4) are ICIs caused by precursors (items containing Fi's) and postcursors (items containing Bi's), respectively. Therefore, not only is the ICI from post-cursors canceled (as in Webster), but the ICI from pre-cursors is also canceled. In contrast, Webster did not address ICI due to post-cursors only.
Equation (2) is a function of the encoded phases, φ2, φ3, φ4, only φ1 has no impact on ICI). The CCK encoded phases, φ2, φ3, φ4 each has one of the following four values: 0, π/2, π, or 3π/2. Therefore, there are 43=64 MPI outputs in Equation (2). As a result, the total number of MPI values for all 256 codewords is reduced from 256(=44) as in Webster to 64(=43) in the present invention. Thus, the architecture of the present invention provides for a reduction in the complexity of the hardware and for power savings. Further reductions to 32, 16, or 8 can be derived as follows.
The structure to calculate MPI outputs in Equation (2) is found to be similar to that which is used to calculate the 64 CCK correlation outputs (denoted as Xi, i=1, 2, . . . , 64) in Equation (1). Equation (1) shows that:
The fast algorithm to compute the MPIs will be called “Fast Multipath Transform”, or FMT. A detailed description on the reuse of CCK Correlator 36b hardware for MPI computations in the FMIC block 36a will be described below.
The basic hardware building block that can be shared for a FMT in Mode 1 and a FWT in MODE 2 is denoted as a Fast Multipath/Walsh Transform (FMWT) block and is illustrated in
In Mode 1 (FMT outputs):
Re{D1}=Re{A1+A2}=Re{A1}+Re{A2}
Im{D1}=Im{A1+A2}=Im{A1}+Im{A2}
Re{D2}=Re{jA1−jA2}=−Im{A1}+Im{A2}
Im{D2}=Im{jA1−jA2}=Re{A1}−Re{A2}
Re{D3}=Re{−A1−A2}=−Re{A1}−Re{A2}
Im{D3}=Im{−A1−A2}=−Im{A1}−Im{A2}
Re{D4}=Re{−jA1+jA2}=Im{A1}−Im{A2}
Im{D4}=Im{−jA1+jA2}=−Re{A1}+Re{A2} Eq.(3a)
In Mode 2 (FWT outputs):
Re{D1}=Re{A1+A2}=Re{A1}+Re{A2}
Im{D1}=Im{A1+A2}=Im{A1}+Im{A2}
Re{D2}=Re{jA1+A2}=−Im{A1}+Re{A2}
Im{D2}=Im{jA1+A2}=Re{A1}+Im{A2}
Re{D3}=Re{−A1+A2}=−Re{A1}+Re{A2}
Im{D3}=Im{−A1+A2}=−Im{A1}+Im{A2}
Re{D4}=Re{−jA1+A2}=Im{A1}+Re{A2}
Im{D4}=Im{−jA1+A2}=−Re{A1}+Im{A2} Eq.(3b)
where the notations, Re{ } and Im{ }, denote the real part and the imaginary part of the complex number inside the brackets and the complex scalar j is defined as the square-root of (−1). Therefore, no multiplication is required for this basic building block in both modes 1 and 2.
One may program the same hardware block to operate in either Mode 1 or Mode 2 as shown in
Furthermore, the number of MPI outputs can be reduced from 64 to 32 or 16 for all 256 codewords. One can observe from Equation (2):
ƒ1(φ2+π,φ3,φ4)=−ƒ1(φ2,φ3,φ4),φ2=0 or π/2. Eq.(4a)
From Equation (4a), it is not necessary to calculate the MPI outputs for the cases when encoded phase φ4 is π or 3π/2. Similarly,
ƒ1(φ2,φ3,φ4+π)=−ƒ1(φ2,φ3,φ4),φ4=0 or π/2. Eq.(4b)
ƒ1(φ2+π,φ3,φ4+π)=ƒ1(φ2,φ3,φ4),φ2,φ4=0 or π/2. Eq.(4c)
From Equations (4b) and (4c), only two choices are necessary to calculate the MPI outputs for the encoded phases φ2 or φ4. Since there are four choices to choose the encoded phase φ3, the number of required MPI outputs is 4×2×2 or 16. All other 48 MPI outputs have the same absolute values as these 16 MPI outputs but may have a negative sign.
If the dominant ICI in a system is caused by the post-cursors (or the pre-cursors) only, the number of required MPI outputs can be further reduced to 8 using the following properties:
ƒF(φ2,φ3,φ4)≡ejφ
ƒB(φ2,φ3,φ4)≡e−jφ
For example, one only needs to calculate the function, fF1(φ2,φ3) or fB1(φ2,φ3), which has two choices of φ2 and four choices of φ3 The complex scalar, exp(jφ4), has one of the four values: 1, j, −1, or −j. In the practical hardware implementation, these operations are implemented by changing the (positive/negative) signs and/or by switching the real/imaginary parts of the values, i.e., without any calculations.
In short, it is required to first calculate the eight values of the function, fF1(φ2,φ3), or fB1(φ2,φ3), then the function, fF(φ2,φ3,φ4), or fB(φ2,φ3,φ4) can be obtained from changing the (positive/negative) signs and/or switching the real/imaginary parts of the function, fF1(φ2,φ3), or fB1(φ2,φ3).
Implementations of Equations (4a)–(4e) will be provided below as three embodiments. Other mathematically-equivalent embodiments based upon (but not limited to) Equation (2) are given by:
ƒ2(φ2,φ3,φ4)=ej(φ
ƒ3(φ2,φ3,φ4)=ejφ
ƒ4(φ2,φ3,φ4)=ejφ
ƒ5(φ2,φ3,φ4)=[F*7ej(φ
Since there are no common factors in Equations (5a)–(5d) among items, the FMT algorithm cannot be applied to evaluate the 64 (or 32 or 16) MPI outputs in Equations (5a)–(5d). More hardware complexity is therefore required to compute the MPIs using any one of the Equations (5a)–(5d) when compared to the FMT implementation based on Equation (2). However, each embodiment in Equations (5a)–(5d) requires much less hardware complexity and processing time than those required in Webster, which requires all 256 MPIs to be computed.
In summary, the present invention derives a mathematical equation, Equation (2), which provides an optimal algorithmic structure (FMT) to compute MPIs required for ICI cancellations. The algorithm jointly computes the ICI of a plurality of possible codewords. This algorithmic structure is similar to the optimal architecture (FWT) required for CCK correlation computations. One can, therefore, take advantage of this similarity between these two structures and reuse essentially the same hardware at different times. Other mathematically equivalent embodiments to Equation (2) are characterized by Equations (5a)–(5d). One can use any of the Equations (5a)–(5d) to show that the number of required MPI outputs for ICI cancellation is also reduced from 256 to 64. From Equations (2), or (5a)–(5d), one can derive the symmetry properties shown in Equations (4a)–(4e). With these symmetry properties, one needs to compute only 32, 16, or 8 of the 64 MPIs and then derive all 64 MPIs based on Equations (4a)–(4e).
One first embodiment of the FMIC block 36a of
Since it is not necessary to calculate all 64 MPI outputs, it is desirable to deactivate the unnecessary outputs and FMWT blocks in practical implementations. A half FMWT, block shown in
Re{D1}=Re{A1+A2}=Re{A1}+Re{A2}
Im{D1}=Im{A1+A2}=Im{A1}+Im{A2}
Re{D2}=Re{jA1−jA2}=−Im{A1}+Im{A2}
Re{D2}=Im{jA1−jA2}=Re{A1}−Re{A2} Eq.(6)
This (multiplication-free) half FMWT block operates in Mode 1 only.
Referring now to the embodiment shown in
Referring now to the embodiment shown in
The implementions of the FMIC block according to the present invention are not limited to those modeled based on Equations (2), (5a)–(5d). Other implementations of FMIC blocks which are mathematically equivalent to Equations (2), (5a)–(5d) are also encompassed within the scope of the present invention.
Thus, when compared with the RAKE receiver in Webster, the present invention provides numerous important advantages:
(1) The 64 MPI outputs for the FMIC block 36a in the present invention are obtained from essentially the same hardware used for the CCK Correlator 36b of
(2) Only 64 (or 32, or 16 or even 8) MPI outputs are required for all 256 codewords, as opposed to 256 ICI outputs required by Webster for an optimal CCK Decoder.
(3) The 64 MPI outputs of the present invention are obtained from a structure similar to FWT which minimizes the required hardware (i.e., adders only), minimizes the number of complex calculations, and minimizes the processing time.
(4) While the 256 ICI outputs in Webster are calculated independently, the 64 MPI outputs in the present invention are jointly calculated. All common operations to obtain 64 (or 32 or even 16) different MPI outputs are calculated only once. Therefore, the number of operations is minimized.
(5) No multiplication operations are required for the present invention.
(6) In the present invention, the MPI from pre-cursors and post-cursors are calculated from the Feed-Forward (FF) and Feedback (FB) taps. In contrast, in Webster, only the MPI from post-cursors are calculated from only the FB taps. Therefore, the present invention can cancel more interference when compared to the architecture in Webster.
Those skilled in the art will appreciate that the embodiments and alternatives described above are non-limiting examples only, and that certain modifications can be made without departing from the spirit and scope thereof. The accompanying claims are intended to cover such modifications as would fall within the true scope and spirit of the present invention.
Number | Name | Date | Kind |
---|---|---|---|
5513215 | Marchetto et al. | Apr 1996 | A |
5623511 | Bar-David et al. | Apr 1997 | A |
6233273 | Webster et al. | May 2001 | B1 |
6324160 | Martin et al. | Nov 2001 | B1 |
20020042256 | Baldwin et al. | Apr 2002 | A1 |
20020122466 | Somayazulu | Sep 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
20040091023 A1 | May 2004 | US |