This application claims the benefit of Russian Application No. 2010147930, filed Nov. 25, 2010 and is hereby incorporated by reference in its entirety.
The present invention relates to forward error correction codes generally and, more particularly, to a method and/or apparatus for implementing reconfigurable encoding per multiple communications standards.
Turbo and convolutional codes are widely used forward error correction codes. Turbo codes were proposed by Berrou and Glavieux in 1993 and have been adopted in many communications standards such as Wideband-CDMA (WCDMA), Code Division Multiple Access 2000 (CDMA2000), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE) and Digital Video Broadcasting-Return Channel via Satellite (DVB-RCS). The codes allow near optimal decoding with excellent performance approaching the Shannon limit for Additive White Gaussian Noise (AWGN) channels.
Conventional implementations of convolutional and turbo encoders handle a single input bit per clock cycle. If a conventional encoder simultaneously supports many different standards, straightforward implementations utilize a significant amount of additional configuration data. Moreover, the configuration data is prepared outside the encoder and loaded into internal registers when the encoder is initialized. If the configuration data is sufficiently long, many clock cycles are used to configure the encoder.
The present invention concerns an apparatus generally including a first circuit and a second circuit. The first circuit may be configured to (i) receive a configuration signal that identifies a current one of a plurality of communications standards and (ii) generate a plurality of matrix elements based on the configuration signal. The second circuit may include a plurality of matrixes. The second circuit may be configured to (i) fill the matrixes with the matrix elements and (ii) generate an encoded signal by forward error correction encoding an input signal using the matrixes. The encoded signal generally complies with the current communications standard.
The objects, features and advantages of the present invention include providing apparatus for implementing reconfigurable encoding per multiple communications standards that may (i) be used for any particular set of wireless communications standards, (ii) reconfigure in a single clock cycle, (iii) implement hardware-only reconfiguration, (iv) handle several input bits per clock cycle, (v) implement a convolutional encoder, (vi) implement a turbo encoder, (vii) occupy an area close to a non-configurable encoder and/or (viii) perform with a throughput close to a non-configurable encoder.
These and other objects, features and advantages of the present invention will be apparent from the following detailed description and the appended claims and drawings in which:
Some embodiments of the present invention generally concern a reconfigurable chip (or die) for encoding an input signal in accordance with two or more wireless communications standards. The wireless communications standards may include, but are not limited to, a Long Term Evolution (LTE) standard (3GPP Release 8), an Institute of Electrical and Electronics Engineering (IEEE) 802.16 standard (WiMAX), a Wideband-CDMA/High Speed Packet Access (WCDMA/HSPA) standard (3GPP Release 7) and a CDMA-2000/Ultra Mobile Broadband (UMB) standard (3GPP2). Other wired and/or wireless communications standards may be implemented to meet the criteria of a particular application.
Instead of using a separate scheme for each wireless communications standard, the standards may be supported by hardware-only reconfiguration. For each standard, a specific configuration code generally controls the generation of matrix elements for multiple matrixes. The matrixes may be used in the manipulation of the input bits to generate an encoded signal. The resulting encoder may handle several input bits per clock cycle. Furthermore, reconfiguration from a current communications standard to another communications standard may be achieved in a single clock cycle.
Referring to
The signal IN may convey an information word received by the apparatus 100. The information word “d” (e.g., data to be transmitted) may be described by formula 1 as follows:
d=(d1, . . . , dk)ε{0,1}k (1)
where each diε{0,1} may be an information bit and parameter “k” may be an information word length. The apparatus 100 generally adds redundancy to the information word d and produces a codeword “c” in the signal OUT. Codeword c is generally illustrated by formula 2 as follows:
c=(c1, . . . , cn)ε{0,1}n (2)
where “n” is the codeword length and R=k/n may be a code rate.
For convolutional rate 1/s, the apparatus 100 may be defined by a transfer matrix T. Transfer Matrix T is generally shown in formula 3 as follows:
T=[t1(D), . . . , ts(D)] (3)
where each ti(D) (e.g., formula 4):
may be a rational function in variable D over the binary field F2={0,1}. The elements h(i)(D), g(i)(D)εF2(D) may be polynomials in D with coefficients in F2 and h(i) (0)=g(i) (0)=1. When the apparatus 100 receives the signal IN carrying an infinite binary sequence (e.g., formula 5):
d=d1, d2, . . . , di, . . . (5)
the signal IN may be interpreted as a formal power series per formula 6 as follows:
d(D)=d1+d2D+ . . . +diDi-1+ . . . (6)
The apparatus 100 may generate multiple signals (e.g., P1 to PS). A combination of the signals P1 to PS may form the signal OUT. Each signal P1 to PS may carry a sequence (e.g., p(1) to p(s)) as shown in formulae 7 as follows:
The sequences may be considered as formal power series and calculated as shown in formulae 8 as follows:
The resulting codeword c may be represented by formula 9 as follows:
c=(p1(1), . . . , p1(s), p2(1), . . . , p2(s), . . . , pk(1), . . . , pk(s)), (9)
where p(j) (e.g., formula 10):
p(j)=(p1(j), . . . , pk(j)) (10)
may be the j-th element created by the convolutional encoding. The word p(j) may be referred to as a parity word.
In the case of convolutional codes (CC) generally used in wireless standards, the channel encoding is generally not systematic (e.g., the encoding may have a polynomial transfer matrix). In the case of convolutional turbo codes (CTC), the encoding may be systematic (e.g., the information word d may be a part of the codeword c).
Referring to
The circuit 104 may implement a Recursive Systematic Convolutional (RSC) encoder. The circuit 104 is generally operational to encode the information word d generate the parity word p(1). The information word d may be received in the signal IN. The parity word p(1) may be presented in the signal P1. The encoding may be a recursive systematic convolutional encoding.
The circuit 106 may implement another RSC encoder. The circuit 106 is generally operational to encode a permuted word π(d) (e.g., formula 11) as follows:
π(d)=(dπ(1), . . . , dπ(k)) (11)
to generate the parity word p(2). The permuted word π(d) may be received in the signal PER from the circuit 108. The parity word p(2) may be presented in the signal P2. The encoding may also be a recursive systematic convolutional encoding. The circuit 106 may be a duplicate of the circuit 104 and perform the same encoding technique.
The circuit 108 may implement an interleaver circuit. The circuit 108 is generally operational to generated the permuted word Π(d) by permutating the information word d. The information word d may be received in the signal IN. The permuted word Π(d) may be presented to the circuit 106 in the signal PER.
Each standard LTE, W-CDMA/HSPA and WiMAX may include rate 1/3 turbo codes. In the WiMAX standard, the codeword c may be given by formula 12 as follows:
c=(d1, p1(1), p1(2), . . . , dk, pk(1), pk(2)), (12)
where n=3k and tail-biting may be utilized. In the LTE standard and the W-CDMA/HSPA standard, the codeword c is generally illustrated by formula 13 as follows:
c=(d1, p1(1), p1(2), . . . , dk, pk(1), pk(2), t1, . . . , t12), (13)
where n=3k+12 and the final several bits (e.g., 12 bits t1, . . . , t12) may be used for trellis termination. The trellis termination generally forces the apparatus 102 to an initial zero state. In the case of trellis termination, the actual code rate k/(3k+12) may be a little smaller than the rate 1/3.
In the above cases, the parity word p(1) in the signal P1 may convey the parity bits word obtained for an unpermuted information word d generated by the circuit 104. The parity word p(2) may be obtained for the permuted word π(d) generated by the circuit 108. An operation n may be a permutation on a set {1, 2, . . . , k} specified by an interleaver table of the standard.
Referring to
The circuit 122 may present a signal to the circuit 124a and the circuit 128a. Each circuit 124a to 124m−1 may present a signal to the next respective circuit 124b to 124m, respective circuit 126a to 126m−1 and a respective circuit 130a to 130m−1. The circuit 124m may present a signal to the circuits 126m and 130m. Each circuit 126a to 126m may present a signal to a respective circuit 128a to 128m. Each circuit 128a to 128m−1 may present a signal to a respective next circuit 128b to 128m. Each circuit 130a to 130m−1 may present a signal to a respective circuit 132a to 132m−1. The circuit 130m may also present a signal to the circuit 132m−1. Each circuit 132b to 132m−1 may present a signal to a respective previous circuit 132a to 132m−2. The circuit 132a may present a signal back to the circuit 122.
Each circuit 122, 128a to 128m and 132a to 132m−1 may implement an adder circuit. The circuits 122, 128a to 128m and 132a to 132m−1 are generally operational to generate a sum at an output port of two values received at the respective input ports.
Each circuit 124a to 124m may implement a delay circuit (e.g., register). The circuit 124a-124m may be operational to buffer a received value for a single clock cycle.
Each circuit 126a to 126m may implement a transfer circuit. The circuit 126a to 126m may be operational to transfer an input value to an output value per a respective polynomial (e.g., H1 to Hm).
Each circuit 130a to 130m may implement another transfer circuit. The circuit 130a to 130m may be operational to transfer an input value to an output value per a respective polynomial (e.g., G1 to Gm).
A number of additional rates may be easily obtained by applying puncturing. Puncturing generally deletes some of the parity symbols according to a puncturing scheme defined in each standard.
In a general case, a convolutional rate k/n encoder (e.g., k input bits and n output bits may be defined by a transfer matrix T). An example transfer matrix T is generally shown in formula 14 as follows:
where each tij(D) (formula 15):
is generally a rational function in variable D. The elements Hij(D) and Gij(D) may polynomials in D with coefficients in F2 and Hij(0)=Gij(0)=1. When an encoder is fed by the k-input infinite binary sequences in the signal IN (e.g., formula 16):
IN=[x(1), . . . , x(k)] (16)
each sequence x(i) (formula 17):
x(i)=x0(i)x1(i) (17)
may be interpreted as formal power series as illustrated in formula 18 as follows:
x(i)(D)=x0(i)+x1(i)D+ . . . (18)
Hence, the signal OUT of the encoder may be given by formula 19 as follows:
OUT=[y(1)(D), . . . , y(n)(D)] (19)
where matrix y may be defined by the formula 20 as follows:
Referring to
Each circuit 142a to 142f may implement a delay circuit (e.g., register). The circuit 142a to 142f may be operational to buffer a received value for a single clock cycle. Each circuit 144a to 148d may implement an adder circuit. The circuits 144a to 148d are generally operational to generate a sum at an output port of two values received at the respective input ports.
Referring to
The circuit 162 may implement a constituent decoder circuit. The circuit 162 is generally operational to generate a portion of the signal OUT by encoding the signal IN. The circuit 164 may implement another constituent decoder circuit. The circuit 164 is generally operational to generate a portion of the signal OUT by encoding the signal IN′. In some embodiments, the circuit 164 may be a copy of the circuit 162. The circuit 166 may implement an interleaver circuit. The circuit 166 is generally operational to generate the signal IN′ by permuting (interleaving) the signal IN.
Each circuit 168 and 170 may implement a switch. The circuit 168 may switch an input signal into the circuit 162 between the signal IN and a feedback signal of the circuit 162. The circuit 170 may switch an input signal into the circuit 164 between the signal IN′ and a feedback signal of the circuit 164.
Consider a general rate 1/s code in the following. In a simple case where n=k=1 (e.g.,
where h(D) is generally given by formula 22 as follows:
h(D)=h0+h1D+ . . . +hmDm (22)
and g(D) is given by formula 23 as follows:
g(D)=g0+g1D+ . . . +gmDm (23)
Generally, h0=g0=1. A vector (e.g., q(t) formula 24):
q(t)=[q1(t), . . . , qm(t)]εF2m (24)
may represent an encoder state, the vector X(t)εF2 may be an input (e.g., signal IN) and the vector Y(t)εF2 an output (e.g., signal OUT) at the moment at t=0, 1, 2, 3, etc. If an initial state q(0) of the encoder is given by formula 25 as follows:
q(0)[q1(0), . . . , qm(0)]εF2m (25)
the encoder may work as described by formulae 26 as follows:
In matrix form, the operation of the encoder may be described by formulae 27 as follows:
The matrixes G, e1, h, q(t) and q(0) may be defined by formulae 28, 29, 30, 31 and 32 respectively as follows:
At time t+2, vector q(t+2) may be given by formulae 33 as follows:
By induction, formula 34 may be as follows:
q(t+s)=Gs·q(t)+b(s-1)x(t)+ . . . +b(0)x(t+s)−1) (34)
where b(i)=Gi·e1, may be obtained for any time t+s, where b(i) may be a first column of matrix Gi. In matrix form, q may be expressed by formula 35 as follows:
A relation between Y(s) (t) and q(t), X(s) (t) may be expressed by formula 36 as follows:
may apply for i=1, . . . , s, where δij may be 1 if i=j and 0 (zero) otherwise. The vector Y may be written in matrix form in formula 38 as follows:
Therefore, formulae 39:
may be implemented for the encoder to operate s times faster.
Consider a convolutional rate 1/n encoder to be universal where the encoder supports any transfer matrix T with a maximum possible constraint length L (e.g., number of delays in encoder). In order to implement an s times faster version of such a universal encoder, the binary matrixes A(S), B(S) and n different pairs of matrixes C(S), D(S) (e.g., a pair of matrixes for each of n outputs) may be calculated using the previous formulae. For the case n=1, four matrixes may be generated. For other cases, the sizes of matrixes generally increase as the parameter s increases (e.g., the number of input bits per clock cycle). Therefore, all the elements of matrixes may be initialized and stored in a configuration register.
Referring to
The signal IN may be received by the circuits 188 and 192. The signal OUT may be generated by the circuit 198. A signal (e.g., CONFIG1) may be received by the circuit 182. The circuit 182 may generate a signal (e.g., EA) received by the circuit 186. A signal (e.g., EB) may also be generated by the circuit 182 and received by the circuit 188. The circuit 182 may generate a signal (e.g., EC) received by the circuit 190. A signal (e.g., ED) may be generated by the circuit 182 and received by the circuit 192.
The circuit 182 may implement a configuration register circuit. The circuit 182 may be operational to store a set of matrix elements used by the circuit 184. A particular set of matrix elements may be loaded into the circuits 186, 188, 190 and 192 for encoding according to a particular communications standard. The particular set of matrix elements may be received in the signal CONFIG1 from a source external to the apparatus 180. In some embodiments of the present invention, the source may be implemented as a software driver. Other sources of the configuration information (e.g., matrix elements) may be implemented to meet the criteria of a particular application.
The circuit 184 may implement an encoder circuit. The circuit 184 is generally operational to generate the signal OUT by encoding the signal IN. Encoding may be performed to the communications standard defined by the matrix elements received in the signals EA, EB; EC and ED. The signal IN may convey the sequence of input vectors X(S)(t).
Each circuit 186, 188, 190 and 192 may implement a matrix multiplication circuit. The circuits 186 to 192 are generally operational to multiply a word (e.g., vector) by the respective matrix elements to generate another vector.
The circuit 188 may multiply an information word (e.g., X(i)(t)) as received in the signal IN by the matrix (e.g., B(S)) received in the signal EB. The resulting vector may be transferred to the circuit 194.
The circuit 194 may implement an adder circuit. The circuit 192 is generally operational to add the vector received from the circuit 188 with a vector generated by the circuit 186. The sum vector may be presented to the circuit 196.
The circuit 196 may implement a register circuit. The circuit 196 may be operational to buffer the sum vector generated by the circuit 194. Buffering may last for a single clock cycle. On the next clock cycle, the buffered sum vector may be transferred to the circuits 186 and 190.
The circuit 186 may multiply the vector received from the circuit 196 by the matrix (e.g., A(S)) received in the signal EA. The resulting vector may be feed back to the circuit 194. The circuit 190 may multiply the vector received from the circuit 196 by the matrix (e.g., C(S)) received in the signal EC. The resulting vector may be transferred to the circuit 198. The circuit 192 may multiply the vector received in the signal IN by the matrix (e.g., D(S)) received in the signal ED. The resulting vector may be transferred to the circuit 198.
The circuit 198 may implement another adder circuit. The circuit 198 is generally operational to generate the sequence of output vectors Y(S)(t) in the signal OUT by adding the vectors received from the circuits 190 and 192.
Referring to
The circuit 222 may implement a universal multipole circuit. The circuit 222 is generally operational to calculate the binary matrix elements for the matrixes A(S), B(S), C(S) and D(S) based on the signal CONFIG2. An architecture of the apparatus 220 in the case when n=1 is illustrated in
A Boolean chain for functions of n variables is generally a sequence of steps where each step combines the results from two previous steps. A Boolean chain that includes all functions of the n variables may be referred to as a universal multiple. The universal multipole for variables J1, J2, . . . , Jn is generally a scheme with n inputs and 2^(2^n) outputs. The universal multipole may implement the 2^(2^n) outputs using all Boolean functions on the variables J1, J2, . . . , Jn. A universal multipole may be constructed by common techniques using no more than 2^(2^n) elements from the set of all Boolean logical operations of two variables {AND, OR, NOT, . . . }. Additional information may be found in “The Art of Computer Programming”, volume 4, Pre-Fascicle 0C, by Donald E. Knuth, section 7.1.2: Boolean Evaluation, pages 0-61, copyright 2006 by Addison-Wesley, which is hereby incorporated by reference in its entirety.
Let v be the number of different convolutional and turbo codes used in a chosen set of wireless standards (e.g., LTE, W-CDMA, CDMA-2000). Usually the number v has a small value (e.g., v<8) and so each code used in the set of communications standards may be identified as a multi-bit (e.g., 3-bit) vector J=(J1, J2, J3). Each element of the matrixes A(S), B(S), C(S) and D(S) may be represented as a Boolean function f(J1, J2, J3). Therefore, a universal multipole U for variables J1, J2, J3 may be implemented by the circuit 222.
All of the matrix elements of the matrixes A(S), B(S), C(S) and D(S) may be calculated by the circuit 222. In some embodiments, the matrix elements for the matrix A(S) may be presented in the signal EA, the matrix elements for the matrix B(S) in the signal EB, the matrix elements for the matrix C(S) in the signal EC and the matrix elements for the matrix D(S) in the signal ED. In such cases, configuration (or reconfiguration) of the apparatus 220 to encode in accordance with a particular communications standard generally involves loading the vector J to a register in the circuit 222. The vector J may carry the corresponding 3-bit vector for the particular communications standard. Loading the vector J into a register and calculating the subsequent matrix elements may be performed in a single clock cycle for a hardware-only implementation of the circuit 222. Thus, reconfiguration of the apparatus 220 may be accomplished in the single clock cycle.
The apparatus 180 and the apparatus 220 generally allow processing of several (e.g., up to 8) information bits per clock cycle. The circuit 222 of the apparatus 220 may not implement a large buffer to store large amounts of configuration data and so may quickly configured the circuit 184. Moreover, reconfiguration may be made on-the-fly in a single clock without support from external driver software and/or circuitry.
The functions performed by the diagrams of
The present invention may also be implemented by the preparation of ASICs (application specific integrated circuits), Platform ASICs, FPGAs (field programmable gate arrays), PLDs (programmable logic devices), CPLDs (complex programmable logic device), sea-of-gates, RFICs (radio frequency integrated circuits), ASSPs (application specific standard products), monolithic integrated circuits, one or more chips or die arranged as flip-chip modules and/or multi-chip modules or by interconnecting an appropriate network of conventional component circuits, as is described herein, modifications of which will be readily apparent to those skilled in the art(s).
The elements of the invention may form part or all of one or more devices, units, components, systems, machines and/or apparatuses. The devices may include, but are not limited to, servers, workstations, storage array controllers, storage systems, personal computers, laptop computers, notebook computers, palm computers, personal digital assistants, portable electronic devices, battery powered devices, set-top boxes, encoders, decoders, transcoders, compressors, decompressors, pre-processors, post-processors, transmitters, receivers, transceivers, cipher circuits, cellular telephones, digital cameras, positioning and/or navigation systems, medical equipment, heads-up displays, wireless devices, audio recording, storage and/or playback devices, video recording, storage and/or playback devices, game platforms, peripherals and/or multi-chip modules. Those skilled in the relevant art(s) would understand that the elements of the invention may be implemented in other types of devices to meet the criteria of a particular application.
As would be apparent to those skilled in the relevant art(s), the signals illustrated in
While the invention has been particularly shown and described with reference to the preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2010147930 | Nov 2010 | RU | national |
Number | Name | Date | Kind |
---|---|---|---|
5991308 | Fuhrmann et al. | Nov 1999 | A |
7536624 | Eroz et al. | May 2009 | B2 |
8261168 | Wang et al. | Sep 2012 | B2 |
Entry |
---|
Berrou, Claude et al., Near Shannon Limit Error—Correcting Coding and Decoding: Turbo-Codes (1), IEEE, pp. 1064-1070, copyright 1993. |
Number | Date | Country | |
---|---|---|---|
20120137190 A1 | May 2012 | US |