This invention relates to Viterbi decoding. Viterbi decoding is commonly used in the receiving side of digital communication systems where potentially disrupted signals (e.g., disrupted by a fading channel, noise, etc.) must be decoded. Such signals are typically the result of bit-streams that have been encoded using convolutional codes and modulated for transmission, and such received encoded signals are typically decoded using a maximum-likelihood algorithm, generally based on the ‘Viterbi algorithm’.
In considering the Viterbi algorithm, two aspects in particular must be considered: the ‘Metric Calculation’ and the ‘Viterbi decoder’ itself. The theory of both of these aspects, involving calculation of branch, node and path metrics between different trellis nodes, is well known and ubiquitously applied in the field of digital communications.
The main problem of the Viterbi algorithm lies in its arithmetical decoding complexity (thus leading to high power consumption, etc., which is a paramount consideration in battery-operated portable communication devices). A lot of research has been done with the aim of reducing complexity associated with the Viterbi algorithm.
However, this research has invariably not taken into account the needs of ‘broadband communications’ systems. In these systems account must be taken of the very high bit rates involved, which require adaptation of the Viterbi algorithm for efficient maximum-likelihood decoding.
Standard implementations of the Viterbi algorithm are distinctly sub-optimum for ‘Broadband Communication’ systems because:
A need therefore exists for a Viterbi decoder, unit therefor and method wherein the abovementioned disadvantage(s) may be alleviated.
In accordance with a first aspect of the present invention there is provided a Viterbi decoder as claimed in claim 1.
In accordance with a second aspect of the present invention there is provided a method of producing metrics, for use in a Viterbi decoder, as claimed in claim 4.
In accordance with a third aspect of the present invention there is provided a butterfly unit, for use in a Viterbi decoder Add-Compare-Select unit, as claimed in claim 11.
One Viterbi decoder incorporating the present invention will now be described, by way of example only, with reference to the accompanying drawing(s), in which:
The following description, explanation and associated drawings are based (for the sake of example) on use of an encoder whose code rate is of the type R=1/m, with m integer. However, it will be understood that the invention is not limited to such an encoder type and may be more generally applied, e.g., to cases of code rate type R=k/m, where k (>1) and m are integer.
Convolutional codes are commonly used in digital communication systems in order to encode a bit-stream before transmission. In the receiver, a deconvolution has to be performed on the received symbols that have been possibly corrupted by fading due to a multipath channel and by additive noise. A classical implementation of the Viterbi algorithm, as shown in
The present invention concerns techniques for reducing the complexity of a Viterbi decoder.
Briefly stated, the present invention provides a new ACS unit that may be used at certain positions in a Viterbi decoder to simplify the processing required, and provides certain new metrics for use with the new ACS units to decrease the overall complexity of Viterbi decoding.
The critical element in a Viterbi decoder is usually the ACS unit, of which a typical example is shown in
ACS butterfly operations have to be performed per trellis transition if a N-state convolutional encoder is used. In a high-speed application, all
or at least some
ACS butterflies have to work in parallel, requiring an important amount of chip surface in the case of a hardware implementation. Consequently, the power consumption of the ACS units is important compared to the total consumption of the decoder.
For the ‘HIPERLAN/2’ standard, for example, massive parallel structures are necessary in order to guarantee the required bit-rates (up to 54 MBits/s. Even if all ACS units are working in parallel in order to decode 1 bit per clock cycle, a minimum clock speed of 54 MHz is mandatory.
In order to reduce the complexity of Viterbi decoding, the following is proposed:
Following these proposals produces the advantages that:
(this type of code rate, together with a constraint length of K=7, leads to a convolutional code that is commonly used, for example by the ‘BRAN HIPERLAN/2’ standard), 50% of all classical butterflies can be substituted by the optimised ones leading to approximately 8% gain in surface/complexity compared to that of a Viterbi decoder using only the conventional butterfly configuration.
ACS butterflies must be implemented. It is possible to find hybrid structures where a number of butterflies between 1 and
are implemented and reused once or several times per transition. So, a trade-off is possible between decoding speed and chip surface in a hardware implementation.
The following discussion explains adaptation of metrics in general to suit the new ACS butterfly unit of
Considering a convolutional encoder based on a code rate
with m integer, m encoded bits are output by the encoder at each transition. These m bits appear in the decoder as metrics m1(bit=0), m1(bit=1), m2(bit=0), m2(bit=1), . . . , mm(bit=0), mm(bit=1). Per trellis transition, there are l=2m different branch metrics possible:
mb1=m1(bit=0)+m2(bit=0)+ . . . +mm(bit=0)
mb2=m1(bit=1)+m2(bit=0)+ . . . +mm(bit=0)
. . .
mbl=m1(bit=1)+m2(bit=1)+ . . . +mm(bit=1)
Assuming that positive and negative branch metrics are possible, any branch metric mbaε(mb1, mb2, . . . , mbl) may be chosen and subtracted from all other branch metrics. The new resulting branch metrics are thus:
mb1=mb1−mba=m1(bit=0)+m2(bit=0)+ . . . +mm(bit=0)−mba
mb2=mb2−mba=m1(bit=1)+m2(bit=0)+ . . . +mm(bit=0)−mba
. . .
mba=mba−mba=0
. . .
mbl=mbl−mba=m1(bit=1)+m2(bit=1)+ . . . +mm(bit=1)−mba
Considering now the inputs to the ACS unit, there are in any case two path (or node) metrics Mnode1 and Mnode 2 as well as two branch metrics mbranch1 ε(mb1, mb2, . . . , mbl) and mbranch2 ε(mb1, mb2, . . . , mb1) at the input of the ACS unit. Two cases have to be considered separately:
This rule is based on the typically valid observation that the encoder output bits remain unchanged if both, the input bit to the encoder and the most significant bit (MSB) of the encoder state are inverted.
In general, this method has the disadvantage that the resulting metrics mb1, mb2, . . . , mbl might have a larger dynamic range than the classical metrics m1, m2, . . . , ml. However, the following discussion progresses from the above general case to a slightly specialised case where this disadvantage is resolved.
The only restriction that is imposed on the metrics in the following specialisation is
ma(bit=0)=−ma(bit=1) ∀a
where the expression “∀a” stands for “for all valid a”. That is to say, assuming a bit “0” has been sent, a metric “ma(bit=0)” is produced. The metric corresponding to the assumption that a bit “1” has been sent instead is simply calculated by multiplying the previous result by “−1”. This is valid for “all valid a”.
Now, the l=2m different branch metric can be presented as follows:
mb1=m1(bit=0)+m2(bit=0)+ . . . +mm(bit=0)
mb2=−m1(bit=0)+m2(bit=0)+ . . . +mm(bit=0)
. . .
mbl=−m1(bit=0)−m2(bit=0)− . . . −mm(bit=0)
If any metric mbaε(mb1, mb2, . . . , mbl) is chosen among them and subtracted from all metrics mb1, mb2, . . . , mbl, the resulting metrics mb1, mb2, . . . , mbl are
Each contribution ±mx(bit=0) is either multiplied by 2 or set to 0. Since all metrics can be multiplied by a constant factor without changing the decision path of the Viterbi decoder, mb1, mb2, . . ., mbl shall be multiplied by
Then, we find l=2m new metrics adapted to the new ACS units that require neither more complex metric calculation nor a higher dynamic range:
In OFDM (Orthogonal Frequency Division Multiplex) systems, the metrics are very often calculated based on symbols which have been constructed using BPSK (Binary Phase Shift Keying), QPSK (Quadrature Phase Shift Keying), QAM (Quadrature Amplitude Modulation)-16, QAM (Quadrature Amplitude Modulation)-64 or similar constellations. U.S. Pat. No. 5,742,621, 1998 (MOTOROLA) presents a very efficient implementation of the known BPSK/QPSK metrics:
In the example metrics of Table 1, z1 is the complex transmitted symbol, H1* is the complex conjugate of the channel coefficient and y1=H1·z1+ν is the received complex symbol with ν being additive white gaussian noise (AWGN). For QAM-16, QAM-64, etc., similar metrics can be derived. These metrics are especially important in the framework of OFDM systems.
For this example, a code rate of
a constraint length of K=7 and a convolutional encoder based on the generator polynomials G1=133OCT, G2=171OCT is assumed. The non-optimised BPSK metrics may be defined for example as
mb1=m1(bit=0)+m2(bit=0)=sign(real(z1))·real(y1·H1*)+sign(real(z2))·real(y2·H2*)
mb2=m1(bit=1)+m2(bit=0)=−sign(real(z1))·real(y1·H1*)+sign(real(z2))·real(y2·H2*)
mb3=m1(bit=0)+m2(bit=1)=sign(real(z1))·real(y1·H1*)−sign(real(z2))·real(y2·H2*)
mb4=m1(bit=1)+m2(bit=1)=−sign(real(z1))·real(y1·H1*)−sign(real(z2))·real(y2·H2*)
Choosing for example ma=mb1, the optimsed metrics are
All 1−1=2m−1 non-zero metrics are pre-calculated by the Transition Metric Unit (TMU). Altogether there are l=2m different ACS butterflies (the two butterfly entries are not independent, which is why not all metric combinations are mixed and the number of different butterflies is limited to l=2m). With K being the constraint length of the convolutional encoder, there are
ACS butterflies having a zero-metric as an input. Here, the new, optimised butterfly of
It should be noted that the new metrics mb1, mb2, mb3, mb4 are less complex (2 multiplications, 1 addition) than the classical ones mb1, mb2, Mb3, Mb4(2 multiplications, 2 additions, 2 sign inversions).
The resulting four ACS butterflies are presented by
In
In the upper section, additive Gaussian noise of a constant mean noise power σnoise2 with a mean value μnoise=0 has been assumed. In the case of a non-zero mean value, the mean value μnoise≠0 is simply subtracted from the received symbols. Using the notations of example 1, the received symbol is in this case
y1=[H1·z1+v]−μnoise=H1·z1+(v−μnoise).
Now, (v−μnoise) can be considered as zero-mean and the metrics can be used as before.
If the mean noise power depends on the received symbol (σnoise2→|cn|2σnoise2), the new metrics must be divided by the corresponding gain:
Respecting these rules, the metrics can also be used in coloured noise environments.
In general, the placements of the different butterfly types are found by the following exhaustive search:
Practically, the ACS structure can be exploited in different ways:
Based on the exhaustive search proposed above, the four different ACS butterfly types shown in
There are 2K-1=64 trellis states and correspondingly 64 path (or node) metric buffers. These buffers are connected to the ACS units as indicated by the following Table 2 (for the standard generator polynomials G1=133OCT, G2=171OCT of the convolutional encoder used by the HIPERLAN/2 standard).
It will be understood that 50% of all butterflies are of the type I and II (low complexity) and the other 50% are of the type III and IV (classical butterflies), and that the total saving in complexity is approx. 8% compared to the total complexity of the classical Viterbi decoder.
In conclusion, it will be understood that the Viterbi decoder described above provides the following advantages:
The proposed technique may be used for any Viterbi decoder in general. However, it is especially interesting for OFDM systems, since the resulting optimised metrics do not require any additional precision, at least if the metric calculation is performed adequately, as presented by the example of Table 1.
The technique is especially interesting for a coding rate of R=½, since 50% of all ACS butterflies can be substituted by low-complexity, optimised ACS butterflies. For smaller coding rates, this percentage decreases exponentially.
The applications of the new method are principally found in high-speed applications where massive-parallel structures are required. Here, the savings in complexity/surface/power-consumption are maximal.
Number | Date | Country | Kind |
---|---|---|---|
00403711 | Dec 2000 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
5291499 | Behrens et al. | Mar 1994 | A |
5327440 | Fredrickson et al. | Jul 1994 | A |
5414738 | Bienz | May 1995 | A |
5530707 | Lin | Jun 1996 | A |
5742621 | Amon et al. | Apr 1998 | A |
5815515 | Dabiri | Sep 1998 | A |
5928378 | Choi | Jul 1999 | A |
5970097 | Ishikawa et al. | Oct 1999 | A |
6163581 | Kang | Dec 2000 | A |
6334202 | Pielmeier | Dec 2001 | B1 |
6553541 | Nikolic et al. | Apr 2003 | B1 |
6697443 | Kim et al. | Feb 2004 | B1 |
20010007142 | Hocevar et al. | Jul 2001 | A1 |
Number | Date | Country |
---|---|---|
2 769 434 | Apr 1999 | FR |
Number | Date | Country | |
---|---|---|---|
20020126776 A1 | Sep 2002 | US |