Iterative decoding

Description

BACKGROUND OF THE INVENTION

1. Field of Invention

This invention relates to iterative decoding of input sequences.

2. Description of Related Art

Maximum a posteriori (MAP) sequence decoding selects a most probable information sequence X

1

T

=(X

1

, X

2

, . . . , X

T

) that produced the received sequence Y

1

T

=(Y

1

, Y

2

, . . . , Y

T

). For transmitters and/or channels that are modeled using Hidden Markov Models (HMM), the process for obtaining the information sequence X

1

T

that corresponds to a maximum probability is difficult due to a large number of possible hidden states as well as a large number of possible information sequences X

1

T

. Thus, new technology is needed to improve MAP decoding for HMMs.

SUMMARY OF THE INVENTION

This invention provides an iterative process to maximum a posteriori (MAP) decoding. The iterative process uses an auxiliary function which is defined in terms of a complete data probability distribution. The MAP decoding is based on an expectation maximization (EM) algorithm which finds the maximum by iteratively maximizing the auxiliary function. For a special case of trellis coded modulation, the auxiliary function may be maximized by a combination of forward-backward and Viterbi algorithms. The iterative process converges monotonically and thus improves the performance of any decoding algorithm.

The MAP decoding decodes received inputs by minimizing a probability of error. A direct approach to achieve this minimization results in a complexity which grows exponentially with T, where T is the size of the input. The iterative process avoids this complexity by converging on the MAP solution through repeated use of the auxiliary function.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is described in detail with reference to the following figures where like numerals reference like elements, and wherein:

FIG. 1

shows a diagram of a communication system;

FIG. 2

shows a flow chart of an exemplary iterative process;

FIGS. 3-6

show state trajectories determined by the iterative process;

FIG. 7

shows an exemplary block diagram of the receiver shown in

FIG. 1

;

FIG. 8

shows a flowchart for an exemplary process of the iterative process for a TCM example; and

FIG. 9

shows step

1004

of the flowchart of

FIG. 8

in greater detail.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

FIG. 1

shows an exemplary block diagram of a communication system

100

. The communication system

100

includes a transmitter

102

, a channel

106

and a receiver

104

. The transmitter

102

receives an input information sequence I

1

T

(i.e., I

1

, I

2

, . . . , I

T

) of length T, for example. The input information sequence may represent any type of data including analog voice, analog video, digital image, etc. The transmitter

102

may represent a speech synthesizer, a signal modulator, etc.; the receiver

104

may represent a speech recognizer, a radio receiver, etc.; and the channel

106

may be any medium through which the information sequence X

1

T

(i.e., X

1

, X

2

, . . . , X

T

) is conveyed to the receiver

104

. The transmitter

102

may encode the information sequence I

1

T

and transmit encoded information sequence X

1

T

through the channel

106

and the receiver

104

receives information sequence Y

1

T

(i.e., Y

1

, Y

2

, . . . , Y

T

). The problem in communications is, of course, to decode Y

1

T

in such a way as to retrieve I

1

T

.

Maximum a posteriori (MAP) sequence decoding is a technique that decodes the received sequence Y

1

T

by minimizing a probability of error to obtain X

1

T

(and if a model of the transmitter

102

is included, to obtain I

1

T

). In MAP, the goal is to choose a most probable X

1

T

that produces the received Y

1

T

. The MAP estimator may be expressed by equation 1 below.

\begin{matrix} {\hat{X}}_{1}^{T} = \underset{X_{1}^{T}}{\arg \max} \Pr (X_{1}^{T}, Y_{1}^{T}) & (1) \end{matrix}

where Pr(·) denotes a corresponding probability or probability density function and {circumflex over (X)}

1

T

is an estimate of X

1

T

. Equation 1 sets {circumflex over (X)}

1

T

to the X

1

T

that maximizes Pr(X

1

T

, Y

1

T

).

The Pr(X

1

T

, Y

1

T

) term may be obtained by modeling the channel

106

of the communication system

100

using techniques such as Hidden Markov Models (HMMs). An input-output HMM λ=(S,X,Y,π, {P(X,Y)}) is defined by its internal states S={1,2, . . . n}, inputs X, outputs Y, initial state probability vector π, and the input-output probability density matrices (PDMs) P(X,Y), XεX, YεY. The elements of P(X,Y), p

ij

(X,Y)=Pr(j,X,Y|i), are conditional probability density functions (PDFs) of input x and corresponding output y after transferring from the state i to state j. It is assumed that the state sequence S

0

t

=(S

0

, S

1

, . . . , S

t

), input sequence X

1

t

=(X

1

,X

2

, . . . X

t

), and output sequence Y

1

t

=(Y

1

,Y

2

, . . . , Y

t

) possess the following Markovian property

Pr

(

S

t

,X

t

,Y

t

|S

0

t−1

,X

1

t−1

,Y

1

t−1

)=

Pr

(

S

t

,X

t

,Y

t

|S

t−1

).

Using HMM, the PDF of the input sequence X

1

T

and output sequence Y

1

T

may be expressed by equation 2 below:

\begin{matrix} p_{T} (X_{1}^{T}, Y_{1}^{T}) = π \prod_{i = 1}^{T} P (X_{i}, Y_{i}) 1 & (2) \end{matrix}

where 1 is a column vector of n ones, π is a vector of state initial probabilities, and n is a number of states in the HMM. Thus, the MAP estimator when using HMM may be expressed by equation 3 below:

\begin{matrix} {\hat{X}}_{1}^{T} = \arg \max_{X_{1}^{T}} [π \prod_{i = 1}^{T} P (X_{i}, Y_{i}) 1] & (3) \end{matrix}

The maximization required by equation 3 is a difficult problem because all possible sequences of X

1

T

must be considered. This requirement results in a complexity that grows exponentially with T. This invention provides an iterative process to obtain the maximum without the complexity of directly achieving the maximization by evaluating equation 2 for all possible X

1

T

, for example. In the iterative process, an auxiliary function is developed whose iterative maximization generates a sequence of estimates for X

1

T

approaching the maximum point of equation 2.

The iterative process is derived based on the expectation maximization (EM) algorithm. Because the EM algorithm converges monotonically, the iterative process may improve the performance of any decoding algorithm by using its output as an initial sequence of the iterative decoding algorithm. In the following description, it is assumed that HMM parameters for the channel

106

and/or the transmitter

102

are available either by design or by techniques such as training.

The auxiliary function may be defined in terms of a complete data probability distribution shown in equation 4 below.

\begin{matrix} Ψ (z, X_{1}^{T}, Y_{1}^{T}) = π_{i_{o}} \prod_{i = 1}^{T} p_{i_{t - 1} i_{t}} (X_{t}, Y_{t}), & (4) \end{matrix}

where z=i

0

T

is an HMM state sequence, π

i

0

is an initial probability vector for state i

0

, and p

ij

(X,Y) are the elements of the matrix P(X,Y). The MAP estimator of equation 1 can be obtained iteratively by equations 5-9 as shown below.

\begin{matrix} X_{1, p + 1}^{T} = \arg \max_{X_{1}^{T}} Q (X_{1}^{T}, X_{1, p}^{T}), p = 0, 1, 2, \dots & (5) \end{matrix}

where p is a number of iterations and Q(X

i

t

,X

i,p

t

) is the auxiliary function which may be expressed as

\begin{matrix} Q (X_{1}^{T}, X_{1, p}^{T}) = \sum_{z} Ψ (z, X_{1, p}^{T}, Y_{1}^{T}) \log (Ψ (z, X_{1}^{T}, Y_{1}^{T})) . & (6) \end{matrix}

The auxiliary function may be expanded based on equation 4 as follows:

\begin{matrix} Q (X_{1}^{T}, X_{1, p}^{T}) = \sum_{t = 1}^{T} \sum_{i = 1}^{n} \sum_{j = 1}^{n} γ_{t, ij} (X_{1, p}^{T}) \log (p_{ij} (X_{t}, Y_{t})) + C & (7) \end{matrix}

where C does not depend on X

1

T

, n is a number of states in the HMM and

γ

t,ij

(

X

1,p

T

)=α

i

(

X

1,p

t−1

,Y

1

t−1

)

P

ij

(

X

t,p

,Y

t

)β

j

(

X

t+1,p

T

,Y

t+1

T

) (8)

where α

i

(X

1,p

t

,Y

1

t

) and β

j

(X

t+1,p

T

,Y

t+1

T

) are the elements of the following forward and backward probability vectors

\begin{matrix} α (X_{1}^{T}, Y_{1}^{T}) = π \prod_{i = 1}^{T} P (X_{i}, Y_{i}), and β (X_{t + 1}^{T}, Y_{t + 1}^{T}) = \prod_{t + 1}^{T} P (X_{i}, Y_{i}) 1. & (9) \end{matrix}

Based on equations 5-9, the iterative process may proceed as follows. At p=0, an initial estimate of X

1,0

T

is generated. Then, Q(X

1

T

,X

1,0

T

) is generated for all possible sequences of X

1

T

. From equations 7 and 8, Q(X

1

T

,X

1,0

T

) may be evaluated by generating γ

t,ij

(X

1,0

T

) and log(p

ij

(X

t

,Y

t

)) for each t, i, and j. γ

t,ij

(X

1,0

T

) may be generated by using the forward-backward algorithm as shown below:

α(

X

1,p

0

,Y

1

0

)=π, α(

X

t,p

t

,Y

1

t

)=α(

X

1,p

t−1

,Y

1

t−1

)

P

(

X

t,p

,Y

t

),

t

=1,2

, . . . T

β(

X

T+1,p

T

,Y

T+1

T

)=1, β(

X

t,p

T

,Y

t

T

)=

P

(

X

t,p

,Y

t

)β(

X

t+1,p

T

,Y

t+1

T

),

t=T

−1

,T

−2, . . . , 1

Log(p

ij

(X

t

,Y

t

))is generated for all possible X

t

for t=1,2, . . . , T and the (X

t

)s that maximize D(X

1

T

, X

1,0

T

) are selected as X

1,1

T

. After X

1,1

T

is obtained, it is compared with X

1,0

T

. If a measure D(X

1,1

T

,X

1,0

T

) of difference between the sequences exceeds a compare threshold, then the above process is repeated until the difference measure D(X

1,p

T

,X

1,p−1

T

) is within the threshold. The last X

1,p

T

for p iterations is the decoded output. The measure of difference may be an amount of mismatch information. For example, if X

1

T

is a sequence of symbols, then the measure may be a number of different symbols between X

1,p

T

and X

1,p−1

T

(Hamming distance); if X

1

T

is a sequence of real numbers, then the measure may be an Euclidean distance D(X

1,p

T

,X

1,p−1

T

)=[Σ

i=1

T

(X

i,p

−X

i,p−1

)

2

]

½

.

FIG. 2

shows a flowchart of the above-described process. In step

1000

, the receiver

104

receives the input information sequence Y

1

T

and goes to step,

1002

. In step

1002

, the receiver

104

selects an initial estimate for the decode output information sequence X

1,0

T

and goes to step

1004

. In step

1004

, the receiver

104

generates γ

t,ij

(X

1,p

T

) where p=0 for the first iteration and goes to step

1006

. In step

1006

, the receiver

104

generates all the log(p

ij

(X

t

,Y

t

)) values and goes to step

1008

.

In step

1008

, the receiver

104

selects a sequence X

1,p+1

T

that maximizes Q(X

1,p+1

T

,X

1,p

T

) and goes to step

1010

. In step

1010

, the receiver

104

compares X

1,p

T

with X

1,p+1

T

. If the compare result is within the compare threshold, then the receiver

104

goes to step

1012

; otherwise, the receiver

104

returns to step

1004

and continues the process with the new sequence X

1,p+1

T

. In step

1012

, the receiver

104

outputs X

1,p+1

T

and goes to step

1014

and ends the process.

The efficiency of the above described iterative technique may be improved if the transmitted sequence is generated by modulators such as a trellis coded modulator (TCM). A TCM may be described as a finite state machine that may be defined by equations 10 and 11 shown below.

S

t+1

=f

t

(

S

t

,I

t

) (10)

X

t

=g

t

(

S

t

,I

t

) (11)

Equation 10 specifies the TCM state transitions while equation 11 specifies the transmitted information sequence based on the state and the input information sequence. For example, after receiving input I

t

in state S

t

, the finite state machine transfers to state S

t+1

based on S

t

and I

t

as shown in equation 10. The actual output by the transmitter

102

is X

t

according to equation 11. Equation 10 may represent a convolutional encoder and equation 11 may represent a modulator. For the above example, the transmitter output information sequence may not be independent even if the input information sequence X

1

T

is independent.

In equation 15, the log(p

ij

(Y

t

,X

t

)) term may be analyzed based on the TCM state transitions because the information actually transmitted X

t

is related to the source information I

t

by X

t

=g

t

(S

t

,I

t

). This relationship between X

t

and I

t

forces many elements p

ij

(Y

t

,X

t

) of P(Y

t

,X

t

), to zero since the finite state machine (equations 10 and 11) removes many possibilities that otherwise must be considered. Thus, unlike the general case discussed in relation to equations 5-9, evaluation of p

ij

(Y

t

,X

t

) may be divided into a portion that is channel related and another portion that is TCM related. The following discussion describes the iterative technique in detail for the TCM example.

For a TCM system with an independent and identically distributed information sequence, an input-output HMM may be described by equations 12 and 13 below.

P

(

X

t

,Y

t

)=[

p

s

t

s

t+1

P

c

(

X

t

|Y

t

)], (12)

where

\begin{matrix} p_{s_{t} s_{t + 1}} = {\begin{matrix} \Pr (I_{t}) & if S_{t + 1} = f_{t} (S_{t}, I_{t}) \\ 0 & otherwise \end{matrix} & (13) \end{matrix}

P

c

(Y

1

|X

t

) is the conditional PDM of receiving Y

t

given that X

t

has been transmitted for the HMM of a medium (channel) through which the information sequence is transmitted; p

s

t

s

t+1

is the probability of the TCM transition from state S

t

to state S

t+1

, and Pr(I

t

) is the probability of an input I

t

. Thus, equation 2 may be written as

p_{T} (I_{1}^{T}, Y_{1}^{T}) = π_{c} \prod_{i = 1}^{T} p_{s_{t} s_{t + 1}} P_{c} (Y_{t} | X_{t}) 1,

where π

c

is a vector of the initial probabilities of the channel states, X

t

=g

t

(S

t

,I

t

), and the product is taken along the state trajectory S

t+1

=f

t

(S

t

,I

t

) for t=1,2, . . . , T.

If all elements of the input information sequence are equally probable, then the MAP estimate may be expressed by equation 14 below.

\begin{matrix} {\hat{I}}_{1}^{T} = \arg \max_{I_{1}^{T}} π_{c} \prod_{t = 1}^{T} P_{c} (Y_{t} | X_{t}) 1, & (14) \end{matrix}

The auxiliary function may be expressed by equations 15-17 below corresponding to equations 7-9 above.

\begin{matrix} Q (I_{1}^{T}, I_{1, p}^{T}) = \sum_{t = 1}^{T} \sum_{i = 1}^{n} \sum_{j = 1}^{n} γ_{t, ij} (I_{1, p}^{T}) \log (p_{ij} (Y_{t} | X_{t})) + C & (15) \end{matrix}

where X

t

=g

t

(S

t

,I

t

) and

γ

t,ij

(

I

1,p

T

)=α

i

(

Y

1

t−1

|I

1,p

t

)

P

c,ij

(

Y

t

|X

t,p

)β

j

(

Y

t+1,p

T

|I

t+1,p

T

) (16)

α

i

(Y

1

T−1

|I

1,p

t−1

) and β

j

(Y

t+1,p

T

|I

t+1,p

T

)

are the elements of the forward and backward probability vectors

\begin{matrix} α (Y_{1}^{t} | I_{1}^{T}) = π_{c} \prod_{i = 1}^{t} P_{c} (Y_{i} | X_{i}) and β (Y_{t + 1}^{T} | I_{t + 1}^{T}) = \prod_{i = t + 1}^{T} P_{c} (Y_{i} | X_{i}) 1 & (17) \end{matrix}

From equation 15, the Viterbi algorithm may be applied with the branch metric

\begin{matrix} m (I_{t}) = \sum_{i = 1}^{n} \sum_{j = 1}^{n} γ_{t, ij} (I_{1, p}^{T}) \log p_{c, ij} (Y_{t} | X_{t}), t = 1, 2, \dots, T & (18) \end{matrix}

to find a maximum of Q(I

1

T

,I

1,p

T

) which can be interpreted as a longest path leading from the initial zero state to one of the states S

T

where only the encoder trellis is considered. The Viterbi algorithm may be combined with the backward portion of the forward-backward algorithm as follows.

1. Select an initial source information sequence I

1,0

T

=I

1,0

,I

2,0

, . . . , I

T,0

2. Forward part:

a. set α(Y

1

0

|I

1

0

)=π, where π is an initial state probability estimate; and

b. for t=1,2, . . . ,T, compute X

t,p

=g

t

(S

t

,I

t,p

), α(Y

1

t

|I

1,p

t

)=α(Y

1,p

t−1

|I

1,p

t−1

)P

c

(Y

t

|X

1,p

), where I

1,p

t

is a prior estimate of I

1

t

.

3. Backward part:

a. setβ(Y

T+1

T

|I

T+1,p

T

)=1 and last state transition lengths L(S

T

) to 0 for all the states;

for t=T,T−1, . . . , 1 compute:

b. X

t

=g

t

(S

t

,I

t

),

c. γ

t,ij

(I

1,p

T

)=α

i

(Y

1

t−1

|I

t−1

)P

c,j

(Y

t

|X

t,p

)β

j

(Y

t+1

T

|I

t+1,p

T

),

d.

L (S_{t}) = \max_{I_{t}} {L [f_{t} (S_{t}, I_{t})] + m (I_{t})} .

This step selects the paths with the largest lengths (the survivors).

e.

{\hat{I}}_{t} (S_{t}) = \arg \max_{I_{t}} {L [f_{t} (S_{t}, I_{t})] + m (I_{t})} .

This step estimates I

t

corresponding to the state S

t

by selecting the I

t

of the survivor in step d.

f. β(Y

t

T

|I

t,p

T

)=P

c

(Y

t

|X

t,p

) β(Y

t+1

T

|X

t+1,p

T

).

g. End (of “for” loop).

4. Reestimate the information sequence:

I

t,p+1

=Î

t

(Ŝ

t

), Ŝ

t+1

=f

t

(Ŝ

t

,I

t,p+1

), t=1,2, . . . , T where Ŝ

1

=0; and

5. If I

t,p+1

≠I

t,p

, go to step 2; otherwise decode the information sequence as I

t,p+1

T

.

FIGS. 3-6

show an example of the iterative process discussed above where there are four states in the TCM and T=5. The dots represent possible states and the arrows represent a state trajectory that corresponds to a particular information sequence. The iterative process may proceed as follows. First, an initial input information sequence I

1,0

5

is obtained. I

1,0

5

may be the output of an existing decoder or may simply be a guess.

The Viterbi algorithm together with the backward algorithm may be used to obtain a next estimate of the input information sequence I

1,1

5

. This process begins with the state transitions between t=4 and t=5 by selecting state transitions leading to each of the states s0-s3 at t=4 from states at t=5 that have the largest value of the branch metric L(S

4

)=m(I

5

) of equation 18 above. Then, the process moves to select state transitions between the states at t=3 and t=4 that have the largest cumulative distance L(S

3

)=L(S

4

)+m(I

4

). This process continues until t=0 and the sequence of input information I

1

5

corresponding to the path connecting the states from t=0 to t=5 that has the longest path

L (S_{0}) = \sum_{t = 1}^{5}

m(I

t

) is selected as the next input information sequence I

1,1

5

.

For the example in

FIG. 3

, state transitions from the states at t=4 to all the states at t=5 are considered. Assuming that the (I

t

)s are binary, then only two transitions can emanate from each of the states at t=4: one transition for I

5

=0 and one transition for I

5

=1. Thus,

FIG. 3

shows two arrows terminating on each state at t=4 (arrows are “backwards” because the backward algorithm is used). State transitions

301

and

302

terminate at state s0; state transitions

303

and

304

terminate at state s1; state transitions

305

and

306

terminate at state s2; and state transitions

307

and

308

terminate at state s3.

The branch metric m(I

t

) of equation 18 represents a “distance” between the states and is used to select the state transition that corresponds to the longest path for each of the states s0-s3 at t=4:

\begin{matrix} \begin{matrix} m (I_{5}) = \sum_{i}^{} \sum_{j}^{} γ_{5, ij} (X_{1, 0}^{5}) \log p_{ij} (X_{5}, Y_{5}) \\ = \sum_{i}^{} \sum_{j}^{} α_{i} (X_{1, 0}^{4}, Y_{1}^{4}) p_{ij} (X_{5, 0}, Y_{5}) β_{j} (X_{5, 0}, Y_{5}) β_{j} (X_{6, 0}^{5}, Y_{6}^{5}) \log p_{ij} (Y_{5} | X_{5}), \end{matrix} & (19) \end{matrix}

where β

j

(X

6,0

5

,Y

6

5

)=1, and X

5

=g

5

(S

5

,I

5

) by definition. There is an I

5

that corresponds to each of the state transitions

301

-

308

. For this example, L(S

4

)=m(I

5

) corresponding to odd numbered state transitions

301

-

307

are greater than that for even numbered state transitions

302

-

308

. Thus, odd numbered state transitions are “survivors.” Each of them may be part of the state trajectory that has the longest path from t=0 to t=5. This transition (the survivor) is depicted by the solid arrow while the transitions with smaller lengths are depicted by dashed lines.

The state sequence determination process continues by extending the survivors to t=3 as shown in

FIG. 4

forming state transitions

309

-

316

. The distance between state transitions for each of the states are compared based on L(S

4

)+m(I

4

), where m(I

4

) is shown in equation 20 below.

\begin{matrix} \begin{matrix} m (I_{4}) = \sum_{i}^{} \sum_{j}^{} γ_{5, ij} (X_{1, 0}^{5}) \log p_{ij} (X_{4}, Y_{4}) \\ = \sum_{i}^{} \sum_{j}^{} α_{i} (X_{1, 0}^{3}, Y_{1}^{3}) p_{ij} (X_{4, 0}, Y_{4}) β_{j} (X_{5, 0}, Y_{5}) \log p_{ij} (Y_{4} | X_{4}) . \end{matrix} & (20) \end{matrix}

For this example, the distances corresponding to the odd numbered state transitions

309

-

315

are longer than distances corresponding to even numbered state transitions

310

-

316

. Thus, the paths corresponding to the odd numbered state transitions are the survivors. As shown in

FIG. 4

, the state transition

301

is not connected to any of the states at t=3 and thus is eliminated even though it was a survivor. The other surviving state transitions may be connected into partial state trajectories. For example, partial state trajectories are formed by odd numbered state transitions

307

-

309

,

303

-

311

,

303

-

313

and

305

-

315

.

The above process continues until t=0 is reached as shown in

FIG. 5

where two surviving state trajectories

320

-

322

are formed by the surviving state trajectories. All the state trajectories terminate at state zero for this example because, usually, encoders start at state zero. As shown in

FIG. 6

, the state trajectory that corresponds to the longest cumulative distance is selected and the input information sequence I

1

5

(via S

t+1

=f

t

(S

t

I

t

) that corresponds to the selected trajectory is selected as the next estimated input information sequence Î

1,1

5

. For this example, the state trajectory

320

is selected and the input information sequence I

1

5

corresponding to the state trajectory

320

is selected as Î

1,1

5

.

FIG. 7

shows an exemplary block diagram of the receiver

104

. The receiver

104

may include a controller

202

, a memory

204

, a forward processor

206

, a backward processor

208

, a maximal length processor

210

and an input/output device

212

. The above components may be coupled together via a signal bus

214

. While the receiver

104

is illustrated using a bus architecture, any architecture may be suitable as is well known to one of ordinary skill in the art.

All the functions of the forward, backward and maximal length processors

206

,

208

and

210

may also be performed by the controller

202

which may be either a general purpose or special purpose computer (e.g., DSP).

FIG. 7

shows separate processors for illustration only. The forward, backward maximal length processors

206

,

208

and

210

may be combined and may be implemented by using ASICs, PLAs, PLDs, etc. as is well known in the art.

The forward processor

206

generates the forward probability vectors α

i

(X

1,p

t−1

,Y

1

t−1

) herein referred to as α

i

. For every iteration, when a new X

1,p

T

(or I

1,p

T

) is generated, the forward processor

206

may generate a complete set of α

i

.

The backward processor

208

together with the maximal length processor

210

generate a new state sequence by searching for maximal length state transitions based on the branch metric m(I

t

). Starting with the final state transition between states corresponding to t=T−1 and t=T, the backward processor generates β

j

(X

t+1,p

T

,Y

t+1

T

) (hereinafter referred as β

j

) as shown in equation 8 for each state transition.

The maximal length processor

210

generates m(I

t

) based on the results of the forward processor

206

, the backward processor

208

and p

ij

(X

1

,Y

t

). After generating all the m(I

t

)s corresponding to each of the possible state transitions, the maximal length processor

210

compares all the L(S

t

)+m(I

t

)s and selects the state transition that corresponds to the largest L(S

t

)+m(I

t

), and the I

t

(via S

t+1

=f

t

(S

t

,I

t

)) that corresponds to the selected state transition is selected as the estimated input information for that t. The above process is performed for each t=1,2, . . . , T to generate a new estimate I

1,p

T

for each of the iteration p.

Initially, the controller

202

places an estimate of the PDM P(X,Y) and π in the memory

204

that corresponds to the HMM for the channel

106

and/or the transmitter

102

. The PDM P(X,Y) may be obtained via well known training processes, for example.

When ready, the controller

202

receives the received input information sequence Y

1

T

and places them in the memory

204

and selects an initial estimate of I

1,0

T

(or X

1,0

T

). The controller

202

coordinates the above-described iterative process until a new estimate I

1,1

T

(or X

1,1

T

) is obtained. Then, the controller

202

compares I

1,0

T

with I

1,1

T

to determine if the compare result is below the compare threshold value (e.g., matching a predetermined number of elements or symbols of the information sequence). The compare threshold may be set to 0, in which case I

1,0

T

must be identical with I

1,1

T

. If an acceptable compare result is reached, I

1,1

T

is output as the decoded output. Otherwise, the controller

202

iterates the above-described process again and compares the estimated I

1,p

T

with I

1,p−1

T

until an acceptable result is reached and I

1,p

T

is output as the decoded output.

FIG. 8

shows a flowchart of the above-described process. In step

1000

, the controller

202

receives Y

1

T

via the input/output device

212

and places Y

1

T

in the memory

204

and goes to step

1002

. In step

1002

, the controller

202

selects an initial estimate for I

1,0

T

and goes to step

1004

. In step

1004

, the controller

202

determines a new state sequence and a next estimated I

1,1

T

(I

1,p

T

, where p=1) (via the forward, backward and maximal length processors

206

,

208

and

210

) and goes to step

1006

. In step

1006

, the controller

202

compares I

1,0

T

with I

1,1

T

. If the compare result is within the predetermined threshold, then the controller

202

goes to step

1008

; otherwise, the controller

202

returns to step

1004

. In step

1008

, the controller

202

outputs I

1,p

T

where p is the index of the last iteration and goes to step

1010

and ends the process.

FIG. 9

shows a flowchart that expands step

1004

in greater detail. In step

2000

, the controller

202

instructs the forward processor

206

to generate α

i

as shown in equation 8, and goes to step

2002

. In step

2002

, the controller

202

sets the parameter t=T and goes to step

2004

. In step

2004

, the controller

202

instructs the backward processor

208

to generate β

j

and the maximal length processor

210

to determine next set of survivors based on equation 18 and time t+1 survivors and goes to step

2006

.

In step

2006

, the controller

202

decrements t and goes to step

2008

. In step

2008

, the controller

202

determines whether t is equal to 0. If t is equal to 0, the controller

202

goes to step

2010

; otherwise, the controller

202

returns to step

2004

. In step

2010

, the controller

202

outputs the new estimated I

1

T

and goes to step

2012

and returns to step

1006

of FIG.

5

.

A specific example of the iterative process for convolutional encoders is enclosed in the appendix.

While this invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art. Accordingly, preferred embodiments of the invention as set forth herein are intended to be illustrative, not limiting. Various changes may be made without departing from the spirit and scope of the invention.

For example, a channel may be modeled as P

c

(Y|X)=P

c

B

c

(Y|X) where P

c

is a channel state transition probability matrix and B

c

(Y|X) is a diagonal matrix of state output probabilities. For example, based on the Gilbert-Elliott model

B_{c} (X | X) = [\begin{matrix} 1 - b_{1} & 0 \\ 0 & 1 - b_{2} \end{matrix}] and B_{c} (\overline{X} | X) = [\begin{matrix} b_{1} & 0 \\ 0 & b_{2} \end{matrix}],

where {overscore (X)} is the complement of X. For this case, m(I

t

) may be simplified as

m (I_{t}) = \sum_{i = 1}^{n_{c}} γ_{t, i} (I_{1, p}^{T}) b_{j} (Y_{t} | X_{t}), t = 1, 2, \dots, T, and

γ

t,i

(

I

1,p

T

)=α

i

(

Y

1

t

|I

1,p

t

)β

i

(

Y

t+1

T

|I

t+1,p

T

), where

b

j

(

Y

t

|X

t

) are the elements of B

c

.

Claims

1. A method for maximum a posteriori (MAP) decoding of an input information sequence based on a first information sequence received through a channel, comprising:iteratively generating decode results, Xi, for i=1,2, . . . n, where n is an integer, and where each Xi is generated by employing Xi−1 and X1 is generated from said first information sequence and from initial decode result, X0; and ceasing said step of iteratively generating, and outputting last-generated decode results when difference between said last-generated decode results and next-to-last-generated decode results is within a compare threshold.
2. A method for maximum a posteriori (MAP) decoding of an input information sequence based on a first information sequence received through a channel, comprising:iteratively generating a sequence of one or more decode results starting with an initial decode result; and outputting one of adjacent decode results as a decode of the input information sequence if the adjacent decode results are within a compare threshold, wherein the step of iteratively generating comprises: a. generating the initial decode result as a first decode result; b. generating a second decode result based on the first decode result and a model of the channel; c. comparing the first and second decode results; d. replacing the first decode result with the second decode result; and e. repeating b-d if the first and second decode results are not within the compare threshold.
3. The method of claim 2, wherein the generating a second decode result comprises searching for a second information sequence that maximizes a value of an auxiliary function.
4. The method of claim 3, wherein the auxiliary function is based on the expectation maximization (EM) algorithm.
5. The method of claim 4, wherein the model of the channel is a Hidden Markov Model (HMM) having an initial state probability vector π and probability density matrix (PDM) of P(X,Y), where XεX, YεY and elements of P(X,Y), pij(X,Y)=Pr(j,X,Y|i), are conditional probability density functions of an information element X of the second information sequence that corresponds to a received element Y of the first information sequence after the HMM transfers from a state i to a state j, the auxiliary function being expressed as: Q⁡(X1T,X1,pT)=∑z⁢Ψ⁡(x,X1,pT,Y1T)⁢log⁡(Ψ⁡(z,X1T,Y1T)),where p is a number of iterations, Ψ⁡(z,X1T,Y1T)=πio⁢∏t=1T⁢ ⁢pit-1⁢it⁡(Xt,Y1), T is a number of information elements in a particular information sequence, z is a HMM state sequence i0T, πi0 is the probability of an initial state i0,X1T is the second information sequence, X1,pT is a second information sequence estimate corresponding to a pth iteration, and Y1T is the first information sequence.
6. The method of claim 5, wherein the auxiliary function is expanded to be: Q⁢(X1T,X1,pT)=∑t=1T⁢ ⁢∑i=1n⁢ ⁢∑j=1n⁢ ⁢γt,ij⁡(X1,pT)⁢log⁡(pij⁡(Xt,Yt))+Cwhere C does not depend on X1T andγt,ij(X1,pT)=αi(X1,pt−1,Y1t−1)Pij(Xt,p,Yt)βj(Xt+1,pT,Yt+1T) where αi(X1,pt,Y1T) and βj(Xt+1,pT,Yt+1T) are the elements of forward and backward probability vectors defined as α⁡(X1t,Y1t)=π⁢∏i=1t⁢ ⁢P⁡(Xi,Yi),and⁢ ⁢β⁢ ⁢(X1T,Y1T)=∏j=tT⁢ ⁢P⁡(Xj,Yj)⁢1,π is an initial probability vector, 1 is the column vector of ones.
7. The method of claim 6, wherein a source of an encoded sequence is a trellis code modulator (TCM), the TCM receiving a source information sequence I1T and outputting X1T as an encoded information sequence that is transmitted, the TCM defining Xt=gt(St,It) where Xt and It are the elements of X1T and I1T for each time t, respectively, St is a state of the TCM at t, and gt(·) is a function relating Xt, to It and St, the method comprising:generating, for iteration p+1, a source information sequence estimate I1,p+1T that corresponds to a sequence of TCM state transitions that has a longest cumulative distance L(St−1) at t=1 or L(S0), wherein a distance for each of the TCM state transitions is defined by L(St)=L(St+1)+m(Ît+1(St+1)) for the TCM state transitions at each t for t=1, . . . , T and the cumulative distance is the sum of m(Ît(St)) for all t, m(Ît(St)) being defined as m⁡(I^t⁡(St))=∑i=1nc⁢ ⁢∑j=1nc⁢ ⁢γt,ij⁡(I1,pT)⁢log⁢ ⁢pc,ij⁡(Yt|Xt⁡(St)),for⁢ ⁢each⁢ ⁢t=1,2,…⁢ ,T,where Xt(St)=gt(St,Ît(St)), nc is a number of states in an HMM of the channel and pc,ij(Yt|Xt(St)) are channel conditional probability density functions of Yt when Xt(St) is transmitted by the TCM, I1,p+1T being set to a sequence of Ît for all t.
8. The method of claim 7, wherein for each t=1,2, . . . , T, the method comprises:generating m(Ît(St)) for each possible state transition of the TCM; selecting state trajectories that correspond to largest L(St)=L(St+1)+m(Ît+1(St+1)) for each state as survivor state trajectories; and selecting Ît(St)s that correspond to the selected state trajectories as It,p+1(St).
9. The method of claim 8, further comprising:a. assigning L(ST)=0 for all states at t=T; b. generating m(Ît(St)) for all state transitions between states St and all possible states St+1; c. selecting state transitions between the states St and St+1 that have a largest L(St)=L(St−1)+m(Ît+1(St+1)) and Ît+1(St+1) that correspond to the selected state transitions; d. updating the survivor state trajectories at states St by adding the selected state transitions to the corresponding survivor state trajectories at state St+1; e. decrementing t by 1; f. repeating b-e until t=0; and g. selecting all the Ît(St) that correspond to a survivor state trajectory that corresponding to a largest L(St) at t=0 as I1,p+1T.
10. The method of claim 6, wherein the channel is modeled as Pc(Y|X)=PcBc(Y|X) where Pc is a channel state transition probability matrix and Bc(Y|X) is a diagonal matrix of state output probabilities, the method comprising for each t=1,2, . . . , T:generating γt,i(I1,pT)=αi(Y1t|I1,pt)βi(Yt+1T|It+1,pT); selecting an Ît(St) that maximizes L(St)=L(St+1)+m(Ît+1(St+1)), where m(Ît(St)) is defined as m⁡(I^t⁡(St))=∑i=1nc⁢γt,ij⁡(I1,pT)⁢βj⁡(Yt|Xt⁡(St)),nc being a number of states in an HMM of the channel; selecting state transitions between states St and St−1that corresponds to a largest L(St)=L(St+1)+m(Ît+1(St+1)); and forming survivor state trajectories by connecting selected state transitions.
11. The method of claim 10, further comprising:selecting Ît(St) that corresponds to a survivor state trajectory at t=0 that has the largest L(St) as It,p+1T for each pth iteration; comparing I1,pT and I1,p+1T; and outputting I1,p+1T as the second decode result if I1,pT and I1,p+1T are within the compare threshold.
12. A maximum a posteriori (MAP) decoder that decodes a transmitted information sequence using a received information sequence received through a channel, comprising:a memory; and a controller coupled to the memory, the controller iteratively-generating decode results, Xi, for i=1,2, . . . n, where n is an integer, and where each Xi is generated by employing Xi−1, and X1 is generated from said first information sequence and from an initial decode result, X0, and ceasing said step of iteratively generating, and outputting last-generated decode results when difference between said last-generated decode results and next-to-last-generated decode results is within a compare threshold.
13. A maximum a posteriori (MAP) decoder that decodes a transmitted information sequence using a received information sequence received through a channel, comprising:a memory; and a controller coupled to the memory, the controller iteratively generating a sequence of one or more decode results starting with an initial decode result, and outputting one of adjacent decode results as a decode of the input information sequence if the adjacent decode results are within a compare threshold wherein the controller: a. generates the initial decode result as a first decode result; b. generates a second decode result based on the first decode result and a model of the channel; c. compares the first and second decode results; d. replaces the first decode result with the second decode result; and e. repeats b-d until the first and second decode result are not within the compare threshold.
14. The decoder of claim 13, wherein the controller searches for information sequence that maximizes a value of an auxiliary function.
15. The decoder of claim 14, wherein the auxiliary function is based on expectation maximization (EM).
16. The decoder of claim 15, wherein the model of the channel is a Hidden Markov Model (HMM) having an initial state probability vector π and probability density matrix (PDM) of P(X,Y), where XεX, YεY and elements of P(X,Y), pij(X,Y)=Pr(j,X,Y|i), are conditional probability density functions of an information element X of the second information sequence that corresponds to a received element Y of the first information sequence after the HMM transfers from a state i to a state j, the auxiliary function being expressed as: Q⁡(X1T,X1,pT)=∑z⁢Ψ⁡(z,X1,pT,Y1T)⁢log⁡(Ψ⁡(z,X1T,Y1T)),where p is a number of iterations, Ψ⁡(z,X1T,Y1T)=πio⁢∏t=1T⁢ ⁢pit-1⁢it⁡(Xt,Yt), T is a number of information elements in a particular information sequence, z is a HMM state sequence i0T, πi0 is the probability of an initial state i0,X1T is the second information sequence, X1,pT is a second information sequence estimate corresponding to a pth iteration, and Y1T is the first information sequence.
17. The decoder of claim 16, wherein the auxiliary function is expanded to be: Q⁢(X1T,X1,pT)=∑t=1T⁢ ⁢∑i=1n⁢ ⁢∑j=1n⁢ ⁢γt,ij⁡(X1,pT)⁢log⁡(pij⁡(Xt,Yt))+Cwhere C does not depend on X1T andγt,ij(X1,pT)=αi(X1,pt−1,Y1t−1)pij(Xt,p,Yt)βj(Xt+1,pT,Yt+1T) where αi(X1,pt,Y1T) and βj(Xt+1,pT,Yt+1T) are the elements of forward and backward if probability vectors defined as α⁡(X1t,Y1t)=π⁢∏i=1t⁢ ⁢P⁡(Xi,Yi),and⁢ ⁢β⁢ ⁢(Xt+1T,Yt+1T)=∏j=t+1T⁢ ⁢P⁡(Xj,Yj)⁢1,π is an initial probability vector, 1 is the column vector of ones.
18. The decoder of claim 17, wherein a source of an encoded sequence is a trellis code modulator (TCM), the TCM receiving a source information sequence I1T and outputting X1T as an encoded information sequence that is transmitted, the TCM defining Xt=gt(St,It) where Xt and It are the elements of X1T and I1T for each time t, respectively, St is a state of the TCM at t, and gt(·) is a function relating Xt, to It and St, the controller generates, for iteration p+1, an input information sequence estimate I1,p+1T that corresponds to a sequence of TCM state transitions that has a longest cumulative distance L(St−1) at t=1 or L(S0), wherein a distance for each of the TCM state transitions is defined by L(St+1)=L(St+1)+m(Ît+1(St+1)) for the TCM state transitions at each t for t=1, . . . , T and the cumulative distance is the sum of m(Ît(St)) for all t, m(Ît(St)) being defined as m⁡(I^t⁡(St))=∑i=1nc⁢ ⁢∑j=1nc⁢ ⁢γt,ij⁡(I1,pT)⁢log⁢ ⁢pc,ij⁢(Yt|Xt⁡(St)),for each t=1,2, . . . , T, where Xt(St)=gt(St,Ît(St)), nc is a number of states in an HMM of the channel and Pc,ij(Yt|Xt(St)) are channel conditional probability density functions of Yt when Xt(St) is transmitted by the TCM, I1,p+1T being set to a sequence of Ît for all t.
19. The decoder of claim 18, wherein for each t=1,2, . . . , T, the controller generating m(Ît(St)) for each possible state transition of the TCM, selecting state trajectories that correspond to largest L(St)=L(St+1)+m(Ît+1(St+1)) for each state as survivor state trajectories, and selecting Ît+1(St+1)s that correspond to the selected state trajectories as It+1,p+1(St+1).
20. The decoder of claim 19, wherein the controller:a. assigns L(ST)=0 for all states at t=T; b. generates m(Ît(St)) for all state transitions between states St and all possible states St+1; c. selects state transitions between the states St and St+1 that have a largest L(St)=L(St+1)+m(Ît+1(St+1)) and Ît+1(St+1) that correspond to the selected state transitions; d. updates the survivor state trajectories at states St by adding the selected state transitions to the corresponding survivor state trajectories at state St+1; e. decrements t by 1; f. repeats b-e until t=0; and g. selects all the Ît(St) that correspond to a survivor state trajectory that corresponding to a largest L(St) at t=0 as I1,p+1T.
21. The decoder of claim 20, wherein the channel is modeled as Pc(Y|X)=PcBc(Y|X) where Pc is a channel state transition probability matrix and Bc(Y|X) is a diagonal matrix of state output probabilities, for each t=1,2, . . . , T, the controller:generates γt,i(I1,pT)=αi(Y1t|I1,pt)βi(Yt+1T|It+1,pT); selects an Ît(St) that maximizes L(St)=L(St+1)+m(Ît+1(St+1)), where m(Ît(St)) is defined as m⁡(I^t⁡(St))=∑i=1nc⁢γt,i⁡(I1,pT)⁢βj⁡(Yt|Xt⁡(St)),nc being a number of states in an HMM of the channel; selects state transitions between states St and St+1 that corresponds to a largest L(St)=L(St+1)+m(Ît+1(St+1)); and forms survivor state trajectories by connecting selected state transitions.
22. The decoder of claim 21, wherein the controller selects Ît(St) that corresponds to a survivor state trajectory at t=0 that has the largest L(St) as I1,p+1T for each pth iteration, compares I1,pT and I1,p+1T, and outputs I1,p+1T as the second decode result if I1,pT and I1,p+1T are within the compare threshold.

Parent Case Info

This nonprovisional application claims the benefit of U.S. provisional application No. 60/174,601 entitled “Map Decoding In Channels With Memory” filed on Jan. 5, 2000. The Applicant of the provisional application is William Turin. The above provisional application is hereby incorporated by reference including all references cited therein.

US Referenced Citations (6)

Number	Name	Date	Kind
5721746	Hladik et al.	Feb 1998	A
6167552	Gagnon et al.	Dec 2000	A
6182261	Haller et al.	Jan 2001	B1
6223319	Ross et al.	Apr 2001	B1
6343368	Lerzer	Jan 2002	B1
6377610	Hagenauer et al.	Apr 2002	B1

Non-Patent Literature Citations (2)

Entry
Georghiades, et al., “Sequence Estimation in the Presence of Random Parameters Via the EM Algorithms”, IEEE Transactions on Communications, vol. 45, No. 3, Mar. 1997.
Turin, Digital Transmission System; Performance Analysis and Modeling, pp. 126-143, pp. 227-228, 1998.

Provisional Applications (1)

	Number	Date	Country
	60/174601	Jan 2000	US

Iterative decoding

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US