Multiple description transform coding using optimal transforms of arbitrary dimension

Information

  • Patent Grant
  • 6345125
  • Patent Number
    6,345,125
  • Date Filed
    Wednesday, February 25, 1998
    27 years ago
  • Date Issued
    Tuesday, February 5, 2002
    24 years ago
Abstract
A multiple description (MD) joint source-channel (JSC) encoder in accordance with the invention encodes n components of a signal for transmission over in channels of a communication medium. In illustrative embodiments, the invention provides optimal or near-optimal transforms for applications in which at least one of n and m is greater than two, and applications in which the failure probabilities of the m channels are non-independent and non-equivalent. The signal to be encoded may be a data signal, a speech signal, an audio signal, an image signal, a video signal or other type of signal, and each of the m channels may correspond to a packet or a group of packets to be transmitted over the medium. A given n×m transform implemented by the MD JSC encoder may be in the form of a cascade structure of several transforms each having dimension less than n×m. The transform may also be configured to provide a substantially equivalent rate for each of the m channels.
Description




FIELD OF THE INVENTION




The present invention relates generally to multiple description transform coding (MDTC) of data, speech, audio, images, video and other types of signals for transmission over a network or other type of communication medium.




BACKGROUND OF THE INVENTION




Multiple description transform coding (MDTC) is a type of joint source-channel coding (JSC) designed for transmission channels which are subject to failure or “erasure.” The objective of MDTC is to ensure that a decoder which receives an arbitrary subset of the channels can produce a useful reconstruction of the original signal. A distinguishing characteristic of MDTC is the introduction of correlation between transmitted coefficients in a known, controlled manner so that lost coefficients can be statistically estimated from received coefficients. This correlation is used at the decoder at the coefficient level, as opposed to the bit level, so it is fundamentally different than techniques that use information about the transmitted data to produce likelihood information for the channel decoder. The latter is a common element in other types of JSC coding systems, as shown, for example, in P. G. Sherwood and K. Zeger, “Error Protection of Wavelet Coded Images Using Residual Source Redundancy,” Proc. of the 31


st


Asilomar Conference on Signals, Systems and Computers, November 1997.




A known MDTC technique for coding pairs of independent Gaussian random variables is described in M. T. Orchard et al., “Redundancy Rate-Distortion Analysis of Multiple Description Coding Using Pairwise Correlating Transforms,” Proc. IEEE Int. Conf. Image Proc., Santa Barbara, Calif., October 1997. This MDTC technique provides optimal 2×2 transforms for coding pairs of signals for transmission over two channels. However, this technique as well as other conventional techniques fail to provide optimal generalized n×m transforms for coding any n signal components for transmission over any m channels. Moreover, the optimality of the 2×2 transforms in the M.T. Orchard et al. reference requires that the channel failures be independent and have equal probabilities. The conventional techniques thus generally do not provide optimal transforms for applications in which, for example, channel failures either are dependent or have unequal probabilities, or both. This inability of conventional techniques to provide suitable transforms for arbitrary dimensions and different types of channel failure probabilities unduly restricts the flexibility of MDTC, thereby preventing its effective implementation in many important applications.




SUMMARY OF THE INVENTION




The invention provides MDTC techniques which can be used to implement optimal or near-optimal n×m transforms for coding any number n of signal components for transmission over any number m of channels. A multiple description (MD) joint source-channel (JSC) encoder in accordance with an illustrative embodiment of the invention encodes n components of a signal for transmission over in channels of a communication medium, in applications in which at least one of n and m may be greater than two, and in which the failure probabilities of the m channels may be non-independent and non-equivalent. An n×m transform implemented by the MD JSC encoder may be in the form of a cascade structure of several transforms each having dimension less than n×m. An exemplary transform in accordance with the invention may include an additional degree of freedom not found in conventional MDTC transforms. This additional degree of freedom provides considerable improvement in design flexibility, and may be used, for example, to partition a total available rate among the m channels such that each channel has substantially the same rate.




In accordance with another aspect of the invention, an MD JSC encoder may include a series combination of N “macro” MD encoders followed by an entropy coder, and each of the N macro MD encoders includes a parallel arrangement of M “micro” MD encoders. Each of the M micro MD encoders implements one of: (i) a quantizer block followed by a transform block, (ii) a transform block followed by a quantizer block, (iii) a quantizer block with no transform block, and (iv) an identity function. This general MD JSC encoder structure allows the encoder to implement any desired n×m transform while also minimizing design complexity.




The MDTC techniques of the invention do not require independent or equivalent channel failure probabilities. As a result, the invention allows MDTC to be implemented effectively in a much wider range of applications than has heretofore been possible using conventional techniques. The MDTC techniques of the invention are suitable for use in conjunction with signal transmission over many different types of channels, including lossy packet networks such as the Internet as well as broadband ATM networks, and may be used with data, speech, audio, images, video and other types of signals.











BRIEF DESCRIPTION OF THE DRAWINGS





FIG. 1

shows an exemplary communication system in accordance with the invention.





FIG. 2

shows a multiple description (MD) joint source-channel (JSC) encoder in accordance with the invention.





FIG. 3

shows an exemplary macro MD encoder for use in the MD JSC encoder of FIG.


2


.





FIG. 4

shows an entropy encoder for use in the MD JSC encoder of FIG.


2


.





FIGS. 5A through 5D

show exemplary micro MD encoders for use in the macro MD encoder of FIG.


3


.





FIGS. 6A

,


6


B and


6


C show respective audio encoder, image encoder and video encoder embodiments of the invention, each including the MD JSC encoder of FIG.


2


.





FIG. 7A

shows a relationship between redundancy and channel distortion in an exemplary embodiment of the invention.





FIG. 7B

shows relationships between distortion when both of two channels are received and distortion when one of the two channels is lost, for various rates, in an exemplary embodiment of the invention.





FIG. 8

illustrates an exemplary 4×4 cascade structure which may be used in an MD JSC encoder in accordance with the invention.











DETAILED DESCRIPTION OF THE INVENTION




The invention will be illustrated below in conjunction with exemplary MDTC systems. The techniques described may be applied to transmission of a wide variety of different types of signals, including data signals, speech signals, audio signals, image signals, and video signals, in either compressed or uncompressed formats. The term “channel” as used herein refers generally to any type of communication medium for conveying a portion of a encoded signal, and is intended to include a packet or a group of packets. The term “packet” is intended to include any portion of an encoded signal suitable for transmission as a unit over a network or other type of communication medium.





FIG. 1

shows a communication system


10


configured in accordance with an illustrative embodiment of the invention. A discrete-time signal is applied to a pre-processor


12


. The discrete-time signal may represent, for example, a data signal, a speech signal, an audio signal, an image signal or a video signal, as well as various combinations of these and other types of signals. The operations performed by the pre-processor


12


will generally vary depending upon the application. The output of the preprocessor is a source sequence {x


k


} which is applied to a multiple description (MD) joint source-channel (JSC) encoder


14


. The encoder


14


encodes n different components of the source sequence {x


k


} for transmission over m channels, using transform, quantization and entropy coding operations. Each of the m channels may represent, for example, a packet or a group of packets. The m channels are passed through a network


15


or other suitable communication medium to an MD JSC decoder


16


. The decoder


16


reconstructs the original source sequence {x


k


} from the received channels. The MD coding implemented in encoder


14


operates to ensure optimal reconstruction of the source sequence in the event that one or more of the m channels are lost in transmission through the network


15


. The output of the MD JSC decoder


16


is further processed in a post processor


18


in order to generate a reconstructed version of the original discrete-time signal.





FIG. 2

illustrates the MD JSC encoder


14


in greater detail. The encoder


14


includes a series arrangement of N macro MD


i


encoders MD


1


, . . . MD


N


corresponding to reference designators


20


-


1


, . . .


20


-N. An output of the final macro MD


i


encoder


20


-N is applied to an entropy coder


22


. FIG.


3


shows the structure of each of the macro MD


i


encoders


20


-i. Each of the macro MD


i


encoders


20


-i receives as an input an r-tuple, where r is an integer. Each of the elements of the r-tuple is applied to one of M micro MD


j


encoders MD


1


, . . . MD


N


corresponding to reference designators


30


-


1


, . . .


30


-M. The output of each of the macro MD


i


encoders


20


-i is an s-tuple, where s is an integer greater than or equal to r.





FIG. 4

indicates that the entropy coder


22


of

FIG. 2

receives an r-tuple as an input, and generates as outputs the m channels for transmission over the network


15


. In accordance with the invention, the in channels may have any distribution of dependent or independent failure probabilities. More specifically, given that a channel i is in a state S


i


∈{0, 1}, where S


i


=0 indicates that the channel has failed while S


i


=1 indicates that the channel is working, the overall state S of the system is given by the Cartesian product of the channel states S


i


over m, and the individual channel probabilities may be configured so as to provide any probability distribution function which can be defined on the overall state S.





FIGS. 5A through 5D

illustrate a number of possible embodiments for each of the micro MD


j


encoders


30


-j.

FIG. 5A

shows an embodiment in which a micro MD


j


encoder


30


-j includes a quantizer (Q) block


50


followed by a transform (Y) block


51


. The Q block


50


receives an r-tuple as input and generates a corresponding quantized r-tuple as an output. The T block


51


receives the r-tuple from the Q block


50


, and generates a transformed r-tuple as an output.

FIG. 5B

shows an embodiment in which a micro MD, encoder


30


-j includes a T block


52


followed by a Q block


53


. The T block


52


receives an r-tuple as input and generates a corresponding transformed s-tuple as an output. The Q block


53


receives the s-tuple from the T block


52


, and generates a quantized s-tuple as an output, where s is greater than or equal to r.

FIG. 5C

shows an embodiment in which a micro MD


j


encoder


30


-j includes only a Q block


54


. The Q block


54


receives an r-tuple as input and generates a quantized s-tuple as an output, where s is greater than or equal to r.

FIG. 5D

shows another possible embodiment, in which a micro MD


j


encoder


30


-j does not include a Q block or a T block but instead implements an identity function, simply passing an r-tuple at its input though to its output. The micro MD


j


encoders


30


-j of

FIG. 3

may each include a different one of the structures shown in

FIGS. 5A through 5D

.





FIGS. 6A through 6C

illustrate the manner in which the MD JSC encoder


14


of

FIG. 2

can be implemented in a variety of different encoding applications. In each of the embodiments shown in

FIGS. 6A through 6C

, the MD JSC encoder


14


is used to implement the quantization, transform and entropy coding operations typically associated with the corresponding encoding application.

FIG. 6A

shows an audio coder


60


which includes an MD JSC encoder


14


configured to receive input from a conventional psychoacoustics processor


61


.

FIG. 6B

shows an image coder


62


which includes an MD JSC encoder


14


configured to interact with an element


63


providing preprocessing functions and perceptual table specifications.

FIG. 6C

shows a video coder


64


which includes first and second MD JSC encoders


14


-


1


and


14


-


2


. The first encoder


14


-


1


receives input from a conventional motion compensation element


66


, while the second encoder


14


-


2


receives input from a conventional motion estimation element


68


. The encoders


14


-


1


and


14


-


2


are interconnected as shown. It should be noted that these are only examples of applications of an MD JSC encoder in accordance with the invention. It will be apparent to those skilled in the art that numerous alternate configurations may also be used, in audio, image, video and other applications.




A general model for analyzing MDTC techniques in accordance with the invention will now be described. Assume that a source sequence {x


k


} is input to an MD JSC encoder, which outputs m streams at rates R


1


, R


2


, . . . R


m


. These streams are transmitted on Mil separate channels. One version of the model may be viewed as including many receivers, each of which receives a subset of the channels and uses a decoding algorithm based on which channels it receives. More specifically, there may be 2


m


−1 receivers, one for each distinct subset of streams except for the empty set, and each experiences some distortion. An equivalent version of this model includes a single receiver when each channel may have failed or not failed, and the status of the channel is known to the receiver decoder but not to the encoder. Both versions of the model provide reasonable approximations of behavior in a lossy packet network. As previously noted, each channel may correspond to a packet or a set of packets. Some packets may be lost in transmission, but because of header information it is known which packets are lost. An appropriate objective in a system which can be characterized in this manner is to minimize a weighted sum of the distortions subject to a constraint on a total rate R. For m=2, this minimization problem is related to a problem from information theory called the multiple description problem. D


0


, D


1


and D


2


denote the distortions when both channels are received, only channel


1


is received, and only channel


2


is received, respectively. The multiple description problem involves determining the achievable (R


1


, R


2


, D


0


, D


1


, D


2


)-tuples. A complete characterization for an independent, identically-distributed (i.i.d.) Gaussian source and squared-error distortion is described in L. Ozarow, “On a source-coding problem with two channels and three receivers,” Bell Syst. Tech. J., 59(8):1417-1426, 1980. It should be noted that the solution described in the L. Ozarow reference is non-constructive, as are other achievability results from the information theory literature.




An MDTC coding structure for implementation in the MD JSC encoder


14


of

FIG. 2

in accordance with the invention will now be described. In this illustrative embodiment, it will be assumed for simplicity that the source sequence {x


k


} input to the encoder is an i.i.d. sequence of zero-mean jointly Gaussian vectors with a known correlation matrix R


x


=[x


k


x


k




T


]. The vectors can be obtained by blocking a scalar Gaussian source. The distortion will be measured in terms of mean-squared error (MSE). Since the source in this example is jointly Gaussian, it can also be assumed without loss of generality that the components are independent. If the components are not independent, one can use a Karhunen-Loeve transform of the source at the encoder and the inverse at each decoder. This embodiment of the invention utilizes the following steps for implementing MDTC of a given source vector x:




1. The source vector x is quantized using a uniform scalar quantizer with stepsize Δ:x


qi


=[x


i


]


Δ


, where [·]


Δ


denotes rounding to the nearest multiple of Δ.




2. The vector x


q


=[x


q1


, x


q2


, . . . x


qn


]


T


is transformed with an invertible, discrete transform {circumflex over (T)}: ΔZ


n


→ΔZ


n


, y={circumflex over (T)}(x


q


). The design and implementation of {circumflex over (T)} are described in greater detail below.




3. The components of y are independently entropy coded.




4. If m>n, the components of y are grouped to be sent over the m channels.




When all of the components of y are received, the reconstruction process is to exactly invert the transform {circumflex over (T)} to get {circumflex over (x)}=x


q


. The distortion is the quantization error from Step 1 above. If some components of y are lost, these components are estimated from the received components using the statistical correlation introduced by the transform {circumflex over (T)}. The estimate {circumflex over (x)} is then generated by inverting the transform as before.




Starting with a linear transform T with a determinant of one, the first step in deriving a discrete version {circumflex over (T)} is to factor T into “lifting” steps. This means that T is factored into a product of lower and upper triangular matrices with unit diagonals T=T


1


T


2


. . . T


k


. The discrete version of the transform is then given by:






{circumflex over (T)}(x


q


)=[T


1


[T


2


. . . [T


k


x


q


]


Δ


]


Δ


]


Δ


.  (1)






The lifting structure ensures that the inverse of {circumflex over (T)} can be implemented by reversing the calculations in (1):






{circumflex over (T)}


−1


(y)=[T


k




−1


. . . [T


2




−1


[T


1




−1


y]


Δ


]


Δ


]


Δ


.






The factorization of T is not unique. Different factorizations yield different discrete transforms, except in the limit as A approaches zero. The above-described coding structure is a generalization of a 2×2 structure described in the above-cited M.T. Orchard et al. reference. As previously noted, this reference considered only a subset of the possible 2×2 transforms; namely, those implementable in two lifting steps.




It is important to note that the illustrative embodiment of the invention described above first quantizes and then applies a discrete transform. If one were to instead apply a continuous transform first and then quantize, the use of a nonorthogonal transform could lead to non-cubic partition cells, which are inherently suboptimal among the class of partition cells obtainable with scalar quantization. See, for example, A. Gersho and R. M. Gray, “Vector Quantization and Signal Compression,” Kluwer Acad. Pub., Boston, Mass. 1992. The above embodiment permits the use of discrete transforms derived from nonorthogonal linear transforms, resulting in improved performance.




An analysis of an exemplary MDTC system in accordance with the invention will now be described. This analysis is based on a number of fine quantization approximations which are generally valid for small Δ. First, it is assumed that the scalar entropy of y={circumflex over (T)}([x]


Δ


) is the same as that of [Tx]


Δ


. Second, it is assumed that the correlation structure of y is unaffected by the quantization. Finally, when at least one component of y is lost, it is assumed that the distortion is dominated by the effect of the erasure, such that quantization can be ignored. The variances of the components of x are denoted by σ


1




2


, σ


2




2


. . . σ


n




2


and the correlation matrix of x is denoted by R


x


, where R


x


=diag (σ


1




2


, σ


2




2


. . . σ


n




2


. Let R


y


=TR


x


T


T


. In the absence of quantization, R


y


would correspond to the correlation matrix of y. Under the above-noted fine quantization approximations, R


y


will be used in the estimation of rates and distortions.




The rate can be estimated as follows. Since the quantization is fine, y is approximately the same as [(Tx)


i


]


Δ


, i.e., a uniformly quantized Gaussian random variable. If y


i


is treated as a Gaussian random variable with power σ


yi




2


=(R


y


)


ii


quantized with stepsize Δ, the entropy of the quantized coefficient is given by:






H(y


i


)≈½log 2πeσ


yi




2


−log Δ=½log σ


yi




2


+½log 2πe−log Δ=½log σ


yi




2


+k


Δ


,






where k


Δ






Δ


(log


2πe)/2−log Δ and all logarithms are base two. Notice that k


Δ


depends only on Δ. The total rate R can therefore be estimated as:









R
=





i
=
1

n



H


(

y
i

)



=


nk
Δ

+


1
2


log





i
=
1

n




σ
yi
2

.









(
2
)













The minimum rate occurs when the product from i=1 to n of σ


yi




2


is equivalent to the product from i=1 to n of σ


i




2


, and at this rate the components of y are uncorrelated. It should be noted that T=I is not the only transform which achieves the minimum rate. In fact, it will be shown below that an arbitrary split of the total rate among the different components of y is possible. This provides a justification for using a total rate constraint in subsequent analysis.




The distortion will now be estimated, considering first the average distortion due only to quantization. Since the quantization noise is approximately uniform, the distortion is Δ


2


/12 for each component. Thus the distortion when no components are lost is given by:










D
0

=


n






Δ
2


12





(
3
)













and is independent of T.




The case when l>0 components are lost will now be considered. It first must be determined how the reconstruction will proceed. By renumbering the components if necessary, assume that y


1


, y


2


, . . . y


n−l


are received and y


n−l+1


, . . . y


n


are lost. First partition y into “received” and “not received” portions as y=[y


r


, y


nr


] where y


r


=[y


1


, Y


2


, . . . y


n−l


]


T


and y


nr


=[y


n−l+1


, . . . y


n


]


T


. The minimum MSE estimate {circumflex over (x)} of x given y


r


is E[x|y


r


], which has a simple closed form because in this example x is a jointly Gaussian vector. Using the linearity of the expectation operator gives the following sequence of calculations:













x
^

=






E


[

x
|

y
r


]


=


E


[



T

-
1



Tx

|

y
r


]


=


T

-
1




E


[

Tx
|

y
r


]











=







T

-
1




E


[


[




y
r






y
nr




]

|

y
r


]



=



T

-
1




[




y
r






E


[


y
nr

|

y
r


]





]


.









(
4
)













If the correlation matrix of y is partitioned in a way compatible with the partition of y as:








R
y

=



TR
x



T
T


=

[




R
1



B





B
T




R
2




]



,










then it can be shown that the conditional signal y


r


|y


nr


is Gaussian with mean B


T


R


1




−1


y


r


and correlation matrix A


Δ


R


2


−B


T


R


1




−1


B. Thus, E[y


r


|y


nr


]=B


T


R


1




−1


y


r


, and η


Δ


y


nr


−E[y


nr


|y


r


] is Gaussian with zero mean and correlation matrix A. The variable η denotes the error in predicting y


nr


from y


r


and hence is the error caused by the erasure. However, because a nonorthogonal transform has been used in this example, T


1


is used to return to the original coordinates before computing the distortion. Substituting y


nr


−η in (4) above gives the following expression for {circumflex over (x)}:









T

-
1




[




y
r







y
nr

-
η




]


=

x
+


T

-
1




[



0





-
η




]




,










such that ∥x−{circumflex over (x)}∥ is given by:









&LeftDoubleBracketingBar;


T

-
1




[



0




η



]


&RightDoubleBracketingBar;

2

=


η
T



U
T


U





η


,










where U is the last l columns of T


−1


. The expected value E[∥x−{circumflex over (x)}∥] is then given by:












i
=
1

l






j
=
1

l





(


U
T


U

)

ij




A
ij

.







(
5
)













The distortion with l erasures is denoted by D


1


. To determine D


1


, (5) above is averaged over all possible combinations of erasures of l out of n components, weighted by their probabilities if the probabilities are non-equivalent. An additional distortion criteria is a weighted sum {overscore (D)} of the distortions incurred with different numbers of channels available, where {overscore (D)} is given by:









l
=
1

n




α
l




D
l

.












For a case in which each channel has a failure probability of p and the channel failures are independent, the weighting







α
l

=


(



n




l



)





p
l



(

1
-
p

)



n
-
l













makes the weighted sum {overscore (D)} the overall expected MSE. Other choices of weighting could be used in alternative embodiments. Consider an image coding example in which an image is split over ten packets. One might want acceptable image quality as long as eight or more packets are received. In this case, one could set α


3





4


= . . . =α


10


=0.




The above expressions may be used to determine optimal transforms which minimize the weighted sum {overscore (D)} for a given rate R. Analytical solutions to this minimization problem are possible in many applications. For example, an analytical solution is possible for the general case in which n=2 components are sent over m=2 channels, where the channel failures have unequal probabilities and may be dependent. Assume that the channel failure probabilities in this general case are as given in the following table.



















Channel 1















no failure




failure




















Channel 2




failure




1 − p


0


− p


1


− p


2






p


1










no failure




p


2






p


0

















If the transform T is given by:







T
=

[



a


b




c


d



]


,










minimizing (2) over transforms with a determinant of one gives a minimum possible rate of:






R*=2k


Δ


+log σ


1


σ


2


.






The difference ρ=R−R* is referred to as the redundancy, i.e., the price that is paid to reduce the distortion in the presence of erasures. Applying the above expressions for rate and distortion to this example, and assuming that σ


1





2


, it can be shown that the optimal transform will satisfy the following expression:







&LeftBracketingBar;
a
&RightBracketingBar;

=




σ
2


2

c






σ
1





[




2

2

ρ


-
1


+



2

2

ρ


-
1
-

4


bc


(

bc
+
1

)






]


.











The optimal value of bc is then given by:








(
bc
)

optimal

=


-

1
2


+


1
2






(



p
1


p
2


-
1

)



[



(



p
1


p
2


+
1

)

2

-

4


(


p
1


p
2


)



2


-
2


ρ




]




-
1

/
2


.













The value of (bc)


optimal


ranges from −1 to 0 as p


1


/p


2


ranges from 0 to ∞. The limiting behavior can be explained as follows: Suppose p


1


>>p


2


, i.e., channel 1 is much more reliable than channel 2. Since (bc)


optimal


approaches 0, ad must approach 1, and hence one optimally sends x


1


(the larger variance component) over channel 1 (the more reliable channel) and vice-versa.




If p


1


=p


2


in the above example, then (bc)


optimal


=−½, independent of ρ. The optimal set of transforms is then given by: a≠0 (but otherwise arbitrary), c=−½b, d=½a and






b=±(2


ρ




−{square root over (2








−1


+L )})σ




1


a/σ


2


.






Using a transform from this set gives:










D
1

=



1
2



(


D

1
,
1


+

D

1
,
2



)


=


σ
1
2

-


1


2
·

2
ρ




(


2
ρ

-



2

2

ρ


-
1



)






(


σ
1
2

-

σ
2
2


)

.








(
6
)













This relationship is plotted in

FIG. 7A

for values of σ


1


=1 and σ


2


=0.5. As expected, D


1


starts at a maximum value of (σ


1




2





2




2


)/2 and asymptotically approaches a minimum value of σ


2




2


. By combining (2), (3) and (6), one can find the relationship between R, D


0


and D


1


.

FIG. 7B

shows a number of plots illustrating the trade-off between D


0


and


1,


for various values of R. It should be noted that the optimal set of transforms given above for this example provides an “extra” degree of freedom, after fixing p, that does not affect the ρ vs. D


1


performance. This extra degree of freedom can be used, for example, to control the partitioning of the total rate between the channels, or to simplify the implementation.




Although the conventional 2×2 transforms described in the above-cited M.T. Orchard et al. reference can be shown to fall within the optimal set of transforms described herein when channel failures are independent and equally likely, the conventional transforms fail to provide the above-noted extra degree of freedom, and are therefore unduly limited in terms of design flexibility.




Moreover, the conventional transforms in the M.T. Orchard et al. reference do not provide channels with equal rate (or, equivalently, equal power). The extra degree of freedom in the above example can be used to ensure that the channels have equal rate, i.e., that R


1


=R


2


, by implementing the transform such that |a|=|c| and |b|=|d|. This type of rate equalization would generally not be possible using conventional techniques without rendering the resulting transform suboptimal.




As previously noted, the invention may be applied to any number of components arid any number of channels. For example, the above-described analysis of rate and distortion may be applied to transmission of n=3 components over m=3 channels. Although it becomes more complicated to obtain a closed form solution, valious simplifications can be made in order to obtain a near-optimal solution. If it is assumed in this example that σ


1





2





3


, and that the channel failure probabilities are equal and small, a set of transforms that gives near-optimal performance is given by:







[



a



-



3



σ
1


a


σ
2






-


σ
2


6


3



σ
1
2



a
2









2

a



0




σ
2


6


3



σ
1
2



a
2







a





3



σ
1


a


σ
2





-


σ
2


6


3



σ
1
2



a
2







]

.










Optimal or near-optimal transforms can be generated in a similar manner for any desired number of components and number of channels.





FIG. 8

illustrates one possible way in which the MDTC techniques described above can be extended to an arbitrary number of channels, while maintaining reasonable ease of transform design. This 4×4 transform embodiment utilizes a cascade structure of 2×2 transforms, which simplifies the transform design, as well as the encoding and decoding processes (both with and without erasures), when compared to use of a general 4×4 transform. In this embodiment, a 2×2 transform T


α


is applied to components x


1


and x


2


, and a 2×2 transform T


β


is applied to components x


3


and x


4


. The outputs of the transforms T


α


and T


β


are routed to inputs of two 2×2 transforms T


γ


as shown. The outputs of the two 2×2 transforms T


γ


correspond to the four channels y


1


through y


4


. This type of cascade structure can provide substantial performance improvements as compared to the simple pairing of coefficients in conventional techniques, which generally cannot be expected to be near optimal for values of m larger than two. Moreover, the failure probabilities of the channels y


1


through y


4


need not have any particular distribution or relationship.

FIGS. 2

,


3


,


4


and


5


A-


5


D above illustrate more general extensions of the MDTC techniques of the invention to any number of signal components and channels.




The above-described embodiments of the invention are intended to be illustrative only. It should be noted that a complementary decoder structure corresponding to the encoder structure of

FIGS. 2

,


3


,


4


and


5


A-


5


D may be implemented in the MD JSC decoder


16


of FIG.


1


. Alternative embodiments of the invention may utilize other coding structures and arrangements. Moreover, the invention may be used for a wide variety of different types of compressed and uncompressed signals, and in numerous coding applications other than those described herein. These and numerous other alternative embodiments within the scope of the following claims will be apparent to those skilled in the art.



Claims
  • 1. A method of encoding a signal for transmission, comprising the steps of:encoding n components of the signal in a multiple description encoder, wherein the encoding step utilizes a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into m groups of multiple description components for encoding and transmission over in channels, wherein at least one of n and m is greater than two; and transmitting the encoded components of the signal.
  • 2. The method of claim 1 wherein the signal includes at least one of a data signal, a speech signal, an audio signal, an image signal and a video signal.
  • 3. The method of claim 1 wherein each of the channels corresponds to at least one packet.
  • 4. The method of claim 1 wherein at least a subset of the m channels have probabilities of failure which are not independent of one another.
  • 5. The method of claim 1 wherein at least a subset of the m channels have non-equivalent probabilities of failure.
  • 6. The method of claim 1 wherein the encoding step includes encoding the n components for transmission over the m channels using a transform of dimension n×m.
  • 7. The method of claim 1 wherein the encoding step includes encoding the n components for transmission over the m channels using a transform which is in the form of a cascade structure of a plurality of transforms each having dimension less than n×m.
  • 8. The method of claim 1 wherein the encoding step includes encoding the n components for transmission over the m channels using a transform which is configured to provide a substantially equivalent rate for each of the channels.
  • 9. The method of claim 1 wherein the encoding step includes encoding the n components for transmission over the m channels in a multiple description joint source-channel encoder which includes a series combination of N multiple description encoders followed by an entropy coder, wherein each of the N multiple description encoders includes a parallel arrangement of M multiple description encoders.
  • 10. The method of claim 9 wherein each of the M multiple description encoders implements one of: (i) a quantizer block followed by a transform block, (ii) a transform block followed by a quantizer block, (iii) a quantizer block with no transform block, and (iv) an identity function.
  • 11. An apparatus for encoding a signal for transmission, comprising:a processor for processing the signal to form components thereof; and a multiple description encoder for encoding n components of the signal, wherein the encoding process utilizes a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into m groups of multiple description components for encoding and transmission over m channels, wherein at least one of n and m is greater than two.
  • 12. The apparatus of claim 11 wherein the signal includes at least one of a data signal, a speech signal, an audio signal, an image signal and a video signal.
  • 13. The apparatus of claim 11 wherein each of the channels corresponds to at least one packet.
  • 14. The apparatus of claim 11 wherein at least a subset of the m channels have probabilities of failure which are not independent of one another.
  • 15. The apparatus of claim 11 wherein at least a subset of the m channels have non-equivalent probabilities of failure.
  • 16. The apparatus of claim 11 wherein the multiple description joint source-channel encoder is operative to encode the n components for transmission over the m channels using a transform of dimension n×m.
  • 17. The apparatus of claim 11 wherein the multiple description joint source-channel encoder is operative to encode the n components for transmission over the m channels using a transform which is in the form of a cascade structure of a plurality of transforms each having dimension less than n×m.
  • 18. The apparatus of claim 11 wherein the multiple description joint source-channel encoder is operative to encode the n components for transmission over the m channels using a transform which is configured to provide a substantially equivalent rate for each of the channels.
  • 19. The apparatus of claim 11 wherein the multiple description joint source-channel encoder further includes a series combination of N multiple description encoders followed by an entropy coder, wherein each of the N multiple description encoders includes a parallel arrangement of M multiple description encoders.
  • 20. The apparatus of claim 19 wherein each of the M multiple description encoders implements one of: (i) a quantizer block followed by a transform block, (ii) a transform block followed by a quantizer block, (iii) a quantizer block with no transform block, and (iv) an identity function.
  • 21. A method of decoding a signal received over a communication medium, comprising the steps of:receiving encoded components of the signal over m channels of the medium, wherein the components are encoded utilizing a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into m groups of multiple description components for encoding and transmission over the m channels; and decoding the received encoded components of the signal in a multiple description decoder, wherein at least one of n and m is greater than two.
  • 22. An apparatus for decoding a signal received over a communication medium, comprising:a multiple description decoder for decoding encoded components of the signal received over m channels of the medium, wherein the components are encoded utilizing a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into m groups of multiple description components for encoding and transmission over the m channels, and wherein at least one of n and m is greater than two.
  • 23. A method of encoding a signal for transmission, comprising the steps of:encoding n components of the signal in a multiple description encoder for transmission over m channels, wherein the encoding step utilizes a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into n groups of multiple description components for encoding and transmission over the m channels, and wherein at least a subset of the m channels have probabilities of failure which are not independent of one another; and transmitting the encoded components of the signal.
  • 24. An apparatus for encoding a signal for transmission, comprising:a processor for processing the signal to form components thereof; and a multiple description encoder for encoding n components of the signal for transmission over m channels, wherein the encoding step utilizes a non-identity multiple description transform to produce at least n multiple description components each of which corresponds to a different output of the multiple description transform, and the resulting multiple description components are grouped into n groups of multiple description components for encoding and transmission over the m channels, and wherein at least a subset of the m channels have probabilities of failure which are not independent of one another.
US Referenced Citations (5)
Number Name Date Kind
4894713 Delogne et al. Jan 1990 A
5028995 Izawa et al. Jul 1991 A
5263100 Kim et al. Nov 1993 A
5394473 Davidson Feb 1995 A
5928331 Bushmitch Jul 1999 A
Non-Patent Literature Citations (22)
Entry
P.G. Sherwood et al., “Error Protection of Wavelet Coded Images Using Residual Source Redundancy,” Proc. of the 31st Asilomar Conference on Signals, Systems and Computers, Nov. 1997.
M.T. Orchard et al., “Redundancy Rate-Distortion Analysis of Multiple Description Coding Using Pairwise Correlating Transforms,” Proc. IEEE Int. Conf. Image Proc., Santa Barbara, CA, Oct. 1997.
L. Ozarow, “On a source-coding problem with two channels and three receivers,” Bell Syst. Tech. J., 59(8):1417-1426, 1980.
A.A. El Gamal et al., “Achievable Rates for Multiple Descriptions,” IEEE Trans. Inform. Th., 28(6):851-857, Nov. 1982.
V.A. Vaishampayan, “Design of Multiple Description Scalar Quantizers,” IEEE Trans. Inform. Th., 39(3):821-834, May 1993.
P. Subrahmanya et al., “Multiple Descriptions Encoding of Images,” Preprint, 1997.
Y. Wang et al., “Multiple Description Image Coding for Noisy Channels by Pairing Transform Coefficients,” Proc. First IEEE SP Workshop on Multimedia Signal Processing, pp. 419-424, Princeton, NJ, Jun. 1997.
J.K. Wolf et al., “Source Coding for Multiple Descriptions,” Bell Syst. Tech. J., 59(8):1417-1426, 1980.
J.-C. Batlo et al., “Asympatotic Performance of Multiple Description Transform Codes,” IEEE Trans. Inform. Th., 43(2):703-707, 1997.
V.K. Goyal et al., “Quantized Overcomplete Expansions in IRN: Analysis, Synthesis and Algorithms,” IEEE Trans. Inform. Th., 44(1):Jan. 16, 1998.
T. Berger et al., “Minimum Breakdown Degradation in Binary Source Encoding” IEEE Trans. Inform. Th., 29(6):807, Nov. 1983.
R.M. Gray et al., “Source Coding for a Simple Network,” Bell Syst. Tech. J., 53(8):1681, Nov. 1974.
Z. Zhang et al., “New Results in Binary Multiple Descriptions,” IEEE Trans. Inform. Th., 33(4):502, Jul. 1987.
R. Ahlswede, “The Rate-Distortion Region for Multiple Descriptions Without Excess Rate,” IEEE Trans. Inform. Th., 1995.
W.H.R. Equitz et al., “Successive Refinement of Information,” IEEE Trans. Inform. Th., 37(2):269, Mar. 1991.
H.S. Witsenhausen et al., “Source Coding for Multiple Descriptions II: A Binary Source,” Bell Syst. Tech. J., 60(10):2281, Dec. 1981.
V.A. Vaishampayan et al., “Design of Entropy-Constrained Multiple-Description Scalar Quantizers,” IEEE Trans. Inform. Th., 40(1), Jan. 1994.
V.A. Vaishampayan et al., “Asymptotic Analysis of Multiple Description Quantizers,” IEEE Trans. Inform. Th., 1994.
S.-M. Yang et al., “Low-Delay Communications for Rayleigh Fading Channels: An Application of the Multiple Description Quantizer,” IEEE Trans. Comm., 43(11), Nov. 1995.
A. Ingle et al., “DPCM System Design for Diversity Systems with Applications to Packetized Speech,” IEEE Trans. Sp. and Audio Proc., 3(1):48, Jan. 1995.
V.A. Vaishampayan et al., “Speech Predictor Design for Diversity Communication Systems,” IEEE Workshop on Speech Coding for Telecommunications, Annapolis, MD, Sep. 1995.
V.A. Vaishampayan, “Application of Multiple Description Codes to Image and Video Transmission over Lossy Networks,” 7th Int'l Workshop on Packet Video, Mar. 18-19, 1996, Brisbane, Australia.