This invention relates generally to conditional source coding, and more particularly to compound conditional source coding, Slepian-Wolf list decoding, and applications for media coding.
Distributed source coding and predictive or “conditional” source coding are used in a wide range of applications. Examples of applications include temporal video and media compression, sensor networks, secure multimedia coding. See, D. Slept an and J. K. Wolf: “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, 19:471.480, July 1973, and R. M. Gray; “Conditional rate-distortion theory,” Technical report. Stanford Electronics Laboratories. No. 6502-2, 1972,
As an example, a video coding is treated as a conditional source coding problem. Because switch 60 is closed, each frame can be predictively encoded based on the previous frames. Video coding can also be approached as a distributed source coding problem, as is discussed in, e.g., A. Aaron, R. Zhang, and B. Girod. “Wyner-Ziv coding of motion video,” in Proc. Asilomar Conf. on Signals, Systems and Comput., Monterey, Calif., November 2002 and R. Puri and K. Ramchandran. PRISM: “A new robust video coding architecture based on distributed compression principles,” in Proc. 40th Allerton Conf. on Commun., Control and Comput., Monticello, Ill., October 2002. As shown in
Wyner-Ziv video coding is a rate-distortion version of Slepian-Wolf coding. At a high level, a Wyner-Ziv system is a conventional vector quantizer, followed by a Slepian-Wolf encoder and decoder, and followed by post-processing including a joint estimate of the source x based on the decoded vector quantization of the source x and the side-information vector y. Thus, the Slepian-Wolf core is the only distributed aspect of a Wyner-Ziv system.
For a number of applications, e.g., the Wyner-Ziv video coding, it is desired to represent the side-information vector y as a set of possibilities, rather than as predefined information.
Embodiments of the invention provide a compound conditional source coding system and method that model a number of media coding scenarios. Distributed source coding methods, while centrally important in robustly addressing the compound nature of these problems, do not by themselves characterize a full range of operational possibilities. The invention demonstrates an encoding technique whose reliability exceeds that of distributed source coding.
Length-n random uncompressed source data are drawn according to a distribution px(x), and serves as input data to an encoder. A set P of candidate side-information vectors is also input to the encoder. The encoder encodes the source data, using the set of the candidate side-information vectors, to produce an encoded message. The message is transmitted to a decoder. The decoder decodes the received message to produce a source estimate, using a selected side-information vector and an index of the selected side-information vector in the set of the candidate side-information vectors.
Compound Conditional Source Encoding System
py
Thus, the encoder 220 only knows that the side-information, vector y is one of a certain, small finite, set of the candidate side-information vectors {y1, y2, . . . , yP} 260, but does not know which particular the side-information vector is observed at the decoder.
As defined herein, the set of the candidate side-information vectors 260 includes two or more members. The encoder 220 encodes the source x 210, using the set: of the candidate side-information vectors 260, to produce an encoded message 230. The message 230 is sent by a transmitter 281 over a channel to a decoder 240. The decoder 240 decodes the received message 230 to produce a source estimate 250, using selected side-information vector yk 270 and an index k 280 of the selected side-information vector 270 in the set of the candidate side-information vectors 260. Our invention does not require a probability distribution on the selection of the index k 280, though such a distribution can be incorporated. The encoder 220 and the decoder 240 both know all the joint distributions, px,y
In contrast to compound conditional source coding, in conditional source coding P=1, and in distributed source coding the encoder knows only the side-information vector y that is a member of a typical set of possibilities, hence P˜2nH(y|x). Because in compound systems the encoder does not know which of the P possibilities is received by the decoder, conditional coding fails.
On the other hand, distributed source coding can operate successfully if the compression rate is chosen large enough. However, because the set of possibilities has been narrowed from an exponential to a sub-exponential number, the encoder 220 is able to operate more efficiently than conventional encoders that use only Slepian-Wolf coding techniques.
Pre-Encoding Process
The encoder 320 produces an encoded message 340 which is sent 370 to a decoder 360. Here, the decoder 360 is a Slepian-Wolf list decoder. In maximum likelihood decoding, the output of the decoder is the single best-estimate of the source x. In list decoding, the decoder outputs a length-L list of source possibilities 380. The list decoder fails only if the input source x 310 is not on the list. Thus, the list decoder 360 produces the list L(yj) 380 of L possibilities for the source x 310. As for the conventional decoder, the side-information vector yj 350 is also an input to the list decoder 360.
Elements of the list L 380 are compared 385 with the input source x 310. The result of the comparison 385 is a matching index j 390 of the element of the list L 380, which matches the input source x 310.
Encoding
The encoded message 470 includes additional resolution information, i.e., the matching indexes 460, the result of the pre-encode step 300. These additional resolution information bits identify which entry on each list, i.e., for each of the P possible side-information vectors, is the correct source sequence. As described above, the pre-encode step 300 calculates the matching indexes by list-decoding with each of the P candidate side-information vectors, as shown in
The set of candidate side-information vectors has cardinality P, the total number of resolution bits is P logL. The rate of the resolution information is P log L/n, which decreases to zero as the block length n increases. Thus, asymptotically, the resolution information uses a zero additional rate. The message y 470 is sent to a decoding process 500, see
Compound Decoding
Analysis
Below, we describe technical analysis results of our embodiments. These include the rate-requirements of compound conditional source coding, achievable error exponents for Slepian-Wolf list decoding, and achievable error exponents of compound conditional source coding. For some embodiments, we state results for the case of memory-less independent and identically distributed (i.i.d.) sources.
Compound Conditional Source Coding Theorem 1
Let
p
x,y
(x, yp)=Πi=1npx, y
where px, y
In maximum likelihood decoding, the output of the decoder is the single best-estimate of the source sequence. In list decoding, the decoder outputs a length-L list of possible sources. The list decoder fails only if the true source sequence is not on the list, see P. Elias “List decoding for noisy channels,” Technical Report MIT Research Lab, of Electronics Tech. Report 335, Mass. Instit. Tech., 1957.
We derive the following list-coding result for distributed Slepian-Wolf source coding.
List-Decoding for Slepian-Wolf Systems Theorem 2
Let px,y(x, y) be the joint distribution of a pair of length-n random sequences (x, y), where x is the source input to the encoder and y is the decoder side-information vector. There exists a rate-R encoder/list-decoder pair, where the list L(y) is of size |L(y)|=L, such that the average probability of a list decoding error is bounded for any choice of ρ, 0≦ρ≦L as
In the special case of an i.i.d source distribution
p
x,y(x, y)=Πi=1n px,y(xi, yi),
and maximizing over the free parameter 0≦ρ≦L, we obtain the following error exponent.
IID Corollary 1
For i.i.d. sources there exists a rate-R distributed source coding list-encoder/decoder-pair such that Pr[x ∉ L(y)]≦2−nE for all E≦ESW,list(px,y, R, L) where ESW,list(px,y, R, L)=
The following corollary states that the error exponent of compound conditional source coding is at least as large as the list-decoding error exponent of the distributed source coding problem under the selected joint distribution px,y
Error Exponent of Compound Conditional Source Coding Corollary 2
Consider the compound conditional source coding problem of Theorem 1. The index for the decoder side-information vector is k, where the index k ε {1, 2, . . . P}. Then
In maximum likelihood decoding for conventional Slepian-Wolf decoding, 0≦p≦1, while in length-L list decoding, 0≦p≦L. This additional freedom translates into a large increase in the exponent at higher rates. This is the same effect as when list decoding is used in channel coding.
Certain media coding application, where distributed source coding techniques are used, can be stated more exactly as compound conditional problems. This insight can lead to improved system performance, as we demonstrate for error exponents.
Examples of Compound Conditional Source Coding Applications
Multiview Coding
In multiview video/image coding, images are acquired of a scene by multiple cameras at each time instant t. For the purpose of this description, each time instant is associated with a frame. For example, in
The possible side-information vector sequences for the encoder is predetermined. For example, the prediction reference frames for frame (2, 4) can either be frame (1, 4) or frame (2, 3), depending on the desired decoding order.
Robust Video Coding
Wyner-Ziv coding of video to reduce error propagation is used when video frames are transmitted over a lossy channel, see
This is another example of a compound conditional source coding application because the encoder knows in advance the possible side-information vector (frame 4 or frame 3 or frame 2 or frame 1) that the decoder might use in decoding frame 5.
Stream Switching for Multiresolution Video Coding
A key issue in streaming a video is that the network bandwidth can vary over time. Some applications use Wyner-Ziv video coding to allow the transmitter to vary the bit-rate/resolution/quality of the video stream dynamically. Enabling the decoder to “switch” from one resolution to another is complicated by the fact that the decoder may not have the prediction reference frames from the other video stream.
As shown in
Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.